I took the excellent FGRL rating list of Andreas Strangmüller at 60''+ 0.6'' time control, adapted to my conditions, and having very many games (very small error margins).
http://www.fastgm.de/60-0.60.html
My time control in games is 60''+ 1''. The openings used are 12 positions from DeepMind paper, side and reversed. I played 24 rounds gauntlet of Lc0 v17 ID 11261 on GTX 1060 against top 3 current engines on four i7 cores, a total of 72 games against top 3 having this FGRL average rating:
Code: Select all
=====================
SF dev : 3454
Houdini 6.03 : 3364
Komodo 12.1 : 3323
==========
Average: 3380
Code: Select all
Rank Name Elo +/- Games Score Draws
0 lc0_v17 11261 -58 46 72 41.7% 66.7%
1 SF dev 120 75 24 66.7% 66.7%
2 Houdini 6.03 73 74 24 60.4% 70.8%
3 Komodo 12.1.1 -14 87 24 47.9% 62.5%
72 of 72 games finished.
========================================================================================
Then, I played in the same conditions against much weaker engines of the same line, but 5 year old. FGRL average rating of those is:
Code: Select all
==========
SF DD : 3064
Houdini 3 : 3080
Komodo 7a : 3062
==========
Average: 3069
Code: Select all
Rank Name Elo +/- Games Score Draws
0 lc0_v17 11261 68 52 72 59.7% 58.3%
1 Komodo 7a -29 92 24 45.8% 58.3%
2 Houdini 3 -73 74 24 39.6% 70.8%
3 SF DD -104 107 24 35.4% 45.8%
72 of 72 games finished.
========================================================================================
We see that the performance against 300+ Elo points weaker engines is much lower for Leela.
The difference in performance in the two cases is:
Difference in Lc0 performances: 185 +/- 69 Elo points 2SD.
This is pretty huge, and the Elo curve is either not obeyed at all, or severely compressed. Therefore, only that and one can already say that there is no much meaning in "Elo strength of Leela against regular engines", never mind of scaling issues with time control, hardware issues and the opening repertoire issue. It's pretty useless to talk of Leela in Elo terms comparing it to regular engines, maybe just as order of magnitude and always specifying conditions.