Daniel Shawul wrote:It still blunders hugely after the +300 elo.
It does, no question about it. Even with the new cuDNN speedups (which need testing to see how that translates into results). It is a fascinating engine, but there is a ways to go yet.
With this sort of speedu-up due to cuDNN, at LTC, say 60'+ 15'', your LC0 should be about the level of Houdini 1.5a on 4 threads. It would be a fascinating match, H1.5 was a very tactically astute engine. In 10 games, there will be some LC0 wins, and I would be curious how LC0 missing mates in 3, and generally being very weak in tactics, especially in tactical puzzles, can beat H1.5, mostly positionally and on long term "plans". Seeing this cuDNN speed-up, I will get a GTX 1060 in less than 2 weeks.
You are greatly overestimating LC0 strength.
I run again 1000 games match between LC0 ID227 1000 playouts vs. SF depth 10, and LC0 didn't even reach 40%.
SF depth 10 is like 800-1000Elo weaker than H1.5 on 4 threads on 60'+15'' TC.
LC0 on cuDNN and 1060 is ~5000nps on 15x128 net, i.e. around 350'000 playouts per move on 60'+15'' TC.
If you really believe 8.5 doublings for LC0 is worth even close to 800 or 1000 Elo, what can I tell you, you are grossly mistaken. Just look at Andrey Chilantiev results in Tournaments section where it is clear than doubling in number of playouts for LC0 yields 50-60Elo.
You seem to extrapolate from very remote levels, which might vary greatly in Elo per doublings and many other things. I took 2900 CCRL level on close to 1060 card at 1'+1'', reported here by at least 2 posters. This was with non-CUDA 6x times slower engine. Now extrapolation begins: 50x time control --> 5.5 doublings. Scaling difference --> +35 Elo points per doubling (I got 60-65 at lower playouts), so at this LTC 60'+ 15'', the non-CUDA would perform at 3100 CCRL Elo level. With cuDNN, a factor of 6 in nps is 2.5 doublings, and I assumed at least 70 Elo points (accounting for better scaling too) --> additional 150 Elo points. 3100 + 150 = 3250 CCRL Elo level, similar to CCRL Elo level of Houdini 1.5a on 4 threads.
Well, when I get my GTX 1060, we might see who is closer, if no one will bother to test that in said conditions in say 10 games.
Laskos wrote:With cuDNN engine, a factor of 6 is 2.5 doublings, and I assumed at least 70 Elo points (accounting for better scaling too) --> additional 150 Elo points. 3100 + 150 = 3250 CCRL Elo level, similar to CCRL Elo level of Houdini 1.5a on 4 threads.
Well, when I get my GTX 1060, we might see who is closer, if no one will bother to test that in said conditions in say 10 games.
We will probably need to wait some time for bug-fee cuDNN version. I now quickly tested tensorflow version too and it is affected the same as cuDNN version as also Albert Silver noted. Despite having 6x or more nps advantage cuDNN version loses in direct match with OpenCL version comfortably.
This is a total rewrite by mooskagh (Alexander Lyashuk) so probably SMP code that divides data in batches has some serious bugs. The code is not officially supported because Gian-Carlo is paranoid about using cuDNN with LC0 thinking it would break the licences which IMO is wrong, so he and than also Garry doesn't approve changing it to cuDNN.
Daniel Shawul wrote:Which branch has the cuDNN version? i see only tensorflow code for training.
But the lc0 implementation (with cudnn) sounds like more of a rewrite, so it should not be used to measure the strength of play yet. The author is still trying to make it work compared to baseline lczero.
Laskos wrote:With cuDNN engine, a factor of 6 is 2.5 doublings, and I assumed at least 70 Elo points (accounting for better scaling too) --> additional 150 Elo points. 3100 + 150 = 3250 CCRL Elo level, similar to CCRL Elo level of Houdini 1.5a on 4 threads.
Well, when I get my GTX 1060, we might see who is closer, if no one will bother to test that in said conditions in say 10 games.
We will probably need to wait some time for bug-fee cuDNN version. I now quickly tested tensorflow version too and it is affected the same as cuDNN version as also Albert Silver noted. Despite having 6x or more nps advantage cuDNN version loses in direct match with OpenCL version comfortably.
This is a total rewrite by mooskagh (Alexander Lyashuk) so probably SMP code that divides data in batches has some serious bugs. The code is not officially supported because Gian-Carlo is paranoid about using cuDNN with LC0 thinking it would break the licences which IMO is wrong, so he and than also Garry doesn't approve changing it to cuDNN.
Ah, ok, didn't know. I was really disquiet seeing those benches, and not having a reasonable GPU.
I can confirm that:
- LCZ on GTX1060 and 1m+1s TC is now in CCRL 2900 ballpark (I tested ID228) when tested against engines in roughly 2900-3100 range,
- cuDNN version is much faster but doesn't look any stronger than OpenCL one.
I also tried a match of LCZ (the same HW and TC as above) against minimal SF9 (1 CPU, 32 MB hash, no book, no syzygy) with Noomen 2-moves openings and it was unfortunately almost absolute slaughter, around 96% for SF. I'm really curious to see how LCZ is going to overcome SF as many people seem certain it will happen. I'm not so sure that SF would go so quietly into the night.