This one Leela with one of the latest test30 nets doesn't solve, it sticks to f2-f3 for at least 20 minutes (40 million nodes).
The previous one Leela solves instantly and sticks to it, a2-a4.
Thanks for these positions, I will include them.
Moderators: hgm, Rebel, chrisw
This one Leela with one of the latest test30 nets doesn't solve, it sticks to f2-f3 for at least 20 minutes (40 million nodes).
Laskos wrote: ↑Tue Jan 08, 2019 9:49 am On my positional Openings200 test suite, largely based on databases of human games, I used Polyglot with particular settings, as engines like Lc0 and SF behave very differently in depth and such output. I used a setting whether from time/2 to time/1 engine sticks to the correct solution, as this seems the most representative for real moves played in games at roughly this time control per move. Usual testing from very short time to final time sticking to solution for example for 3 successive iterations is unreliable, as a regular engine can stick for 3 plies at very short times to the correct solution, only to change its mind at longer times on this positional test suite.
Lc0 on RTX 2070 GPU
Regular engines on 4 i7 fast cores.
Lc0 performance is very strong, covering human opening knowledge, a hard one, in a matter of seconds per position. I suspect that 15-20 of 200 solutions of the test suite built by me are wrong, so Lc0 with test30 nets approaches the upper limit of this positional test suite on longer time per position. Test30 ID32458 performs much better than test10 ID11261 positionally, but worse tactically (on WAC200, for example). All in all, they are about the same strength in CCRL 40/4 conditions. I do not know why they didn't manage to improve test30 tactically, as it's the main weakness of the latest nets.Code: Select all
Stuck to the solution from 1s to 2s per position on 200 positions, top engines Lc0 v20.1 ID32458: 143/200 Stockfish 10: 108/200 Komodo 12.3: 97/200 Ethreeal 11.00: 89/200 Stuck to the solution from 10s to 20s per position on 200 positions, top engines Lc0 v20.1 ID32458: 157/200 Stockfish 10: 128/200 Komodo 12.3: 117/200 Ethereal 11.00: 112/200
The link to this positional opening suite is here:
http://s000.tinyupload.com/?file_id=249 ... 2088614166
I don't think that this positional test is of utmost relevancy for strength, there is tactics often involved in games, and unfortunately test30, although by now better positionally, is still weaker tactically than test10, and not improving tactically. All in all, the latest test30 nets are just a bit stronger than the best test10 nets (in the region of 20 or so Elo points).Javier Ros wrote: ↑Wed Jan 16, 2019 10:40 amLaskos wrote: ↑Tue Jan 08, 2019 9:49 am On my positional Openings200 test suite, largely based on databases of human games, I used Polyglot with particular settings, as engines like Lc0 and SF behave very differently in depth and such output. I used a setting whether from time/2 to time/1 engine sticks to the correct solution, as this seems the most representative for real moves played in games at roughly this time control per move. Usual testing from very short time to final time sticking to solution for example for 3 successive iterations is unreliable, as a regular engine can stick for 3 plies at very short times to the correct solution, only to change its mind at longer times on this positional test suite.
Lc0 on RTX 2070 GPU
Regular engines on 4 i7 fast cores.
Lc0 performance is very strong, covering human opening knowledge, a hard one, in a matter of seconds per position. I suspect that 15-20 of 200 solutions of the test suite built by me are wrong, so Lc0 with test30 nets approaches the upper limit of this positional test suite on longer time per position. Test30 ID32458 performs much better than test10 ID11261 positionally, but worse tactically (on WAC200, for example). All in all, they are about the same strength in CCRL 40/4 conditions. I do not know why they didn't manage to improve test30 tactically, as it's the main weakness of the latest nets.Code: Select all
Stuck to the solution from 1s to 2s per position on 200 positions, top engines Lc0 v20.1 ID32458: 143/200 Stockfish 10: 108/200 Komodo 12.3: 97/200 Ethreeal 11.00: 89/200 Stuck to the solution from 10s to 20s per position on 200 positions, top engines Lc0 v20.1 ID32458: 157/200 Stockfish 10: 128/200 Komodo 12.3: 117/200 Ethereal 11.00: 112/200
The link to this positional opening suite is here:
http://s000.tinyupload.com/?file_id=249 ... 2088614166
I think it would be very interesting to compute the values of your test for versions 32367, 32409 (these two the best in my opinion) and compare with the current versions.
In addition, I believe your positional test can be used to predict which versions are the best for later testing, since due to its large number it is very difficult to choose the best version.
Code: Select all
Lc0 v20.1 ID32644: 756/1000
Lc0 v20.1 ID32458: 712/1000
Houdini 6.03: 558/1000
Komodo 12.3: 556/1000
Stockfish 10: 524/1000
Booot 6.3.1: 494/1000
Andscacs 0.95: 484/1000
Ethereal 11.00: 457/1000
Fire 7.1: 431/1000
Texel 1.07: 419/1000
You are right! Lc0 32644 is playing again 8..h5 at the 8th position of Balsa_Top25 suite.Laskos wrote: ↑Wed Jan 16, 2019 11:01 amCode: Select all
Lc0 v20.1 ID32644: 756/1000 Lc0 v20.1 ID32458: 712/1000 Houdini 6.03: 558/1000 Komodo 12.3: 556/1000 Stockfish 10: 524/1000 Booot 6.3.1: 494/1000 Andscacs 0.95: 484/1000 Ethereal 11.00: 457/1000 Fire 7.1: 431/1000 Texel 1.07: 419/1000
ID32644 surpasses by huge margins any regular engine. I am pretty happy that my 2-year old test-suite can see the huge positional superiority with good nets of Lc0. It was a suite not relying on analysis of engines (like STS is), but on databases of human games in the openings. I think Stockfish performs not very well compared to Komodo and Houdini due to its stupid 2moves_v1 random openings for Fishtest, and not some more regular openings.
Lc0 is a pattern learner. Cerebellum actually contains many of these openings. This positional strength of Lc0 is probably covering midgame too, where there are many not covered by a book positions (most).
Thanks, it is very close to what the best on this suite test30 nets (32819, for example) get on this suite, about 153/200, at 1s to 2s per position. So, Leela is from 1s to 2s per position in openings roughly as strong as the Cerebellum book, which is analyzed for dozens of minutes per position by Stockfish, and it is quite a feat. At longer TC (say Blitz or longer), Leela by itself plays stronger than the Cerebellum book in the openings.