Lc0 with latest test30 nets is vastly superior positionally

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Laskos
Posts: 9133
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Laskos » Tue Jan 15, 2019 11:09 pm

Kanizsa wrote:
Wed Jan 09, 2019 1:32 pm
b2 b3.JPGAnother position that I suggest to add is this. Very very difficult for a program to find b3 in order to reply on Nc5 with Rb1! and b4 (what does Leela play?)
This one Leela with one of the latest test30 nets doesn't solve, it sticks to f2-f3 for at least 20 minutes (40 million nodes).
The previous one Leela solves instantly and sticks to it, a2-a4.

Thanks for these positions, I will include them.

Javier Ros
Posts: 181
Joined: Fri Oct 12, 2012 10:48 am
Location: Seville (SPAIN)
Full name: Javier Ros

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Javier Ros » Wed Jan 16, 2019 8:06 am

Laskos wrote:
Tue Jan 15, 2019 11:09 pm
Kanizsa wrote:
Wed Jan 09, 2019 1:32 pm
b2 b3.JPGAnother position that I suggest to add is this. Very very difficult for a program to find b3 in order to reply on Nc5 with Rb1! and b4 (what does Leela play?)
This one Leela with one of the latest test30 nets doesn't solve, it sticks to f2-f3 for at least 20 minutes (40 million nodes).
The previous one Leela solves instantly and sticks to it, a2-a4.

Thanks for these positions, I will include them.

This tendency of positional level loss of the last versions of Lc0 is consistent with the result of my tests and experiments, see

viewtopic.php?f=2&t=69582&sid=0b6d895e6 ... 10#p786318

Perhaps the evolution of the learning process of Lc0 is leading her to improve in tactics and worsen in the positional level.
The love relationship between a chess engine tester and his computer can be summarized in one sentence:
Until heat do us part.

Henk
Posts: 5618
Joined: Mon May 27, 2013 8:31 am

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Henk » Wed Jan 16, 2019 8:51 am

What I remember is that an engine scores best on Win at chess if it limits it's reductions and pruning to only alpha beta search.

So only applying optimizations that give no information loss. Problem is that it won't search deep.

Javier Ros
Posts: 181
Joined: Fri Oct 12, 2012 10:48 am
Location: Seville (SPAIN)
Full name: Javier Ros

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Javier Ros » Wed Jan 16, 2019 9:40 am

Laskos wrote:
Tue Jan 08, 2019 8:49 am
On my positional Openings200 test suite, largely based on databases of human games, I used Polyglot with particular settings, as engines like Lc0 and SF behave very differently in depth and such output. I used a setting whether from time/2 to time/1 engine sticks to the correct solution, as this seems the most representative for real moves played in games at roughly this time control per move. Usual testing from very short time to final time sticking to solution for example for 3 successive iterations is unreliable, as a regular engine can stick for 3 plies at very short times to the correct solution, only to change its mind at longer times on this positional test suite.

Lc0 on RTX 2070 GPU
Regular engines on 4 i7 fast cores.

Code: Select all

Stuck to the solution from 1s to 2s per position on 200 positions, top engines

Lc0 v20.1 ID32458: 143/200
Stockfish 10:      108/200
Komodo 12.3:        97/200
Ethreeal 11.00:     89/200

Stuck to the solution from 10s to 20s per position on 200 positions, top engines
      
Lc0 v20.1 ID32458: 157/200
Stockfish 10:      128/200
Komodo 12.3:       117/200
Ethereal 11.00:    112/200
Lc0 performance is very strong, covering human opening knowledge, a hard one, in a matter of seconds per position. I suspect that 15-20 of 200 solutions of the test suite built by me are wrong, so Lc0 with test30 nets approaches the upper limit of this positional test suite on longer time per position. Test30 ID32458 performs much better than test10 ID11261 positionally, but worse tactically (on WAC200, for example). All in all, they are about the same strength in CCRL 40/4 conditions. I do not know why they didn't manage to improve test30 tactically, as it's the main weakness of the latest nets.

The link to this positional opening suite is here:
http://s000.tinyupload.com/?file_id=249 ... 2088614166

I think it would be very interesting to compute the values ​​of your test for versions 32367, 32409 (these two the best in my opinion) and compare with the current versions.
In addition, I believe your positional test can be used to predict which versions are the best for later testing, since due to its large number it is very difficult to choose the best version.
The love relationship between a chess engine tester and his computer can be summarized in one sentence:
Until heat do us part.

User avatar
Laskos
Posts: 9133
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Laskos » Wed Jan 16, 2019 10:01 am

Javier Ros wrote:
Wed Jan 16, 2019 9:40 am
Laskos wrote:
Tue Jan 08, 2019 8:49 am
On my positional Openings200 test suite, largely based on databases of human games, I used Polyglot with particular settings, as engines like Lc0 and SF behave very differently in depth and such output. I used a setting whether from time/2 to time/1 engine sticks to the correct solution, as this seems the most representative for real moves played in games at roughly this time control per move. Usual testing from very short time to final time sticking to solution for example for 3 successive iterations is unreliable, as a regular engine can stick for 3 plies at very short times to the correct solution, only to change its mind at longer times on this positional test suite.

Lc0 on RTX 2070 GPU
Regular engines on 4 i7 fast cores.

Code: Select all

Stuck to the solution from 1s to 2s per position on 200 positions, top engines

Lc0 v20.1 ID32458: 143/200
Stockfish 10:      108/200
Komodo 12.3:        97/200
Ethreeal 11.00:     89/200

Stuck to the solution from 10s to 20s per position on 200 positions, top engines
      
Lc0 v20.1 ID32458: 157/200
Stockfish 10:      128/200
Komodo 12.3:       117/200
Ethereal 11.00:    112/200
Lc0 performance is very strong, covering human opening knowledge, a hard one, in a matter of seconds per position. I suspect that 15-20 of 200 solutions of the test suite built by me are wrong, so Lc0 with test30 nets approaches the upper limit of this positional test suite on longer time per position. Test30 ID32458 performs much better than test10 ID11261 positionally, but worse tactically (on WAC200, for example). All in all, they are about the same strength in CCRL 40/4 conditions. I do not know why they didn't manage to improve test30 tactically, as it's the main weakness of the latest nets.

The link to this positional opening suite is here:
http://s000.tinyupload.com/?file_id=249 ... 2088614166

I think it would be very interesting to compute the values ​​of your test for versions 32367, 32409 (these two the best in my opinion) and compare with the current versions.
In addition, I believe your positional test can be used to predict which versions are the best for later testing, since due to its large number it is very difficult to choose the best version.
I don't think that this positional test is of utmost relevancy for strength, there is tactics often involved in games, and unfortunately test30, although by now better positionally, is still weaker tactically than test10, and not improving tactically. All in all, the latest test30 nets are just a bit stronger than the best test10 nets (in the region of 20 or so Elo points).

I saw some regression in positional play of test30, but then again it jumped to record levels.
Here are the results for top engines, stuck to solution in 1s to 2s time interval per position. Latest ID32644 is incredibly strong on this suite:

Code: Select all

Lc0 v20.1 ID32644: 756/1000
Lc0 v20.1 ID32458: 712/1000
Houdini 6.03:      558/1000
Komodo 12.3:       556/1000
Stockfish 10:      524/1000
Booot 6.3.1:       494/1000
Andscacs 0.95:     484/1000
Ethereal 11.00:    457/1000
Fire 7.1:          431/1000
Texel 1.07:        419/1000


ID32644 surpasses by huge margins any regular engine. I am pretty happy that my 2-year old test-suite can see the huge positional superiority with good nets of Lc0. It was a suite not relying on analysis of engines (like STS is), but on databases of human games in the openings. I think Stockfish performs not very well compared to Komodo and Houdini due to its stupid 2moves_v1 random openings for Fishtest, and not some more regular openings.

Javier Ros
Posts: 181
Joined: Fri Oct 12, 2012 10:48 am
Location: Seville (SPAIN)
Full name: Javier Ros

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Javier Ros » Wed Jan 16, 2019 11:28 am

Laskos wrote:
Wed Jan 16, 2019 10:01 am

Code: Select all

Lc0 v20.1 ID32644: 756/1000
Lc0 v20.1 ID32458: 712/1000
Houdini 6.03:      558/1000
Komodo 12.3:       556/1000
Stockfish 10:      524/1000
Booot 6.3.1:       494/1000
Andscacs 0.95:     484/1000
Ethereal 11.00:    457/1000
Fire 7.1:          431/1000
Texel 1.07:        419/1000


ID32644 surpasses by huge margins any regular engine. I am pretty happy that my 2-year old test-suite can see the huge positional superiority with good nets of Lc0. It was a suite not relying on analysis of engines (like STS is), but on databases of human games in the openings. I think Stockfish performs not very well compared to Komodo and Houdini due to its stupid 2moves_v1 random openings for Fishtest, and not some more regular openings.
You are right! Lc0 32644 is playing again 8..h5 at the 8th position of Balsa_Top25 suite.
Lc0 is regaining positional strength! It haven't played it since 32409

11 seconds ply 12 on 2070.

FEN: rn1qkb1r/1p3ppp/p2pbn2/4p3/4P3/1NN1BP2/PPP3PP/R2QKB1R b KQkq - 0 8

Lc0201_32644:

....
11/25 00:03 82.133 23.910 -0,53 8. ... Bf8-e7 9.Qd1-d2 Nb8-d7 10.g2-g4 O-O 11.g4-g5 Nf6-h5 12.O-O-O b7-b5 13.Nc3-d5 Be6xd5 14.e4xd5 f7-f6 15.g5xf6 Be7xf6 16.Kc1-b1 Bf6-h4 17.Nb3-a5 Qd8-f6 18.Bf1-h3 Nd7-c5
11/26 00:05 138.822 24.265 -0,50 8. ... Bf8-e7 9.Qd1-d2 h7-h5 10.Nc3-d5 Nf6xd5 11.e4xd5 Be6-f5 12.Bf1-e2 Nb8-d7 13.O-O h5-h4 14.Nb3-a5 Qd8-c7 15.c2-c4 O-O 16.b2-b4 h4-h3 17.g2-g4 Bf5-g6 18.Ra1-c1 f7-f5
12/26 00:06 151.106 24.574 -0,50 8. ... Bf8-e7 9.Qd1-d2 h7-h5 10.Nc3-d5 Nf6xd5 11.e4xd5 Be6-f5 12.Bf1-e2 Nb8-d7 13.O-O h5-h4 14.Nb3-a5 Qd8-c7 15.c2-c4 O-O 16.b2-b4 h4-h3 17.g2-g4 Bf5-g6 18.Ra1-c1 f7-f5
12/27 00:07 186.884 25.977 -0,50 8. ... Bf8-e7 9.Qd1-d2 h7-h5 10.Nc3-d5 Nf6xd5 11.e4xd5 Be6-f5 12.Bf1-e2 Nb8-d7 13.O-O h5-h4 14.Nb3-a5 Qd8-c7 15.c2-c4 O-O 16.b2-b4 h4-h3 17.g2-g4 Bf5-g6 18.Ra1-c1 f7-f5
12/28 00:07 198.869 26.229 -0,50 8. ... Bf8-e7 9.Qd1-d2 h7-h5 10.Nc3-d5 Nf6xd5 11.e4xd5 Be6-f5 12.Bf1-e2 a6-a5 13.a2-a4 O-O 14.O-O Nb8-d7 15.Be2-b5 Nd7-f6 16.c2-c4 h5-h4 17.Qd2-f2 h4-h3 18.g2-g4 Bf5-d7
12/28 00:11 364.973 30.966 -0,41 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.O-O-O Bf8-e7 11.Kc1-b1 Ra8-c8 12.Nc3-d5 Nf6xd5 13.e4xd5 Be6-f5 14.Bf1-d3 Bf5xd3 15.Qd2xd3 Be7-g5 16.Be3-f2 Qd8-c7 17.c2-c3 O-O 18.Qd3-f5 Bg5-h6 19.Qf5xh5 b7-b5
12/28 00:16 557.394 33.195 -0,38 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Bf8-g7 13.Kc1-b1 Qd8-c7 14.Bf1-e2 O-O 15.g2-g4 Nd7-b6 16.Be3-g5 h5xg4 17.Bg5xf6 Bg7xf6 18.f3xg4 Bf6-h4 19.Qd2-h6 Qc7-e7 20.Nb3-d2 Bh4-g5 21.Qh6-h3 e5-e4
12/29 00:17 568.714 33.207 -0,38 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Bf1-e2 O-O 15.g2-g4 Nd7-b6 16.Be3-g5 h5xg4 17.Bg5xf6 Bg7xf6 18.f3xg4 Bf6-h4 19.Qd2-h6 Qc7-e7 20.Nb3-d2 Bh4-g5 21.Qh6-h3 Bg5xd2 22.Rd1xd2
12/30 00:20 690.376 34.461 -0,39 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Bf1-e2 O-O 15.g2-g4 Nd7-b6 16.Be3-g5 h5xg4 17.Bg5xf6 Bg7xf6 18.f3xg4 Bf6-h4 19.Qd2-h6 Qc7-e7 20.Nb3-d2 Bh4-g5 21.Qh6-h3 Bg5xd2 22.Rd1xd2
13/30 00:20 696.869 34.560 -0,38 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Bf1-e2 O-O 15.g2-g4 Nd7-b6 16.Be3-g5 h5xg4 17.Bg5xf6 Bg7xf6 18.f3xg4 Bf6-h4 19.Qd2-h6 Qc7-e7 20.Nb3-d2 Bh4-g5 21.Qh6-h3 Bg5xd2 22.Rd1xd2
13/31 00:22 783.187 34.605 -0,36 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Bf1-e2 O-O 15.g2-g4 Nd7-b6 16.Be3-g5 h5xg4 17.Bg5xf6 Bg7xf6 18.f3xg4 Bf6-h4 19.Qd2-h6 Qc7-e7 20.Qh6-e3 Nb6-a4 21.Nb3-d2 e5-e4 22.Qe3xe4 Qe7xe4
13/32 00:24 842.683 34.421 -0,35 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Bf1-e2 O-O 15.g2-g4 Nd7-b6 16.Be3-g5 h5xg4 17.Bg5xf6 Bg7xf6 18.f3xg4 Bf6-h4 19.Qd2-h6 Qc7-e7 20.Qh6-e3 Nb6-a4 21.Nb3-d2 e5-e4 22.Qe3xe4 Qe7xe4
13/33 00:29 997.945 34.143 -0,34 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Rh1-g1 O-O 15.g2-g4 h5xg4 16.f3xg4 Nd7-b6 17.Qd2-g2 e5-e4 18.Be3-d4 Ra8-e8 19.c2-c4 e4-e3 20.h2-h4 Nb6xc4
13/37 00:33 1.197.343 35.456 -0,33 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Rh1-g1 O-O 15.g2-g4 h5xg4 16.f3xg4 Nd7-b6 17.Qd2-g2 e5-e4 18.Be3-d4 Ra8-e8 19.c2-c4 e4-e3 20.h2-h4 Nb6xc4
14/37 00:34 1.228.053 35.249 -0,33 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Bf1-e2 O-O 15.g2-g4 Rf8-c8 16.Rd1-c1 a6-a5 17.g4-g5 Nf6-e8 18.a2-a4 Qc7-d8 19.Be2-b5 Ne8-c7 20.c2-c4 b7-b6 21.Bb5-c6 Ra8-b8 22.Bc6-b5 Rb8-a8 23.Rh1-d1 Nc7-a6
14/37 00:37 1.350.835 36.200 -0,34 8. ... h7-h5 9.Qd1-d2 Nb8-d7 10.Nc3-d5 Be6xd5 11.e4xd5 g7-g6 12.O-O-O Qd8-c7 13.Kc1-b1 Bf8-g7 14.Bf1-e2 O-O 15.g2-g4 Rf8-c8 16.Rd1-c1 a6-a5 17.g4-g5 Nf6-e8 18.a2-a4 Qc7-d8 19.Be2-b5 Ne8-c7 20.Qd2-d3 Nc7xb5 21.Qd3xb5 b7-b6 22.Nb3-d2 Nd7-c5 23.Nd2-e4 Ra8-b8 24.Rh1-f1 Rc8-c7
The love relationship between a chess engine tester and his computer can be summarized in one sentence:
Until heat do us part.

Jouni
Posts: 1953
Joined: Wed Mar 08, 2006 7:15 pm

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Jouni » Wed Jan 16, 2019 12:46 pm

Lc0 score is no surprise with 40 MB of learning data. How about testing Brainfish?
Jouni

User avatar
Laskos
Posts: 9133
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Laskos » Wed Jan 16, 2019 1:14 pm

Jouni wrote:
Wed Jan 16, 2019 12:46 pm
Lc0 score is no surprise with 40 MB of learning data. How about testing Brainfish?
Lc0 is a pattern learner. Cerebellum actually contains many of these openings. This positional strength of Lc0 is probably covering midgame too, where there are many not covered by a book positions (most).
Aside that, I had troubles testing BrainFish in Polyglot.

Jouni
Posts: 1953
Joined: Wed Mar 08, 2006 7:15 pm

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Jouni » Sat Feb 02, 2019 8:22 pm

Brainfish is difficult to test, because it don't use book in analysis mode :x. But I was curious and tested it manually. Score was 152/200 with "0" sec limit.
Jouni

User avatar
Laskos
Posts: 9133
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Lc0 with latest test30 nets is vastly superior positionally

Post by Laskos » Sat Feb 02, 2019 9:28 pm

Jouni wrote:
Sat Feb 02, 2019 8:22 pm
Brainfish is difficult to test, because it don't use book in analysis mode :x. But I was curious and tested it manually. Score was 152/200 with "0" sec limit.
Thanks, it is very close to what the best on this suite test30 nets (32819, for example) get on this suite, about 153/200, at 1s to 2s per position. So, Leela is from 1s to 2s per position in openings roughly as strong as the Cerebellum book, which is analyzed for dozens of minutes per position by Stockfish, and it is quite a feat. At longer TC (say Blitz or longer), Leela by itself plays stronger than the Cerebellum book in the openings.

Post Reply