Do not use TBs and do not use any adjudications, it's important here, in this particular set-up. I am testing myself at longer TC (100 + 1) from these openings Lc0 J92-160 on RTX 2070 against SF12 on 4 i7 cores, a pretty favorable for Leela set-up (old Leela ratio of about 2.5), let's see.MMarco wrote: ↑Mon Oct 05, 2020 2:06 amInteresting. I guess that was at a fast time control. I ran Lc0 tcec-19 (on a rtx 2060) vs Stockfish tcec-19 (1 core Ryzen 9 4900H) at 100s + 1s with 5-men syzygy, on your test set. With these conditions, Lc0 and Stockfish are usually about on par (see my tournaments here: http://talkchess.com/forum3/viewtopic.p ... 56#p860072 ). My test conditions are such that engines calculate about 1000-1500 times fewer nodes per move than at TCEC.Laskos wrote: ↑Sat Oct 03, 2020 8:00 pmCode: Select all
Endgames: Score of lc0_LS15 vs Fruit_21: 17 - 17 - 66 [0.500] 100 ... lc0_LS15 playing White: 8 - 10 - 32 [0.480] 50 ... lc0_LS15 playing Black: 9 - 7 - 34 [0.520] 50 ... White vs Black: 15 - 19 - 66 [0.480] 100 Elo difference: 0.0 +/- 39.8, LOS: 50.0 %, DrawRatio: 66.0 % Finished match
Games: https://gofile.io/d/xz9EBbCode: Select all
# PLAYER : RATING ERROR PLAYED (%) CFS W D L D(%) 1 Lc0 tcec-19 : 0.0 23.1 100 50.00 50 32 36 32 36.00 2 Stockfish tcec-19 : 0.0 23.1 100 50.00 --- 32 36 32 36.00 White advantage = -21.05 +/- 29.75 Draw rate (equal opponents) = 36.10 % +/- 4.51
I would guess that the bad result against Fruit is due to Leela missing tactics at low depth. Given a reasonable time control, Leela is on par with Stockfish in this endgame test.
Policy determining quiet early opening preferences of Leela
Moderators: hgm, Rebel, chrisw
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Policy determining quiet early opening preferences of Leela
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Policy determining quiet early opening preferences of Leela
I interrupted after 24 games (12 pairs side and reversed)Laskos wrote: ↑Mon Oct 05, 2020 8:57 amDo not use TBs and do not use any adjudications, it's important here, in this particular set-up. I am testing myself at longer TC (100 + 1) from these openings Lc0 J92-160 on RTX 2070 against SF12 on 4 i7 cores, a pretty favorable for Leela set-up (old Leela ratio of about 2.5), let's see.MMarco wrote: ↑Mon Oct 05, 2020 2:06 amInteresting. I guess that was at a fast time control. I ran Lc0 tcec-19 (on a rtx 2060) vs Stockfish tcec-19 (1 core Ryzen 9 4900H) at 100s + 1s with 5-men syzygy, on your test set. With these conditions, Lc0 and Stockfish are usually about on par (see my tournaments here: http://talkchess.com/forum3/viewtopic.p ... 56#p860072 ). My test conditions are such that engines calculate about 1000-1500 times fewer nodes per move than at TCEC.Laskos wrote: ↑Sat Oct 03, 2020 8:00 pmCode: Select all
Endgames: Score of lc0_LS15 vs Fruit_21: 17 - 17 - 66 [0.500] 100 ... lc0_LS15 playing White: 8 - 10 - 32 [0.480] 50 ... lc0_LS15 playing Black: 9 - 7 - 34 [0.520] 50 ... White vs Black: 15 - 19 - 66 [0.480] 100 Elo difference: 0.0 +/- 39.8, LOS: 50.0 %, DrawRatio: 66.0 % Finished match
Games: https://gofile.io/d/xz9EBbCode: Select all
# PLAYER : RATING ERROR PLAYED (%) CFS W D L D(%) 1 Lc0 tcec-19 : 0.0 23.1 100 50.00 50 32 36 32 36.00 2 Stockfish tcec-19 : 0.0 23.1 100 50.00 --- 32 36 32 36.00 White advantage = -21.05 +/- 29.75 Draw rate (equal opponents) = 36.10 % +/- 4.51
I would guess that the bad result against Fruit is due to Leela missing tactics at low depth. Given a reasonable time control, Leela is on par with Stockfish in this endgame test.
Code: Select all
Score of SF_12 vs Lc0_J92-190_CUDA: 4 - 2 - 18 [0.542] 24
... SF_12 playing White: 2 - 2 - 8 [0.500] 12
... SF_12 playing Black: 2 - 0 - 10 [0.583] 12
... White vs Black: 2 - 4 - 18 [0.458] 24
Elo difference: 29.0 +/- 69.9, LOS: 79.3 %, DrawRatio: 75.0 %
24 longer TC games are here:
https://gofile.io/d/Ydjt7N
I will share the new opening suite and new LTC results.
-
- Posts: 4610
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: Policy determining quiet early opening preferences of Leela
I checked the games now from the match LC0 vs. Fruit and after this it wasn't necessary anymore to check the SF12 vs. Fruit games.Laskos wrote: ↑Sat Oct 03, 2020 10:04 pm
...
In fact you are more correct than I thought. In endgames Lc0 LS15 (one of the best nets out there) on RTX 2070 is similar in strength to...Fruit 2.1 on one core, underperforming by at least 1000 Elo points compared to openings, where it is the strongest engine on my PC (and I have no upper limit of its strength in the openings). So, you are basically right, if by move 30-40 Lc0 is not winning, it will hardly win later.
PGN:Code: Select all
Endgames: Score of lc0_LS15 vs Fruit_21: 17 - 17 - 66 [0.500] 100 ... lc0_LS15 playing White: 8 - 10 - 32 [0.480] 50 ... lc0_LS15 playing Black: 9 - 7 - 34 [0.520] 50 ... White vs Black: 15 - 19 - 66 [0.480] 100 Elo difference: 0.0 +/- 39.8, LOS: 50.0 %, DrawRatio: 66.0 % Finished match
https://gofile.io/d/U8aHbN
...
Nobody probably knows in many of these endgames whether they are won or drawn. You have the PGN file. I can play from these endgames SF12 against Fruit 2.1, but keep in mind that endgames contribute to less than 15% of the Elo of an engine. Thew bottom line is: Lc0 with a good net and strong GPU is as weak in endgames as Fruit 2.1 on one core, a tremendous underperformance you agree or not.
The reason is, that something seems immanent weak in those LC0 games. It often reaches a totally won endgame, but in 95% of those games
it boils down to the issue it cannot convert K+R vs. R 'endgames' and leads them to 50 moves draws.
(those are the most of 'wrong result' games, there are a very few others too, as K+Q vs. K+N, also possible that early adjudications hides
this problem from most people)
Just one of many examples:
[pgn]
[Event "?"]
[Site "?"]
[Date "2020.10.03"]
[Round "7"]
[White "Fruit_21"]
[Black "lc0_LS15"]
[Result "1/2-1/2"]
[TimeControl "15+0.25"]
[GameDuration "00:00:58"]
[GameEndTime "2020-10-03T11:28:16.905 GTB Daylight Time"]
[GameStartTime "2020-10-03T11:27:18.850 GTB Daylight Time"]
[PlyCount "133"]
[FEN "Q7/8/5p2/4p1k1/5p2/5P1K/3q4/8 b - - 0 1"]
[SetUp "1"]
{--------------
Q . . . . . . .
. . . . . . . .
. . . . . p . .
. . . . p . k .
. . . . . p . .
. . . . . P . K
. . . q . . . .
. . . . . . . .
black to play
--------------}
1... Qd7+ {+7.70/7 4.00} 2. Kg2 {-2.12/13 7.00} Qe6 {+8.02/6 5.50} 3. Qb7
{-2.12/12 7.70} Kh6 {+9.46/6 6.00} 4. Qb2 {-2.11/12 4.10} Qg8+
{+9.27/5 6.00} 5. Kf1 {-2.13/14 6.50} Qc4+ {+10.89/6 4.30} 6. Kf2
{-2.13/14 4.40} Qd4+ {+14.47/6 4.80} 7. Qxd4 {-10.71/25 5.20} exd4
{+11.29/8 3.00} 8. Ke2 {-10.57/17 5.00} Kg5 {+11.40/9 8.70} 9. Kd3
{-10.60/18 5.80} Kh4 {+12.63/8 2.10} 10. Kxd4 {-10.71/21 6.10} Kg3
{+12.65/8 6.30} 11. Ke4 {-10.63/19 4.50} f5+ {+11.93/9 2.90} 12. Ke5
{-10.63/19 5.80} Kxf3 {+12.15/8 4.40} 13. Kxf5 {-10.76/16 5.70} Kg3
{+9.47/9 8.00} 14. Ke4 {-10.68/15 4.10} f3 {+8.88/8 4.20} 15. Ke3
{-10.75/14 5.50} f2 {+10.51/8 5.90} 16. Ke2 {-10.74/12 5.00} Kg2
{+11.03/8 3.30} 17. Ke3 {-10.74/10 4.70} f1=R {+21.64/7 1.10} 18. Ke4
{-5.70/15 3.80} Kg3 {+23.33/6 6.40} 19. Ke3 {-5.67/11 4.80} Kg4
{+20.95/5 5.50} 20. Kd4 {-5.70/14 5.30} Kf4 {+31.48/5 5.40} 21. Kd3
{-5.88/16 3.30} Ra1 {+24.00/5 5.40} 22. Kc4 {-5.88/12 4.20} Ke4
{+28.95/5 5.40} 23. Kc5 {-5.88/12 3.90} Kd3 {+25.38/5 5.40} 24. Kd5
{-5.88/14 3.40} Ra5+ {+25.60/5 5.50} 25. Ke6 {-5.84/10 3.70} Kd4
{+33.27/5 5.40} 26. Kf6 {-5.87/11 3.20} Ke4 {+39.94/5 5.30} 27. Ke6
{-5.84/10 3.50} Ra4 {+39.44/5 5.40} 28. Kd6 {-6.03/16 4.60} Kd4
{+34.76/5 5.50} 29. Ke6 {-5.87/10 2.90} Ra3 {+29.41/5 5.40} 30. Kf5
{-5.97/12 3.30} Ra1 {+29.58/5 5.40} 31. Ke6 {-6.02/14 3.30} Ke4
{+35.36/5 5.40} 32. Kd6 {-5.88/12 3.00} Re1 {+27.05/5 5.30} 33. Kc5
{-5.93/14 4.50} Ke5 {+28.31/5 5.30} 34. Kc4 {-6.22/17 3.50} Ke4
{+26.53/5 5.20} 35. Kc5 {+0.00/59 1.00} Kd3 {+28.08/5 5.00} 36. Kd5
{-5.84/10 4.40} Kc3 {+17.57/5 5.50} 37. Kc5 {-5.84/10 4.80} Rh1
{+26.82/5 5.20} 38. Kd5 {-5.84/10 4.60} Kb4 {+36.82/5 5.30} 39. Ke4
{-5.84/10 4.60} Kc4 {+38.80/5 5.10} 40. Kf4 {-5.88/10 2.80} Rb1
{+28.87/5 5.10} 41. Ke4 {-5.88/10 2.80} Kc3 {+28.93/5 4.90} 42. Ke5
{-5.84/10 4.40} Kd3 {+38.68/5 4.90} 43. Kd5 {-5.84/10 4.40} Rg1
{+25.00/5 4.90} 44. Ke5 {-5.84/10 4.20} Kc4 {+31.73/5 4.80} 45. Kf4
{-5.88/10 3.50} Re1 {+24.54/5 4.70} 46. Kf5 {-5.84/10 4.30} Kd5
{+30.53/5 4.60} 47. Kf4 {-5.88/11 4.30} Kd4 {+23.88/5 4.50} 48. Kg5
{-5.91/12 4.10} Ke4 {+32.04/5 4.20} 49. Kf6 {-5.84/10 4.20} Kf4
{+29.20/5 4.20} 50. Kg6
{-M14/18 0.28}
50... Rd1 {+27.89/5 4.10} 51. Kf6 {-6.02/10 3.20} Ke4 {+26.51/5 3.90} 52.
Ke6 {-5.84/10 4.10} Rh1 {+27.90/5 3.80} 53. Kd6 {-5.88/9 2.50} Kd4
{+34.18/5 3.60} 54. Ke6 {-5.88/10 3.70} Kc5 {+36.94/5 3.60} 55. Ke5
{-5.88/10 4.00} Rh4 {+47.81/5 3.40} 56. Kf5 {-5.90/11 3.80} Ra4
{+34.03/5 3.40} 57. Ke5 {-5.84/9 2.50} Rd4 {+21.92/5 3.30} 58. Kf5
{-6.25/12 2.70} Kd5 {+24.04/5 3.10} 59. Kg6 {-6.14/11 3.70} Ke5
{+34.20/5 3.00} 60. Kf7 {-6.03/10 4.30} Kf5 {+19.44/5 3.10} 61. Ke7
{-M14/19 0.24}
61... Rb4 {+17.12/5 3.00} 62. Kd6 {-5.84/10 3.90} Ra4 {+5.37/5 2.90} 63.
Kd5 {+0.00/10 3.50} Rh4 {+1.52/5 2.80} 64. Kc5 {+0.00/18 2.30} Ke5
{+0.74/5 2.80} 65. Kc6 {+0.00/28 2.50} Rh1 {+0.24/5 2.70} 66. Kc5
{+0.00/34 2.40} Ke4 {+0.00/2 2.70} 67. Kd6 {+0.00/56 7.30} Rh6+
{+0.00/2 2.60}
{Draw by fifty moves rule} 1/2-1/2[/pgn]
It is also completely incomprehensible why it should promote to a Rook instead of a Qqueen in move 17 from above game?
[d]8/8/8/8/8/4K3/5pk1/8 b - - 3 17
I am not sure though, if this is a problem of the LS15 net used, or due to the very low depths or even sth else?
Not sure either, if the comparison is fair, because they added syzygy tables support due to some endgame weaknesses,
(much more than in trad. engines) also there exist already nets trained for endgames, which could be used for this test?
Moreover I thought the problem with getting too near to 50 moves draws in winning endgames was solved in LC0 long ago?
I haven't worked with LC0 for a long time and also did not read the discord channel for a long period.
https://rwbc-chess.de
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
-
- Posts: 195
- Joined: Sun Apr 12, 2020 1:09 am
- Full name: Marc-O Moisan-Plante
Re: Policy determining quiet early opening preferences of Leela
It depens on what you mean by endgame play. For me endgame play in much larger than converting TBs positions. It doesn't matter to me if the engine cannot mate with N and B against a bare king, if it plays well before, if it is able to supress opponent counterplay, transform its positional advantage or material advantage to reach an absolutely won position given in the TBs.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Policy determining quiet early opening preferences of Leela
Thanks for this analysis. I am interested in regular nets which in learning use the standard opening position, in order to show that in the openings Lc0 with a net like that is extremely strong, but in endgames no better than a weak by current standards engine. Indeed, such misses as you have found are not very relevant and might be due to the net used or to too short time control. I am currently checking at 100s+1s J92-190 against SF12 from these 50 openings:Guenther wrote: ↑Mon Oct 05, 2020 12:16 pmI checked the games now from the match LC0 vs. Fruit and after this it wasn't necessary anymore to check the SF12 vs. Fruit games.Laskos wrote: ↑Sat Oct 03, 2020 10:04 pm
...
In fact you are more correct than I thought. In endgames Lc0 LS15 (one of the best nets out there) on RTX 2070 is similar in strength to...Fruit 2.1 on one core, underperforming by at least 1000 Elo points compared to openings, where it is the strongest engine on my PC (and I have no upper limit of its strength in the openings). So, you are basically right, if by move 30-40 Lc0 is not winning, it will hardly win later.
PGN:Code: Select all
Endgames: Score of lc0_LS15 vs Fruit_21: 17 - 17 - 66 [0.500] 100 ... lc0_LS15 playing White: 8 - 10 - 32 [0.480] 50 ... lc0_LS15 playing Black: 9 - 7 - 34 [0.520] 50 ... White vs Black: 15 - 19 - 66 [0.480] 100 Elo difference: 0.0 +/- 39.8, LOS: 50.0 %, DrawRatio: 66.0 % Finished match
https://gofile.io/d/U8aHbN
...
Nobody probably knows in many of these endgames whether they are won or drawn. You have the PGN file. I can play from these endgames SF12 against Fruit 2.1, but keep in mind that endgames contribute to less than 15% of the Elo of an engine. Thew bottom line is: Lc0 with a good net and strong GPU is as weak in endgames as Fruit 2.1 on one core, a tremendous underperformance you agree or not.
The reason is, that something seems immanent weak in those LC0 games. It often reaches a totally won endgame, but in 95% of those games
it boils down to the issue it cannot convert K+R vs. R 'endgames' and leads them to 50 moves draws.
(those are the most of 'wrong result' games, there are a very few others too, as K+Q vs. K+N, also possible that early adjudications hides
this problem from most people)
Just one of many examples:
[pgn]
[Event "?"]
[Site "?"]
[Date "2020.10.03"]
[Round "7"]
[White "Fruit_21"]
[Black "lc0_LS15"]
[Result "1/2-1/2"]
[TimeControl "15+0.25"]
[GameDuration "00:00:58"]
[GameEndTime "2020-10-03T11:28:16.905 GTB Daylight Time"]
[GameStartTime "2020-10-03T11:27:18.850 GTB Daylight Time"]
[PlyCount "133"]
[FEN "Q7/8/5p2/4p1k1/5p2/5P1K/3q4/8 b - - 0 1"]
[SetUp "1"]
{--------------
Q . . . . . . .
. . . . . . . .
. . . . . p . .
. . . . p . k .
. . . . . p . .
. . . . . P . K
. . . q . . . .
. . . . . . . .
black to play
--------------}
1... Qd7+ {+7.70/7 4.00} 2. Kg2 {-2.12/13 7.00} Qe6 {+8.02/6 5.50} 3. Qb7
{-2.12/12 7.70} Kh6 {+9.46/6 6.00} 4. Qb2 {-2.11/12 4.10} Qg8+
{+9.27/5 6.00} 5. Kf1 {-2.13/14 6.50} Qc4+ {+10.89/6 4.30} 6. Kf2
{-2.13/14 4.40} Qd4+ {+14.47/6 4.80} 7. Qxd4 {-10.71/25 5.20} exd4
{+11.29/8 3.00} 8. Ke2 {-10.57/17 5.00} Kg5 {+11.40/9 8.70} 9. Kd3
{-10.60/18 5.80} Kh4 {+12.63/8 2.10} 10. Kxd4 {-10.71/21 6.10} Kg3
{+12.65/8 6.30} 11. Ke4 {-10.63/19 4.50} f5+ {+11.93/9 2.90} 12. Ke5
{-10.63/19 5.80} Kxf3 {+12.15/8 4.40} 13. Kxf5 {-10.76/16 5.70} Kg3
{+9.47/9 8.00} 14. Ke4 {-10.68/15 4.10} f3 {+8.88/8 4.20} 15. Ke3
{-10.75/14 5.50} f2 {+10.51/8 5.90} 16. Ke2 {-10.74/12 5.00} Kg2
{+11.03/8 3.30} 17. Ke3 {-10.74/10 4.70} f1=R {+21.64/7 1.10} 18. Ke4
{-5.70/15 3.80} Kg3 {+23.33/6 6.40} 19. Ke3 {-5.67/11 4.80} Kg4
{+20.95/5 5.50} 20. Kd4 {-5.70/14 5.30} Kf4 {+31.48/5 5.40} 21. Kd3
{-5.88/16 3.30} Ra1 {+24.00/5 5.40} 22. Kc4 {-5.88/12 4.20} Ke4
{+28.95/5 5.40} 23. Kc5 {-5.88/12 3.90} Kd3 {+25.38/5 5.40} 24. Kd5
{-5.88/14 3.40} Ra5+ {+25.60/5 5.50} 25. Ke6 {-5.84/10 3.70} Kd4
{+33.27/5 5.40} 26. Kf6 {-5.87/11 3.20} Ke4 {+39.94/5 5.30} 27. Ke6
{-5.84/10 3.50} Ra4 {+39.44/5 5.40} 28. Kd6 {-6.03/16 4.60} Kd4
{+34.76/5 5.50} 29. Ke6 {-5.87/10 2.90} Ra3 {+29.41/5 5.40} 30. Kf5
{-5.97/12 3.30} Ra1 {+29.58/5 5.40} 31. Ke6 {-6.02/14 3.30} Ke4
{+35.36/5 5.40} 32. Kd6 {-5.88/12 3.00} Re1 {+27.05/5 5.30} 33. Kc5
{-5.93/14 4.50} Ke5 {+28.31/5 5.30} 34. Kc4 {-6.22/17 3.50} Ke4
{+26.53/5 5.20} 35. Kc5 {+0.00/59 1.00} Kd3 {+28.08/5 5.00} 36. Kd5
{-5.84/10 4.40} Kc3 {+17.57/5 5.50} 37. Kc5 {-5.84/10 4.80} Rh1
{+26.82/5 5.20} 38. Kd5 {-5.84/10 4.60} Kb4 {+36.82/5 5.30} 39. Ke4
{-5.84/10 4.60} Kc4 {+38.80/5 5.10} 40. Kf4 {-5.88/10 2.80} Rb1
{+28.87/5 5.10} 41. Ke4 {-5.88/10 2.80} Kc3 {+28.93/5 4.90} 42. Ke5
{-5.84/10 4.40} Kd3 {+38.68/5 4.90} 43. Kd5 {-5.84/10 4.40} Rg1
{+25.00/5 4.90} 44. Ke5 {-5.84/10 4.20} Kc4 {+31.73/5 4.80} 45. Kf4
{-5.88/10 3.50} Re1 {+24.54/5 4.70} 46. Kf5 {-5.84/10 4.30} Kd5
{+30.53/5 4.60} 47. Kf4 {-5.88/11 4.30} Kd4 {+23.88/5 4.50} 48. Kg5
{-5.91/12 4.10} Ke4 {+32.04/5 4.20} 49. Kf6 {-5.84/10 4.20} Kf4
{+29.20/5 4.20} 50. Kg6
{-M14/18 0.28}
50... Rd1 {+27.89/5 4.10} 51. Kf6 {-6.02/10 3.20} Ke4 {+26.51/5 3.90} 52.
Ke6 {-5.84/10 4.10} Rh1 {+27.90/5 3.80} 53. Kd6 {-5.88/9 2.50} Kd4
{+34.18/5 3.60} 54. Ke6 {-5.88/10 3.70} Kc5 {+36.94/5 3.60} 55. Ke5
{-5.88/10 4.00} Rh4 {+47.81/5 3.40} 56. Kf5 {-5.90/11 3.80} Ra4
{+34.03/5 3.40} 57. Ke5 {-5.84/9 2.50} Rd4 {+21.92/5 3.30} 58. Kf5
{-6.25/12 2.70} Kd5 {+24.04/5 3.10} 59. Kg6 {-6.14/11 3.70} Ke5
{+34.20/5 3.00} 60. Kf7 {-6.03/10 4.30} Kf5 {+19.44/5 3.10} 61. Ke7
{-M14/19 0.24}
61... Rb4 {+17.12/5 3.00} 62. Kd6 {-5.84/10 3.90} Ra4 {+5.37/5 2.90} 63.
Kd5 {+0.00/10 3.50} Rh4 {+1.52/5 2.80} 64. Kc5 {+0.00/18 2.30} Ke5
{+0.74/5 2.80} 65. Kc6 {+0.00/28 2.50} Rh1 {+0.24/5 2.70} 66. Kc5
{+0.00/34 2.40} Ke4 {+0.00/2 2.70} 67. Kd6 {+0.00/56 7.30} Rh6+
{+0.00/2 2.60}
{Draw by fifty moves rule} 1/2-1/2[/pgn]
It is also completely incomprehensible why it should promote to a Rook instead of a Qqueen in move 17 from above game?
[d]8/8/8/8/8/4K3/5pk1/8 b - - 3 17
I am not sure though, if this is a problem of the LS15 net used, or due to the very low depths or even sth else?
Not sure either, if the comparison is fair, because they added syzygy tables support due to some endgame weaknesses,
(much more than in trad. engines) also there exist already nets trained for endgames, which could be used for this test?
Moreover I thought the problem with getting too near to 50 moves draws in winning endgames was solved in LC0 long ago?
I haven't worked with LC0 for a long time and also did not read the discord channel for a long period.
Code: Select all
8/6k1/2p1p3/n1P3BP/1p1P4/8/2K5/8 w - - ce 109; acd 33; acs 5.000; c0 "Stockfish 12";
8/2q4k/6p1/7p/3Q4/7P/6P1/1r4BK b - - ce 142; acd 36; acs 5.000; c0 "Stockfish 12";
5r2/3k2p1/3p4/4p3/1P4QP/8/5r2/6K1 b - - ce 219; acd 23; acs 5.000; c0 "Stockfish 12";
8/7p/2P1b2P/4B3/p1kP4/P7/3K4/8 w - - ce 120; acd 50; acs 5.000; c0 "Stockfish 12";
8/6nk/5np1/4Q2p/8/1BK5/5q2/2B5 b - - ce 171; acd 23; acs 5.000; c0 "Stockfish 12";
8/4Qpk1/6p1/1p5r/5PKP/P3R3/7q/8 b - - ce 215; acd 24; acs 5.000; c0 "Stockfish 12";
5rk1/7p/3p2p1/1PnP4/2P5/4K2P/8/R4B2 w - - ce 179; acd 24; acs 5.000; c0 "Stockfish 12";
8/3B4/k4p1p/6p1/1PbpP1P1/5K1P/8/8 b - - ce 180; acd 44; acs 5.000; c0 "Stockfish 12";
8/1p3p1k/6q1/1Q6/p3p3/P6P/2r1B2K/4B3 b - - ce 144; acd 24; acs 5.000; c0 "Stockfish 12";
4b3/6p1/6p1/5p2/7P/3k2P1/2p5/2B2K2 b - - ce 223; acd 42; acs 5.000; c0 "Stockfish 12";
8/3r1k2/7p/5qpP/2Q5/5BP1/6K1/8 b - - ce 159; acd 26; acs 5.000; c0 "Stockfish 12";
8/pp4k1/4p2R/3b4/3Pp3/8/PP6/1K6 w - - ce 116; acd 26; acs 5.000; c0 "Stockfish 12";
8/2p5/8/1pkr1p2/p3r3/P1P2KP1/1P6/5Q2 b - - ce 183; acd 24; acs 5.000; c0 "Stockfish 12";
8/1B1b4/6p1/p2Kp1k1/1p6/1Pb2P2/P7/7R w - - ce 234; acd 28; acs 5.000; c0 "Stockfish 12";
8/P7/6p1/5p2/5P2/3R1kPP/q7/6K1 b - - ce 140; acd 33; acs 5.000; c0 "Stockfish 12";
2R5/pp5p/2p3k1/8/3N2p1/2P5/PPK5/5r2 w - - ce 179; acd 26; acs 5.000; c0 "Stockfish 12";
8/6pk/7p/2q1b2P/4P1P1/1Q1K1P2/8/8 b - - ce 250; acd 30; acs 5.000; c0 "Stockfish 12";
4R3/8/4n1pk/2r5/5PK1/5QP1/4q3/8 b - - ce 247; acd 26; acs 5.000; c0 "Stockfish 12";
8/6k1/8/1p2P2p/2n1BR1P/2r3P1/4K3/8 w - - ce 112; acd 27; acs 5.000; c0 "Stockfish 12";
8/2p4r/1p3k2/1P1p4/P2R1P1p/7K/8/8 b - - ce 250; acd 28; acs 5.000; c0 "Stockfish 12";
8/pp2kn1r/6R1/3P1R2/4r3/P6P/2P3P1/6K1 b - - ce 176; acd 23; acs 5.000; c0 "Stockfish 12";
8/1q4kp/1p3p2/p3p2P/P1P5/6P1/2P5/5QK1 b - - ce 107; acd 24; acs 5.000; c0 "Stockfish 12";
k2r4/7p/4P3/8/Pb6/1P1p1R2/N7/1K6 w - - ce 218; acd 25; acs 5.000; c0 "Stockfish 12";
r7/8/1n1N4/p7/2P1pk2/1P6/1K5P/R7 w - - ce 107; acd 23; acs 5.000; c0 "Stockfish 12";
1r6/4p3/1P1p1k2/7p/7r/8/2Q3P1/6K1 w - - ce 121; acd 25; acs 5.000; c0 "Stockfish 12";
8/1B4p1/3kbp1p/3p4/1P1K2P1/5P2/6P1/8 w - - ce 179; acd 35; acs 5.000; c0 "Stockfish 12";
8/7k/6pp/1Rbq3r/8/4p2P/2Q3PB/7K b - - ce 144; acd 25; acs 5.000; c0 "Stockfish 12";
1R6/5bk1/8/3p1R1p/3P3P/3B1PK1/8/2q5 w - - ce 191; acd 25; acs 5.000; c0 "Stockfish 12";
8/6p1/6Q1/1p4B1/p1n1k2P/P1P1p1PK/1q6/8 b - - ce 180; acd 25; acs 5.000; c0 "Stockfish 12";
4r1k1/1K2Bp1p/6p1/P2pP1P1/8/3b4/1R6/8 w - - ce 117; acd 26; acs 5.000; c0 "Stockfish 12";
4b3/3r2k1/Bb2p1p1/1P2P3/5PK1/4p1P1/2Q5/8 w - - ce 227; acd 24; acs 5.000; c0 "Stockfish 12";
5r2/1Bq1kp1R/1p2p3/2p5/8/8/1P6/1K3Q2 w - - ce 248; acd 24; acs 5.000; c0 "Stockfish 12";
8/8/7p/8/1bN1nk2/3p4/5PKP/5N2 b - - ce 177; acd 30; acs 5.000; c0 "Stockfish 12";
8/4k3/3q4/3b1P1p/4B1pP/2Q3P1/7K/8 w - - ce 118; acd 32; acs 5.000; c0 "Stockfish 12";
3r3k/2K1b2P/7P/4pB2/R3P3/p4P2/8/8 w - - ce 124; acd 44; acs 5.000; c0 "Stockfish 12";
1Q4k1/5p1p/4p3/8/1P5N/5PPK/r2q3P/8 b - - ce 173; acd 27; acs 5.000; c0 "Stockfish 12";
6k1/5p2/b5p1/1pnp4/8/5P2/6KP/1R3B2 w - - ce 244; acd 29; acs 5.000; c0 "Stockfish 12";
2b5/2P3k1/2KBr3/1P4P1/8/8/3Q4/6q1 b - - ce 237; acd 26; acs 5.000; c0 "Stockfish 12";
3b1k2/8/1p4p1/1P2q3/2Pp3P/3Q2P1/2B3K1/8 w - - ce 116; acd 27; acs 5.000; c0 "Stockfish 12";
8/2p5/1pP5/p5p1/P2k1b2/5P1p/2R5/5K2 b - - ce 197; acd 30; acs 5.000; c0 "Stockfish 12";
2R5/5pp1/4k3/2P1p2P/6P1/5P2/2r2K2/8 w - - ce 237; acd 27; acs 5.000; c0 "Stockfish 12";
8/p5r1/kp6/5K2/P1RN1B2/8/8/4b3 w - - ce 207; acd 22; acs 5.000; c0 "Stockfish 12";
1b4k1/1P5p/4N2P/6P1/3KB3/8/8/6r1 w - - ce 149; acd 25; acs 5.000; c0 "Stockfish 12";
7k/4qNbp/1p2p3/p3P3/5Q2/7P/6P1/7K b - - ce 240; acd 28; acs 5.000; c0 "Stockfish 12";
8/5pk1/3Q1pnp/8/3p4/7P/5PP1/3q1BK1 b - - ce 238; acd 30; acs 5.000; c0 "Stockfish 12";
1k6/2b2R2/p1p5/P1P5/1P2Kp2/6p1/8/8 w - - ce 128; acd 49; acs 5.000; c0 "Stockfish 12";
6k1/5r2/R6p/1p1qp3/3n2Q1/6P1/5N1K/8 b - - ce 175; acd 26; acs 5.000; c0 "Stockfish 12";
8/2N3pk/5b1p/3Q3P/6K1/5PP1/8/2q5 w - - ce 190; acd 32; acs 5.000; c0 "Stockfish 12";
8/3nk3/4q1p1/2bR3p/4P2P/p5P1/P1Q3K1/8 b - - ce 116; acd 25; acs 5.000; c0 "Stockfish 12";
Qn3rk1/3p2p1/p6p/7P/P7/6K1/6P1/8 w - - ce 227; acd 35; acs 5.000; c0 "Stockfish 12";
Early to say something, but after 10 games I don't see trivial misses of Lc0 J92-190. Let's see. I will post the result and the PGN.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Policy determining quiet early opening preferences of Leela
Here in 100 games at 100 + 1 from the new openings:
Pair-wise (side and reversed) in 50 pairs SF12 score is +8 -3 =39
Not that conclusive a result from these unbalanced openings too. Moreover, Lc0 manages to beat SF12 in 3 pairs of side and reversed games, which surprises me. Will check overnight Lc0 vs Fruit from these unbalanced openings.
PGN:
https://gofile.io/d/w2WZ8f
Code: Select all
Score of SF_12 vs Lc0_J92-190_CUDA: 34 - 29 - 37 [0.525] 100
... SF_12 playing White: 14 - 15 - 21 [0.490] 50
... SF_12 playing Black: 20 - 14 - 16 [0.560] 50
... White vs Black: 28 - 35 - 37 [0.465] 100
Elo difference: 17.4 +/- 54.4, LOS: 73.6 %, DrawRatio: 37.0 %
Finished match
Not that conclusive a result from these unbalanced openings too. Moreover, Lc0 manages to beat SF12 in 3 pairs of side and reversed games, which surprises me. Will check overnight Lc0 vs Fruit from these unbalanced openings.
PGN:
https://gofile.io/d/w2WZ8f
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am
Re: Policy determining quiet early opening preferences of Leela
The TBs aren't just converting what the engine (possibly) cannot. The TBs are guiding its earlier play, in the TB hits from the engine search.MMarco wrote: ↑Mon Oct 05, 2020 12:34 pmIt depens on what you mean by endgame play. For me endgame play in much larger than converting TBs positions. It doesn't matter to me if the engine cannot mate with N and B against a bare king, if it plays well before, if it is able to supress opponent counterplay, transform its positional advantage or material advantage to reach an absolutely won position given in the TBs.
If you really take your position (I certainly do not) that conversion skills don't matter, you should just run engine matches without TBs and adjudicate when they get down to 5 pieces. (The argument is weak, anyway, because why should we believe that 5-piece endgames are just "conversion" and don't need "ability to suppress opponent counterplay, trainsform its advantage", etc.?)
-
- Posts: 4610
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: Policy determining quiet early opening preferences of Leela
It seems you changed the LC0 net too? Those games are completely different now and I am sure it has nothing to do with the start positions.Laskos wrote: ↑Mon Oct 05, 2020 11:45 pm Here in 100 games at 100 + 1 from the new openings:
Pair-wise (side and reversed) in 50 pairs SF12 score is +8 -3 =39Code: Select all
Score of SF_12 vs Lc0_J92-190_CUDA: 34 - 29 - 37 [0.525] 100 ... SF_12 playing White: 14 - 15 - 21 [0.490] 50 ... SF_12 playing Black: 20 - 14 - 16 [0.560] 50 ... White vs Black: 28 - 35 - 37 [0.465] 100 Elo difference: 17.4 +/- 54.4, LOS: 73.6 %, DrawRatio: 37.0 % Finished match
Not that conclusive a result from these unbalanced openings too. Moreover, Lc0 manages to beat SF12 in 3 pairs of side and reversed games, which surprises me. Will check overnight Lc0 vs Fruit from these unbalanced openings.
PGN:
https://gofile.io/d/w2WZ8f
Looking at the depths in the final stage I would conclude that the difference to the previous match is much more owing to the net instead
of the time control. This would mean the LS15 net has to be very weak at rudimentary endgames.
(The depth difference to the previous match is just one ply in the final stage. It was 4-5 in the first one and is 5-6 in this one)
https://rwbc-chess.de
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
trollwatch:
Talkchess nowadays is a joke - it is full of trolls/idiots/people stuck in the pleistocene > 80% of the posts fall into this category...
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Policy determining quiet early opening preferences of Leela
Yes, I changed the net to the one used in TCEC by the devs. It is trained on the latest T60 games too which is in line with Lc0 framework. Yes, with the old opening positions I would have probably gotten a similar result, just that with these openings I made sure they are not too drawish pair-wise (side and reversed). The depth is not that much higher at this longer TC because this net is larger and 3-3.5 times slower in NPS.Guenther wrote: ↑Tue Oct 06, 2020 10:01 amIt seems you changed the LC0 net too? Those games are completely different now and I am sure it has nothing to do with the start positions.Laskos wrote: ↑Mon Oct 05, 2020 11:45 pm Here in 100 games at 100 + 1 from the new openings:
Pair-wise (side and reversed) in 50 pairs SF12 score is +8 -3 =39Code: Select all
Score of SF_12 vs Lc0_J92-190_CUDA: 34 - 29 - 37 [0.525] 100 ... SF_12 playing White: 14 - 15 - 21 [0.490] 50 ... SF_12 playing Black: 20 - 14 - 16 [0.560] 50 ... White vs Black: 28 - 35 - 37 [0.465] 100 Elo difference: 17.4 +/- 54.4, LOS: 73.6 %, DrawRatio: 37.0 % Finished match
Not that conclusive a result from these unbalanced openings too. Moreover, Lc0 manages to beat SF12 in 3 pairs of side and reversed games, which surprises me. Will check overnight Lc0 vs Fruit from these unbalanced openings.
PGN:
https://gofile.io/d/w2WZ8f
Looking at the depths in the final stage I would conclude that the difference to the previous match is much more owing to the net instead
of the time control. This would mean the LS15 net has to be very weak at rudimentary endgames.
(The depth difference to the previous match is just one ply in the final stage. It was 4-5 in the first one and is 5-6 in this one)
I left overnight the same match conditions Lc0 against Fruit, and now Fruit is destroyed (showing also that the openings are good in discerning superiority)
Code: Select all
Score of Fruit_21 vs Lc0_J92-190_CUDA: 4 - 46 - 50 [0.290] 100
... Fruit_21 playing White: 0 - 24 - 26 [0.260] 50
... Fruit_21 playing Black: 4 - 22 - 24 [0.320] 50
... White vs Black: 22 - 28 - 50 [0.470] 100
Elo difference: -155.5 +/- 47.4, LOS: 0.0 %, DrawRatio: 50.0 %
Finished match
https://gofile.io/d/qQFSN1
Pair-wise score is +40 -0 =10 for Lc0.
All in all, it seems Lc0 with this net is only mildly weaker than SF12 at this time control and hardware in endgames, say the level of Komodo or Ethereal, but this has to be checked. That would mean that it underperforms by a couple of hundreds of Elo points in endgames, not 1000 as I have stated earlier. This surprises me, experiments one year ago with Lc0 had a different outcome IIRC. Lc0 still seems to have a longer path to conversion than the traditional engines, but it usually doesn't miss much now.