LCZero: Progress and Scaling. Relation to CCRL Elo

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
noobpwnftw
Posts: 429
Joined: Sun Nov 08, 2015 10:10 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by noobpwnftw » Sun May 20, 2018 9:32 pm

Are you sure that at low node count condition the new networks perform better but scale worse is not because of over-fitting?

jkiliani
Posts: 143
Joined: Wed Jan 17, 2018 12:26 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by jkiliani » Sun May 20, 2018 9:44 pm

noobpwnftw wrote:
Sun May 20, 2018 9:32 pm
Are you sure that at low node count condition the new networks perform better but scale worse is not because of over-fitting?
The value head of recent nets used to be still very slight worse than that of 237. With 321, this may not be the case anymore according to some measurements. The reduced quality of the value head after the regression phase was indeed because of over-fitting, but it has been recovering starting from 287 and getting better ever since. When completely recovered (which may or may not already have happened), scaling properties should also be fully restored, and the new nets will become better at any time control.

User avatar
Laskos
Posts: 10197
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos » Sun May 20, 2018 10:05 pm

jkiliani wrote:
Sun May 20, 2018 8:14 pm
Laskos wrote:
Sun May 20, 2018 6:24 pm
It seems to be a bit more complicated. At 1'+ 1'' on GTX 1060 with LC0 CUDA, latest nets seem even stronger than NN237. But at 15'+ 15'', NN237 seems stronger. I left NN319 against Komodo 10.2, it lost 5 games in a row due to tactical blunders. Eval graph was also unstable. I interrupted the match and reverted to NN237, and in 5 games until now, there are 2 wins of Komodo and 3 draws. Only one game was lost due to blunder. Still waiting for one win of LC0 in 10 games. The sample is too small, but I saw a similar thing in games against Houdini 1.5a. It seems NN237 scales better with TC or playouts, having a better value head eval. It is strange, as at nodes=1, latest nets are some 150-200 Elo points stronger than NN237. Really, they have to roll back to v0.7 engine, the current nets are trained in some schizophrenic way with v0.10.
There was a discussion about rollback on Discord yesterday, it isn't happening. At low node counts (800), current nets are far stronger than Id 237, although as you observed they don't scale quite as well (yet). But the quality of the value head is still improving, which is also the deciding factor in determining scaling properties. I'm not too worried this won't fix itself in the end, since we're going to upgrade to a 256x20 network eventually when there's no more improvement on the 192x15 architecture. Lc0 beating Komodo on your setup may not be happening yet, but I'm optimistic that it will soon, either still on 192x15 or at the latest once we go 256x20 (the AlphaZero size).
I hope the quality of the value head will continue to improve substantially, as for now it seems very strange that at nodes=1 the NN320 is almost 200 Elo points stronger than NN237, but scales worse and tactically it is still weaker. There seem to be a lot of room of improvement with 192x15 net, with a bug-fixed engine. I already got a win from LC0 CUDA NN237 in game 7 against Komodo 10.2 at 15'+ 15'' time control, the score for LC0 against Komodo 10.2 is +1 -3 =3, or about 100 Elo points difference, putting LC0 NN237 in these longer time control conditions above 3200 CCRL 40/4' Elo level, as in the games in similar conditions against Houdini 1.5a. You have to keep in mind that longer the TC, better is the rating of LC0 (at least with NN237 value head), because it scales better. Here is the game won by LC0 against Komodo 10.2. Komodo is a tough opponent, as it has a very good eval of the initial phases of the games and imbalances, where LC0 usually gains large advantages against weaker opponents. But in this game, Komodo 10.2 was pretty clueless of what happens up to move 35, considering that it has a large advantage, while the game was proceeding very well for LC0. The match will end in 2-3 hours with a total of 10 games. Anyway, I am pretty amazed by these performances against Houdini 1.5a and Komodo 10.2 at longer TC, some of recent top dogs, and objections that the samples are too small don't impress me. All in all combined (I have some other games played), LC0 CUDA NN237 with Albert's settings on GTX 1060 6GB on 2 CPU threads performs unexpectedly well to me at longer TC.

[Event "My Tournament"]
[Site "?"]
[Date "2018.05.20"]
[Round "7"]
[White "LC0_GPU_CUDA"]
[Black "Komodo 10.2"]
[Result "1-0"]
[FEN "r1bqkbnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPPQPPP/RNB1KB1R w KQkq - 0 1"]
[PlyCount "151"]
[SetUp "1"]
[TimeControl "900+15"]

1. g3 {+0.09/2 49s} Bc5 {+0.10/25 58s} 2. c3 {+0.11/2 52s} Nf6 {+0.06/25 37s}
3. Bg2 {+0.14/2 34s} O-O {+0.07/26 42s} 4. b4 {+0.15/2 46s} Bb6 {+0.02/25 49s}
5. O-O {+0.14/2 63s} a6 {+0.09/26 44s} 6. d3 {+0.15/2 42s} Re8 {+0.10/26 49s}
7. Bg5 {+0.22/2 39s} d6 {+0.07/26 65s} 8. Nbd2 {+0.21/2 34s} h6 {+0.08/27 47s}
9. Bh4 {+0.14/2 60s} g5 {+0.74/24 25s} 10. Nxg5 {0.00/2 20s} hxg5 {+0.38/25 44s}
11. Bxg5 {0.00/2 8.8s} Bg4 {+0.58/24 15s} 12. Bf3 {+0.04/2 38s}
Be6 {+0.52/24 45s} 13. Nc4 {+0.20/2 38s} Ba7 {+0.41/25 62s}
14. Ne3 {+0.29/2 42s} Kg7 {+0.43/24 16s} 15. Ng4 {+0.25/2 22s}
Bxg4 {+0.57/25 20s} 16. Bxg4 {+0.22/2 6.7s} Rh8 {+0.49/25 30s}
17. Kg2 {+0.21/2 31s} Qe8 {+0.44/25 73s} 18. f4 {+0.20/2 34s}
Nxg4 {+0.69/25 18s} 19. Qxg4 {+0.73/2 32s} Qc8 {+0.50/26 31s}
20. f5 {+0.59/2 33s} Kf8 {+0.80/21 15s} 21. Rad1 {+0.46/2 44s}
Qd7 {+0.72/25 45s} 22. h4 {+0.86/2 34s} Rh7 {+0.74/24 17s} 23. a4 {+0.67/2 48s}
Re8 {+0.62/23 63s} 24. Qf3 {+0.54/2 48s} Rh8 {+0.62/25 45s}
25. Qe2 {+0.71/2 42s} Ne7 {+0.71/24 35s} 26. f6 {+0.66/2 32s} Nc8 {+0.64/23 18s}
27. Qd2 {+0.83/2 47s} Rd8 {+0.76/22 13s} 28. Bh6+ {+0.72/2 21s}
Ke8 {+0.92/24 10s} 29. Bg7 {+0.73/2 9.1s} Rh7 {+1.03/26 18s}
30. Qe2 {+0.81/2 12s} Qxa4 {+1.12/23 21s} 31. h5 {+1.02/2 28s}
Qd7 {+0.43/24 54s} 32. Qf3 {+1.03/2 24s} Qe6 {+0.95/20 13s} 33. h6 {+1.12/2 17s}
Nb6 {+1.12/23 39s} 34. d4 {+1.22/2 9.0s} c6 {+0.44/22 59s} 35. g4 {+1.42/2 26s}
exd4 {-0.18/25 78s} 36. cxd4 {+1.76/2 25s} Kd7 {-0.62/25 68s}
37. g5 {+1.67/2 17s} Rdh8 {-0.90/21 27s} 38. Qh3 {+2.49/2 36s}
Qxh3+ {-0.52/22 25s} 39. Kxh3 {+2.83/2 25s} Nc4 {-0.67/25 33s}
40. Rfe1 {+2.94/2 42s} Nb2 {-0.12/26 17s} 41. e5 {+3.18/2 17s}
Ke6 {-1.07/20 8.3s} 42. exd6+ {+2.95/2 33s} Kxd6 {-1.49/22 7.5s}
43. Rd2 {+3.05/2 5.0s} Nc4 {-1.40/24 5.7s} 44. Rdd1 {+3.22/2 5.6s}
Nb2 {-1.39/26 16s} 45. Rb1 {+4.05/2 20s} Bxd4 {-1.29/24 17s}
46. Re4 {+4.22/2 17s} c5 {-1.35/24 11s} 47. bxc5+ {+5.93/2 23s}
Kxc5 {-1.70/24 37s} 48. Rxd4 {+6.43/2 15s} Kxd4 {-2.95/19 4.4s}
49. Rxb2 {+6.61/2 8.8s} b5 {-4.63/22 27s} 50. Kg4 {+6.98/2 12s}
Rb8 {-4.35/22 7.1s} 51. g6 {+8.27/2 23s} fxg6 {-6.32/24 23s}
52. f7+ {+8.92/2 35s} Kd5 {-6.48/20 4.4s} 53. Re2 {+10.80/2 32s}
Rf8 {-7.31/22 12s} 54. Bxf8 {+11.30/2 16s} Rxf7 {-7.52/24 6.6s}
55. Bg7 {+11.45/2 7.9s} Rf1 {-12.01/25 36s} 56. Rh2 {+11.67/2 17s}
Rg1+ {-12.07/24 5.3s} 57. Kf4 {+12.07/2 31s} Rf1+ {-12.07/20 19s}
58. Kg5 {+12.06/2 20s} Rg1+ {-12.07/25 7.0s} 59. Kf6 {+12.08/2 10s}
Rf1+ {-12.07/21 16s} 60. Kxg6 {+12.91/2 22s} Rg1+ {-12.10/26 12s}
61. Kf7 {+13.08/2 14s} Rc1 {-250.00/25 22s} 62. Rh5+ {+12.87/2 33s}
Ke4 {-250.00/21 18s} 63. h7 {+12.75/2 19s} Rc8 {-M40/23 18s}
64. Re5+ {+16.10/2 27s} Kf3 {-M34/21 1.1s} 65. Re8 {+17.93/2 22s}
Rxe8 {-M32/21 0.95s} 66. Kxe8 {+17.83/2 13s} Ke3 {-M32/21 1.5s}
67. Kd7 {+19.28/2 31s} Kf4 {-M30/20 1.6s} 68. h8=Q {+20.38/2 42s}
b4 {-M26/20 2.6s} 69. Ke6 {+21.24/3 26s} Kg3 {-M16/20 3.1s}
70. Kf5 {+25.35/2 18s} Kf2 {-M10/35 1.7s} 71. Ke4 {+35.99/3 13s}
Kg2 {-M8/99 0.59s} 72. Be5 {+51.55/2 14s} b3 {-M6/99 0.037s}
73. Qh2+ {+M75/2 13s} Kf1 {-M4/5 0s} 74. Kf3 {+122.28/2 9.9s} a5 {-M2/99 0.008s}
75. Qe2+ {+127.03/2 19s} Kg1 {-M2/5 0s} 76. Qg2# {+128.00/2 12s, White mates}
1-0

Albert Silver
Posts: 2894
Joined: Wed Mar 08, 2006 8:57 pm
Location: Rio de Janeiro, Brazil

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Albert Silver » Sun May 20, 2018 10:42 pm

Laskos wrote:
Sun May 20, 2018 10:05 pm
jkiliani wrote:
Sun May 20, 2018 8:14 pm
Laskos wrote:
Sun May 20, 2018 6:24 pm
It seems to be a bit more complicated. At 1'+ 1'' on GTX 1060 with LC0 CUDA, latest nets seem even stronger than NN237. But at 15'+ 15'', NN237 seems stronger. I left NN319 against Komodo 10.2, it lost 5 games in a row due to tactical blunders. Eval graph was also unstable. I interrupted the match and reverted to NN237, and in 5 games until now, there are 2 wins of Komodo and 3 draws. Only one game was lost due to blunder. Still waiting for one win of LC0 in 10 games. The sample is too small, but I saw a similar thing in games against Houdini 1.5a. It seems NN237 scales better with TC or playouts, having a better value head eval. It is strange, as at nodes=1, latest nets are some 150-200 Elo points stronger than NN237. Really, they have to roll back to v0.7 engine, the current nets are trained in some schizophrenic way with v0.10.
There was a discussion about rollback on Discord yesterday, it isn't happening. At low node counts (800), current nets are far stronger than Id 237, although as you observed they don't scale quite as well (yet). But the quality of the value head is still improving, which is also the deciding factor in determining scaling properties. I'm not too worried this won't fix itself in the end, since we're going to upgrade to a 256x20 network eventually when there's no more improvement on the 192x15 architecture. Lc0 beating Komodo on your setup may not be happening yet, but I'm optimistic that it will soon, either still on 192x15 or at the latest once we go 256x20 (the AlphaZero size).
I hope the quality of the value head will continue to improve substantially, as for now it seems very strange that at nodes=1 the NN320 is almost 200 Elo points stronger than NN237, but scales worse and tactically it is still weaker.
Not sure why, but NN321 is now the best in tactics, beating NN237 in WAC Revised by one position. Also interesting, is that the 20x256 Net scored the same in tactics to NN237, when both are tested with LC0 Optimized. I had not expected this frankly, with half speed.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."

jp
Posts: 1331
Joined: Mon Apr 23, 2018 5:54 am

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by jp » Mon May 21, 2018 8:45 am

Albert Silver wrote:
Sun May 20, 2018 10:42 pm
Not sure why, but NN321 is now the best in tactics, beating NN237 in WAC Revised by one position. Also interesting, is that the 20x256 Net scored the same in tactics to NN237, when both are tested with LC0 Optimized. I had not expected this frankly, with half speed.
What is the 20x256 Net??

User avatar
Laskos
Posts: 10197
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos » Tue May 22, 2018 7:01 am

Albert Silver wrote:
Sun May 20, 2018 10:42 pm
Laskos wrote:
Sun May 20, 2018 10:05 pm
jkiliani wrote:
Sun May 20, 2018 8:14 pm

There was a discussion about rollback on Discord yesterday, it isn't happening. At low node counts (800), current nets are far stronger than Id 237, although as you observed they don't scale quite as well (yet). But the quality of the value head is still improving, which is also the deciding factor in determining scaling properties. I'm not too worried this won't fix itself in the end, since we're going to upgrade to a 256x20 network eventually when there's no more improvement on the 192x15 architecture. Lc0 beating Komodo on your setup may not be happening yet, but I'm optimistic that it will soon, either still on 192x15 or at the latest once we go 256x20 (the AlphaZero size).
I hope the quality of the value head will continue to improve substantially, as for now it seems very strange that at nodes=1 the NN320 is almost 200 Elo points stronger than NN237, but scales worse and tactically it is still weaker.
Not sure why, but NN321 is now the best in tactics, beating NN237 in WAC Revised by one position. Also interesting, is that the 20x256 Net scored the same in tactics to NN237, when both are tested with LC0 Optimized. I had not expected this frankly, with half speed.
The later NNs are already some 3200 CCRL Elo level on GTX 1060 at even short 1m + 1s time control. Here is the result against Houdini 1.5a (3170 CCRL):

Code: Select all

1m + 1s
Score of LC0_CUDA_NN322 vs Houdini 1.5a: 43 - 33 - 24 [0.550]
Elo difference: 34.86 +/- 60.09

100 of 100 games finished.
At TC like 15m + 15s, the CCRL rating is even higher, maybe 3250 or so. The snag is that these later nets scale not that well with TC (or playouts) compared to NN237.

jkiliani
Posts: 143
Joined: Wed Jan 17, 2018 12:26 pm

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by jkiliani » Tue May 22, 2018 7:49 pm

Laskos wrote:
Tue May 22, 2018 7:01 am
At TC like 15m + 15s, the CCRL rating is even higher, maybe 3250 or so. The snag is that these later nets scale not that well with TC (or playouts) compared to NN237.
You might try Id 329 sometime. A test of the value head today yielded the best result of any network so far, for 329, which is a very promising indication that the net may also scale very well.
Attachments
Value329.png
Value329.png (32.47 KiB) Viewed 1694 times

User avatar
Laskos
Posts: 10197
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Laskos » Tue May 22, 2018 9:45 pm

jkiliani wrote:
Tue May 22, 2018 7:49 pm
Laskos wrote:
Tue May 22, 2018 7:01 am
At TC like 15m + 15s, the CCRL rating is even higher, maybe 3250 or so. The snag is that these later nets scale not that well with TC (or playouts) compared to NN237.
You might try Id 329 sometime. A test of the value head today yielded the best result of any network so far, for 329, which is a very promising indication that the net may also scale very well.
Yes, I myself was curious, and upon arriving home, played LTC games to check a bit, they take time. Here is the result (I found it sufficiently interesting to post in a new thread):
http://talkchess.com/forum3/viewtopic.php?f=2&t=67537
It seems that by now the newest nets are the best at all time controls.

Kanizsa
Posts: 32
Joined: Mon Feb 20, 2017 7:29 am
Location: Rialto, Venice

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Kanizsa » Thu May 24, 2018 11:25 am

Laskos wrote:
Sun Apr 01, 2018 9:37 am
peter wrote:Hi Robin!
CheckersGuy wrote:That's indeed a very impressive result but that's probably what neural-nets are good at. It's kind of intresting. Weaker traditional alpha-beta engines are decent at tactics and suffer from bad positional play while with Leela0 it's the other way around :lol:
LC0 seems already close to very strong engines in this opening suite. At this pace of advancement in positional understanding, I will be very curious how it develops.
Hi Kai,
what's about your last experiments with this opening suite ?
Are last nets of LC0 (those >300) positionally better than Stockfish & Komodo ?

Albert Silver
Posts: 2894
Joined: Wed Mar 08, 2006 8:57 pm
Location: Rio de Janeiro, Brazil

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Post by Albert Silver » Thu May 24, 2018 2:11 pm

Kanizsa wrote:
Thu May 24, 2018 11:25 am
Laskos wrote:
Sun Apr 01, 2018 9:37 am
Hi Robin!

LC0 seems already close to very strong engines in this opening suite. At this pace of advancement in positional understanding, I will be very curious how it develops.
Hi Kai,
what's about your last experiments with this opening suite ?
Are last nets of LC0 (those >300) positionally better than Stockfish & Komodo ?
Offhand, I'd say maybe, but that is a very speculative maybe. One cannot remove tactics from the equation, so oversights in its calculations will affect its decisions. An argument such as ''this would be a great move if.... it didn't lose a piece" holds no water in my book.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."

Post Reply