LCZero: Progress and Scaling. Relation to CCRL Elo

Albert Silver · Post by **Albert Silver** » Wed May 30, 2018 8:01 am

Laskos wrote: ↑Wed May 30, 2018 1:56 am
Werewolf wrote: ↑Mon May 28, 2018 10:25 pm To me it looks like stalling. I wonder if this can be resurrected.
From NN342 to NN352 it indeed seems to be stalling against an A/B engine Arasan 20.5. Here is the results of gauntlet:
Code: Select all
Games Completed = 1600 of 1600 (Avg game length = 13.001 sec)
Settings = Gauntlet/64MB/100ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 29352 sec elapsed, 0 sec remaining
 1.  Arasan 20.5              	1062.5/1600	872-347-381  	(L: m=347 t=0 i=0 a=0)	(D: r=297 i=55 f=7 s=9 a=13)	(tpm=107.6 d=16.21 nps=1682392)
 2.  Lc0 NN342                	269.0/800	172-434-194  	(L: m=429 t=0 i=5 a=0)	(D: r=149 i=26 f=4 s=6 a=9)	(tpm=108.8 d=1.25 nps=1871)
 3.  Lc0 NN352                	268.5/800	175-438-187  	(L: m=436 t=0 i=2 a=0)	(D: r=148 i=29 f=3 s=3 a=4)	(tpm=108.6 d=1.22 nps=1700)
Lc0 is the cuDNN version with default settings, I am tired of fiddling with its parameters, with performances varying at time controls, and it seems anyway there are some serious bugs.

So I finished the CLOP run after 1500 games, which took a full 4 days to run. Details are: 1m+1s, used id338 (after healing so evals are not crazy), 3 opponents to avoid bias in favor of one (rated around 3080-3090 CCRL each), randomized openings suite. Computer tested on has GTX1060 6GB.

Final result is cPUCT=3.15 and FPU=0.17

Laskos · Post by **Laskos** » Wed May 30, 2018 1:23 pm

Albert Silver wrote: ↑Wed May 30, 2018 8:01 am
Laskos wrote: ↑Wed May 30, 2018 1:56 am
Werewolf wrote: ↑Mon May 28, 2018 10:25 pm To me it looks like stalling. I wonder if this can be resurrected.
From NN342 to NN352 it indeed seems to be stalling against an A/B engine Arasan 20.5. Here is the results of gauntlet:
Code: Select all
Games Completed = 1600 of 1600 (Avg game length = 13.001 sec)
Settings = Gauntlet/64MB/100ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 29352 sec elapsed, 0 sec remaining
 1.  Arasan 20.5              	1062.5/1600	872-347-381  	(L: m=347 t=0 i=0 a=0)	(D: r=297 i=55 f=7 s=9 a=13)	(tpm=107.6 d=16.21 nps=1682392)
 2.  Lc0 NN342                	269.0/800	172-434-194  	(L: m=429 t=0 i=5 a=0)	(D: r=149 i=26 f=4 s=6 a=9)	(tpm=108.8 d=1.25 nps=1871)
 3.  Lc0 NN352                	268.5/800	175-438-187  	(L: m=436 t=0 i=2 a=0)	(D: r=148 i=29 f=3 s=3 a=4)	(tpm=108.6 d=1.22 nps=1700)
Lc0 is the cuDNN version with default settings, I am tired of fiddling with its parameters, with performances varying at time controls, and it seems anyway there are some serious bugs.
So I finished the CLOP run after 1500 games, which took a full 4 days to run. Details are: 1m+1s, used id338 (after healing so evals are not crazy), 3 opponents to avoid bias in favor of one (rated around 3080-3090 CCRL each), randomized openings suite. Computer tested on has GTX1060 6GB.

Final result is cPUCT=3.15 and FPU=0.17

Thanks Albert for your great, lengthy work. I will put your settings compared to default against Arasan 20.5 in a gauntlet of 1600 ultra-fast games total, to see if the settings help. One of the problems of this LCzero in general is that it is scaling in some uncharted ways, and the settings may vary with time control, but if I will get something with your settings in ultra-fast games, then some confidence of fast testing may open new ways to optimize it. I will post the results in some maybe 7 hours, it still takes some time.

Milos · Post by **Milos** » Wed May 30, 2018 1:42 pm

Laskos wrote: ↑Wed May 30, 2018 1:23 pm Thanks Albert for your great, lengthy work. I will put your settings compared to default against Arasan 20.5 in a gauntlet of 1600 ultra-fast games total, to see if the settings help. One of the problems of this LCzero in general is that it is scaling in some uncharted ways, and the settings may vary with time control, but if I will get something with your settings in ultra-fast games, then some confidence of fast testing may open new ways to optimize it. I will post the results in some maybe 7 hours, it still takes some time.

I don't believe changing PUCT from 3 to 3.15 or FPUR from -0.05 to 0.15 would have impact anywhere near that you can actually measure it reliably with your sample of games. The idea that you can actually optimize a relatively complex thing such as UCT working in conjunction with complex two-head ResNet with just 2 fixed numbers is by itself pretty ridiculous. But anyway good luck in burning electricity.

JJJ · Post by **JJJ** » Wed May 30, 2018 2:18 pm

Laskos wrote: ↑Wed May 30, 2018 1:23 pm
Albert Silver wrote: ↑Wed May 30, 2018 8:01 am
Laskos wrote: ↑Wed May 30, 2018 1:56 am
From NN342 to NN352 it indeed seems to be stalling against an A/B engine Arasan 20.5. Here is the results of gauntlet:
Code: Select all
Games Completed = 1600 of 1600 (Avg game length = 13.001 sec)
Settings = Gauntlet/64MB/100ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 29352 sec elapsed, 0 sec remaining
 1.  Arasan 20.5              	1062.5/1600	872-347-381  	(L: m=347 t=0 i=0 a=0)	(D: r=297 i=55 f=7 s=9 a=13)	(tpm=107.6 d=16.21 nps=1682392)
 2.  Lc0 NN342                	269.0/800	172-434-194  	(L: m=429 t=0 i=5 a=0)	(D: r=149 i=26 f=4 s=6 a=9)	(tpm=108.8 d=1.25 nps=1871)
 3.  Lc0 NN352                	268.5/800	175-438-187  	(L: m=436 t=0 i=2 a=0)	(D: r=148 i=29 f=3 s=3 a=4)	(tpm=108.6 d=1.22 nps=1700)
Lc0 is the cuDNN version with default settings, I am tired of fiddling with its parameters, with performances varying at time controls, and it seems anyway there are some serious bugs.
So I finished the CLOP run after 1500 games, which took a full 4 days to run. Details are: 1m+1s, used id338 (after healing so evals are not crazy), 3 opponents to avoid bias in favor of one (rated around 3080-3090 CCRL each), randomized openings suite. Computer tested on has GTX1060 6GB.

Final result is cPUCT=3.15 and FPU=0.17
Thanks Albert for your great, lengthy work. I will put your settings compared to default against Arasan 20.5 in a gauntlet of 1600 ultra-fast games total, to see if the settings help. One of the problems of this LCzero in general is that it is scaling in some uncharted ways, and the settings may vary with time control, but if I will get something with your settings in ultra-fast games, then some confidence of fast testing may open new ways to optimize it. I will post the results in some maybe 7 hours, it still takes some time.

I m testing these settings with Lc0 Cudnn id 357 againt Houdini 1.5 at 1 min + 1 sec. Lc0 is much better from my point of view in attack now. He sees wins many move before Houdini does and seems to not mess these winning position like he did.

Laskos · Post by **Laskos** » Wed May 30, 2018 3:10 pm

Milos wrote: ↑Wed May 30, 2018 1:42 pm
Laskos wrote: ↑Wed May 30, 2018 1:23 pm Thanks Albert for your great, lengthy work. I will put your settings compared to default against Arasan 20.5 in a gauntlet of 1600 ultra-fast games total, to see if the settings help. One of the problems of this LCzero in general is that it is scaling in some uncharted ways, and the settings may vary with time control, but if I will get something with your settings in ultra-fast games, then some confidence of fast testing may open new ways to optimize it. I will post the results in some maybe 7 hours, it still takes some time.
I don't believe changing PUCT from 3 to 3.15 or FPUR from -0.05 to 0.15 would have impact anywhere near that you can actually measure it reliably with your sample of games. The idea that you can actually optimize a relatively complex thing such as UCT working in conjunction with complex two-head ResNet with just 2 fixed numbers is by itself pretty ridiculous. But anyway good luck in burning electricity.

No, the defaults are PUCT 1.2 and FPUR 0.20. Let's see, although I don't expect a large improvement. But earlier CLOP results were marred by many issues.

Werewolf · Post by **Werewolf** » Wed May 30, 2018 8:01 pm

CMCanavessi wrote: ↑Mon May 28, 2018 10:33 pm
Werewolf wrote: ↑Mon May 28, 2018 10:25 pm To me it looks like stalling. I wonder if this can be resurrected.
How can you "resurrect" something that's not dead?

Well.....323 to the present looks a lot like stalling to me. Even the inflated self-play elo isn't going up. Maybe it's a temporary stall...maybe it's not.

Laskos · Post by **Laskos** » Wed May 30, 2018 9:05 pm

Laskos wrote: ↑Wed May 30, 2018 1:23 pm Thanks Albert for your great, lengthy work. I will put your settings compared to default against Arasan 20.5 in a gauntlet of 1600 ultra-fast games total, to see if the settings help. One of the problems of this LCzero in general is that it is scaling in some uncharted ways, and the settings may vary with time control, but if I will get something with your settings in ultra-fast games, then some confidence of fast testing may open new ways to optimize it. I will post the results in some maybe 7 hours, it still takes some time.

I played 1500 games, probably I typed 1500 instead of 1600, but it doesn't matter, the result seems conclusive:

Code: Select all

    Program                            Score      %      Elo    +   -    Draws

  1 Arasan 20.5                    : 997.0/1500  66.5    - 60   16  16   21.2 %
  2 Lc0 357 default                : 272.5/ 750  36.3     -38   23  23   20.7 %
  3 Lc0 357 Albert CLOP            : 230.5/ 750  30.7     -81   23  23   21.7 %

So, the new CLOP settings are 43 +/- 32 Elo points weaker than the default settings at 0.1 s/move. Probably settings depend on time control (and hardware, but we have identical GTX 1060 6GB), as I myself observed. It's a pain playing with them.

Milos · Post by **Milos** » Thu May 31, 2018 2:38 am

Laskos wrote: ↑Wed May 30, 2018 9:05 pm
Laskos wrote: ↑Wed May 30, 2018 1:23 pm Thanks Albert for your great, lengthy work. I will put your settings compared to default against Arasan 20.5 in a gauntlet of 1600 ultra-fast games total, to see if the settings help. One of the problems of this LCzero in general is that it is scaling in some uncharted ways, and the settings may vary with time control, but if I will get something with your settings in ultra-fast games, then some confidence of fast testing may open new ways to optimize it. I will post the results in some maybe 7 hours, it still takes some time.
I played 1500 games, probably I typed 1500 instead of 1600, but it doesn't matter, the result seems conclusive:
Code: Select all
    Program                            Score      %      Elo    +   -    Draws

  1 Arasan 20.5                    : 997.0/1500  66.5    - 60   16  16   21.2 %
  2 Lc0 357 default                : 272.5/ 750  36.3     -38   23  23   20.7 %
  3 Lc0 357 Albert CLOP            : 230.5/ 750  30.7     -81   23  23   21.7 %
So, the new CLOP settings are 43 +/- 32 Elo points weaker than the default settings at 0.1 s/move. Probably settings depend on time control (and hardware, but we have identical GTX 1060 6GB), as I myself observed. It's a pain playing with them.

And if you ran it at the same TC as Adam, they'd be better then default, and then again if you ran it at longer TC they might again become worse.
You can't model different depth with 2 fixed parameters. It is hopeless and I really don't understand how ppl don't see such a simple thing. Imagine A/B search instead of null move, LMR, all kind of depth dependent tricks have only 2 fixed numbers that control shape of the search tree. How well you think it would work?

Albert Silver · Post by **Albert Silver** » Thu May 31, 2018 3:27 am

Laskos wrote: ↑Wed May 30, 2018 9:05 pm
Laskos wrote: ↑Wed May 30, 2018 1:23 pm Thanks Albert for your great, lengthy work. I will put your settings compared to default against Arasan 20.5 in a gauntlet of 1600 ultra-fast games total, to see if the settings help. One of the problems of this LCzero in general is that it is scaling in some uncharted ways, and the settings may vary with time control, but if I will get something with your settings in ultra-fast games, then some confidence of fast testing may open new ways to optimize it. I will post the results in some maybe 7 hours, it still takes some time.
I played 1500 games, probably I typed 1500 instead of 1600, but it doesn't matter, the result seems conclusive:
Code: Select all
    Program                            Score      %      Elo    +   -    Draws

  1 Arasan 20.5                    : 997.0/1500  66.5    - 60   16  16   21.2 %
  2 Lc0 357 default                : 272.5/ 750  36.3     -38   23  23   20.7 %
  3 Lc0 357 Albert CLOP            : 230.5/ 750  30.7     -81   23  23   21.7 %
So, the new CLOP settings are 43 +/- 32 Elo points weaker than the default settings at 0.1 s/move. Probably settings depend on time control (and hardware, but we have identical GTX 1060 6GB), as I myself observed. It's a pain playing with them.

You're aware that the engine already has a 100ms (0.1s) overhead in the UCI settings, so basically you are running instant moves. I don't doubt your results, but I think you must be aware already that Leela does not scale in any way like normal A/B engines. My testing was already at an average 1-2 seconds per move. For me an engine is for analyzing and playing, and I don't know anyone who does either at a pace of microseconds thus I so see no point tuning for them.

pohl4711 · Post by **pohl4711** » Thu May 31, 2018 4:55 am

At the moment, I play two gauntlets on my third notebook: Komodo 5 vs. Leela Cuda with default settings and Komodo 5 vs. Leela Cuda with the older Clop-Settings (cpuct=3.168, fpu reduction=-0.0683). With 5'+3“ thinking time. I want to play at least 200 games, each. When this is done, I will try your new Clop-tuned values, too. It is interesting, that both Clop-tunings gave nearly the same cpuct-value (around 3.1). Only the fpu-reduction is very different.

Stefan (SPCC)

LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo

Re: LCZero: Progress and Scaling. Relation to CCRL Elo