Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Laskos · Post by **Laskos** » Fri Jan 31, 2020 10:27 am

I had this post which showed a significant improvement with Kiudee Lc0 settings over the defaults:

Laskos wrote: ↑Mon Jan 27, 2020 4:51 pm It's good that you opened this thread. Yes, Kiudee settings are good within a large range of generality. I will present some current results with one the latest T59 nets on my OC-ed RTX 2070 GPU, these 128x10b nets with Lc0 0.23.2 hit the ceiling of about 100k NPS of the engine's (pseudo) MCTS search. Smaller nets like 64x6 or 48x5 have basically the same speed due to this ceiling, so, as the most important in order to derive LTC behavior is the number of nodes per move used, I used the strong by now T59 run.

First, a bit about your result: you probably use "too good, too balanced" openings, which translate into a very high draw rate. Combined with small samples (100 games), you results are hardly conclusive. I got from unbalanced openings (pair-wise --- side and reversed to diminish the noise) in more games (500 games each match) reasonable draw rates and small error margins. The smaller than naive error margins come from 5-nomial variance for unbalanced openings, discussed here:
https://www.chessprogramming.org/index. ... l_Analysis

I also compute the 5-nomial Normalized Elo, as simple Elo differences tend to compress with increasing time control. The Normalized Elo described here:
http://hardy.uhasselt.be/Toga/normalized_elo.pdf

I first played matches at ultra-fast time control 6s + 0.1s to see whether the Kiudee settings give an advantage. The matches were both self-play and against different opponents. In self-play, the Kiudee settings gave an inflated by about 20% Elo difference for Kiudee bonus. I tried T40, SV, LS, T59, T60 nets with and without Kiudee settings, and SF too in the mix as an opponent. In these ultra-fast games, I got about about 45-50 Elo bonus in self play and about 40 Elo points bonus against a different opponent. Maybe a tiny bit larger bonus for T59 and T60 nets, probably because they are trained with different settings as noise etc. It is worth noting that one important thing is to test at equal time controls, not at equal nodes even in self-play, because Kiudee settings can affect the NPS behavior.

To check the time control behavior, I used three time controls: 6s + 0.1s, corresponding to about 20k npm on average, 30s + 0.3s, corresponding to about 100k npm and 120s + 1.2s, corresponding to about 400k npm on average. I used a late T59 net in self-games Kiudee versus default. Here are the results in matches of 500 games each:

Kiudee settings:
CPuct=2.147
FpuValue=0.443
PolicyTemperature=1.607
CPuctBase=18368
CPuctFactor=2.815
Code: Select all
TC: 6s + 0.1s  (about 20k nodes per move)
Score of T_59_Kiudee vs T_59: 147 - 77 - 276 [0.570]
Elo difference: 49.0 +/- 8.2, LOS: 100.0 %, DrawRatio: 55.2 %
Normalized Elo: 0.270
Code: Select all
TC: 30s + 0.3s (about 100k nodes per move)
Score of T_59_Kiudee vs T_59: 125 - 57 - 318 [0.568]
Elo difference: 47.5 +/- 7.2, LOS: 100.0 %, DrawRatio: 63.6 %
Normalized Elo: 0.299
Code: Select all
TC: 120s + 1.2s (about 400k nodes per move)
Score of T_59_Kiudee vs T_59: 121 - 66 - 313 [0.555]
Elo difference: 38.4 +/- 7.1, LOS: 100.0 %, DrawRatio: 62.6 %
Normalized Elo: 0.245

The Elo error margins shown are 1 SD 5-nomial.

The scaling of the Kiudee settings is given by the 5-nomial Normalied Elo. Although within 95% confidence confidence interval, the Kiudee bonus seems to decrease a bit with some 80% confidence going from 100k npm to 400k npm. It decreases just a bit and 400k npm on my RTX 2070 with T60 net is already a slow rapid, not far from real LTC.

To check that it's not something simple to cure that small decrease, of the very few facts about the settings, I knew that to LTC a bit larger CPuct is recommended. So, now under the test is a 120s +1.2 match of 500 games with Kiudee settings + CPuct = 2.600 instead of CPuct = 2.147.

It's too early, one more day to go, the things might change a bit, a provisional result is here:
Code: Select all
TC: 120s + 1.2s (about 400k nodes per move)
Score of T_59_Kiudee_mod vs T_59: 51 - 30 - 129 [0.550]
Elo difference: 34.9 +/- 29.1, LOS: 99.0 %, DrawRatio: 61.4 %

210 of 500 games finished.
My impression is that the Kiudee settings were obtained by tuning them all together (CLOP-like), and if CPuct setting is not orthogonal to all the other parameters, a simple attempt like that to just increase CPuct to longer TC won't get you the desired results. If one knows how other parameters might relate with CPuct to longer TC, one might try to tune on fewer parameters with longer TC games. The number of games necessary for tuning explodes exponentially with the number of parameters, so having 2-3 instead of 5 would help greatly. Also, if one knows roughly how these 2-3 parameters relate one to another to longer TC (more npm), then one can do just a simple exploring, almost a manual one, using longer TC games.

One a bit unpleasant thing was that at longer TC, some 400k nodes per move (npm), I got a smaller improvement both in Elo and in Normalized Elo. So, one was unsure what happens to really long TC on an RTX GPU (more than 1000k npm). First, intuitively I tried a larger CPuct for LTC (about 400k npm) of 2.600 with bad results, worse than the result with the Kiudee CPuct = 2.147. I thought that I overshoot the optimum and tried a CPuct of 2.300 with similarly bad results. But now I tried a SMALLER CPuct at LTC with some pretty crazy improvement already outside error margins even in preliminary results.

Smaller CPuct = 1.900 instead of CPuct of 2.147 to LONGER time control:

Code: Select all

TC: 120s + 1.2s (about 400k nodes per move)
Score of T_59_Kiudee_mod vs T_59: 48 - 10 - 130 [0.601]
Elo difference: 71.2 +/- 9.6, LOS: 100.0 %, DrawRatio: 69.1 %

188 of 500 games finished.

Normalized Elo: 0.558

The error margins (1 SD) and the Normalized Elo are computed using 5-nomial for paired games (side-reversed from unbalanced openings).

Pretty astounding improvement with Kiudee "mod" settings at longer TC, significantly better one than that at STC.

I will probably leave this test run up to 250 games, but the result is outside any doubt. I was so surprised that I checked my settings, even the hardware, but I couldn't find anything wrong.
Then, by fiddling with just CPuct of the Kiudee settings, I will try to find improvements at shorter time controls. That's much easier, as the burden was with LTC games (about 400k npm) which took days. And maybe I will see a trend of CPuct value with npm (or time control) and try to extrapolate to real LTC (above 1000k npm).

I didn't expect that fiddling with just CPuct can give such boosts, thinking of several more related parameters which have to be tuned together.

Laskos · Post by **Laskos** » Fri Jan 31, 2020 11:06 am

Then again, I am surprised. A CPuct = 1.900 doesn't seem very intuitive to LTC, does it? How re-farctoring works with Lc0? Really, I feel I have to check and re-check every detail of my setup. Even if a self-play, an improvement of 70 Elo points with 70% draw rate at 400k npm is a bit hard to believe (for me, at least). "Normalized Elo" which gives the invariant scaling of the improvement exploded.

Kiudee · Post by **Kiudee** » Fri Jan 31, 2020 11:12 am

Laskos wrote: ↑Fri Jan 31, 2020 11:06 am Then again, I am surprised. A CPuct = 1.900 doesn't seem very intuitive to LTC, does it? How re-farctoring works with Lc0? Really, I feel I have to check and re-check every detail of my setup. Even if a self-play, an improvement of 70 Elo points with 70% draw rate at 400k npm is a bit hard to believe (for me, at least). "Normalized Elo" which gives the invariant scaling of the improvement exploded.

Thanks for doing proper tests of your modification. I currently have a distributed tuning run in progress for a mix of STC and LTC and can confirm that lower CPuct values look promising. I won’t disclose any new parameter combinations before convergence has been reached of course.

edit: I see that you tested against lc0 in self-play. Try validating the change against Stockfish.

Eduard · Post by **Eduard** » Fri Jan 31, 2020 11:15 am

Even 50 Elo would be fantastic.

Laskos · Post by **Laskos** » Fri Jan 31, 2020 11:38 am

Kiudee wrote: ↑Fri Jan 31, 2020 11:12 am
Laskos wrote: ↑Fri Jan 31, 2020 11:06 am Then again, I am surprised. A CPuct = 1.900 doesn't seem very intuitive to LTC, does it? How re-farctoring works with Lc0? Really, I feel I have to check and re-check every detail of my setup. Even if a self-play, an improvement of 70 Elo points with 70% draw rate at 400k npm is a bit hard to believe (for me, at least). "Normalized Elo" which gives the invariant scaling of the improvement exploded.
Thanks for doing proper tests of your modification. I currently have a distributed tuning run in progress for a mix of STC and LTC and can confirm that lower CPuct values look promising. I won’t disclose any new parameter combinations before convergence has been reached of course.

edit: I see that you tested against lc0 in self-play. Try validating the change against Stockfish.

After that test is finished with 250 games (for consistency of fixed finish) I might try to validate against SF, but it will take time, hundreds of games at 120'' + 1.2'' (on average 400k npm on my 2070 GPU) take time. It would be interesting, what works well against Lc0 might not work well against SF, it was seen before.

pohl4711 · Post by **pohl4711** » Fri Jan 31, 2020 11:56 am

Laskos wrote: ↑Fri Jan 31, 2020 11:38 am
Kiudee wrote: ↑Fri Jan 31, 2020 11:12 am
Laskos wrote: ↑Fri Jan 31, 2020 11:06 am Then again, I am surprised. A CPuct = 1.900 doesn't seem very intuitive to LTC, does it? How re-farctoring works with Lc0? Really, I feel I have to check and re-check every detail of my setup. Even if a self-play, an improvement of 70 Elo points with 70% draw rate at 400k npm is a bit hard to believe (for me, at least). "Normalized Elo" which gives the invariant scaling of the improvement exploded.
Thanks for doing proper tests of your modification. I currently have a distributed tuning run in progress for a mix of STC and LTC and can confirm that lower CPuct values look promising. I won’t disclose any new parameter combinations before convergence has been reached of course.

edit: I see that you tested against lc0 in self-play. Try validating the change against Stockfish.
After that test is finished with 250 games (for consistency of fixed finish) I might try to validate against SF, but it will take time, hundreds of games at 120'' + 1.2'' (on average 400k npm on my 2070 GPU) take time. It would be interesting, what works well against Lc0 might not work well against SF, it was seen before.

Because Leelenstein 13.2 does not seem to be a big progress, I aborted this testrun (will be tested later, of course). And I will repeat my testrun of
Lc0 0.23.2k t40-1541 (20x256) with your changed Kiudee-setting (CPuct=1.900), first (300 Armageddon games with 8'+5'' vs SF 191210). Will take 6 days. But, if it does not work well, I will perhaps abort it. I will report here.

Laskos · Post by **Laskos** » Fri Jan 31, 2020 12:05 pm

pohl4711 wrote: ↑Fri Jan 31, 2020 11:56 am
Laskos wrote: ↑Fri Jan 31, 2020 11:38 am
Kiudee wrote: ↑Fri Jan 31, 2020 11:12 am
Laskos wrote: ↑Fri Jan 31, 2020 11:06 am Then again, I am surprised. A CPuct = 1.900 doesn't seem very intuitive to LTC, does it? How re-farctoring works with Lc0? Really, I feel I have to check and re-check every detail of my setup. Even if a self-play, an improvement of 70 Elo points with 70% draw rate at 400k npm is a bit hard to believe (for me, at least). "Normalized Elo" which gives the invariant scaling of the improvement exploded.
Thanks for doing proper tests of your modification. I currently have a distributed tuning run in progress for a mix of STC and LTC and can confirm that lower CPuct values look promising. I won’t disclose any new parameter combinations before convergence has been reached of course.

edit: I see that you tested against lc0 in self-play. Try validating the change against Stockfish.
After that test is finished with 250 games (for consistency of fixed finish) I might try to validate against SF, but it will take time, hundreds of games at 120'' + 1.2'' (on average 400k npm on my 2070 GPU) take time. It would be interesting, what works well against Lc0 might not work well against SF, it was seen before.
Because Leelenstein 13.2 does not seem to be a big progress, I aborted this testrun (will be tested later, of course). And I will repeat my testrun of
Lc0 0.23.2k t40-1541 (20x256) with your changed Kiudee-setting (CPuct=1.900), first (300 Armageddon games with 8'+5'' vs SF 191210). Will take 6 days. But, if it does not work well, I will perhaps abort it. I will report here.

Oh thanks, that was really a pain for me to check, would have taken days on. Maybe I will start something, but if I will need my CPU and GPU for other things, I will abandon it.

pohl4711 · Post by **pohl4711** » Fri Jan 31, 2020 12:09 pm

Makes more sense, that I am doing the testrun, because I already have a comparable result of that net with Kiudee-setting...

Laskos · Post by **Laskos** » Fri Jan 31, 2020 12:11 pm

pohl4711 wrote: ↑Fri Jan 31, 2020 12:09 pm Makes more sense, that I am doing the testrun, because I already have a comparable result of that net with Kiudee-setting...

Yes, I needed to run a gauntlet of SF against two Lc0, double games. Painful.

mwyoung · Post by **mwyoung** » Fri Jan 31, 2020 6:15 pm

Laskos wrote: ↑Fri Jan 31, 2020 11:06 am Then again, I am surprised. A CPuct = 1.900 doesn't seem very intuitive to LTC, does it? How re-farctoring works with Lc0? Really, I feel I have to check and re-check every detail of my setup. Even if a self-play, an improvement of 70 Elo points with 70% draw rate at 400k npm is a bit hard to believe (for me, at least). "Normalized Elo" which gives the invariant scaling of the improvement exploded.

Again don't be shy about moving CPUCT and Temp Policy. For Long time controls with good Lc0 speed. You always need to increase Cpuct and tune with Temp policy. Currently running CPuct = 3.5 and Temp policy = 1.75. Results have been awesome!

Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct

Re: Crazy good LTC Kiudee "mod" setting just by adjusting CPuct