Laskos wrote: ↑Mon Jan 27, 2020 3:51 pmIt's good that you opened this thread. Yes, Kiudee settings are good within a large range of generality. I will present some current results with one the latest T59 nets on my OC-ed RTX 2070 GPU, these 128x10b nets with Lc0 0.23.2 hit the ceiling of about 100k NPS of the engine's (pseudo) MCTS search. Smaller nets like 64x6 or 48x5 have basically the same speed due to this ceiling, so, as the most important in order to derive LTC behavior is the number of nodes per move used, I used the strong by now T59 run.
First, a bit about your result: you probably use "too good, too balanced" openings, which translate into a very high draw rate. Combined with small samples (100 games), you results are hardly conclusive. I got from unbalanced openings (pair-wise --- side and reversed to diminish the noise) in more games (500 games each match) reasonable draw rates and small error margins. The smaller than naive error margins come from 5-nomial variance for unbalanced openings, discussed here:
https://www.chessprogramming.org/index. ... l_Analysis
I also compute the 5-nomial Normalized Elo, as simple Elo differences tend to compress with increasing time control. The Normalized Elo described here:
I first played matches at ultra-fast time control 6s + 0.1s to see whether the Kiudee settings give an advantage. The matches were both self-play and against different opponents. In self-play, the Kiudee settings gave an inflated by about 20% Elo difference for Kiudee bonus. I tried T40, SV, LS, T59, T60 nets with and without Kiudee settings, and SF too in the mix as an opponent. In these ultra-fast games, I got about about 45-50 Elo bonus in self play and about 40 Elo points bonus against a different opponent. Maybe a tiny bit larger bonus for T59 and T60 nets, probably because they are trained with different settings as noise etc. It is worth noting that one important thing is to test at equal time controls, not at equal nodes even in self-play, because Kiudee settings can affect the NPS behavior.
To check the time control behavior, I used three time controls: 6s + 0.1s, corresponding to about 20k npm on average, 30s + 0.3s, corresponding to about 100k npm and 120s + 1.2s, corresponding to about 400k npm on average. I used a late T59 net in self-games Kiudee versus default. Here are the results in matches of 500 games each:
Normalized Elo: 0.270
Code: Select all
TC: 6s + 0.1s (about 20k nodes per move) Score of T_59_Kiudee vs T_59: 147 - 77 - 276 [0.570] Elo difference: 49.0 +/- 8.2, LOS: 100.0 %, DrawRatio: 55.2 %
Normalized Elo: 0.299
Code: Select all
TC: 30s + 0.3s (about 100k nodes per move) Score of T_59_Kiudee vs T_59: 125 - 57 - 318 [0.568] Elo difference: 47.5 +/- 7.2, LOS: 100.0 %, DrawRatio: 63.6 %
Normalized Elo: 0.245
Code: Select all
TC: 120s + 1.2s (about 400k nodes per move) Score of T_59_Kiudee vs T_59: 121 - 66 - 313 [0.555] Elo difference: 38.4 +/- 7.1, LOS: 100.0 %, DrawRatio: 62.6 %
The Elo error margins shown are 1 SD 5-nomial.
The scaling of the Kiudee settings is given by the 5-nomial Normalied Elo. Although within 95% confidence confidence interval, the Kiudee bonus seems to decrease a bit with some 80% confidence going from 100k npm to 400k npm. It decreases just a bit and 400k npm on my RTX 2070 with T60 net is already a slow rapid, not far from real LTC.
To check that it's not something simple to cure that small decrease, of the very few facts about the settings, I knew that to LTC a bit larger CPuct is recommended. So, now under the test is a 120s +1.2 match of 500 games with Kiudee settings + CPuct = 2.600 instead of CPuct = 2.147.
It's too early, one more day to go, the things might change a bit, a provisional result is here:
Code: Select all
TC: 120s + 1.2s (about 400k nodes per move) Score of T_59_Kiudee_mod vs T_59: 51 - 30 - 129 [0.550] Elo difference: 34.9 +/- 29.1, LOS: 99.0 %, DrawRatio: 61.4 % 210 of 500 games finished.
My impression is that the Kiudee settings were obtained by tuning them all together (CLOP-like), and if CPuct setting is not orthogonal to all the other parameters, a simple attempt like that to just increase CPuct to longer TC won't get you the desired results. If one knows how other parameters might relate with CPuct to longer TC, one might try to tune on fewer parameters with longer TC games. The number of games necessary for tuning explodes exponentially with the number of parameters, so having 2-3 instead of 5 would help greatly. Also, if one knows roughly how these 2-3 parameters relate one to another to longer TC (more npm), then one can do just a simple exploring, almost a manual one, using longer TC games.
One a bit unpleasant thing was that at longer TC, some 400k nodes per move (npm), I got a smaller improvement both in Elo and in Normalized Elo. So, one was unsure what happens to really long TC on an RTX GPU (more than 1000k npm). First, intuitively I tried a larger CPuct for LTC (about 400k npm) of 2.600 with bad results, worse than the result with the Kiudee CPuct = 2.147. I thought that I overshoot the optimum and tried a CPuct of 2.300 with similarly bad results. But now I tried a SMALLER CPuct at LTC with some pretty crazy improvement already outside error margins even in preliminary results.
Smaller CPuct = 1.900 instead of CPuct of 2.147 to LONGER time control:
Code: Select all
TC: 120s + 1.2s (about 400k nodes per move) Score of T_59_Kiudee_mod vs T_59: 48 - 10 - 130 [0.601] Elo difference: 71.2 +/- 9.6, LOS: 100.0 %, DrawRatio: 69.1 % 188 of 500 games finished.
The error margins (1 SD) and the Normalized Elo are computed using 5-nomial for paired games (side-reversed from unbalanced openings).
Pretty astounding improvement with Kiudee "mod" settings at longer TC, significantly better one than that at STC.
I will probably leave this test run up to 250 games, but the result is outside any doubt. I was so surprised that I checked my settings, even the hardware, but I couldn't find anything wrong.
Then, by fiddling with just CPuct of the Kiudee settings, I will try to find improvements at shorter time controls. That's much easier, as the burden was with LTC games (about 400k npm) which took days. And maybe I will see a trend of CPuct value with npm (or time control) and try to extrapolate to real LTC (above 1000k npm).
I didn't expect that fiddling with just CPuct can give such boosts, thinking of several more related parameters which have to be tuned together.