30s+0.4s
2 x 2080 super (cudnn-fp16), Threads=3
LS 14.3 vs BrainFish_200420 23 CPU (because its faster than abrok for me)
6 man TB in search for both (and cutechess adjudication).
So average TC is about 6x and GPU speed is roughly 10x? (I benched max of 84k NPS after about 40s, with 0.4s search I think it's more like 70k at start but very position dependent, some are as low as 35k). Seems to get around 150k to 300k nodes per move (instead of 500 or 14k from previous tune).
Last edited by jjoshua2 on Thu May 14, 2020 4:39 pm, edited 2 times in total.
I am doing 6 games per iteration instead of 4 to reduce noise. Currently at iteration 645 which is thus after 3870 games. Note the red line values are the current best and the orange are kind of a error bars, but real error is greater than this.
I tested 1000 games in selfplay on same TC and GPUs not too far from these settings and they were -10 elo +- 10 elo, which isn't too surprising because trade penalty is known to reduce elo in self play.
Winrate and error bars on winrate where -1 is LS wins all games and 0 is 50% winrate and 1 is SF wins all games. So roughly 59% winrate +-3.5% is expected at the optimum shown here. (Not average of all points shown)
(array([-0.18201357]), array([0.07259538]))
Thanks for sharing your work.
It is nice to see a *reason* why parameters should be set to certain values.
Otherwise, it feels like poking something with a stick and hoping that it works.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Thanks @Dann Corbit! Also neglected to point out the obvious that previous tuning was done with a net that is around 4x smaller/faster and thus weaker, but still doing a lot more nodes here despite the slowdown.
I did some testing of iteration 674 with a 250*2 game gauntlet and it ended up behind kiudee defaults. Here is iteration 805. It's changing very slowly but over many iterations it adds up. FPU and Cpuct are lower and cpuctFactor and policy are higher now.
Note if you open image in a new tab you will see it is high enough resolution to read all the numbers.
1185 looks like the red and orange are pretty converged now. But doesn't mean it won't change later, but there's finally a lot of dots everywhere now so its probably good performance here even with other similar setup computers. Cpuctfactor is remarkably close to kiudee's tune. Tradepenalty is about 0.00003, would be better if this was refactored to call it 3 maybe.
From testing this some on a computer with a significantly worse leela ratio it seems close in elo in SF gauntlet to kiudee tune, but much better in midgame and worse in endgame.