Page 1 of 8
Elo points gain from doubling time
Posted: Mon Dec 10, 2012 8:14 pm
by Laskos
I played a gauntlet with Komodo 5 at different time/move vs Houdini 3 at 1s/move. This took some time because the time controls are not very short, and I only saw such tests performed at ultra-short fixed time or fixed depth controls (by Don and Adam).
Code: Select all
Program Games Elo
1 Komodo 5 4s : 2000 3131
2 Komodo 5 2s : 2000 3050
3 Komodo 5 1s : 4016 2957
4 Komodo 5 0.5s : 4017 2850
5 Houdini 3 1s : 12033 3036
The scaling of Komodo 5 with doubling time is:
Code: Select all
From 2s/move to 4s/move (blitz) +81 Elo points
From 1s/move to 2s/move +93 Elo points
From 0.5s/move to 1s/move (bullet) +107 Elo points
The fit is:
107*(0.87)^{log2(time in seconds per move)} =
107*(time in seconds per move)^(-0.20) Elo points per doubling time (or cores, assuming perfect scaling).
Extrapolating to longer time controls, for 120min/40 moves on one core it gives 107*180^(-0.20) ~ 40 Elo points per doubling time. On eight cores for 120min/40moves LTC it's ~30 Elo points per doubling time. Of course, this is an extrapolation.
Further speculation: to the infinite time control, the improvement from 1s/move is 107/(1-0.87) ~ 820 Elo points, so that Komodo 5 is limited by something like 4000 Elo points strength (calibrated to the current lists) at infinite time control.
I think the formula 107*(time per move in seconds)^(-0.20) Elo points is useful as a rule of thumb for gain from doubling time. This is on one modern core, on several cores time should be multiplied by #cores.
Kai
Re: Elo points gain from doubling time
Posted: Mon Dec 10, 2012 9:11 pm
by hgm
Your testing method is somewhat susceptible to systematic error, because you test against a fixed-strength opponent. So the ratings you get are sensitive to the rating model, and the saturation you see could very well be caused by the shape of the rating curve being different from what the Elo calculator assumes.
Re: Elo points gain from doubling time
Posted: Mon Dec 10, 2012 10:22 pm
by Laskos
hgm wrote:Your testing method is somewhat susceptible to systematic error, because you test against a fixed-strength opponent. So the ratings you get are sensitive to the rating model, and the saturation you see could very well be caused by the shape of the rating curve being different from what the Elo calculator assumes.
I don't think this is a serious issue here. The ratings of Komodo are connected only through Houdini, and the points given reflect strictly the Elo curve. It's true that the Elo curve might be off the rating curve for the engines, but I saw some possible deviations only on some 1200 Elo points span on the tails (possibly a Gaussian), not 250 points as shown here. Yes, these points here assume the Elo model, but I don't think it's problematic on this span.
Re: Elo points gain from doubling time
Posted: Mon Dec 10, 2012 10:51 pm
by hgm
Well, what you
think and what you
measure are two different things...
If you plan to continue this investigation, I would be really curious to see what happens if you also throw in a 0.25s, 0.5s and 2s Houdini.
Re: Elo points gain from doubling time
Posted: Mon Dec 10, 2012 10:55 pm
by Jouni
BTW in CSS forum is search depth test with Houdini 3. Result is weird:
Code: Select all
Tiefe Elodiff. Ergebnis
1 - 2 222 217,5-782,5 (+82=271-647)
2 - 3 185 256-744 (+75=362-563)
3 - 4 151 295,5-704,5 (+67=457-476)
4 - 5 128 323,5-676,5 (+69=509-422)
5 - 6 122 331,5-668,5 (+69=525-406)
6 - 7 107 350,5-649,5 (+72=557-371)
7 - 8 120 333,5-666,5 (+54=559-387)
8 - 9 175 268-732 (+28=480-492)
9 - 10 123 330,5-669,5 (+39=583-378)
10 - 11 92 370-630 (+29=682-289)
11 - 12 75 393,5-606,5 (+49=689-220)
12 - 13 63 410,5-589,5 (+41=739-220)
13 - 14 63 411-589 (+28=766-206)
14 - 15 46 433,5-566,5 (+36=795-169)
What on earth happens in 8-9 ply match !?!?
Re: Elo points gain from doubling time
Posted: Mon Dec 10, 2012 11:11 pm
by hgm
The result is weird anyway: the draw fraction seems to go up enormously at higher depth. I would consider 80% draws a ridiculously high draw fraction, between nearly equal engines.
Re: Elo points gain from doubling time
Posted: Mon Dec 10, 2012 11:24 pm
by Rebel
Extremities happen from time to time. Here a snapshot from an 1+1 match I am running in 8 threads. After 100 games I have:
Code: Select all
ProDeo 1.82 (work) vs ProDeo 1.81 (main) 10-12-2012 using [TIME 1+1]
Testing:
[HR_DEPTH = 1] * default=3
# ENGINE : RATING POINTS PLAYED (%)
1 WORK1 : 2649.4 11.0 13 84.6%
2 MAIN5 : 2625.7 10.5 13 80.8%
3 MAIN7 : 2596.3 9.0 12 75.0%
4 WORK4 : 2587.5 9.5 13 73.1%
5 MAIN3 : 2577.7 8.5 12 70.8%
6 MAIN6 : 2549.0 7.0 11 63.6%
7 MAIN2 : 2544.8 7.5 12 62.5%
8 WORK8 : 2500.0 7.0 14 50.0%
9 MAIN8 : 2500.0 7.0 14 50.0%
10 WORK2 : 2455.2 4.5 12 37.5%
11 WORK6 : 2451.0 4.0 11 36.4%
12 WORK3 : 2422.3 3.5 12 29.2%
13 MAIN4 : 2412.5 3.5 13 26.9%
14 WORK7 : 2403.7 3.0 12 25.0%
15 WORK5 : 2374.3 2.5 13 19.2%
16 MAIN1 : 2350.6 2.0 13 15.4%
Engine WORK (elo 2500) vs Engine MAIN (elo 2500) estimated TPR 2465 (-35)
28-34-38 (100) match score 45.0 - 55.0 (45.0%)
Won-loss 28-38 = -10 (100 games) draws 34.0%
LOS = 11.1% Elo Error Margin +56 -56
WORK 4:15:26 (25.920M nodes) NPS = 1.691K
MAIN 4:17:49 (25.971M nodes) NPS = 1.679K
Depth Stats MIDG END0 END1 END2
WORK 11.63 12.09 12.80 16.70
MAIN 11.20 12.16 12.41 16.12
MAIN = ProDeo 1.81
WORK = ProDeo 1.81 + a code change
Please explain WORK1 version scores 11/13 and the same version only scores 2.5/13 while the total score is -10.
I am confident that after 2000 games (or so) there will be clearness.
Re: Elo points gain from doubling time
Posted: Mon Dec 10, 2012 11:38 pm
by Laskos
hgm wrote:Well, what you
think and what you
measure are two different things...
If you plan to continue this investigation, I would be really curious to see what happens if you also throw in a 0.25s, 0.5s and 2s Houdini.
Besides the Elo model, which is pretty irrelevant here (on -100,+100 points interval all models are almost linear), I see the problem with still large error margins. These 2,000-4,000 blitz games matches take a lot of time, so I will not throw in different TC Houdinis, as I just wanted to see the rule of thumb law.
Re: Elo points gain from doubling time
Posted: Tue Dec 11, 2012 8:24 am
by M ANSARI
Interesting test! I guess the value most are assuming for a doubling of cores is around 40ELO points, but that is at longer time controls. For super fast matches I think the architecture of the engine makes a huge difference if different engines are pitted against each other. I would think process vs. threaded engines would especially have a huge difference at super fast time controls and then moving to longer time controls.
Re: Elo points gain from doubling time
Posted: Tue Dec 11, 2012 8:37 am
by hgm
Laskos wrote:These 2,000-4,000 blitz games matches take a lot of time, so I will not throw in different TC Houdinis, as I just wanted to see the rule of thumb law.
Well, the 'rule of thumb' seems to be 100 Elo per doubling. (Which, btw, is more than I expected. I always assumed 70 Elo per doubling.) The rest of the analysis seems mostly analysing noise, based on a +7 +/- 4.2 @1s and a -19 +/- 6.3 @ 4s.