Elo Increase per Doubling
Posted: Mon May 07, 2012 1:40 pm
In the past few months, several people (such as Peter Österlund and myself) have measured the increase in Elo per doubling of time control. The increases discovered in each case was, on average, much higher than the often quoted range of 50 to 70 Elo per doubling of speed (which is emulated by doubling the time control). However, one objection that has been raised is that the time controls are too short. While I do not believe the previous results were invalid, I do know that the short base time control added noise to the measurement. So, I decided to test the Elo increase per doubling at a longer base time control.
As it can be seen, the increase in Elo per doubling was 83, 85, and 77 Elo. The increase in winning percentage for each doubling was 11%.
I also performed a RR tournament with Gaviota's opponents. When I include those games, the ratings are as follows:
The increase per doubling is now 84, 86, and 77 Elo.
From my previous test, using a base time control of 6 seconds + 0.1 seconds per move , I found that Gaviota 0.84 gained 104 Elo per doubling (on average). Though the method of measurement was different between the two tests, and two different versions of Gaviota are being compared, it may not be unreasonable to claim that this shows that:
1) At depths higher than those used to determine the estimate of 50 to 70 Elo per doubling of speed originally ("How Computers Play Chess" by David Levy and Monty Newborn ?), the expected increase per doubling for modern engines is greater than the quoted numbers.
2) The expected increase per doubling of speed may decrease at higher time controls. If the base time control was 120 minutes + 90 seconds per move, the measured Elo increase per doubling might be 40 to 70 Elo.
All the games used to compute the ratings can be found here:
http://www.mediafire.com/file/guwf0e3x9 ... me_Odds.7z
- I used Gaviota 0.85.1 64-bit as the engine to measure.
Gaviota played against 10 opponents (Jonny 4.00, SmarThink 1.20 64-bit, Glaurung 2.2 64-bit, Booot 5.1.0, Quazar 0.4 64-bit, Nemo 1.01b 64-bit, Naum 4.2 64-bit, Gull 1.0a 64-bit, Spike 1.4, Hannibal 1.1 64-bit).
The base time control was 1 minute + 1 second per move.
I randomly choice 100 positions in epd format to use for all games, with each position played with reversed colors so that there was 200 games played for each match.
Gaviota played a gauntlet against the opponents at the base time control, and then it played with 2 times, 4 times, and 8 times the base time control
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Naum 4.2 64-bit 163 21 21 800 74% -24 27%
2 Gull 1.0a 64-bit 113 20 20 800 68% -24 28%
3 Gaviota 0.85.1 64-bit(8) 97 12 12 2000 62% 10 30%
4 Quazar 0.4 64-bit 39 19 19 800 59% -24 32%
5 Hannibal 1.1 64-bit 33 19 19 800 58% -24 31%
6 Spike 1.4 33 19 19 800 58% -24 31%
7 Gaviota 0.85.1 64-bit(4) 20 12 12 2000 51% 10 32%
8 Nemo 1.01 beta 64-bit -6 19 19 800 52% -24 32%
9 Booot 5.1.0 -35 19 19 801 48% -24 33%
10 Glaurung 2.2 JA 64-bit -43 19 19 800 47% -24 25%
11 Gaviota 0.85.1 64-bit(2) -65 12 12 2001 40% 10 28%
12 SmarThink 1.20 64-bit -99 20 20 800 40% -24 25%
13 Jonny 4.00 -100 20 20 800 40% -24 26%
14 Gaviota 0.85.1 64-bit -148 13 13 2000 29% 10 26%
I also performed a RR tournament with Gaviota's opponents. When I include those games, the ratings are as follows:
Code: Select all
Rank Name Elo + - games score oppo. draws
1 Naum 4.2 64-bit 165 11 11 2600 73% -13 28%
2 Gull 1.0a 64-bit 103 10 10 2600 65% -8 32%
3 Gaviota 0.85.1 64-bit(8) 98 12 12 2000 62% 10 30%
4 Spike 1.4 49 10 10 2600 57% -4 32%
5 Hannibal 1.1 64-bit 40 10 10 2600 56% -3 34%
6 Quazar 0.4 64-bit 22 10 10 2600 53% -2 35%
7 Gaviota 0.85.1 64-bit(4) 21 12 12 2000 51% 10 32%
8 Nemo 1.01 beta 64-bit -10 10 10 2600 48% 1 32%
9 Booot 5.1.0 -29 10 10 2601 46% 2 34%
10 Glaurung 2.2 JA 64-bit -47 10 10 2600 43% 4 29%
11 Gaviota 0.85.1 64-bit(2) -65 12 12 2001 40% 10 28%
12 Jonny 4.00 -95 11 11 2600 36% 7 27%
13 SmarThink 1.20 64-bit -102 11 11 2600 35% 8 26%
14 Gaviota 0.85.1 64-bit -149 13 13 2000 29% 10 26%
From my previous test, using a base time control of 6 seconds + 0.1 seconds per move , I found that Gaviota 0.84 gained 104 Elo per doubling (on average). Though the method of measurement was different between the two tests, and two different versions of Gaviota are being compared, it may not be unreasonable to claim that this shows that:
1) At depths higher than those used to determine the estimate of 50 to 70 Elo per doubling of speed originally ("How Computers Play Chess" by David Levy and Monty Newborn ?), the expected increase per doubling for modern engines is greater than the quoted numbers.
2) The expected increase per doubling of speed may decrease at higher time controls. If the base time control was 120 minutes + 90 seconds per move, the measured Elo increase per doubling might be 40 to 70 Elo.
All the games used to compute the ratings can be found here:
http://www.mediafire.com/file/guwf0e3x9 ... me_Odds.7z