Elo Increase per Doubling

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Robert
Posts: 20
Joined: Tue Oct 07, 2008 2:53 am
Location: Brasil

Re: Elo Increase per Doubling

Post by Robert »

Don wrote:At low depths the amount of ELO per doubling is quite large and at high depths it is much lower.
This is true (and quite logical) and it must be dependent of your BF too.

Higher the BF => less ELO gain per doubling.
Lower the BF => more ELO gain ....


Do you agree?
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Elo Increase per Doubling

Post by Don »

Robert wrote:
Don wrote:At low depths the amount of ELO per doubling is quite large and at high depths it is much lower.
This is true (and quite logical) and it must be dependent of your BF too.

Higher the BF => less ELO gain per doubling.
Lower the BF => more ELO gain ....


Do you agree?
I agree that the lower the BF the more ELO gain for a properly written program. In fact it is the reason today's programs are so much stronger than programs of 20 years ago is because the very low BF. We do pay for this in quality but it's an excellent trade-off.

But I'm not sure it explains the low vs high ELO gain per doubling as the BF of most program don't vary that much at high depths.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Elo Increase per Doubling

Post by Adam Hair »

Don wrote:
Robert wrote:
Don wrote:At low depths the amount of ELO per doubling is quite large and at high depths it is much lower.
This is true (and quite logical) and it must be dependent of your BF too.

Higher the BF => less ELO gain per doubling.
Lower the BF => more ELO gain ....


Do you agree?
I agree that the lower the BF the more ELO gain for a properly written program. In fact it is the reason today's programs are so much stronger than programs of 20 years ago is because the very low BF. We do pay for this in quality but it's an excellent trade-off.

But I'm not sure it explains the low vs high ELO gain per doubling as the BF of most program don't vary that much at high depths.
I thought that the possible explanation for this is that being able to search x+1 plies instead of x plies does not give quite as much increase in Elo as being able to search x plies instead of x-1 plies.

I have some data to share, but it is even less conclusive than I remembered.

First, here are some results from self-testing where Fruit 2.1 and Houdini 1.03 each play themselves at different depths (both were RR):

Code: Select all

Rank Name              Elo    +    - games score oppo. draws 
   1 Fruit_2.1_ply12   482   18   18  1981   91%   -60   15% 
   2 Fruit_2.1_ply11   387   16   16  1981   84%   -49   19% 
   3 Fruit_2.1_ply10   275   15   15  1981   74%   -35   20% 
   4 Fruit_2.1_ply9    158   14   14  1981   63%   -20   21% 
   5 Fruit_2.1_ply8     47   13   13  1981   52%    -6   20% 
   6 Fruit_2.1_ply7    -93   14   14  1981   39%    11   17% 
   7 Fruit_2.1_ply6   -237   15   15  1982   28%    30   13% 
   8 Fruit_2.1_ply5   -408   18   18  1982   15%    51   11% 
   9 Fruit_2.1_ply4   -613   24   24  1982    4%    77    6% 


Rank Name                 Elo    +    - games score oppo. draws 
  1 Houdini_1.03a_ply16   287    8    8  3649   85%   -36   28% 
  2 Houdini_1.03a_ply15   231    7    7  3648   79%   -29   32% 
  3 Houdini_1.03a_ply14   173    7    7  3650   72%   -22   36% 
  4 Houdini_1.03a_ply13   102    6    6  3648   63%   -13   39% 
  5 Houdini_1.03a_ply12    30    6    6  3649   53%    -4   36% 
  6 Houdini_1.03a_ply11   -53    6    6  3649   42%     7   33% 
  7 Houdini_1.03a_ply10  -145    7    7  3649   30%    18   28% 
  8 Houdini_1.03a_ply9   -231    8    8  3649   20%    29   21% 
  9 Houdini_1.03a_ply8   -394   10   10  3649    7%    49   12%   
There seems to be a trend where the Elo delta decreases as the depth increases, but not conclusively.

The next piece of data is a comparison of move selections made by Gaviota 0.84 when restricted to different depths. I randomly selected 1000 positions from the positions that come with the sim tool, then modified the tool so that it gave "go depth x" instead of "go infinite":

Code: Select all

Key:
  1) Gaviota 0.84 depth 02(time: 50 ms  scale: 1.0)
  2) Gaviota 0.84 depth 03(time: 50 ms  scale: 1.0)
  3) Gaviota 0.84 depth 04(time: 50 ms  scale: 1.0)
  4) Gaviota 0.84 depth 05(time: 50 ms  scale: 1.0)
  5) Gaviota 0.84 depth 06(time: 50 ms  scale: 1.0)
  6) Gaviota 0.84 depth 07(time: 50 ms  scale: 1.0)
  7) Gaviota 0.84 depth 08(time: 50 ms  scale: 1.0)
  8) Gaviota 0.84 depth 09(time: 50 ms  scale: 1.0)
  9) Gaviota 0.84 depth 10(time: 50 ms  scale: 1.0)
 10) Gaviota 0.84 depth 11(time: 50 ms  scale: 1.0)
 11) Gaviota 0.84 depth 12(time: 50 ms  scale: 1.0)
 12) Gaviota 0.84 depth 13(time: 50 ms  scale: 1.0)
 13) Gaviota 0.84 depth14(time: 50 ms  scale: 1.0)

         1     2     3     4     5     6     7     8     9    10    11    12    13
  1.  ----- 59.50 50.10 45.80 42.10 38.60 36.80 37.00 34.80 33.70 33.40 31.80 30.60
  2.  59.50 ----- 66.60 57.30 53.00 46.90 44.00 43.30 41.80 40.10 40.10 36.50 36.00
  3.  50.10 66.60 ----- 71.20 63.00 57.30 52.40 48.50 46.40 44.60 43.70 41.60 39.70
  4.  45.80 57.30 71.20 ----- 76.10 64.70 57.80 53.90 50.90 47.60 47.40 45.60 43.90
  5.  42.10 53.00 63.00 76.10 ----- 76.90 65.80 61.20 56.50 53.00 52.40 49.10 47.60
  6.  38.60 46.90 57.30 64.70 76.90 ----- 79.40 71.40 63.80 59.50 57.30 53.90 52.90
  7.  36.80 44.00 52.40 57.80 65.80 79.40 ----- 83.20 73.30 67.20 63.20 60.00 56.30
  8.  37.00 43.30 48.50 53.90 61.20 71.40 83.20 ----- 82.20 73.90 69.90 64.80 61.70
  9.  34.80 41.80 46.40 50.90 56.50 63.80 73.30 82.20 ----- 84.90 78.70 71.30 66.00
 10.  33.70 40.10 44.60 47.60 53.00 59.50 67.20 73.90 84.90 ----- 86.50 77.50 72.50
 11.  33.40 40.10 43.70 47.40 52.40 57.30 63.20 69.90 78.70 86.50 ----- 81.10 76.60
 12.  31.80 36.50 41.60 45.60 49.10 53.90 60.00 64.80 71.30 77.50 81.10 ----- 82.80
 13.  30.60 36.00 39.70 43.90 47.60 52.90 56.30 61.70 66.00 72.50 76.60 82.80 -----
Again, there does appear to be a trend, but not conclusively. In fact, the similarity between successive plies actually reduces twice, and then increases again.

If I had the resources, I would try even higher plies.

My speculation is that as the average depth searched becomes deeper and deeper, it is less likely in a game to find a better move than you could find searching one ply less. To me, that would seem to translate into less increase in Elo per doubling.

I do think I will revisit both of these types of measurements. Also, Miguel has suggested to me to measure the increase in Elo when the number of nodes is doubled.
User avatar
JuLieN
Posts: 2949
Joined: Mon May 05, 2008 12:16 pm
Location: Bordeaux (France)
Full name: Julien Marcel

Re: Elo Increase per Doubling

Post by JuLieN »

Thanks a lot, Adam! :) Exactly the data I was searching for!

I made a plot graph out of it:

Image
(With Delta Elo(x) = Elo(x)-Elo(x-1) )

So there IS a diminishing return, when depth increases! Actually, data from plies 1-5 would be very interesting as well! :)
"The only good bug is a dead bug." (Don Dailey)
[Blog: http://tinyurl.com/predateur ] [Facebook: http://tinyurl.com/fbpredateur ] [MacEngines: http://tinyurl.com/macengines ]
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Elo Increase per Doubling

Post by Don »

Adam Hair wrote:
Don wrote:
Robert wrote:
Don wrote:At low depths the amount of ELO per doubling is quite large and at high depths it is much lower.
This is true (and quite logical) and it must be dependent of your BF too.

Higher the BF => less ELO gain per doubling.
Lower the BF => more ELO gain ....


Do you agree?
I agree that the lower the BF the more ELO gain for a properly written program. In fact it is the reason today's programs are so much stronger than programs of 20 years ago is because the very low BF. We do pay for this in quality but it's an excellent trade-off.

But I'm not sure it explains the low vs high ELO gain per doubling as the BF of most program don't vary that much at high depths.
I thought that the possible explanation for this is that being able to search x+1 plies instead of x plies does not give quite as much increase in Elo as being able to search x plies instead of x-1 plies.

I have some data to share, but it is even less conclusive than I remembered.

First, here are some results from self-testing where Fruit 2.1 and Houdini 1.03 each play themselves at different depths (both were RR):

Code: Select all

Rank Name              Elo    +    - games score oppo. draws 
   1 Fruit_2.1_ply12   482   18   18  1981   91%   -60   15% 
   2 Fruit_2.1_ply11   387   16   16  1981   84%   -49   19% 
   3 Fruit_2.1_ply10   275   15   15  1981   74%   -35   20% 
   4 Fruit_2.1_ply9    158   14   14  1981   63%   -20   21% 
   5 Fruit_2.1_ply8     47   13   13  1981   52%    -6   20% 
   6 Fruit_2.1_ply7    -93   14   14  1981   39%    11   17% 
   7 Fruit_2.1_ply6   -237   15   15  1982   28%    30   13% 
   8 Fruit_2.1_ply5   -408   18   18  1982   15%    51   11% 
   9 Fruit_2.1_ply4   -613   24   24  1982    4%    77    6% 


Rank Name                 Elo    +    - games score oppo. draws 
  1 Houdini_1.03a_ply16   287    8    8  3649   85%   -36   28% 
  2 Houdini_1.03a_ply15   231    7    7  3648   79%   -29   32% 
  3 Houdini_1.03a_ply14   173    7    7  3650   72%   -22   36% 
  4 Houdini_1.03a_ply13   102    6    6  3648   63%   -13   39% 
  5 Houdini_1.03a_ply12    30    6    6  3649   53%    -4   36% 
  6 Houdini_1.03a_ply11   -53    6    6  3649   42%     7   33% 
  7 Houdini_1.03a_ply10  -145    7    7  3649   30%    18   28% 
  8 Houdini_1.03a_ply9   -231    8    8  3649   20%    29   21% 
  9 Houdini_1.03a_ply8   -394   10   10  3649    7%    49   12%   
There seems to be a trend where the Elo delta decreases as the depth increases, but not conclusively.
Beyond the low depths the ply the increase is much more gradual but it's there. You "delta" data for Houdini goes like this:

163, 86, 92, 83, 72, 71, 58, 56 which is pretty conclusive. You go from 163 to 56. beyond depth 20 you should expect the tapering to be more pronounced - in other very gradual and almost flat. If it were not you would run out of ELO gains which means the program would be playing almost perfect chess. I may just be seeing patterns that don't exist but the decline seems to come in pairs. I don't know what the branching factor of Houdini is but for most strong programs it's 2 or less.

I wonder if you could also estimate the strength of a program with only self play by attach a value to the ELO gain per doubling. Assume all programs scale the same I'll bet you would come within 200 strength of the program this way, for example if you are gaining 50 ELO at some level it might mean at that level you are playing 2800 strength - that just an example not an actual value. And this is just a hypothesis not an assertion :-)

The next piece of data is a comparison of move selections made by Gaviota 0.84 when restricted to different depths. I randomly selected 1000 positions from the positions that come with the sim tool, then modified the tool so that it gave "go depth x" instead of "go infinite":

Code: Select all

Key:
  1) Gaviota 0.84 depth 02(time: 50 ms  scale: 1.0)
  2) Gaviota 0.84 depth 03(time: 50 ms  scale: 1.0)
  3) Gaviota 0.84 depth 04(time: 50 ms  scale: 1.0)
  4) Gaviota 0.84 depth 05(time: 50 ms  scale: 1.0)
  5) Gaviota 0.84 depth 06(time: 50 ms  scale: 1.0)
  6) Gaviota 0.84 depth 07(time: 50 ms  scale: 1.0)
  7) Gaviota 0.84 depth 08(time: 50 ms  scale: 1.0)
  8) Gaviota 0.84 depth 09(time: 50 ms  scale: 1.0)
  9) Gaviota 0.84 depth 10(time: 50 ms  scale: 1.0)
 10) Gaviota 0.84 depth 11(time: 50 ms  scale: 1.0)
 11) Gaviota 0.84 depth 12(time: 50 ms  scale: 1.0)
 12) Gaviota 0.84 depth 13(time: 50 ms  scale: 1.0)
 13) Gaviota 0.84 depth14(time: 50 ms  scale: 1.0)

         1     2     3     4     5     6     7     8     9    10    11    12    13
  1.  ----- 59.50 50.10 45.80 42.10 38.60 36.80 37.00 34.80 33.70 33.40 31.80 30.60
  2.  59.50 ----- 66.60 57.30 53.00 46.90 44.00 43.30 41.80 40.10 40.10 36.50 36.00
  3.  50.10 66.60 ----- 71.20 63.00 57.30 52.40 48.50 46.40 44.60 43.70 41.60 39.70
  4.  45.80 57.30 71.20 ----- 76.10 64.70 57.80 53.90 50.90 47.60 47.40 45.60 43.90
  5.  42.10 53.00 63.00 76.10 ----- 76.90 65.80 61.20 56.50 53.00 52.40 49.10 47.60
  6.  38.60 46.90 57.30 64.70 76.90 ----- 79.40 71.40 63.80 59.50 57.30 53.90 52.90
  7.  36.80 44.00 52.40 57.80 65.80 79.40 ----- 83.20 73.30 67.20 63.20 60.00 56.30
  8.  37.00 43.30 48.50 53.90 61.20 71.40 83.20 ----- 82.20 73.90 69.90 64.80 61.70
  9.  34.80 41.80 46.40 50.90 56.50 63.80 73.30 82.20 ----- 84.90 78.70 71.30 66.00
 10.  33.70 40.10 44.60 47.60 53.00 59.50 67.20 73.90 84.90 ----- 86.50 77.50 72.50
 11.  33.40 40.10 43.70 47.40 52.40 57.30 63.20 69.90 78.70 86.50 ----- 81.10 76.60
 12.  31.80 36.50 41.60 45.60 49.10 53.90 60.00 64.80 71.30 77.50 81.10 ----- 82.80
 13.  30.60 36.00 39.70 43.90 47.60 52.90 56.30 61.70 66.00 72.50 76.60 82.80 -----
Again, there does appear to be a trend, but not conclusively. In fact, the similarity between successive plies actually reduces twice, and then increases again.

If I had the resources, I would try even higher plies.

My speculation is that as the average depth searched becomes deeper and deeper, it is less likely in a game to find a better move than you could find searching one ply less. To me, that would seem to translate into less increase in Elo per doubling.

I do think I will revisit both of these types of measurements. Also, Miguel has suggested to me to measure the increase in Elo when the number of nodes is doubled.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Elo Increase per Doubling

Post by Laskos »

Adam, what are you using for 2^n*time versus 1*time used at short controls? I played a bit with fixed ply games for n+1 versus n plies some half a year ago, my impression is that the unrelated engines have pretty unique behaviour with respect to increased depth, I thought of inventing "mostly search similarity tool" :D (in contrast to mostly eval Sim03 tester, if I understood something) based on the shape of the Elo increase.

Here is the picture for Houdini 1.5 (if I found the correct file)

Image

For StockFish 2.1

Image


I had some results for IvanHoe, which was closer in its shape to Houdini than to StockFish.


The Branching Factor of Houdini 1.5 with respect to depth:

Image

So, it seems that the delta Elo with the depth decreases (with some bumping odd-even plies), and with respect with the number of doublings in time decreases even further, because BF slowly increases.

Kai
User avatar
Rebel
Posts: 6997
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Elo Increase per Doubling

Post by Rebel »

Adam Hair wrote: The next piece of data is a comparison of move selections made by Gaviota 0.84 when restricted to different depths. I randomly selected 1000 positions from the positions that come with the sim tool, then modified the tool so that it gave "go depth x" instead of "go infinite":

Code: Select all

Key:
  1) Gaviota 0.84 depth 02(time: 50 ms  scale: 1.0)
  2) Gaviota 0.84 depth 03(time: 50 ms  scale: 1.0)
  3) Gaviota 0.84 depth 04(time: 50 ms  scale: 1.0)
  4) Gaviota 0.84 depth 05(time: 50 ms  scale: 1.0)
  5) Gaviota 0.84 depth 06(time: 50 ms  scale: 1.0)
  6) Gaviota 0.84 depth 07(time: 50 ms  scale: 1.0)
  7) Gaviota 0.84 depth 08(time: 50 ms  scale: 1.0)
  8) Gaviota 0.84 depth 09(time: 50 ms  scale: 1.0)
  9) Gaviota 0.84 depth 10(time: 50 ms  scale: 1.0)
 10) Gaviota 0.84 depth 11(time: 50 ms  scale: 1.0)
 11) Gaviota 0.84 depth 12(time: 50 ms  scale: 1.0)
 12) Gaviota 0.84 depth 13(time: 50 ms  scale: 1.0)
 13) Gaviota 0.84 depth14(time: 50 ms  scale: 1.0)

         1     2     3     4     5     6     7     8     9    10    11    12    13
  1.  ----- 59.50 50.10 45.80 42.10 38.60 36.80 37.00 34.80 33.70 33.40 31.80 30.60
  2.  59.50 ----- 66.60 57.30 53.00 46.90 44.00 43.30 41.80 40.10 40.10 36.50 36.00
  3.  50.10 66.60 ----- 71.20 63.00 57.30 52.40 48.50 46.40 44.60 43.70 41.60 39.70
  4.  45.80 57.30 71.20 ----- 76.10 64.70 57.80 53.90 50.90 47.60 47.40 45.60 43.90
  5.  42.10 53.00 63.00 76.10 ----- 76.90 65.80 61.20 56.50 53.00 52.40 49.10 47.60
  6.  38.60 46.90 57.30 64.70 76.90 ----- 79.40 71.40 63.80 59.50 57.30 53.90 52.90
  7.  36.80 44.00 52.40 57.80 65.80 79.40 ----- 83.20 73.30 67.20 63.20 60.00 56.30
  8.  37.00 43.30 48.50 53.90 61.20 71.40 83.20 ----- 82.20 73.90 69.90 64.80 61.70
  9.  34.80 41.80 46.40 50.90 56.50 63.80 73.30 82.20 ----- 84.90 78.70 71.30 66.00
 10.  33.70 40.10 44.60 47.60 53.00 59.50 67.20 73.90 84.90 ----- 86.50 77.50 72.50
 11.  33.40 40.10 43.70 47.40 52.40 57.30 63.20 69.90 78.70 86.50 ----- 81.10 76.60
 12.  31.80 36.50 41.60 45.60 49.10 53.90 60.00 64.80 71.30 77.50 81.10 ----- 82.80
 13.  30.60 36.00 39.70 43.90 47.60 52.90 56.30 61.70 66.00 72.50 76.60 82.80 -----
Again, there does appear to be a trend, but not conclusively. In fact, the similarity between successive plies actually reduces twice, and then increases again.

If I had the resources, I would try even higher plies.

My speculation is that as the average depth searched becomes deeper and deeper, it is less likely in a game to find a better move than you could find searching one ply less. To me, that would seem to translate into less increase in Elo per doubling.

I do think I will revisit both of these types of measurements. Also, Miguel has suggested to me to measure the increase in Elo when the number of nodes is doubled.
Here is another statistic that tells the same story. You will find it as SEARCH.TXT in any ProDeo folder. Massive changes during the first iterations, then stabilizing.

Code: Select all

                                    SEARCH OVERVIEW        
                                    ===============        

Depth           Moves                    Moves                   Moves    
               Changed                  Changed                 Changed   
             Middle Game              Normal Endgame         Simple Endgame

 1     287628 - 239661 = 83.3%   182770 - 130099 = 71.2% 137389 - 85574 = 62.3%
 2     285662 - 128595 = 45.0%   180799 - 66267 = 36.7%  135944 - 43882 = 32.3%
 3     273200 - 93203 = 34.1%    172731 - 44025 = 25.5%  128720 - 33322 = 25.9%
 4     265436 - 71316 = 26.9%    168022 - 34396 = 20.5%  124031 - 21202 = 17.1%
 5     248266 - 69761 = 28.1%    165326 - 33659 = 20.4%  120118 - 23454 = 19.5%
 6     234111 - 62769 = 26.8%    164107 - 31491 = 19.2%  117806 - 20101 = 17.1%
 7     195376 - 46723 = 23.9%    163587 - 28854 = 17.6%  116897 - 18995 = 16.2%
 8     193120 - 41092 = 21.3%    161748 - 26394 = 16.3%  115869 - 17255 = 14.9%
 9     174273 - 30610 = 17.6%    147300 - 20968 = 14.2%  115092 - 15863 = 13.8%
10     139846 - 21779 = 15.6%    119788 - 13869 = 11.6%  113580 - 14486 = 12.8%
11     102808 - 15820 = 15.4%     94937 - 9607 = 10.1%    95777 - 10773 = 11.2%
12      72567 - 9435 = 13.0%      74857 - 6347 =  8.5%    81248 - 7541 =  9.3%
13      48367 - 5607 = 11.6%      57179 - 4111 =  7.2%    68112 - 5507 =  8.1%
14      28744 - 2615 =  9.1%      43487 - 2528 =  5.8%    55727 - 4074 =  7.3%
15      14545 - 923 =  6.3%       33337 - 1577 =  4.7%    44482 - 2939 =  6.6%
16       6744 - 226 =  3.4%       26180 - 729 =  2.8%     34434 - 1920 =  5.6%
17       3472 - 46 =  1.3%        21286 - 350 =  1.6%     26435 - 1241 =  4.7%
18       2494 - 6 =  0.2%         18288 - 133 =  0.7%     20672 - 776 =  3.8%
19       2196 - 2 =  0.0%         16875 - 55 =  0.3%      16191 - 392 =  2.4%
20       2090 - 0 =  0.0%         16214 - 18 =  0.1%      13191 - 233 =  1.8%
21       2045 - 0 =  0.0%         15871 - 7 =  0.0%       11327 - 157 =  1.4%
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Elo Increase per Doubling

Post by Don »

Ed,

What is "moves changed?" Do you mean the move is changed in the final iteration?

Rebel wrote:
Adam Hair wrote: The next piece of data is a comparison of move selections made by Gaviota 0.84 when restricted to different depths. I randomly selected 1000 positions from the positions that come with the sim tool, then modified the tool so that it gave "go depth x" instead of "go infinite":

Code: Select all

Key:
  1) Gaviota 0.84 depth 02(time: 50 ms  scale: 1.0)
  2) Gaviota 0.84 depth 03(time: 50 ms  scale: 1.0)
  3) Gaviota 0.84 depth 04(time: 50 ms  scale: 1.0)
  4) Gaviota 0.84 depth 05(time: 50 ms  scale: 1.0)
  5) Gaviota 0.84 depth 06(time: 50 ms  scale: 1.0)
  6) Gaviota 0.84 depth 07(time: 50 ms  scale: 1.0)
  7) Gaviota 0.84 depth 08(time: 50 ms  scale: 1.0)
  8) Gaviota 0.84 depth 09(time: 50 ms  scale: 1.0)
  9) Gaviota 0.84 depth 10(time: 50 ms  scale: 1.0)
 10) Gaviota 0.84 depth 11(time: 50 ms  scale: 1.0)
 11) Gaviota 0.84 depth 12(time: 50 ms  scale: 1.0)
 12) Gaviota 0.84 depth 13(time: 50 ms  scale: 1.0)
 13) Gaviota 0.84 depth14(time: 50 ms  scale: 1.0)

         1     2     3     4     5     6     7     8     9    10    11    12    13
  1.  ----- 59.50 50.10 45.80 42.10 38.60 36.80 37.00 34.80 33.70 33.40 31.80 30.60
  2.  59.50 ----- 66.60 57.30 53.00 46.90 44.00 43.30 41.80 40.10 40.10 36.50 36.00
  3.  50.10 66.60 ----- 71.20 63.00 57.30 52.40 48.50 46.40 44.60 43.70 41.60 39.70
  4.  45.80 57.30 71.20 ----- 76.10 64.70 57.80 53.90 50.90 47.60 47.40 45.60 43.90
  5.  42.10 53.00 63.00 76.10 ----- 76.90 65.80 61.20 56.50 53.00 52.40 49.10 47.60
  6.  38.60 46.90 57.30 64.70 76.90 ----- 79.40 71.40 63.80 59.50 57.30 53.90 52.90
  7.  36.80 44.00 52.40 57.80 65.80 79.40 ----- 83.20 73.30 67.20 63.20 60.00 56.30
  8.  37.00 43.30 48.50 53.90 61.20 71.40 83.20 ----- 82.20 73.90 69.90 64.80 61.70
  9.  34.80 41.80 46.40 50.90 56.50 63.80 73.30 82.20 ----- 84.90 78.70 71.30 66.00
 10.  33.70 40.10 44.60 47.60 53.00 59.50 67.20 73.90 84.90 ----- 86.50 77.50 72.50
 11.  33.40 40.10 43.70 47.40 52.40 57.30 63.20 69.90 78.70 86.50 ----- 81.10 76.60
 12.  31.80 36.50 41.60 45.60 49.10 53.90 60.00 64.80 71.30 77.50 81.10 ----- 82.80
 13.  30.60 36.00 39.70 43.90 47.60 52.90 56.30 61.70 66.00 72.50 76.60 82.80 -----
Again, there does appear to be a trend, but not conclusively. In fact, the similarity between successive plies actually reduces twice, and then increases again.

If I had the resources, I would try even higher plies.

My speculation is that as the average depth searched becomes deeper and deeper, it is less likely in a game to find a better move than you could find searching one ply less. To me, that would seem to translate into less increase in Elo per doubling.

I do think I will revisit both of these types of measurements. Also, Miguel has suggested to me to measure the increase in Elo when the number of nodes is doubled.
Here is another statistic that tells the same story. You will find it as SEARCH.TXT in any ProDeo folder. Massive changes during the first iterations, then stabilizing.

Code: Select all

                                    SEARCH OVERVIEW        
                                    ===============        

Depth           Moves                    Moves                   Moves    
               Changed                  Changed                 Changed   
             Middle Game              Normal Endgame         Simple Endgame

 1     287628 - 239661 = 83.3%   182770 - 130099 = 71.2% 137389 - 85574 = 62.3%
 2     285662 - 128595 = 45.0%   180799 - 66267 = 36.7%  135944 - 43882 = 32.3%
 3     273200 - 93203 = 34.1%    172731 - 44025 = 25.5%  128720 - 33322 = 25.9%
 4     265436 - 71316 = 26.9%    168022 - 34396 = 20.5%  124031 - 21202 = 17.1%
 5     248266 - 69761 = 28.1%    165326 - 33659 = 20.4%  120118 - 23454 = 19.5%
 6     234111 - 62769 = 26.8%    164107 - 31491 = 19.2%  117806 - 20101 = 17.1%
 7     195376 - 46723 = 23.9%    163587 - 28854 = 17.6%  116897 - 18995 = 16.2%
 8     193120 - 41092 = 21.3%    161748 - 26394 = 16.3%  115869 - 17255 = 14.9%
 9     174273 - 30610 = 17.6%    147300 - 20968 = 14.2%  115092 - 15863 = 13.8%
10     139846 - 21779 = 15.6%    119788 - 13869 = 11.6%  113580 - 14486 = 12.8%
11     102808 - 15820 = 15.4%     94937 - 9607 = 10.1%    95777 - 10773 = 11.2%
12      72567 - 9435 = 13.0%      74857 - 6347 =  8.5%    81248 - 7541 =  9.3%
13      48367 - 5607 = 11.6%      57179 - 4111 =  7.2%    68112 - 5507 =  8.1%
14      28744 - 2615 =  9.1%      43487 - 2528 =  5.8%    55727 - 4074 =  7.3%
15      14545 - 923 =  6.3%       33337 - 1577 =  4.7%    44482 - 2939 =  6.6%
16       6744 - 226 =  3.4%       26180 - 729 =  2.8%     34434 - 1920 =  5.6%
17       3472 - 46 =  1.3%        21286 - 350 =  1.6%     26435 - 1241 =  4.7%
18       2494 - 6 =  0.2%         18288 - 133 =  0.7%     20672 - 776 =  3.8%
19       2196 - 2 =  0.0%         16875 - 55 =  0.3%      16191 - 392 =  2.4%
20       2090 - 0 =  0.0%         16214 - 18 =  0.1%      13191 - 233 =  1.8%
21       2045 - 0 =  0.0%         15871 - 7 =  0.0%       11327 - 157 =  1.4%
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
User avatar
Rebel
Posts: 6997
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Elo Increase per Doubling

Post by Rebel »

Adam Hair wrote:I do think I will revisit both of these types of measurements. Also, Miguel has suggested to me to measure the increase in Elo when the number of nodes is doubled.
Good idea, just one but: time control is killed. But since that's true for both engines.....

Is there no interface that allows each engine its own time control ?
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Elo Increase per Doubling

Post by Don »

Rebel wrote:
Adam Hair wrote:I do think I will revisit both of these types of measurements. Also, Miguel has suggested to me to measure the increase in Elo when the number of nodes is doubled.
Good idea, just one but: time control is killed. But since that's true for both engines.....

Is there no interface that allows each engine its own time control ?
I'm running a test of pure nodes doubling now. I need my computer for other things so this will be just a few hundred games. Here is what I have so far (I'm still running) :

Code: Select all

Rank Name    Elo      +      -    games   score   oppo.   draws 
   1 10    4607.7   96.1   96.1     169   94.1%  3851.3   11.8% 
   2 09    4498.6   84.1   84.1     169   87.9%  3861.6   17.2% 
   3 08    4400.2   81.3   81.3     170   80.0%  3875.8   12.9% 
   4 07    4270.4   79.3   79.3     170   70.9%  3888.8   15.9% 
   5 06    4135.0   81.4   81.4     170   61.8%  3902.4   14.1% 
   6 05    3969.2   84.4   84.4     170   51.2%  3918.9    5.9% 
   7 04    3757.1   84.5   84.5     170   37.4%  3940.1   11.2% 
   8 03    3654.1   87.0   87.0     170   30.3%  3950.5   11.2% 
   9 02    3543.2   89.7   89.7     170   23.5%  3961.5   10.6% 
  10 01    3323.0  108.9  108.9     170   12.1%  3983.6    5.3% 
  11 00    3000.0  179.5  179.5     170    1.5%  4015.9    0.6% 


      TIME       RATIO    log(r)     NODES    log(r)  ave DEPTH    GAMES   PLAYER
 ---------  ----------  --------  --------  --------  ---------  -------   --
    0.0022       1.000     0.000     0.001     0.000     3.1752      170   00
    0.0028       1.302     0.264     0.001     0.685     4.2193      170   01
    0.0046       2.122     0.752     0.002     1.378     5.3787      170   02
    0.0082       3.748     1.321     0.004     2.071     6.5317      170   03
    0.0158       7.263     1.983     0.008     2.765     7.6667      170   04
    0.0326      14.939     2.704     0.016     3.457     8.7110      170   05
    0.0577      26.458     3.276     0.033     4.151     9.7802      170   06
    0.1052      48.235     3.876     0.066     4.844    10.8375      170   07
    0.1926      88.292     4.481     0.131     5.537    11.8340      170   08
    0.3901     178.827     5.186     0.262     6.230    13.0136      169   09
    0.7313     335.232     5.815     0.524     6.923    14.0442      169   10
As Ed observes you do lose time control but it's the same with fixed depth. The average depth is interesting too, we get a little more than a ply for each doubling of nodes.

For reference, version 00 is 512 nodes and each subsequent level doubles that.

I'll report again later with the delta's and a graph.

Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.