TalkChess.com

Posted: **Tue Jul 23, 2013 4:27 pm**

With Houdini 3 I played large number of games on 2x nodes versus 1x nodes per move level, or effective doubling time control:

Code: Select all

                                    w     l     d     Elo

1)  4k nodes vs 2k nodes          +3862 -352  =786   +303
2)  8k nodes vs 4k nodes          +3713 -374  =913   +280
3)  16k nodes vs 8k nodes         +3399 -436 =1165   +237
4)  32k nodes vs 16k nodes        +3151 -474 =1374   +208
5)  64k nodes vs 32k nodes        +2862 -494 =1641   +179
6)  128k nodes vs 64k nodes       +2613 -501 =1881   +156
7)  256k nodes vs 128k nodes       +942 -201  =855   +136
8)  512k nodes vs 256k nodes       +900 -166  =930   +134
9)  1024k nodes vs 512k nodes      +806 -167 =1026   +115
10) 2048k nodes vs 1024k nodes     +344  -83  =572    +93
11) 4096k nodes vs 2048k nodes     +307  -85  =607    +79

The gain from doubling the nodes I fitted with a/(b*x^c + 1), where x is the number of doublings, getting the correlation 0.99
The plot is here:

The 40/4', 40/40' and 40/120' CCRL and CEGT levels are shown, and the resulting gain from doubling in this extrapolation is ~70 points at 40/4', ~55 points at 40/40' and ~45 points at 40/120'. The limiting value I get by summing up to infinity over all doublings (infinite time control), and is 1707 points above the Houdini 3 40/40' CCRL level. So, I get 4877 Elo points on CCRL the rating of the perfect engine, similar to what I remember Don got some time ago.

The draw ratio I fitted with a shifted logistic, getting the correlation 0.999. In self play we can expect a very high percentage of draws going to very long time controls.
The plot is here:

The hardest to quantify to me was the win/loss ratio, which I somehow assumed to be constant at longer TC. It seems not to be the case, win/loss ratio seems to decrease with time control (or nodes). I fitted it with 1 + 1/(a*x + b), getting a correlation 0.96.
The plot is here:

Posted: **Tue Jul 23, 2013 5:46 pm**

Hello Kai:

Very interesting study, as usual.

Laskos wrote:The hardest to quantify to me was the win/loss ratio, which I somehow assumed to be constant at longer TC. It seems not to be the case, win/loss ratio seems to decrease with time control (or nodes). I fitted it with 1 + 1/(a*x + b), getting a correlation 0.96.

I see that the fit of draw ratio is superb and the fit of Elo gain is also good. You were a bit worried about the fit of wins/loses ratio. I suggest you to use both Elo gain and draw ratio equations in a try to do a better fit (I do not know if it will work):

Code: Select all

W = wins/games; D = draws/games; L = loses/games.
Elo gain = delta = 400*log{[1 + (W - L)]/[1 - (W - L)]}
W - L = [10^(delta/400) - 1]/[10^(delta/400) + 1]

W + D + L = 1 by definition; 1 - D = W + L.
W = [(W + L) + (W - L)]/2 = [1 - D + (W - L)]/2.
L = [(W + L) - (W - L)]/2 = [1 - D - (W - L)]/2.

W/L = [1 - D + (W - L)]/[1 - D - (W - L)] = {2*10^(delta/400) - [10^(delta/400) + 1]*D}/{2 - [10^(delta/400) + 1]*D}

For simplicity of read: y = 10^(delta/400).
W/L = [2y - (y + 1)*D]/[2 - (y + 1)*D]

For simplicity of read: u = (y + 1)*D.
W/L = (wins/games)/(loses/games) = wins/loses = (2y - u)/(2 - u)

I hope no typos. You can replace delta by a/(b*x^c + 1) and D by the shifted logistic. The new equation will not be nice but hopefully it will bring a better fit. Just my two cents.

Regards from Spain.

Ajedrecista.

Posted: **Tue Jul 23, 2013 6:52 pm**

Have you compared with Ken Thompson's and Berliner's results? Both saw something similar. Doubling anything at shallow depth represents a larger change than at deeper depths, since the tree is exponential in nature... There's always been a diminishing return aspect of such tests. But the key has been that while it continues to diminish, it doesn't go to zero for any practical speed...

Posted: **Tue Jul 23, 2013 10:51 pm**

Ajedrecista wrote:Hello Kai:

Very interesting study, as usual.

Laskos wrote:The hardest to quantify to me was the win/loss ratio, which I somehow assumed to be constant at longer TC. It seems not to be the case, win/loss ratio seems to decrease with time control (or nodes). I fitted it with 1 + 1/(a*x + b), getting a correlation 0.96.
I see that the fit of draw ratio is superb and the fit of Elo gain is also good. You were a bit worried about the fit of wins/loses ratio. I suggest you to use both Elo gain and draw ratio equations in a try to do a better fit (I do not know if it will work):
Code: Select all
W = wins/games; D = draws/games; L = loses/games.
Elo gain = delta = 400*log{[1 + (W - L)]/[1 - (W - L)]}
W - L = [10^(delta/400) - 1]/[10^(delta/400) + 1]

W + D + L = 1 by definition; 1 - D = W + L.
W = [(W + L) + (W - L)]/2 = [1 - D + (W - L)]/2.
L = [(W + L) - (W - L)]/2 = [1 - D - (W - L)]/2.

W/L = [1 - D + (W - L)]/[1 - D - (W - L)] = {2*10^(delta/400) - [10^(delta/400) + 1]*D}/{2 - [10^(delta/400) + 1]*D}

For simplicity of read: y = 10^(delta/400).
W/L = [2y - (y + 1)*D]/[2 - (y + 1)*D]

For simplicity of read: u = (y + 1)*D.
W/L = (wins/games)/(loses/games) = wins/loses = (2y - u)/(2 - u)
I hope no typos. You can replace delta by a/(b*x^c + 1) and D by the shifted logistic. The new equation will not be nice but hopefully it will bring a better fit. Just my two cents.

Regards from Spain.

Ajedrecista.

Hi Jesus, thanks, I used the simplest possible (and according to common sense and asymptotic values) expressions to fit the curves. I computed as you showed the formula for Wins/Losses with the two fittings for Elo advantage (delta) and draw ratio (D) which worked very well. The plot Wins/Losses with the number of doublings is here:

The correlation is now 0.95 compared to the old 0.96. So, one of my 3 previous fittings seems using a wrong function. Win/Loss ratio remains a bit of a mystery for longer time controls, but at least up to now little contradicts it decreases with longer TC.

Posted: **Tue Jul 23, 2013 11:11 pm**

bob wrote:Have you compared with Ken Thompson's and Berliner's results? Both saw something similar. Doubling anything at shallow depth represents a larger change than at deeper depths, since the tree is exponential in nature... There's always been a diminishing return aspect of such tests. But the key has been that while it continues to diminish, it doesn't go to zero for any practical speed...

The diminishing return was expected, I wanted to see the magnitude of this, and to get a clue what the doubling means on real-life rating lists like CCRL and CEGT. Also, at very long time controls little is known about the doubling, I seem to get 45 points on one modern core for 40/120' games, and 25 points for 40/1 week games. It's sure never zero, but decreases appreciably.

Posted: **Wed Jul 24, 2013 11:33 am**

Hello Kai:

Later that my post but before your reply I did some work on Excel and Derive: the numbers I got with what I wrote in the code box were identical to the true numbers, so my formula of wins/loses is correct.

I tried ZunZun to fit the first two curves but I did not find the equations you used (did you fitted the curves with MATLAB?), so I adjusted them with quadratic equations in Excel (I obtained similar R² values than you); later I used Derive for plot my formula (2y - u)/(2 - u) with the quadratic equations inside it. Then I calculated R² by its own definition of 1 - SS_res/SS_tot... I obtained something like R² ~ 0.951 (writing from memory), so it is almost the same as your report. The thing that does not like me is that this function has a minimum at 2.048MN vs. 1.024MN with the data provided. May I suggest to try increasing the number of games for reach 5000 games in each level? I guess that it is difficult, but it should stabilize the results and you will narrow error bars.

This morning, I observed a strange pattern from level 8 and a reason for it could be the 'short' number of games (you know, more randomness). This pattern is present in both Elo gain and wins/loses graphs... it looks like they are very correlated (I am not sure it is the correct word).

I tried to adjust the wins/loses numbers with 1 + 1/(ax + b) (just as you) and once again I did not find this equation at ZunZun (maybe I do not know how to search there), but this time I could use Excel with a little trick:

Code: Select all

1 + 1/(ax + b) = wins/loses; ax + b = loses/(wins - loses)
a ~ 0.024913; b ~ 0.072522

I plotted 1 + 1/(0.024913x + 0.072522) in Derive together with the original data and it seems that the coefficients are correct (other than roundings). Using the definition of R² again in Excel, I have obtained R² ~ 0.967779, which should be the same or almost the same for you. This last point only pretends to confirm that the adjusted line you plotted was 1 + 1/(0.024913x + 0.072522) more less.

Please keep up your good work!

Regards from Spain.

Ajedrecista.

Posted: **Wed Jul 24, 2013 1:53 pm**

You can enter user-defined functions at Zunzun.

Posted: **Thu Jul 25, 2013 2:16 am**

Hi Jesus,
I added on more point to the data, the most difficult one, 8192k nodes vs 4096k nodes per move, already a real blitz time control, not just bullet or ultra-short. It took one day and a half to have 1,000 games. I don't think the number of games is too small to draw conclusions, and I sure cannot have them all at 5,000 games, only the faster ones.

Code: Select all

1)  4k nodes vs 2k nodes          +3862 -352  =786   +303
2)  8k nodes vs 4k nodes          +3713 -374  =913   +280
3)  16k nodes vs 8k nodes         +3399 -436 =1165   +237
4)  32k nodes vs 16k nodes        +3151 -474 =1374   +208
5)  64k nodes vs 32k nodes        +2862 -494 =1641   +179
6)  128k nodes vs 64k nodes       +2613 -501 =1881   +156
7)  256k nodes vs 128k nodes       +942 -201  =855   +136
8)  512k nodes vs 256k nodes       +900 -166  =930   +134
9)  1024k nodes vs 512k nodes      +806 -167 =1026   +115
10) 2048k nodes vs 1024k nodes     +344  -83  =572    +93
11) 4096k nodes vs 2048k nodes     +307  -85  =607    +79
12) 8192k nodes vs 4096k nodes     +290  -70  =640    +78

It changed very little the overall picture, showing the consistency of the results. From the fitting, the doubling at 40/4' CCRL 1 core gives ~70 points,
40/40' gives 55 points, and 40/120' gives 45 points. The limiting value for perfect engine is 4870 Elo points on CCRL. Very little changed adding this additional (12th) doubling.

Gain in Elo points from doubling fitted with a/(b*x^c + 1) is shown here:

Draw ratio fitted with a shifted logistic is shown here:

Now the ambiguous Win/Loss ratio. I fitted with a bit broader function {c + 1/(a*x + b)}, instead of {1+1/(a*x + b)}. The fitting gave c>1, so the ratio Win/Loss decreases slowly to a value of 1.27 at infinity.

Here is your interpolation with Win/Loss from the draw ratio and Elo gain per doubling.

It seems to stay pretty constant at longer TC. So, it's still a bit of mystery whether the Win/Loss ratio slowly decreases or remains pretty constant at longer time controls.

Posted: **Thu Jul 25, 2013 11:43 am**

I also derived the rule-of-thumb formula for gain in Elo points from doubling nodes (or time):

N = number of nodes per move
Elo gain for doubling is ~ 18100/{log(N)}^2

Dealing with Houdini 3 only, it can be said that Houdini at 2x nodes scales the same as Houdini at 1x nodes, and this means that one can plot Elo difference between equally scaling engines.

Here is the strength as a number of doublings, and as it is obvious, equally scaling engines should approach in strength with the number of doublings, or time control.

Posted: **Thu Jul 25, 2013 1:08 pm**

Hello Kai:

Laskos wrote:I also derived the rule-of-thumb formula for gain in Elo points from doubling nodes (or time):

N = number of nodes per move
Elo gain for doubling is ~ 18100/{log(N)}^2

Dealing with Houdini 3 only, it can be said that Houdini at 2x nodes scales the same as Houdini at 1x nodes, and this means that one can plot Elo difference between equally scaling engines.

Here is the strength as a number of doublings, and as it is obvious, equally scaling engines should approach in strength with the number of doublings, or time control.

Thank you very much for your effort!

I have applied your formula of 18100/[ln(nodes per move)]² for each doubling. I have two doubts:

a) In a level of play (for example, level 1 ---> 4000 vs. 2000) I do not know how to interpret your formula: given the number of nodes N, your formula must be applied in a match of (2N vs. N) or in a match of (N vs. N/2)?

b) In other post, you stated that given Houdini 3 rating at CCRL 40/40' (around level 16 in your nomenclature), then the perfect rating should be around 4870 Elo, that is 1700 Elo better.

I added the first 50 million of terms starting the number of nodes at 65.536 MN = 6.5536e+7 nodes and 131.072 MN = 1.31072e+8 nodes (due to the doubt explained in a)) corresponding to level 16 if I am not wrong. I used the following code in Fortran 95:

Code: Select all

program limit
implicit none
integer :: n
integer, parameter :: max = 50000000  ! max = 50 million.
real(KIND=3) :: delta(max), ln_nodes(max), sum_
do n = 1, max
  ln_nodes(n) = log(6.5536d7) + (n + 0d0)*log(2d0)  ! nodes = (6.5536e+7)*2^n
  delta(n) = 1.81d4/(ln_nodes(n)*ln_nodes(n))
end do
sum_ = sum(delta)
print*, sum_
write(*,*) 3.17d3 + sum_
end program limit

Of course, at line 7 I used both 6.5536d7 and 1.31072d8. In the first case I more less obtained 3170 + 1423.28 = 4593.28 Elo, while in the second case I more less obtained 3170 + 1371.47 = 4541.47 Elo. My results are very far from yours...

I know that a rule of thumb may not be exact but I also do not know if I am doing anything wrong. I guess that a/(b*x^c + 1) and 18100/[ln(nodes per move)]² clearly differ at very long TC.

Regards from Spain.

Ajedrecista.

TalkChess.com

Scaling at 2x nodes (or doubling time control).

Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).

Re: Scaling at 2x nodes (or doubling time control).