Scaling at 2x nodes (or doubling time control).

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Laskos
Posts: 9410
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Scaling at 2x nodes (or doubling time control).

Post by Laskos » Thu Jul 25, 2013 12:58 pm

Ajedrecista wrote:Hello Kai:
Laskos wrote:I also derived the rule-of-thumb formula for gain in Elo points from doubling nodes (or time):

N = number of nodes per move
Elo gain for doubling is ~ 18100/{log(N)}^2


Dealing with Houdini 3 only, it can be said that Houdini at 2x nodes scales the same as Houdini at 1x nodes, and this means that one can plot Elo difference between equally scaling engines.

Here is the strength as a number of doublings, and as it is obvious, equally scaling engines should approach in strength with the number of doublings, or time control.

Image
Thank you very much for your effort!

I have applied your formula of 18100/[ln(nodes per move)]² for each doubling. I have two doubts:

a) In a level of play (for example, level 1 ---> 4000 vs. 2000) I do not know how to interpret your formula: given the number of nodes N, your formula must be applied in a match of (2N vs. N) or in a match of (N vs. N/2)?
I use 2N vs N, in 40/4' case 8' vs 4'.

b) In other post, you stated that given Houdini 3 rating at CCRL 40/40' (around level 16 in your nomenclature), then the perfect rating should be around 4870 Elo, that is 1700 Elo better.

I added the first 50 million of terms starting the number of nodes at 65.536 MN = 6.5536e+7 nodes and 131.072 MN = 1.31072e+8 nodes (due to the doubt explained in a)) corresponding to level 16 if I am not wrong. I used the following code in Fortran 95:

Code: Select all

program limit
implicit none
integer :: n
integer, parameter :: max = 50000000  ! max = 50 million.
real(KIND=3) :: delta(max), ln_nodes(max), sum_
do n = 1, max
  ln_nodes(n) = log(6.5536d7) + (n + 0d0)*log(2d0)  ! nodes = (6.5536e+7)*2^n
  delta(n) = 1.81d4/(ln_nodes(n)*ln_nodes(n))
end do
sum_ = sum(delta)
print*, sum_
write(*,*) 3.17d3 + sum_
end program limit
Of course, at line 7 I used both 6.5536d7 and 1.31072d8. In the first case I more less obtained 3170 + 1423.28 = 4593.28 Elo, while in the second case I more less obtained 3170 + 1371.47 = 4541.47 Elo. My results are very far from yours... :(

I know that a rule of thumb may not be exact but I also do not know if I am doing anything wrong. I guess that a/(b*x^c + 1) and 18100/[ln(nodes per move)]² clearly differ at very long TC.

Regards from Spain.

Ajedrecista.
You are doing nothing wrong, the rule of thumb formula does not behave very well for exceedingly small and exceedingly large values (say below 10^3 nodes and above 10^11 nodes), so summing it up to infinity is not the way to calculate the limiting value. The exponent is not exactly 2, and 18100 therefore is not exact. The true fitted function is (as a number of doublings, not nodes)
323.437/(0.0601579*n^1.59204 + 1)
where n is the number of doublings from 2k nodes. n=16 is more or less CCRL 40/40' level of 1 core 3170 points Houdini 3. I summed up this expression from 16 to infinity to get 1700 Elo points above this 3170 CCRL rating, i.e. 4870 Elo on CCRL for perfect engine.

User avatar
Laskos
Posts: 9410
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Scaling at 2x nodes (or doubling time control).

Post by Laskos » Thu Jul 25, 2013 5:34 pm

Returning to the delicate issue, the scaling Wins/Losses, I got the correlation coefficients for the fitted and interpolated curves.
The fit is with function {c + 1/(a*x + b)}. The fitting gave c=1.27, so the ratio Wins/Losses decreases slowly to a value of 1.27 at infinity. I want to use the simplest expression for the fitting, given some common sense asymptotic values, I could certainly introduce more variables to fit better, but I am not interested in artificial overfitting with weird functions.
Image
Correlation = 0.98385

With your interpolation Wins/Losses from the draw ratio and Elo gain per doubling. This is not a fit.
Image
Correlation = 0.98557

Your interpolation gives a bit higher correlation, but not significantly. What can be said is the that ratio Wins/Losses doesn't vary very significantly over a large span (from game in one minute to game in one day). Maybe it decreases a bit with time, but it's speculation.

User avatar
Laskos
Posts: 9410
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Scaling at 2x nodes (or doubling time control).

Post by Laskos » Mon Jul 29, 2013 8:15 pm

Final point, 13th doubling, 16384k nodes vs 8192k nodes per move, equivalent to CCRL and CEGT 40/8' vs 40/4'. It took 3 days to get the result.

Code: Select all

                                    w     l     d     Elo

1)  4k nodes vs 2k nodes          +3862 -352  =786   +303
2)  8k nodes vs 4k nodes          +3713 -374  =913   +280
3)  16k nodes vs 8k nodes         +3399 -436 =1165   +237
4)  32k nodes vs 16k nodes        +3151 -474 =1374   +208
5)  64k nodes vs 32k nodes        +2862 -494 =1641   +179
6)  128k nodes vs 64k nodes       +2613 -501 =1881   +156
7)  256k nodes vs 128k nodes       +942 -201  =855   +136
8)  512k nodes vs 256k nodes       +900 -166  =930   +134
9)  1024k nodes vs 512k nodes      +806 -167 =1026   +115
10) 2048k nodes vs 1024k nodes     +344  -83  =572    +93
11) 4096k nodes vs 2048k nodes     +307  -85  =607    +79
12) 8192k nodes vs 4096k nodes     +290  -70  =640    +78
13) 16384k nodes vs 8192k nodes    +262  -68  =670    +68

The gain from doubling the nodes I fitted with a/(b*x^c + 1), where x is the number of doublings. The plot is here:
Image

The 40/4', 40/40' and 40/120' CCRL and CEGT levels are shown, and the resulting gain from doubling in this extrapolation is ~70 points at 40/4' on 1 core to ~35 points at 40/120' on 8 cores. The limiting value I get is summing to infinity over all doublings (infinite time control), and is 1640 points above the Houdini 3 40/40' CCRL level. So, I get 4810 Elo points on CCRL the rating of the perfect engine.


The draw ratio I fitted with a shifted logistic.
The plot is here:
Image

The intricate wins/losses ratio, which I somehow assumed for a long time to be constant at longer TC. I fitted it with a + 1/(b*x^c + d), and it seems to consolidate my view.
The plot is here:
Image

And the interpolated from Draw Ratio and Elo Gain is here:
Image

Both show a pretty constant Wins/Losses ratio at longer time controls.

duncan
Posts: 10202
Joined: Mon Jul 07, 2008 8:50 pm

Re: Scaling at 2x nodes (or doubling time control).

Post by duncan » Thu Aug 06, 2015 10:42 am

Laskos wrote:
It changed very little the overall picture, showing the consistency of the results. From the fitting, the doubling at 40/4' CCRL 1 core gives ~70 points,
40/40' gives 55 points, and 40/120' gives 45 points. The limiting value for perfect engine is 4870 Elo points on CCRL. Very little changed adding this additional (12th) doubling.
so what can a perfect engine give away and still convincingly defeat a 2800 gm, a rook ? rook and pawn ?

User avatar
Laskos
Posts: 9410
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Scaling at 2x nodes (or doubling time control).

Post by Laskos » Thu Aug 06, 2015 11:17 am

duncan wrote:
Laskos wrote:
It changed very little the overall picture, showing the consistency of the results. From the fitting, the doubling at 40/4' CCRL 1 core gives ~70 points,
40/40' gives 55 points, and 40/120' gives 45 points. The limiting value for perfect engine is 4870 Elo points on CCRL. Very little changed adding this additional (12th) doubling.
so what can a perfect engine give away and still convincingly defeat a 2800 gm, a rook ? rook and pawn ?
Not even a knight. These things are not linear, against a 1800 ELO player, a 2700 player can give a rook, so a rook is worth 900 points in this case. But is seems 3200 Komodo cannot even give a knight to a 2200 FM, the larger 1000 points difference. My guess, a perfect engine playing anti-human chess can give some 2 not very important pawns to a 2800 GM for a balanced match, but it's just a guess.

Uri Blass
Posts: 8554
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: Scaling at 2x nodes (or doubling time control).

Post by Uri Blass » Thu Aug 06, 2015 4:04 pm

Laskos wrote:
Ajedrecista wrote:Hello Kai:
Laskos wrote:I also derived the rule-of-thumb formula for gain in Elo points from doubling nodes (or time):

N = number of nodes per move
Elo gain for doubling is ~ 18100/{log(N)}^2


Dealing with Houdini 3 only, it can be said that Houdini at 2x nodes scales the same as Houdini at 1x nodes, and this means that one can plot Elo difference between equally scaling engines.

Here is the strength as a number of doublings, and as it is obvious, equally scaling engines should approach in strength with the number of doublings, or time control.

Image
Thank you very much for your effort!

I have applied your formula of 18100/[ln(nodes per move)]² for each doubling. I have two doubts:

a) In a level of play (for example, level 1 ---> 4000 vs. 2000) I do not know how to interpret your formula: given the number of nodes N, your formula must be applied in a match of (2N vs. N) or in a match of (N vs. N/2)?
I use 2N vs N, in 40/4' case 8' vs 4'.

b) In other post, you stated that given Houdini 3 rating at CCRL 40/40' (around level 16 in your nomenclature), then the perfect rating should be around 4870 Elo, that is 1700 Elo better.

I added the first 50 million of terms starting the number of nodes at 65.536 MN = 6.5536e+7 nodes and 131.072 MN = 1.31072e+8 nodes (due to the doubt explained in a)) corresponding to level 16 if I am not wrong. I used the following code in Fortran 95:

Code: Select all

program limit
implicit none
integer :: n
integer, parameter :: max = 50000000  ! max = 50 million.
real(KIND=3) :: delta(max), ln_nodes(max), sum_
do n = 1, max
  ln_nodes(n) = log(6.5536d7) + (n + 0d0)*log(2d0)  ! nodes = (6.5536e+7)*2^n
  delta(n) = 1.81d4/(ln_nodes(n)*ln_nodes(n))
end do
sum_ = sum(delta)
print*, sum_
write(*,*) 3.17d3 + sum_
end program limit
Of course, at line 7 I used both 6.5536d7 and 1.31072d8. In the first case I more less obtained 3170 + 1423.28 = 4593.28 Elo, while in the second case I more less obtained 3170 + 1371.47 = 4541.47 Elo. My results are very far from yours... :(

I know that a rule of thumb may not be exact but I also do not know if I am doing anything wrong. I guess that a/(b*x^c + 1) and 18100/[ln(nodes per move)]² clearly differ at very long TC.

Regards from Spain.

Ajedrecista.
You are doing nothing wrong, the rule of thumb formula does not behave very well for exceedingly small and exceedingly large values (say below 10^3 nodes and above 10^11 nodes), so summing it up to infinity is not the way to calculate the limiting value. The exponent is not exactly 2, and 18100 therefore is not exact. The true fitted function is (as a number of doublings, not nodes)
323.437/(0.0601579*n^1.59204 + 1)
where n is the number of doublings from 2k nodes. n=16 is more or less CCRL 40/40' level of 1 core 3170 points Houdini 3. I summed up this expression from 16 to infinity to get 1700 Elo points above this 3170 CCRL rating, i.e. 4870 Elo on CCRL for perfect engine.
I think elo is dependent on the question who play against who.

My guess is that if you play n nodes only against 4n nodes and not against 2n nodes then maybe you will get a smaller result
and if you play n nodes only against n*(2^0.5) nodes then you will get a higher result.

If I am right then 2n against n inflate the perfect rating considering the fact that the difference at high level between n and 2n is very small and it is better to use not n against 2n but n against 3n or n against 4n at higher levels to have the same difference between every pair of players and not to have very small difference when you get closer to perfect.

User avatar
Laskos
Posts: 9410
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Scaling at 2x nodes (or doubling time control).

Post by Laskos » Thu Aug 06, 2015 4:39 pm

Uri Blass wrote:
I think elo is dependent on the question who play against who.

My guess is that if you play n nodes only against 4n nodes and not against 2n nodes then maybe you will get a smaller result
and if you play n nodes only against n*(2^0.5) nodes then you will get a higher result.

If I am right then 2n against n inflate the perfect rating considering the fact that the difference at high level between n and 2n is very small and it is better to use not n against 2n but n against 3n or n against 4n at higher levels to have the same difference between every pair of players and not to have very small difference when you get closer to perfect.
You mean non-transitivity of the ELO ratings?
Isn't it opposite, 4n against n will get a bit larger ELO that two times 2n versus n? In case the logistic is incorrect, and Gaussian fits better on large ELO spans. It's perfectly possible, even probable that 16n or 32n versus n during the first doublings will get a significantly higher ELO difference than 8 or 16 doublings. The problem is about large ELO spans, and 2n versus n has no such ELO differences where the ELO is non-transitive by more than 1-2%. I think (2^0.5) * n and 2^1 * n has very tiny differences. And it's important to keep differences small, where logistic or whatever is pretty much irrelevant (the rating is basically linear with irrelevant coefficient of proportionality, in case of the usual logistic or Gaussian it's something like 700*(score-0.5)). So, 2n versus n and smaller than 2 factors will be all very close, and it should be irrelevant, if the ELO difference is small each new result, whether we add a diminishing small ELO difference or a constant small ELO difference.

Gurcan Uckardes
Posts: 196
Joined: Tue Oct 28, 2014 11:42 pm
Contact:

Re: Scaling at 2x nodes (or doubling time control).

Post by Gurcan Uckardes » Thu Aug 06, 2015 5:23 pm

Hello Kai. Yummy studies! Keep up with the subject deeper...
I suggest you cross check the findings with Komodo and Stockfish too depending on your spare time.
My blog for Android users: http://chesstroid.blogspot.com

Post Reply