Math question

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
jdart
Posts: 3842
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Math question

Post by jdart » Wed Jul 17, 2019 1:57 pm

Zenmastur wrote:
Wed Jul 17, 2019 7:34 am

Well, if your keeping the book statistics up to date as new games are played then the highest scoring move should be played next. But, this isn't the best move AND it's not meant to be the best move. There are two consideration involved, one is the best scoring move and the other is the move that the book will learn the most information from. These are balanced, or can be balanced depending on the type of play. If you are using the book in a testing frame work then I would think the balance should be in favor of book learning speed.
The multi-armed bandit formulation of the problem assumes there is a cost to exploring suboptimal moves. If you are doing offline training then there isn't really a cost in terms of bad outcomes (you don't care if you lose games), but there is still a cost in terms of training time. You don't want to spend more time than necessary looking at suboptimal parts of the search tree.

Thompson sampling tries to balance exploitation (choosing so-far best moves) and exploration (looking at others) so that you efficiently converge to finding the move with the best outcome. Over time it does less exploration as more data is available. There are some mathematical proofs of efficiency.

It is not the only algorithm for this problem though. See https://lilianweng.github.io/lil-log/20 ... tions.html for an outline of some others.

In addition though I will mention that in my experience book learning even with a good algorithm is not as efficient as you would like, especially if you are using self-play for training. The reason is that the draw rate with self-play is very high and so bad moves are not "punished" enough. With large numbers of training games It could probably still work but it will converge slowly.

--Jon

User avatar
Ajedrecista
Posts: 1401
Joined: Wed Jul 13, 2011 7:04 pm
Location: Madrid, Spain.
Contact:

Re: Math question.

Post by Ajedrecista » Wed Jul 17, 2019 7:22 pm

Hello Ed:

I worked a little in an own method which could be unsound but gives some results that may suit your needs:

Code: Select all

For each i-th opening move:
n_i = w_i + d_i + l_i
µ_i = (w_i + 0.5*d_i)/(w_i + d_i + l_i)

q_i = µ_i * (n_i)^a

Weight_i = q_i / SUM(q_i)          // SUM(Weight_i) = 100%
Using your input values with different values of a, I get the following results with the help of Excel:

Code: Select all

Opening Book : books\elo2700.bin
Positions    : 716.101

Book Weight Score Depth Learn   Total     Won    Draw    Loss     Perc      WD      WDL
e2e4  41.93%     0    0     0    6456    1833    3415    1208    54.84%   41.92%   30.27%
d2d4  37.19%     0    0     0    5727    1607    3067    1053    54.83%   37.19%   27.09%
g1f3  12.23%     0    0     0    1899     567     932     400    54.39%   12.23%    8.22%
c2c4   7.00%     0    0     0    1092     314     555     223    54.16%    7.00%    4.78%
b2b3   0.78%     0    0     0     120      45      40      35    54.16%    0.77%    0.39%
g2g3   0.56%     0    0     0      72      35      25      12    65.97%    0.56%    0.46%
f2f4   0.11%     0    0     0      14       7       5       2    67.85%    0.11%    0.09%
b1c3   0.10%     0    0     0      19       5       6       8    42.10%    0.09%    0.00%
d2d3   0.03%     0    0     0       3       3       0       0   100.00%    0.03%    0.03%
b2b4   0.02%     0    0     0       3       2       0       1    66.66%    0.02%    0.01%
a2a3   0.01%     0    0     0       3       1       1       1    50.00%    0.01%    0.00%
a2a4   0.01%     0    0     0       1       1       0       0   100.00%    0.01%    0.01%
Totals                          15409    4420    8046    2943

Code: Select all

       a = -10.0    a = -5.0    a = -2.0    a = -1.0
Move     Weight      Weight      Weight      Weight
====================================================
e2e4      0.00%       0.00%       0.00%       0.00%
d2d4      0.00%       0.00%       0.00%       0.01%
g1f3      0.00%       0.00%       0.00%       0.02%
c2c4      0.00%       0.00%       0.00%       0.03%
b2b3      0.00%       0.00%       0.00%       0.25%
g2g3      0.00%       0.00%       0.01%       0.51%
f2f4      0.00%       0.00%       0.28%       2.68%
b1c3      0.00%       0.00%       0.09%       1.23%
d2d3      0.00%       0.41%       8.92%      18.44%
b2b4      0.00%       0.27%       5.95%      12.29%
a2a3      0.00%       0.20%       4.46%       9.22%
a2a4    100.00%      99.12%      80.29%      55.33%
---------------------------------------------------
SUM     100.00%     100.00%     100.00%     100.00%

Code: Select all

       a = -0.5    a = 0.0    a = 0.1    a = 0.2
Move    Weight      Weight     Weight     Weight
================================================
e2e4     0.25%       7.17%     11.35%     16.26%
d2d4     0.27%       7.17%     11.21%     15.88%
g1f3     0.46%       7.11%      9.96%     12.63%
c2c4     0.61%       7.08%      9.38%     11.26%
b2b3     1.83%       7.08%      7.52%      7.24%
g2g3     2.88%       8.62%      8.71%      7.96%
f2f4     6.72%       8.87%      7.60%      5.90%
b1c3     3.58%       5.50%      4.86%      3.89%
d2d3    21.39%      13.07%      9.60%      6.39%
b2b4    14.26%       8.71%      6.40%      4.26%
a2a3    10.70%       6.54%      4.80%      3.20%
a2a4    37.05%      13.07%      8.60%      5.13%
------------------------------------------------
SUM    100.00%     100.00%    100.00%    100.00%

Code: Select all

       a = 0.3    a = 0.5    a = 0.7    a = 1.0
Move    Weight     Weight     Weight     Weight
===============================================
e2e4    21.28%     29.81%     35.84%     41.93%
d2d4    20.53%     28.07%     32.96%     37.20%
g1f3    14.62%     16.04%     15.10%     12.23%
c2c4    12.33%     12.11%     10.21%      7.01%
b2b3     6.36%      4.01%      2.18%      0.77%
g2g3     6.64%      3.79%      1.85%      0.56%
f2f4     4.18%      1.72%      0.61%      0.11%
b1c3     2.84%      1.24%      0.47%      0.09%
d2d3     3.88%      1.17%      0.30%      0.04%
b2b4     2.59%      0.78%      0.20%      0.02%
a2a3     1.94%      0.59%      0.15%      0.02%
a2a4     2.79%      0.68%      0.14%      0.01%
-----------------------------------------------
SUM    100.00%    100.00%    100.00%    100.00%

Code: Select all

       a = 2.0    a = 10.0    a = 50.0    a = 80.0
Move    Weight     Weight      Weight      Weight
==================================================
e2e4    52.59%     76.82%      99.75%      99.99%
d2d4    41.38%     23.18%       0.25%       0.01%
g1f3     4.51%      0.00%       0.00%       0.00%
c2c4     1.49%      0.00%       0.00%       0.00%
b2b3     0.02%      0.00%       0.00%       0.00%
g2g3     0.01%      0.00%       0.00%       0.00%
f2f4     0.00%      0.00%       0.00%       0.00%
b1c3     0.00%      0.00%       0.00%       0.00%
d2d3     0.00%      0.00%       0.00%       0.00%
b2b4     0.00%      0.00%       0.00%       0.00%
a2a3     0.00%      0.00%       0.00%       0.00%
a2a4     0.00%      0.00%       0.00%       0.00%
--------------------------------------------------
SUM    100.00%    100.00%     100.00%     100.00%
As you can see, a = 1 gives your original weights. a =< 0 gives weird weights (like giveaway, worst is best) and a -> +infinity awards the most played move regardless its score.

I think that a = 0.5 gives suitable weights for your needs:

Code: Select all

a = 0.5

Move    Weight    Cumulative
============================
e2e4    29.81%      29.81%
d2d4    28.07%      57.88%
g1f3    16.04%      73.92%
c2c4    12.11%      86.02%
b2b3     4.01%      90.04%
g2g3     3.79%      93.83%
f2f4     1.72%      95.54%
b1c3     1.24%      96.78%
d2d3     1.17%      97.96%
b2b4     0.78%      98.74%
a2a3     0.59%      99.32%
a2a4     0.68%     100.00%
The 'Big Four' are selected over 86% of the time and Nf3 + c4 are selected over 28% of the time. OTOH, outsider moves are selected circa 1% of the time.

Probably a = 0.7 or close values worth a try:

Code: Select all

a = 0.7

Move    Weight    Cumulative
============================
e2e4    35.84%      35.84%
d2d4    32.96%      68.80%
g1f3    15.10%      83.90%
c2c4    10.21%      94.10%
b2b3     2.18%      96.28%
g2g3     1.85%      98.13%
f2f4     0.61%      98.74%
b1c3     0.47%      99.20%
d2d3     0.30%      99.51%
b2b4     0.20%      99.71%
a2a3     0.15%      99.86%
a2a4     0.14%     100.00%
The 'Big Four' are selected over 94% of the time and Nf3 + c4 are selected over 25% of the time. Outsider moves will be selected less than 1% of the time with your data.

Good luck!

Regards from Spain.

Ajedrecista.

Zenmastur
Posts: 512
Joined: Sat May 31, 2014 6:28 am

Re: Math question

Post by Zenmastur » Thu Jul 18, 2019 9:03 am

Rebel wrote:
Wed Jul 17, 2019 9:42 am
Zenmastur wrote:
Wed Jul 17, 2019 7:34 am
Which might look some thing like this:

Move ranking Score = (Average game score for this move) + SQRT(Constant1*LOG(Total number of games played from this position)/number of times this move has been played).
Looks as something to try, what initial number to feed "Constant1" with?
2 is a good starting place I guess. I haven't had time to find my spread sheet and notes yet. But if your interested I'll post the formulas i have when I find them.

Regards,

Zenmastur
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.

User avatar
Ajedrecista
Posts: 1401
Joined: Wed Jul 13, 2011 7:04 pm
Location: Madrid, Spain.
Contact:

Re: Math question.

Post by Ajedrecista » Thu Jul 18, 2019 6:10 pm

Hello again:

Taking the same idea than in my previous post, but using the lower bound of Wilson score interval of µ (without or with continuity correction) instead of µ, I obtain the following weights with your data:

Code: Select all

Opening Book : books\elo2700.bin
Positions    : 716.101

Book Weight Score Depth Learn   Total     Won    Draw    Loss     Perc      WD      WDL
e2e4  41.93%     0    0     0    6456    1833    3415    1208    54.84%   41.92%   30.27%
d2d4  37.19%     0    0     0    5727    1607    3067    1053    54.83%   37.19%   27.09%
g1f3  12.23%     0    0     0    1899     567     932     400    54.39%   12.23%    8.22%
c2c4   7.00%     0    0     0    1092     314     555     223    54.16%    7.00%    4.78%
b2b3   0.78%     0    0     0     120      45      40      35    54.16%    0.77%    0.39%
g2g3   0.56%     0    0     0      72      35      25      12    65.97%    0.56%    0.46%
f2f4   0.11%     0    0     0      14       7       5       2    67.85%    0.11%    0.09%
b1c3   0.10%     0    0     0      19       5       6       8    42.10%    0.09%    0.00%
d2d3   0.03%     0    0     0       3       3       0       0   100.00%    0.03%    0.03%
b2b4   0.02%     0    0     0       3       2       0       1    66.66%    0.02%    0.01%
a2a3   0.01%     0    0     0       3       1       1       1    50.00%    0.01%    0.00%
a2a4   0.01%     0    0     0       1       1       0       0   100.00%    0.01%    0.01%
Totals                          15409    4420    8046    2943

Code: Select all

Using the lower bound of Wilson score interval (95% confidence) of µ WITHOUT continuity correction:
(a = 0.7)

Move    Weight    Cumulative
============================
e2e4    36.67%      36.67%
d2d4    33.67%      70.33%
g1f3    15.14%      85.47%
c2c4    10.09%      95.56%
b2b3     1.90%      97.46%
g2g3     1.60%      99.06%
f2f4     0.39%      99.46%
b1c3     0.27%      99.72%
d2d3     0.14%      99.86%
b2b4     0.07%      99.93%
a2a3     0.04%      99.97%
a2a4     0.03%     100.00%

Code: Select all

Using the lower bound of Wilson score interval (95% confidence) of µ WITH continuity correction:
(a = 0.7)

Move    Weight    Cumulative
============================
e2e4    36.74%      36.74%
d2d4    33.74%      70.48%
g1f3    15.17%      85.65%
c2c4    10.11%      95.75%
b2b3     1.89%      97.64%
g2g3     1.58%      99.23%
f2f4     0.36%      99.59%
b1c3     0.24%      99.83%
d2d3     0.10%      99.93%
b2b4     0.04%      99.97%
a2a3     0.02%      99.99%
a2a4     0.01%     100.00%
These weights look reasonable for me.

------------------------

I tried the same idea with a bigger opening book: the 'huge book' at Shredder web:

Code: Select all

Move       n         w         d         l         µ
=======================================================
e2e4    242221     83248     97815     61158     54.56%
d2d4    204646     72648     85419     46579     56.37%
g1f3     59276     19990     26061     13225     55.71%
c2c4     45320     15914     19274     10132     56.38%
g2g3      5864      2037      2382      1445     55.05%
b2b3      1505       526       542       437     52.96%
f2f4      1078       314       366       398     46.10%
b1c3       464       135       154       175     45.69%
b2b4       266        85        92        89     49.25%
e2e3       108        38        29        41     48.61%
d2d3        99        27        41        31     47.98%
c2c3        52        18        15        19     49.04%
a2a3        42        12        12        18     42.86%
h2h3        29         8         5        16     36.21%
g2g4        25         8         6        11     44.00%
h2h4         8         5         1         2     68.75%
b1a3         3         0         2         1     33.33%
g1h3         3         1         1         1     50.00%
a2a4         1         0         1         0     50.00%
-------------------------------------------------------
SUM     561010    195014    232218    133778

Code: Select all

Using the lower bound of Wilson score interval (95% confidence) of µ WITHOUT continuity correction:
(a = 0.7)

Move    Weight    Cumulative
============================
e2e4    36.22%      36.22%
d2d4    33.25%      69.46%
g1f3    13.75%      83.22%
c2c4    11.53%      94.74%
g2g3     2.65%      97.39%
b2b3     0.96%      98.35%
f2f4     0.65%      99.00%
b1c3     0.34%      99.34%
b2b4     0.24%      99.59%
e2e3     0.12%      99.70%
d2d3     0.11%      99.81%
c2c3     0.06%      99.88%
a2a3     0.05%      99.92%
h2h3     0.03%      99.95%
g2g4     0.03%      99.98%
h2h4     0.02%      99.99%
b1a3     0.00%     100.00%
g1h3     0.00%     100.00%
a2a4     0.00%     100.00%

Code: Select all

Using the lower bound of Wilson score interval (95% confidence) of µ WITH continuity correction:
(a = 0.7)

Move    Weight    Cumulative
============================
e2e4    36.22%      36.22%
d2d4    33.25%      69.48%
g1f3    13.76%      83.23%
c2c4    11.53%      94.76%
g2g3     2.65%      97.41%
b2b3     0.96%      98.37%
f2f4     0.65%      99.01%
b1c3     0.34%      99.36%
b2b4     0.24%      99.60%
e2e3     0.12%      99.72%
d2d3     0.11%      99.83%
c2c3     0.06%      99.89%
a2a3     0.04%      99.93%
h2h3     0.02%      99.96%
g2g4     0.03%      99.98%
h2h4     0.01%     100.00%
b1a3     0.00%     100.00%
g1h3     0.00%     100.00%
a2a4     0.00%     100.00%
Given the big difference of played games of the 'Big Four' (43.18%, 36.48%, 10.57% and 8.08%), I obtain plausible weights IMHO although the math is not so simple.

You can tweak the parameters z and a. z ~ 1.96 or z = 2 are pretty standard, but a is more subjective and susceptible of being tested.

Regards from Spain.

Ajedrecista.

Zenmastur
Posts: 512
Joined: Sat May 31, 2014 6:28 am

Re: Math question

Post by Zenmastur » Fri Jul 19, 2019 6:18 am

jdart wrote:
Wed Jul 17, 2019 1:57 pm
Zenmastur wrote:
Wed Jul 17, 2019 7:34 am

Well, if your keeping the book statistics up to date as new games are played then the highest scoring move should be played next. But, this isn't the best move AND it's not meant to be the best move. There are two consideration involved, one is the best scoring move and the other is the move that the book will learn the most information from. These are balanced, or can be balanced depending on the type of play. If you are using the book in a testing frame work then I would think the balance should be in favor of book learning speed.
The multi-armed bandit formulation of the problem assumes there is a cost to exploring suboptimal moves. If you are doing offline training then there isn't really a cost in terms of bad outcomes (you don't care if you lose games), but there is still a cost in terms of training time. You don't want to spend more time than necessary looking at suboptimal parts of the search tree.
While this may be true in theory you don't want the book algorithm exploring risky moves during tournaments. Some of the moves that WILL be explored are VERY bad. This occurs mostly at leaf nodes since not all moves may have been tried yet. So, some method of preventing these exploratory move from being played during critical times (like tournaments) should be provided.

There are a couple of ways of doing this. One is to simply not allow untested moves to be explored during tournaments. This WILL cause certain positions in the book to become “unbalanced”. Once this restriction is remove the algorithm will try to re-balance the book when it can. The problem here is that it may not see the affected nodes for quite some time. If more tournaments are played in the mean time the problem will get worse. Another way to handle the issue is to skew the algorithm more towards exploration during testing. This will tend to “correct” the unbalanced node issue. There are addition things that can be done to circumvent this problem that rely on domain specific knowledge to better handle untested moves at leaf nodes. An alternative method is to make a copy of the book and then artificially balance the book for tournaments in such a way that the untested moves aren't about to come to the top of the “next to be played” move list. It's a pain in the ass and an awkward ”fix”.

While “training time” may not be “free” it's better to use a little extra time than to allow the algorithm to shoot itself in the foot by making risky exploratory moves OTB in tournaments. Or at least that's my take on the matter.
Thompson sampling tries to balance exploitation (choosing so-far best moves) and exploration (looking at others) so that you efficiently converge to finding the move with the best outcome. Over time it does less exploration as more data is available. There are some mathematical proofs of efficiency.

It is not the only algorithm for this problem though. See https://lilianweng.github.io/lil-log/20 ... tions.html for an outline of some others.

In addition though I will mention that in my experience book learning even with a good algorithm is not as efficient as you would like, especially if you are using self-play for training. The reason is that the draw rate with self-play is very high and so bad moves are not "punished" enough. With large numbers of training games It could probably still work but it will converge slowly.

--Jon
I'm interested in how you set it up as there are many ways to exploit chess specific knowledge when implementing an MAB algorithm for chess opening books that will have an effect on speed of learning. As far as self play not being as effective, it almost sounds like the parameters need tweaking. A huge advantage of self-play is that the time can be hidden in “normal” testing inside the testing frame work. The time controls for these games is VERY fast and a huge number of games are played per year. e.g. SF testing frame work can produce 1,000 to 2,000 games per minute or between 500,000,000 and 1,000,000,000 games per year. This should be MORE than sufficient to create a robust opening book. And it will only get better with time.

Regards,

Zenmastur
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.

User avatar
Rebel
Posts: 4788
Joined: Thu Aug 18, 2011 10:04 am

Re: Math question.

Post by Rebel » Fri Jul 19, 2019 7:45 am

Ajedrecista wrote:
Wed Jul 17, 2019 7:22 pm
As you can see, a = 1 gives your original weights. a =< 0 gives weird weights (like giveaway, worst is best) and a -> +infinity awards the most played move regardless its score.
I like the formula especially because it's so easily to tune with a parameter [a], I think I am going to use it, so thanks.
90% of coding is debugging, the other 10% is writing bugs.

User avatar
Rebel
Posts: 4788
Joined: Thu Aug 18, 2011 10:04 am

Re: Math question.

Post by Rebel » Fri Jul 19, 2019 7:52 am

Ajedrecista wrote:
Thu Jul 18, 2019 6:10 pm
Given the big difference of played games of the 'Big Four' (43.18%, 36.48%, 10.57% and 8.08%), I obtain plausible weights IMHO although the math is not so simple.

You can tweak the parameters z and a. z ~ 1.96 or z = 2 are pretty standard, but a is more subjective and susceptible of being tested.

Regards from Spain.

Ajedrecista.
I am missing the dedinion of [z] in your post or I am in need for new glasses.
90% of coding is debugging, the other 10% is writing bugs.

jdart
Posts: 3842
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Math question

Post by jdart » Fri Jul 19, 2019 1:43 pm

Zenmastur wrote:
Fri Jul 19, 2019 6:18 am
A huge advantage of self-play is that the time can be hidden in “normal” testing inside the testing frame work. The time controls for these games is VERY fast and a huge number of games are played per year. e.g. SF testing frame work can produce 1,000 to 2,000 games per minute or between 500,000,000 and 1,000,000,000 games per year. This should be MORE than sufficient to create a robust opening book. And it will only get better with time.
You're welcome to try this but I don't think hyper-fast games are a promising way to train a book that would be used at tournament level time controls. Note also Stockfish testing games start with a fixed set of openings.

--Jon

Zenmastur
Posts: 512
Joined: Sat May 31, 2014 6:28 am

Re: Math question

Post by Zenmastur » Fri Jul 19, 2019 4:19 pm

jdart wrote:
Fri Jul 19, 2019 1:43 pm
Zenmastur wrote:
Fri Jul 19, 2019 6:18 am
A huge advantage of self-play is that the time can be hidden in “normal” testing inside the testing frame work. The time controls for these games is VERY fast and a huge number of games are played per year. e.g. SF testing frame work can produce 1,000 to 2,000 games per minute or between 500,000,000 and 1,000,000,000 games per year. This should be MORE than sufficient to create a robust opening book. And it will only get better with time.
You're welcome to try this but I don't think hyper-fast games are a promising way to train a book that would be used at tournament level time controls.
Theoretically you can train the book with random play outs of the games. So there is no technical requirement to have a strong “agent” play the games. My assertion is that a non-random player that uses domain specific knowledge will make the book converge orders of magnitude faster if the difference between it's playing strength and that of a random player is sufficient to compensate for the additional time required to perform the search and apply the domain specific knowledge.

A single core on my machine when running stock speed will see about 1,500k NPS and reach about depth 15-17 or so in 100 milli-seconds using the latest version of SF. SF normal tests are done at time control of 10 seconds plus a 100 milli-second increment IIRC. So search depths in the range of 15 to 17 are relevant to this discussion. It's exact playing strength is unknown, but if we are comparing it's strength to a random player then we have this:
Laskos wrote:
Sun Feb 15, 2015 12:08 pm
I did some Monte Carlo simulations assuming reasonable beta distributions of random mover inaccuracies, depth=3 inaccuracies and depth=10 inaccuracies. The beta distribution is useful to model the random variables limited to intervals of finite length, and works well in this case. My simulations are in steps of 5%, so "85%" means that 85% of moves are random, 15% are either depth=3 or depth=10. The results are here:

Code: Select all

Each 10000 simulations, depth=3

95% vs 100%   +114 Elo points
90% vs  95%   +116 Elo points
85% vs  90%   +118 Elo points
80% vs  85%   +117 Elo points
75% vs  80%   +121 Elo points 
70% vs  75%   +120 Elo points
65% vs  70%   +123 Elo points
60% vs  65%   +115 Elo points
55% vs  60%   +121 Elo points
50% vs  55%   +126 Elo points
45% vs  50%   +126 Elo points
40% vs  45%   +139 Elo points
35% vs  40%   +145 Elo points
30% vs  35%   +146 Elo points
25% vs  30%   +148 Elo points
20% vs  25%   +164 Elo points
15% vs  20%   +161 Elo points
10% vs  15%   +180 Elo points
 5% vs  10%   +179 Elo points
 0% vs   5%   +189 Elo points
_____________________________

Total     +2768 Elo points Gaviota depth=3 versus random engine 



Each 10000 simulations, depth=10

95% vs 100%   +124 Elo points
90% vs  95%   +131 Elo points
85% vs  90%   +126 Elo points
80% vs  85%   +128 Elo points
75% vs  80%   +136 Elo points 
70% vs  75%   +132 Elo points
65% vs  70%   +125 Elo points
60% vs  65%   +140 Elo points
55% vs  60%   +129 Elo points
50% vs  55%   +136 Elo points
45% vs  50%   +137 Elo points
40% vs  45%   +151 Elo points
35% vs  40%   +147 Elo points
30% vs  35%   +164 Elo points
25% vs  30%   +168 Elo points
20% vs  25%   +181 Elo points
15% vs  20%   +192 Elo points
10% vs  15%   +195 Elo points
 5% vs  10%   +196 Elo points
 0% vs   5%   +192 Elo points
_____________________________

Total     +3030 Elo points Gaviota depth=10 versus random engine
The difference with this pool of depth=10 compared to depth=3 is 262 Elo points. The real difference is at least 900 Elo points, confirming Uri's hypothesis of dependence on the pool of players.
From this information we can “guess” that the playing strength difference between a random player and current DEV SF searching to depth of more than 10 will be well in excess of 3000 ELO.

Now the question becomes this: If we allow longer games (say double the times controls) will it double the rate at which the book will “converge” to the “true” value of a given position? (IIRC time doubling yields about 70 ELO difference in playing strength) If it changes the “real time” rate of convergence then clearly you will be right and I will be wrong.

In my favor, the branching factor during the opening is much larger than 2 and doubling the time controls means half as many games will be played. In order to “see” all positions arising from a given position there is a minimum number of games that must be played (i.e. the branching factor of the position in question) before the algorithm will “begin” to work properly. The branching factor isn't going to change regardless of what the time controls are set to. This seems to favor more games per unit time and I don't think the marginal increase in strength of the engines play due to the longer time controls is going to compensate for halving the number of games per unit time. That would be my guess anyway!
Note also Stockfish testing games start with a fixed set of openings.

--Jon
Yes, I am aware of this. I'm also aware that this could be changed if there were sufficient reason to do so.

Regards,

Zenmastur
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.

Robert Pope
Posts: 510
Joined: Sat Mar 25, 2006 7:27 pm

Re: Math question.

Post by Robert Pope » Fri Jul 19, 2019 4:57 pm

Rebel wrote:
Fri Jul 19, 2019 7:52 am
Ajedrecista wrote:
Thu Jul 18, 2019 6:10 pm
Given the big difference of played games of the 'Big Four' (43.18%, 36.48%, 10.57% and 8.08%), I obtain plausible weights IMHO although the math is not so simple.

You can tweak the parameters z and a. z ~ 1.96 or z = 2 are pretty standard, but a is more subjective and susceptible of being tested.

Regards from Spain.

Ajedrecista.
I am missing the dedinion of [z] in your post or I am in need for new glasses.
z ~ 1.959963985 (95% confidence in a normal distribution; you can change z)
It's in one of the very early posts.

Post Reply