my first cluster test result with R

Daniel Shawul · Post by **Daniel Shawul** » Fri Jan 29, 2010 3:28 pm

I did a self test between different versions of scorpio with different
values of R (null move reduction). In the past i tested values of 2 and 4 with no sucess, so i wanted to tune the default value of 3 by fractional values closer to it.

Code: Select all

scorpio t1           R = 2.5
scorpio 2.4.7        R = 3
scorpio t2           R = 3.5
scorpio t3           R = 4

I used Dann Corbit's neutral.pgn with 600 games. First i divided them in to three groups to see if i get similar results between my three tests. And I think the results are close enough. This i think is an indication that as long as my tests are of similar in content this result should hold. Here are the results

Code: Select all

neutral1 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       32   12   12  1200   57%   -11   41%
   2 Scorpio_t2        1   12   12  1200   50%     0   48%
   3 Scorpio_2.4.7    -5   12   12  1200   49%     2   49%
   4 Scorpio_t3      -28   12   12  1200   44%     9   47%

neutral2 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       31   12   12  1200   57%   -10   45%
   2 Scorpio_t2        7   12   12  1199   52%    -2   47%
   3 Scorpio_2.4.7   -14   12   12  1199   47%     5   52%
   4 Scorpio_t3      -24   12   12  1200   45%     8   49%

neutral3 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       18   12   12  1200   54%    -6   44%
   2 Scorpio_2.4.7     3   12   12  1200   51%    -1   52%
   3 Scorpio_t2       -2   12   12  1200   50%     1   48%
   4 Scorpio_t3      -19   12   12  1200   46%     6   47%

Overall

Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       27    8    7  3600   56%    -9   43%
   2 Scorpio_t2        2    8    7  3599   51%    -1   48%
   3 Scorpio_2.4.7    -5    7    8  3599   49%     2   51%
   4 Scorpio_t3      -24    7    7  3600   45%     8   48%

TC is 40/10. I know this is fast so the results are hard to take. It looks like R = 2.5 performs better by a big margin! I am guessing this is probably due to the shorter search depths attained in that TC. So i am going to redo the test with 40/30 and finally with 40/60. If i don't see any change i will stop it there otherwise i will keep on increasing TC.

Daniel

Michael Sherwin · Post by **Michael Sherwin** » Fri Jan 29, 2010 4:34 pm

Daniel Shawul wrote:I did a self test between different versions of scorpio with different
values of R (null move reduction). In the past i tested values of 2 and 4 with no sucess, so i wanted to tune the default value of 3 by fractional values closer to it.
Code: Select all
scorpio t1           R = 2.5
scorpio 2.4.7        R = 3
scorpio t2           R = 3.5
scorpio t3           R = 4
I used Dann Corbit's neutral.pgn with 600 games. First i divided them in to three groups to see if i get similar results between my three tests. And I think the results are close enough. This i think is an indication that as long as my tests are of similar in content this result should hold. Here are the results
Code: Select all
neutral1 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       32   12   12  1200   57%   -11   41%
   2 Scorpio_t2        1   12   12  1200   50%     0   48%
   3 Scorpio_2.4.7    -5   12   12  1200   49%     2   49%
   4 Scorpio_t3      -28   12   12  1200   44%     9   47%

neutral2 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       31   12   12  1200   57%   -10   45%
   2 Scorpio_t2        7   12   12  1199   52%    -2   47%
   3 Scorpio_2.4.7   -14   12   12  1199   47%     5   52%
   4 Scorpio_t3      -24   12   12  1200   45%     8   49%

neutral3 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       18   12   12  1200   54%    -6   44%
   2 Scorpio_2.4.7     3   12   12  1200   51%    -1   52%
   3 Scorpio_t2       -2   12   12  1200   50%     1   48%
   4 Scorpio_t3      -19   12   12  1200   46%     6   47%

Overall

Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       27    8    7  3600   56%    -9   43%
   2 Scorpio_t2        2    8    7  3599   51%    -1   48%
   3 Scorpio_2.4.7    -5    7    8  3599   49%     2   51%
   4 Scorpio_t3      -24    7    7  3600   45%     8   48%
TC is 40/10. I know this is fast so the results are hard to take. It looks like R = 2.5 performs better by a big margin! I am guessing this is probably due to the shorter search depths attained in that TC. So i am going to redo the test with 40/30 and finally with 40/60. If i don't see any change i will stop it there otherwise i will keep on increasing TC.

Daniel

You should also test the classic:

R = 2 if remaining depth < 6

R = 3 if remaining depth > 5

Daniel Shawul · Post by **Daniel Shawul** » Fri Jan 29, 2010 5:18 pm

Will do.
I will also add R = 2. In the past it didnt work for me at longer time controls but it might work here at this fast time control. If that is the case then this raises an issue if to trust fast time controls.

Daniel

Dann Corbit · Post by **Dann Corbit** » Fri Jan 29, 2010 8:38 pm

Daniel Shawul wrote:I did a self test between different versions of scorpio with different
values of R (null move reduction). In the past i tested values of 2 and 4 with no sucess, so i wanted to tune the default value of 3 by fractional values closer to it.
Code: Select all
scorpio t1           R = 2.5
scorpio 2.4.7        R = 3
scorpio t2           R = 3.5
scorpio t3           R = 4
I used Dann Corbit's neutral.pgn with 600 games. First i divided them in to three groups to see if i get similar results between my three tests. And I think the results are close enough. This i think is an indication that as long as my tests are of similar in content this result should hold. Here are the results
Code: Select all
neutral1 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       32   12   12  1200   57%   -11   41%
   2 Scorpio_t2        1   12   12  1200   50%     0   48%
   3 Scorpio_2.4.7    -5   12   12  1200   49%     2   49%
   4 Scorpio_t3      -28   12   12  1200   44%     9   47%

neutral2 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       31   12   12  1200   57%   -10   45%
   2 Scorpio_t2        7   12   12  1199   52%    -2   47%
   3 Scorpio_2.4.7   -14   12   12  1199   47%     5   52%
   4 Scorpio_t3      -24   12   12  1200   45%     8   49%

neutral3 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       18   12   12  1200   54%    -6   44%
   2 Scorpio_2.4.7     3   12   12  1200   51%    -1   52%
   3 Scorpio_t2       -2   12   12  1200   50%     1   48%
   4 Scorpio_t3      -19   12   12  1200   46%     6   47%

Overall

Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       27    8    7  3600   56%    -9   43%
   2 Scorpio_t2        2    8    7  3599   51%    -1   48%
   3 Scorpio_2.4.7    -5    7    8  3599   49%     2   51%
   4 Scorpio_t3      -24    7    7  3600   45%     8   48%
TC is 40/10. I know this is fast so the results are hard to take. It looks like R = 2.5 performs better by a big margin! I am guessing this is probably due to the shorter search depths attained in that TC. So i am going to redo the test with 40/30 and finally with 40/60. If i don't see any change i will stop it there otherwise i will keep on increasing TC.

Daniel

Here's a "captain obvious" idea:
If the data indicates better R at certain time controls, then have R be a function of time control.

rjgibert · Post by **rjgibert** » Fri Jan 29, 2010 9:03 pm

Dann Corbit wrote:
Daniel Shawul wrote:I did a self test between different versions of scorpio with different
values of R (null move reduction). In the past i tested values of 2 and 4 with no sucess, so i wanted to tune the default value of 3 by fractional values closer to it.
Code: Select all
scorpio t1           R = 2.5
scorpio 2.4.7        R = 3
scorpio t2           R = 3.5
scorpio t3           R = 4
I used Dann Corbit's neutral.pgn with 600 games. First i divided them in to three groups to see if i get similar results between my three tests. And I think the results are close enough. This i think is an indication that as long as my tests are of similar in content this result should hold. Here are the results
Code: Select all
neutral1 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       32   12   12  1200   57%   -11   41%
   2 Scorpio_t2        1   12   12  1200   50%     0   48%
   3 Scorpio_2.4.7    -5   12   12  1200   49%     2   49%
   4 Scorpio_t3      -28   12   12  1200   44%     9   47%

neutral2 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       31   12   12  1200   57%   -10   45%
   2 Scorpio_t2        7   12   12  1199   52%    -2   47%
   3 Scorpio_2.4.7   -14   12   12  1199   47%     5   52%
   4 Scorpio_t3      -24   12   12  1200   45%     8   49%

neutral3 test
Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       18   12   12  1200   54%    -6   44%
   2 Scorpio_2.4.7     3   12   12  1200   51%    -1   52%
   3 Scorpio_t2       -2   12   12  1200   50%     1   48%
   4 Scorpio_t3      -19   12   12  1200   46%     6   47%

Overall

Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       27    8    7  3600   56%    -9   43%
   2 Scorpio_t2        2    8    7  3599   51%    -1   48%
   3 Scorpio_2.4.7    -5    7    8  3599   49%     2   51%
   4 Scorpio_t3      -24    7    7  3600   45%     8   48%
TC is 40/10. I know this is fast so the results are hard to take. It looks like R = 2.5 performs better by a big margin! I am guessing this is probably due to the shorter search depths attained in that TC. So i am going to redo the test with 40/30 and finally with 40/60. If i don't see any change i will stop it there otherwise i will keep on increasing TC.

Daniel
Here's a "captain obvious" idea:
If the data indicates better R at certain time controls, then have R be a function of time control.

The real meaning of given TC is hardware dependent. It would be better to gauge R in a hardware independent way e.g. by depth.

Daniel Shawul · Post by **Daniel Shawul** » Sat Jan 30, 2010 12:22 am

Test finished

Code: Select all

Legend

scorpio t1           R = 2.5
scorpio 2.4.7        R = 3
scorpio t2           R = 3.5
scorpio t3           R = 4 
scorpio t4           R = 2
scorpio t5           R = 2/3   (depth >= 5)

I realized i had depth >= 5 instead of depth > 5 halfway through the tests.
Shoudln't matter much i suppose ?

Code: Select all

Rank Name            Elo    +    - games score oppo. draws
   1 Scorpio_t1       18    7    6  6000   53%    -4   46%
   2 Scorpio_t2        8    7    7  5999   52%    -2   48%
   3 Scorpio_2.4.7     6    7    6  5999   51%    -1   49%
   4 Scorpio_t5        4    7    6  6000   51%    -1   48%
   5 Scorpio_t4      -12    6    7  6000   48%     2   47%
   6 Scorpio_t3      -24    7    6  6000   45%     5   48%

R = 2 is worse even at this fast time control. So it is out of the next test. Saves me time because only higher values of R should be performing better at longer TCs. R = 2/3 doesnt look promising either but I will keep it in with the correction (depth > 5).

Daniel

Daniel Shawul · Post by **Daniel Shawul** » Sat Jan 30, 2010 12:29 am

Yes I will do it by depth, but R = 2 doesn't look promising even at faster time controls. Best bet right now looks like R = 2.5/3.5 with depth > 5 border.

my first cluster test result with R

my first cluster test result with R

Re: my first cluster test result with R

Re: my first cluster test result with R

Re: my first cluster test result with R

Re: my first cluster test result with R

Re: my first cluster test result with R

Re: my first cluster test result with R