I did a self test between different versions of scorpio with different
values of R (null move reduction). In the past i tested values of 2 and 4 with no sucess, so i wanted to tune the default value of 3 by fractional values closer to it.
scorpio t1 R = 2.5
scorpio 2.4.7 R = 3
scorpio t2 R = 3.5
scorpio t3 R = 4
I used Dann Corbit's neutral.pgn with 600 games. First i divided them in to three groups to see if i get similar results between my three tests. And I think the results are close enough. This i think is an indication that as long as my tests are of similar in content this result should hold. Here are the results
neutral1 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 32 12 12 1200 57% -11 41%
2 Scorpio_t2 1 12 12 1200 50% 0 48%
3 Scorpio_2.4.7 -5 12 12 1200 49% 2 49%
4 Scorpio_t3 -28 12 12 1200 44% 9 47%
neutral2 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 31 12 12 1200 57% -10 45%
2 Scorpio_t2 7 12 12 1199 52% -2 47%
3 Scorpio_2.4.7 -14 12 12 1199 47% 5 52%
4 Scorpio_t3 -24 12 12 1200 45% 8 49%
neutral3 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 18 12 12 1200 54% -6 44%
2 Scorpio_2.4.7 3 12 12 1200 51% -1 52%
3 Scorpio_t2 -2 12 12 1200 50% 1 48%
4 Scorpio_t3 -19 12 12 1200 46% 6 47%
Overall
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 27 8 7 3600 56% -9 43%
2 Scorpio_t2 2 8 7 3599 51% -1 48%
3 Scorpio_2.4.7 -5 7 8 3599 49% 2 51%
4 Scorpio_t3 -24 7 7 3600 45% 8 48%
TC is 40/10. I know this is fast so the results are hard to take. It looks like R = 2.5 performs better by a big margin! I am guessing this is probably due to the shorter search depths attained in that TC. So i am going to redo the test with 40/30 and finally with 40/60. If i don't see any change i will stop it there otherwise i will keep on increasing TC.
Daniel Shawul wrote:I did a self test between different versions of scorpio with different
values of R (null move reduction). In the past i tested values of 2 and 4 with no sucess, so i wanted to tune the default value of 3 by fractional values closer to it.
scorpio t1 R = 2.5
scorpio 2.4.7 R = 3
scorpio t2 R = 3.5
scorpio t3 R = 4
I used Dann Corbit's neutral.pgn with 600 games. First i divided them in to three groups to see if i get similar results between my three tests. And I think the results are close enough. This i think is an indication that as long as my tests are of similar in content this result should hold. Here are the results
neutral1 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 32 12 12 1200 57% -11 41%
2 Scorpio_t2 1 12 12 1200 50% 0 48%
3 Scorpio_2.4.7 -5 12 12 1200 49% 2 49%
4 Scorpio_t3 -28 12 12 1200 44% 9 47%
neutral2 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 31 12 12 1200 57% -10 45%
2 Scorpio_t2 7 12 12 1199 52% -2 47%
3 Scorpio_2.4.7 -14 12 12 1199 47% 5 52%
4 Scorpio_t3 -24 12 12 1200 45% 8 49%
neutral3 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 18 12 12 1200 54% -6 44%
2 Scorpio_2.4.7 3 12 12 1200 51% -1 52%
3 Scorpio_t2 -2 12 12 1200 50% 1 48%
4 Scorpio_t3 -19 12 12 1200 46% 6 47%
Overall
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 27 8 7 3600 56% -9 43%
2 Scorpio_t2 2 8 7 3599 51% -1 48%
3 Scorpio_2.4.7 -5 7 8 3599 49% 2 51%
4 Scorpio_t3 -24 7 7 3600 45% 8 48%
TC is 40/10. I know this is fast so the results are hard to take. It looks like R = 2.5 performs better by a big margin! I am guessing this is probably due to the shorter search depths attained in that TC. So i am going to redo the test with 40/30 and finally with 40/60. If i don't see any change i will stop it there otherwise i will keep on increasing TC.
Daniel
You should also test the classic:
R = 2 if remaining depth < 6
R = 3 if remaining depth > 5
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Will do.
I will also add R = 2. In the past it didnt work for me at longer time controls but it might work here at this fast time control. If that is the case then this raises an issue if to trust fast time controls.
Daniel Shawul wrote:I did a self test between different versions of scorpio with different
values of R (null move reduction). In the past i tested values of 2 and 4 with no sucess, so i wanted to tune the default value of 3 by fractional values closer to it.
scorpio t1 R = 2.5
scorpio 2.4.7 R = 3
scorpio t2 R = 3.5
scorpio t3 R = 4
I used Dann Corbit's neutral.pgn with 600 games. First i divided them in to three groups to see if i get similar results between my three tests. And I think the results are close enough. This i think is an indication that as long as my tests are of similar in content this result should hold. Here are the results
neutral1 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 32 12 12 1200 57% -11 41%
2 Scorpio_t2 1 12 12 1200 50% 0 48%
3 Scorpio_2.4.7 -5 12 12 1200 49% 2 49%
4 Scorpio_t3 -28 12 12 1200 44% 9 47%
neutral2 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 31 12 12 1200 57% -10 45%
2 Scorpio_t2 7 12 12 1199 52% -2 47%
3 Scorpio_2.4.7 -14 12 12 1199 47% 5 52%
4 Scorpio_t3 -24 12 12 1200 45% 8 49%
neutral3 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 18 12 12 1200 54% -6 44%
2 Scorpio_2.4.7 3 12 12 1200 51% -1 52%
3 Scorpio_t2 -2 12 12 1200 50% 1 48%
4 Scorpio_t3 -19 12 12 1200 46% 6 47%
Overall
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 27 8 7 3600 56% -9 43%
2 Scorpio_t2 2 8 7 3599 51% -1 48%
3 Scorpio_2.4.7 -5 7 8 3599 49% 2 51%
4 Scorpio_t3 -24 7 7 3600 45% 8 48%
TC is 40/10. I know this is fast so the results are hard to take. It looks like R = 2.5 performs better by a big margin! I am guessing this is probably due to the shorter search depths attained in that TC. So i am going to redo the test with 40/30 and finally with 40/60. If i don't see any change i will stop it there otherwise i will keep on increasing TC.
Daniel
Here's a "captain obvious" idea:
If the data indicates better R at certain time controls, then have R be a function of time control.
Daniel Shawul wrote:I did a self test between different versions of scorpio with different
values of R (null move reduction). In the past i tested values of 2 and 4 with no sucess, so i wanted to tune the default value of 3 by fractional values closer to it.
scorpio t1 R = 2.5
scorpio 2.4.7 R = 3
scorpio t2 R = 3.5
scorpio t3 R = 4
I used Dann Corbit's neutral.pgn with 600 games. First i divided them in to three groups to see if i get similar results between my three tests. And I think the results are close enough. This i think is an indication that as long as my tests are of similar in content this result should hold. Here are the results
neutral1 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 32 12 12 1200 57% -11 41%
2 Scorpio_t2 1 12 12 1200 50% 0 48%
3 Scorpio_2.4.7 -5 12 12 1200 49% 2 49%
4 Scorpio_t3 -28 12 12 1200 44% 9 47%
neutral2 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 31 12 12 1200 57% -10 45%
2 Scorpio_t2 7 12 12 1199 52% -2 47%
3 Scorpio_2.4.7 -14 12 12 1199 47% 5 52%
4 Scorpio_t3 -24 12 12 1200 45% 8 49%
neutral3 test
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 18 12 12 1200 54% -6 44%
2 Scorpio_2.4.7 3 12 12 1200 51% -1 52%
3 Scorpio_t2 -2 12 12 1200 50% 1 48%
4 Scorpio_t3 -19 12 12 1200 46% 6 47%
Overall
Rank Name Elo + - games score oppo. draws
1 Scorpio_t1 27 8 7 3600 56% -9 43%
2 Scorpio_t2 2 8 7 3599 51% -1 48%
3 Scorpio_2.4.7 -5 7 8 3599 49% 2 51%
4 Scorpio_t3 -24 7 7 3600 45% 8 48%
TC is 40/10. I know this is fast so the results are hard to take. It looks like R = 2.5 performs better by a big margin! I am guessing this is probably due to the shorter search depths attained in that TC. So i am going to redo the test with 40/30 and finally with 40/60. If i don't see any change i will stop it there otherwise i will keep on increasing TC.
Daniel
Here's a "captain obvious" idea:
If the data indicates better R at certain time controls, then have R be a function of time control.
The real meaning of given TC is hardware dependent. It would be better to gauge R in a hardware independent way e.g. by depth.
R = 2 is worse even at this fast time control. So it is out of the next test. Saves me time because only higher values of R should be performing better at longer TCs. R = 2/3 doesnt look promising either but I will keep it in with the correction (depth > 5).
Yes I will do it by depth, but R = 2 doesn't look promising even at faster time controls. Best bet right now looks like R = 2.5/3.5 with depth > 5 border.