There are several fixed depth tests in computer chess history. Some of these tests can be found here.
I have used Quazar 0.4 w32 for doing this experiment (depth d vs. depth (d - 1)), which is currently unfinished, although I do not know if I will be able of run more matches. I want to add 2500 or 5000 games more in the next months (once CuteChess GUI 1.0 is out with the feature of pause and resume matches) but I probably will not manage it because each new 2500-game match takes insane amounts of time for me.
I use cutechess-cli 0.5.1 for this fixed depth testing. The engines.json files look like this one:
Code: Select all
[
{
"command" : "Quazar_0.4_w32.exe",
"name" : "Depth_9",
"options" : [
{
"name" : "Hash",
"value" : 16
}
],
"protocol" : "uci",
"workingDirectory" : "H:\\d9"
},
{
"command" : "Quazar_0.4_w32.exe",
"name" : "Depth_10",
"options" : [
{
"name" : "Hash",
"value" : 16
}
],
"protocol" : "uci",
"workingDirectory" : "H:\\d10"
}
]
Code: Select all
cutechess-cli -engine conf="Depth_9" depth=9 -engine conf="Depth_10" depth=10 -each tc=inf -concurrency 2 -draw 100 50 -resign 5 900 -games 2 -rounds 1250 -pgnin klo_250_eco_a00-e97_variations.pgn -repeat -pgnout 09_Vs_10.pgn
Code: Select all
version 0057.2, Copyright (C) 1997-2010 Remi Coulom.
compiled Apr 5 2012 17:26:01.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under the terms and conditions of the GNU General Public License.
See http://www.gnu.org/copyleft/gpl.html for details.
ResultSet>readpgn 01_Vs_02.pgn
2500 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>readpgn 02_Vs_03.pgn
5000 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>readpgn 03_Vs_04.pgn
7500 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>readpgn 04_Vs_05.pgn
10000 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>readpgn 05_Vs_06.pgn
12500 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>readpgn 06_Vs_07.pgn
15000 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>readpgn 07_Vs_08.pgn
17500 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>readpgn 08_Vs_09.pgn
20000 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>readpgn 09_Vs_10.pgn
22500 game(s) loaded, 0 game(s) with unknown result ignored.
ResultSet>elo
ResultSet-EloRating>mm 1 1
Iteration 100: 0.0058971
Iteration 200: 0.0006629
Iteration 300: 7.88414e-005
00:00:00,00
ResultSet-EloRating>confidence 0.95
0.95
ResultSet-EloRating>ratings
Rank Name Elo Diff + - Games Score Oppo. Draws Win W-L-D
1 Depth_10 850.50 0.00 10.36 10.36 2500 69.96% 716.04 44.80% 47.56% 1189-191-1120
2 Depth_9 716.04 -134.46 7.45 7.45 5000 51.62% 701.84 41.52% 30.86% 1543-1381-2076
3 Depth_8 553.17 -162.86 7.62 7.62 5000 49.78% 552.50 36.48% 31.54% 1577-1599-1824
4 Depth_7 388.96 -164.21 7.94 7.94 5000 52.88% 361.27 30.36% 37.70% 1885-1597-1518
5 Depth_6 169.37 -219.59 8.17 8.17 5000 49.47% 174.79 26.70% 36.12% 1806-1859-1335
6 Depth_5 -39.37 -208.75 8.17 8.17 5000 49.94% -41.55 25.92% 36.98% 1849-1855-1296
7 Depth_4 -252.48 -213.11 8.46 8.46 5000 51.16% -267.93 21.28% 40.52% 2026-1910-1064
8 Depth_3 -496.48 -244.00 9.44 9.44 5000 54.37% -547.21 15.06% 46.84% 2342-1905-753
9 Depth_2 -841.94 -345.46 9.15 9.15 5000 43.80% -772.13 17.20% 35.20% 1760-2380-860
10 Depth_1 -1047.77 -205.83 11.67 11.67 2500 24.00% -841.94 22.40% 12.80% 320-1620-560
ResultSet-EloRating>x
ResultSet>x
I choose a line because of some reasons: when depth d tends to infinity, then ln(d) ~ 1 + 1/2 + 1/3 + ... + 1/d and ln(d - 1) ~ 1 + 1/2 + 1/3 + ... + 1/(d - 1); delta_x = ln(d) - ln(d - 1) = ln[d/(d - 1)] ~ 1/d. If Y(x) = mx + n, dY/dx = m; estimate Elo gain = delta_Y = m*delta_x ~ m/d ---> 0 if d ---> infinity (diminishing return exists with this model).
A quadratic function fails with the same previous analysis: Y(x) = ax² + bx + c; dY/dx = 2ax + b; delta_x ~ 1/d (the same as before); estimate Elo gain = delta_Y = (dY/dx)*delta_x = (2ax + b)/d ~ {2a*[d + (d - 1)]/2 + b}/d ~ 2a = constant: diminishing return does not exist with this model (the same with other polynomials of higher degree). In dY/dx, I choose the average mean x ~ [d + (d - 1)]/2 because it makes sense to me. I am not taking into account error bars.
For depth_i > 4, using Y(depth_i) ~ 1296.6*ln(depth_i) - 2137.6 and calculating errors (rounding up to 0.1 Elo) as error_i = Y(depth_i) - rating_i:
Code: Select all
Depth: Rating: Error:
------ ------- ------
5 -39.4 -11.4
6 169.4 16.2
7 389 -3.5
8 553.2 5.4
9 716 -4.7
10 850.5 -2.6
Another known fact is the growth of the draw ratio when depth raises (except for very low depths, where my results are a bit strange): this steady growth is easily seen in the output of BayesElo.
I provide all PGN files (around 8 MB because they are compressed), win-lose-draw statistics of each 2500-game match and used openings in a PGN file:
Fixed_depth_testing_of_Quazar_0.4_w32.rar (8.08 MB)
IIRC, this Zippyshare link will dead at 30 days of inactivity.
Code: Select all
Finished game 2499 (Depth_2 vs Depth_1): 1/2-1/2 {Draw by 3-fold repetition}
Score of Depth_1 vs Depth_2: 320 - 1620 - 560 [0.24] 2500
ELO difference: -200
Finished match
-----------------------------------------------------------------------------------
Finished game 2500 (Depth_2 vs Depth_3): 0-1 {Black wins by adjudication}
Score of Depth_2 vs Depth_3: 140 - 2060 - 300 [0.12] 2500
ELO difference: -353
Finished match
-----------------------------------------------------------------------------------
Finished game 2500 (Depth_3 vs Depth_4): 1/2-1/2 {Draw by adjudication}
Score of Depth_3 vs Depth_4: 282 - 1765 - 453 [0.20] 2500
ELO difference: -237
Finished match
Warning: QObject::killTimers: timers cannot be stopped from another thread
-----------------------------------------------------------------------------------
Finished game 2499 (Depth_5 vs Depth_4): 1/2-1/2 {Draw by adjudication}
Score of Depth_4 vs Depth_5: 261 - 1628 - 611 [0.23] 2500
ELO difference: -213
Finished match
-----------------------------------------------------------------------------------
Finished game 2500 (Depth_5 vs Depth_6): 0-1 {Black wins by adjudication}
Score of Depth_5 vs Depth_6: 221 - 1594 - 685 [0.23] 2500
ELO difference: -214
Finished match
-----------------------------------------------------------------------------------
Finished game 2500 (Depth_6 vs Depth_7): 0-1 {Black wins by adjudication}
Score of Depth_6 vs Depth_7: 212 - 1638 - 650 [0.21] 2500
ELO difference: -225
Finished match
-----------------------------------------------------------------------------------
Finished game 2500 (Depth_7 vs Depth_8): 0-1 {Black wins by adjudication}
Score of Depth_7 vs Depth_8: 247 - 1385 - 868 [0.27] 2500
ELO difference: -171
Finished match
-----------------------------------------------------------------------------------
Finished game 2500 (Depth_8 vs Depth_9): 1/2-1/2 {Draw by 3-fold repetition}
Score of Depth_8 vs Depth_9: 192 - 1352 - 956 [0.27] 2500
ELO difference: -175
Finished match
-----------------------------------------------------------------------------------
Finished game 2499 (Depth_10 vs Depth_9): 1/2-1/2 {Draw by adjudication}
Score of Depth_9 vs Depth_10: 191 - 1189 - 1120 [0.30] 2500
ELO difference: -147
Finished match
Regards from Spain.
Ajedrecista.