which engine should win
Moderators: hgm, Rebel, chrisw
-
- Posts: 424
- Joined: Wed Sep 30, 2009 5:30 am
which engine should win
which engine should win in a ten game 5 minute blitz tournament: Komodo 13.02, Shashchess 8, Stockfish 10, or SugaR XPro 1.6.1? Every thing is pretty much default; threads = 4. All engines x64.
David S.
-
- Posts: 1357
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
-
- Posts: 1968
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Which engine should win?
Hello:
I looked into CCRL 40/4 ratings as suggested above, so let's hope that 40/4 ratings are similar to 5-minute blitz games. The input parameters were:
- Ratings: the rating of each engine if available; if not, take the rating of a similar engine. Finally:
- White advantage: the global white advantage of CCRL 40/4 games. I know that a better estimator would have been the white advantage of top level engines, but this was more difficult of calculate. Anyway:
So I did White advantage = 400*log10{[797129 + (609068)/2]/[645243 + (609068)/2]} = 400*log10[(1101663)/(949777)] ~ 25.77 ~ 26 Elo.
- Draw rate for equal opponents: I averaged (weighted average) the draw rates of these four opponents, which are almost equal in strength. I only found 36/40 in SF-ShashChess and 62/90 in SF-K, so I supposed Draw rate = (36 + 62)/(40 + 90) = 98/130 ~ 0.7538 = 75%.
I choosed 1,000,000 simulations and 10 legs in each simulation (10 rounds per engine), so each complete tournament simulation of n engines and g games/engine will have g*[n*(n - 1)/2] games (60 games with g = 10 and n = 4).
Here are the results:
Summarizing:
So, around 88.84% (~ 8/9) of a single winner and 23.16% (~ 22/95) of shared winners. Since 88.84 % + (23.16 %)/2 = 100.42 % > 100 %, some of the shared wins will be between more than two engines.
Regards from Spain.
Ajedrecista.
There is a Round-Robin simulator that can simulate the outcome of a Round-Robin tournament with some input parameters.
I looked into CCRL 40/4 ratings as suggested above, so let's hope that 40/4 ratings are similar to 5-minute blitz games. The input parameters were:
- Ratings: the rating of each engine if available; if not, take the rating of a similar engine. Finally:
Code: Select all
"Stockfish 10 64-bit 4CPU", 3547
"SugaR XPrO 1.4 64-bit 4CPU", 3535
"ShashChess 6.0 64-bit 4CPU", 3505
"Komodo 13.02 64-bit 4CPU", 3490
Code: Select all
Testing summary:
___________________________
Total: 2'051'440 games
played by 2'369 programs
27518 CPU days (X2 4600+)
White wins: 797'129 (38.9%)
Black wins: 645'243 (31.5%)
Draws: 609'068 (29.7%)
White score: 53.7%
- Draw rate for equal opponents: I averaged (weighted average) the draw rates of these four opponents, which are almost equal in strength. I only found 36/40 in SF-ShashChess and 62/90 in SF-K, so I supposed Draw rate = (36 + 62)/(40 + 90) = 98/130 ~ 0.7538 = 75%.
I choosed 1,000,000 simulations and 10 legs in each simulation (10 rounds per engine), so each complete tournament simulation of n engines and g games/engine will have g*[n*(n - 1)/2] games (60 games with g = 10 and n = 4).
Here are the results:
Code: Select all
.\rrsim-v0.5-win>rrsim-win32 -h
Program to simulate round robin tournaments
quick example: rrsim -i input_example.csv -w30 -d30 -s100 -p simulated.pgn
- Processes input_example.csv
- White advantage is 30 points, draw rate is 30%, simulated 100 times
- Games simulated are saved in simulated.pgn
usage: rrsim [-OPTION]
-h print this help
-v print version number and exit
-L display the license information
-q quiet (no output except error messages)
-i <file> input file, with comma separated value format
-p <file> output file, containing all simulated games (optional)
-w <num> white advantage (rating points), default = 0.0
-d <num> draw rate for equal opponents (%), default = 50.0
-s <num> number of simulation repeats, default = 1
-r starts with reversed colors, (optional)
-l round robing legs, default = 1
[...]
.\rrsim-v0.5-win>rrsim-win32 -i input_example.csv -w 26 -d 75.0 -s 1000000 -l 10
Simulations: 1000000
0 10 20 30 40 50 60 70 80 90 100 (%)
|----|----|----|----|----|----|----|----|----|----|
[*************************************************]
Tournament engines = 4
Tournament boards = 2
Tournament games/leg = 6
Tournament rounds/leg = 3
Tournament legs = 10
Tournament total games = 60
Tournament total rounds = 30
Simulations = 1000000
Total games = 60000000
draw rate (equal strength) = 75.0%
White advantage = 26.0
First Engine (Stockfish 10 64-bit 4CPU: 3547.0) Stats:
won = 512968
shared = 94253
loss = 392779
total = 1000000
won outright % = 51.3
won shared % = 9.4
[...]
First Engine (SugaR XPrO 1.4 64-bit 4CPU: 3535.0) Stats:
won = 298289
shared = 86976
loss = 614735
total = 1000000
won outright % = 29.8
won shared % = 8.7
[...]
First Engine (ShashChess 6.0 64-bit 4CPU: 3505.0) Stats:
won = 56373
shared = 34296
loss = 909331
total = 1000000
won outright % = 5.6
won shared % = 3.4
[...]
First Engine (Komodo 13.02 64-bit 4CPU: 3490.0) Stats:
won = 20676
shared = 15957
loss = 963367
total = 1000000
won outright % = 2.1
won shared % = 1.6
Code: Select all
64-bit 4CPU:
Outright win Shared win Win (outright + shared)
Stockfish 10 (3547) 51.30 % 9.43 % 60.73 %
SugaR XPrO 1.4 (3535) 29.83 % 8.70 % 38.53 %
ShashChess 6.0 (3505) 5.64 % 3.43 % 9.07 %
Komodo 13.02 (3490) 2.07 % 1.60 % 3.67 %
Regards from Spain.
Ajedrecista.