Discussion of anything and everything relating to chess playing software and machines.
Moderators: hgm, Harvey Williamson, bob
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
-
Laskos
- Posts: 8023
- Joined: Wed Jul 26, 2006 8:21 pm
Post
by Laskos » Fri Apr 07, 2017 7:49 pm
The excellent FGRL rating list (
http://www.fastgm.de/index.html) contains two Top 10 rating lists for 10' + 6'' and 60' + 15'' TC with identical engines on one core. We can make direct comparisons of engine performances.
1/
10' + 6''
Code: Select all
10' + 6''
Ordo v1.0.9.2: 3000
Engine : Elo Diff Error Points (%) W D L D(%) CFS W/L
------------------------------------------------------------------------------------------------------ ------
1 Stockfish 8 : 3151 0 9 1916.0 70.96 1209 1414 77 52.37 89 15.70
2 Komodo 10.4 : 3143 -8 9 1889.0 69.96 1224 1330 146 49.26 63 8.38
3 Houdini 5.01 : 3141 -10 8 1882.0 69.70 1193 1378 129 51.04 100 9.25
4 Deep Shredder 13 : 3009 -142 8 1390.0 51.48 630 1520 550 56.30 100 1.145
5 Fire 5 : 2983 -168 8 1289.0 47.74 542 1494 664 55.33 100 0.816
6 Fizbo 1.9 : 2957 -194 8 1186.0 43.93 476 1420 804 52.59 100 0.592
7 Gull 3 : 2941 -210 8 1125.0 41.67 399 1452 849 53.78 100 0.470
8 Andscacs 0.89 : 2901 -250 8 975.5 36.13 330 1291 1079 47.81 98 0.306
9 Fritz 15 : 2889 -262 8 930.0 34.44 282 1296 1122 48.00 72 0.251
10 Chiron 4 : 2885 -266 8 917.5 33.98 271 1293 1136 47.89 --- 0.239
White advantage = 40.58 +/- 2.07
Draw rate (equal opponents) = 63.46 % +/- 0.53
2/
60' + 15''
Code: Select all
60' + 15''
Ordo v1.2.6: 3000
Engine : Elo Diff Error Points (%) W D L D(%) CFS W/L
---------------------------------------------------------------------------------------------------- ------
1 Stockfish 8 : 3146 0 12 950.5 70.41 587 727 36 53.85 51 16.31
2 Komodo 10.4 : 3146 0 12 950.0 70.37 615 670 65 49.63 100 9.46
3 Houdini 5.01 : 3119 -27 11 903.5 66.93 516 775 59 57.41 100 8.74
4 Deep Shredder 13 : 3015 -131 11 706.5 52.33 304 805 241 59.63 99 1.261
5 Fire 5 : 2997 -149 10 670.5 49.67 287 767 296 56.81 100 0.970
6 Fizbo 1.9 : 2949 -197 11 577.5 42.78 208 739 403 54.74 83 0.516
7 Gull 3 : 2941 -205 11 562.5 41.67 172 781 397 57.85 97 0.433
8 Andscacs 0.89 : 2926 -220 11 533.0 39.48 176 714 460 52.89 100 0.383
9 Chiron 4 : 2885 -261 11 457.0 33.85 126 662 562 49.04 88 0.224
10 Fritz 15 : 2875 -271 11 439.0 32.52 106 666 578 49.33 --- 0.183
White advantage = 39.23 +/- 2.84
Draw rate (equal opponents) = 66.78 % +/- 0.74
Elo is not an adequate parametrization of the scaling. Rating at longer time controls is subjected to Elo compression, due to increasing draw rate. So, a weaker engine might appear to approach Elo-wise a stronger one (relatively gain strength), but this might be just due to the increasing number of draws, without affecting the relative strength. More related to relative strength is Win/Loss rate for every engine in the list. Here I post the rating list of scaling of engines in Win/Loss ratios from Blitz TC to Long TC. Also
log10 list for ratings to be additive.
Scaling to Long Time Control on one core:
Code: Select all
Engine Scaling = (W2*L1)/(W1*L2) 100*log10(Scaling)
------------------------------------------------------------------------------------
1 Andscacs 0.89 : 1.252 9.76
2 Fire 5 : 1.189 7.52
3 Komodo 10.4 : 1.129 5.27
4 Deep Shredder 13 : 1.101 4.18
5 Stockfish 8 : 1.039 1.66
6 Houdini 5.01 : 0.945 -2.46
7 Chiron 4 : 0.937 -2.83
8 Gull 3 : 0.921 -3.57
9 Fizbo 1.9 : 0.872 -5.95
10 Fritz 15 : 0.729 -13.73
-
Dann Corbit
- Posts: 8662
- Joined: Wed Mar 08, 2006 7:57 pm
- Location: Redmond, WA USA
-
Contact:
Post
by Dann Corbit » Fri Apr 07, 2017 8:08 pm
So using this measure, Andscacs scales best with longer time and Fritz the worst.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
Laskos
- Posts: 8023
- Joined: Wed Jul 26, 2006 8:21 pm
Post
by Laskos » Fri Apr 07, 2017 8:16 pm
Dann Corbit wrote:So using this measure, Andscacs scales best with longer time and Fritz the worst.
Yes.
-
fastgm
- Posts: 342
- Joined: Mon Aug 19, 2013 4:57 pm
-
Contact:
Post
by fastgm » Fri Apr 07, 2017 8:49 pm
Hello Kai,
thank you very much for the comparison.
Here the data from my third rating list
60" + 0.6":
Code: Select all
60'' + 0.6''
Ordo v1.2.6: 3000
Engine : Elo Diff Games Points (%) W D L D(%) W/L
---------------------------------------------------------------------------------------------------- ------
1 Stockfish 8 : 3208 0 2250 1722.0 76.53 1308 828 114 36.80 11.47
2 Houdini 5 : 3205 -3 2250 1714.0 76.18 1319 790 141 35.11 9.35
3 Komodo 10.4 : 3184 -24 2250 1663.0 73.91 1263 800 187 35.56 6.75
4 Deep Shredder 13 : 3004 -204 2250 1148.0 51.02 675 946 629 42.04 1.073
5 Fire 5 : 2973 -235 2250 1053.0 46.80 635 836 779 37.16 0.815
6 Fizbo 1.9 : 2947 -261 2250 974.0 43.29 575 798 877 35.47 0.656
7 Gull 3 : 2918 -290 2250 884.5 39.31 459 851 940 37.82 0.489
8 Fritz 15 : 2858 -350 2250 711.0 31.60 337 748 1165 33.24 0.289
9 Andscacs 0.89 : 2858 -350 2250 708.5 31.49 372 673 1205 29.91 0.309
10 Chiron 4 : 2844 -364 2250 672.0 29.87 291 762 1197 33.87 0.243
White advantage = 34.62
Draw rate (equal opponents) = 46.34 %
-
Laskos
- Posts: 8023
- Joined: Wed Jul 26, 2006 8:21 pm
Post
by Laskos » Fri Apr 07, 2017 8:55 pm
fastgm wrote:Hello Kai,
thank you very much for the comparison.
Here the data from my third rating list
60" + 0.6":
Code: Select all
60'' + 0.6''
Ordo v1.2.6: 3000
Engine : Elo Diff Games Points (%) W D L D(%) W/L
---------------------------------------------------------------------------------------------------- ------
1 Stockfish 8 : 3208 0 2250 1722.0 76.53 1308 828 114 36.80 11.47
2 Houdini 5 : 3205 -3 2250 1714.0 76.18 1319 790 141 35.11 9.35
3 Komodo 10.4 : 3184 -24 2250 1663.0 73.91 1263 800 187 35.56 6.75
4 Deep Shredder 13 : 3004 -204 2250 1148.0 51.02 675 946 629 42.04 1.073
5 Fire 5 : 2973 -235 2250 1053.0 46.80 635 836 779 37.16 0.815
6 Fizbo 1.9 : 2947 -261 2250 974.0 43.29 575 798 877 35.47 0.656
7 Gull 3 : 2918 -290 2250 884.5 39.31 459 851 940 37.82 0.489
8 Fritz 15 : 2858 -350 2250 711.0 31.60 337 748 1165 33.24 0.289
9 Andscacs 0.89 : 2858 -350 2250 708.5 31.49 372 673 1205 29.91 0.309
10 Chiron 4 : 2844 -364 2250 672.0 29.87 291 762 1197 33.87 0.243
White advantage = 34.62
Draw rate (equal opponents) = 46.34 %
Thank you very much, I will compute tomorrow morning the relative ratios from Bullet to Long Time Control.
-
cdani
- Posts: 2047
- Joined: Sat Jan 18, 2014 9:24 am
- Location: Andorra
-
Contact:
Post
by cdani » Fri Apr 07, 2017 9:28 pm
Laskos wrote:Dann Corbit wrote:So using this measure, Andscacs scales best with longer time and Fritz the worst.
Yes.
You can bet I tried very hard to obtain this

-
Dann Corbit
- Posts: 8662
- Joined: Wed Mar 08, 2006 7:57 pm
- Location: Redmond, WA USA
-
Contact:
Post
by Dann Corbit » Fri Apr 07, 2017 9:42 pm
cdani wrote:Laskos wrote:Dann Corbit wrote:So using this measure, Andscacs scales best with longer time and Fritz the worst.
Yes.
You can bet I tried very hard to obtain this

I guess that all the efforts to obtain this are via pruning, since it has to do with all experiments running a single thread (so it has nothing to do with SMP).
I think that this is the right direction for a giant win (next big revolution like null move and LMR were in their day).
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
cdani
- Posts: 2047
- Joined: Sat Jan 18, 2014 9:24 am
- Location: Andorra
-
Contact:
Post
by cdani » Fri Apr 07, 2017 9:51 pm
Dann Corbit wrote:cdani wrote:Laskos wrote:Dann Corbit wrote:So using this measure, Andscacs scales best with longer time and Fritz the worst.
Yes.
You can bet I tried very hard to obtain this

I guess that all the efforts to obtain this are via pruning, since it has to do with all experiments running a single thread (so it has nothing to do with SMP).
I think that this is the right direction for a giant win (next big revolution like null move and LMR were in their day).
I don't signal a concrete cause. I try that every patch I accept scales well, or at least is neutral. So its an accumulated effect. Anyway even if this comes for long ago, I'm never sure if the next patch I will do will kill a part of the achievements, as of course I cannot test at very long time control.
-
JJJ
- Posts: 1195
- Joined: Sat Apr 19, 2014 11:47 am
Post
by JJJ » Fri Apr 07, 2017 10:00 pm
This confirm my intuition, about Komodo scaling better than Stockfish 8 with time.
-
mjlef
- Posts: 1315
- Joined: Thu Mar 30, 2006 12:08 pm
-
Contact:
Post
by mjlef » Sat Apr 08, 2017 12:50 am
Laskos wrote:fastgm wrote:Hello Kai,
thank you very much for the comparison.
Here the data from my third rating list
60" + 0.6":
Code: Select all
60'' + 0.6''
Ordo v1.2.6: 3000
Engine : Elo Diff Games Points (%) W D L D(%) W/L
---------------------------------------------------------------------------------------------------- ------
1 Stockfish 8 : 3208 0 2250 1722.0 76.53 1308 828 114 36.80 11.47
2 Houdini 5 : 3205 -3 2250 1714.0 76.18 1319 790 141 35.11 9.35
3 Komodo 10.4 : 3184 -24 2250 1663.0 73.91 1263 800 187 35.56 6.75
4 Deep Shredder 13 : 3004 -204 2250 1148.0 51.02 675 946 629 42.04 1.073
5 Fire 5 : 2973 -235 2250 1053.0 46.80 635 836 779 37.16 0.815
6 Fizbo 1.9 : 2947 -261 2250 974.0 43.29 575 798 877 35.47 0.656
7 Gull 3 : 2918 -290 2250 884.5 39.31 459 851 940 37.82 0.489
8 Fritz 15 : 2858 -350 2250 711.0 31.60 337 748 1165 33.24 0.289
9 Andscacs 0.89 : 2858 -350 2250 708.5 31.49 372 673 1205 29.91 0.309
10 Chiron 4 : 2844 -364 2250 672.0 29.87 291 762 1197 33.87 0.243
White advantage = 34.62
Draw rate (equal opponents) = 46.34 %
Thank you very much, I will compute tomorrow morning the relative ratios from Bullet to Long Time Control.
It would be great to calculate the same kind of scaling based on number of cores/threads. Of course more cores help you search deeper, just as longer time does, so that would have to be taken into account.
Kai, as always, great stuff! Thanks.
Mark