Indeed, I hope to make some sense out of this. I am encouraged by several indices that something is up there:bob wrote:At least it suggests it is not somewhat random noise.Laskos wrote:
The consistency is encouraging, but it does not prove yet that the methodology is correct.
1/ Andscacs-Sungorus is an abomination of extremely weak eval and very strong search. Although in normal games it beats SOS, Fruit, Shredder 6, in this list it comes by far the last, as it should.
2/ Fruit was known to have a simple eval, and although it beats both Shredder 6 and SOS in normal games, it comes lower on this list, as it should.
3/ While strength is not a determinant factor in the list, the newer and stronger engines still tend to score better, denoting a progress in time, as it should.
I re-tested everything taking more care of the averages. I took the geometric average between opening-middlegame and endgame phases, taking into account that opening-middlegame is more important in play than the endgame. I also took into account that geometric averages instead of arithmetic averages should be used when computing the means for each count.
So, if a=number of nodes used during the late opening, b=number of nodes used during endgame, the mean is: (b*a^2)^(1/3), instead of the old (a+b)/2. The results are close to the old ones, showing the apparent robustness of these Elos, which are hopefully related to the mostly Eval of engines.
The new, a bit more refined list, with node counts to depth=4 included:
Code: Select all
# PLAYER : RATING POINTS PLAYED (%) NODES to depth 4
1 Gaviota 1.0 : 153.9 706.5 1000 70.7% 1078
2 Komodo 3 : 138.6 688.0 1000 68.8% 1852
3 Houdini 4 : 128.9 676.0 1000 67.6% 1967
4 Komodo 8 : 120.2 665.0 1000 66.5% 1418
5 Houdini 1.5 : 91.4 627.5 1000 62.8% 2177
6 RobboLito 0.085 : 62.3 588.0 1000 58.8% 1862
7 Shredder 12 depth : 13.0 518.5 1000 51.9% 2350
8 Shredder 12 : 0.0 9569.0 17992 53.2%
9 Stockfish 21.03.2015 : -2.1 497.0 1000 49.7% 558
10 Andscacs 0.72 : -20.1 469.5 996 47.1% 3351
11 Texel 1.05 : -29.9 457.5 1000 45.8% 1823
12 Stockfish 2.1.1 : -45.1 436.0 1000 43.6% 1899
13 Crafty 24.1 : -84.7 380.0 996 38.2% 1364
14 Shredder 6PB : -92.1 371.5 1000 37.1% 3269
15 Shredder 9.1 : -104.6 355.0 1000 35.5% 3375
16 Strelka 2.0 : -122.9 331.5 1000 33.1% 4508
17 SOS 5.1 : -178.3 265.5 1000 26.6% 3079
18 Fruit 2.1 : -179.7 264.0 1000 26.4% 7268
19 Andscacs-Sung : -339.4 126.0 1000 12.6% 3699