Dann Corbit provided me with an excellent EPD file of various KID opening positions, 1245 of them. KID plans are 15-20 moves long. I used peculiar adjudication conditions:
1/ When both engines agree to 70cp advantage for one side in 2 consecutive moves, game is adjudicated as Win
2/ When engines play 20 moves from the opening without a Win, game is adjudicated as Draw
The purpose of these adjudication conditions is to evidence the serious advantage one gets from KID openings, without bothering how they convert it (that is already not related to KID openings).
These conditions I can use with LittleBlitzer, but not with Cutechess-Cli. Cutechess has broken adjudication implementation, which works reasonably in normal conditions, but not in thes peculiar ones. I used top 7 engines at 1 second per move.
The LittleBlitzer output was the following:
Code: Select all
Games Completed = 12600 of 12600 (Avg game length = 37.623 sec)
Settings = RR/32MB/1000ms per move/M 70cp for 2 moves, D 20 moves/EPD:C:\LittleBlitzer\greasy_kid_stuff.epd(1245)
Time = 63237 sec elapsed, 0 sec remaining
1. Komodo 11.01 64-bit 2062.0/3600 861-337-2402 (L: m=0 t=0 i=0 a=337) (D: r=6 i=0 f=0 s=0 a=2396) (tpm=1009.9 d=16.44 nps=992868)
2. Stockfish 260517 64 BMI2 1945.5/3600 877-586-2137 (L: m=0 t=0 i=0 a=586) (D: r=12 i=0 f=0 s=0 a=2125) (tpm=1010.2 d=17.11 nps=1109620)
3. Houdini 5.01 Pro x64-popc 1802.5/3600 783-778-2039 (L: m=0 t=0 i=0 a=778) (D: r=15 i=0 f=0 s=0 a=2024) (tpm=1009.0 d=15.09 nps=1417130)
4. Deep Shredder 13 x64 1720.0/3600 564-724-2312 (L: m=0 t=0 i=0 a=724) (D: r=18 i=0 f=0 s=0 a=2294) (tpm=1013.8 d=17.31 nps=1173046)
5. Gull 3 x64 1682.0/3600 576-812-2212 (L: m=0 t=0 i=0 a=812) (D: r=12 i=0 f=0 s=0 a=2200) (tpm=1012.3 d=14.68 nps=1584731)
6. Andscacs 0.91b 1682.0/3600 385-621-2594 (L: m=1 t=0 i=0 a=620) (D: r=16 i=0 f=0 s=0 a=2578) (tpm=1021.2 d=15.56 nps=839012)
7. Fritz 15 1706.0/3600 478-666-2456 (L: m=0 t=0 i=0 a=666) (D: r=13 i=0 f=0 s=0 a=2443) (tpm=985.5 d=14.08 nps=524156)
I then performed rating calculations. ELO is a bit irrelevant, it depends too much on Draw rate, and the rate of Draws, which is subjected to adjudication conditions, is not that relevant here. I used WILO and Normalized ELO, which not only here, but in general, are better indication of strength.
WILO:
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED (%) CFS(next)
1 Komodo 11.01 64-bit : 3140.6 21.0 861.0 1198 71.9 100
2 Stockfish 260517 64 BMI2 : 3068.2 18.9 877.0 1463 59.9 100
3 Houdini 5.01 Pro x64-popc : 3010.9 18.1 783.0 1561 50.2 100
4 Deep Shredder 13 x64 : 2960.0 19.8 564.0 1288 43.8 84
5 Fritz 15 : 2944.3 20.3 478.0 1144 41.8 54
6 Gull 3 x64 : 2942.6 18.3 576.0 1388 41.5 72
7 Andscacs 0.91b : 2933.4 22.4 385.0 1006 38.3 ---
Normalized ELO:
Code: Select all
# PLAYER : RATING ERROR POINTS PLAYED
1 Komodo 11.01 64-bit : 0.261 0.033 2062.0 3600
2 Stockfish 260517 64 BMI2 : 0.128 0.033 1945.5 3600
3 Houdini 5.01 Pro x64-popc : 0.002 0.033 1802.5 3600
4 Deep Shredder 13 x64 : -0.075 0.033 1720.0 3600
5 Fritz 15 : -0.093 0.033 1706.0 3600
6 Gull 3 x64 : -0.106 0.033 1682.0 3600
7 Andscacs 0.91b : -0.125 0.033 1682.0 3600
Komodo 11.01 came as a clear winner in this KID contest, although in normal conditions and general openings it is significantly inferior to Stockfish dev at this time control.
For me as a patzer, this result is more convincing than the result of the thread started by Om Prakash question.