Discussion of computer chess matches and engine tournaments.
Moderators: hgm , Rebel , chrisw
majortom
Posts: 669 Joined: Mon Nov 04, 2013 10:19 pm
Post
by majortom » Sun Dec 25, 2016 6:24 pm
Komodo 10.3 vs. Stockfish 221216, TC 120"+2", 1 core per engine, hash - 2048 MB, each position will played with both colours. So far played 94 games:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 221216 64 POPCNT : 98 43 39 94 63.8 % 0 63.8 %
2 Komodo 10.3 64-bit : 0 39 43 94 36.2 % 98 63.8 %
Individual statistics:
1 Stockfish 221216 64 POPCNT : 98 94 (+ 30,= 60,- 4), 63.8 %
2 Komodo 10.3 64-bit : 0 94 (+ 4,= 60,- 30), 36.2 %
Games
Lyudmil Tsvetkov
Posts: 6052 Joined: Tue Jun 12, 2012 12:41 pm
Post
by Lyudmil Tsvetkov » Mon Dec 26, 2016 9:36 am
majortom wrote: Komodo 10.3 vs. Stockfish 221216, TC 120"+2", 1 core per engine, hash - 2048 MB, each position will played with both colours. So far played 94 games:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 221216 64 POPCNT : 98 43 39 94 63.8 % 0 63.8 %
2 Komodo 10.3 64-bit : 0 39 43 94 36.2 % 98 63.8 %
Individual statistics:
1 Stockfish 221216 64 POPCNT : 98 94 (+ 30,= 60,- 4), 63.8 %
2 Komodo 10.3 64-bit : 0 94 (+ 4,= 60,- 30), 36.2 %
Games
interesting; seemingly, play against H improved dramatically, while play against SF did not change at all or changed for worse.
the overall strength increase is evident, but I would like to see some statistically more relevant tests (for example Ingo's or CCRL) to conclude where exactly Komodo now stands: 1st, 2nd or 3rd.
majortom
Posts: 669 Joined: Mon Nov 04, 2013 10:19 pm
Post
by majortom » Mon Dec 26, 2016 10:18 am
Forgot to add some info:
Komodo is playing with default contept in two matches with long TC (1800"+10"), and with contempt 0 in this match with TC 120"+2".
majortom
Posts: 669 Joined: Mon Nov 04, 2013 10:19 pm
Post
by majortom » Mon Dec 26, 2016 10:34 am
BTW in the last 16 games Komodo played well: +3 -0 =13 and after 168 games:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 221216 64 POPCNT : 74 30 29 168 60.4 % 0 67.3 %
2 Komodo 10.3 64-bit : 0 29 30 168 39.6 % 74 67.3 %
Individual statistics:
1 Stockfish 221216 64 POPCNT : 74 168 (+ 45,=113,- 10), 60.4 %
2 Komodo 10.3 64-bit : 0 168 (+ 10,=113,- 45), 39.6 %
Games
majortom
Posts: 669 Joined: Mon Nov 04, 2013 10:19 pm
Post
by majortom » Mon Dec 26, 2016 11:54 am
Lyudmil Tsvetkov wrote: but I would like to see some statistically more relevant tests (for example Ingo's or CCRL) to conclude where exactly Komodo now stands: 1st, 2nd or 3rd.
I don't trust enough to rating lists with big number of engines with very different strength level. Where the very few direct games between top engines comparing with the games against the much weaker opponents.
I guess that's why Komodo does not have the contempt 0 by default.
I don't like the formula:
Code: Select all
if a>c at 100 elo and b>c at 100 elo, then a=b
majortom
Posts: 669 Joined: Mon Nov 04, 2013 10:19 pm
Post
by majortom » Tue Dec 27, 2016 8:53 am
After 240 games:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 221216 64 POPCNT : 68 26 25 240 59.8 % 0 66.2 %
2 Komodo 10.3 64-bit : 0 25 26 240 40.2 % 68 66.2 %
Individual statistics:
1 Stockfish 221216 64 POPCNT: 68 240 (+ 64,=159,- 17), 59.8 %
2 Komodo 10.3 64-bit : 0 240 (+ 17,=159,- 64), 40.2 %
Games
majortom
Posts: 669 Joined: Mon Nov 04, 2013 10:19 pm
Post
by majortom » Tue Dec 27, 2016 8:39 pm
After 280 games:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 221216 64 POPCNT : 64 23 22 280 59.1 % 0 68.2 %
2 Komodo 10.3 64-bit : 0 22 23 280 40.9 % 64 68.2 %
Individual statistics:
1 Stockfish 221216 64 POPCNT : 64 280 (+ 70,=191,- 19), 59.1 %
2 Komodo 10.3 64-bit : 0 280 (+ 19,=191,- 70), 40.9 %
Games
majortom
Posts: 669 Joined: Mon Nov 04, 2013 10:19 pm
Post
by majortom » Fri Dec 30, 2016 8:45 pm
Match is finished:
Code: Select all
Program Elo + - Games Score Av.Op. Draws
1 Stockfish 221216 64 POPCNT : 64 22 21 300 59.2 % 0 68.3 %
2 Komodo 10.3 64-bit C0 : 0 21 22 300 40.8 % 64 68.3 %
Individual statistics:
1 Stockfish 221216 64 POPCNT : 64 300 (+ 75,=205,- 20), 59.2 %
2 Komodo 10.3 64-bit C0 : 0 300 (+ 20,=205,- 75), 40.8 %
Games
The best result is with contempt -10 so far.