Komodo 10.3 vs. Stockfish 221216, TC 120+2

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

majortom
Posts: 669
Joined: Mon Nov 04, 2013 10:19 pm

Komodo 10.3 vs. Stockfish 221216, TC 120+2

Post by majortom »

Komodo 10.3 vs. Stockfish 221216, TC 120"+2", 1 core per engine, hash - 2048 MB, each position will played with both colours. So far played 94 games:

Code: Select all


    Program                           Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish 221216 64 POPCNT     :   98   43  39    94    63.8 %      0   63.8 %
  2 Komodo 10.3 64-bit             :    0   39  43    94    36.2 %     98   63.8 %

Individual statistics:

  1 Stockfish 221216 64 POPCNT     :   98   94 (+ 30,= 60,-  4), 63.8 %
  2 Komodo 10.3 64-bit             :    0   94 (+  4,= 60,- 30), 36.2 %
Games
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Komodo 10.3 vs. Stockfish 221216, TC 120+2

Post by Lyudmil Tsvetkov »

majortom wrote:Komodo 10.3 vs. Stockfish 221216, TC 120"+2", 1 core per engine, hash - 2048 MB, each position will played with both colours. So far played 94 games:

Code: Select all


    Program                           Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish 221216 64 POPCNT     :   98   43  39    94    63.8 %      0   63.8 %
  2 Komodo 10.3 64-bit             :    0   39  43    94    36.2 %     98   63.8 %

Individual statistics:

  1 Stockfish 221216 64 POPCNT     :   98   94 (+ 30,= 60,-  4), 63.8 %
  2 Komodo 10.3 64-bit             :    0   94 (+  4,= 60,- 30), 36.2 %
Games
interesting; seemingly, play against H improved dramatically, while play against SF did not change at all or changed for worse.

the overall strength increase is evident, but I would like to see some statistically more relevant tests (for example Ingo's or CCRL) to conclude where exactly Komodo now stands: 1st, 2nd or 3rd.
majortom
Posts: 669
Joined: Mon Nov 04, 2013 10:19 pm

Re: Komodo 10.3 vs. Stockfish 221216, TC 120+2

Post by majortom »

Forgot to add some info:

Komodo is playing with default contept in two matches with long TC (1800"+10"), and with contempt 0 in this match with TC 120"+2".
majortom
Posts: 669
Joined: Mon Nov 04, 2013 10:19 pm

Re: Komodo 10.3 vs. Stockfish 221216, TC 120+2

Post by majortom »

BTW in the last 16 games Komodo played well: +3 -0 =13 and after 168 games:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish 221216 64 POPCNT     :   74   30  29   168    60.4 %      0   67.3 %
  2 Komodo 10.3 64-bit             :    0   29  30   168    39.6 %     74   67.3 %

Individual statistics:

  1 Stockfish 221216 64 POPCNT     :   74  168 (+ 45,=113,- 10), 60.4 %
  2 Komodo 10.3 64-bit             :    0  168 (+ 10,=113,- 45), 39.6 %
Games
majortom
Posts: 669
Joined: Mon Nov 04, 2013 10:19 pm

Re: Komodo 10.3 vs. Stockfish 221216, TC 120+2

Post by majortom »

Lyudmil Tsvetkov wrote:but I would like to see some statistically more relevant tests (for example Ingo's or CCRL) to conclude where exactly Komodo now stands: 1st, 2nd or 3rd.
I don't trust enough to rating lists with big number of engines with very different strength level. Where the very few direct games between top engines comparing with the games against the much weaker opponents.
I guess that's why Komodo does not have the contempt 0 by default.
I don't like the formula:

Code: Select all

if a>c at 100 elo and b>c at 100 elo, then a=b
majortom
Posts: 669
Joined: Mon Nov 04, 2013 10:19 pm

Re: Komodo 10.3 vs. Stockfish 221216, TC 120+2

Post by majortom »

After 240 games:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish 221216 64 POPCNT     :   68   26  25   240    59.8 %      0   66.2 %
  2 Komodo 10.3 64-bit             :    0   25  26   240    40.2 %     68   66.2 %

    Individual statistics:

  1 Stockfish 221216 64 POPCNT:        68  240 (+ 64,=159,- 17), 59.8 %
  2 Komodo 10.3 64-bit             :    0  240 (+ 17,=159,- 64), 40.2 %
Games
majortom
Posts: 669
Joined: Mon Nov 04, 2013 10:19 pm

Re: Komodo 10.3 vs. Stockfish 221216, TC 120+2

Post by majortom »

After 280 games:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish 221216 64 POPCNT     :   64   23  22   280    59.1 %      0   68.2 %
  2 Komodo 10.3 64-bit             :    0   22  23   280    40.9 %     64   68.2 %

    Individual statistics:
  
  1 Stockfish 221216 64 POPCNT     :   64  280 (+ 70,=191,- 19), 59.1 %
  2 Komodo 10.3 64-bit             :    0  280 (+ 19,=191,- 70), 40.9 %
Games
majortom
Posts: 669
Joined: Mon Nov 04, 2013 10:19 pm

Re: Komodo 10.3 vs. Stockfish 221216, TC 120+2

Post by majortom »

Match is finished:

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 Stockfish 221216 64 POPCNT     :   64   22  21   300    59.2 %      0   68.3 %
  2 Komodo 10.3 64-bit C0          :    0   21  22   300    40.8 %     64   68.3 %

    Individual statistics:

  1 Stockfish 221216 64 POPCNT     :   64  300 (+ 75,=205,- 20), 59.2 %
  2 Komodo 10.3 64-bit C0          :    0  300 (+ 20,=205,- 75), 40.8 %
Games

The best result is with contempt -10 so far.