Symmetric multiprocessing (SMP) scaling - SF8 Contempt=10

fastgm · Post by **fastgm** » Sat May 13, 2017 4:32 am

Symmetric multiprocessing (SMP) scaling
Stockfish 8 with and without Contempt Factor under Windows & Linux

Windows: Windows 10 Professional 64-Bit, Dual AMD Opteron 6376 @ 2.3 GHz
Linux: Ubuntu Server 16.04 LTS (HVM) 64-Bit, Amazon EC2 Instance, m4.16xlarge, Intel Xeon E5-2686v4 @ 2.3 GHz

Stockfish: default settings, except Contempt, 128 MB Hash
Cutechess-Cli: no draw and resign rules, no ponder, no learning, no tablebases, 1500 diff. opening positions, changing colors
TC = time control, T = number of threads, Depth = Average search depth, Elostat Start Elo = 3000

Here the results, also as PDF-File:
http://www.fastgm.de/schach/SMP-scaling-SF8-C10.pdf

Code: Select all

Windows – 1 thread vs 2 threads
 
TC = 60" + 0.05"  – default Contempt = 0                                    Contempt = 10
 
    Program           Elo    +   -   Games   Score    Draws   Depth           Program           Elo    +   -   Games   Score    Draws   Depth  
 -------------------------------------------------------------------       -------------------------------------------------------------------
  1 Stockfish 8 T2  &#58; 3031   6   6   3000    58.7 %   73.7 %  21.59         1 Stockfish 8 T2  &#58; 3035   6   6   3000    60.0 %   71.3 %  21.90
  2 Stockfish 8 T1  &#58; 2969   6   6   3000    41.3 %   73.7 %  20.33         2 Stockfish 8 T1  &#58; 2965   6   6   3000    40.0 %   71.3 %  20.51
 
  Result     &#58; 1761.0/3000 (+655,=2212,-133&#41;                                Result     &#58; 1800.0/3000 (+731,=2138,-131&#41;
  Perf.      &#58; 58.7 %                                                       Perf.      &#58; 60.0 %
  Elo        &#58; 3061                                                         Elo        &#58; 3070                  +9 Elo

---------------------------------------------------------------------------------------------------------------------------------------------- 
 
Linux – 1 thread vs 2 threads
 
TC = 10" + 0.1"  – default Contempt = 0                                     Contempt = 10
 
    Program           Elo    +   -   Games   Score    Draws   Depth           Program           Elo    +   -   Games   Score    Draws   Depth
 -------------------------------------------------------------------       -------------------------------------------------------------------
  1 Stockfish 8 T2  &#58; 3039   7   7   3000    61.0 %   68.1 %  19.84         1 Stockfish 8 T2  &#58; 3043   7   7   3000    62.0 %   63.0 %  20.56
  2 Stockfish 8 T1  &#58; 2961   7   7   3000    39.0 %   68.1 %  18.73         2 Stockfish 8 T1  &#58; 2957   7   7   3000    38.0 %   63.0 %  19.39
 
  Result     &#58; 1830.0/3000 (+809,=2042,-149&#41;                                Result     &#58; 1860.0/3000 (+915,=1890,-195&#41;
  Perf.      &#58; 61.0 %                                                       Perf.      &#58; 62.0 %
  Elo        &#58; 3078                                                         Elo        &#58; 3085                  +8 Elo

----------------------------------------------------------------------------------------------------------------------------------------------  
 
Windows – 1 thread vs 4 threads
 
TC = 60" + 0.05" – default Contempt = 0                                     Contempt = 10
 
    Program           Elo    +   -   Games   Score    Draws   Depth           Program           Elo    +   -   Games   Score    Draws   Depth 
 -------------------------------------------------------------------       -------------------------------------------------------------------
  1 Stockfish 8 T4  &#58; 3056   7   7   3000    65.6 %   65.7 %  23.08         1 Stockfish 8 T4  &#58; 3062   7   7   3000    67.2 %   62.1 %  23.21
  2 Stockfish 8 T1  &#58; 2944   7   7   3000    34.4 %   65.7 %  20.66         2 Stockfish 8 T1  &#58; 2938   7   7   3000    32.8 %   62.1 %  20.69

  Result     &#58; 1968.0/3000 (+982,=1972,-46&#41;                                 Result     &#58; 2015.5/3000 (+1084,=1863,-53&#41;
  Perf.      &#58; 65.6 %                                                       Perf.      &#58; 67.2 % 
  Elo        &#58; 3112                                                         Elo        &#58; 3124                  +12 Elo

----------------------------------------------------------------------------------------------------------------------------------------------  
 
Linux – 1 thread vs 4 threads
 
TC = 10" + 0.1" – default Contempt = 0                                      Contempt = 10
 
    Program           Elo    +   -   Games   Score    Draws   Depth           Program           Elo    +   -   Games   Score    Draws   Depth
 -------------------------------------------------------------------       -------------------------------------------------------------------
  1 Stockfish 8 T4  &#58; 3064   7   7   3000    67.6 %   61.2 %  21.41         1 Stockfish 8 T4  &#58; 3074   8   8   3000    70.1 %   56.0 %  22.20
  2 Stockfish 8 T1  &#58; 2936   7   7   3000    32.4 %   61.2 %  19.39         2 Stockfish 8 T1  &#58; 2926   8   8   3000    29.9 %   56.0 %  19.97

  Result     &#58; 2028.5/3000 (+1111,=1835,-54&#41;                                Result     &#58; 2104.0/3000 (+1264,=1680,-56&#41;
  Perf.      &#58; 67.6 %                                                       Perf.      &#58; 70.1 %
  Elo        &#58; 3128                                                         Elo        &#58; 3148                  +20 Elo

----------------------------------------------------------------------------------------------------------------------------------------------  
  
Windows – 1 thread vs 8 threads
 
TC = 60" + 0.05" – default Contempt = 0                                     Contempt = 10
 
    Program           Elo    +   -   Games   Score    Draws   Depth           Program           Elo    +   -   Games   Score    Draws   Depth 
 -------------------------------------------------------------------       -------------------------------------------------------------------
  1 Stockfish 8 T8  &#58; 3079   8   8   3000    71.2 %   56.4 %  23.87         1 Stockfish 8 T8  &#58; 3090   8   8   3000    73.8 %   50.7 %  24.25
  2 Stockfish 8 T1  &#58; 2921   8   8   3000    28.8 %   56.4 %  20.40         2 Stockfish 8 T1  &#58; 2910   8   8   3000    26.2 %   50.7 %  20.42
 
  Result     &#58; 2135.5/3000 (+1289,=1693,-18&#41;                                Result     &#58; 2215.0/3000 (+1454,=1522,-24&#41;
  Perf.      &#58; 71.2 %                                                       Perf.      &#58; 73.8 %
  Elo        &#58; 3157                                                         Elo        &#58; 3180                  +23 Elo

----------------------------------------------------------------------------------------------------------------------------------------------  
 
Linux – 1 thread vs 8 threads
 
TC = 10" + 0.1" – default Contempt = 0                                      Contempt = 10
 
    Program           Elo    +   -   Games   Score    Draws   Depth           Program           Elo    +   -   Games   Score    Draws   Depth
 -------------------------------------------------------------------       -------------------------------------------------------------------
  1 Stockfish 8 T8  &#58; 3093   8   8   3000    74.4 %   50.4 %  22.35         1 Stockfish 8 T8  &#58; 3101   9   9   3000    76.2 %   45.6 %  23.22
  2 Stockfish 8 T1  &#58; 2907   8   8   3000    25.6 %   50.4 %  19.38         2 Stockfish 8 T1  &#58; 2899   9   9   3000    23.8 %   45.6 %  20.09
 
  Result     &#58; 2232.5/3000 (+1477,=1511,-12&#41;                                Result     &#58; 2285.5/3000 (+1601,=1369,-30&#41;
  Perf.      &#58; 74.4 %                                                       Perf.      &#58; 76.2 %
  Elo        &#58; 3185                                                         Elo        &#58; 3202                  +17 Elo
 
---------------------------------------------------------------------------------------------------------------------------------------------- 
 
Linux – 1 thread vs 16 threads
 
TC = 10" + 0.1" – default Contempt = 0                                      Contempt = 10
 
    Program           Elo    +   -   Games   Score    Draws   Depth           Program           Elo    +   -   Games   Score    Draws   Depth
 -------------------------------------------------------------------       -------------------------------------------------------------------
  1 Stockfish 8 T16 &#58; 3113   13  13  1516    78.7 %   42.0 %  23.33         1 Stockfish 8 T16 &#58; 3123   10  10  3000    80.4 %   38.5 %  23.96
  2 Stockfish 8 T1  &#58; 2887   13  13  1516    21.3 %   42.0 %  19.33         2 Stockfish 8 T1  &#58; 2877   10  10  3000    19.6 %   38.5 %  19.89

  Result     &#58; 1192.5/1516 (+874,=637,-5&#41;                                   Result     &#58; 2411.5/3000 (+1834,=1155,-11&#41;
  Perf.      &#58; 78.7 %                                                       Perf.      &#58; 80.4 %
  Elo        &#58; 3227                                                         Elo        &#58; 3245                  +18 Elo

----------------------------------------------------------------------------------------------------------------------------------------------  

Linux – 1 thread vs 32 threads
 
TC = 10" + 0.1" – default Contempt = 0                                      Contempt = 10 
 
    Program           Elo    +   -   Games   Score    Draws   Depth           Program           Elo    +   -   Games   Score    Draws   Depth
 -------------------------------------------------------------------       -------------------------------------------------------------------
  1 Stockfish 8 T32 &#58; 3115   13  13  1502    79.0 %   41.5 %  23.03         1 Stockfish 8 T32 &#58; 3130   15  14  1510    81.7 %   36.0 %  23.89
  2 Stockfish 8 T1  &#58; 2885   13  13  1502    21.0 %   41.5 %  19.45         2 Stockfish 8 T1  &#58; 2870   14  15  1510    18.3 %   36.0 %  20.07

  Result     &#58; 1186.0/1502 (+874,=624,-4&#41;                                   Result     &#58; 1234.0/1510 (+962,=544,-4&#41;
  Perf.      &#58; 79.0 %                                                       Perf.      &#58; 81.7 %
  Elo        &#58; 3230                                                         Elo        &#58; 3260                  +30 Elo

.

lucasart · Post by **lucasart** » Sun May 14, 2017 3:02 am

Thank you for this high quality test.

So Contempt really has a bigger effect than I thought for large elo differences. Or perhaps the effect is magnified due to self-testing, where repetition draws are pervasive (all Contempt does in SF is penalize repetition draws).

This does demonstrate what I was saying in the other thread:
http://www.talkchess.com/forum/viewtopic.php?t=63903

Komodo does not have better SMP scaling that SF. It's an optical illusion, explained by Contempt, and possibly by better single threaded scaling (better eval and search, not SMP related).

Looking at 1 vs 16 thread Linux result with Contempt=0:
* Komodo scores 80.8% (http://www.talkchess.com/forum/viewtopic.php?t=63955)
* Stockfish scores 78.7% (see above)

This tiny difference can easily be explained by either/or: (1) error bars (only 1500 games) (2) Komodo's contempt could be better than SF's avoiding exchanged which SF doesn't do (3) Komodo may also outscale SF on single thread.

Looking at Komodo's Contempt=0 vs. Contempt=10 results, we also see that Komodo's contempt implementation is a lot better than SF. It gains a lot more elo from weaker opponents that SF does. +65 elo on 16 threads (linux), whereas SF only gets +18 in the same test.

In turn, this explaing why Komodo 10.4 is on par with SF 8 in rating lists, but significantly weaker in face to face matches. It just racks a lot more elo from avoiding draws with weaker opponents.

mjlef · Post by **mjlef** » Mon May 15, 2017 3:37 am

lucasart wrote:Thank you for this high quality test.

So Contempt really has a bigger effect than I thought for large elo differences. Or perhaps the effect is magnified due to self-testing, where repetition draws are pervasive (all Contempt does in SF is penalize repetition draws).

This does demonstrate what I was saying in the other thread:
http://www.talkchess.com/forum/viewtopic.php?t=63903

Komodo does not have better SMP scaling that SF. It's an optical illusion, explained by Contempt, and possibly by better single threaded scaling (better eval and search, not SMP related).

Looking at 1 vs 16 thread Linux result with Contempt=0:
* Komodo scores 80.8% (http://www.talkchess.com/forum/viewtopic.php?t=63955)
* Stockfish scores 78.7% (see above)

This tiny difference can easily be explained by either/or: (1) error bars (only 1500 games) (2) Komodo's contempt could be better than SF's avoiding exchanged which SF doesn't do (3) Komodo may also outscale SF on single thread.

Looking at Komodo's Contempt=0 vs. Contempt=10 results, we also see that Komodo's contempt implementation is a lot better than SF. It gains a lot more elo from weaker opponents that SF does. +65 elo on 16 threads (linux), whereas SF only gets +18 in the same test.

In turn, this explaing why Komodo 10.4 is on par with SF 8 in rating lists, but significantly weaker in face to face matches. It just racks a lot more elo from avoiding draws with weaker opponents.

I agree with 2. I do not think 1 applies (see below) and I am less sure about 3, although I judge it to be likely true, past the 2 minute level.

Andreas sent us some other information where he did a scaling run of Komodo using a Contempt of 0. I hope he posts it here since it gives more information about the effect of Komodo's Contempt. I agree that Komodo's Contempt is more effective. In the 1 thread versus 32 Komodo with a Contempt of 10 scored 345 elo stronger. With Contempt set to 0, it was 271 elo stronger. So Contempt at this level was worth a nice 75 elo gain. But Stockfish with Contempt=0 gained 230 elo going from 1 to 32 threads. So a difference of 41 elo compared with the Contempt=0 Komodo 1:32 results. The error margins were about 15 elo at these higher thread counts (it is hard to get a lot of games in). But 41 is well past 15, of course. So Komodo's elo scaling with higher threads is not due to the error margins.

Comparing the Stockfish 8 Contempt = 0 1:16 vs 1:32 core results shows a gain of just 3 elo. Komodo from 1:16 to 1:32 also with Contempt of 0 gains 21 elo, again past the error margin. I would love more games to lower the error margins but the data suggests Komodo still scales better with more cores and Contempt set to 0.

There is an effect of time scaling in Komodo's elo. Although in general I would agree more cores is going to mean more depth (and so in a way more time for an equivalent single core run), but our in house testing shows Komodo's worst results against Stockfish 8 are at the 1-2 minute (plus 1%) time levels. So we would expect Komodo to do worse when going from a 10 sec level to a many core "equivalent" of 1-2 minutes.

Basically I feel the best evidence still suggests Komodo's thread scaling is better right now. This could change in the future as each team discoverers better ways to use so many processors.

Since these results were all same program versus itself, things could be different on direct play. I would love to see some games above a few minutes to get past the 1-2 minutes worst time results. But it would take a lot of hardware to run this.

fastgm · Post by **fastgm** » Mon May 15, 2017 10:26 am

Andreas sent us some other information where he did a scaling run of Komodo using a Contempt of 0. I hope he posts it here since it gives more information about the effect of Komodo's Contempt.

Mark, I did it already:
http://talkchess.com/forum/viewtopic.php?t=63955

Comparing the Stockfish 8 Contempt = 0 1:16 vs 1:32 core results shows a gain of just 3 elo. Komodo from 1:16 to 1:32 also with Contempt of 0 gains 21 elo, again past the error margin. I would love more games to lower the error margins but the data suggests Komodo still scales better with more cores and Contempt set to 0.

More games will follow. I will report it.

mjlef · Post by **mjlef** » Wed May 17, 2017 2:48 am

Thanks Andreas. Somehow I missed that one. Thanks again for running these great experiments.

Symmetric multiprocessing (SMP) scaling - SF8 Contempt=10

Symmetric multiprocessing (SMP) scaling - SF8 Contempt=10

Re: Symmetric multiprocessing (SMP) scaling - SF8 Contempt=1

Re: Symmetric multiprocessing (SMP) scaling - SF8 Contempt=1

Re: Symmetric multiprocessing (SMP) scaling - SF8 Contempt=1

Re: Symmetric multiprocessing (SMP) scaling - SF8 Contempt=1