Threads factor: Komodo, Houdini, Stockfish and Zappa

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

fastgm
Posts: 818
Joined: Mon Aug 19, 2013 6:57 pm

Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by fastgm »

Conditions:
Hardware: Dual AMD Opteron 6376, 32 x 2.3 GHz (Turbo Core off)
OS: Windows 7 Pro 64-Bit
GUI: no
Settings: all engines default settings
Large Tables: no
Position: starting position
Time: 20 seconds

UCI commands:
setoption name threads value 1 (to 32)
go movetime 20000

The tests were run in console mode.

Here the values from 1 to 32 threads, starting position, with 20 seconds of computing time.
nps = nodes per second

Image

Komodo, Houdini and Zappa are almost equal up to 16 threads (factor 11.94 - 11.37 - 11.34).
Stockfish DD and also the latest Stockfish version lies somewhat behind (factor 8.01 - 9.79).

Komodo scales still excellent beyond 16 threads. Also Zappa shows a very good SMP implementation.
Beyond 16 threads Houdini and Stockfish DD benefit much less than the other tested engines.

Increase from 16 to 32 threads:

Komodo TCECr (11,94 - 20,60 = 73%)
Zappa Mexico II (11,37 - 16,46 = 45%)
Stockfish 140513 ( 9,79 - 14,21 = 45%)
Stockfish DD ( 8,01 - 10,18 = 27%)
Houdini 4 Pro (11,34 - 13,48 = 19%)

Image
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by zullil »

Your Stockfish data seem remarkably monotone as a function of the number of threads. The factor increases each time you increment the number of threads (except once, when the number of threads goes from 21 to 22). Are you recording an average of multiple runs for each number of threads? If so, how many runs with each threads setting?

I ask because these are fixed-time searches, so essentially you are recording the total number of nodes searched in the twenty seconds. I'd imagine that this would vary quite a lot from run to run, even with no change in the threads setting. The trees searched might differ significantly, with even the best move changing from search to search. For example, here are two consecutive Stockfish searches, each done with 16 threads. Look at how much they differ:

Code: Select all

info depth 24 seldepth 36 score cp 22 nodes 151240061 nps 7561246 time 20002 multipv 1 pv e2e4 c7c5 b1c3 d7d6 g1f3 e7e5 f1c4 f8e7 a2a3 g8f6 e1g1 e8g8 b2b4 b8d7 d2d3 a7a6 c1d2 b7b5 c4d5 f6d5 c3d5 c8b7 b4c5 d7c5 d2a5 d8d7 d5b6 d7d8
info nodes 151240061 time 20002
bestmove e2e4 ponder c7c5

Code: Select all

info depth 25 seldepth 33 score cp 29 nodes 182162118 nps 9106739 time 20003 multipv 1 pv d2d4 d7d5 c2c4 e7e6 g1f3 d5c4 b1c3 b8c6 e2e4
info nodes 182162118 time 20003
bestmove d2d4 ponder d7d5
fastgm
Posts: 818
Joined: Mon Aug 19, 2013 6:57 pm

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by fastgm »

Are you recording an average of multiple runs for each number of
threads?
Yes.
If so, how many runs with each threads setting?
5 runs per thread setting, overall 800 runs!
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by lucasart »

Thanks Andreas. Your posts are always interesting, especially this one!
* Impressive scaling by Komodo
* Great improvement thanks to Joona's "late join" patch
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by Uri Blass »

Interesting information but the target of chess programs is not to search more nodes but to earn playing strength.

Nodes are not proportional to playing strength and I guess that for the same engine,
the same number of nodes with 1 thread is better than the same number of nodes with many threads.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by Laskos »

These are NPS. Hard to tell strength-wise, or effective speed-up. Time to depth (TTD) won't help too much either, as even SF with Joona's patch widens a bit, without talking of Komodo.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by lucasart »

Uri Blass wrote:Interesting information but the target of chess programs is not to search more nodes but to earn playing strength.

Nodes are not proportional to playing strength and I guess that for the same engine,
the same number of nodes with 1 thread is better than the same number of nodes with many threads.
Good point. TTD would be a better measure than NPS. The ideal measure ie ELO but it's extremely costly to calculate with good enough precision.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by syzygy »

Uri Blass wrote:Interesting information but the target of chess programs is not to search more nodes but to earn playing strength.

Nodes are not proportional to playing strength and I guess that for the same engine,
the same number of nodes with 1 thread is better than the same number of nodes with many threads.
This is of course true, but it does show that SF and H4 quite likely still have room for improvement here.

An interesting question is whether Komdo's smp implementation is comparable at all with that of Zappa, SF and H4 (which are all YBWC tree splitters with some further refinements). As Richard Vida mentioned on the fishcooking forum, it might be that Komodo uses a "lazy smp"-like approach:
http://talkchess.com/forum/viewtopic.php?t=46858
http://talkchess.com/forum/viewtopic.ph ... 350#504350
Isaac
Posts: 265
Joined: Sat Feb 22, 2014 8:37 pm

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by Isaac »

I think it would be interesting to repeat the exact same test with a different FEN, particularly an end game FEN.

I expect Komodo to earn a lot (in the TCEC I have seen it having 56 Mnps in end game, compared to 16 Mnps in early game. So surely, more cores = better) while SF DD having a totally different graph (passes from 16 Mnps in early game up to 7 Mnps in end game. A quad core is faster than 16 cores. So more cores = worse performance.).
It would be interesting to see how the newer SF dev versions are doing compared to SF DD.
As a side-note Critter had a pentium 4 performance in end game running on 16 cores: 750 kN/s.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Threads factor: Komodo, Houdini, Stockfish and Zappa

Post by michiguel »

Uri Blass wrote:Interesting information but the target of chess programs is not to search more nodes but to earn playing strength.

Nodes are not proportional to playing strength and I guess that for the same engine,
the same number of nodes with 1 thread is better than the same number of nodes with many threads.
But that is not the point of the experiment. This tells us about the upper limit of scalability, which is useful to know. In addition, it tells us how that upper limits suffers from addition of cores. For instance, Houdini starts to have problems after exactly 16 cores. Before that, it is among the best.

Miguel