Page 1 of 1

stockfish threading model

Posted: Fri May 13, 2016 3:16 pm
by flok
Hi,

A while ago someone told me that stockfish' threading model is rather simple: just start x threads at the root doing the same search with the transposition table as a form of shared state thing.

I googled for more details, couldn't find them. But I wonder: is there more to it? Because a quick test showed no improvement here compared to a single threaded version of my program (also without locking in the tt). Also in the output I see each thread return the same pv with each thread contiously finishing an iteration at the same moment. I tried starting thread x at iteration level 1 + x (iteration level: search-depth) or even 1 + x * 2 but in both cases in only 1 or 2 maybe 3 steps they end up in that same-pv-same-depth situation.

Re: stockfish threading model

Posted: Fri May 13, 2016 6:08 pm
by Dann Corbit
Dan Homan popularized the JaBOT (Just a Bunch Of Threads) model. The popular name is "Lazy SMP".

Lots of people are using it now.

It appears to work so well that nobody can explain the reason for its extreme success.

There is even an indication that it works better with hyperthreads turned on and all virtual cores busy in action. It defies the imagination. See Dmitri's posts on the topic.

An interesting trick some people are using is to have some threads starting a node or two forwards from the root if there are lots of threads.

Re: stockfish threading model

Posted: Fri May 13, 2016 9:12 pm
by Michel
Dan Homan popularized the JaBOT (Just a Bunch Of Threads) model. The popular name is "Lazy SMP".
It was in fact Toga II that introduced this idea. From the rating lists it could be easily verified that Toga II scaled just as well as Stockfish which at that time had a rather sophisticated YBW implementation.

Re: stockfish threading model

Posted: Fri May 13, 2016 9:25 pm
by Dann Corbit
When was this introduced?
I would like to have a look at the first implementation

Re: stockfish threading model

Posted: Fri May 13, 2016 9:48 pm
by cdani
flok wrote: I googled for more details, couldn't find them. But I wonder: is there more to it? Because a quick test showed no improvement here compared to a single threaded version of my program (also without locking in the tt). Also in the output I see each thread return the same pv with each thread contiously finishing an iteration at the same moment. I tried starting thread x at iteration level 1 + x (iteration level: search-depth) or even 1 + x * 2 but in both cases in only 1 or 2 maybe 3 steps they end up in that same-pv-same-depth situation.
Is possible there are some variables that are not well isolated one thread from the other? The only communication between threads should be the tt.

Re: stockfish threading model

Posted: Fri May 13, 2016 10:02 pm
by flok
cdani wrote:
flok wrote: I googled for more details, couldn't find them. But I wonder: is there more to it? Because a quick test showed no improvement here compared to a single threaded version of my program (also without locking in the tt). Also in the output I see each thread return the same pv with each thread contiously finishing an iteration at the same moment. I tried starting thread x at iteration level 1 + x (iteration level: search-depth) or even 1 + x * 2 but in both cases in only 1 or 2 maybe 3 steps they end up in that same-pv-same-depth situation.
Is possible there are some variables that are not well isolated one thread from the other? The only communication between threads should be the tt.
Hmm I don't think so. If there would be, then it would be unprotected (eg no locking, barriers, etc) access (because only my tt implementation uses locks) and that would show up when I run things in helgrind (valgrind for threading).

Re: stockfish threading model

Posted: Fri May 13, 2016 10:09 pm
by cdani
flok wrote: Hmm I don't think so. If there would be, then it would be unprotected (eg no locking, barriers, etc) access (because only my tt implementation uses locks) and that would show up when I run things in helgrind (valgrind for threading).
Don't think locking is that important or have noticeable effects. Andscacs shares tt and like 5 other small hashes like pawn hash between threads, and does nothing to prevent problems. No locking, no nothing. And all works nicely even at 31 threads.