Page 1 of 1

More questions about threads

Posted: Sat Apr 21, 2018 11:25 am
by Patrice Duhamel
YBWC is working in Cheese and I have more questions.

For YBWC I have almost the same implementation as Stockfish 6 but without the late join, the main hash table is shared (lockless) and each thread have a pawn hash table, history and killers tables. The engine create N-1 threads and the main thread is searching + testing inputs + timing.
I have a class with all data/functions needed by a thread to search at a splitpoint and I allocate this new object in the stack and initialize it before starting a search.

How to set the main thread maximum stack size in Linux/Mac OS/Android ?
I know there is pthread_attr_setstacksize(), but this works only for new created threads.
In windows I can use a compiler option with Visual Studio or MinGW gcc to set max stack size.

What is the "most stable" way to compare SMP vs single thread version, and what to compare ?
I tried with 60 positions from Arasan testsuite 20 (hash table 1024 Kb) :

- search to depth 20 :
1 cpu : 2333.60 s , 9 832 260 510 nodes , 4 213 339 nodes/s
2 cpu : 1378.63 s , 11 469 668 794 nodes , 8 319 625 nodes/s
4 cpu : 802.31 s , 12 651 149 446 nodes , 15 768 327 nodes/s

- search 30 s / position :
1 cpu : 1800 s , 7 588 029 630 nodes , 4 204 779 nodes/s , depth (min,average,max) 17, 19, 28
2 cpu : 1800 s , 14 972 597 971 nodes , 8 296 888 nodes/s , depth (min,average,max) 17, 21, 33
4 cpu : 1800 s , 28 714 040 333 nodes , 15 911 273 nodes/s , depth (min,average,max) 19, 21, 33

Is it possible to measure how much time threads are inactive in YBWC ?

I use staged move generation and I don't know how many legal moves I have before playing them, in YBWC the search can split for nothing because there will be not enough legal moves for other threads, how much does it cost ?
I found 20 % of search at a splitpoint have no moves to search.

Re: More questions about threads

Posted: Sat Apr 21, 2018 11:37 am
by smatovic
What is the "most stable" way to compare SMP vs single thread version, and what to compare ?
e.g., mean of 4 runs of time to depth on Hyatt24 positions with variable depth:

http://talkchess.com/forum/viewtopic.ph ... 70&t=56937

--
Srdja

Re: More questions about threads

Posted: Sat Apr 21, 2018 11:46 am
by syzygy
Patrice Duhamel wrote:What is the "most stable" way to compare SMP vs single thread version, and what to compare ?
Only way is play games and compare Elo.

Re: More questions about threads

Posted: Sat Apr 21, 2018 11:51 am
by syzygy
Patrice Duhamel wrote:I use staged move generation and I don't know how many legal moves I have before playing them, in YBWC the search can split for nothing because there will be not enough legal moves for other threads, how much does it cost ?
I found 20 % of search at a splitpoint have no moves to search.
I think, in case of an engine like Stockfish with a super low branching factor, the problem is not so much the cost of splitting but the lack of nodes with lots of children. I suspect the problem of YBWC is exactly what the abbreviation stands for. It may be a really bad idea to wait for the search of the first move to complete.

Re: More questions about threads

Posted: Sat Apr 21, 2018 12:25 pm
by Patrice Duhamel
smatovic wrote: e.g., mean of 4 runs of time to depth on Hyatt24 positions with variable depth:
I will try this.

Re: More questions about threads

Posted: Sat Apr 21, 2018 12:29 pm
by Patrice Duhamel
syzygy wrote: Only way is play games and compare Elo.
Yes, but it takes more time to play games, I was looking for something faster, to run tests and confirm later with games.

Re: More questions about threads

Posted: Sat Apr 21, 2018 3:47 pm
by jdart
Is it possible to measure how much time threads are inactive in YBWC ?
A profiling tool like vTune can show you this. But I also have implemented a sampling technique: I keep a bitmap of active threads and periodically during the search I read the bitmap and collect statistics on how many threads were active, on average.
I use staged move generation and I don't know how many legal moves I have before playing them, in YBWC the search can split for nothing because there will be not enough legal moves for other threads, how much does it cost ?
I found 20 % of search at a splitpoint have no moves to search.
I solved this by actually forcing a full move generation (all stages) after all split conditions have been satisfied (there is also another, technical reason why I needed to do this). So if there are no moves there will be no other threads activated. I also do not split while in check because generally there will be few nodes to search there.

However: like the Stockfish team has done, I am shifting over to using LazySMP now, because YBWC gave poor scaling on large core counts such as TCEC is using.

--Jon

Re: More questions about threads

Posted: Sat Apr 21, 2018 6:15 pm
by Patrice Duhamel
jdart wrote: However: like the Stockfish team has done, I am shifting over to using LazySMP now, because YBWC gave poor scaling on large core counts such as TCEC is using.
And with less cores (2,4,8), LazySMP is also better than YBWC ?

I will try to look at ABDADA and LazySMP later.

Re: More questions about threads

Posted: Sat Apr 21, 2018 8:42 pm
by jdart
And with less cores (2,4,8), LazySMP is also better than YBWC ?
It is better on 4 cores, in my testing (haven't tested on 2).

--Jon