Parallel search/LazySMP question

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
jdart
Posts: 3824
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Parallel search/LazySMP question

Post by jdart » Sun Dec 17, 2017 8:55 pm

So, I understand some engines are basically using the ABDADA approach, where a bit in the hash table signifies that a thread is searching that node, so that other parallel threads will not also try to visit it (although they may help searching some of its successor nodes).

But it seems that engines using (or saying they use) LazySMP do not necessarily do this. A few I have looked at are: Texel, Stockfish, Demolito. I do not claim to understand everything I see there. But one thing I don't see is a mechanism to prevent redundant searches. It looks like each worker has its own move generator, and if started at the same depth, they will produce the same moves, and search in parallel the same move list, which means that some moves will be searched twice. I assume I am missing something?
(My engine, which uses YBWC, prevents this because the same MoveGenerator instance is shared by all parallel threads at the same depth (it is thread-safe), and it will not deliver the same move twice).

--Jon

Pio
Posts: 119
Joined: Sat Feb 25, 2012 9:42 pm
Location: Stockholm
Contact:

Re: Parallel search/LazySMP question

Post by Pio » Sun Dec 17, 2017 9:02 pm

They will search different moves due to things like the history table and other structures shared by the threads that make difference in the move ordering and LMR

another thing that will prevent duplicate searches is to launch the threads with different depths

jdart
Posts: 3824
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Parallel search/LazySMP question

Post by jdart » Sun Dec 17, 2017 9:12 pm

History etc. may make the order of the moves different but don't change the fact that the overall set of moves is the same.

And some threads are being launched with the same depth.

--Jon

User avatar
hgm
Posts: 23714
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Parallel search/LazySMP question

Post by hgm » Sun Dec 17, 2017 9:20 pm

LazySMP is not ABDADA. It is a lot lazier.

Pio
Posts: 119
Joined: Sat Feb 25, 2012 9:42 pm
Location: Stockholm
Contact:

Re: Parallel search/LazySMP question

Post by Pio » Sun Dec 17, 2017 9:24 pm

The thing is that the depths will be different because the move order might change and that LMR is a function of move order. This in some way prevents the threads to search the same tree

CheckersGuy
Posts: 273
Joined: Wed Aug 24, 2016 7:49 pm

Re: Parallel search/LazySMP question

Post by CheckersGuy » Sun Dec 17, 2017 9:57 pm

Pio wrote:The thing is that the depths will be different because the move order might change and that LMR is a function of move order. This in some way prevents the threads to search the same tree
On top of that I would add that they will search different moves because of thread scheduling and information shared with the transposition table.

D Sceviour
Posts: 456
Joined: Mon Jul 20, 2015 3:06 pm
Contact:

Re: Parallel search/LazySMP question

Post by D Sceviour » Sun Dec 17, 2017 11:32 pm

jdart wrote: It looks like each worker has its own move generator,...
Almost. There are no slave/worker threads. Each thread acts independently with a shared hash table.
and if started at the same depth, they will produce the same moves, and search in parallel the same move list,...
Each thread might produce different moves and values due to different hash reads at different times. Peter Osterlund accepts the first PV returned and rejects all others.
which means that some moves will be searched twice. I assume I am missing something?
The search tries to avoid this, but the odd move still gets through at the root due to overlapping and racing conditions. A problem that is it is unclear what root aspiration values each thread should get, especially if racing conditions get two or more threads searching the same move.

sedicla
Posts: 146
Joined: Fri Jan 07, 2011 11:51 pm
Location: USA

Re: Parallel search/LazySMP question

Post by sedicla » Mon Dec 18, 2017 12:05 am

I just released my engine with parallel search. I looked mostly at stockfish and demolito code.
My implementation uses shared hash table. Each thread starts at the same time, and for even threads (thread id 2, 4, 6...) I change the depth to be searched so it creates different "search spots".
The threads only shares the hash table, each one has its own search tables and move generation.
I think this is a very simple implementation and I want to do more research. It looks to me that in general there is not much cooperation between threads and there's something to gain in the area.

jdart
Posts: 3824
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Parallel search/LazySMP question

Post by jdart » Mon Dec 18, 2017 12:10 am

I think the YBWC improvements in recent Crafty (25.2) are quite interesting. Basically threads enter an idle loop where they scan a list of available split points and select the most promising one to join. Threads can create split points even pre-emptively when no other threads are available to service them at the moment.

The overall implementation is designed to boost efficiency by locking less and distributing the work of splitting a thread. (It also has some optional NUMA optimizations).

I have seen Crafty hit 40-50M nps even on fairly common hardware. Part of the speed is that it has a simpler search than most and does futility cutoffs based only on material, not on a full eval. But part of the speed is also the parallel search efficiency.

--Jon

sedicla
Posts: 146
Joined: Fri Jan 07, 2011 11:51 pm
Location: USA

Re: Parallel search/LazySMP question

Post by sedicla » Mon Dec 18, 2017 12:25 am

Crafty approach seems interesting, I will take a look, specially how it defines promising split points.

Post Reply