So, I understand some engines are basically using the ABDADA approach, where a bit in the hash table signifies that a thread is searching that node, so that other parallel threads will not also try to visit it (although they may help searching some of its successor nodes).
But it seems that engines using (or saying they use) LazySMP do not necessarily do this. A few I have looked at are: Texel, Stockfish, Demolito. I do not claim to understand everything I see there. But one thing I don't see is a mechanism to prevent redundant searches. It looks like each worker has its own move generator, and if started at the same depth, they will produce the same moves, and search in parallel the same move list, which means that some moves will be searched twice. I assume I am missing something?
(My engine, which uses YBWC, prevents this because the same MoveGenerator instance is shared by all parallel threads at the same depth (it is thread-safe), and it will not deliver the same move twice).
--Jon
Parallel search/LazySMP question
Moderators: hgm, Rebel, chrisw
-
- Posts: 4366
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
-
- Posts: 334
- Joined: Sat Feb 25, 2012 10:42 pm
- Location: Stockholm
Re: Parallel search/LazySMP question
They will search different moves due to things like the history table and other structures shared by the threads that make difference in the move ordering and LMR
another thing that will prevent duplicate searches is to launch the threads with different depths
another thing that will prevent duplicate searches is to launch the threads with different depths
-
- Posts: 4366
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Parallel search/LazySMP question
History etc. may make the order of the moves different but don't change the fact that the overall set of moves is the same.
And some threads are being launched with the same depth.
--Jon
And some threads are being launched with the same depth.
--Jon
-
- Posts: 27790
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Parallel search/LazySMP question
LazySMP is not ABDADA. It is a lot lazier.
-
- Posts: 334
- Joined: Sat Feb 25, 2012 10:42 pm
- Location: Stockholm
Re: Parallel search/LazySMP question
The thing is that the depths will be different because the move order might change and that LMR is a function of move order. This in some way prevents the threads to search the same tree
-
- Posts: 273
- Joined: Wed Aug 24, 2016 9:49 pm
Re: Parallel search/LazySMP question
On top of that I would add that they will search different moves because of thread scheduling and information shared with the transposition table.Pio wrote:The thing is that the depths will be different because the move order might change and that LMR is a function of move order. This in some way prevents the threads to search the same tree
-
- Posts: 570
- Joined: Mon Jul 20, 2015 5:06 pm
Re: Parallel search/LazySMP question
Almost. There are no slave/worker threads. Each thread acts independently with a shared hash table.jdart wrote: It looks like each worker has its own move generator,...
Each thread might produce different moves and values due to different hash reads at different times. Peter Osterlund accepts the first PV returned and rejects all others.and if started at the same depth, they will produce the same moves, and search in parallel the same move list,...
The search tries to avoid this, but the odd move still gets through at the root due to overlapping and racing conditions. A problem that is it is unclear what root aspiration values each thread should get, especially if racing conditions get two or more threads searching the same move.which means that some moves will be searched twice. I assume I am missing something?
-
- Posts: 178
- Joined: Sat Jan 08, 2011 12:51 am
- Location: USA
- Full name: Alcides Schulz
Re: Parallel search/LazySMP question
I just released my engine with parallel search. I looked mostly at stockfish and demolito code.
My implementation uses shared hash table. Each thread starts at the same time, and for even threads (thread id 2, 4, 6...) I change the depth to be searched so it creates different "search spots".
The threads only shares the hash table, each one has its own search tables and move generation.
I think this is a very simple implementation and I want to do more research. It looks to me that in general there is not much cooperation between threads and there's something to gain in the area.
My implementation uses shared hash table. Each thread starts at the same time, and for even threads (thread id 2, 4, 6...) I change the depth to be searched so it creates different "search spots".
The threads only shares the hash table, each one has its own search tables and move generation.
I think this is a very simple implementation and I want to do more research. It looks to me that in general there is not much cooperation between threads and there's something to gain in the area.
-
- Posts: 4366
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Parallel search/LazySMP question
I think the YBWC improvements in recent Crafty (25.2) are quite interesting. Basically threads enter an idle loop where they scan a list of available split points and select the most promising one to join. Threads can create split points even pre-emptively when no other threads are available to service them at the moment.
The overall implementation is designed to boost efficiency by locking less and distributing the work of splitting a thread. (It also has some optional NUMA optimizations).
I have seen Crafty hit 40-50M nps even on fairly common hardware. Part of the speed is that it has a simpler search than most and does futility cutoffs based only on material, not on a full eval. But part of the speed is also the parallel search efficiency.
--Jon
The overall implementation is designed to boost efficiency by locking less and distributing the work of splitting a thread. (It also has some optional NUMA optimizations).
I have seen Crafty hit 40-50M nps even on fairly common hardware. Part of the speed is that it has a simpler search than most and does futility cutoffs based only on material, not on a full eval. But part of the speed is also the parallel search efficiency.
--Jon
-
- Posts: 178
- Joined: Sat Jan 08, 2011 12:51 am
- Location: USA
- Full name: Alcides Schulz
Re: Parallel search/LazySMP question
Crafty approach seems interesting, I will take a look, specially how it defines promising split points.