Questions on SMP search

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bhlangonijr
Posts: 482
Joined: Thu Oct 16, 2008 4:23 am
Location: Milky Way

Questions on SMP search

Post by bhlangonijr »

Concerning the following structures of your chess engine:

- Main Hash Table
- Pawn Hash Table
- Killers array
- History array

How you deal with these structures, when implementing a SMP search? To be more specific:
- For what structures, you keep a separate instance for each thread and what structures share a single instance with all threads;
- How you handle the concurrent access - if applicable - for each of these structures;
- How to maintain the killers/history arrays.

Thanks,
Joost Buijs
Posts: 1563
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Questions on SMP search

Post by Joost Buijs »

I have almost everything thread global except for the killer arrays, they are maintained locally.

For the hash table(s) I use lockless (Hyatt style) hashing, locking each TT access is way to expensive.

For the history array I don't do anything special, because in my engine this is an atomic operation.

Recently I tried to change the history array from thread global to thread local. After doing al lot of tests it seems this performes worse, probably because there is less information per thread.
ldesnogu

Re: Questions on SMP search

Post by ldesnogu »

Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Questions on SMP search

Post by Daniel Shawul »

You have to share the main one for SMP. Sharing that table is so important for alpha-beta , distributed hash tables are sometimes used even on clusters.
I currently do not share them there since it seems a lot of work for a 15% speed up. I do not share the others even on SMP systems. I do not "lockless hash" but test for the legality of the move instead.
User avatar
marcelk
Posts: 348
Joined: Sat Feb 27, 2010 12:21 am

Re: Questions on SMP search

Post by marcelk »

bhlangonijr wrote:Concerning the following structures of your chess engine:

- Main Hash Table
- Pawn Hash Table
- Killers array
- History array

How you deal with these structures, when implementing a SMP search? To be more specific:
- For what structures, you keep a separate instance for each thread and what structures share a single instance with all threads;
- How you handle the concurrent access - if applicable - for each of these structures;
- How to maintain the killers/history arrays.

Thanks,
I have everything separated by address space.
Only the transposition table is in shared memory.

I don't use locking or 'lockless' tricks but just accept the
race conditions and make sure the engine doesn't
choke on reading an invalid move from the table but
silently discards it.

I might speed up communication speed in a later version
also through shared memory.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Questions on SMP search

Post by bob »

marcelk wrote:
bhlangonijr wrote:Concerning the following structures of your chess engine:

- Main Hash Table
- Pawn Hash Table
- Killers array
- History array

How you deal with these structures, when implementing a SMP search? To be more specific:
- For what structures, you keep a separate instance for each thread and what structures share a single instance with all threads;
- How you handle the concurrent access - if applicable - for each of these structures;
- How to maintain the killers/history arrays.

Thanks,
I have everything separated by address space.
Only the transposition table is in shared memory.

I don't use locking or 'lockless' tricks but just accept the
race conditions and make sure the engine doesn't
choke on reading an invalid move from the table but
silently discards it.

I might speed up communication speed in a later version
also through shared memory.
I share hash and phash. I have local killers and don't do history at all in Crafty. I don't think shared killers make much sense as they are pretty much a "local issue."

+
User avatar
rvida
Posts: 481
Joined: Thu Apr 16, 2009 12:00 pm
Location: Slovakia, EU

Re: Questions on SMP search

Post by rvida »

My implementation (in Critter):

- Main hash is shared. No locking nor lockless xoring is done. Moves from TT are checked for legality instead.

- Pawn hash and evaluation cache are local to thread.

- History array is shared. Since writes here are atomic there is no need to care about conflicts. History tends to be very noisy anyway with higher depths. There might be some speed penalty due to cache coherency traffic / snooping.

- With killer moves it is a bit more complicated. While each thread maintains its private killer array, during split() operation a few plies worth of killer moves are copied over.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Questions on SMP search

Post by mcostalba »

rvida wrote:My implementation (in Critter):

- Main hash is shared. No locking nor lockless xoring is done. Moves from TT are checked for legality instead.

- Pawn hash and evaluation cache are local to thread.

- History array is shared. Since writes here are atomic there is no need to care about conflicts. History tends to be very noisy anyway with higher depths. There might be some speed penalty due to cache coherency traffic / snooping.

- With killer moves it is a bit more complicated. While each thread maintains its private killer array, during split() operation a few plies worth of killer moves are copied over.
In Stockfish is 100% the same.

The only difference is that we don't have an evaluation cache but we use the main hash table to store evaluation together with the usual data.
bhlangonijr
Posts: 482
Joined: Thu Oct 16, 2008 4:23 am
Location: Milky Way

Re: Questions on SMP search

Post by bhlangonijr »

rvida wrote:My implementation (in Critter):

- Main hash is shared. No locking nor lockless xoring is done. Moves from TT are checked for legality instead.

- Pawn hash and evaluation cache are local to thread.

- History array is shared. Since writes here are atomic there is no need to care about conflicts. History tends to be very noisy anyway with higher depths. There might be some speed penalty due to cache coherency traffic / snooping.

- With killer moves it is a bit more complicated. While each thread maintains its private killer array, during split() operation a few plies worth of killer moves are copied over.
Interesting. I understand Bob's position that killers are a "local issue", although in the non-smp version of the search implementation we usually don't keep many killers arrays per sub-tree, for example. I guess we end up creating a somewhat different move ordering scheme, when implementing the SMP version of the search. Is this a problem when testing new changes - not related to the SMP code?
bhlangonijr
Posts: 482
Joined: Thu Oct 16, 2008 4:23 am
Location: Milky Way

Re: Questions on SMP search

Post by bhlangonijr »

bob wrote:
I share hash and phash. I have local killers and don't do history at all in Crafty. I don't think shared killers make much sense as they are pretty much a "local issue."

+
Thanks all you guys for the answers. :)

Another important question is, how you do the load balance when distributing the work among the available threads?

In my second attempt to implement SMP (the first one failed miserably), I have a SMP search method with a thread-safe access to the move stack, control variables, hash, etc. All the available threads are booked and then call simultaneously the SMP search method, passing as parameter the same move stack, control variables, etc. The master is defined as the thread which called the "split" method. If some slave finishes its work it will become available for helping in a further split call. But when the master finishes its work it will stay in a busy loop until all slaves get their work done.

With limited testing, I got in average 1.6 speed up when using 2 cores and about 2.9 when using up to 6 cores (Considering NPS). Also, it appears that the speed up gained when using 2 cores doesn't help at all, in Elo terms...

What are your numbers? Mine are too bad? :)

Thanks,