Implementing parallel search

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Implementing parallel search

Post by matthewlai » Tue Sep 02, 2014 8:14 am

Hello!

Question about parallel search - how are people implementing it?

I read up on YBW, and it makes sense conceptually, but I can't seem to come up with a way to implement it efficiently.

I understand that in search(), we first search the first node serially, before dividing up the remaining nodes.

The way I am thinking about doing it is, have the master thread look for an idle thread from a thread pool, and have them each take one of the remaining nodes.

The problem is, what should the master do, then? Should it also pick a node and start searching? If not, it looks like it would be difficult to manage number of actively-searching threads, and there will be many sleeping threads. If so, what if one of the children it spawned finishes early, and needs more work to do?

Thanks!
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

ZirconiumX
Posts: 1327
Joined: Sun Jul 17, 2011 9:14 am

Re: Implementing parallel search

Post by ZirconiumX » Tue Sep 02, 2014 8:44 am

matthewlai wrote:Hello!

Question about parallel search - how are people implementing it?

I read up on YBW, and it makes sense conceptually, but I can't seem to come up with a way to implement it efficiently.

I understand that in search(), we first search the first node serially, before dividing up the remaining nodes.

The way I am thinking about doing it is, have the master thread look for an idle thread from a thread pool, and have them each take one of the remaining nodes.

The problem is, what should the master do, then? Should it also pick a node and start searching? If not, it looks like it would be difficult to manage number of actively-searching threads, and there will be many sleeping threads. If so, what if one of the children it spawned finishes early, and needs more work to do?

Thanks!
From what I know, the master thread searching too is a standard implementation.

Matthew:out
Some believe in the almighty dollar.

I believe in the almighty printf statement.

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: Implementing parallel search

Post by matthewlai » Tue Sep 02, 2014 9:08 am

ZirconiumX wrote:
matthewlai wrote:Hello!

Question about parallel search - how are people implementing it?

I read up on YBW, and it makes sense conceptually, but I can't seem to come up with a way to implement it efficiently.

I understand that in search(), we first search the first node serially, before dividing up the remaining nodes.

The way I am thinking about doing it is, have the master thread look for an idle thread from a thread pool, and have them each take one of the remaining nodes.

The problem is, what should the master do, then? Should it also pick a node and start searching? If not, it looks like it would be difficult to manage number of actively-searching threads, and there will be many sleeping threads. If so, what if one of the children it spawned finishes early, and needs more work to do?

Thanks!
From what I know, the master thread searching too is a standard implementation.

Matthew:out
Ah I see. I guess in that case the threads must be self-organizing, through some kind of data structure without an active master.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

ymatioun
Posts: 64
Joined: Fri Oct 18, 2013 9:40 pm
Location: New York

Re: Implementing parallel search

Post by ymatioun » Tue Sep 02, 2014 11:52 am

I also found this confusing when i was implementing parallel search.
So the way i did it was several slave threads (4 for 4-core CPU), waiting on an event. When master thread reaches split point, it puts position data into global variable, triggers slave event and goes to sleep. Slaves pick-up the solve, and when they are done, wake up master and go to sleep themselves.

Youri.

jdart
Posts: 3825
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Implementing parallel search

Post by jdart » Tue Sep 02, 2014 3:59 pm

Feel free to look at my implementation, which is doing what you describe:

https://github.com/jdart1/arasan-chess/tree/master/src

see Search::searchSMP and the ThreadPool class (threadp.h/.cpp).

Implementing the "helpful master" concept will increase performance but it is not a huge gain, so if you want you can defer implementing this.

--Jon

syzygy
Posts: 4455
Joined: Tue Feb 28, 2012 10:56 pm

Re: Implementing parallel search

Post by syzygy » Tue Sep 02, 2014 6:41 pm

matthewlai wrote:If so, what if one of the children it spawned finishes early, and needs more work to do?
Usually slave threads pick a (next) move on their own. You obviously need to do some locking to make sure that two slaves do not pick the same move.

syzygy
Posts: 4455
Joined: Tue Feb 28, 2012 10:56 pm

Re: Implementing parallel search

Post by syzygy » Tue Sep 02, 2014 6:44 pm

jdart wrote:Implementing the "helpful master" concept will increase performance but it is not a huge gain, so if you want you can defer implementing this.
I don't think the OP is there, yet.

The way I understand the question is: should the master be one of the search threads that search moves at the split node, or should it only somehow loop around waiting for available slaves in order to serve them with moves.

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: Implementing parallel search

Post by matthewlai » Tue Sep 02, 2014 8:20 pm

ymatioun wrote:I also found this confusing when i was implementing parallel search.
So the way i did it was several slave threads (4 for 4-core CPU), waiting on an event. When master thread reaches split point, it puts position data into global variable, triggers slave event and goes to sleep. Slaves pick-up the solve, and when they are done, wake up master and go to sleep themselves.

Youri.
That sounds pretty simple, but if a slave splits again and becomes a master, aren't there only 3 threads searching now?

It's possible to spawn another thread in that case (6 threads total, 2 sleeping), but I am afraid it will end up with hundreds of sleeping threads which would kill scheduler performance.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

ymatioun
Posts: 64
Joined: Fri Oct 18, 2013 9:40 pm
Location: New York

Re: Implementing parallel search

Post by ymatioun » Tue Sep 02, 2014 9:52 pm

I only allow splits by master thread. So when a slave thread has no more work, but some slaves are still working, CPU utilization drops, from 4 running threads down to 3, 2, then 1. This is certainly suboptimal, but very easy to implement.

Youri.

matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: Implementing parallel search

Post by matthewlai » Tue Sep 02, 2014 11:17 pm

ymatioun wrote:I only allow splits by master thread. So when a slave thread has no more work, but some slaves are still working, CPU utilization drops, from 4 running threads down to 3, 2, then 1. This is certainly suboptimal, but very easy to implement.

Youri.
Ah! I see.

It does sound very easy.

Do you know what average CPU utilization looks like over say a typical game?
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

Post Reply