"How To" guide to parallel-izing an engine, easy a

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

TomKerrigan
Posts: 64
Joined: Fri Jul 03, 2015 9:22 pm

"How To" guide to parallel-izing an engine, easy a

Post by TomKerrigan »

Hey guys, I did a lot of work on my parallel search earlier this year and came up with a pretty good way of doing it (IMO). I figured I'd make some web pages about it, and here they are:

http://www.tckerrigan.com/Chess/Parallel_Search/

Basically it's a very simple, easy, straightforward "method" of implementing parallel search, and if you follow the steps I think you will probably get speedups that are similar to DTS's. (You can see my own speedup numbers if you follow the link.)

The main algorithm involved is ABDADA, but I've simplified the implementation and added some of my own (simple) optimizations that improve its performance by 5-10%.

I look forward to any feedback people might have about this method and my web pages, and I definitely look forward to other people trying the method and hearing about how well it works for them!
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: o.t.

Post by Frank Quisinsky »

Hi Tom,

20 years later ...

Very happy to read your name in TalkChess. Thinking directly on the good old Winboard times and our first Winboard tourneys with Stober, Zarkov and first programmers create an engine for Winboard. You gave us a very strong support we need to public the tourneys and to animated others.

And of course I remember on your nice mails at this time. My English isn't better all the years but again ... I am very happy to read your name here.

A good day ...
Tom wrote!

Friendly
Frank

PS: From programming no idea ... but I have to write my shortly message!
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: "How To" guide to parallel-izing an engine, ea

Post by Evert »

Nice guide, and for me very timely! I've been mulling over (re)writing my YBW inspired parallel algorithm, and ABDADA looks appealing (apart from the name).
Very useful!
TomKerrigan
Posts: 64
Joined: Fri Jul 03, 2015 9:22 pm

Re: o.t.

Post by TomKerrigan »

Great to hear from you again Frank! Hearing from you is bringing back a lot of memories. I'm glad you're still on these message boards. I feel guilty that I haven't been!
User avatar
Steve Maughan
Posts: 1221
Joined: Wed Mar 08, 2006 8:28 pm
Location: Florida, USA

Re: "How To" guide to parallel-izing an engine, ea

Post by Steve Maughan »

Hi Tom,

Many thanks. It's timely for me as I'm thinking about giving Maverick a parallel search.

Quick question - have you tried Lazy SMP? It seems to be simple to implement and all the rage with top engines (e.g. Stockfish and Komodo).

- Steve
http://www.chessprogramming.net - Maverick Chess Engine
TomKerrigan
Posts: 64
Joined: Fri Jul 03, 2015 9:22 pm

Re: "How To" guide to parallel-izing an engine, ea

Post by TomKerrigan »

Haha, ABDADA is a mouthful. I had a hard time remembering the sequence of letters for a while until I started pronouncing it: ab-da-da. I thought about calling my implementation something else but then I thought it would be disrespectful to Jean-Christophe Weill. The algorithm really is a stroke of genius IMO--self-synchronizing YBW! Who thinks of this stuff?

Really excited to see if you have any feedback or thoughts about the stuff I posted. Feel free to post here or email me privately.
TomKerrigan
Posts: 64
Joined: Fri Jul 03, 2015 9:22 pm

Re: "How To" guide to parallel-izing an engine, ea

Post by TomKerrigan »

Steve Maughan wrote:Hi Tom,

Many thanks. It's timely for me as I'm thinking about giving Maverick a parallel search.

Quick question - have you tried Lazy SMP? It seems to be simple to implement and all the rage with top engines (e.g. Stockfish and Komodo).
...
I haven't tried Lazy SMP but I don't see the point. Lazy SMP seems like what you would get if you were trying to implement ABDADA and then gave up halfway through--instead of systematically ordering moves to divide up the work between threads (as ABDADA calls for), you just do things kinda randomly and hope for the best.

I can understand people's reluctance to implement ABDADA in the past, because the implementation in the original paper is a bit tricky.

With my "Simplified ABDADA" implementation, I think there's no reason not to use ABDADA. It's literally only a few lines of code, with absolutely minimal changes to a single-threaded search.

http://www.tckerrigan.com/Chess/Paralle ... ed_ABDADA/
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: "How To" guide to parallel-izing an engine, ea

Post by lucasart »

Interesting. I'll give this a try.

But it will need some adaptation to work. The main problem is that, with shared hashtable (aka lazy), you certainly don't want to have all threads searching the same depth, or running their own iterative deepening loop in a desynchronized manner.

There is a TON of elo in shared hashtable, by simply:
1/ having half the threads at depth=d, and the other half at depth=d+1. SF extends this to more complicated patterns, but this works well up to 4 cores (and my machine only has 4).
2/ when a thread completes a depth, signal all threads working on this depth (or lower) to stop *immediately* (eg. global slatomic signal, threads checks signal and throws an exception to return to ID loop). They go back to the ID loop, where useful work is given to them (heeding rule #1).

I would be very surprised if this gains any elo on top of what I already have (which scales quasi perfectly elo wise up to 4 cores).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: "How To" guide to parallel-izing an engine, ea

Post by lucasart »

Another thing about your simplified ABDADA: it only applies at root. Shouldn't the logic be applied at every node (perhaps with a min depth) ? This requires some data in TT entries, like how many threads are searching the hash move in this position. Or perhaps I'm reinventing ABDADA…

I dont see how any root only rule can be scalable at long tc.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: "How To" guide to parallel-izing an engine, ea

Post by mar »

lucasart wrote:Another thing about your simplified ABDADA: it only applies at root.
Where do you see it only applies at root?

As I understand it Tom uses a separate small hash table (pos AND move hashed together) which he uses to postpone (defer) searching moves already searched by other threads (except for the first move).
This is only applied when depth >= limit (in this case 3).

It's a nice idea and should be trivial to implement and try.