Parallel Search with Transposition Table

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Parallel Search with Transposition Table

Post by jdart »

Since parallelizing alpha-beta looks very difficult and way above my skill level, I decided to take the "easy" route and just have a thread pool handle the search for all the moves at the root.
This is problematic, because the usual approach (PVS search) is to do the supposed best move first, obtain an accurate score, and then narrow the search window so that subsequent moves are only tested to see if they improve the current best score or not. That narrow-window search is typically much faster than a search with wider bounds.

Re concurrency: all you have to do to make it work is to make your hash table access method "synchronized", although as others have noted performance may be an issue because you will acquire/release locks frequently.

--Jon
User avatar
Bloodbane
Posts: 154
Joined: Thu Oct 03, 2013 4:17 pm

Re: Parallel Search with Transposition Table

Post by Bloodbane »

I though GC was pretty fast if it is done right, i.e. using some sort of a system based on when an object was last accessed, can't remember the name. Is Java not doing GC that way or is it just an feature of hashMap?
Functional programming combines the flexibility and power of abstract mathematics with the intuitive clarity of abstract mathematics.
https://github.com/mAarnos
DrRibosome
Posts: 19
Joined: Tue Mar 12, 2013 5:31 pm

Re: Parallel Search with Transposition Table

Post by DrRibosome »

Hmm, I wrote an engine in java as well and it seems to have comparable performance to a lot of the cpp engines. Everything is pre-allocated, etc on startup so there's no allocations at all in the search, and the jvm optimizes all the numeric operations really well
daylen
Posts: 40
Joined: Fri Dec 30, 2011 5:33 am
Location: Berkeley, CA

Re: Parallel Search with Transposition Table

Post by daylen »

Thanks for your detailed post. It's true that Java is doing a lot of object creation/deletion in, as you said, the move generator. What profiler did you use that showed garbage collection? jvisualvm, the profiler that ships with Java, doesn't show it.

I'll try to read more about PVS. Right now I know what it is on a very high-level but not really how to go about implementing it. The idea is to explore the left-most branch and then do the rest of the tree in parallel...

By using a ConcurrentHashMap I'm basically using method (2)—synchronizing access with read/write locks. I'm *also* doing sanity checking of data I get out of the ConcurrentHashMap. However there are still inconsistency issues, so the bug must be elsewhere.
daylen
Posts: 40
Joined: Fri Dec 30, 2011 5:33 am
Location: Berkeley, CA

Re: Parallel Search with Transposition Table

Post by daylen »

It's called generational garbage collection. From Wikipedia:

"It has been empirically observed that in many programs, the most recently created objects are also those most likely to become unreachable quickly (known as infant mortality or the generational hypothesis). A generational GC (also known as ephemeral GC) divides objects into generations and, on most cycles, will place only the objects of a subset of generations into the initial white (condemned) set."
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: Parallel Search with Transposition Table

Post by syzygy »

rtitle wrote:The alternative is to ignore thread safety except for some sanity checking that you don't get back garbage due to corrupted entries. This obviously is not foolproof - you might get back something non-garbage but wrong - but it might be good enough for chess search where an erroneous result from the hashtable during search isn't fatal. (By contrast, you wouldn't want your bank trashing your bank balance and saying "oh, that's just corruption because we used a non-threadsafe database for speed").
You also don't want your bank to calculate your balance based on wild heuristics that nobody really understands but just happen to work. A chess engine is just not comparable to a bank.

But with C++11/C11 atomics you can at least get rid of UB.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: Parallel Search with Transposition Table

Post by syzygy »

daylen wrote:Thanks for your detailed post. It's true that Java is doing a lot of object creation/deletion in, as you said, the move generator.
If already generating moves involves object creation/deletion, then, with all due respect, it is time to start over.

Use integers to represent moves, not objects.
daylen
Posts: 40
Joined: Fri Dec 30, 2011 5:33 am
Location: Berkeley, CA

Re: Parallel Search with Transposition Table

Post by daylen »

Doing so would require almost a complete rewrite of the Position class. A Move object encapsulates 4 integers: the x, y for the source and destination squares. Would the speed up from using one integer be significant?
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Parallel Search with Transposition Table

Post by Rein Halbersma »

daylen wrote:Doing so would require almost a complete rewrite of the Position class. A Move object encapsulates 4 integers: the x, y for the source and destination squares. Would the speed up from using one integer be significant?
Ronald presumably meant: if you dynamically have to create/delete objects (i.e. using malloc/free, or C++ new/delete), rather than on the stack.
User avatar
emadsen
Posts: 434
Joined: Thu Apr 26, 2012 1:51 am
Location: Oak Park, IL, USA
Full name: Erik Madsen

Re: Parallel Search with Transposition Table

Post by emadsen »

So you are implementing this in Java, presumably because you don't care about performance. But you want to parallelize it, because you do care about performance. Make up your mind. If you want performance, you probably want to go with C or C++.
I agree, depending on your goals. This forum can be a bit performance or ELO-obsessed, as Fernando reminds us from time to time. If your goal is to produce a high-performance, strong engine, then yes, these technology choices are contradictory. However, if the goal of the project is to produce a sparring partner, or gain some intellectual satisfaction by correctly implementing difficult algorithms, then these arguments over languages and performance are irrelevant. I've been spending a lot of time lately testing various algorithms to implement UCI_LimitStrength and UCI_Elo, which adds value to a chess program in a different manner than NPS.

Besides, my slow .NET engine is nearly 2500 ELO against human opposition on ICC. It's ridiculous how strong these chess engines are when the distorting effect of computer-computer testing is removed.

So, the original poster has to decide what are his goals.
profiling showed over 90% of its time spent garbage collecting
Seems very high. My engine spends about 12% in garbage collection.
each TT entry in the HashMap is an object, and they need to be garbage-collected after replacement.
No need for garbage collection. Just pre-allocate the TT entries and reuse.
My C# chess engine: https://www.madchess.net