Research paper comparing various parallel search algorithms

koedem · Post by **koedem** » Wed Nov 10, 2021 4:33 pm

Is there a reasonably recent paper that compares various parallel search algorithms? For simplicity let's look at ones for PV search based algorithms in particular. I see there is this https://www.chessprogramming.org/Parall ... h#Taxonomy however, that seems quite old and I also can't access the original source.

The background is, for some university lecture I'm taking, I may among other options give a short talk about some parallel algorithms for certain problems or, alternatively, write a wikipedia article or add a section to an existing one (for instance the one on alpha-beta search). The effort is supposed to be not too high (20 minutes of talking or one wiki section won't contain an extraordinary amount of information) so maybe this topic is not a good idea in the first place, but I might want to give it a try.

Now currently I have been looking through various chessprogramming pages, read some forum posts here and so forth, but that info is rather widely spread. Some overview might make it more clear whether this is a good idea in the first place.

Furthermore, unrelated to this specific project, if there were no such up to date paper existing, do you think there would be interest in writing one? To me this seems like an obvious paper that "should" exist so to speak.

koedem · Post by **koedem** » Wed Nov 10, 2021 5:04 pm

koedem wrote: ↑Wed Nov 10, 2021 4:33 pm Furthermore, unrelated to this specific project, if there were no such up to date paper existing, do you think there would be interest in writing one? To me this seems like an obvious paper that "should" exist so to speak.

Can't edit my post anymore, I mean in particular, if I tried to write such a paper would anyone be interested in reading it?

Karlo Bala · Post by **Karlo Bala** » Wed Nov 10, 2021 5:26 pm

koedem wrote: ↑Wed Nov 10, 2021 5:04 pm
koedem wrote: ↑Wed Nov 10, 2021 4:33 pm Furthermore, unrelated to this specific project, if there were no such up to date paper existing, do you think there would be interest in writing one? To me this seems like an obvious paper that "should" exist so to speak.
Can't edit my post anymore, I mean in particular, if I tried to write such a paper would anyone be interested in reading it?

Well, if it was meant to be a scientific paper, then there would be readers for sure

Ajedrecista · Post by **Ajedrecista** » Wed Nov 10, 2021 9:05 pm

Hello:

koedem wrote: ↑Wed Nov 10, 2021 4:33 pm[...] I see there is this https://www.chessprogramming.org/Parall ... h#Taxonomy however, that seems quite old and I also can't access the original source.

[...]

I did not found the original article with the headers and footers of ICGA Journal (ICCA Journal in those days if you look at the free PDFs here) in a brief search. However, I found a PDF that could be a transcription of the original article:

http://citeseerx.ist.psu.edu/viewdoc/su ... .1.47.8438

http://citeseerx.ist.psu.edu/viewdoc/do ... 1&type=pdf

I also found the paper in PS format:

https://webdocs.cs.ualberta.ca/~games/a ... nomy.ps.gz

But it is the same PDF than before once you convert it to PDF with an online converter.

Regards from Spain.

Ajedrecista.

koedem · Post by **koedem** » Thu Nov 11, 2021 2:11 pm

I wonder, is the following statement as to why people switched away from YBW reasonable?

As I interpret it, YBW and similar do work rather well if the branching factor is sufficiently high (which makes sense, because the parallelism happens where there are different branches to look at). However, modern engines like Stockfish prune so aggressively that the effective branching factor is rather low and thus it is much harder to parallelize. And so, an algorithm like lazySMP, which in itself is not that "great" can be competitive and even outperform YBW because it is hit less by the low branching factor?
(coupled with the fact that such a lazySMP approach can also be more effective on a modern engine like Stockfish with many search heuristics to wiggle at for non determinism, than if you for instance had a plain PV search)

jorose · Post by **jorose** » Thu Nov 11, 2021 7:52 pm

koedem wrote: ↑Thu Nov 11, 2021 2:11 pm I wonder, is the following statement as to why people switched away from YBW reasonable?

As I interpret it, YBW and similar do work rather well if the branching factor is sufficiently high (which makes sense, because the parallelism happens where there are different branches to look at). However, modern engines like Stockfish prune so aggressively that the effective branching factor is rather low and thus it is much harder to parallelize. And so, an algorithm like lazySMP, which in itself is not that "great" can be competitive and even outperform YBW because it is hit less by the low branching factor?
(coupled with the fact that such a lazySMP approach can also be more effective on a modern engine like Stockfish with many search heuristics to wiggle at for non determinism, than if you for instance had a plain PV search)

Back in my masters, I had a group project where our group implemented a baseline chess engine with only "correct" pruning, in the sense that the engine was guaranteed to reach the exact minimax optimal solution at any depth and with any number of threads. This implied fairly minimal pruning and we expected performance to be pretty fine under these conditions. What I remember is that we used YBW and it scaled very poorly to high thread counts.

Unfortunately, I am not sure I have access to the class project code still, nor could I guarantee there were no serious bugs or such things, but that's what I remember about my takeaway.

I wrote the bulk of the code and chess functionality for my team and was actually very happy with that part of the code. I might have higher standards now, but at the time I decided to fork the project at the end of the semester. In the fork, the first thing I did was remove any of the multithreading code portions and then started adding a ton of improvements. I also renamed the engine, it is now called "Winter".

koedem · Post by **koedem** » Fri Nov 12, 2021 7:57 pm

jorose wrote: ↑Thu Nov 11, 2021 7:52 pm Back in my masters, I had a group project where our group implemented a baseline chess engine with only "correct" pruning, in the sense that the engine was guaranteed to reach the exact minimax optimal solution at any depth and with any number of threads. This implied fairly minimal pruning and we expected performance to be pretty fine under these conditions. What I remember is that we used YBW and it scaled very poorly to high thread counts.

Unfortunately, I am not sure I have access to the class project code still, nor could I guarantee there were no serious bugs or such things, but that's what I remember about my takeaway.

Oh, interesting. I thought I had read in some papers that YBW was supposed to give good speedups up to a 1000 cores or so, although I don't know how trustworthy those numbers are either. But given that it was one of the main algorithms 10 years ago it shouldn't have been so terrible? Maybe it suffers from poor move ordering, but that shouldn't be all?!

jorose · Post by **jorose** » Fri Nov 19, 2021 8:45 pm

koedem wrote: ↑Fri Nov 12, 2021 7:57 pm
jorose wrote: ↑Thu Nov 11, 2021 7:52 pm Back in my masters, I had a group project where our group implemented a baseline chess engine with only "correct" pruning, in the sense that the engine was guaranteed to reach the exact minimax optimal solution at any depth and with any number of threads. This implied fairly minimal pruning and we expected performance to be pretty fine under these conditions. What I remember is that we used YBW and it scaled very poorly to high thread counts.

Unfortunately, I am not sure I have access to the class project code still, nor could I guarantee there were no serious bugs or such things, but that's what I remember about my takeaway.
Oh, interesting. I thought I had read in some papers that YBW was supposed to give good speedups up to a 1000 cores or so, although I don't know how trustworthy those numbers are either. But given that it was one of the main algorithms 10 years ago it shouldn't have been so terrible? Maybe it suffers from poor move ordering, but that shouldn't be all?!

I think it likely depends largely on the conditions. I love computer chess and would love to write research papers in this field, but have avoided doing it as so many factors are completely dependent on the exact conditions.

It's possible we had a bug or its possible the design of our barebones engine meant the engine could not properly utilize YBW for scaling.

I also wouldn't be confident in the old YBW papers, however. Do they focus on chess or general game trees? If they focus on chess, how do the engines they use compare to modern engines in terms of search features? If they rely on general game trees, how comparable is their simulated data for chess? How do they measure scaling? Are they looking at time to depth, node counts, time to find mate, or something completely different?

In the course we chose our design specifically so speedup in terms of time to depth would be well defined. Minimax would take X time to reach a depth with a certain PV resulting an eval score S and our YBW search would get the same length PV at that depth and take time Y to reach the same eval. PVs could be different but we verified the length was the one we expected and eval of the final position was correct and identical. Unfortunately this design also made it completely incomparable to modern chess engines.

I think we had a hash table with 64 bit (might have even been 128 bits to be sure we detect collisions) zobrist hashing and our search was an AlphaBeta search, but if you think about it, almost every other feature of modern search adds a component of nondeterminism that can result in missing the optimal PV at a given depth. I think we had a QSearch, but this is actually non-trivial given the constraints.

I think on paper LazySMP does not result in as high depth gains as we would expect, but in practice it is hard to beat. This is especially the case considering the benefits such as ease of implementation, which also leads to fewer bugs and better chances of nothing going wrong at thread counts you might not have the hardware to test on.

Somewhat ironically, at the time we didn't test LazySMP, as we assumed it would be more complicated as it was much newer. Had we spent 10 minutes to properly read up on it we would have for sure added it, but I guess that's how projects at uni courses kind of go.

Another fun fact is I wrote most of the codebase while I was playing the European Club Cup in Serbia that year. So you can imagine I was playing up to 6 hours of chess a day, preparing for my opponents of the next day while trying to also keep in touch with the project partners back home and somehow getting a functional chess engine that was clean enough my non-chess playing partners could understand and modify to generate project data. It is kind of a miracle the codebase ended up being something I liked well enough to fork into becoming my personal project.

Deberger · Post by **Deberger** » Sat Nov 20, 2021 1:27 am

koedem wrote: ↑Wed Nov 10, 2021 4:33 pm I also can't access the original source.

Brockington, Mark G. "A taxonomy of parallel game-tree search algorithms." ICCA Journal 19, no. 3 (1996): 162-174.

pdf

Research paper comparing various parallel search algorithms

Research paper comparing various parallel search algorithms

Re: Research paper comparing various parallel search algorithms

Re: Research paper comparing various parallel search algorithms

Re: Research paper comparing various parallel search algorithms.

Re: Research paper comparing various parallel search algorithms

Re: Research paper comparing various parallel search algorithms

Re: Research paper comparing various parallel search algorithms

Re: Research paper comparing various parallel search algorithms

Re: Research paper comparing various parallel search algorithms