I split at the root all the time. If you use 8 cores, there is an 82.5% chance that the "master" runs out of work first. The savings are most dramatic at the root, but they apply everywhere. When I parallelized Crafty back in '96, I wrote the current code, but I only turned it on piece by piece since the debugging is a pain. Allowing the master to join in was pretty complicated, but was a pretty significant performance improvement.jdart wrote:Sure, but that is the best case for helpful master. In general the gain is less.One of the other threads is hung up on a long search because the move it is searching will eventually fail high. Why wouldn't you want the original master to join in and help resolve that fail high before time runs out and you miss it completely.
--Jon
The way the tree is split and re-split, without the "helpful master" concept, at 7/8 of the split points you end up with one CPU "lost" for a while. Certainly worth trying to eliminate that.
Just my $.02 of course. I am currently revisiting "late-join" since I have pretty much revamped that part of my parallel search anyway...