SMP and questions

syzygy · Post by **syzygy** » Fri Nov 23, 2012 11:58 pm

zamar wrote:3) When splitting at root and second move is taking long time, other threads can usually quickly refute the other moves and then they start helping in resolving the problematic move. Of course if we have two moves which are both taking long to complete, we have a problem, but my instincts say that this is extremely rear.

When a root move has failed high and is being researched with an open window, it might be an idea to let threads that finish searching their root move join the search of the move that failed high?

And another question. Did you try (at any PV split node) checking whether alpha has been updated by another thread before researching a fail high? (And maybe even before researching with full depth a move that failed high when searched with reduced depth?) I have not yet tested whether this is an improvement for me, but in theory it seems this could somewhat reduce the search overhead.

So basically (only for PV split nodes):

Code: Select all

do &#123;
  alpha = sp->alpha;
  value = -search&#40;..., -alpha-1, alpha, d&#41;;
&#125; while &#40;value > alpha && value <= sp->alpha&#41;;

if &#40;value > alpha && value < beta&#41; &#123;
  alpha = sp->alpha;
  value = -search&#40;..., -beta,  -alpha, d&#41;;
&#125;

jdart · Post by **jdart** » Sat Nov 24, 2012 2:49 am

Intel Parallel Studio is free for non-commercial use on Linux and can give you an idea of how efficient your SMP is and where the bottlenecks are.

If you are doing it right, you should get nearly 100% usage on all threads, except for the little bit of startup time before you are hitting the minimum split depth (I do not split if the depth is too low or if in check, because if in check you will typically only have a few moves to search).

With > 2 threads you need to implement the "helpful master" concept to achieve this .. i.e. a master thread should not be waiting for its slave threads to complete, but should help them complete.

--Jon

bob · Post by **bob** » Sat Nov 24, 2012 5:40 pm

syzygy wrote:
zamar wrote:3) When splitting at root and second move is taking long time, other threads can usually quickly refute the other moves and then they start helping in resolving the problematic move. Of course if we have two moves which are both taking long to complete, we have a problem, but my instincts say that this is extremely rear.
When a root move has failed high and is being researched with an open window, it might be an idea to let threads that finish searching their root move join the search of the move that failed high?

And another question. Did you try (at any PV split node) checking whether alpha has been updated by another thread before researching a fail high? (And maybe even before researching with full depth a move that failed high when searched with reduced depth?) I have not yet tested whether this is an improvement for me, but in theory it seems this could somewhat reduce the search overhead.

So basically (only for PV split nodes):
Code: Select all
do &#123;
  alpha = sp->alpha;
  value = -search&#40;..., -alpha-1, alpha, d&#41;;
&#125; while &#40;value > alpha && value <= sp->alpha&#41;;

if &#40;value > alpha && value < beta&#41; &#123;
  alpha = sp->alpha;
  value = -search&#40;..., -beta,  -alpha, d&#41;;
&#125;

Why would you do that? You just found a move that raises beta, why not stop everyone and sic all threads on this specific move so that you can get a score back?

syzygy · Post by **syzygy** » Sat Nov 24, 2012 6:27 pm

bob wrote:
syzygy wrote:So basically (only for PV split nodes):
Code: Select all
do &#123;
  alpha = sp->alpha;
  value = -search&#40;..., -alpha-1, alpha, d&#41;;
&#125; while &#40;value > alpha && value <= sp->alpha&#41;;

if &#40;value > alpha && value < beta&#41; &#123;
  alpha = sp->alpha;
  value = -search&#40;..., -beta,  -alpha, d&#41;;
&#125;
Why would you do that? You just found a move that raises beta, why not stop everyone and sic all threads on this specific move so that you can get a score back?

No, I did not find a move that raised beta. I found a move that would have raised alpha, except that alpha has already been raised by a parallel thread searching the same PV node. I think the pseudocode above expresses it correctly. sp->alpha is the value of alpha that might have been updated by other threads (but will always be < beta).

Suppose you're searching the root. The first move returned +0.5. At some point you start searching all other moves in parallel. Now one move fails high, so you research it. It returns +1.5. Now another move fails high, i.e. it returns > +0.5. Do you research it with an open window (+1.5, INF) (or (+1.5, +1.5 + delta) in case of aspiration search), or do you first test with a nullwindow whether it is > +1.5 ?

The latter seems to have good chances of showing that the value is <= +1.5, in which case you save some time.

syzygy · Post by **syzygy** » Sat Nov 24, 2012 6:33 pm

bob wrote:Why would you do that? You just found a move that raises beta, why not stop everyone and sic all threads on this specific move so that you can get a score back?

Or do you propose that in case of a move that is going to raise alpha in a PV node, you stop all other threads searching the same node and put them on the open window research of the move that failed high? In that case the aborted searches will have to be completely redone (but with an improved alpha).

I don't think this is what you meant, but I'll have a look at crafty to see how you are dealing with this case.

syzygy · Post by **syzygy** » Sat Nov 24, 2012 6:46 pm

syzygy wrote:
bob wrote:Why would you do that? You just found a move that raises beta, why not stop everyone and sic all threads on this specific move so that you can get a score back?
Or do you propose that in case of a move that is going to raise alpha in a PV node, you stop all other threads searching the same node and put them on the open window research of the move that failed high? In that case the aborted searches will have to be completely redone (but with an improved alpha).

I don't think this is what you meant, but I'll have a look at crafty to see how you are dealing with this case.

From SearchParallel() I understand that the "PVS re-search code" (and all other invocations of Search()) simply uses the local alpha and beta without checking whether a parallel thread in the meantime has raised alpha.

You might be able to reduce the search overhead a bit by updating alpha to a value stored in a location shared by all threads searching the particular PV node, each time before you start a search or re-search of a move in a PV node.

Don · Post by **Don** » Tue Nov 27, 2012 4:16 am

mcostalba wrote:
Edsel Apostol wrote: What are the other tricks used to minimize idle time spent by the threads? I have read "Active reparenting" from the SF team but I have not studied SF yet how this is implemented
or was this implemented at all?
The idea is very simple. Implemenatation, as usual with SMP, is simple only after you made it to work .

Once a slave has finished searching, instead of returning to idle loop waiting to be picked up by a new master, it looks for an active split point, register itself as a split point's slave and starts searching on it. The idea is that average idle time is greatly reduced in this case.

Unfortunaty the patch failed to score an increase and so has been retired:

https://github.com/mcostalba/Stockfish/ ... 95083a3ab9

My guess is that the idle time is anyhow already small on average so that the patch does not change the things a lot. One possibility that I have not tested is to enable active reparenting together with increasing minimum split depth. This is because increasing split depth reduces the SMP overhead but at the same time increases the average slave's idle time. So at first glance could make sense to increase the first and enable the second.

This is essentially how it works with the Cilk scheduler too. You grab work from your own work queue first if there is something there, but if you run out of work you pick another queue at random and steal work from that one.

I think Vincent is advocating having only a single queue, I'm not sure though I'm still trying to figure out what he said.

Don

rvida · Post by **rvida** » Tue Nov 27, 2012 8:24 am

Houdini wrote: 7) Don't split good captures or killers - the probability of a cut-off is too high.

However, Houdini (1.5) always tries to split after the first move was searched. (IIRC)

rvida · Post by **rvida** » Tue Nov 27, 2012 8:52 am

syzygy wrote: You might be able to reduce the search overhead a bit by updating alpha to a value stored in a location shared by all threads searching the particular PV node, each time before you start a search

Makes sense. I thought everyone is doing that. If not, they should.

syzygy wrote: or re-search of a move in a PV node.

This is not so obvious. It might be better to retry the reduced search with updated alpha. If it still returns a score above the new alpha only then do a full depth re-search.

bob · Post by **bob** » Tue Nov 27, 2012 5:59 pm

syzygy wrote:
bob wrote:
syzygy wrote:So basically (only for PV split nodes):
Code: Select all
do &#123;
  alpha = sp->alpha;
  value = -search&#40;..., -alpha-1, alpha, d&#41;;
&#125; while &#40;value > alpha && value <= sp->alpha&#41;;

if &#40;value > alpha && value < beta&#41; &#123;
  alpha = sp->alpha;
  value = -search&#40;..., -beta,  -alpha, d&#41;;
&#125;
Why would you do that? You just found a move that raises beta, why not stop everyone and sic all threads on this specific move so that you can get a score back?
No, I did not find a move that raised beta. I found a move that would have raised alpha, except that alpha has already been raised by a parallel thread searching the same PV node. I think the pseudocode above expresses it correctly. sp->alpha is the value of alpha that might have been updated by other threads (but will always be < beta).

Suppose you're searching the root. The first move returned +0.5. At some point you start searching all other moves in parallel. Now one move fails high, so you research it. It returns +1.5. Now another move fails high, i.e. it returns > +0.5. Do you research it with an open window (+1.5, INF) (or (+1.5, +1.5 + delta) in case of aspiration search), or do you first test with a nullwindow whether it is > +1.5 ?

The latter seems to have good chances of showing that the value is <= +1.5, in which case you save some time.

Sounded like a "beta cutoff" to me. Which means you need to raise beta and try again.

Are you not using PVS? If so this would seem to be relevant to my comments, since once you search the first move and establish alpha (YBW) you then search with beta = alpha + 1, and if you get a fail-high, you know you have found a better move. Why not stop everyone, and sic 'em on that single move to get a result back sooner?

For your question, as soon as one move fails high at the root, I stop, then restart the search with everyone searching the same root move to get a score quicker. I don't have any way for the threads to have different alpha values, because any change to alpha has to follow a fail-high, and after a fail-high, everyone is working together on that root move...\

SMP and questions

Re: SMP and questions

Re: SMP and questions

Re: SMP and questions

Re: SMP and questions

Re: SMP and questions

Re: SMP and questions

Re: SMP and questions

Re: SMP and questions

Re: SMP and questions

Re: SMP and questions