Easy easy move

hgm · Post by **hgm** » Fri Aug 02, 2013 8:37 pm

In a typical mini-Shogi game almost all moves are captures or checks, and very often there is only one leply that doesn't immediately blunder the game away. So it seems not wasting too much thinking time on such moves should really help. Especially in ponder games, where you don't want to grant your opponent free thinking time with a guaranteed ponder hit!

So I am looking for a simple implementation of easy move. So far I found this:

Code: Select all

int easy;

int Search&#40;int alpha, int beta, int depth&#41;
&#123;
  for&#40;iterDepth=1; iterDepth<depth; iterDepth++) &#123; //&#40;I&#41;ID
    alpha = originalAlpha, bestScore = -INFINITY; firstMove = TRUE;
    for&#40;ALL_MOVES&#41; &#123;

      Make&#40;move&#41;;
      score = -Search&#40;-beta, -alpha, iterDepth-1&#41;; // recursion
      UnMake&#40;move&#41;;

      if&#40;score > bestScore&#41; &#123; // normal alpha-beta score propagation
        bestScore = score;
        if&#40;score > alpha&#41; &#123;
          alpha = score;
          if&#40;score >= beta&#41; break; // beta cutoff;
        &#125;
      &#125;

      if&#40;NodeIsRoot&#40;)) &#123;
        // ********* HERE COMES THE EASY-MOVE CODE&#58; ************
        if&#40;iterDepth > 3 && easy&#41; &#123;
          if&#40;!firstMove && score > alpha - easy&#41; easy = 0; 
          alpha = alpha - easy; // lower alpha for next move
        &#125;
        // *************** END OF EASY-MOVE CODE ***************
        if&#40;TimeUsed&#40;) > TimeLimit&#40;easy&#41; && !DisastrousFailLow&#40;)) break; // be happy with what we have
      &#125;
      firstMove = FALSE;
    &#125; // next move

    if&#40;NodeIsRoot&#40;)) &#123;
      PrintPV&#40;);
      if&#40;TimeUsed&#40;) > SafeToStartNewIter&#40;easy&#41;) break; // no time for new iteration
    &#125;

  &#125; // next depth

  return bestScore;
&#125;

The engine in question doesn't have separate routines for searching root and other nodes, so within its unified Search() there are two if(NodeIsRoot()) statements that cover all stuff that only has to be done in the root (with a perfectly predicted branch on a trivial condition, so it doesn't give a measurable slowdown). There happens to be one in a useful spot, at the end of the loop over moves, in which it is checked if we should grant extra time to search more moves. Without putting any extra load on the normal search I can put other root code there, in particular the easy-move code.

The idea behind the implementation is that in the first few iterations a unique best move should have floated to the beginning of the move list, and will be searched first. The variable firstMove already kept track of that for the benefit of PVS. Now if a later move at iteration 4 or higher has a score that is within a margin 'easy' from the current best move, as recorded in this iteration's alpha, (which will certainly the case if it actually improved alpha, as then score = alpha > alpha - easy after increasing alpha), we clear 'easy', to indicate that life is not easy, but a real choice is to be made.

Of course normally you would not know how close the later moves score to alpha, because if they are below alpha they fail low, and the score is just an upper limit. So we have to lower alpha to alpha - easy, which is done for the next move if the current one did not already spoil the easiness. This opens the window with which the next move will be searched. (I guess I could also lower beta, but PVS uses window {alpha, alpha+1} on the later moves anyway, so as long as they keep failing low beta is never used. And the first one that doesn't fail low compared to the artificially lowered alpha will spoil the easiness once and for all, hopefully at quite low depth, so I did not bother.)

So as long as 'easy' has some non-zero value, all later moves will be searched with the artificially lowered alpha, enabling us to reliably check if the easiness criterion remains fulfilled. The global varianble 'easy' can be initialized before the search to whatever threshold we want to use for a move to qualify as easy.

Does this make any sense?

The easy-move code is put before the tests it there is time to continue, so that these tests can make use of the value of 'easy' to greatly reduce the time quota if 'easy' is still non-zero.

bob · Post by **bob** » Sat Aug 03, 2013 6:17 am

hgm wrote:In a typical mini-Shogi game almost all moves are captures or checks, and very often there is only one leply that doesn't immediately blunder the game away. So it seems not wasting too much thinking time on such moves should really help. Especially in ponder games, where you don't want to grant your opponent free thinking time with a guaranteed ponder hit!

So I am looking for a simple implementation of easy move. So far I found this:
Code: Select all
int easy;

int Search&#40;int alpha, int beta, int depth&#41;
&#123;
  for&#40;iterDepth=1; iterDepth<depth; iterDepth++) &#123; //&#40;I&#41;ID
    alpha = originalAlpha, bestScore = -INFINITY; firstMove = TRUE;
    for&#40;ALL_MOVES&#41; &#123;

      Make&#40;move&#41;;
      score = -Search&#40;-beta, -alpha, iterDepth-1&#41;; // recursion
      UnMake&#40;move&#41;;

      if&#40;score > bestScore&#41; &#123; // normal alpha-beta score propagation
        bestScore = score;
        if&#40;score > alpha&#41; &#123;
          alpha = score;
          if&#40;score >= beta&#41; break; // beta cutoff;
        &#125;
      &#125;

      if&#40;NodeIsRoot&#40;)) &#123;
        // ********* HERE COMES THE EASY-MOVE CODE&#58; ************
        if&#40;iterDepth > 3 && easy&#41; &#123;
          if&#40;!firstMove && score > alpha - easy&#41; easy = 0; 
          alpha = alpha - easy; // lower alpha for next move
        &#125;
        // *************** END OF EASY-MOVE CODE ***************
        if&#40;TimeUsed&#40;) > TimeLimit&#40;easy&#41; && !DisastrousFailLow&#40;)) break; // be happy with what we have
      &#125;
      firstMove = FALSE;
    &#125; // next move

    if&#40;NodeIsRoot&#40;)) &#123;
      PrintPV&#40;);
      if&#40;TimeUsed&#40;) > SafeToStartNewIter&#40;easy&#41;) break; // no time for new iteration
    &#125;

  &#125; // next depth

  return bestScore;
&#125;
The engine in question doesn't have separate routines for searching root and other nodes, so within its unified Search() there are two if(NodeIsRoot()) statements that cover all stuff that only has to be done in the root (with a perfectly predicted branch on a trivial condition, so it doesn't give a measurable slowdown). There happens to be one in a useful spot, at the end of the loop over moves, in which it is checked if we should grant extra time to search more moves. Without putting any extra load on the normal search I can put other root code there, in particular the easy-move code.

The idea behind the implementation is that in the first few iterations a unique best move should have floated to the beginning of the move list, and will be searched first. The variable firstMove already kept track of that for the benefit of PVS. Now if a later move at iteration 4 or higher has a score that is within a margin 'easy' from the current best move, as recorded in this iteration's alpha, (which will certainly the case if it actually improved alpha, as then score = alpha > alpha - easy after increasing alpha), we clear 'easy', to indicate that life is not easy, but a real choice is to be made.

Of course normally you would not know how close the later moves score to alpha, because if they are below alpha they fail low, and the score is just an upper limit. So we have to lower alpha to alpha - easy, which is done for the next move if the current one did not already spoil the easiness. This opens the window with which the next move will be searched. (I guess I could also lower beta, but PVS uses window {alpha, alpha+1} on the later moves anyway, so as long as they keep failing low beta is never used. And the first one that doesn't fail low compared to the artificially lowered alpha will spoil the easiness once and for all, hopefully at quite low depth, so I did not bother.)

So as long as 'easy' has some non-zero value, all later moves will be searched with the artificially lowered alpha, enabling us to reliably check if the easiness criterion remains fulfilled. The global varianble 'easy' can be initialized before the search to whatever threshold we want to use for a move to qualify as easy.

Does this make any sense?

The easy-move code is put before the tests it there is time to continue, so that these tests can make use of the value of 'easy' to greatly reduce the time quota if 'easy' is still non-zero.

I rewrote my easy move code in the last version of Crafty. Basic idea is to set a time multiplier to 1.0. As each iteration is searched, if the best move changes, the multiplier is increased. More changes = bigger change. If the best move does not change, the multiplier is reduced. Took a bit of tuning, but it turned out to be worth a few Elo. And another benefit is that the inverse of "easy" is also sort of detected, as if it keeps changing best moves, the target time (base time * multiplier) keeps climbing making the thing search longer when it seems to be unsure what is best...

sje · Post by **sje** » Sat Aug 03, 2013 7:31 am

Symbolic keeps a list of the predicted variations returned from each iteration of the search. If the last N (N=5 or so) PVs all had the same first move and at least 1/4 of the allocated time has passed, then the first move of the most recent PV is an easy move (my name "obvious move"). The parameters can be modified slightly if the obvious move is also the predicted move from the prior search.

hgm · Post by **hgm** » Sat Aug 03, 2013 8:41 am

bob wrote:As each iteration is searched, if the best move changes, the multiplier is increased. More changes = bigger change. If the best move does not change, the multiplier is reduced. Took a bit of tuning, but it turned out to be worth a few Elo. And another benefit is that the inverse of "easy" is also sort of detected, as if it keeps changing best moves, the target time (base time * multiplier) keeps climbing making the thing search longer when it seems to be unsure what is best...

A problem I see with that is that I very often have situations where the best move does not change at all for the first N iterations, because the score of all moves is supplied by the hash table. Only when the draft of the hashed entries is no longer sufficient it starts thinking, but often it does not change move even in a position where there are two close moves (e.g. where it would alternate two moves during the first N iterations when you do them with an empty hash table). And quite often it is not even able to finish the first iteration it really has to work on, but is aborted before finishing searching the first move, in which case it certainly does not change.

In your algorithm, these would cause the false impression of an easy move. You would even get less chance to finish the first iteration it has to do real work.

Don · Post by **Don** » Sat Aug 03, 2013 8:24 pm

bob wrote: I rewrote my easy move code in the last version of Crafty. Basic idea is to set a time multiplier to 1.0. As each iteration is searched, if the best move changes, the multiplier is increased. More changes = bigger change. If the best move does not change, the multiplier is reduced. Took a bit of tuning, but it turned out to be worth a few Elo. And another benefit is that the inverse of "easy" is also sort of detected, as if it keeps changing best moves, the target time (base time * multiplier) keeps climbing making the thing search longer when it seems to be unsure what is best...

That's a very sensible implementation. It's probably the most important of all the factors that can be considered. Does your system give more weight to later iterations? I think that would be an improvement if you are not already doing that but probably not a big deal and not tested. It should probably asymptotically approach some maximum value too so that it doesn't grow forever (or at least have a cutoff) - I would like to hear some more details if you are willing to provide them.

bob · Post by **bob** » Sat Aug 03, 2013 10:26 pm

hgm wrote:
bob wrote:As each iteration is searched, if the best move changes, the multiplier is increased. More changes = bigger change. If the best move does not change, the multiplier is reduced. Took a bit of tuning, but it turned out to be worth a few Elo. And another benefit is that the inverse of "easy" is also sort of detected, as if it keeps changing best moves, the target time (base time * multiplier) keeps climbing making the thing search longer when it seems to be unsure what is best...
A problem I see with that is that I very often have situations where the best move does not change at all for the first N iterations, because the score of all moves is supplied by the hash table. Only when the draft of the hashed entries is no longer sufficient it starts thinking, but often it does not change move even in a position where there are two close moves (e.g. where it would alternate two moves during the first N iterations when you do them with an empty hash table). And quite often it is not even able to finish the first iteration it really has to work on, but is aborted before finishing searching the first move, in which case it certainly does not change.

In your algorithm, these would cause the false impression of an easy move. You would even get less chance to finish the first iteration it has to do real work.

That's certainly an issue. My only hope is that it eventually gets deep enough. One key, an easy move never drops below 60% or so of the normal time allocation...

The one "holy grail" I searched for for about 6 months was to find some signature that "this position is complicated". Not just a fail-low, which is often too late. Hsu claimed to have had something.. When I asked a long while back, it seemed to not work for me thanks to the aggressive null-move stuff, reductions and pruning...

bob · Post by **bob** » Sat Aug 03, 2013 10:36 pm

Don wrote:
bob wrote: I rewrote my easy move code in the last version of Crafty. Basic idea is to set a time multiplier to 1.0. As each iteration is searched, if the best move changes, the multiplier is increased. More changes = bigger change. If the best move does not change, the multiplier is reduced. Took a bit of tuning, but it turned out to be worth a few Elo. And another benefit is that the inverse of "easy" is also sort of detected, as if it keeps changing best moves, the target time (base time * multiplier) keeps climbing making the thing search longer when it seems to be unsure what is best...
That's a very sensible implementation. It's probably the most important of all the factors that can be considered. Does your system give more weight to later iterations? I think that would be an improvement if you are not already doing that but probably not a big deal and not tested. It should probably asymptotically approach some maximum value too so that it doesn't grow forever (or at least have a cutoff) - I would like to hear some more details if you are willing to provide them.

I have not stopped playing with it, but 23.6 has the last thing I did. Yes later iterations "count more" because of how I increase/decrease this "multiplier". And both directions have well-established bounds.

Here's a summary of the code in 23.6 as released:

1. if there was more than one best move for an iteration,

if difficulty currently is < 100
difficulty = 100 + 20 * number_of_best_moves
else
difficulty = 80 * difficulty / 100 + 20 * number_of_best_moves

The idea is that if we are in an "easy move" mode (where difficulty factor is < 100, I quickly set it back to 100 + 20 * the number of best moves we found. If we change our mind 4 times, it goes to 180 (1.8x normal target time).

2. If there was just one best move,

difficulty = 90 * difficulty / 100.

IE drop it by 10%.

The allowable range is 60 <= difficulty <= 180.

The usual fail-low stuff is still present so that if the score drops, it can use quite a bit of time. Ditto for searching the current root moves that are in progress, before aborting, to be sure we are not about to fail high on one of 'em...

hgm · Post by **hgm** » Sun Aug 04, 2013 1:17 pm

Well, I am looking more for thinking-time reductions of a factor 5-10, in positions where all moves except one have a losing score (say < -700, in terms of normal Chess).

Don · Post by **Don** » Sun Aug 04, 2013 2:09 pm

hgm wrote:Well, I am looking more for thinking-time reductions of a factor 5-10, in positions where all moves except one have a losing score (say < -700, in terms of normal Chess).

Komodo uses a scheme similar to Bob's but takes node counts of moves into consideration - and it can play very fast. So you can make it play as fast as you wish. However you could also easily allow iteration score to be a factor, perhaps speeding up even more when the score is not close to zero.

Don

hgm · Post by **hgm** » Sun Aug 04, 2013 5:47 pm

I guess I should only take an easy move if the score is close to equality. If the easy-move margin is 300 cP, choosing between a move that scores 500 and a lot of others that score ~200 cP is a real choice, as the latter are still winning. Playing the 500 cP "blindly" could bungle the win, when deeper searched would have revealed the move to be poisoned, while +200 would still have easily won.

Easy easy move

Easy easy move

Re: Easy easy move

Re: Easy easy move

Re: Easy easy move

Re: Easy easy move

Re: Easy easy move

Re: Easy easy move

Re: Easy easy move

Re: Easy easy move

Re: Easy easy move