Playing better moves in drawish positions (anti-0.00)

Onno Garms · Post by **Onno Garms** » Sun Mar 13, 2011 8:40 pm

This is what I had been working on when I decided to give up Onno. I
do not yet know if one can get this idea working. Currently it is not
yet working. I call it anti-0.00

The problem is the following: On some positions, Onno has a strange
idea who has the upper hand. In reality, Onno should fight for a
draw. But Onno thinks that
- he has the upper hand
- he cannot avoid repetition

This problem occurs with Onno quite often, most likely due to some
problems in eval. But other engines sometimes have this kind of problem
too, especially when they use positive contempt.

As Onno thinks that the position is draw anyway due to repetition, he
will play bad moves. Suppose the real value of the position is -30
from Onno's viewpoint. But Onno thinks it is +20 apart from the
repetition. Onno can make a weak move that reduces the value to +5,
because he thinks there comes the repetition anyway. But this reduces
the real value to -45.

In order to solve this problem, I first implemented contempt. I run
normal searches without contempt (or the contempt given in the UCI
options). If this returns 0.00, I run an additional search with low
contempt (i.e. Onno wants draw) to make Onno go for the draw that is
indicated by the 0.00.

Maybe it is advisable to modify eval so that it does not return 0
coincidently. But at least when working with milli-pawns this hardly
happens anyway.

I also thought about integrating "option to draw the game" somehow in
the position value and do just one search. But I arrived the
conclusion that that is impossible.

Here is my implementation, but it does not yet work properly. What
should be tried:
- Always do that mode first that was put into effect at the previous depth.
- Only switch to draw mode when value 0.00 arose at two or three
consecutive depth.

Code: Select all

  // NOT YET WORKING!!

  // convert move list into PV list
  PV* pv_list      = create_pv_list &#40;move_list&#41;;
  PV* pv_list_draw = create_pv_list &#40;move_list&#41;;
  d_callback->set_pv &#40;pv_list, move_list.size&#40;));
  d_callback->set_null_scores &#40;d_null_scores&#41;;

  // search by iterative deeping
  // ----------------------------------------------------------------------
  int depth;
  int draw_depth = 0;
  bool draw = false;
  for &#40;depth = 1; depth<SearchBase&#58;&#58;c_end_height && !d_terminator->interrupt&#40;); ++depth&#41;
  &#123;
    // set mode
    d_eval_parameter->set_mode &#40;EvalParameter&#58;&#58;neutral&#41;;
    d_trans_map->set_draw_mode &#40;EvalParameter&#58;&#58;neutral&#41;;

    // do the search for given depth
    d_callback->send_nodes &#40;d_node_counter->total&#40;), d_tbhit_counter->total&#40;));
    d_callback->send_depth &#40;depth&#41;;
    d_terminator->send_depth_start &#40;depth&#41;;
    root_search &#40;depth, p_board, p_multi_pv, move_list.size&#40;), pv_list&#41;;

    if &#40;pv_list&#91;0&#93;.value&#40;)==0&#41;
    &#123;
      // setup for draw search
      EvalParameter&#58;&#58;DrawMode draw_mode = p_board.col2mov&#40;)==Color&#58;&#58;white ? EvalParameter&#58;&#58;white_wants_draw &#58; EvalParameter&#58;&#58;black_wants_draw;
      d_eval_parameter->set_mode &#40;draw_mode&#41;;
      d_trans_map->set_draw_mode &#40;draw_mode&#41;;
      // draw research up to current depth
      while &#40;draw_depth < depth&#41;
      &#123;
        ++draw_depth;
        root_search &#40;draw_depth, p_board, p_multi_pv, move_list.size&#40;), pv_list_draw&#41;;
      &#125;
      // switch to results from draw research
      d_callback->set_pv &#40;pv_list_draw, move_list.size&#40;));
      d_callback->set_null_scores &#40;true&#41;;
      draw = true;
    &#125;
    // switch back to normal results
    else
    &#123;
      d_callback->set_pv &#40;pv_list, move_list.size&#40;));
      d_callback->set_null_scores &#40;d_null_scores&#41;;
      draw = false;
    &#125;

    // evaluate result of search
    if &#40;d_terminator->interrupt&#40;))
      break;
    d_terminator->send_depth_finished ();
  &#125;

Michel · Post by **Michel** » Mon Mar 14, 2011 6:47 pm

I can confirm that this problem is real

At the recent TCEC tournament (division F) GnuChess blundered against Naraku precisely for this reason. After seeing that the opponent could force a repetition it gave away a pawn, mistakenly thinking that after the sacrifice it was still better and the opponent would still settle for the repetition. Of course this is not what happened....

The source of this problem is of course the eval. In this case it was caused by GnuChess having a bad passed pawn eval.

hgm · Post by **hgm** » Tue Mar 15, 2011 2:14 pm

I guess the most efficient way to solve this problem is not assign 0.00 score to repetitions (like it is also very ill-adviced to assign 0-scores to end-games that tablebase probingreveals as draws). Not all draws are equal. In particular, forcing a repetition is better than being forced into a repetition,as you have the choice, and the 0-score is just a lower limit. So it would make sense to score the move that creates the repetition as somewhat > 0. How much you could make dependent on the score of the best other move. Add some contempt to it (say 200cP) as a margin for mis-evaluation, and then divide the result by 10.

Then, if you think the opponent has -150 as an alternative for drawing the draw would be scored as +5. But if you sac a Pawn first, his alternative will score -50, and the draw would score +15 in his advantage. So you would prefer to go for the draw without sacrificing the pawn first.

The price is that you would have to make the draw score raise alpha not to 0, but to -200, to get an exact score for breaking out of the loop. And if 0 <= alpha to start with, it is more of a problem. If alpha = +15cP (say), a move with score better than -40 would upgrade the rep-score to above = +16, so it won't fail low. To know it, you would have to search all other moves with alpha = -40, in stead of +15. But you might have already searched some moves with alpha = +15 before you encounter the move that repeats, and then you would have to re-search those with a more stringent bound. With IID you would of course become aware of the repetition possibility in the first iteration, so you could apply the window shift to the final iteration. Note that this problem only occurs for alpha in the range 0-22: to correct a rep-draw score to +23 you would need another move with score above +30, which would be above the corrected draw score by itself. With milli-Pawn granularity the range could be made even smaller (dividng by 100,in stead of 10), so it would hardly ever occur, unless the search is really comparing different rep-draws in order to determine the best one to go for.

FlavusSnow · Post by **FlavusSnow** » Wed Mar 16, 2011 3:44 am

Maybe I'm misunderstanding you Mr Muller, are you saying to score a draw positively?

a positive score for a draw is counterintuitive for me. I'd think in general you would want an engine to avoid a draw when winning or even and to seek a draw when losing. Wouldn't this be achieved by a slightly negative value of a draw?

So if the search reveals a draw, that score would be returned as a constant that is slightly negative. Ideally the value of a draw would change with your opponent's rating!

hgm · Post by **hgm** » Wed Mar 16, 2011 8:24 am

What you describe is known as 'contempt factor'. This can be combined with the idea above, by adding it to the draw score afterwards. So my explanation would apply to the case where you are absolutely neutral to the idea of a draw. If a draw should be considered a good result, because your opponent is stronger, or you were playing black, you add positive contempt to the draw score.

But the reason the score is positive in the case I explain, is that it is a 'draw or better' for the side that choses it. So it describes the 'or better' part, really. Normally the score of a node is the score of the best move, because that is the one you are going to play. But here you are comparing a score that is absolutely certain, (because the game is finished), with one that is uncertain (it might change on deeper search), and that begs some correction.

Onno Garms · Post by **Onno Garms** » Sat Mar 19, 2011 4:59 pm

I also thought about integrating different levels of draw scores into the normal search, but I couldn't make up a working scheme for this.

hgm wrote:In particular, forcing a repetition is better than being forced into a repetition

How can you tell who forces the repetition and who is being forced? This seems to be unrelated to the question whose moves creates the first repetition. Suppose in bQh4, wPg2 wKg1 white makes a move at the other end of the board. Then black decides to force a repetition by 1. .. Qe1 2. Kh2 Qh4. White's 3. Kg1 creates the repetition. When you start with bQg3, 3... Qe1 creates the first repetition.

as you have the choice, and the 0-score is just a lower limit. So it would make sense to score the move that creates the repetition as somewhat > 0. How much you could make dependent on the score of the best other move. Add some contempt to it (say 200cP) as a margin for mis-evaluation, and then divide the result by 10.

I couldn't get the idea of integrating draw options into the eval to a level where I could have started implementation. I mostly thought about having something like a struct of a value and flags which player can draw the game.

Your suggestion means to evaluate positions where your opponent has the choice whether to draw the game or not to the interval (0, 0.20] (assuming that a margin of 2.0 for misevaluation is always sufficient). Might be more appropriate to map to (0, 0.01), but that is not my point.

The problem seems that to know the value of the best other move you will have to search the other moves. At the very node where the repetition occurs, that is possible. But you will need that information on earlier nodes too. For example a player might need to sacrifice his queen before giving permanent check. Nonetheless his alternative to drawing the game is not continuing with queen odds but playing a different move earlier.

At least this means overhead for all positions for bookkeeping of "draw options". I doubt that it is worth it. Even worse, I could not find a scheme how to propagate "draw options" along the search tree.

In the following tree suppose that at root node it is white's turn. At each node I have a boolean if black can force repetition and a value of the node (from the player's perspective whose turn it is at that node), assuming that black does not want repetition.

Code: Select all

b  (-20,t&#41;   (-10,f&#41;   (-30,t&#41;
         \    /          |
w        (+10,f&#41;       &#40;30,t&#41;
              \       /
               \     /
b              (-30,t&#41;           (-25,t&#41;
                     \          /
                      \        /
w                       &#40;30,t&#41;

The engine will play the left move at root node. However when the evaluation is off by 40, the right move is better:

Code: Select all

b  &#40;20,t&#41;   &#40;30,f&#41;    &#40;10,t&#41;
         \    /          |
w        (-20,t&#41;      (-10,t&#41;
              \       /
               \     /
b               &#40;20,t&#41;           &#40;15,t&#41;
                     \          /
                      \        /
w                       (-15,t&#41;

bob · Post by **bob** » Sat Mar 19, 2011 6:12 pm

FlavusSnow wrote:Maybe I'm misunderstanding you Mr Muller, are you saying to score a draw positively?

a positive score for a draw is counterintuitive for me. I'd think in general you would want an engine to avoid a draw when winning or even and to seek a draw when losing. Wouldn't this be achieved by a slightly negative value of a draw?

So if the search reveals a draw, that score would be returned as a constant that is slightly negative. Ideally the value of a draw would change with your opponent's rating!

I can give you several ideas, all used in current crafty.

First, for egtb positions. Mates are obvious, but what about draws. For example, krpkr might be drawn, but the onus is on the losing side to draw it. Crafty scores any draw where it is ahead in material as +.01. Any draw where it is down in material is -.01, and any draw where material is even is 0.00. That makes it prefer to stay in the KRP vs KR rather than just tossing the pawn away, since both are draws. In the former, your opponent can make a simple error and lose. In the latter, he he will only lose if he hangs his rook.

Second, if you are playing someone rated higher than you, a draw is a good result. If you are playing someone rated lower than you, a draw is a bad result. You can vary the draw score, so that when playing a weaker opponent, a draw is -.2 or -.5 (you have to tune this to avoid making silly mistakes of course). If you play someone much better than you, you can use +.20 or +.50, bigger values for stronger opponents.

It works well. You can look at the Crafty source to see how it works. In particular, look at option.c "rating" command which is where Crafty learns its rating and the opponent's rating from xboard when playing on a server.

Onno Garms · Post by **Onno Garms** » Sat Mar 19, 2011 7:21 pm

bob wrote: I can give you several ideas, all used in current crafty.

These ideas are definitely useful, but I don't think they address the original problem.

First, for egtb positions.

Your solution will make the engine stop playing rediculous looking moves in many cases, but at least if the opponent also has egtb, they will not change the result of the game. Correct my if I am wrong.

Second, [contempt]

Sure. Implementing contempt is a prequisite for my idea but does not solve the original problem unless you play with negative contempt (which might not be what you want). Positive contempt might even enlarge the original problem. My suggestion can be described in short as automatic enabeling of contempt when necessary.

bob · Post by **bob** » Sat Mar 19, 2011 9:58 pm

Onno Garms wrote:
bob wrote: I can give you several ideas, all used in current crafty.
These ideas are definitely useful, but I don't think they address the original problem.

First, for egtb positions.
Your solution will make the engine stop playing rediculous looking moves in many cases, but at least if the opponent also has egtb, they will not change the result of the game. Correct my if I am wrong.

If both have EGTBs, _nothing_ will change the result of the game. I'm not sure I understand your point. I have found the -.01, 0.00 and +.01 idea to be helpful, as even in repetitions, that is my draw score, not 0.00. Crafty is unlikely to walk away from a -0.01 score, since it is down in material, while at +0.01, it might well suddenly decide to avoid the repetition if it can improve on that +0.01 in any way (it does have a material advantage, obviously).

Second, [contempt]
Sure. Implementing contempt is a prequisite for my idea but does not solve the original problem unless you play with negative contempt (which might not be what you want). Positive contempt might even enlarge the original problem. My suggestion can be described in short as automatic enabeling of contempt when necessary.

I do both negative and positive, All based on rating(crafty)-rating(opponent). I am not sure it matters if you try to get too clever, or just implement a simple negative contempt when you are stronger than your opponent, and a positive contempt when you are weaker. For Crafty I use the -0.01 ~ +0.01 window for ratings within 100 points of each other. If crafty is stronger, I go to +.20, +.30 or +.50 depending on whether the opponent over 100 points better, over 300 points better, or over 500 points better. If you don't you draw too many games on ICC where your opponent plays very drawish openings and you see draws before he makes a mistake... And since he is a lot weaker, he _will_ make those mistakes.

Playing better moves in drawish positions (anti-0.00)

Playing better moves in drawish positions (anti-0.00)

Re: Playing better moves in drawish positions (anti-0.00)

Re: Playing better moves in drawish positions (anti-0.00)

Re: Playing better moves in drawish positions (anti-0.00)

Re: Playing better moves in drawish positions (anti-0.00)

Re: Playing better moves in drawish positions (anti-0.00)

Re: Playing better moves in drawish positions (anti-0.00)

Re: Playing better moves in drawish positions (anti-0.00)

Re: Playing better moves in drawish positions (anti-0.00)