Threat information from evaluation to inform q-search

gladius · Post by **gladius** » Tue Sep 15, 2009 9:38 am

I ran into an interesting position in an extremely fast test game:

[d]4r3/pp1brp2/4p1k1/4P3/5R2/P1R4P/1P3PP1/6K1 b[/d]

My engine played Kg5 and was promptly mated. It's a #7, so 14 ply should find this threat. I was trying to think of something that would help in general game play and avoid this threat as well.

I ended up using the king safety information from the evaluation, and returning a kingInDanger[2] from eval. Then, if kingInDanger[], I generate checking moves much further in q-search than I normally would. The threshold for setting kingInDanger changes the personality a fair bit

.

This seems to avoid search explosion, and helped quite a bit in this position, and in finding threats in general. I'm sure that other people do similar things with evaluation and search extensions/reductions. I haven't seen it in the strong open source engines, but they still manage to beat up on me anyway

.

mcostalba · Post by **mcostalba** » Tue Sep 15, 2009 12:54 pm

gladius wrote: This seems to avoid search explosion, and helped quite a bit in this position

I would think it can be very dangerous to tweak the engine to make it succed in a given position.

Almost all engine tweaks works in a statistical fashion: sometime they help, sometime they don't, sometime they even weak the engine.

A feature or a tweak is considered good when it helps more then it hurts in a big bunch of real games, not when it never hurts or when it helps only on specific cases.

As example also null search, considered good by almost everybody, in some specific positions could hurt, I am not talking of zugzwang here, just position where null search fails low so we need to carry on the search anyway and null search turned out (almost) a waste of time (there are a lot of these positions). But it also happens that in the lot of positions when null search fails high we save a lot of work, so at the end the balance is positive. That's why it works.

The real games result is the metric used by CCRL, CEGT or also by tournments, so this is the metric I try to stick with, I am not interested in other kind of metrics because are not used by publicly evaluating the strength of an engine.

BubbaTough · Post by **BubbaTough** » Tue Sep 15, 2009 2:33 pm

Great idea Gary! I am much in favor of exploring ways to use chess knowledge to guide the search. As Marco implied, the real test is to see if it helps your engine win games. If you run your engine against a bunch of opponents for a bunch of games with and without that change, I would be interested to hear the results.

-Sam

Zach Wegner · Post by **Zach Wegner** » Tue Sep 15, 2009 7:43 pm

Not a bad idea. It's somewhat similar to some ideas I've played with of creating bitboards of "critical squares" in the eval, usually near the king or passed pawns. Moves to those squares are ordered higher and sometimes not reduced. I really can't remember any test results from this though.

I do agree with Sam that the evaluation can return a lot more information than a score, and can (or at least should) be very useful in guiding the search.

gladius · Post by **gladius** » Tue Sep 15, 2009 8:20 pm

mcostalba wrote:I would think it can be very dangerous to tweak the engine to make it succed in a given position.

Yes, it's certainly best to measure success against a bunch of other engines in actual games. But for trying ideas out really quickly, it's nice to have a few test positions to see the actual behavior of the search. If it works well there, then it moves onto game testing for me

.

In this case, it seemed to be ~5 elo improvement in game testing. Of course, no 40k games measurement here (I usually use 512 games), so it's not guaranteed. I have run multiple test runs since I added this change though, and it seems to hold up so far though.

gladius · Post by **gladius** » Wed Sep 16, 2009 2:59 am

Just for fun, I tried a form of this idea out in Stockfish as well, to see how it would do in a strong program (mine isn't quite there). It appears that in Stockfish, it acts a bit like a higher contempt setting, scoring more wins against weaker opposition, but not doing quite as well against stronger opponents.

Here is the code, modifications to qsearch(). It's a bit hacky in that it doesn't do anything if the eval. is cached, but oh well

.

Code: Select all

    Depth checkDepth = depth;

    if &#40;isCheck&#41;
        staticValue = -VALUE_INFINITE;

    else if &#40;tte && tte->type&#40;) == VALUE_TYPE_EVAL&#41;
    &#123;
        // Use the cached evaluation score if possible
        assert&#40;tte->value&#40;) == evaluate&#40;pos, ei, threadID&#41;);
        assert&#40;ei.futilityMargin == Value&#40;0&#41;);

        staticValue = tte->value&#40;);
    &#125;
    else
    &#123;
        staticValue = evaluate&#40;pos, ei, threadID&#41;;

        // If the king is threatened by two or more pieces, one of which
        // is a queen or rook, generate checking moves in second ply of
        // q-search.
        if &#40;ei.kingAttackersCount&#91;pos.side_to_move&#40;)&#93; >= 2 && 
            ei.kingAttackersWeight&#91;pos.side_to_move&#40;)&#93; >= 5 &&
            depth >= -OnePly&#41;
            checkDepth = Depth&#40;0&#41;;
     &#125;

     // <snip>

     MovePicker mp = MovePicker&#40;pos, ttMove, checkDepth, H&#41;;

Here are the results (of course, take with a grain of salt - small # of games, and still well within error bounds). Anyway, it looks like it might be interesting to experiment with.

1 stockfish-extend.exe ELO:260 512.0 (424.5 : 87.5)
1 stockfish.exe ELO:251 512.0 (410.0 : 102.0)

mcostalba · Post by **mcostalba** » Wed Sep 16, 2009 6:52 am

gladius wrote: Here is the code, modifications to qsearch(). It's a bit hacky in that it doesn't do anything if the eval. is cached, but oh well .

Wow, thanks a lot !

Regarding the cached eval is not a problem because, as you can see from the assert, we store the eval only if ei.futilityMargin == Value(0) and this it means king is not threathened, if you see in evaluate_king() in evaluation.cpp:

Code: Select all

     // Finally, extract the king safety score from the SafetyTable&#91;&#93; array.
      // Add the score to the evaluation, and also to ei.futilityMargin.  The
      // reason for adding the king safety score to the futility margin is
      // that the king safety scores can sometimes be very big, and that
      // capturing a single attacking piece can therefore result in a score
      // change far bigger than the value of the captured piece.
      Value v = apply_weight&#40;SafetyTable&#91;attackUnits&#93;, WeightKingSafety&#91;us&#93;);

      ei.mgValue -= sign * v;

      if &#40;us == p.side_to_move&#40;))
          ei.futilityMargin += v;

so ei.futilityMargin value could be use as a trigger instead of your home grown condition, should be more or less equivalent.

I will properly test your patch and in case I will merge in development version.

Thanks again Gary

gladius · Post by **gladius** » Wed Sep 16, 2009 7:06 am

mcostalba wrote: Regarding the cached eval is not a problem because, as you can see from the assert, we store the eval only if ei.futilityMargin == Value(0) and this it means king is not threathened, if you see in evaluate_king() in evaluation.cpp:

No problem on the little patch, hopefully it turns out to be good, or gives you ideas to play with

.

As for the ei.futilityMargin, unless I'm missing something, that's going to always be zero? This is a direct copy/paste from qsearch:

Code: Select all

    ei.futilityMargin = Value&#40;0&#41;; // Manually initialize futilityMargin

    if &#40;isCheck&#41;
        staticValue = -VALUE_INFINITE;

    else if &#40;tte && tte->type&#40;) == VALUE_TYPE_EVAL&#41;
    &#123;
        // Use the cached evaluation score if possible
        assert&#40;tte->value&#40;) == evaluate&#40;pos, ei, threadID&#41;);
        assert&#40;ei.futilityMargin == Value&#40;0&#41;);

So, if tte != null, and it's of VALUE_TYPE_EVAL, ei.futilityMargin will be zero.

mcostalba · Post by **mcostalba** » Wed Sep 16, 2009 7:14 am

gladius wrote:
mcostalba wrote: So, if tte != null, and it's of VALUE_TYPE_EVAL, ei.futilityMargin will be zero.
Yes it is.

This is due to a limitation in available fields in the TT entry, we have only one field for storing the normal TT value. Sometime we use it also for the evaluation score instead of the TT value, but because there is only one field we have no space to store _also_ the ei.futilityMargin

Because ei.futilityMargin is almost always at 0 we come up with the hack to store evaluation _only_ when ei.futilityMargin == 0 so that we don't need an extra field in the TT entry and in any case we cover the common case because ei.futilityMargin == 0 almost always.

When ei.futilityMargin != 0 we don't cache the evaluation score, that's the reason it _seems_ that is always at zero.

gladius · Post by **gladius** » Wed Sep 16, 2009 7:38 am

mcostalba wrote:Because ei.futilityMargin is almost always at 0 we come up with the hack to store evaluation _only_ when ei.futilityMargin == 0 so that we don't need an extra field in the TT entry and in any case we cover the common case because ei.futilityMargin == 0 almost always.

When ei.futilityMargin != 0 we don't cache the evaluation score, that's the reason it _seems_ that is always at zero.

Ah, the joy of chess engine programming. Cool stuff, thanks for the explanation. Anyway, great work on Stockfish! It has not much more code than Garbochess and is way, way stronger.

I need to be more disciplined with my testing, but I have too much fun trying random things that usually hurt

.

Threat information from evaluation to inform q-search

Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search

Re: Threat information from evaluation to inform q-search