SF search inconsistency

Ralph Stoesser · Post by **Ralph Stoesser** » Thu May 20, 2010 9:53 pm

In this position (white wins)
[d]8/6QP/8/8/5k2/8/8/K2q4 w - - 0 1
SF default shows drawish score at depth 9 and 12.

If I disable pruning (always return false in ok_to_prun()) I see a 0.00 score at depth 25, 26 and 27. If I additionally disable razoring I see 0.00 score at depth 18 and 19.

But if I only remove this code in qsearch()
search.cpp, line 1657

Code: Select all

    if (bestValue >= beta)
    {
        // Store the score to avoid a future costly evaluation() call
        if (!isCheck && !tte && ei.futilityMargin[pos.side_to_move()] == 0)
            TT.store(pos.get_key(), value_to_tt(bestValue, ply), VALUE_TYPE_EV_LO, Depth(-127*OnePly), MOVE_NONE);

        return bestValue;
    }

I see no more wrong drawish scores in this position. Also it seems that removing this part doesn't make SF play weaker, rather the other way round. (+315, -296, =1038, @1 min/game in favor of the change).

I have not much time at the moment to analyse and debug SF, so I just write what I've found so far.

Any thoughts about this issue?

Eelco de Groot · Post by **Eelco de Groot** » Fri May 21, 2010 1:07 am

Ralph Stoesser wrote:In this position (white wins)
[d]8/6QP/8/8/5k2/8/8/K2q4 w - - 0 1
SF default shows drawish score at depth 9 and 12.

If I disable pruning (always return false in ok_to_prun()) I see a 0.00 score at depth 25, 26 and 27. If I additionally disable razoring I see 0.00 score at depth 18 and 19.

But if I only remove this code in qsearch()
search.cpp, line 1657
Code: Select all
    if (bestValue >= beta)
    {
        // Store the score to avoid a future costly evaluation() call
        if (!isCheck && !tte && ei.futilityMargin[pos.side_to_move()] == 0)
            TT.store(pos.get_key(), value_to_tt(bestValue, ply), VALUE_TYPE_EV_LO, Depth(-127*OnePly), MOVE_NONE);

        return bestValue;
    }
I see no more wrong drawish scores in this position. Also it seems that removing this part doesn't make SF play weaker, rather the other way round. (+315, -296, =1038, @1 min/game in favor of the change).

I have not much time at the moment to analyse and debug SF, so I just write what I've found so far.

Any thoughts about this issue?

The position seems too difficult without tablebases to base any changes in qsearch on. After 1. Kb2 it is Mate in 44 moves but the other possible move 1. Ka2 only is a draw. So it is a very narrow path to victory and 44 moves is of course beyond the search horizon. I would look at the repetition code first instead of at the Stand Pat rule because that is rather basic to qsearch. Qsearch is a basic feature of Razoring too so the problem could be there somewhere. Or in the static evaluation. I don't think this result is significant, the result without the code is not worse but it does not really tell you if the draw scores are really a problem or are directly related to this piece of code. I think it is too easy to just skip some search code like the Stand Pat rule based on one position, that is, I do think it is worthwhile code but maybe just not working here in conjunction with other pieces of search or eval.

I see one 0.00 score at the moment:

[d]8/6QP/8/8/5k2/8/8/K2q4 w - -

Engine: Rainbow Serpent 1.7.1s(dc) Build 052 (Athlon 2009 MHz, 256 MB)
by Tord Romstad, Marco Costalba, Joona Kiiski

1.00 0:00 +3.22 1.Kb2 (55) 0

2.00 0:00 +3.23 1.Kb2 Qd2+ 2.Kb3 (207) 0

3.00 0:00 +3.43 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd2+
4.Kc5 Qa5+ 5.Kd6 (19.265) 68

4.00 0:00 +3.87 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kc4 Qe6+ 5.Kc5 (25.593) 86

5.00 0:00 +3.71 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kc4 Qc6+ 5.Kd4 Qd6+ 6.Kc3 Qc6+
7.Kd4 (32.183) 103

6.00 0:00 +3.55 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Kb7 Qd5+
7.Kc7 (58.879) 164

7.00 0:00 +3.67 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Kb7 Qd5+
7.Kc7 Qc5+ 8.Kd8 (88.146) 226

8.00 0:00 +4.00 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Kb7 Qd5+
7.Kc7 Qc5+ 8.Kd8 Qd5+ 9.Ke7 Qc5+
10.Ke6 Qf5+ 11.Kd6 Qd3+ 12.Ke7 Qe4+
13.Kd6 Qb4+ 14.Ke6 (134.779) 297

9.00 0:00 +3.23-- 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Kb7 Qd5+
7.Kc7 Qc5+ 8.Kd8 Qd5+ 9.Ke7 Qb7+
10.Kf6 Qa6+ 11.Kf7 Qc4+ 12.Kf6 Qd4+
13.Kg6 Qe4+ 14.Kf7 (185.106) 338

9.00 0:00 +3.03 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Kb7 Qd5+
7.Kc7 Qc5+ 8.Kd8 Qd5+ 9.Qd7 Qe5
10.Qe8 (245.114) 402

10.00 0:00 +3.11 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Ka6 Qc6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Kc8 Qf5+ 9.Kd8 Qd5+
10.Qd7 Qa8+ 11.Ke7 Qh8 12.Ke6 Kg5 (436.765) 508

11.00 0:01 +3.11 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Ka6 Qc6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Kc8 Qf5+ 9.Kd8 Qd5+
10.Qd7 Qa8+ 11.Ke7 Qh8 12.Ke6 Kg5 (592.730) 558

12.01 0:01 +3.19++ 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Ka6 Qc6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Kc8 Qa6+ 9.Qb7 Qe6+
10.Kd8 Qf6+ 11.Qe7 Qh8+ 12.Kd7 Kf5
13.Qf7+ Kg5 14.Qg8+ (900.368) 626

12.01 0:02 +3.27++ 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Ka6 Qd3+ 6.Kb6 Qb3+
7.Kc7 Qc4+ 8.Kd8 Qd5+ 9.Qd7 Qa8+
10.Ke7 Qh8 11.Qd2+ Ke4 12.Qc2+ Kd5
13.Kf7 (1.505.193) 678

12.01 0:03 +3.43++ 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd3+ 5.Kc5 Qc2+ 6.Kb4 Qb1+
7.Ka5 Qa2+ 8.Kb5 Qb3+ 9.Kc6 Qc2+
10.Kb7 Qe4+ 11.Kb8 Qb4+ 12.Kc8 (2.430.663) 717

12.01 0:04 +3.27-- 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd3+ 5.Kc5 Qa3+ 6.Kb6 Qb4+
7.Kc7 Qa5+ 8.Kb7 Qb5+ 9.Kc8 Qa6+
10.Qb7 Qf6 11.Qc7+ Kf3 12.Qd7 Qh8+
13.Kc7 Kf4 14.Qf7+ (2.920.767) 730

13.01 0:04 +3.10 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qb8+ 5.Kc6 Qa8+ 6.Kb6 Qb8+
7.Kc5 Qc8+ 8.Kb4 Qb8+ 9.Kc3 Qc8+
10.Kb2 Qb8+ 11.Kc2 Qc8+ 12.Qc3 Qf5+
13.Qd3 Qc5+ 14.Kb3 (3.596.877) 745

14.01 0:06 +2.57-- 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd3+ 5.Kb6 Qd6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Qb7 Qe8+ 9.Ka7 Qa4+
10.Kb6 Qb4+ 11.Kc7 Qc3+ 12.Kd6 Qe5+
13.Kd7 Qg7+ 14.Kc6 (4.677.702) 759

14.01 0:06 +2.05-- 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Qb7 Qe8+ 9.Ka7 Qa4+
10.Kb6 Qb4+ 11.Kc7 Qc3+ 12.Kd6 Qe5+
13.Kd7 Qg7+ 14.Kc6 (4.851.001) 766

14.01 0:06 +1.00-- 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Qb7 Qe8+ 9.Qc8 Qb5+
10.Kc7 Qe5+ 11.Kb6 Qb2+ 12.Ka5 Qe5+
13.Kb4 Qe7+ (5.045.159) 772

14.01 0:07 0.00 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Qb7 Qe8+ 9.Qc8 Qb5+
10.Kc7 Qe5+ 11.Kb6 Qb2+ 12.Ka5 Qd2+
13.Kb6 Qb2+ (5.874.117) 789

15.01 0:11 +3.51 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qb8+ 5.Kc6 Qa8+ 6.Qb7 Qf8
7.Qc7+ Kf5 8.Qd6 Qa8+ 9.Kb6 Qh8
10.Qd5+ Kf4 11.Qf7+ Ke4 12.Qg8 Qb2+
13.Kc7 Qe5+ 14.Kd8 (9.553.927) 812

16.01 0:21 +3.39 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Kc8 Qa6+ 9.Qb7 Qf6
10.Qc7+ Kf3 11.Kd7 Qg7+ 12.Kc6 Qg6+
13.Kb7 Qb1+ 14.Kc8 (17.657.796) 834

17.01 0:30 +3.23 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Kc8 Qa6+ 9.Qb7 Qf6
10.Qc7+ Kf3 11.Qc2 Qf8+ 12.Kc7 Qe7+
13.Kc6 Qe8+ 14.Kb6 (25.530.821) 851

18.01 0:43 +3.67++ 1.Kb2 Qd2+ 2.Kb3 Qd3+ 3.Kb4 Qd6+
4.Kb5 Qd5+ 5.Kb6 Qd6+ 6.Ka7 Qc5+
7.Kb8 Qb5+ 8.Kc8 Qa6+ 9.Qb7 Qf6
10.Qc7+ Kf3 11.Qc2 Qf8+ 12.Kc7 Qe7+
13.Kc6 Qe6+ 14.Kb7 (37.432.088) 855

18.01 1:08 +4.12++ 1.Kb2 Qd2+ 2.Kb3 Qd5+ 3.Ka4 Qc6+
4.Ka5 Qc5+ 5.Ka6 Qc6+ 6.Ka7 Qa4+
7.Kb7 Qe4+ 8.Kb8 Qb4+ 9.Kc8 (59.032.664) 861

18.01 1:34 +5.01++ 1.Kb2 Qd2+ 2.Kb3 Qd5+ 3.Ka4 Qc6+
4.Ka5 Qc5+ 5.Ka6 Qc6+ 6.Ka7 Qa4+
7.Kb7 Qe4+ 8.Kb8 Qb4+ 9.Kc8 (82.070.367) 866

18.01 1:41 +5.01 1.Kb2 Qd2+ 2.Kb3 Qd5+ 3.Ka4 Qc6+
4.Ka5 Qc5+ 5.Ka6 Qc6+ 6.Ka7 Qa4+
7.Kb7 Qb5+ 8.Kc8 Qa6+ 9.Qb7 Qf6
10.Qc7+ Kf3 11.Qc2 Kg3 12.Qd3+ Kg2
13.Qe2+ Kg3 14.Qe1+ (88.284.648) 870

19.01 2:31 +5.57 1.Kb2 Qd2+ 2.Kb3 Qd5+ 3.Ka4 Qc6+
4.Ka5 Qc5+ 5.Ka6 Qc6+ 6.Ka7 Qa4+
7.Kb7 Qb5+ 8.Kc8 Qa6+ 9.Qb7 Qf6
10.Qc7+ Kf3 11.Qc2 Ke3 12.Kd7 Qf7+
13.Kc6 Qe8+ 14.Kd5 (132.267.162) 872

20.01 3:03 +5.37 1.Kb2 Qd2+ 2.Kb3 Qd5+ 3.Ka4 Qc6+
4.Ka5 Qc5+ 5.Ka6 Qc6+ 6.Ka7 Qa4+
7.Kb7 Qb5+ 8.Kc8 Qa6+ 9.Qb7 Qf6
10.Qc7+ Kf3 11.Qc2 Ke3 12.Kd7 Qf7+
13.Kd8 Qf8+ 14.Kc7 (161.753.573) 881

21.01 4:42 +5.41 1.Kb2 Qd2+ 2.Kb3 Qd5+ 3.Kb4 Qd6+
4.Ka5 Qc5+ 5.Ka6 Qc6+ 6.Ka7 Qa4+
7.Kb7 Qb5+ 8.Kc8 Qa6+ 9.Qb7 Qf6
10.Qc7+ Kf3 11.Qc2 Ke3 12.Kd7 Qf7+
13.Kd8 Qf8+ 14.Kc7 (256.049.611) 907

best move: Ka1-b2 time: 4:47.391 min n/s: 906.712 nodes: 260.580.167

Eelco

Ralph Stoesser · Post by **Ralph Stoesser** » Fri May 21, 2010 1:53 am

The repetition code seems ok to me. This was the first piece of code I have checked in this regard. Then I have considered late move pruning and razoring as the reason for the wrong draw scores. But they are not the reason for the wrong draw scores.

The stand pat rule in qsearch() seems a bit fishy to me in this kind of positions, because, as far as I understand, it would return many wrong draw scores coming from static eval in tactical postions where a regular quiescence search would easily detect the right non-draw score.

Ralph Stoesser · Post by **Ralph Stoesser** » Fri May 21, 2010 1:53 am

(deleted)

Eelco de Groot · Post by **Eelco de Groot** » Fri May 21, 2010 3:41 am

Ralph Stoesser wrote:The repetition code seems ok to me. This was the first piece of code I have checked in this regard. Then I have considered late move pruning and razoring as the reason for the wrong draw scores. But they are not the reason for the wrong draw scores.

The stand pat rule in qsearch() seems a bit fishy to me in this kind of positions, because, as far as I understand, it would return many wrong draw scores coming from static eval in tactical postions where a regular quiescence search would easily detect the right non-draw score.

The idea behind Stand Pat is something like "I have the move and the static evaluation is already above beta. If I can't improve on this with any move I can make, tactical or otherwise, I must be in Zugzwang. But barring that there is no need to search further if static eval > beta is correct. I could only look at tactical moves anyway in qsearch so I have no sure way to detect Zugzwang. There may however be still some of my pieces en prise". Threat information helps but I am sure you can see something missing now in the code that you have examined yourself: ei.futilityMargin only contains King Safety, the threat information is not added to the futility margin. It is in Rainbow Serpent and I think it makes some difference.

Regards, Eelco

mcostalba · Post by **mcostalba** » Fri May 21, 2010 1:08 pm

Ralph Stoesser wrote: I see no more wrong drawish scores in this position. Also it seems that removing this part doesn't make SF play weaker, rather the other way round. (+315, -296, =1038, @1 min/game in favor of the change).

Thi is an unexpected result and if I found some testing time it would be interesting to try to verify it...although not top priority

mcostalba · Post by **mcostalba** » Fri May 21, 2010 1:18 pm

Eelco de Groot wrote:Threat information helps but I am sure you can see something missing now in the code that you have examined yourself: ei.futilityMargin only contains King Safety, the threat information is not added to the futility margin. It is in Rainbow Serpent and I think it makes some difference.

Hi Eelco,

I mostly agree on stand pat rationale that you exposed. But in the evaluation used to "stand pat" threats are _already_ accounted for.

Instead you suggest to add them also in ei.futilityMargin that is _not_ used by stand pat, so is an off-topic and has nothing to do with what we are discussing now. Or am I missing something ?

Regarding your idea to add threast to futility margin, it would be important first try to undertsand where this parameter is uded. And is used in qsearch and specifically in this line:

Code: Select all

futilityBase = staticValue + FutilityMarginQS + ei.futilityMargin[pos.side_to_move()];

From where we can see that, because of staticValue is the position evaluation that already includes the threats, if you add the threats score also to ei.futilityMargin you end up counting threats score two times in the final futilityBase value.

I don't know if your idea works or not, but for sure is not intuitive why it should work. IMHO it seems should not and if it does it means threats scores are much underestimated, so that your patch would be a (wrong) fix of an unrelated issue.

Marco

zamar · Post by **zamar** » Fri May 21, 2010 2:39 pm

Ralph Stoesser wrote: Any thoughts about this issue?

It's a known fact that Transposition table does not work fully correctly when it comes to repetition draws. So it's not surprising that lesser use of TT could solve the problem for this specific position. How this affects in other positions is of course different issue...

Ralph Stoesser · Post by **Ralph Stoesser** » Sat May 22, 2010 10:24 am

zamar wrote:
Ralph Stoesser wrote: Any thoughts about this issue?
It's a known fact that Transposition table does not work fully correctly when it comes to repetition draws. So it's not surprising that lesser use of TT could solve the problem for this specific position. How this affects in other positions is of course different issue...

I don't understand what you mean. SF detect repetition by comparing position hash keys from the history stack of the current search line. Positions detected as draw by repetition will not end up in the TT. It would be an error to save positions detected as draw by repetition in the TT, because the draw score is not related to the position itself but to current search line only.

@Marco: After 3000 games the score is +555, -568, =1877. So SF default has taken the lead. I'll test until 5000 games, because I have nothing else to test at the moment.

zamar · Post by **zamar** » Sat May 22, 2010 11:29 am

Ralph Stoesser wrote: I don't understand what you mean. SF detect repetition by comparing position hash keys from the history stack of the current search line. Positions detected as draw by repetition will not end up in the TT. It would be an error to save positions detected as draw by repetition in the TT, because the draw score is not related to the position itself but to current search line only.

Draw scores detected in leaf nodes affect scores of parent-nodes, grand-parent nodes etc. and those get saved in TT. Those nodes get probed in some completely other line and provide false information. That is the problem, but there is no sane way to fix this without radically affecting performance of program.

SF search inconsistency

SF search inconsistency

Re: SF search inconsistency

Re: SF search inconsistency

Re: SF search inconsistency

Re: SF search inconsistency

Re: SF search inconsistency

Re: SF search inconsistency

Re: SF search inconsistency

Re: SF search inconsistency

Re: SF search inconsistency