stockfish fail high fail low

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
zamar
Posts: 613
Joined: Sun Jan 18, 2009 6:03 am

Re: stockfish fail high fail low

Post by zamar » Sat Apr 24, 2010 11:15 am

The crucial difference between search_pv() and root_search()
is that when in root_search() value >= beta, this node is going to be part of the new PV.
But when in search_pv() value >= beta, this means that in parent node value <= alpha which means fail low and trying of another move, so the node won't be part of the new PV.

In some my very early tests (around year ago) I did test removing the "value < beta" condition from search_pv() and it did slightly weaker. However there has been drastic restructuring of the search tree since then, so you never know. However as explained above, I find it unlikely that this change could be useful.
Joona Kiiski

QED
Posts: 60
Joined: Thu Nov 05, 2009 8:53 pm

Re: stockfish fail high fail low

Post by QED » Sat Apr 24, 2010 11:23 am

I think I will not test the short code idea in the near future.
That is because I finally used the "look and see" approach and I have find out that it "pv verification" still does not solve all the bad behaviour, not even in the testopsition posted at the start of the this thread.
I am not sure why. I have to think about it more deeply and meanwhile test other ideas that might accidentally help.

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 7:17 pm

Re: stockfish fail high fail low

Post by mcostalba » Sat Apr 24, 2010 11:32 am

QED wrote:I think I will not test the short code idea in the near future.
That is because I finally used the "look and see" approach and I have find out that it "pv verification" still does not solve all the bad behaviour, not even in the testopsition posted at the start of the this thread.
I am not sure why. I have to think about it more deeply and meanwhile test other ideas that might accidentally help.
Ok, perhaps I will give it a try. I have read Joona's post and he is very possibly right that the patch doesn't work....but you never know ;-)

QED
Posts: 60
Joined: Thu Nov 05, 2009 8:53 pm

Re: stockfish fail high fail low

Post by QED » Sat Apr 24, 2010 12:16 pm

Joona Kiiski wrote:But when in search_pv() value >= beta, this means that in parent node value <= alpha which means fail low and trying of another move, so the node won't be part of the new PV.
The patch is not useful if exact value >= beta. In that case it only slows search down a bit.
It is useful when search_pv() would return value <= alpha, but search() incorrectly returns value >= beta (because it does not see 50-moves rule or something like that), which avoids search_pv() call.

Without patch, we could, in parent node, falsely report fail low instead of incoming fail high. That could result in playing suboptimal moves.

Whether avoiding this bad decisions (if the patch really avoids them) is worth the small slowdown of search is not known, so it should be tested.

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 7:17 pm

Re: stockfish fail high fail low

Post by mcostalba » Sat Apr 24, 2010 2:32 pm

QED wrote:
Joona Kiiski wrote:But when in search_pv() value >= beta, this means that in parent node value <= alpha which means fail low and trying of another move, so the node won't be part of the new PV.
The patch is not useful if exact value >= beta. In that case it only slows search down a bit.
It is useful when search_pv() would return value <= alpha, but search() incorrectly returns value >= beta (because it does not see 50-moves rule or something like that), which avoids search_pv() call.

Without patch, we could, in parent node, falsely report fail low instead of incoming fail high. That could result in playing suboptimal moves.

Whether avoiding this bad decisions (if the patch really avoids them) is worth the small slowdown of search is not known, so it should be tested.
Yes, in search_pv() pruning and LMR is much less aggressive and it can happen that a node that fails high in search() gives instead a value < beta when searched with search_pv() .

Just to give some numbers (always needed when talking about this things), the following measuring with dbg_hit_on_c() shows an hit rate of 27% on a set of testing positions.

Code: Select all

if &#40;doFullDepthSearch&#41;
&#123;
    // Full depth non-pv search using alpha as upperbound
    ss&#91;ply&#93;.reduction = Depth&#40;0&#41;;
    value = -search&#40;pos, ss, -alpha, newDepth, ply+1, true, threadID&#41;;

    bool fail_high = value >= beta;

    // If non-pv search fails high then research at same depth but as
    // PV to get a correct score or eventually a fail high above beta.
    if &#40;value > alpha&#41;
        value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, ply+1, threadID&#41;;

    dbg_hit_on_c&#40;fail_high, value < beta&#41;; // hits on a given condition
&#125;

QED
Posts: 60
Joined: Thu Nov 05, 2009 8:53 pm

Re: stockfish fail high fail low

Post by QED » Sun Apr 25, 2010 1:26 pm

Another patch. Stabilises, but hurts performance.

So, I have thought about it a little deeper. I see two main contributions to fail high/low instability. One of them is aspiration re-search, when we visit the same node with the same depth, but in the meantime we have added more data to transposition table. This generally produces better moves, so it is good.

The other contribution is from situations where search() returns values that significantly differ from what search_pv() would return. This can make search_pv() blind to some moves, only do discover them suddenly when aspiration window gets wide enough (if ever). This could make stockfish to waste time (or play suboptimal moves), so it is bad. But if we want to benefit from TT and aggressive non-pv pruning, we could not avoid this situations completely.

Is I understand, the original stockfish 1.7.1 assures that when search_pv() returns value between alpha and beta, it would be based on qsearch() value at the end of pv, and not just on some TT hit. But values outside (alpha, beta) interval could be just TT hits.

So I tried to make a patch that would assure that all return values of search_pv() are from pv end qsearch. It all happens in steps 14 and 15 of search_pv(). Dropping value<beta from pv search condition takes care of fail high situations, fail low situations need another code:

Code: Select all

        // Step 14. Reduced search
        // if the move fails high will be re-searched at full depth.
        bool doFullDepthSearch = true;

        if (    depth >= 3 * OnePly
            && !dangerous
            && !captureOrPromotion
            && !move_is_castle&#40;move&#41;
            && !move_is_killer&#40;move, ss&#91;ply&#93;))
        &#123;
            ss&#91;ply&#93;.reduction = pv_reduction&#40;depth, moveCount&#41;;
            if &#40;ss&#91;ply&#93;.reduction&#41;
            &#123;
                // We are trying to improve bestValue, independently of alpha.
                value = -search&#40;pos, ss, -bestValue, newDepth-ss&#91;ply&#93;.reduction, ply+1, true, threadID&#41;;
                doFullDepthSearch = &#40;value > bestValue&#41;;
            &#125;
        &#125;

        // Step 15. Full depth search
        if &#40;doFullDepthSearch&#41;
        &#123;
            ss&#91;ply&#93;.reduction = Depth&#40;0&#41;;
            value = -search&#40;pos, ss, -bestValue, newDepth, ply+1, true, threadID&#41;;

            // Step extra. pv search &#40;only in PV nodes&#41;
            if &#40;value > bestValue&#41;
                value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, ply+1, threadID&#41;;
        &#125;
Basically, this code uses bestValue instead of alpha for decision whether to search at full depth, so it might introduce a big slowdown, especially in the positions like the tesposition at the start of this thread.

Anywyay, I was going to sleep, so I started 1000 games match (1 thread, 32MB hash, 30 seconds/game) between this patch and the short patch. The short patch was only 18 elo stronger, with likelihood of superiority 98 (against 1).

Maybe now I can finally try to implement something MTD-like again, using search_pv() with zero width aspiration window.

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 7:17 pm

Re: stockfish fail high fail low

Post by mcostalba » Sun Apr 25, 2010 3:33 pm

QED wrote:The short patch was only 18 elo stronger, with likelihood of superiority 98 (against 1).
Well, 18 ELO points are not peanuts ;-)

...and the "short patch" seems to be weaker then current released version though test is still running, around 700 games played, I'll wait until this evening before to stop the match.

So it seems we are not lucky this time :-(

QED
Posts: 60
Joined: Thu Nov 05, 2009 8:53 pm

Re: stockfish fail high fail low

Post by QED » Tue Jun 15, 2010 8:24 pm

I got another idea of possible source of search instability. I was guided by the idea of always returning value based on pv_search() or qsearch(). But what should we do if we are failing low and we have discarded some moves based on reduced search()? Conservative approach is to do fail hard. I do not understand all of consequences, but the patch (single thread only) is simple:

Code: Select all

diff -dur src-Ch/search.cpp src-Pv4Ch/search.cpp
--- src-Ch/search.cpp   2010-04-20 00&#58;45&#58;49.000000000 +0200
+++ src-Pv4Ch/search.cpp        2010-06-15 21&#58;32&#58;00.000000000 +0200
@@ -1180,8 +1180,14 @@
             value = -search&#40;pos, ss, -alpha, newDepth, ply+1, true, threadID&#41;;
 
             // Step extra. pv search &#40;only in PV nodes&#41;
-            if &#40;value > alpha && value < beta&#41;
+            if &#40;value > alpha&#41;
                 value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, ply+1, threadID&#41;;
+            else
+                value = alpha;
+        &#125;
+        else
+        &#123;
+            value = alpha;
         &#125;
       &#125;
For testing, I said to myself that using "-concurrency 2" increases load so the test is less exact, but doubling number of games should more than compensate for it. From the point of original version, the match ended +378 =1135 -487 (or ++24 +=91 +-83 =+101 ==334 =-148 -+55 -=127 --37 for minimatches). Bayeselo says it is +16(+-12) elo for the patched version, with likelihood of superiority of 99% (against 0%). Conditions as usual: Ponder=false, Threads=1, Hash=32, stockfish book, -bookdepth 40, -repeat, but now -games 2000.

I am not sure how to change sp_search_pv() to work accordingly, so now I am going to test with "&& value < beta)" back in the pv search condition to see what happens.

EDIT: tc=/1:00

QED
Posts: 60
Joined: Thu Nov 05, 2009 8:53 pm

Re: stockfish fail high fail low

Post by QED » Wed Jun 16, 2010 11:37 pm

Something is wrong. The new patch was

Code: Select all

vrato@notas&#58;~/Prg/stockfish-171$ diff -dur src-Ch src-Pv5Ch 
diff -dur src-Ch/search.cpp src-Pv5Ch/search.cpp
--- src-Ch/search.cpp   2010-04-20 00&#58;45&#58;49.000000000 +0200
+++ src-Pv5Ch/search.cpp        2010-06-15 22&#58;33&#58;24.000000000 +0200
@@ -1182,6 +1182,12 @@
             // Step extra. pv search &#40;only in PV nodes&#41;
             if &#40;value > alpha && value < beta&#41;
                 value = -search_pv&#40;pos, ss, -beta, -alpha, newDepth, ply+1, threadID&#41;;
+            else if &#40;value < beta&#41;
+                value = alpha;
+        &#125;
+        else
+        &#123;
+            value = alpha;
         &#125;
       &#125;
but it scored +209 =579 -212 against original version and +227 =598 -175 against previous patch. Bayeselo says it means 91:8 probability the previous patch was in fact weaker than original stockfish. Is it possible that before previous test I forgot to recompile original stockfish with new version of gcc? Anyway, more tests to come, this time using gaviota-starters.pgn.

QED
Posts: 60
Joined: Thu Nov 05, 2009 8:53 pm

Re: stockfish fail high fail low

Post by QED » Fri Jun 18, 2010 7:56 am

So, previous patch vs original ended +232 =551 -217 and new patch vs original ended +208 =583 -209. The new patch hass still the same strength as original, the previous patch still looks slightly stronger than original, but after 1000 games LOS is only 68:31. More tests to come.

Post Reply