strange tt(?) problem

flok · Post by **flok** » Thu Oct 27, 2022 8:32 pm

Hi,

For a certain position ( r3k2r/1Qp2pp1/p1nqpn1p/3p1b2/3P4/2P1P3/PP1NBPP1/R1B2RK1 b kq - 2 12 ), my program Dog starts to behave unexpected (in autoplay as below but also regularly): it only emits moves with score 0 after depth 11 and from then for every depth and move. This is with pondering and lazy smp disabled.

Code: Select all

uci
id name Dog v0.9
id author Folkert van Heusden
option name Threads type spin default 2 min 1 max 2
option name Ponder type check default true
uciok
position fen r3k2r/1Qp2pp1/p1nqpn1p/3p1b2/3P4/2P1P3/PP1NBPP1/R1B2RK1 b kq - 2 12
play
info depth 1 score cp -65 nodes 154 time 3 nps 51333 pv e8g8
info depth 2 score cp -98 nodes 520 time 5 nps 104000 pv a8a7
info depth 3 score cp -76 nodes 971 time 8 nps 121375 pv a8a7
info depth 4 score cp -13 nodes 5163 time 31 nps 166548 pv e8g8
info depth 5 score cp -11 nodes 7490 time 45 nps 166444 pv e8g8
info depth 6 score cp -5 nodes 15695 time 90 nps 174388 pv e8g8
info depth 7 score cp -3 nodes 46177 time 262 nps 176248 pv e8g8
info depth 8 score cp -1 nodes 86046 time 484 nps 177780 pv e8g8
info depth 9 score cp -8 nodes 202484 time 1086 nps 186449 pv e8g8
info depth 10 score cp -12 nodes 403509 time 2146 nps 188028 pv e8g8
info depth 11 score cp 0 nodes 815945 time 4415 nps 184812 pv e8g8
info depth 12 score cp 0 nodes 1644502 time 9028 nps 182155 pv e8g8
info depth 1 score cp 0 nodes 1 time 1 nps 1000 pv b7b3
info depth 2 score cp 0 nodes 2 time 1 nps 2000 pv b7b3
info depth 3 score cp 0 nodes 3 time 1 nps 3000 pv b7b3
info depth 4 score cp 0 nodes 4 time 1 nps 4000 pv b7b3
info depth 5 score cp 0 nodes 5 time 1 nps 5000 pv b7b3
info depth 6 score cp 0 nodes 6 time 1 nps 6000 pv b7b3
info depth 7 score cp 0 nodes 7 time 1 nps 7000 pv b7b3
info depth 8 score cp 0 nodes 8 time 1 nps 8000 pv b7b3
info depth 9 score cp 0 nodes 9 time 1 nps 9000 pv b7b3
info depth 10 score cp 0 nodes 10 time 1 nps 10000 pv b7b3
info depth 11 score cp 0 nodes 11 time 1 nps 11000 pv b7b3
info depth 12 score cp 0 nodes 1039145 time 5795 nps 179317 pv b7b3
info depth 1 score cp 0 nodes 1 time 1 nps 1000 pv f6g4
info depth 2 score cp 0 nodes 2 time 1 nps 2000 pv f6g4
info depth 3 score cp 0 nodes 3 time 1 nps 3000 pv f6g4
info depth 4 score cp 0 nodes 4 time 1 nps 4000 pv f6g4
info depth 5 score cp 0 nodes 5 time 1 nps 5000 pv f6g4
info depth 6 score cp 0 nodes 6 time 1 nps 6000 pv f6g4
info depth 7 score cp 0 nodes 7 time 1 nps 7000 pv f6g4
info depth 8 score cp 0 nodes 8 time 1 nps 8000 pv f6g4
info depth 9 score cp 0 nodes 9 time 1 nps 9000 pv f6g4
info depth 10 score cp 0 nodes 10 time 1 nps 10000 pv f6g4
info depth 11 score cp 0 nodes 11 time 1 nps 11000 pv f6g4
...

The full code can be viewed at https://github.com/folkertvanheusden/Do ... er/app/src but here's the part where I retrieve a record from TT and put one back (only in search, nog in qs and here with extra comments):

Code: Select all

        std::optional<libchess::Move> tt_move { };
        uint64_t       hash        = pos.hash();
        std::optional<tt_entry> te = tti->lookup(hash, depth);

        if (te.has_value()) {  // any record returned?
                if (te.value().data_._data.m)  // does it contain a move?
                        tt_move = libchess::Move(te.value().data_._data.m); // convert move from tt format

                if (tt_move.has_value() && pos.is_legal_move(tt_move.value()) == false) {   // check if move is valid (e.g. for collision check)
                        tt_move.reset();
                }
                else if (te.value().data_._data.depth >= depth) {  // tt-move depth >= current depth
                        bool use       = false;

                        int csd        = max_depth - depth;
                        int score      = te.value().data_._data.score;
                        int work_score = abs(score) > 9800 ? (score < 0 ? score + csd : score - csd) : score;

                        if (te.value().data_._data.flags == EXACT)
                                use = true;
                        else if (te.value().data_._data.flags == LOWERBOUND && work_score >= beta)
                                use = true;
                        else if (te.value().data_._data.flags == UPPERBOUND && work_score <= alpha)
                                use = true;

                        if (use) {
                                if (tt_move.has_value()) {   // if tt-move is valid, return both the score and the move
                                        *m = tt_move.value();

                                        return work_score;
                                }

                                if (!is_root_position) {  // tt-move is not valid, return only the score (when not in root position)
                                        *m = libchess::Move(0);

                                        return work_score;
                                }
                        }
                }
        }

Code: Select all

        tt_entry_flag flag = EXACT;

        if (best_score <= start_alpha)   // start-alpha is the alpha at start of search()
                flag = UPPERBOUND;
        else if (best_score >= beta)
                flag = LOWERBOUND;

        tti->store(hash, flag, depth, best_score, 
                        best_score > start_alpha || tt_move.has_value() == false ? best_move : tt_move.value());

Any ideas/suggestions?

expositor · Post by **expositor** » Thu Oct 27, 2022 11:18 pm

On the first search, the reported statistics look fine, but on subsequent searches of the same position

Code: Select all

info depth 1 score cp 0 nodes 1 time 1 nps 1000 pv b7b3
info depth 2 score cp 0 nodes 2 time 1 nps 2000 pv b7b3
info depth 3 score cp 0 nodes 3 time 1 nps 3000 pv b7b3
info depth 4 score cp 0 nodes 4 time 1 nps 4000 pv b7b3
info depth 5 score cp 0 nodes 5 time 1 nps 5000 pv b7b3
info depth 6 score cp 0 nodes 6 time 1 nps 6000 pv b7b3
info depth 7 score cp 0 nodes 7 time 1 nps 7000 pv b7b3
info depth 8 score cp 0 nodes 8 time 1 nps 8000 pv b7b3
info depth 9 score cp 0 nodes 9 time 1 nps 9000 pv b7b3
info depth 10 score cp 0 nodes 10 time 1 nps 10000 pv b7b3
info depth 11 score cp 0 nodes 11 time 1 nps 11000 pv b7b3

the number of nodes searched is equal to the depth, which means the engine is never searching more than a single node (the root) at each step. My guess is that it's returning immediately, since

the cache entry now exists and te.value().data_._data.depth >= depth (e.g. 11 ≥ 1, 11 ≥ 2, ...)
flags == EXACT (or at least I expect it to be at the root), and
there are no further guards on returning.

The typical practice, as far as I'm aware, is

don't return if the score is exact unless the score is above beta or under alpha, and
don't allow the engine to exit early in expected PV nodes, regardless of the score.

That first idea expressed in code might look something like

Code: Select all

uint8_t estimate = te.value().data_._data.flags;

bool use = ((estimate == EXACT || estimate == LOWERBOUND) && work_score >= beta) ||
           ((estimate == EXACT || estimate == UPPERBOUND) && work_score <= alpha) ;

JVMerlino · Post by **JVMerlino** » Fri Oct 28, 2022 12:53 am

expositor wrote: ↑Thu Oct 27, 2022 11:18 pm The typical practice, as far as I'm aware, is
don't return if the score is exact unless the score is above beta or under alpha....

I'm not sure about this. I believe that, if the exact hash entry is also of the required minimum depth, you should always return SOME value. But your implementation of what value to return will depend on whether or not you have implemented fail-hard or fail-soft. I use fail-hard in Myrddin, and this is my code for when the hash probe returns an exact value:

Code: Select all

if (nHashFlags & HASH_EXACT)
{
	if (nHashEval >= nBeta) return(nBeta);
        if (nHashEval <= nAlpha) return(nAlpha);
        return nHashEval;
}

A fail-soft implementation would simply do this:

Code: Select all

if (nHashFlags & HASH_EXACT) return nHashEval;

hgm · Post by **hgm** » Fri Oct 28, 2022 7:03 am

This is normal behavior. After having done a 12-ply search and playing the PV move, you are in a position that has been searched to 11 ply, and has an exact score. So up to d=11 the root gives a usable hash hit. Only at d=12 the stored depth is insufficient, and a real search has to be done.

Only thing suspect is that the score stays 0 even after a new search. But that doesn't have to be wrong. Scores can often stay the same in several iterations, because the earlier iterations were extended by QS to the next quiet position. So your PV at d=11 might be 15 ply long because 2 pieces get traded at the end, and would still be PV at d=15.

flok · Post by **flok** » Sat Oct 29, 2022 5:15 pm

Thank you all for the help.

flok · Post by **flok** » Sat Oct 29, 2022 7:35 pm

hgm wrote: ↑Fri Oct 28, 2022 7:03 am Only thing suspect is that the score stays 0 even after a new search. But that doesn't have to be wrong. Scores can often stay the same in several iterations, because the earlier iterations were extended by QS to the next quiet position. So your PV at d=11 might be 15 ply long because 2 pieces get traded at the end, and would still be PV at d=15.

I found that it also happens from the root-position after a while.

For example also "2B5/8/8/6k1/3K4/8/8/8 w - - 90 126" triggers it. It is timing related because it does not every time I try that test-position get triggered.

This is very suspicious.

flok · Post by **flok** » Sat Oct 29, 2022 8:25 pm

flok wrote: ↑Sat Oct 29, 2022 7:35 pm
hgm wrote: ↑Fri Oct 28, 2022 7:03 am Only thing suspect is that the score stays 0 even after a new search. But that doesn't have to be wrong. Scores can often stay the same in several iterations, because the earlier iterations were extended by QS to the next quiet position. So your PV at d=11 might be 15 ply long because 2 pieces get traded at the end, and would still be PV at d=15.
I found that it also happens from the root-position after a while.

For example also "2B5/8/8/6k1/3K4/8/8/8 w - - 90 126" triggers it. It is timing related because it does not every time I try that test-position get triggered.

This is very suspicious.

Ok I added some logging to Dog and found that for "2B5/8/8/6k1/3K4/8/8/8 w - - 90 126" the positions + score that are put in tt are
~ 50% halfmoves > 100 or position is repeated or insufficient material
~ 34% caused by a TT hit returning 0

strange tt(?) problem

strange tt(?) problem

Re: strange tt(?) problem

Re: strange tt(?) problem

Re: strange tt(?) problem

Re: strange tt(?) problem

Re: strange tt(?) problem

Re: strange tt(?) problem