First post (and FailHigh question!)

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: First post (and FailHigh question!)

Post by Sven »

xmas79 wrote:This is one of the flow I traced successfully. The same thing is happening all over the tree. That's why engine fails high when BETA is less than INF. When BETA is +INF no cut-offs happens and normal score is then reported.

That's all.
Hi Natale,

nice to read about your progress! I have taken a look into your data, and at a first glance it seems that your hash table works correctly regarding mate scores, which is good news.

There is one thing I do not understand, so I hope you can explain it to me. Your iteration 15 that was started with beta at root about +600 fails high with a mate score after 1.Be2 (even though this is not the optimal move, but that's ok for a fail-high). This is fine. But now you start a research with beta=+INF. The big question arises: why does the research not find a PV with a mate score? It could find the optimal line 1.Ke5 Kh2 (or 1...Kg2) 2.Kf4 Kh3 and so on which is indeed +M11 from the root. It might be possible that this is missed in iteration 15 if there is not enough data yet in the hash table for it. But I would be very surprised if it would not even find the suboptimal line 1.Ke5 Kg1 2.Be2 which is a transposition of 1.Be2 Kg1 2.Ke5 having been searched during the first attempt of iteration 15 that failed high!

Therefore I suspect a general problem in backing up scores from a subtree. Currently I have no other explanation of the behaviour you describe. I am very sure that your research must find a mate score if it works correctly.

Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: First post (and FailHigh question!)

Post by Sven »

Sven Schüle wrote:
xmas79 wrote:This is one of the flow I traced successfully. The same thing is happening all over the tree. That's why engine fails high when BETA is less than INF. When BETA is +INF no cut-offs happens and normal score is then reported.

That's all.
Hi Natale,

nice to read about your progress! I have taken a look into your data, and at a first glance it seems that your hash table works correctly regarding mate scores, which is good news.

There is one thing I do not understand, so I hope you can explain it to me. Your iteration 15 that was started with beta at root about +600 fails high with a mate score after 1.Be2 (even though this is not the optimal move, but that's ok for a fail-high). This is fine. But now you start a research with beta=+INF. The big question arises: why does the research not find a PV with a mate score? It could find the optimal line 1.Ke5 Kh2 (or 1...Kg2) 2.Kf4 Kh3 and so on which is indeed +M11 from the root. It might be possible that this is missed in iteration 15 if there is not enough data yet in the hash table for it. But I would be very surprised if it would not even find the suboptimal line 1.Ke5 Kg1 2.Be2 which is a transposition of 1.Be2 Kg1 2.Ke5 having been searched during the first attempt of iteration 15 that failed high!

Therefore I suspect a general problem in backing up scores from a subtree. Currently I have no other explanation of the behaviour you describe. I am very sure that your research must find a mate score if it works correctly.
I'd like to add: please follow the path from the nodes where mate scores are found up to the root, especially during the iteration that fails high. Do these mate scores make it to the root? Why? After rereading my post and my older posts in this thread I am no longer sure that the fail-high is really correct, it could still be possible that the fail-high should not happen since the opponent can make a move that avoids getting mated within the horizon AND avoids stepping through positions having mate scores in the TT.

Another question: what do you do if you get a hash hit when searching a node to depth 7 but the TT entry you have retrieved has a draft of only 5? I hope you do not handle this as a hash hit.

Sven
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: First post (and FailHigh question!)

Post by xmas79 »

Hi Sven!
Sven Schüle wrote:There is one thing I do not understand, so I hope you can explain it to me. Your iteration 15 that was started with beta at root about +600 fails high with a mate score after 1.Be2 (even though this is not the optimal move, but that's ok for a fail-high). This is fine.
What I posted is the fail-high iteration, there's no research part in this output. It fails-high after 1.Ke5 and not Be2. This output states that engine first tries Be2, then tries Ke5 and here fails high. In fact the end of the iteration is:

Code: Select all

2180   CheckHash    301908   0x7FDB715732C364AB   -31972                                      5     10         ------                            8/8/8/8/5K2/3N1B1k/8/8 b - -       1. Kd6-e5 Kh1-h2 2. Ke5-f4 Kh2-h3 3. Bd1-f3
2181   InsertHash   9468     0x6EDE23771B7824FE   300        0x698B468E203024FF   31971       4     11         ------         Bd1-f3             8/8/8/8/5K2/3N3k/8/3B4 w - -       1. Kd6-e5 Kh1-h2 2. Ke5-f4 Kh2-h3
2182   InsertHash   58296    0x28093263506F1C46   -600       0x28093263506F1C46   -31970      3     12         Kh2-h3         ------             8/8/8/8/5K2/3N4/7k/3B4 b - -       1. Kd6-e5 Kh1-h2 2. Ke5-f4
2183   InsertHash   10996    0x0000000000000000   0          0x76DE067062082AF7   31969       2     13         ------         Ke5-f4             8/8/8/4K3/8/3N4/7k/3B4 w - -       1. Kd6-e5 Kh1-h2
2184   InsertHash   179040   0x2B2C20B96315449F   -600       0x2B2C20B96315449F   -31968      1     14         Kh1-g1         ------             8/8/8/4K3/8/3N4/8/3B3k b - -       1. Kd6-e5
2185   InsertHash   351452   0x3F3A725C4E1D5CDE   600        0x3F3A725C4E1D5CDE   31967       0     15         Bd1-c2         Kd6-e5             8/8/3K4/8/8/3N4/8/3B3k w - -
Notice the last line: ply=0, depth=15, old best move = Bc2, new best move=Ke5. Fail-high. This is the output of the engine in this point:

Code: Select all

--------------------------------------------------------------------------------
 00:00:01.062   15     Bd1-c2      15.0M    +6.00        1. Bd1-c2    Kh1-g1
                                                         2. Bc2-b1    Kg1-f1
                                                         3. Bb1-a2    Kf1-g1
                                                         4. Ba2-b3    Kg1-f1
                                                         5. Bb3-d1    Kf1-g1
                                                         6. Bd1-c2    Kg1-f1
                                                         7. Bc2-b1    Kf1-g1
                                                         8. Bb1-a2
 00:00:01.521   15     Kd6-e5      13.8M  +MATE17
Mate17, which is a mate in 33 plies ---> 31967.
Sven Schüle wrote:But now you start a research with beta=+INF. The big question arises: why does the research not find a PV with a mate score? It could find the optimal line 1.Ke5 Kh2 (or 1...Kg2) 2.Kf4 Kh3 and so on which is indeed +M11 from the root. It might be possible that this is missed in iteration 15 if there is not enough data yet in the hash table for it. But I would be very surprised if it would not even find the suboptimal line 1.Ke5 Kg1 2.Be2 which is a transposition of 1.Be2 Kg1 2.Ke5 having been searched during the first attempt of iteration 15 that failed high!

Therefore I suspect a general problem in backing up scores from a subtree. Currently I have no other explanation of the behaviour you describe. I am very sure that your research must find a mate score if it works correctly.

Sven
Now I resarch with a beta=+INF. And what happens here is very difficult to track down... Research is longer and I have no "special" cases to attach my debugging code.... But all I can say is the research code is THE SAME code was just executed in the search, only difference is a different beta. I think that widening beta to +INF avoid some cut-offs from happening when black have to move: when I set beta to +INF in the research and when doing a checkhash when is black to move, bounds are [-INF, -500] instead of a [-700, -500] (remember window is 100cp around material eval, which is 600cp). I mean that a probe of a value like -31980 actually do causes a cut-off when window is [-700, -500] backing-up a mate score, and actually do NOT causes a cut-off whe window is [-INF, -500]:

Code: Select all

case SCORE_IS_AT_MOST:
			if &#40;hash.boundValue <= alpha&#41;
				return hash.boundValue;
			break;
And these NOT cut-offs "forces" the search (or research, the code is the same... not copy&paste, THE SAME) to keep on trying moves, so search will behave more like "normal" alpha-beta without TT, and the value I get back to the root node is an AlphaBeta true score, +6.00, which is the right one...

Natl.
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: First post (and FailHigh question!)

Post by xmas79 »

Hi Sven,
I read your post while posting... Here's the answer:
Sven Schüle wrote:I'd like to add: please follow the path from the nodes where mate scores are found up to the root, especially during the iteration that fails high. Do these mate scores make it to the root? Why? After rereading my post and my older posts in this thread I am no longer sure that the fail-high is really correct, it could still be possible that the fail-high should not happen since the opponent can make a move that avoids getting mated within the horizon AND avoids stepping through positions having mate scores in the TT.
See my previous post: an hash hit causes search to return immediately without allowing black to try to escape from mate in 14 or 16 or 17. That's how I back-up mate scores.
Sven Schüle wrote:Another question: what do you do if you get a hash hit when searching a node to depth 7 but the TT entry you have retrieved has a draft of only 5? I hope you do not handle this as a hash hit.
Sven
Hash hit requires a signature match and a draft condition (stored draft >= current draft). Tested with Fine#70, nothing would work....

Good night,
Natl.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: First post (and FailHigh question!)

Post by Sven »

xmas79 wrote:Mate17, which is a mate in 33 plies ---> 31967.
Two questions:

1) Are the black replies forced? I don't think so. If, for instance, 1.Ke5 Kh2 leads to a forced mate for white then you can't back this up to the parent unless you also checked that all other black replies (siblings of 1...Kh2) are not better! This is different from white-to-move positions where you leave a node at a cutoff.

2) Where do you add an offset of 1 to your mate scores? In your recent output I notice that the mate scores are increased by 1 for each ply (apart from the sign), e.g. 31971, -31970, 31969, -31968, 31967. If these are already "ply corrected" as you mentioned in your previous post then I would like to know where that additional offset of 1 comes from. In my view it is wrong, and therefore a possible cause of your problem. If you search a node at ply P and retrieve a mate score "mate in X plies from here" from TT and apply the ply correction then you have to return "mate in (P+X) plies from root", and the parent will immediately see this as "mated in (P+X) plies from root". But there is no "+ 1" offset anywhere in this calculation, never! The backed up scores (when ignoring my point 1 for a while, of course!!) should be 31971, -31971, 31971, -31971, 31971 instead.

Sven
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: First post (and FailHigh question!)

Post by xmas79 »

Sven Schüle wrote:1) Are the black replies forced? I don't think so. If, for instance, 1.Ke5 Kh2 leads to a forced mate for white then you can't back this up to the parent unless you also checked that all other black replies (siblings of 1...Kh2) are not better! This is different from white-to-move positions where you leave a node at a cutoff.
Black replies are not forced of course. But as I said in a post earlier, I check hash before searching ANY move, so if I get a hash hit and bound conditions are met I backup that score and leave that node, black or white side to move. Notice that white-to-move positions have BETA cut-offs, while black positions can have ALPHA cut-offs. So as already said, when white-to-move a beta cut-off cause search to leave the current node with positive mate score (as search improved alpha above beta after making a move), but when black-to-move ALPHA cut-offs happens because score returned from a hash hit is below alpha, and in this case no black moves have been searched and never will, as cut-off means leaving this node with a negative score.
Sven Schüle wrote:2) Where do you add an offset of 1 to your mate scores? In your recent output I notice that the mate scores are increased by 1 for each ply (apart from the sign), e.g. 31971, -31970, 31969, -31968, 31967. If these are already "ply corrected" as you mentioned in your previous post then I would like to know where that additional offset of 1 comes from. In my view it is wrong, and therefore a possible cause of your problem. If you search a node at ply P and retrieve a mate score "mate in X plies from here" from TT and apply the ply correction then you have to return "mate in (P+X) plies from root", and the parent will immediately see this as "mated in (P+X) plies from root". But there is no "+ 1" offset anywhere in this calculation, never! The backed up scores (when ignoring my point 1 for a while, of course!!) should be 31971, -31971, 31971, -31971, 31971 instead.

Sven
Values in the output are NOT the values backed-up by search itself, they are:
- when insert ---> values backed-up by search at ply X and stored in the hash by the parent node at ply X-1 and ply-corrected, so if at ply 10 black is checkmated search returns a score of -31990 (-32000+10=-31990), parent node (ply=9) will get back this score of 31990, and this value stored in hash will be ply-corrected, so I will store in the hash the value of 31990+9 =31999 (and this is ok since this is a mate in 1 ply position), and the parent of this node (ply=8) too will receive back a score of -31990, which (if not improved) will be stored as a -31990-8=-31998 (which is ok because it is a mated in 2 plies) and so on...
- when retrieving ---> values ply-reverse-corrected: if I at ply 9 I get a hash hit with a score of 31999 stored in the hash, this score will be translated to root distance before being used by search ---> 31999-9 = 31990, which will eventually be backed-up as -31990 and so on..

Actually search scores are relative to root, hash scores are relative to the stored position.

This +1 is intrinsic in the ply corrections, and it comes off when moving "back" from ply X to ply X-1, hence the correction of -1. And this is normal as when you have a mate in 9 plies in one position, parent position is a mated in 10 plies, parent parent is mate in 11, hence this +1 effect you see in the output.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: First post (and FailHigh question!)

Post by Sven »

xmas79 wrote:
Sven Schüle wrote:1) Are the black replies forced? I don't think so. If, for instance, 1.Ke5 Kh2 leads to a forced mate for white then you can't back this up to the parent unless you also checked that all other black replies (siblings of 1...Kh2) are not better! This is different from white-to-move positions where you leave a node at a cutoff.
Black replies are not forced of course. But as I said in a post earlier, I check hash before searching ANY move, so if I get a hash hit and bound conditions are met I backup that score and leave that node, black or white side to move.
A hash hit does not say anything about forced or not forced replies since the variation ends at that node. What I meant is that the variations where you showed certain mate scores in your output contain unforced and non-optimal black moves, so even if you find a mate score below such an unforced move this should not directly affect the root.

And why "of course"? In my view you cannot fail high at the root after backing up from unforced replies of the opponent. Failing high at the root means that you state the score for white is "at least" X but could be even better when searching with an infinite window. You cannot state that, however, if black can do better anywhere on the path through which you have backed up the fail-high score to the root.

That whole thing sounds fishy to me, have you double-checked your bound conditions? It is o.k. to immediately leave a node on a correct hash hit. But if a TT cutoff occurs with a value <= alpha then you must be sure that the stored value really was an upper bound since otherwise you might get a wrong beta cutoff at the parent. So this implies, if you store a negative mate score as an upper bound in the TT then you must be sure that no other (in this case black) sibling move allows for a longer mate. Is that the case?
xmas79 wrote:
Sven Schüle wrote:2) Where do you add an offset of 1 to your mate scores? [...]
Values in the output are NOT the values backed-up by search itself, they are:
- when insert ---> values backed-up by search at ply X and stored in the hash by the parent node at ply X-1 and ply-corrected, so if at ply 10 black is checkmated search returns a score of -31990 (-32000+10=-31990), parent node (ply=9) will get back this score of 31990, and this value stored in hash will be ply-corrected, so I will store in the hash the value of 31990+9 =31999 (and this is ok since this is a mate in 1 ply position), and the parent of this node (ply=8) too will receive back a score of -31990, which (if not improved) will be stored as a -31990-8=-31998 (which is ok because it is a mated in 2 plies) and so on...
O.k., so maybe I misunderstood your initial description where you wrote that the scores were already "ply-corrected".

But now that sounds as if you store values in TT after having finished unmakeMove() and the corresponding backing up of scores. The most usual way I know is to store values in TT immediately before returning from a node, *prior* to unmakeMove(). Maybe you do the same but it is not clear for me from your description. If you don't store at the end of a node but within the move loop of the parent then this is another potential implementation problem. I know: so far none of my suspicions was confirmed but that is how it goes - otherwise we would be done already ;-)

I still fail to believe in your "no bug" theory. In my opinion an engine that does not go beyond pure alpha-beta, mate scoring, material values, and TT hash should not behave the way you describe your engine, so for me either it is wrong to suddenly fail high with +M17 in iteration 15 (still my favourite explanation), or if finding that +M17 (= 33 plies) as a lower bound is actually correct already in iteration 15 then the research must find a forced mate as well (not necessarily the shortest since that may require a deeper search), at least the one from the iteration step that failed high or a transposition of it.

As you can see, I have not given up your case :-)

Sven
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Further analysis

Post by xmas79 »

Hi all,
I conducted a deeper analysis on this topic, and I want to show my results. This is another long post... Sorry for that...

1) What I wrote in my previous posts seems to still hold true.
2) I rechecked all the parts of my engine. I retested my move generator by perfting all over here and there and I'm absolutely sure I have no bugs.
3) Then I looked at the problem carefully by dumping every kind of information could be useful triggered by mate scores. As early said, when a mate score is inserted/retrieved from TT I dump a progressive id number, current ply, current draft, current executed line and so on... But this time I dumped EVERY mate score in the search from depth 1 to depth 15. I remember you that CurrentPly + CurrentDraft = K, where K is a constant and specifically is the current search depth coming from the iterative search. So when we have ply=8 and draft=4 --> we are in a search at depth=12, and we are 8 plies far from root and 4 plies far from horizon. Here there are the first 8 entries of this log:

Code: Select all

prog    func          key       oldhash              type_old    oldvalue    newhash               type_new    newvalue    alpha    beta    ply    depth    old best move    best move    fen                                 Current Line                        
1       InsertHash    285448    0x192E0708477C5B09   AT_MOST     600         0x192E0708477C5B09    AT_LEAST    31999       600      601     10     2        ------           Nb2-d3       8/8/8/8/8/4K3/1N2B3/4k3 w - -       1. Kd6-d5 Kh1-g1 2. Nd3-b2 Kg1-f1 3. Kd5-d4 Kf1-e1 4. Kd4-e3 Ke1-f1 5. Bd1-e2 Kf1-e1                         
2       InsertHash    74536     0x72092B9D17592329   AT_MOST     600         0x72092B9D17592329    AT_LEAST    31999       600      601     10     2        ------           Nc1-d3       8/8/8/8/8/4K3/4B3/2N1k3 w - -       1. Kd6-d5 Kh1-g1 2. Kd5-d4 Kg1-f1 3. Kd4-e3 Kf1-g1 4. Nd3-c1 Kg1-f1 5. Bd1-e2 Kf1-e1                         
3       InsertHash    216960    0x6EBE44F27C2B4F81   AT_MOST     600         0x6EBE44F27C2B4F81    AT_LEAST    31999       600      601     10     2        ------           Nf2-d3       8/8/8/8/8/4K3/4BN2/4k3 w - -        1. Kd6-d5 Kh1-g1 2. Kd5-d4 Kg1-f1 3. Kd4-e3 Kf1-g1 4. Nd3-f2 Kg1-f1 5. Bd1-e2 Kf1-e1                         
4       InsertHash    11200     0x224A6FCC1FD82BC1   AT_MOST     600         0x224A6FCC1FD82BC1    AT_LEAST    31999       600      601     10     2        ------           Nb4-c2       8/8/8/8/1N6/4K3/4B3/4k3 w - -       1. Kd6-d5 Kh1-g1 2. Kd5-d4 Kg1-f1 3. Kd4-e3 Kf1-g1 4. Nd3-b4 Kg1-f1 5. Bd1-e2 Kf1-e1                         
5       InsertHash    491056    0x157757007CAF7E30   AT_MOST     600         0x157757007CAF7E30    AT_LEAST    31999       600      601     10     2        ------           Nf4-g2       8/8/8/8/5N2/4K3/4B3/4k3 w - -       1. Kd6-d5 Kh1-g1 2. Kd5-d4 Kg1-f1 3. Kd4-e3 Kf1-g1 4. Nd3-f4 Kg1-f1 5. Bd1-e2 Kf1-e1                         
6       InsertHash    215712    0x1D52617D3A434AA2   AT_MOST     600         0x1D52617D3A434AA2    AT_LEAST    31999       600      601     10     2        ------           Nc5-d3       8/8/8/2N5/8/4K3/4B3/4k3 w - -       1. Kd6-d5 Kh1-g1 2. Kd5-d4 Kg1-f1 3. Kd4-e3 Kf1-g1 4. Nd3-c5 Kg1-f1 5. Bd1-e2 Kf1-e1                         
7       InsertHash    219000    0x56A66F973733577A   AT_MOST     600         0x56A66F973733577A    AT_LEAST    31999       600      601     10     2        ------           Ne5-d3       8/8/8/4N3/8/4K3/4B3/4k3 w - -       1. Kd6-d5 Kh1-g1 2. Kd5-d4 Kg1-f1 3. Kd4-e3 Kf1-g1 4. Nd3-e5 Kg1-f1 5. Bd1-e2 Kf1-e1                         
8       InsertHash    285448    0x7066308408A45B0A   AT_MOST     600         0x192E0708477C5B09    AT_LEAST    31999       600      601     10     3        ------           Nb2-d3       8/8/8/8/8/4K3/1N2B3/4k3 w - -       1. Kd6-d5 Kh1-g1 2. Nd3-b2 Kg1-f1 3. Kd5-d4 Kf1-e1 4. Kd4-e3 Ke1-f1 5. Bd1-e2 Kf1-e1                         
So far so good, the very first (NON forced) mate entry is found at ply 12, where black makes very dumb moves and gets mated in 11 plies. The sequence is: 1. Kd6-d5 Kh1-g1 2. Nd3-b2 Kg1-f1 3. Kd5-d4 Kf1-e1 4. Kd4-e3 Ke1-f1 5. Bd1-e2 Kf1-e1 which triggers a mate in 1 ply with 6. Nb2-d3. Notice that scores stored at depth 12 (from id 1 to id 7, id 8 is about depth=13) are only positive mate scores, AKA only "white wins", and that means black CAN avoid being mated by selecting proper replies. This is quite obvious if you look at this position, and from this log we can say that my search is right, as we don't see any "black loses" mate scores in TT, that would mean search would have backed-up a (wrong) "white wins" score. So search actually finds that black CANNOT be mated in 12 (11) plies, and my search code is OK.

Now I want to show you a "reverse" analysis point of view, where we can see what's happening here (and there...). I choose to start from this position [d]8/8/8/8/8/3N4/4BK2/7k b - -
which is one of the positions already seen in my previous post. In that post I wrote "...it is a mated 12 plies...". Well FIDE should remove me instantly 1000 elo points as this is a mate in 8 plies!!! 1. ... Kh2 2. Bf1 Kh1 3. Ne1 Kh2 4. Nf3+ Kh1 5. Bg2#... Very different from mated in 12 plies... So now we found something that's wrong. Let's see what it is by looking at the (reversed) log:

Code: Select all

prog    func          key       oldhash              type_old    oldvalue    newhash               type_new    newvalue    alpha    beta    ply    depth    old best move    best move    fen                                 Current Line                        
429     InsertHash    169028    0x65C529B604356BBB   AT_LEAST    -600        0x65C529B604356BBB    AT_MOST     -31988      -601     -600    9      6        Kh1-h2           ------       8/8/8/8/8/3N4/4BK2/7k b - -         1. Bd1-e2 Kh1-g1 2. Kd6-c5 Kg1-h1 3. Kc5-d4 Kh1-g1 4. Kd4-e3 Kg1-h1 5. Ke3-f2                         
428     InsertHash    1488      0x38370F7F052805D3   AT_MOST     600         0x38370F7F052805D3    AT_LEAST    31989       600      601     10     5        ------           Be2-g4       8/8/8/8/8/3N4/4BK1k/8 w - -         1. Bd1-e2 Kh1-g1 2. Kd6-c5 Kg1-h1 3. Kc5-d4 Kh1-g1 4. Kd4-e3 Kg1-h1 5. Ke3-f2 Kh1-h2                         
427     InsertHash    41108     0x45B1279F747F5F68   AT_LEAST    -600        0x528A28727AEF5F6B    AT_MOST     -31990      -601     -600    11     4        Kg1-f1           ------       8/8/8/8/6B1/3N4/5K1k/8 b - -        1. Bd1-e2 Kh1-g1 2. Kd6-c5 Kg1-h1 3. Kc5-d4 Kh1-g1 4. Kd4-e3 Kg1-h1 5. Ke3-f2 Kh1-h2 6. Be2-g4                         
426     InsertHash    143616    0x0F780EBB7BF23103   AT_MOST     600         0x0F780EBB7BF23103    AT_LEAST    31991       600      601     12     3        ------           Kf2-g3       8/8/8/8/6B1/3N4/5K2/7k w - -        1. Bd1-e2 Kh1-g1 2. Kd6-c5 Kg1-h1 3. Kc5-d4 Kh1-g1 4. Kd4-e3 Kg1-h1 5. Ke3-f2 Kh1-h2 6. Be2-g4 Kh2-h1                         
425     InsertHash    456672    0x62CD69D87571081E   AT_LEAST    -600        0x52937EE31471081C    AT_MOST     -31992      -601     -600    13     2        Kf1-e1           ------       8/8/8/8/6B1/3N2K1/8/7k b - -        1. Bd1-e2 Kh1-g1 2. Kd6-c5 Kg1-h1 3. Kc5-d4 Kh1-g1 4. Kd4-e3 Kg1-h1 5. Ke3-f2 Kh1-h2 6. Be2-g4 Kh2-h1 7. Kf2-g3                         
424     CheckHash     84508     0x1E0C1E5B2DD94A1C   AT_LEAST    31993                                                     600      601     14     1        Bg4-e2                        8/8/8/8/6B1/3N2K1/8/6k1 w - -       1. Bd1-e2 Kh1-g1 2. Kd6-c5 Kg1-h1 3. Kc5-d4 Kh1-g1 4. Kd4-e3 Kg1-h1 5. Ke3-f2 Kh1-h2 6. Be2-g4 Kh2-h1 7. Kf2-g3 Kh1-g1                         
ID 429 is our starting position, a real mate in 8 plies, but a mate in 12 plies for the engine. Why? Well, seeing what was done previously, we can see that at id 424 it checked this position "8/8/8/8/6B1/3N2K1/8/6k1 w - -", which is effectively a mate in 7 plies, so after a litte of TT math:
- Hashtable check is at ply 14
- Retrieved hashtable score for ID 424 is 31993
- Retrieved hashtable score corrected to root-distance is 31993 - 14 = 31979
- Score is backed-up to ply 13 and negated, so at ply 13 we have a score of -31979
- Score must be stored in the TT at ply 13
- Score is corrected to this-position-distance ---> -31979 - 13 = -31992
- This score is stored in TT with no best move available (as no move can improve the score)

Question is: How we end up to this position? Is this sequence (from ply 14 to ply 9) really forced? Is this a correct "back-up score" procedure? Let's look through FEN positions:
1) at ply 14 (id 424) we see that after Be2 everything is forced. So score back-up is correct as black cannot improve the score.
2) at ply 13 (id 425) black has only one legal move. So score back-up is correct as black cannot improve the score.
3) at ply 12 (id 426) we see that after Kg3 everything is forced as per point 2. So score back-up is correct as black cannot improve the score.
4) at ply 11 (id 427) black has only one legal move. So score back-up is correct as black cannot improve the score.
5) at ply 10 (id 428) we see that after Bg4 everything is forced as per point 4. So score back-up is correct as black cannot improve the score.
6) at ply 9 (id 429) black has only one legal move. So score back-up is correct as black cannot improve the score.
At ply 8 black can escape so no other score back-up happens (and I have no entries in log).

So another question arises: if at ply 9 we have a real mate in 8 plies, why engine doesn't find it in 8 plies? An answer could be that at ply 9 engine has only 6 plies before reaching its horizon, but this post is about this question already, an answer like that enables recursion on this post :))). The answer should be elsewhere... Precisely, let's focus on position at id 428 where there is a mate in 7 plies (since at pos 429 black has only one legal move and then we can skip its analysis and go directly to its parent) by playing Bf1, and this move is missed.

Position at idx 428 [d]8/8/8/8/8/3N4/4BK1k/8 w - -
I dumped all the generated moves:

Code: Select all

1. Be2-d1
2. Be2-f1
3. Be2-f3
4. Be2-g4
5. Be2-h5
6. Nd3-c1
7. Nd3-e1
8. Nd3-b2
9. Nd3-b4
10. Nd3-f4
11. Nd3-c5
12. Nd3-e5 
13. Kf2-e1
14. Kf2-f1
15. Kf2-g1
16. Kf2-g2
17. Kf2-e3
18. Kf2-f3
19. Kf2-g3
We have 19 white moves, where 3 are illegal: Kg1, Kg2, Kg3. These move are NOT excluded during generation, only makemove will see that they are illegal and won't perform any action. All these moves are tried in this dumped order, since they were generated and dumped by the move generator state machine. So that's the time to debug this sub-tree... I will avoid to go too deep and leave some of the details, but keep in mind that the back-up log you have seen is a "back-up", so it is produced on LEAVING a node, on a fail-high, a back-up. What I'm going to do is debugging the node when I ENTER this position. Just to be more clear, engine reaches this position, then generate moves and try them and at a certain point deep in the tree a move will fail high, which will produce the entry at idx 424, and then anoter backup will produce the insert at idx 425 and so on... Now I'm settling BEFORE making the first move.

First move is Bd1 which doesn't increase score.
Second move is Bf1, which leads to a forced mate in 7 plies, BUT!!!! Bf1 has a forced mating sequence 7 plies long.... And stepping through this sub-tree I discovered that after playing Bf1 there are only hash misses, which leaves the engine with the only option to rely on search for this position... And try to figure out what happens if you have a mate in 7 plies AND you don't have any TT hit down the tree AND you have only 5 plies left before reaching horizon? (in the position at id 428 we are at ply 10 and have only 5 plies before reaching the horizon)?? The result is that engine misses the mate because it has no sufficient depth to see it!!! Yep that's a mate in 7 plies, but engine CORRECTLY misses it....
And then it's the turn of Bf3, which doesn't increase score.
And then there is Bg4... Which at some point deep in the tree (specifically at ply 14) hash-hit the position that will be logged at idx 424, which triggers the mate score and start the back-up described above.... and causes the position at ply 9 (id 429) to have a mate score of mated in 12 plies instead of mated in 8 plies.
And this completes the analysis of the first of the "problems" I found.

The second problem I found was already posted and discussed: sometimes the always-replace policy of my TT implementation WILL overwrite a slot which already contains a better mate score (for the same position, I don't mean a hash collision). There's no news about it, and there's an example out of the log (id 445):
[d]8/8/8/8/5K2/3N4/4B3/6k1 w - -

Code: Select all

prog    func          key       oldhash              type_old    oldvalue    newhash               type_new    newvalue    alpha    beta    ply    depth    old best move    best move    fen                                 Current Line                        
444     CheckHash     177536    0x04F139910C354A7D   AT_MOST     -31986                                                    -601     -600    7      8        ------                        8/8/8/8/5K2/3N1B2/8/6k1 b - -       1. Bd1-e2 Kh1-g1 2. Kd6-e5 Kg1-h1 3. Ke5-f4 Kh1-g1 4. Be2-f3                 
445     InsertHash    290860    0x26E748F85FB4702C   AT_LEAST    31993       0x26E748F85FB4702C    AT_LEAST   31985        600      601     6      9        Kf4-g3           Be2-f3       8/8/8/8/5K2/3N4/4B3/6k1 w - -       1. Bd1-e2 Kh1-g1 2. Kd6-e5 Kg1-h1 3. Ke5-f4 Kh1-g1                 
Here we see that at id 445 the engine replaces a good (and obvious) "mate in 7 plies" (Kg3) with a less elegant mate in 15 plies (Bf3).... Why???? Because after Bf3 down the tree engine hash-hits position at ID 444 and say "hey man, that's a mate!", and since Bf3, mate in 15 plies, is tried before Kg3 (king moves are generated at last of generation process), mate in 7 plies, it fails-high with Bf3 and not with Kg3.... And that's perfectly correct.

Now the last question is: how is possible that when beta is less than +INF (or MATE score in general) engine fails-high and when raising beta to +INF engine doesn't fail-high anymore? From what I can understand, the answer is the same I already gave in the previous post: when searching with beta less than a mate score some moves are failing-high because they get a hash cut-off (on both negative and positive side of the window), as a mate score is always greater than beta. Raising beta above a MATE score, let's say above MATE in X plies, search will get lesser and lesser cut-offs, until beta is raised up to the "highest" score that triggers the fail-high, in which case a cut-off cannot occur in any case and the score backed-up to the root is then EXACT one. As an example, below is the engine output modified to behave like the following on fail-high:
- on first fail high, raise beta to +MATE (30000)
- on next fail-high, add +1 to beta

Code: Select all

.
.
.
new beta value is 30001
 00&#58;00&#58;06.753   15     Kd6-e5       2.4M  +MATE17 >24
 00&#58;00&#58;06.753   15  23&#58;Kd6-e5
new beta value is 30002
 00&#58;00&#58;06.755   15     Kd6-e5     500.0   +MATE17 >24
 00&#58;00&#58;06.755   15  23&#58;Kd6-e5
new beta value is 30003
.
.
.
new beta value is 31967
 00&#58;00&#58;10.222   15     Kd6-e5     1000.0   +MATE17 >24
 00&#58;00&#58;10.223   15  23&#58;Kd6-e5
new beta value is 31968
 00&#58;00&#58;10.224   15     Kd6-e5     1000.0   +MATE17 >24
 00&#58;00&#58;10.224   15  23&#58;Kd6-e5
new beta value is 31969
 00&#58;00&#58;10.246   15     Kd6-e5      11.1K  +MATE16 >24
 00&#58;00&#58;10.247   15  23&#58;Kd6-e5
new beta value is 31970
 00&#58;00&#58;10.248   15     Kd6-e5     1000.0   +MATE16 >24
 00&#58;00&#58;10.545   15     Kd6-e5     496.2K    +6.00        1. Kd6-e5    Kh1-g2
                                                         2. Bd1-c2    Kg2-f1
                                                         3. Bc2-b1    Kf1-g1
                                                         4. Bb1-a2    Kg1-f1
                                                         5. Ba2-b3    Kf1-g1
                                                         6. Bb3-a4    Kg1-f1
                                                         7. Ba4-c2    Kf1-g1
                                                         8. Bc2-b1
and this is the correponding log excerpt:

Code: Select all

prog    func          key       oldhash              type_old    oldvalue    newhash               type_new    newvalue    alpha    beta    ply    depth    old best move    best move    fen                                 Current Line                        
2315    CheckHash     351452    0x3F3A725C4E1D5CDE   FORCED_PV   31967                                                     0        0       0      127      Kd6-e5                        8/8/3K4/8/8/3N4/8/3B3k w - -                    
2316    CheckHash     179040    0x2B2C20B96315449F   AT_MOST     -31968                                                    -30001   -550    1      14       ------                        8/8/8/4K3/8/3N4/8/3B3k b - -        1. Kd6-e5 
2317    InsertHash    351452    0x3F3A725C4E1D5CDE   FORCED_PV   31967       0x3F3A725C4E1D5CDE    AT_LEAST    31967       550      30001   0      15       Kd6-e5           Kd6-e5       8/8/3K4/8/8/3N4/8/3B3k w - -    
2318    InsertHash    351452    0x3F3A725C4E1D5CDE   AT_LEAST    31967       0x3F3A725C4E1D5CDE    FORCED_PV   31967       0        0       0      127      Kd6-e5           Kd6-e5       8/8/3K4/8/8/3N4/8/3B3k w - -    
2319    CheckHash     351452    0x3F3A725C4E1D5CDE   FORCED_PV   31967                                                     0        0       0      127      Kd6-e5                        8/8/3K4/8/8/3N4/8/3B3k w - -    
2320    CheckHash     179040    0x2B2C20B96315449F   AT_MOST     -31968                                                    -30002   -550    1      14       ------                        8/8/8/4K3/8/3N4/8/3B3k b - -        1. Kd6-e5 
2321    InsertHash    351452    0x3F3A725C4E1D5CDE   FORCED_PV   31967       0x3F3A725C4E1D5CDE    AT_LEAST    31967       550      30002   0      15       Kd6-e5           Kd6-e5       8/8/3K4/8/8/3N4/8/3B3k w - -    
2322    InsertHash    351452    0x3F3A725C4E1D5CDE   AT_LEAST    31967       0x3F3A725C4E1D5CDE    FORCED_PV   31967       0        0       0      127      Kd6-e5           Kd6-e5       8/8/3K4/8/8/3N4/8/3B3k w - -    
.
.
.
prog    func          key       oldhash              type_old    oldvalue    newhash               type_new    newvalue    alpha    beta    ply    depth    old best move    best move    fen                                 Current Line                        
10176   CheckHash     179040    0x2B2C20B96315449F   AT_MOST     -31968                                                    -31966   -550    1      14       ------                        8/8/8/4K3/8/3N4/8/3B3k b - -        1. Kd6-e5 
10177   InsertHash    351452    0x3F3A725C4E1D5CDE   FORCED_PV   31967       0x3F3A725C4E1D5CDE    AT_LEAST    31967       550      31966   0      15       Kd6-e5           Kd6-e5       8/8/3K4/8/8/3N4/8/3B3k w - -    
10178   InsertHash    351452    0x3F3A725C4E1D5CDE   AT_LEAST    31967       0x3F3A725C4E1D5CDE    FORCED_PV   31967       0        0       0      127      Kd6-e5           Kd6-e5       8/8/3K4/8/8/3N4/8/3B3k w - -    
10179   CheckHash     351452    0x3F3A725C4E1D5CDE   FORCED_PV   31967                                                     0        0       0      127      Kd6-e5                        8/8/3K4/8/8/3N4/8/3B3k w - -    
10180   CheckHash     179040    0x2B2C20B96315449F   AT_MOST     -31968                                                    -31967   -550    1      14       ------                        8/8/8/4K3/8/3N4/8/3B3k b - -        1. Kd6-e5 
10181   InsertHash    351452    0x3F3A725C4E1D5CDE   FORCED_PV   31967       0x3F3A725C4E1D5CDE    AT_LEAST    31967       550      31967   0      15       Kd6-e5           Kd6-e5       8/8/3K4/8/8/3N4/8/3B3k w - -    
10182   InsertHash    351452    0x3F3A725C4E1D5CDE   AT_LEAST    31967       0x3F3A725C4E1D5CDE    FORCED_PV   31967       0        0       0      127      Kd6-e5           Kd6-e5       8/8/3K4/8/8/3N4/8/3B3k w - -    
10183   CheckHash     351452    0x3F3A725C4E1D5CDE   FORCED_PV   31967                                                     0        0       0      127      Kd6-e5                        8/8/3K4/8/8/3N4/8/3B3k w - -    
10184   CheckHash     179040    0x2B2C20B96315449F   AT_MOST     -31968                                                    -31968   -550    1      14       ------                        8/8/8/4K3/8/3N4/8/3B3k b - -        1. Kd6-e5 
10185   CheckHash     329372    0x67B340015ABD069F   AT_LEAST    31975                                                     550      31968   2      13       Ke5-f4                        8/8/8/4K3/8/3N4/8/3B2k1 w - -       1. Kd6-e5 Kh1-g1 
10186   CheckHash     294724    0x2A8D222E21CC7F46   AT_LEAST    31969                                                     31967    31968   2      13       Ke5-f4                        8/8/8/4K3/8/3N4/6k1/3B4 w - -       1. Kd6-e5 Kh1-g2 
10187   CheckHash     308744    0x745A163D13AB49F7   AT_MOST     -31970                                                    -31968   -31967  3      12       ------                        8/8/8/8/5K2/3N4/6k1/3B4 b - -       1. Kd6-e5 Kh1-g2 2. Ke5-f4 
10188   CheckHash     68836     0x359D644E0A510CE7   AT_LEAST    31987                                                     31967    31968   4      11       Bd1-f3                        8/8/8/8/5K2/3N4/8/3B1k2 w - -       1. Kd6-e5 Kh1-g2 2. Ke5-f4 Kg2-f1 
where at ID 2315 it retrieves the PV from hashtable (we are at ply 0), at ID 2316 (ply=1) it checks hashtable and gets a hit with a score of -31968, backs-up to previous ply (ply 0) and fails-high with a score of AT_LEAST 31967, which is coherent with the previous results (notice current alpha value is -30001, since it's black turn). Then at id 2318 I store PV into the hash, as you can notice some entries are marked as FORCED_PV and have ply and draft with predefined values (0 and 127), which help me to extract PV from hashtable. They are ok. Then at ID 2319 search starts over with different bounds, you can see it by looking at ID 2320 (-30002). The thing goes on until beta reaches 31968 ad ID 10183, followed by a hash check at ID 10184, where alpha is -31968 which is less than the returned value of the hash hit (-31967), so it will NOT get a hash hit and the search goes on... failing-high again a little later with a +MATE17... And so on....

So, definitely, I can say this is a completely "normal" behaviour, AKA it's a typical search instability thing.


Thanks for watching, comments are welcome!

Best Regards,
Natl.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Further analysis

Post by Sven »

Hi Natale,

this is an impressive work of analysis which I have rarely seen on such a level of precision - well done!

It is hard for me to find some good questions right now, for the very reason that I am not yet done with carefully reading all of your text. I have processed roughly the first half of it. So I would like to start with just one question that came into my mind.

Do you store the move that causes a cutoff as "best move" in the TT? Or only a move with a score >alpha and <beta? From your description I am not sure about it. If you don't do so yet, would you mind trying it, while also ensuring that the TT move is always tried first (maybe by simply moving it up within the move list)? How does your search behave then?

Sven
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: Further analysis

Post by xmas79 »

Hi Sven,
I already store the best move in the TT slot on every beta cutoff (SCORE_IS_AT_LEAST), and on every "SCORE_IS_EXACT" score (at the end of search when alpha is improved). I also take care of putting PV into hashtable before starting another iteration, so PV nodes are searched first.
When I wrote "generated moves are..." I mean that I have hash miss hence no best move.


Best regards,
Natale.