how can IID with null allowed cause wrong mate move from tt

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

MahmoudUthman
Posts: 234
Joined: Sat Jan 17, 2015 11:54 pm

how can IID with null allowed cause wrong mate move from tt

Post by MahmoudUthman »

I mistakenly wrote the IID search like this :

Code: Select all

AlphaBeta(alpha.beta,IID_depth,AllowNull);
passing the AllowNull Boolean instead of false , when my engine was playing as white against Nejmet it reached this position :
[d] 4Q3/2p2Npk/n7/3P3p/1P4qP/5p2/6P1/4B1K1 w - - 6 44
only to play e1d2 reporting it at depth 2 to be +mate in 1

this doesn't happen when I change the call to (,,false) , what I don't understand is how can the wrong IID call cause this , that if it was really the cause in the first place ? and how can I verify that it was the cause?
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: how can IID with null allowed cause wrong mate move from

Post by Sven »

MahmoudUthman wrote:I mistakenly wrote the IID search like this :

Code: Select all

AlphaBeta(alpha.beta,IID_depth,AllowNull);
passing the AllowNull Boolean instead of false , when my engine was playing as white against Nejmet it reached this position :
[...]
only to play e1d2 reporting it at depth 2 to be +mate in 1

this doesn't happen when I change the call to (,,false) , what I don't understand is how can the wrong IID call cause this , that if it was really the cause in the first place ? and how can I verify that it was the cause?
If the problem is reproducible by setting up that position and letting the engine search then dump the tree to a file, e.g. log move strings of all make/unmake calls with indentation according to current ply and all returns with score and reason. Should not be too difficult for depth=2.
MahmoudUthman
Posts: 234
Joined: Sat Jan 17, 2015 11:54 pm

Re: how can IID with null allowed cause wrong mate move from

Post by MahmoudUthman »

Sven Schüle wrote:
MahmoudUthman wrote:I mistakenly wrote the IID search like this :

Code: Select all

AlphaBeta(alpha.beta,IID_depth,AllowNull);
passing the AllowNull Boolean instead of false , when my engine was playing as white against Nejmet it reached this position :
[...]
only to play e1d2 reporting it at depth 2 to be +mate in 1

this doesn't happen when I change the call to (,,false) , what I don't understand is how can the wrong IID call cause this , that if it was really the cause in the first place ? and how can I verify that it was the cause?
If the problem is reproducible by setting up that position and letting the engine search then dump the tree to a file, e.g. log move strings of all make/unmake calls with indentation according to current ply and all returns with score and reason. Should not be too difficult for depth=2.
It's only reproducible by replaying the entire game, if I try the position alone the engine reports the right score and move
Dann Corbit
Posts: 12541
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: how can IID with null allowed cause wrong mate move from

Post by Dann Corbit »

In case you did not already do this: Build a test so that in debug mode you completely rebuild the hash and compare with the incremental hash using assert.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
MahmoudUthman
Posts: 234
Joined: Sat Jan 17, 2015 11:54 pm

Re: how can IID with null allowed cause wrong mate move from

Post by MahmoudUthman »

Dann Corbit wrote:In case you did not already do this: Build a test so that in debug mode you completely rebuild the hash and compare with the incremental hash using assert.
I already do for all incrementally updated variables .
Ras
Posts: 2488
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: how can IID with null allowed cause wrong mate move from

Post by Ras »

MahmoudUthman wrote:what I don't understand is how can the wrong IID call cause this
I think I have an idea of what might be happening.

1) null move is tried, which gives Dxg2# with mate score. This is done before trying the IID with depth==3 so that the null move serves as refutation move.
2) that is the eval for the root position, which is stored into the hash table. This would be the actual bug - since null move positions are not actual game positions, they don't belong into the hash table. Alternatively, you could have two hashtables depending on whose turn it is to move.
3) IID with depth==3 is tried. Since this position is in the hash table (from step 2), the search aborts with score==mate.
4) Since it is mate anyway, just some random legal move is played out.

You can verify this by switching off the hash tables. If the bug goes away, then I think the root cause is that you have hash table storage active during null move searches.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: how can IID with null allowed cause wrong mate move from

Post by Sven »

Ras wrote:
MahmoudUthman wrote:what I don't understand is how can the wrong IID call cause this
I think I have an idea of what might be happening.

1) null move is tried, which gives Dxg2# with mate score. This is done before trying the IID with depth==3 so that the null move serves as refutation move.
2) that is the eval for the root position, which is stored into the hash table. This would be the actual bug - since null move positions are not actual game positions, they don't belong into the hash table. Alternatively, you could have two hashtables depending on whose turn it is to move.
3) IID with depth==3 is tried. Since this position is in the hash table (from step 2), the search aborts with score==mate.
4) Since it is mate anyway, just some random legal move is played out.

You can verify this by switching off the hash tables. If the bug goes away, then I think the root cause is that you have hash table storage active during null move searches.
If I compare the original problem description with your points above then I see nothing that both have in common.
- The OP wrote that the engine reports +mate in 1 ("white mates"). You write about "black mates" (after null move).
- The null move is never tried at the root since you never try it at a PV node, and the root node is a PV node.
- If the null move fails low (after 1...Qxg2#) then this would not affect the parent node (if there were any), so the resulting score (checkmated) could never become a "best score" anywhere.
- Apart from that, there is no problem in general to store the result of a null move search in the hash table. It will only be done if the null move fails high (which is what you expect most of the time), and then it actually makes sense to store the result of type "lower bound" in the hash table, with the original (not reduced) depth.

I think that the problem is not necessarily related to the hash table. The OP wrote that he can't reproduce the problem by setting up the position and letting the engine search, only by "replaying the game". If "replaying" means just playing all moves of the game on the board then there is no hash table influence that would not be present when setting up the position directly. The only difference would be to know the game history. If, however, "replaying" means playing the game under identical conditions against the same opponent then hash table may be involved but also other reasons are possible (in general any kind of undefined behaviour that occurred during the game prior to that position).
Ras
Posts: 2488
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: how can IID with null allowed cause wrong mate move from

Post by Ras »

Sven Schüle wrote:- The OP wrote that the engine reports +mate in 1 ("white mates"). You write about "black mates" (after null move).
Of course, because the basic problem is that the engine reports mate in 1 and play Bd2. Then mate in 1 is Qxg2#.
- The null move is never tried at the root since you never try it at a PV node
You can well try a null move at root (if not being in check) for move ordering. Before trying a search at depth==3, you can try a null move search with depth==2 and use this as "threat move".

When going for the real depth==3 search, this threat move can be sorted to the top of the list at all nodes with depth-level==2. This gives a considerable number of cut-offs, especially in positions like this one. On average and at depth==3, I've been measuring something like 40% fewer QS node evaluations when feeding in the threat move from the null move search at root level.
- so the resulting score (checkmated) could never become a "best score" anywhere.
It shouldn't, but if the result is stored in hash tables AND there is only one hash table no matter what side is to move, then this could explain why the engine makes the stupid move Bd2.

And the question remains how null move search would lead the engine to play Bd2 with being mated in the next ply. The only mate in 1 after Bd2 is black's mate, and some unlucky combination of hash tables and null moves at least would give some mate in 1 - albeit for black, of course.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: how can IID with null allowed cause wrong mate move from

Post by Sven »

Ras wrote:
Sven Schüle wrote:- The OP wrote that the engine reports +mate in 1 ("white mates"). You write about "black mates" (after null move).
Of course, because the basic problem is that the engine reports mate in 1 and play Bd2. Then mate in 1 is Qxg2#.
- The null move is never tried at the root since you never try it at a PV node
You can well try a null move at root (if not being in check) for move ordering. Before trying a search at depth==3, you can try a null move search with depth==2 and use this as "threat move".

When going for the real depth==3 search, this threat move can be sorted to the top of the list at all nodes with depth-level==2. This gives a considerable number of cut-offs, especially in positions like this one. On average and at depth==3, I've been measuring something like 40% fewer QS node evaluations when feeding in the threat move from the null move search at root level.
- so the resulting score (checkmated) could never become a "best score" anywhere.
It shouldn't, but if the result is stored in hash tables AND there is only one hash table no matter what side is to move, then this could explain why the engine makes the stupid move Bd2.

And the question remains how null move search would lead the engine to play Bd2 with being mated in the next ply. The only mate in 1 after Bd2 is black's mate, and some unlucky combination of hash tables and null moves at least would give some mate in 1 - albeit for black, of course.
The topic is different. The OP asked for help to find the cause of his problem. So it is not about "speculation". You can be 99.99% sure that trying the null move at the root node leads to nowhere. And I also explained why a score resulting from a null move failing *low* can't make it into the hash table. And again, the OP wrote "+mate in 1", not "-mate in 1".
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: how can IID with null allowed cause wrong mate move from

Post by matthewlai »

Ras wrote:
MahmoudUthman wrote:what I don't understand is how can the wrong IID call cause this
I think I have an idea of what might be happening.

1) null move is tried, which gives Dxg2# with mate score. This is done before trying the IID with depth==3 so that the null move serves as refutation move.
2) that is the eval for the root position, which is stored into the hash table. This would be the actual bug - since null move positions are not actual game positions, they don't belong into the hash table. Alternatively, you could have two hashtables depending on whose turn it is to move.
3) IID with depth==3 is tried. Since this position is in the hash table (from step 2), the search aborts with score==mate.
4) Since it is mate anyway, just some random legal move is played out.

You can verify this by switching off the hash tables. If the bug goes away, then I think the root cause is that you have hash table storage active during null move searches.
There's really no reason why hash table can't be used during null move search, as long as side to move is part of the hash (and it should be anyways, since it's possible to reach same positions with different sides to move without null-move). This is equivalent to using 2 hash tables but without the inefficient memory usage.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.