Transposition table bug - Points to incorrect alpha-beta?

hgm · Post by **hgm** » Fri Aug 01, 2014 10:02 am

The following method for debugging engines has proved invaluable to me:

1) Keep track of the path to the current node in a globally accessible stack of moves in the current path (i.e. push the move in MakeMove() and MakeNull(), and pull it in UnMake()).

2) This allows you to prefix any print statements for debugging to only be active in nodes that you specify, by prefixing them with an if(PATH), where PATH is some #defined complex expression involving the moves stack and current ply level. By #defining PATH as 0, the optimizer would completely remove all debugging code.

3) Print ply level, depth remaining, IID depth remaining (if applicable) with any debug info you print, so that you can easily see where it comes from.

4) Print the move, the score it received, the best score so far, alpha and beta at the point where you return from the recursion (e.g. immediately after UnMake())

This allows you to see the the list of searched moves in any node you want, together with the score they receive. You the start at the root (#define PATH plyLevel==1) for a position where it picks an obviously wrong move, to see if the problem is caused by a too-high score of thad bad move, or a too-low score of the good move that it should have chosen instead. You then follow the branch with the faulty score, by including it in the PATH test (#define PATH plyLevel == 2 && path[1] == POORMOVE). This way you can zoom in on the source of the faulty evaluation.

In the presence of hashing, your zooming in will usually end in a node that obtained the faulty score from the hash table. To debug that you will have to print out information (still controlled by if(PATH)) at the point in your code whare you make hash cutoffs:

5) print the hash key and other hash-entry content whenever you get a hash hit.

6) add a print in StoreHash wkich is NOT subject to the usual if(PATH), but triggers on a given hash key, and prints the path to the current node, as well as all info it stores in the entry.

When your zooming in on the error ended in a hash hit, you then know the entry where it was stored from (5), and you can run again after adapting the test for (6) to see where the hashed value came from. You can then set the path to the node responsible for storing the offending hashed value, and continue zooming in to see where the offending score came from.

If you have a reproducible error, this always works, although it is a bit of a pain when the error occurs only at large depth.

rob · Post by **rob** » Fri Aug 01, 2014 10:39 pm

Yes, these all would be very helpful (maybe even requisite) for fixing a transposition table bug.

I've done all this, plus I've done 3 more steps that have greatly simplified the task of getting this working (I think).

The first thing I do is I use position 837 from the ECM test suite. This case fails for me at a depth of 4, and the number of hash table entries at the end of alpha-beta+TT = 1,547. That's a very small number of hash table entries.
Position 837 is 6nk/pn2qr1r/1pbp1p1p/2p1pPpN/P1P1P1PP/2PP3R/7Q/2BBK2R w K - 0 1;

The next thing is I log every significant action taken by, e.g., negamax_ab_cpv (negamax with alpha beta plus capture the principal variation) to disk. I needed this some time ago for debugging other problems, e.g., engine crash - what do I need to do to reproduce the input that caused the crash. Similarly, I log every significant action taken by the alpha beta plus TT implementation. I set this up so that alpha beta + capture principal variation is as similar to alpha beta + TT as possible.

That allows me to drop the output of both into a diff application and it automatically identifies the first point at which the execution traces differ.

What I find is that the cause of the problem is that at some point, the hash table is consulted to determine the score of a position. The hash table says the score is, e.g., -1.0 when the actual score is 1.0. This permanently damages the execution (all follow-on results are wrong) and finally the function returns a silly best move. Usually the returned move is among the worst possible moves, e.g., capture pawn defended by a pawn with a queen. Then I use your method of tracing backwards in the execution (in my case, just search in a text editor backwards for the matching hash value) to find whether the hash table entry is set properly or used improperly. The answer (hopefully I am thinking about this properly) is that the hash table entry has the correct value at set time (insertion into the hash table). Later, when the hash table entry is retrieved for reuse, the depth of reuse does not match the depth of the hash table entry. It's off-by-one. So the value is negative in the bad execution trace. Eventually this leads to a large fire and a choice among the worst possible moves is returned as the best possible move for the input position.

If you like, I can provide the execution traces and the exact line where the executions differ.

Topic shift: Might you be able to tell me more about why my formulation of return value for checkmate seems wrong?

I use return -(INFINITY - (max_depth - depthleft)); because that way if I am five moves away from checkmate, the score of the position is -999,995, if I am four moves away from checkmate, the score of the position is -999,996, if I am three moves away from checkmate, the score is -999,997. INFINITY is always #define'd to -1000000, max_depth is always constant and equal to, e.g., 4 if the engine will be searching to a deepest depth of 4. Depth left starts at 4, then becomes 3, then becomes 2, finally ends at zero. At zero, my alpha beta and alpha beta+TT code calls evaluate(position), where position is continuously updated with a makemove(…) type routine. I don't use make/unmake as this is just a speed optimization and I don't use incremental Zobrist hashing anyway, I calculate the hash value by iterating over the entire board. Makemove(…) writes the output to a local tmp variable and it's copy-make. When done, the tmp variable just falls out of scope and it's gone. All of this is to simplify the situation, at least for now. I can speed things up later once correctness is finished.

This seems to me to produce exactly the effect you are stating is required, that the fastest path to checkmate will receive the best (most highly negative) score.

Rob

bob · Post by **bob** » Fri Aug 01, 2014 10:49 pm

hgm wrote:The following method for debugging engines has proved invaluable to me:

1) Keep track of the path to the current node in a globally accessible stack of moves in the current path (i.e. push the move in MakeMove() and MakeNull(), and pull it in UnMake()).

2) This allows you to prefix any print statements for debugging to only be active in nodes that you specify, by prefixing them with an if(PATH), where PATH is some #defined complex expression involving the moves stack and current ply level. By #defining PATH as 0, the optimizer would completely remove all debugging code.

3) Print ply level, depth remaining, IID depth remaining (if applicable) with any debug info you print, so that you can easily see where it comes from.

4) Print the move, the score it received, the best score so far, alpha and beta at the point where you return from the recursion (e.g. immediately after UnMake())

This allows you to see the the list of searched moves in any node you want, together with the score they receive. You the start at the root (#define PATH plyLevel==1) for a position where it picks an obviously wrong move, to see if the problem is caused by a too-high score of thad bad move, or a too-low score of the good move that it should have chosen instead. You then follow the branch with the faulty score, by including it in the PATH test (#define PATH plyLevel == 2 && path[1] == POORMOVE). This way you can zoom in on the source of the faulty evaluation.

In the presence of hashing, your zooming in will usually end in a node that obtained the faulty score from the hash table. To debug that you will have to print out information (still controlled by if(PATH)) at the point in your code whare you make hash cutoffs:

5) print the hash key and other hash-entry content whenever you get a hash hit.

6) add a print in StoreHash wkich is NOT subject to the usual if(PATH), but triggers on a given hash key, and prints the path to the current node, as well as all info it stores in the entry.

When your zooming in on the error ended in a hash hit, you then know the entry where it was stored from (5), and you can run again after adapting the test for (6) to see where the hashed value came from. You can then set the path to the node responsible for storing the offending hashed value, and continue zooming in to see where the offending score came from.

If you have a reproducible error, this always works, although it is a bit of a pain when the error occurs only at large depth.

There is another trick I use regularly, a monstrous sanity-check. I call this at several points when -DDEBUG is used, it compares all of the chess info for sanity. For example, it recomputes the hash signature from scratch and verifies that it matches the currently active signature. It compares bit boards for sanity (no bit can be set more than once in all the bit boards). Anything computed incrementally (piece counts, material balance, etc) are also checked. Slows things down but certainly pinpoints the specific node / move that breaks something.

rob · Post by **rob** » Sat Aug 02, 2014 1:25 am

I put my execution traces online.

alpha beta
alpha beta+TT

The execution traces first differ on line 52458 of alpha beta+TT.
This matches up to line 60851 of alpha beta.

Or just search for "hashval = 10276814561371569928". The second appearance of this hashval is where it is retrieved from the hash table.

Execution starts failing there due to the value being negated when it should not be.

Rob

bob · Post by **bob** » Sat Aug 02, 2014 5:12 pm

rob wrote:I put my execution traces online.

alpha beta
alpha beta+TT

The execution traces first differ on line 52458 of alpha beta+TT.
This matches up to line 60851 of alpha beta.

Or just search for "hashval = 10276814561371569928". The second appearance of this hashval is where it is retrieved from the hash table.

Execution starts failing there due to the value being negated when it should not be.

Rob

You DID include white-to-move in the hash signature, correct? You can reach the same position with either side to move, and thanks to negamax, that would nicely reverse the sign if your hash signature does not include side-to-move...

rob · Post by **rob** » Sat Aug 02, 2014 10:05 pm

Yes.

Actually, I have implemented with #ifdef … #endif two different approaches, one is XOR in the Zobrist random number for side to move only if black is on move (this should be sufficient). Or I can XOR in a different random number for black on move and white on move. Both implementations operate identically, that is, both reproduce the TT bug.

Rob

rob · Post by **rob** » Sat Aug 02, 2014 10:39 pm

I put a copy of my negamax ab + TT implementation online, with just a bit of simplification, removal of some unnecessary details (like how draw by 3-fold repetition is detected).

It's essentially a clone of Tony Marsland's recipe given in Figure 7 of his paper "A Review of Game Tree Pruning".

negamax_ab_TT

Rob

rob · Post by **rob** » Sat Aug 02, 2014 10:48 pm

Sorry, there are the following modifications to Tony Marsland's recipe:

1. I don't attempt to retrieve and use the hash table move before move generation. I just drop into checking all possible moves after generating all possible moves. This should not be a problem.

2. If I get a beta cutoff, I add local_move_record to the hash table entry as the best move, whereas if I don't get a beta cutoff, I add local_move_record[moveindex] to the hash table entry as the best move. Since I don't use the best move in the hash table, it doesn't matter if one or both of these is wrong.

Rob

hgm · Post by **hgm** » Sat Aug 02, 2014 11:00 pm

rob wrote:Actually, I have implemented with #ifdef … #endif two different approaches, one is XOR in the Zobrist random number for side to move only if black is on move (this should be sufficient). Or I can XOR in a different random number for black on move and white on move. Both implementations operate identically, that is, both reproduce the TT bug.

When you differentially update the key there is an even more efficient method, where you don't have to XOR in an stm key at all. The trick is that you XOR all keys in the Zobrist table, including those for empty squares, with the stm key. Then the regular key update

hashKey ^= Zobrist[departurePiece][from] ^ Zobrist[arrivalPiece][to] ^ Zobrist[victim][to];

will automatically XOR the stm key into the hashKey, as there is an odd number of Zobrist keys involved.

Of course when the stm key and all other keys are randomly chosen, XORing the stm key with the piece keys doesn't make the piece keys any less random, so you might as well not do it. The only thing then is that in the initialization you fill the Zobrist table for the empty square (conventionally taken as 0) with what you otherwise would have used for stm key. Only for the null move you would then have to explicitly XOR with this stm key (as only modification).

op12no2 · Post by **op12no2** » Sat Aug 02, 2014 11:00 pm

You don't seem to overtly unmake makemove() - is that done in makemove() itself, for the previous move?

Transposition table bug - Points to incorrect alpha-beta?

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta