Transposition table bug - Points to incorrect alpha-beta?

rob · Post by **rob** » Thu Jul 31, 2014 5:12 am

Hi,

I have a chess engine implementation that I've put together by reading through the many research papers published post 1950's to today. I've for the most part successfully implemented move-ordering methods such as killers, history heuristic, sorting based on lva-mvv, and so on.

I have arrived at a point where a transposition table implementation will be very beneficial and have implemented a nearly perfect copy of a version given in a paper by Tony Marsland, "A Review of Game Tree Pruning", Figure 7. After implementation, I tested using several of the well-known test suites, ECM and a test set I have typed in from Bruce Pandolfini, "More Chessercizes, Checkmate!".

I was quite puzzled by the results of testing. The transposition table version works perfectly about 99% of the time. Very rarely it does fail however, and in those cases the playing ability of the engine flips so that rather than playing a sensible move it plays apparently a choice among the worst possible moves. For example, from Pandolfini's "More Chessercizes, Checkmate!", position 81:

Position 81

Search Test case

r1b3kr/pppp1p1p/5n1B/3PR3/4n3/2P2N2/P4PPP/R5K1 w - - 0 1

Chessboard:
8 r b k r
7 p p p p p p
6 n B
5 P R
4 n
3 P N
2 P P P P
1 R K
a b c d e f g h

side to move = white
half move clock: 0
full move counter: 1

White mates in 3 moves.
Now constructing mate in 5 total moves.
best_move_nabcpv called with depth = 5
line.cmove = 5
score: 999995.000
Final pv:
Move: WKnight f3 -> d2
Move: BPawn a7 -> a5
Move: WKnight d2 -> e4
Move: BPawn a5 -> a4
Move: WKnight e4 -> f6
Number of evaluations: 198,783
Best move: WKnight f3 -> d2
Score: 999995.000
Elapsed time: 2.828 seconds

best_move_nab_tt004 called with depth = 5
Board hash value = 8499245688989433999
Move: WKnight f3 -> d2
Move: BRook a8 -> b8
Move: WKnight d2 -> e4
Move: BKnight f6 -> d5
Move: WRook e5 -> d5
Number of evaluations: 204,832
Score: 2.000
Elapsed time: 2.927 seconds

Hash table entries = 11,091

best_move_nabcpv is just negamax alpha beta and further, capture the principal variation. best_move_nab_tt004 is just negamax alpha beta with transposition tables implemented according to Marsland's recipe (Figure 7).

The evaluation function is a very simple queen=9, rook=5, bishop/knight=3, pawn=1.

The king and checkmate are set to 1,000,000.

Notice how the transposition table version runs off the rails right at the very last move, preferring to move the white rook to d5 (useless) rather than playing the white knight to f6, checkmate.

I believe I've debugged this all the way back to a flaw in my alpha beta code. In other words, even though my alpha beta code seems to work properly every time, it's actually incorrect.

I wonder if I might ask 4 questions.

Briefly, a bit more context:

- When I detect checkmate, I always return -1,000,000
Well, actually, -(1,000,000 - (max depth - depth remaining))
So this will always be a negative number
depth remaining goes to zero at the leaves of my search (the horizon nodes)

- When I evaluate a chess board, I evaluate score = white pieces - black pieces.
If side to move = white, return the above score
If side to move = black, return -score.

I think the glaring inconsistency between the above two scoring mechanisms is the source of my problems.

My questions are:

In your chess engine, are you returning a side relative checkmate score?
E.g., +1,000,000 if white checkmates and -1,000,000 if black checkmates?
Or are you always returning a negative number, e.g., -1,000,000?

In your chess engine, are you returning score = white pieces - black pieces
and then negate if the side to move is black? (Assuming of course this very trivial evaluation function)

Have you encountered transposition tables working about 99% of the time, and trace through the logic (yes, many thousands of cases) for a simple case and found that the problem was one of the alpha beta values that works its way backwards was wrong in the sign only (magnitude of the value is correct)?

Any idea why alpha beta would seem to be working correctly despite this flaw? I would think that the scoring inconsistency above would cause the alpha beta evaluation to be incorrect most of the time. But I'm not seeing this. I believe what I'm seeing is correct operation 100% of the time. This puzzles me.

Thank you!

Rob

op12no2 · Post by **op12no2** » Thu Jul 31, 2014 9:14 am

Hi Rob,

I think the glaring inconsistency between the above two scoring mechanisms is the source of my problems.

I'm at a similar stage of my chess engine adventure to you so I am by no means totally confident of these answers but I have been there recently so...

Your scoring is not inconsistent, it's OK. In negamax scores are relative to the side on the move, so checkmate is always -ve and in eval() if you compute more +ve == better for white and more -ve == better for black (as you do) then you need to return -score if black is on the move (as you do). Then everything is relative to the side on the move.

I return (-20000 + ply) for checkmate where ply is distance from the root and essentially you are doing the same, but I also tweak that value when storing/retrieving from the TT so that it's correct when you get a hash hit and ply is different.

However I commented out that code and it still finds Nd2 d6 Nxe4 d6xe5 Nxf6 so maybe that's a red herring in this case - it is something you need to do though (if you don't). There are lots of threads about it floating around.

My code (single file) is here if you are interested: http://op12no2.me/toys/lozza/1.2/lozza.js

op12no2 · Post by **op12no2** » Thu Jul 31, 2014 9:36 am

PS: Have you tried generating the hash from scratch in eval() and comparing it with your running hash? I do this every now and again and sometimes find I've broken the running hash when messing around with makemove().

kinderchocolate · Post by **kinderchocolate** » Thu Jul 31, 2014 10:06 am

I think you should disable the TT to convince yourself your bug has something to do with the TT. Your bug might be something else.

rob · Post by **rob** » Thu Jul 31, 2014 1:57 pm

I always generate the hash from scratch, not incrementally as suggested by Zobrist. This will be slower, but I'm having problems with correctness at the moment. I can speed it up later.

rob · Post by **rob** » Thu Jul 31, 2014 2:31 pm

I have a function called best_move(…) that accepts as parameters a description of the board, a depth to search to, an indicator of which side is on move, and a pointer to a data structure which contains the best move to make given the state of the other passed in parameters.

What best_move(…) does is call the following functions:

best_move_ncpv(…) This is a minimax implementation
best_move_nab(…) This is a "dumb" alpha-beta implementation which determines the score for each possible first move from the input position.
best_move_nabcpv(…) This is an alpha-beta implementation with principal variation capture
best_move_nabcpv_hh(…) This is an alpha-beta implementation with move sorting according to the history heuristic
best_move_nabcpv_id(…) This is an alpha-beta implementation with iterative deepening and sorting the principal variation of the previous iteration as the first move to be considered in the present iteration.

… and on and on …

So I am running all possible variations every time best_move(…) is called. Then, after that, I check the returned move score and returned best move from each variation and check that they are consistent (could be different moves that produce the same final score, e.g., many ways to checkmate in 3 moves).

As long as I am executing the variations that don't use transposition tables, all is well. That is to say, the final score is the same regardless of which of the above variants is called. But when I add in an implementation that uses transposition tables (such as pvs or even simple alpha beta+TT) , the value it produces as the final negamax alpha beta score of the input position is always wrong. Further, when I look at the captured principal variation, it's filled with silly moves like queen takes a pawn that is defended by a pawn and no further compensation can occur.

This suggests (at least to me) that the problem is with the TT table implementation. But my implementation is in fact Tony Marsland's recipe described in a paper he authored ("A Review of Game-Tree Pruning"). It would be rare for a paper to contain serious errors in its figures. Therefore I think the fault must be mine, especially since it seems that I'm scoring the boards inconsistently. Perhaps this inconsistent scoring is gumming up the TT table with bad values.

Rob

bob · Post by **bob** » Thu Jul 31, 2014 5:56 pm

rob wrote:
My questions are:

In your chess engine, are you returning a side relative checkmate score?
E.g., +1,000,000 if white checkmates and -1,000,000 if black checkmates?
Or are you always returning a negative number, e.g., -1,000,000?

I assume you are using negamax? If so, -=bad for current side, +=good, so if you are checkmated, - is correct. But notice that when you get back to the previous ply, the sign is reversed (recursive call is something like val = -Search(depth, wtm, etc)) so the score is + which is good for that side since he mated the opponent. I simply return -(MATE - ply) where MATE is largest integer score I want to ever produce. I am not sure why you are factoring in depth however, that seems wrong.

In your chess engine, are you returning score = white pieces - black pieces
and then negate if the side to move is black? (Assuming of course this very trivial evaluation function)

I do that for simplicity, I write the eval where +=good for white, and then at the end, if it is black to move, I return that value with sign reversed.

Have you encountered transposition tables working about 99% of the time, and trace through the logic (yes, many thousands of cases) for a simple case and found that the problem was one of the alpha beta values that works its way backwards was wrong in the sign only (magnitude of the value is correct)?

Any idea why alpha beta would seem to be working correctly despite this flaw? I would think that the scoring inconsistency above would cause the alpha beta evaluation to be incorrect most of the time. But I'm not seeing this. I believe what I'm seeing is correct operation 100% of the time. This puzzles me.

Thank you!

Rob

Hard to guess what might be wrong. But one thing is for certain, you want a correct alpha/beta search BEFORE you start debugging things that will affect it (IE hashing). Debugging two things at once is way more than twice as hard as debugging each thing separately.

rob · Post by **rob** » Fri Aug 01, 2014 6:26 am

I return (-20000 + ply) for checkmate where ply is distance from the root and essentially you are doing the same, but I also tweak that value when storing/retrieving from the TT so that it's correct when you get a hash hit and ply is different.

However I commented out that code and it still finds Nd2 d6 Nxe4 d6xe5 Nxf6 so maybe that's a red herring in this case - it is something you need to do though (if you don't). There are lots of threads about it floating around.

Can you point me to one of the threads about this? It's not mentioned in any of the papers which provide sample TT implementations.

Rob

op12no2 · Post by **op12no2** » Fri Aug 01, 2014 9:39 am

rob wrote: Can you point me to one of the threads about this? It's not mentioned in any of the papers which provide sample TT implementations.

Sure, I'll have a look for some. NB: I think Robert's point about using depth is an important one. We need to focus on the shortest mate, so what seems important to me is distance from root ("ply" in my code) rather than "depth" - distance to horizon, which could be almost anything.

Ditto for killers, depth is irrelevant, ply is needed for the index. You kinda do that with maxply - depth, but if the same position is reached at different depths - and when you add extensions and reductions - that will increasingly not evaluate to ply (distance from root), so your engine may start chasing longer mates; I think.

I'll get back to you with some links...

op12no2 · Post by **op12no2** » Fri Aug 01, 2014 9:54 am

Rob,

Look for "mate score" in this CPW entry:-

https://chessprogramming.wikispaces.com/Checkmate

CPW was my primary source to write my engine (thanks guys), as well as posts here and elsewhere.

The idea is that you need a mate score based on distance from root (ply) so you can focus on the shortest one.

So one way is to return the mate score as is, but save it adjusted from the current ply at the point of detection and then readjust to the (possibly different) current ply when grabbing it from the TT to get a score to return.

Again - I think - it certainly seems logical to me - but I'm no expert at this; just been there recently...

op12no2 wrote: but if the same position is reached at different depths

I was sloppy - "but if the same position is reached at different ply".

Transposition table bug - Points to incorrect alpha-beta?

Transposition table bug - Points to incorrect alpha-beta?

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta

Re: Transposition table bug - Points to incorrect alpha-beta