Page 1 of 3

Key collision handling

Posted: Fri Oct 21, 2011 6:18 pm
by Zlaire
I suspect I have a problem with key collisions in my transposition table. Annoying bugger to track down since it happens so seldom (once per 1000 games or so). Currently I'm only using 32-bit keys which I guess makes the problem far worse.

So far I've only seen it in positions where the king was captured which causes an out bounds exception in my makeMove-code.

Does is sound reasonable to get a key collision every 1000 games with 32 bits or should I be looking for something else?

And if it is a key collision how would you suggest an engine should handle it? (I'm thinking just catching the exception and do a research on the position, better then crashing and better than adding "validateHashMove")

Upping the bits to 64 should be done in either case, but I still don't want the engine to crash when it once in a blue moon gets a key collision.

(I'm still trying to corner the bug to be sure what it is, fourth attempt now after having missed it with some stupidly placed print traces)

Re: Key collision handling

Posted: Fri Oct 21, 2011 6:28 pm
by mcostalba
Zlaire wrote:I suspect I have a problem with key collisions in my transposition table. Annoying bugger to track down since it happens so seldom (once per 1000 games or so). Currently I'm only using 32-bit keys which I guess makes the problem far worse.

So far I've only seen it in positions where the king was captured which causes an out bounds exception in my makeMove-code.

Does is sound reasonable to get a key collision every 1000 games with 32 bits or should I be looking for something else?

And if it is a key collision how would you suggest an engine should handle it? (I'm thinking just catching the exception and do a research on the position, better then crashing and better than adding "validateHashMove")

Upping the bits to 64 should be done in either case, but I still don't want the engine to crash when it once in a blue moon gets a key collision.

(I'm still trying to corner the bug to be sure what it is, fourth attempt now after having missed it with some stupidly placed print traces)
In Stockfish we validate and check for legality the TT move before to try it. This is especially helpful in SMP where corrupted TT entries are not so rare given that normally the TT table is shared across the threads and is accessed without locking. In Crafty there is a trick (based on XOR-ing the fields) to detect corrupted entries, but this does not saves you from key collisions.

Our scheme guarantees to detect and discard both corrupted TT entries and key collisions both in practical and theoretical terms, so I consider this superior to alternatives ;-)

Re: Key collision handling

Posted: Fri Oct 21, 2011 6:29 pm
by bob
Zlaire wrote:I suspect I have a problem with key collisions in my transposition table. Annoying bugger to track down since it happens so seldom (once per 1000 games or so). Currently I'm only using 32-bit keys which I guess makes the problem far worse.

So far I've only seen it in positions where the king was captured which causes an out bounds exception in my makeMove-code.

Does is sound reasonable to get a key collision every 1000 games with 32 bits or should I be looking for something else?

And if it is a key collision how would you suggest an engine should handle it? (I'm thinking just catching the exception and do a research on the position, better then crashing and better than adding "validateHashMove")

Upping the bits to 64 should be done in either case, but I still don't want the engine to crash when it once in a blue moon gets a key collision.

(I'm still trying to corner the bug to be sure what it is, fourth attempt now after having missed it with some stupidly placed print traces)
Very possible. There are a couple of ideas to try:

(1) 64 bits is much safer, but obviously it can't be 100% safe.

(2) fix your code so that it won't crash on an illegal move.

(3) Don't play an illegal hash move. I do this by validating (in a simple way) that the move from the hash table is illegal, and when this happens, I just ignore the hash move completely. You could ignore the score, but testing (paper I wrote with Cozzie a couple of years back) shows that errors that infrequent will not affect the tree in a measurable way.

Re: Key collision handling

Posted: Fri Oct 21, 2011 6:34 pm
by Zlaire
mcostalba wrote:In Stockfish we validate and check for legality the TT move before to try it.
Isn't that costly for something that happens very rarely?

I guess my approach of researching on exception fails on the fact that not every illegal move is guaranteed to trigger an exception, so some (most) of the time the engine gets the move completely messed up.

But still, if you get it once every 1000 games (which seems like a lot, even with smp), it means you unnecessarily slow the engine down the other 999.

Re: Key collision handling

Posted: Fri Oct 21, 2011 6:37 pm
by mcostalba
Zlaire wrote:
mcostalba wrote:In Stockfish we validate and check for legality the TT move before to try it.
Isn't that costly for something that happens very rarely?
The TT move is something that happens very rarely ;-) so the impact is minimal, moreover the implementation of Position::is_pseudo_legal() is quite optimized to be fast. Profiling shows that impact is barely measurable.

Re: Key collision handling

Posted: Fri Oct 21, 2011 6:47 pm
by Zlaire
bob wrote: (2) fix your code so that it won't crash on an illegal move.
I actually didn't see this as a problem, but rather as a kind of a safety net. If I allow the engine to play illegal moves, who knows what comes out the other end. On the other hand, currently it's a side effect and only happens on certain illegal moves.

This needs to be fixed for sure.
mcostalba wrote: The TT move is something that happens very rarely so the impact is minimal
Interesting, I always assumed this was a very common occurrence, especially as I get my IID move from the TT. But maybe that's part of the problem?

Re: Key collision handling

Posted: Fri Oct 21, 2011 6:52 pm
by mcostalba
Zlaire wrote: Interesting, I always assumed this was a very common occurrence, especially as I get my IID move from the TT. But maybe that's part of the problem?
How many moves per node on average you try ? 5, 10 or 15 ? Among this at maximum one is the TT move and you don't always have a TT move especially at low depths where the nodes are the most. Moreover in ALL nodes you normally don't have a TT move.

I also second Bob regarding fixing your code to avoid illegal moves: it is not a safety net, it is a dangerous way to hide possible bugs, and because these kind of bugs normally are already very subtle per se, I would say you don't need to make them even more subtle avoiding filtering out illegal moves.

Re: Key collision handling

Posted: Fri Oct 21, 2011 6:55 pm
by bob
Zlaire wrote:
bob wrote: (2) fix your code so that it won't crash on an illegal move.
I actually didn't see this as a problem, but rather as a kind of a safety net. If I allow the engine to play illegal moves, who knows what comes out the other end. On the other hand, currently it's a side effect and only happens on certain illegal moves.

This needs to be fixed for sure.
mcostalba wrote: The TT move is something that happens very rarely so the impact is minimal
Interesting, I always assumed this was a very common occurrence, especially as I get my IID move from the TT. But maybe that's part of the problem?
It IS a very common thing. Not quite sure what he means by "very rarely". Every position of type LOWER has a best move. Every position of type EXACT" has a best move. And in Crafty, every position of type UPPER has a best move also, as I store the first move searched if I end up failing low, as that at least gives me one free move to search the next time I hit this position, without having to generate moves...

For me, > 25% of positions have a hash move. For the normal program, I'd expect it to be over 10% for LOWER position hits.

I can measure it to get exact numbers if you are curious...

Re: Key collision handling

Posted: Fri Oct 21, 2011 6:58 pm
by bob
mcostalba wrote:
Zlaire wrote:I suspect I have a problem with key collisions in my transposition table. Annoying bugger to track down since it happens so seldom (once per 1000 games or so). Currently I'm only using 32-bit keys which I guess makes the problem far worse.

So far I've only seen it in positions where the king was captured which causes an out bounds exception in my makeMove-code.

Does is sound reasonable to get a key collision every 1000 games with 32 bits or should I be looking for something else?

And if it is a key collision how would you suggest an engine should handle it? (I'm thinking just catching the exception and do a research on the position, better then crashing and better than adding "validateHashMove")

Upping the bits to 64 should be done in either case, but I still don't want the engine to crash when it once in a blue moon gets a key collision.

(I'm still trying to corner the bug to be sure what it is, fourth attempt now after having missed it with some stupidly placed print traces)
In Stockfish we validate and check for legality the TT move before to try it. This is especially helpful in SMP where corrupted TT entries are not so rare given that normally the TT table is shared across the threads and is accessed without locking. In Crafty there is a trick (based on XOR-ing the fields) to detect corrupted entries, but this does not saves you from key collisions.

Our scheme guarantees to detect and discard both corrupted TT entries and key collisions both in practical and theoretical terms, so I consider this superior to alternatives ;-)
I do BOTH in Crafty. Xor for reliability, but test for legality to avoid crashing. The XOR is really useful for pawn hashing since there are so many fields in the structure, and bogus ones can cause grave problems, such as infinite loops and such...

Re: Key collision handling

Posted: Fri Oct 21, 2011 7:02 pm
by mcostalba
bob wrote: I do BOTH in Crafty. Xor for reliability, but test for legality to avoid crashing. The XOR is really useful for pawn hashing since there are so many fields in the structure, and bogus ones can cause grave problems, such as infinite loops and such...
Yes, that's the reason for pawn (and material) keys we have per-thread tables. The hit rate is so high that a single shared table would yield very little added benefit if any, and we avoid a lot of troubles with possible corrupted data.