What happens during a hash collision in say Stockfish and Cr

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Isaac
Posts: 265
Joined: Sat Feb 22, 2014 8:37 pm

What happens during a hash collision in say Stockfish and Cr

Post by Isaac »

Hello people,
I'm wondering what happens if there's a hash collision in say Stockfish or Crafty or Sjaak.
I haven't read their codes so I'm just guessing that the info stored in the buckets is position (FEN?) + eval (and depth?).
When 2 such informations share the same bucket, there is a hash collision. What happens? Do the engine retrieve the wrong score 100% of the time? 50% of the time? Or can the collision be resolved at the expense of computational power in which case the collision just slows down the engine but doesn't give wrong score for an analyzed position?
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: What happens during a hash collision in say Stockfish an

Post by syzygy »

Instead of FENs, engines use Zobrist hashing to map positions to 64-bit numbers and use those numbers as a key into the hash table. Part of the key is used to determine the hash entry (or bucket) for a position.

If two positions map to the same hash entry, there is still a check whether the remaining bits of the Zobrist key are the same.

Inevitably, there will be positions that map to the same 64-bit Zobrist key. When that happens, the wrong score will be used. Usually not a big deal, because this only happens in a few nodes in a huge huge search tree and will almost never affect the outcome of the search.

SF stores relatively few "check" bits in the hash entry, so will have more collisions (but more entries per GB of Hash). That has not been shown to negatively affect performance, yet.
Isaac
Posts: 265
Joined: Sat Feb 22, 2014 8:37 pm

Re: What happens during a hash collision in say Stockfish an

Post by Isaac »

Thanks for the answer, Ronald.
If I understand well, it's because of the type of hashing (Zobrist) that a collision cannot be resolved, fine. But why would the wrong score always be returned instead of the correct one? Is it because it is more recent and somehow overwrites or I don't know what else, the old and correct entry?
User avatar
hgm
Posts: 27796
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: What happens during a hash collision in say Stockfish an

Post by hgm »

Note that there are two kind of collisions: key collisions and index collisions. The latter are usually referred to as replacements. Normally one uses a hash key of 64 bits, and only some 25-30 of this are used to compute the hash index, i.e. the location (or small set of locations known as 'bucket') in the hash table where the position will be stored. If two positions map to the same index this can usually be resolved, because the part of the hash key (or at least 32 bits of it) that is not used as index is stored in the hash entry (the 'signature'). If this happens the new entry just overwrites the old one (or the one you are least eager to keep in the bucket), as you obviously have not enough space to keep all of them, and recently visited nodes are more likely to be visited again soon than older nodes. If there later is a probe for the older one anyway, it will be recognized as a miss, because none of the positions stored in the bucket now has a matching signature.

Only in case of a key collision (which is excessively rare) the program has no way to know it gets a 'false positive', (as the signature is now also the same), and will use the score as if it is the sought one, while in fact it could be anything. With a 32-bit signature and 4 entries per bucket only one in a billion hash probes will be a key collision.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: What happens during a hash collision in say Stockfish an

Post by syzygy »

Isaac wrote:Thanks for the answer, Ronald.
If I understand well, it's because of the type of hashing (Zobrist) that a collision cannot be resolved, fine. But why would the wrong score always be returned instead of the correct one? Is it because it is more recent and somehow overwrites or I don't know what else, the old and correct entry?
Well, if the values currently stored in a hash entry came from the node currently being searched, then there is no hash collision.

If the values currently stored in a hash entry came from a different node and nevertheless the hashkeys match, we have a collision and there is no reason why the value we find is anyway related to the current node. Because it came from a different node.

Note that we can only talk about the values currently stored in an entry. Once you overwrite an entry with values for a new position, the old values are gone forever. Memory, once overwritten, has no memory. So we are only ever looking at the "most recent" stored values.