Stockfish hash implementation

jdart · Post by **jdart** » Tue Jan 10, 2012 4:24 pm

Stockfish stores only the high-order 32 bits of the hash code in each hash entry. Some number of bits from the low 32 bits of the hash code will be used as the index to get the hash cluster. But still, if you have less than 4 billion hash entries, you will use less than the full 64 bits in retrieving and matching the hash entry. For example if you have 1048576 hash entries, you are only using 20 + 32 bits = 52 bits. Apparently the assumption is that this is enough to prevent hash collisions. I don't know if this is valid or not but empirically it seems to work.

Also Stockfish does not use the "lockless hashing" technique found in Crafty, involving an XOR of the hash code with the value, nor does it use an actual lock. So the assumption also seems to be made that updates to the hash table will be atomic.

--Jon

zamar · Post by **zamar** » Tue Jan 10, 2012 5:06 pm

jdart wrote:Stockfish stores only the high-order 32 bits of the hash code in each hash entry. Some number of bits from the low 32 bits of the hash code will be used as the index to get the hash cluster. But still, if you have less than 4 billion hash entries, you will use less than the full 64 bits in retrieving and matching the hash entry. For example if you have 1048576 hash entries, you are only using 20 + 32 bits = 52 bits. Apparently the assumption is that this is enough to prevent hash collisions. I don't know if this is valid or not but empirically it seems to work.

Chance to occur is minimalistic. Chance that random collision causes serious damage to search is minimalistic. minimalistic^2 = astronomically small.

Also Stockfish does not use the "lockless hashing" technique found in Crafty, involving an XOR of the hash code with the value, nor does it use an actual lock. So the assumption also seems to be made that updates to the hash table will be atomic.

Absolutely not. We need to carefully validate the data in fetched entry, so that we don't crash in case of two threads writing simultaneously. So the strategy is not avoid corrupted data, but to cope with corrupted data

Steve Maughan · Post by **Steve Maughan** » Tue Jan 10, 2012 9:41 pm

If I remember correctly, back in circa 1998 there was quite a bit of discussion about the necessary size of the hash key. I think the conclusion was that 32 bits are not quite enough. They gave a false positive about every move or so. So by adding the additional 10 to 20 bits you're more than well covered. I guess you'll need to check the legality of the hash move - but that's good practice anyway.

Also there'll be a small gain for minimizing the bandwidth requirements of the hash table.

Steve

micron · Post by **micron** » Wed Jan 11, 2012 12:12 am

zamar wrote:Chance to occur is minimalistic. Chance that random collision causes serious damage to search is minimalistic. minimalistic^2 = astronomically small.

The worst outcome is that a bad TT move crashes MakeMove(). Without additional safeguards, the risk is only 'minimalistic'.
To get 'astronomically small' or zero, you have to validate the TT move, either crudely or precisely, before playing it.

With a full 64-bit hash, validation seems redundant.

jdart · Post by **jdart** » Wed Jan 11, 2012 2:32 am

1998 was a long time ago (many generations of Moore's Law).

--Jon

mcostalba · Post by **mcostalba** » Wed Jan 11, 2012 4:45 am

micron wrote: To get 'astronomically small' or zero, you have to validate the TT move, either crudely or precisely, before playing it.

I don't know what you mean by "crudely or precisely" but for us "to validate" has a single meaning.

micron wrote:
With a full 64-bit hash, validation seems redundant.

"Seems" is not enough. Thank you.

Just to make it short, people is claiming that hash key size is short and then falls short of identifying a reliable solution. Joona is the only one in this thread that has indicated the _only_ and _correct_ solution (that is the validation of the move before to use) and it is blamed for his sloppiness by the very same people.

bob · Post by **bob** » Wed Jan 11, 2012 5:35 am

jdart wrote:Stockfish stores only the high-order 32 bits of the hash code in each hash entry. Some number of bits from the low 32 bits of the hash code will be used as the index to get the hash cluster. But still, if you have less than 4 billion hash entries, you will use less than the full 64 bits in retrieving and matching the hash entry. For example if you have 1048576 hash entries, you are only using 20 + 32 bits = 52 bits. Apparently the assumption is that this is enough to prevent hash collisions. I don't know if this is valid or not but empirically it seems to work.

Also Stockfish does not use the "lockless hashing" technique found in Crafty, involving an XOR of the hash code with the value, nor does it use an actual lock. So the assumption also seems to be made that updates to the hash table will be atomic.

--Jon

For the second, I don't think so. I use lockless hashing because I have a pretty simplistic (and very fast) legality checker that can be fooled. And if I make an illegal move, since this is a make/unmake program (Crafty) vs a copy/make, I would "die" and lose. The xor trick solves this simply and reliably and I can't measure any speed penalty at all.

If I disable the xor stuff, I might see 10-20 illegal hash moves during a game. That means that 10-20 hash probes his positions that had moves that were not legal, implying that the score is likely wrong as well. That is not very many in light of the hash collision tests I did with Cozzie a couple of years back. 10-20 positions with illegal moves might translate to 100-200 positions with the wrong information (broken hash entry) where the move is still legal, but the score/bound is wrong. That won't cause any measurable issues based on the collision testing I mentioned. UNLESS that illegal move will crash your engine. Mine does, theirs does not (apparently) so it really is safe if that is not an issue.

Don · Post by **Don** » Thu Jan 12, 2012 11:58 pm

Steve Maughan wrote:If I remember correctly, back in circa 1998 there was quite a bit of discussion about the necessary size of the hash key. I think the conclusion was that 32 bits are not quite enough. They gave a false positive about every move or so. So by adding the additional 10 to 20 bits you're more than well covered. I guess you'll need to check the legality of the hash move - but that's good practice anyway.

Also there'll be a small gain for minimizing the bandwidth requirements of the hash table.

Steve

The issue for a practical chess engine is not how many collisions you get but how often it actually changes the move choice. This consideration could be worth several bits because it's not likely that one node out of tens of millions is likely to cause you to play a different move at the root.

It would be fun to run a test with various values of N bits for the hash table. I'll bet that you could get away with 30 bits without getting much of an ELO loss. I don't know that for sure as I have not tried it since about 15 years ago when 32 bits was a lot

Stockfish hash implementation

Stockfish hash implementation

Re: Stockfish hash implementation

Re: Stockfish hash implementation

Re: Stockfish hash implementation

Re: Stockfish hash implementation

Re: Stockfish hash implementation

Re: Stockfish hash implementation

Re: Stockfish hash implementation