Is 64 bits enough as a hash lenght

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Is 64 bits enough as a hash lenght

Post by Sven »

mathmoi wrote:So the remainings 33 bits will be the ones to cause a hash collision. That means I'll get a collisions every 2^33 probes.
I can't follow these numbers yet.

1) First of all, 64-29=35 :-)

2) With this correction, why do you derive from this number of 35 remaining bits that you get a collision every 2^35 probes?

My understanding is that the collision frequency depends upon the whole hash key length (here 64) but also upon something like the total number of different chess positions. You get a collision if your current node has the same 64 bit hash key as another position already stored in the table. Same index, and also same remaining bits (assuming they are stored, too), but the position is different.

I can't tell exactly how the dependency between these two numbers is, nor whether it is possible at all to estimate the collision frequency based on them. Maybe with 2^128 different chess positions and 64 bit hash key you might get collisions every 2^(128-64) = 2^64 probes, I don't know.

I agree with others that 96 bit hash keys are not necessary. Also I think that a very basic check for pseudo-legality of the hash move could be acceptable if your engine would crash otherwise. You could check that the right colour is on the 'from' square, that no friendly piece is captured, and in case of a pawn move, that it does not move backwards and also not to the 1st (8th) rank without promoting. This might be cheap enough while preserving a minimum of robustness.

Furthermore, count the number of detected collisions (hash move not pseudo-legal) and observe the count over many games. I suspect it will stay 0 for a long time, longer than some minutes only (I guess, some thousands of games).

Sven
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

96 bit hashes

Post by sje »

I'd say that the probability of a false positive match when using 96 bits for a signature is rather less than the probability of a general CPU/memory glitch. So using a 96 bit signature might be faster as the move sanity check can be safely skipped.
Dann Corbit
Posts: 12550
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Is 64 bits enough as a hash lenght

Post by Dann Corbit »

jromang wrote:
bob wrote: assuming you have decent Zobrist random numbers...
What are decent numbers ? I blindly use a Mersenne Twister random generator, and have not checked the results ; is there a special thing to do to select good Zobrist keys ?
I seem to recall reading a paper where effort is made in regard to hamming distance.

"Most chess programs use random numbers for their Zobrist keys. However, the average and minimum Hamming Distances have to be maximised, because they represent the relative interdependency of the bit-vectors, or Zobrist keys. Using the math from coding theory and cryptography, there are ways to generate M numbers of length N that satisfy these constraints. These codes are called Bose-Chaudhuri-Hochquenghem (BCH) codes."

T. Warnock and B. Wendorff. Search Tables in Computer Chess. ICCA Journal, 11(1):10–13, 1988.

Possibly helpful:
http://citeseerx.ist.psu.edu/search;jse ... &sort=cite
Dann Corbit
Posts: 12550
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: 96 bit hashes

Post by Dann Corbit »

sje wrote:I'd say that the probability of a false positive match when using 96 bits for a signature is rather less than the probability of a general CPU/memory glitch. So using a 96 bit signature might be faster as the move sanity check can be safely skipped.
What happens if the quality of your hash is not perfect so that you get funnels and collisions at only {say} 60 bits? It will be really hard to know this is a problem because it will be incredibly hard to reproduce but it will be a large enough defect to cause your program to fail once every few dozen games at slow enough time control.
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Is 64 bits enough as a hash lenght

Post by Gerd Isenberg »

jromang wrote:
bob wrote: assuming you have decent Zobrist random numbers...
What are decent numbers ? I blindly use a Mersenne Twister random generator, and have not checked the results ; is there a special thing to do to select good Zobrist keys ?
What about this one? (Just a wild guess)
1. build a vector N (12*64 = 768) zobrist keys zob, f.i. by mersenne twister.
2. a vector of (N*N-N)/2 (294,528) pairs of zobrist keys zob ^ zob[j] with j > i
3. a vector of (N*N+N)/2 (295,296) keys, 1 and 2 concatenated
4. calc minimum and average hamming distance, minHD and avgHD over 3
5. repeat 1..4 with different seed in X runs to maximize minHD with min avgHD.

Some C++ code for minimum and average hamming distance on the fly:

Code: Select all

int minAvgHammingDistance(U64 v[], int n, int &avgHD)
{
   int i, j, hd, sumHD = 0, minHD = 64;
   for &#40;i=0;   i < n-1; i++)
   for &#40;j=i+1; j < n  ; j++)
   &#123;
      hd = popcount&#40;v&#91;i&#93; ^ v&#91;j&#93;); // hamming distance
      minHD  = min&#40;minHD, hd&#41;;
      sumHD += hd;
   &#125;
   n = &#40;n*n - n&#41; / 2;
   avgHD = &#40;sumHD + n/2&#41; / n;
   return minHD;
&#125;
In my white to move only color flipper I have black keys dependent on the 384 whites:

Code: Select all

zob&#91;blackPiece&#93;&#91;sq&#93; == bswap&#40;zob&#91;whitePiece&#93;&#91;sq^56&#93;)
I wonder whether this byte swapped dependency makes the effective number of keybits only 32? On the other hand I don't need to consider color to move since it is always whites turn.

Code: Select all

f.i. white pawn on c2 is 0x8040201008040201,
then black pawn on c7 is 0x0102040810204080,
and both x-ored together 0x8142241818244281 
with always byte 0==7, 1==6, 2==5, 3==4 for all flipped piece pairs. So only 32-significant bits for pairs of pieces.

Gerd
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Is 64 bits enough as a hash lenght

Post by Gerd Isenberg »

Clearly for such huge n sumHD should be long long.
With n = 295.296 there are 43.599.716.160 loop cycles with popcnt.
Even with one cycle popcnt, minimum and average hamming distance of that huge arrays may take some minutes.

Code: Select all

int minAvgHammingDistance&#40;U64 v&#91;&#93;, int n, int &avgHD&#41;
&#123;
   U64 sumHD = 0
   int i, j, hd, minHD = 64;
   for &#40;i=0;   i < n-1; i++)
   for &#40;j=i+1; j < n  ; j++)
   &#123;
      hd = popcount&#40;v&#91;i&#93; ^ v&#91;j&#93;); // hamming distance
      minHD  = min&#40;minHD, hd&#41;;
      sumHD += hd;
   &#125;
   n = &#40;n*n - n&#41; / 2;
   avgHD = &#40;int&#41;(&#40;sumHD + n/2&#41; / n&#41;;
   return minHD;
&#125;
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: Is 64 bits enough as a hash lenght

Post by Michael Sherwin »

mathmoi wrote:Hi,

We all know that we need to check the validity of a TT move before we execute it on the board because otherwise we could end up corrupting the engine's internal board - and probably crash the engine.

What are the probabilities of this occuring? Let's say with a 8Gb TT and searching 15 millions nps (both values are a higher than what I expect to be able to achieve with my current hardware).

8*1024^3/16 = 536870912 entries in the TT

I need the 29 lower bits of the hash to find the index in the TT.

So the remainings 33 bits will be the ones to cause a hash collision. That means I'll get a collisions every 2^33 probes.

2^33/15000000 = 572.6 seconds between each collisions.

That means that if I don't check the validity of the move in the TT I'll probably crash every 9.5 minutes.

However if I use a 96 bits hash size (32+64bits) I can use the 29 lower bits of the 32 part to get an index and store the full 64 bits part in the TT to look for collisions. Now i'll get a collision every 2^64 probes or :

2^64/15000000 = 1229782938247 seconds between each collisions or about 40000 years.

Now my question to you. Can it be a good idea to use a 96 bits hash and completely avoid the validity check of the TT moves (since we don't mind crashing every 40000 years)?

MP
I tried it with RomiChess. Performance did not suffer, so I see no reason not to use 96 bits.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
jromang

Re: Is 64 bits enough as a hash lenght

Post by jromang »

Thanks a lot ! Does anyone have a set of precomputed Zobrist Keys with good Hamming distances ? :D
Aleks Peshkov
Posts: 892
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia

Re: Is 64 bits enough as a hash lenght

Post by Aleks Peshkov »

Gerd Isenberg wrote: In my white to move only color flipper I have black keys dependent on the 384 whites:

Code: Select all

zob&#91;blackPiece&#93;&#91;sq&#93; == bswap&#40;zob&#91;whitePiece&#93;&#91;sq^56&#93;)
I wonder whether this byte swapped dependency makes the effective number of keybits only 32? On the other hand I don't need to consider color to move since it is always whites turn.

Code: Select all

f.i. white pawn on c2 is 0x8040201008040201,
then black pawn on c7 is 0x0102040810204080,
and both x-ored together 0x8142241818244281 
with always byte 0==7, 1==6, 2==5, 3==4 for all flipped piece pairs. So only 32-significant bits for pairs of pieces.

Gerd
There are reasons to turn internal board each ply and I am going to do the same. But I do not understand the reason of such artificial hash keys bits false dependency.

IMHO black pawn on c7 and white pawn on c2 do not transpose to each other because complement white/black positions have different side_to_move.
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Is 64 bits enough as a hash lenght

Post by Gerd Isenberg »

Aleks Peshkov wrote:There are reasons to turn internal board each ply and I am going to do the same. But I do not understand the reason of such artificial hash keys bits false dependency.

IMHO black pawn on c7 and white pawn on c2 do not transpose to each other because complement white/black positions have different side_to_move.
The idea is to bswap the hashkey as well. To color flip the board after make and before unmake. With this dependency while flipping the board, the black pawn on c7 becomes the white pawn on c2 and the same for all other xored zobrist keys that make the hashkey. It gains additional hits from the TT in symmetrical positions, while making nullmoves or loosing tempi.

So my next opening book will take respect of that fact and will encourage HansDamf to play symmetrical openings for some additional elo points ;-)