Is 64 bits enough as a hash lenght

bob · Post by **bob** » Sun Jul 13, 2008 6:04 pm

Michael Sherwin wrote:
mathmoi wrote:Hi,

We all know that we need to check the validity of a TT move before we execute it on the board because otherwise we could end up corrupting the engine's internal board - and probably crash the engine.

What are the probabilities of this occuring? Let's say with a 8Gb TT and searching 15 millions nps (both values are a higher than what I expect to be able to achieve with my current hardware).

8*1024^3/16 = 536870912 entries in the TT

I need the 29 lower bits of the hash to find the index in the TT.

So the remainings 33 bits will be the ones to cause a hash collision. That means I'll get a collisions every 2^33 probes.

2^33/15000000 = 572.6 seconds between each collisions.

That means that if I don't check the validity of the move in the TT I'll probably crash every 9.5 minutes.

However if I use a 96 bits hash size (32+64bits) I can use the 29 lower bits of the 32 part to get an index and store the full 64 bits part in the TT to look for collisions. Now i'll get a collision every 2^64 probes or :

2^64/15000000 = 1229782938247 seconds between each collisions or about 40000 years.

Now my question to you. Can it be a good idea to use a 96 bits hash and completely avoid the validity check of the TT moves (since we don't mind crashing every 40000 years)?

MP
I tried it with RomiChess. Performance did not suffer, so I see no reason not to use 96 bits.

Using that logic, on 64 bit machines going to 128 bit signatures will not hurt performance so go to 128...

There is a computational cost that an optimized program will show. And storing an extra 32 or 64 bits of signature means that the hash entry will be 32 or 64 bits larger, reducing the total number of entries for a given hash size.

Gerd Isenberg · Post by **Gerd Isenberg** » Sun Jul 13, 2008 11:53 pm

bob wrote: You could always do something like this, just for a sanity check:

Code: Select all

min=65;
avg=0;
for &#40;i=0; i< numrands; i++)
  for &#40;j=0; j<numrands; j++) &#123;
    if &#40;i != j&#41; &#123;
      min = Min&#40;min, PopCnt&#40;rand&#91;i&#93; ^ rand&#91;j&#93;);
      avg += PopCnt&#40;rand&#91;i&#93; ^ rand&#91;j&#93;);
  &#125;
&#125;
avg /= numrands;
printf&#40;"min hamming distance = %d\n", min&#41;;
printf&#40;"average hamming distance = %d\n", avg&#41;;

Since rand ^ rand[j] == rand[j] ^ rand, you loop twice as much as necessary, j > i instead of i != j. Also your average might be a bit too optimistic

Code: Select all

min=65;
avg=0;
for &#40;i=0; i< numrands-1; i++)
  for &#40;j=i+1; j<numrands; j++) &#123;
      min = Min&#40;min, PopCnt&#40;rand&#91;i&#93; ^ rand&#91;j&#93;);
      avg += PopCnt&#40;rand&#91;i&#93; ^ rand&#91;j&#93;);
&#125;
avg /= numrands * &#40;numrands - 1&#41; / 2;
printf&#40;"min hamming distance = %d\n", min&#41;;
printf&#40;"average hamming distance = %d\n", avg&#41;;

bob wrote: that will give you some idea where you stand.

Optimal min and average would be 64. But that is impossible unless numrands is 2. I do not know what the minimal acceptable number is for you, it depends on how many numbers you use. Crafty has 896 values (64 x wtm x piece values) = 64 x 2 x 7.

low numbers for min/avg mean you can improve things. One solution is to set a min hamming distance you will accept and then just generate random numbers one at a time and throw one out if it is too close to any already generated. Set this too large and it will never complete of course... If yo udo this you will probably want to find the set (which can take some time for large min values) and then make them constants to avoid the startup time cost.

Have you tried this one? To hamming check all xor pairs of all zobrist keys?
http://64.68.157.89/forum/viewtopic.php ... 93&t=22274

Michael Sherwin · Post by **Michael Sherwin** » Mon Jul 14, 2008 8:50 am

bob wrote:
Michael Sherwin wrote:
mathmoi wrote:Hi,

We all know that we need to check the validity of a TT move before we execute it on the board because otherwise we could end up corrupting the engine's internal board - and probably crash the engine.

What are the probabilities of this occuring? Let's say with a 8Gb TT and searching 15 millions nps (both values are a higher than what I expect to be able to achieve with my current hardware).

8*1024^3/16 = 536870912 entries in the TT

I need the 29 lower bits of the hash to find the index in the TT.

So the remainings 33 bits will be the ones to cause a hash collision. That means I'll get a collisions every 2^33 probes.

2^33/15000000 = 572.6 seconds between each collisions.

That means that if I don't check the validity of the move in the TT I'll probably crash every 9.5 minutes.

However if I use a 96 bits hash size (32+64bits) I can use the 29 lower bits of the 32 part to get an index and store the full 64 bits part in the TT to look for collisions. Now i'll get a collision every 2^64 probes or :

2^64/15000000 = 1229782938247 seconds between each collisions or about 40000 years.

Now my question to you. Can it be a good idea to use a 96 bits hash and completely avoid the validity check of the TT moves (since we don't mind crashing every 40000 years)?

MP
I tried it with RomiChess. Performance did not suffer, so I see no reason not to use 96 bits.
Using that logic, on 64 bit machines going to 128 bit signatures will not hurt performance so go to 128...

There is a computational cost that an optimized program will show. And storing an extra 32 or 64 bits of signature means that the hash entry will be 32 or 64 bits larger, reducing the total number of entries for a given hash size.

If, however, someone was already storing the full 64 bit signature then storing a 64 bit signature that is totally seperate from the key cost no additional storage.

Also having an extra 32 bits added to the key is superfluous in all respects.

And if the hash is not used in the qsearch and updating the hash is taken out of MakeMove() and TakeBack() and put in its own routine then updating an extra 32 bit value during normal search cost nothing. It also gives a higher node rate to seperate them out.

hgm · Post by **hgm** » Mon Jul 14, 2008 10:08 am

bob wrote:What will happen when the illegal move is a castle move, after you have already castled? This is what kills me as I make/unmake rather than using the slower copy/make approach. Making a castling move with that side not having a king on (say) e1 and rook on h1 is catastrophic. Because when you make the move you put a king on g1 and a rook on f1, and now you have an extra king and rook. And when you unmake the move, you keep both and totally corrupt the search. Some will crash with two kings, some will not. But either way if you make/unmake you are essentially dead.

Castling is encoded in my move list as the King move (e1g1, etc), and this part is of the move is performed by the same code in UnMake() that performs normal moves. So it moves whatever is on the From Square to the ToSquare, and remembers what was on both for Unmake(). So if e1 was empty, it just moves the emptiness to g1, perhaps capturing something there.

Now castlings are intercepted before doing this basic MakeMove (recognized by some mode bits in the move encoding that flag it as a castling), in order to move the Rook. Originally, the code that did this was only copying the contents of (say) h1 to f1 (so that it would not need separate code for black and white), so that there would not appear any Rooks out of nowhere if h1 was empty. But the UnMake was just clearing f1, and this could lead to the disappearence of the original piece of f1. So as a defensive measure I now have this Rook-moving code remember the old contents of f1, and restore it on UnMake. As this code is not executed very often (and after both sides lose their castling rights, not at all), it has no measurable impact on performance. The worst effect is probably that it needs an extra byte in the stack frame of the recursive search, to remember the old contenst of the Rook target.

hgm · Post by **hgm** » Mon Jul 14, 2008 10:56 am

jromang wrote:
bob wrote: assuming you have decent Zobrist random numbers...
What are decent numbers ? I blindly use a Mersenne Twister random generator, and have not checked the results ; is there a special thing to do to select good Zobrist keys ?

I once tested my Zobrist keys for dependency, by XORing any combination of 3 of them (64 over 3 times 13^3 = 91M combination) and put them in a hash table. If I get no collissions there, it means that any combination of 6 Zobrist keys (or 4, or 2) will be different from zero. By checking all XORs of 4 keys against the table, (18G of them, so it takes some time), I could rule out dependence in any combinations of 7 keys (or 5, 3).

Note there are already more than 2^55 combinations of 7 keys, and nearly 2^62 combinations of 8 keys (assuming 13 keys per square.) The side-to-move, castling rights and e.p. square keys multiply the latter number to a value above 2^64, so that with 64-bit keys you cannot have all combinations of 8 keys independent, whatever you do.

In reality, Joker uses only ~56 bits of the hash key: 32 bits are stored, and ~24 are used to select the group of entries to be probed. If I do the test only looking at the 56 bits that Joker uses, it means I cannot avoid collisions even with combinations of 7 keys. So what I did check is if I approximately have the expected number of those collisions. By trying with many different sets of randomly generated Zobrist keys, (limiting the key-length to 48, so that I already get collissions amongst the set of 6 keys, and can do the tests quite quickly), I found that the actual number of collissions does not vary very much. You could select a set with the smallest number of collisions, but the difference is unlikely to be spectacular. The only thing you want to verify is that the set you use does not accidentially contain a group of 6 or 5 dependent keys. The chance of this is very low, but if you would be so unlucky, it would spectacularly drive up the number of collissions.

I also applied this method to make sure that some of the memory-saving schemes I use do not lead to an enhanced collision rate. Turns out they don't. So the Zobrist table in micro-Max only needs to occupy less than 1KB of precious L1 cache space...

bob · Post by **bob** » Mon Jul 14, 2008 6:15 pm

Gerd Isenberg wrote:
bob wrote: You could always do something like this, just for a sanity check:
Code: Select all
min=65;
avg=0;
for &#40;i=0; i< numrands; i++)
  for &#40;j=0; j<numrands; j++) &#123;
    if &#40;i != j&#41; &#123;
      min = Min&#40;min, PopCnt&#40;rand&#91;i&#93; ^ rand&#91;j&#93;);
      avg += PopCnt&#40;rand&#91;i&#93; ^ rand&#91;j&#93;);
  &#125;
&#125;
avg /= numrands;
printf&#40;"min hamming distance = %d\n", min&#41;;
printf&#40;"average hamming distance = %d\n", avg&#41;;
Since rand ^ rand[j] == rand[j] ^ rand, you loop twice as much as necessary, j > i instead of i != j. Also your average might be a bit too optimistic
Code: Select all
min=65;
avg=0;
for &#40;i=0; i< numrands-1; i++)
  for &#40;j=i+1; j<numrands; j++) &#123;
      min = Min&#40;min, PopCnt&#40;rand&#91;i&#93; ^ rand&#91;j&#93;);
      avg += PopCnt&#40;rand&#91;i&#93; ^ rand&#91;j&#93;);
&#125;
avg /= numrands * &#40;numrands - 1&#41; / 2;
printf&#40;"min hamming distance = %d\n", min&#41;;
printf&#40;"average hamming distance = %d\n", avg&#41;;
bob wrote: that will give you some idea where you stand.

Optimal min and average would be 64. But that is impossible unless numrands is 2. I do not know what the minimal acceptable number is for you, it depends on how many numbers you use. Crafty has 896 values (64 x wtm x piece values) = 64 x 2 x 7.

low numbers for min/avg mean you can improve things. One solution is to set a min hamming distance you will accept and then just generate random numbers one at a time and throw one out if it is too close to any already generated. Set this too large and it will never complete of course... If yo udo this you will probably want to find the set (which can take some time for large min values) and then make them constants to avoid the startup time cost.
Have you tried this one? To hamming check all xor pairs of all zobrist keys?
http://64.68.157.89/forum/viewtopic.php ... 93&t=22274

Yes. We had this discussion multiple times in the past. I don't remember what mine came out to, but I believe something less than 32. I think that for my parallel programming course this fall, I am going to use this as one assignment, and let students throw some _really_ heavy-duty hardware at the issue to determine the set of PRNs with the max average hamming distance. I try to find things that are computationally difficult without being difficult to actually program, then let 'em go at it with posix threads, then MPI, then both so that the entire cluster can work on one run.

As far as the double computation goes, I wasn't trying to write the most efficient thing I could write, I was just illustrating how to check.

Yes, the second loop index should be limited based on the first to make it N^2 / 2 loops rather than N^2...

bob · Post by **bob** » Mon Jul 14, 2008 6:19 pm

hgm wrote:
bob wrote:What will happen when the illegal move is a castle move, after you have already castled? This is what kills me as I make/unmake rather than using the slower copy/make approach. Making a castling move with that side not having a king on (say) e1 and rook on h1 is catastrophic. Because when you make the move you put a king on g1 and a rook on f1, and now you have an extra king and rook. And when you unmake the move, you keep both and totally corrupt the search. Some will crash with two kings, some will not. But either way if you make/unmake you are essentially dead.
Castling is encoded in my move list as the King move (e1g1, etc), and this part is of the move is performed by the same code in UnMake() that performs normal moves. So it moves whatever is on the From Square to the ToSquare, and remembers what was on both for Unmake(). So if e1 was empty, it just moves the emptiness to g1, perhaps capturing something there.

Now castlings are intercepted before doing this basic MakeMove (recognized by some mode bits in the move encoding that flag it as a castling), in order to move the Rook. Originally, the code that did this was only copying the contents of (say) h1 to f1 (so that it would not need separate code for black and white), so that there would not appear any Rooks out of nowhere if h1 was empty. But the UnMake was just clearing f1, and this could lead to the disappearence of the original piece of f1. So as a defensive measure I now have this Rook-moving code remember the old contents of f1, and restore it on UnMake. As this code is not executed very often (and after both sides lose their castling rights, not at all), it has no measurable impact on performance. The worst effect is probably that it needs an extra byte in the stack frame of the recursive search, to remember the old contenst of the Rook target.

Then what about en passant? In the real case there is a pawn on the target, in the wrong case there is not. Are you saving everything you actually change on the board? (ie tempfrom[ply] = board[from]; board[from]=0; )??? I chose to not do that since the move already has the piece being moved, the piece being captured, so it cuts out all the "temps"...

bob · Post by **bob** » Mon Jul 14, 2008 6:23 pm

Michael Sherwin wrote:
bob wrote:
Michael Sherwin wrote:
mathmoi wrote:Hi,

We all know that we need to check the validity of a TT move before we execute it on the board because otherwise we could end up corrupting the engine's internal board - and probably crash the engine.

What are the probabilities of this occuring? Let's say with a 8Gb TT and searching 15 millions nps (both values are a higher than what I expect to be able to achieve with my current hardware).

8*1024^3/16 = 536870912 entries in the TT

I need the 29 lower bits of the hash to find the index in the TT.

So the remainings 33 bits will be the ones to cause a hash collision. That means I'll get a collisions every 2^33 probes.

2^33/15000000 = 572.6 seconds between each collisions.

That means that if I don't check the validity of the move in the TT I'll probably crash every 9.5 minutes.

However if I use a 96 bits hash size (32+64bits) I can use the 29 lower bits of the 32 part to get an index and store the full 64 bits part in the TT to look for collisions. Now i'll get a collision every 2^64 probes or :

2^64/15000000 = 1229782938247 seconds between each collisions or about 40000 years.

Now my question to you. Can it be a good idea to use a 96 bits hash and completely avoid the validity check of the TT moves (since we don't mind crashing every 40000 years)?

MP
I tried it with RomiChess. Performance did not suffer, so I see no reason not to use 96 bits.
Using that logic, on 64 bit machines going to 128 bit signatures will not hurt performance so go to 128...

There is a computational cost that an optimized program will show. And storing an extra 32 or 64 bits of signature means that the hash entry will be 32 or 64 bits larger, reducing the total number of entries for a given hash size.
If, however, someone was already storing the full 64 bit signature then storing a 64 bit signature that is totally seperate from the key cost no additional storage.

Also having an extra 32 bits added to the key is superfluous in all respects.

And if the hash is not used in the qsearch and updating the hash is taken out of MakeMove() and TakeBack() and put in its own routine then updating an extra 32 bit value during normal search cost nothing. It also gives a higher node rate to seperate them out.

How can it "cost nothing"? The most you can update at one cycle is 64 bits on a 64 bit machine. 96 requires exactly double the effort. If you mean because you are not hashing in q-search then the extra work is not being performed there, I agree. But the cost for doing it in the normal search is still non-zero. And it offers exactly nothing with respect to accuracy. As far as storing 64 bits in the hash, if one does that one is _already_ wasting 32 bits of memory in a hash entry that could be used for more entries instead.

hgm · Post by **hgm** » Mon Jul 14, 2008 7:11 pm

bob wrote:Then what about en passant? In the real case there is a pawn on the target, in the wrong case there is not. Are you saving everything you actually change on the board? (ie tempfrom[ply] = board[from]; board[from]=0; )??? I chose to not do that since the move already has the piece being moved, the piece being captured, so it cuts out all the "temps"...

Yes, I do the tempfrom[ply] = board[from]; stuff. In MakeMove I do need to get the moved piece in a register, to store it to board[to]. I figured it would not make much difference if I load that register from the move list or from the board. Saves me the trouble of storing it in the move list. The "postpone-what-might-be-pruned principle" helps here: if I store the pieces in the move list, I would have to do it for all the moves. Saving in the temporary I only have to do for the moves that are actually made.

Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght