Is 64 bits enough as a hash lenght

hgm · Post by **hgm** » Sat Jul 12, 2008 11:14 pm

jromang wrote:Thanks a lot ! Does anyone have a set of precomputed Zobrist Keys with good Hamming distances ?

I guess not. As sets with good Hamming distances can just as easily make poor (= dependent) keys as sets with good Hamming distances, no one bothers...

bob · Post by **bob** » Sun Jul 13, 2008 12:41 am

mathmoi wrote:
bob wrote:
mathmoi wrote:Hi,

We all know that we need to check the validity of a TT move before we execute it on the board because otherwise we could end up corrupting the engine's internal board - and probably crash the engine.

What are the probabilities of this occuring? Let's say with a 8Gb TT and searching 15 millions nps (both values are a higher than what I expect to be able to achieve with my current hardware).

8*1024^3/16 = 536870912 entries in the TT

I need the 29 lower bits of the hash to find the index in the TT.

So the remainings 33 bits will be the ones to cause a hash collision. That means I'll get a collisions every 2^33 probes.

2^33/15000000 = 572.6 seconds between each collisions.

That means that if I don't check the validity of the move in the TT I'll probably crash every 9.5 minutes.

However if I use a 96 bits hash size (32+64bits) I can use the 29 lower bits of the 32 part to get an index and store the full 64 bits part in the TT to look for collisions. Now i'll get a collision every 2^64 probes or :

2^64/15000000 = 1229782938247 seconds between each collisions or about 40000 years.

Now my question to you. Can it be a good idea to use a 96 bits hash and completely avoid the validity check of the TT moves (since we don't mind crashing every 40000 years)?

MP
It is completely worthless to use 96 bits. If you look at the paper Cozzie and I wrote about the effect of hash collisions on the quality of the search, the results were surprising. One error every 1,000,000 nodes made absolutely no measurable difference to the program's playing skill. So worrying about improving it beyond 1 error in 2^64 is pointless. BTW assuming that the upper bits are the only collision component is incorrect. All the bits matter. The larger the table, the more entries you have and the greater chance for collisions to actually happen in a way that will affect the program. But collisions are very unlikely anyway with 64 bit signatures assuming you have decent Zobrist random numbers...
I red your paper about the effect of hash collosions on the quality of the search. What I'm trying to avoid is not a bad evaluation caused by a collision. I'm trying to avoid a crash of my engine caused by an illegal move taken from the TT.

Then you have three choices for a parallel search implementation:

(1) use traditional locks to protect the atomicity of the store / fetch process so that the entry you fetch is one specific position + associated data, so that you can trust the best move as being valid.

(2) use the lockless approach from previous Crafty versions.

(3) validate the move from the hash table before using it. I found it more efficient to validate the moves than ensure hashing correctness... YMMV.

BTW assuming that the upper bits are the only collision component is incorrect. All the bits matter. The larger the table, the more entries you have and the greater chance for collisions to actually happen in a way that will affect the program.
I think my assumpsion is correct if the TT is full (like most of the time). Suppose I use 24 lower bits as an index. Theses bit will _always_ collide with the hash of the position stored at this index. So the remainaing 64-24 bits will make the difference between a collision or not.

Am I wrong?

MP

[/quote]

Not quite. Because _most_ of the time the lower 24 bits will be different. So it is only those rare cases where the position agrees in the lower 24 bits where the upper 40 bits "save the day". This is a form of the "birthday paradox". The bigger the table, the greater the probability of a collision, even though both tests might use exactly 64 bits. With one entry, the chance of a collision is 1 in 2^64. With 2^64 entries, the chance of a collision is 1 in 2^64. No difference at all from a practical perspective. And the probability of a collision is so small, you can certainly safely ignore it if you validate the hash move before making it. If you don't, probably you need some sort of hashless mechanism as any form of hashing is "lossy" and offers the potential for a disaster.

In Crafty, if I run on 20M nps hardware, and tell it to display every time a hash move is invalid, I typically will see a half-dozen such reports in an entire game... At 20M nodes per second, times an hour per game, that is a very small percentage.

bob · Post by **bob** » Sun Jul 13, 2008 12:43 am

Dann Corbit wrote:
sje wrote:I'd say that the probability of a false positive match when using 96 bits for a signature is rather less than the probability of a general CPU/memory glitch. So using a 96 bit signature might be faster as the move sanity check can be safely skipped.
What happens if the quality of your hash is not perfect so that you get funnels and collisions at only {say} 60 bits? It will be really hard to know this is a problem because it will be incredibly hard to reproduce but it will be a large enough defect to cause your program to fail once every few dozen games at slow enough time control.

I agree. Either zero errors, or make it error-tolerant by validating the moves before making them, assuming making an illegal move will crash your program.

jwes · Post by **jwes** » Sun Jul 13, 2008 5:11 am

Dann Corbit wrote:
sje wrote:I'd say that the probability of a false positive match when using 96 bits for a signature is rather less than the probability of a general CPU/memory glitch. So using a 96 bit signature might be faster as the move sanity check can be safely skipped.
What happens if the quality of your hash is not perfect so that you get funnels and collisions at only {say} 60 bits? It will be really hard to know this is a problem because it will be incredibly hard to reproduce but it will be a large enough defect to cause your program to fail once every few dozen games at slow enough time control.

There is a simple way to test the quality of your hash. Add 32 bits (or 64 for the truly paranoid) to your hash key and run tests for some days counting the number of times that the original hash keys match but the added bits do not match (false positives), and compare to the theoretical number of false positives.

PK-4 · Post by **PK-4** » Sun Jul 13, 2008 7:53 am

jromang wrote:Does anyone have a set of precomputed Zobrist Keys with good Hamming distances ?

Take any three numbers A, ~A and A^~A, e.g., 00001111, 11110000 and 11111111. Good Hamming distances but bad keys because of dependency.

Zach Wegner · Post by **Zach Wegner** » Sun Jul 13, 2008 8:35 am

I always hear this criticism of Hamming distances, but I think it's flawed. First of all, you don't want Hamming distances close to 64, you want them close to 32. Second of all, for large numbers of hashkeys, there won't be sets of hashkeys that satisfy that paradoxical good-Hamming-but-dependent. So given my first point, your first two numbers are already in violation. Try to construct as many 8 bit hashkeys as you can with HD 4.

You can use:
11111111
00001111
11001100
10101010

...and a few more related ones, and then you're out. So construct a full set of hashkeys that minimizes the maximum HD and maximizes the minimum, and I bet they'll be better than just a set of plain numbers. This is of course just my intuition, but it makes sense to me.

bob · Post by **bob** » Sun Jul 13, 2008 5:41 pm

jromang wrote:
bob wrote: assuming you have decent Zobrist random numbers...
What are decent numbers ? I blindly use a Mersenne Twister random generator, and have not checked the results ; is there a special thing to do to select good Zobrist keys ?

Simplest test is to have significant hamming distance between any pair of numbers. A distance of one (1) is very dangerous since you now only affect one bit when hashing making it far easier to get a collision later.

Common mistakes I have seen:

(1) 32 bit values in 64 bit words, giving 32 bit hashing signatures rather than 64.

(2) using floats. Since random numbers are in the range 0 ... 1.0 usually, the exponent bits and sign are almost the same, which reduces the average hamming distance by about 16. I use the low order 32 bits of two successive 64 bit numbers and OR them together to get rid of the exponent and sign bit.

(3) using a PRNG that has a short period so that you get repeated numbers. Even a twister can do this if the wrong starting values are used.

You could always do something like this, just for a sanity check:

Code: Select all

min=65;
avg=0;
for (i=0; i< numrands; i++)
  for (j=0; j<numrands; j++) {
    if (i != j) {
      min = Min(min, PopCnt(rand[i] ^ rand[j]);
      avg += PopCnt(rand[i] ^ rand[j]);
  }
}
avg /= numrands;
printf("min hamming distance = %d\n", min);
printf("average hamming distance = %d\n", avg);

that will give you some idea where you stand.

Optimal min and average would be 64. But that is impossible unless numrands is 2.

I do not know what the minimal acceptable number is for you, it depends on how many numbers you use. Crafty has 896 values (64 x wtm x piece values) = 64 x 2 x 7.

low numbers for min/avg mean you can improve things. One solution is to set a min hamming distance you will accept and then just generate random numbers one at a time and throw one out if it is too close to any already generated. Set this too large and it will never complete of course... If yo udo this you will probably want to find the set (which can take some time for large min values) and then make them constants to avoid the startup time cost.

bob · Post by **bob** » Sun Jul 13, 2008 5:52 pm

jwes wrote:
Dann Corbit wrote:
sje wrote:I'd say that the probability of a false positive match when using 96 bits for a signature is rather less than the probability of a general CPU/memory glitch. So using a 96 bit signature might be faster as the move sanity check can be safely skipped.
What happens if the quality of your hash is not perfect so that you get funnels and collisions at only {say} 60 bits? It will be really hard to know this is a problem because it will be incredibly hard to reproduce but it will be a large enough defect to cause your program to fail once every few dozen games at slow enough time control.
There is a simple way to test the quality of your hash. Add 32 bits (or 64 for the truly paranoid) to your hash key and run tests for some days counting the number of times that the original hash keys match but the added bits do not match (false positives), and compare to the theoretical number of false positives.

You don't need to do this. You can find this from the 90's somewhere in r.g.c.c... several of us did exactly that, although I was able to run a tad more data since I was using a Cray at the time. Except we took it a bit further and did _exact_ matches for the verification. I just encoded the position arithmetically into something like 200 bits, and then when the hash signature matches but the exact position did not match, we called that an error. And with 64 bits the results were essentially "perfect" in real games. I don't remember the specific number of true 64 bit collisions, but I do remember it was very close to zero (if not exactly zero). 32 bits was horrible and unusable. Only thing was, at the time we didn't have the collisions data Cozzie and I produced, so it seems that even 32 bit signatures are safe enough as in the tests we ran, they did not produce enough errors to change scores or moves at all.

bob · Post by **bob** » Sun Jul 13, 2008 5:55 pm

hgm wrote:
mathmoi wrote:That means that if I don't check the validity of the move in the TT I'll probably crash every 9.5 minutes.
Although I agree with the calculation up to here, at this point you seem to draw an unjustified conclusion:

The only thing that you have shown is that you will have a key collision every 9.5 minutes (and if you probe 2 or 4 entries at a time, even per 5 or 2.5 min). So one might assume the hash move will be essentially a random one, not related to the position.

But will such a move crash your engine? Not necessarily! It depends on your engine design. In micro-Max, which has a 0x88 board and no piece list, any combination of (FromSqr,ToSqr) square can be harmlessly interpreted as a move, as long as the squares are on the board. (As they always are, when this was a move valid in some other position, as the board doesn't change size during the game.) It just moves the contents of the FromSqr to the ToSqr, remembering the old contents of the ToSqr, so that the position can be perfectly restored on UnMake(). It does not matter if the FromSquare contains an own piece, an opponent's piece, or is empty.

In Joker, which has a piece list, most illegal moves still are perfectly executed. (I made sure that the empty square does have an encoding that is a valid piece number in the piece list, so that if an empty square is moved, updating its position does not lead to memory corruption. (I needed this anyway, because non-captures are treated as captures of an empty square.)

The only illegal move that could crash Joker was a castling, when there was a piece in the new location of the Rook, as UnMake move did always restore that to an empty square. But the probability that the hash move was a castling is not very large. In the special code of MakeMove() and UnmMake() to handle the Rook part of castling, I now fixed that. At the expense of some overhead, of course, but this is code that is hardl ever executed. Only a minor fraction of the moves are castlings. (And during most of the game castlings are no longer possible anyway.)

What will happen when the illegal move is a castle move, after you have already castled? This is what kills me as I make/unmake rather than using the slower copy/make approach. Making a castling move with that side not having a king on (say) e1 and rook on h1 is catastrophic. Because when you make the move you put a king on g1 and a rook on f1, and now you have an extra king and rook. And when you unmake the move, you keep both and totally corrupt the search. Some will crash with two kings, some will not. But either way if you make/unmake you are essentially dead.

bob · Post by **bob** » Sun Jul 13, 2008 6:00 pm

Sven Schüle wrote:
mathmoi wrote:So the remainings 33 bits will be the ones to cause a hash collision. That means I'll get a collisions every 2^33 probes.
I can't follow these numbers yet.

1) First of all, 64-29=35

2) With this correction, why do you derive from this number of 35 remaining bits that you get a collision every 2^35 probes?

My understanding is that the collision frequency depends upon the whole hash key length (here 64) but also upon something like the total number of different chess positions. You get a collision if your current node has the same 64 bit hash key as another position already stored in the table. Same index, and also same remaining bits (assuming they are stored, too), but the position is different.

I can't tell exactly how the dependency between these two numbers is, nor whether it is possible at all to estimate the collision frequency based on them. Maybe with 2^128 different chess positions and 64 bit hash key you might get collisions every 2^(128-64) = 2^64 probes, I don't know.

I agree with others that 96 bit hash keys are not necessary. Also I think that a very basic check for pseudo-legality of the hash move could be acceptable if your engine would crash otherwise. You could check that the right colour is on the 'from' square, that no friendly piece is captured, and in case of a pawn move, that it does not move backwards and also not to the 1st (8th) rank without promoting. This might be cheap enough while preserving a minimum of robustness.

Furthermore, count the number of detected collisions (hash move not pseudo-legal) and observe the count over many games. I suspect it will stay 0 for a long time, longer than some minutes only (I guess, some thousands of games).

Sven

With a make/unmake approach (fastest way I have tried) I need to ensure the following:

(1) correct piece is on <from>, and if this is a capture, correct piece is on <to> else <to> needs to be empty.

(2) for castling, STM must have castling rights that match the move being made, and the 4-5 squares involved must be correct. for O-O for white, king must be on e1, rook on h1, f1 and g1 must be empty.

(3) for EP captures, the position must be correct with a pawn to capture on the EP square. I don't have to check the actual legality to be sure a knight is not being moved to an impossible square, because that could never make it into the table. I mainly want to make sure that I don't "create" pieces when I unmake a move that was not legal/possible in the first place.

It takes very little time as there are specific requirements for each type of move and you only have to test just the required stuff for the piece being moved, not everything.

Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: 96 bit hashes

Re: 96 bit hashes

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght

Re: 96 bit hashes

Re: Is 64 bits enough as a hash lenght

Re: Is 64 bits enough as a hash lenght