How do I test the quality of the Zobrist keys?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: How do I test the quality of the Zobrist keys?

Post by hgm »

If you have key collisions in KBNK, something must be very bad in your Zobrist randoms: it would mean that the XOR of only 8 keys could give you zero. It must be possible to do significantly better than that, as with KBNK there are only 2^23 = 8M positions.

It should be easy enough to calculate them all, and check which two are the same...
jswaff

Re: How do I test the quality of the Zobrist keys?

Post by jswaff »

jesper_nielsen wrote:
bob wrote:
jswaff wrote:
bob wrote:One simple test is to measure the hamming distance between any two Zobrist numbers (hamming distance is the number of bits that are different between two binary values). The minimum hamming distance between pairs of numbers is an interesting value. If you have several pairs with very small hamming distances, you have a higher probability of a collision when you use those numbers.

For any single value, you can have a Zobrist number with a hamming distance of 64. (a and the binary complement of a, for example). I don't know that anyone has ever computed the minimum hamming distance between any two values in a set of N values. But if you have pairs with a distance of 1 or 2 or 3 (or even 0 heaven forbid) you need to replace one of those values.
I've performed that test in an earlier version of Prophet. I don't remember the value I came up with - I think it was 14. I use 851 keys in all, and the Random() (or Random32()?) function from Crafty. Unfortunately I don't have the code laying around anymore to confirm that number.

--
James
Here's a thing to try:

compute the first N numbers (N = number you need in total) and compute the sum of squares of 64 - the hamming distance between each pair of numbers. Now generate number N+1, and try replacing each original number, one at a time, with the new number, and see if the sum of squares is smaller. Pick the set with the smallest sum of squares. Repeat until you can't stand it any longer. There are other ways to do the same sort of algorithmic approach to finding the best set.

The 64-N is so that smaller hamming distances produce larger numbers, and you want to minimize that. I might just do this myself to see how the optimized set compares to the first N which I now use...

I think I will do this on our cluster, which can eat this kind of problem up by letting each node use a different random number stream.

One note, if you do optimize the set of numbers, don't expect any better play. The tests Anthony and I ran on the ICGA hashing collision paper shows that this is not going to have any significant effect on the search at all. But it is an interesting exercise...
I think i will try your algorithm.

I have been using my set of pseudo-random numbers for a year or so, and never seen anything to make me suspect any problems.

But then when i was testing a KRKN endgame bitbase, my engine would chrash, because of an illegal move from the hash table.

For the zobrist keys to be a problem, i suppose there would not only have to be a collision between positions, but those positions would have to be a part of the same search as well.

Thanks,
Jesper
You could take the time to implement Bob's experiment, but you don't need "optimal," you need "good enough." I am pretty sure you are going to find your problems are due to the crappy PRNG supplied by the dot.net framework not being good enough. (And I'm not bashing dot.net, many PRNGs are pretty crappy.)

And, IMHO your engine should never crash in the event of a collision, no matter how much you minimize the possibility. Add some sort of move verification.

Good luck,

--
James
pijl

Re: How do I test the quality of the Zobrist keys?

Post by pijl »

hgm wrote:If you have key collisions in KBNK, something must be very bad in your Zobrist randoms: it would mean that the XOR of only 8 keys could give you zero. It must be possible to do significantly better than that, as with KBNK there are only 2^23 = 8M positions.

It should be easy enough to calculate them all, and check which two are the same...
Yes, that should be quite easy.
Triggered by this question I've written a little function in CTD to check the xor of different key permutations. Checking all permutations of 4 keys took a little under 3 minutes on my Opteron, and it is currently doing the 5 keys permutations. So far no collisions have been found.
I'm only checking the relevant keys though, skipping the pawns on a1 and so on. The total number of keys I have is 769.
I'll let the check of 5 key permutations finish, and probably do the first part of the 6 key permutations, and finishing the 6 key ones will probably take several months.
Richard.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: How do I test the quality of the Zobrist keys?

Post by hgm »

So what exactly do you mean by "4-key permutations". Take any combination of 4 keys, and test if they XOR to zero?

Or calculating all XORs of 4 keys and check if they are unique? (By tabulating them and look for a collision.)

The latter would effectively check if any combinations of eight different keys XORed to zero.
pijl

Re: How do I test the quality of the Zobrist keys?

Post by pijl »

hgm wrote:So what exactly do you mean by "4-key permutations". Take any combination of 4 keys, and test if they XOR to zero?

Or calculating all XORs of 4 keys and check if they are unique? (By tabulating them and look for a collision.)

The latter would effectively check if any combinations of eight different keys XORed to zero.
The first. And a bit in a dumb way too as the permutation of four keys of the white king is calculated too. I guess quite some efficiency can be gained when ruling out permutations with >2 keys for a king of the same color, >2 keys for ep, >2 keys for castling etc. But it is much easier to get the dumb way to work in a faultfree way :-)

I'm not sure how to do the second method efficiently as it requires about 12Gb of space ( 769*768*767*766 / 24 (4!) times 8 bytes ) just to store the xor results, and comparing the individual entries of that table would also be quite some task.

We're talking big number of possible combinations here.
anyway, if you only want to check KBNK, you can forget about the queen, rook, pawn, castling and EP keys, which makes it much more manageable to calculate the permutations.

Richard.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: How do I test the quality of the Zobrist keys?

Post by hgm »

Yes, in th KBNK case you only have to take one key for wK, one for bK, one for wN and half of the keys for wB. That is only 8M.

But if 12Gb is too much (you could write it to disk?), you could XOR all key triples, which is only 769*768*767/3! x 8byte = 300MB. So in a 512MB hash table you could store them all, and it would be equivalent to testing all 6-key permutations for zero.

And if you would compare all 4-key permutations to this table, you would have checked all 7-key combinations for zero. In a few minutes... Point is that by comparing against a table of millions of entry, in stead of just a single number (zero), you speed up the process millions of times.
pijl

Re: How do I test the quality of the Zobrist keys?

Post by pijl »

hgm wrote:Yes, in th KBNK case you only have to take one key for wK, one for bK, one for wN and half of the keys for wB. That is only 8M.

But if 12Gb is too much (you could write it to disk?), you could XOR all key triples, which is only 769*768*767/3! x 8byte = 300MB. So in a 512MB hash table you could store them all, and it would be equivalent to testing all 6-key permutations for zero.

And if you would compare all 4-key permutations to this table, you would have checked all 7-key combinations for zero. In a few minutes...
I guess generating them is not much slower than retrieving from memory, especially since the key tables will fit in the cache. Checking all 6 or 7 key combinations in a few minutes is wishful thinking imho.

btw, I think we both made a small calculation error. All 4-key permutations will take 116Gb, and all 3-key permutations will take about 600Mb, assuming 8 bytes per entry, of course.
Richard.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: How do I test the quality of the Zobrist keys?

Post by hgm »

You are right about the 600Mb, but you miss the point.

You hash, so you will do the comparison with 75 million stored entries by a single memory access. To do it sequentially you would have to generate all 75 million. And that is significantly slower than a single memory access... You will not do a linear search through the table to check if the entry is there (which would indeed be comparable to generating the keys).
pijl

Re: How do I test the quality of the Zobrist keys?

Post by pijl »

hgm wrote:You are right about the 600Mb, but you miss the point.

You hash, so you will do the comparison with 75 million stored entries by a single memory access. To do it sequentially you would have to generate all 75 million. And that is significantly slower than a single memory access... You will not do a linear search through the table to check if the entry is there (which would indeed be comparable to generating the keys).
Ok, I see what you mean. Yes, I guess by storing all permutations of three keys in a hashtable (with some sort of bucket or overflow system to hold all permutations that map onto the same hashtable entry) may be a bit faster, even when having to store it on disk. Maybe I'll take a quick look at it later.
Richard.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: How do I test the quality of the Zobrist keys?

Post by jwes »

jesper_nielsen wrote:
When i was testing a KRKN endgame bitbase, my engine would chrash, because of an illegal move from the hash table.

For the zobrist keys to be a problem, i suppose there would not only have to be a collision between positions, but those positions would have to be a part of the same search as well.

Thanks,
Jesper
This looks more like a bug than a hash collision. Try making a test version with a full position representation in the hash table (easy for krkn) and when you get a hash hit, compare the position to the position on the board.