jswaff wrote:bob wrote:One simple test is to measure the hamming distance between any two Zobrist numbers (hamming distance is the number of bits that are different between two binary values). The minimum hamming distance between pairs of numbers is an interesting value. If you have several pairs with very small hamming distances, you have a higher probability of a collision when you use those numbers.
For any single value, you can have a Zobrist number with a hamming distance of 64. (a and the binary complement of a, for example). I don't know that anyone has ever computed the minimum hamming distance between any two values in a set of N values. But if you have pairs with a distance of 1 or 2 or 3 (or even 0 heaven forbid) you need to replace one of those values.
I've performed that test in an earlier version of Prophet. I don't remember the value I came up with - I think it was 14. I use 851 keys in all, and the Random() (or Random32()?) function from Crafty. Unfortunately I don't have the code laying around anymore to confirm that number.
--
James
Here's a thing to try:
compute the first N numbers (N = number you need in total) and compute the sum of squares of 64 - the hamming distance between each pair of numbers. Now generate number N+1, and try replacing each original number, one at a time, with the new number, and see if the sum of squares is smaller. Pick the set with the smallest sum of squares. Repeat until you can't stand it any longer. There are other ways to do the same sort of algorithmic approach to finding the best set.
The 64-N is so that smaller hamming distances produce larger numbers, and you want to minimize that. I might just do this myself to see how the optimized set compares to the first N which I now use...
I think I will do this on our cluster, which can eat this kind of problem up by letting each node use a different random number stream.
One note, if you do optimize the set of numbers, don't expect any better play. The tests Anthony and I ran on the ICGA hashing collision paper shows that this is not going to have any significant effect on the search at all. But it is an interesting exercise...