Only the bits that are not part of the index into the hash table count. If the hash table has say 2^24 entries, only 40 bits of the hash key are available to distinguish between nodes mapped to the same entry. Collission rate is 1 in every 2^40 nodes. If the hash table has 2^28 entries, collission rate will be 1 in every 2^36 nodes. (Of course if the index comes from a different hash key, the full 64 bits count.)Daniel Shawul wrote:Ok that will be 2^64 with a bucket size of 1 for a 64 bits hashkey (no change).
The replacement scheme has no effect on the collision rate (if we ignore the rate of hash hits). With a full table and a bucket size of 1, for each node searched there is only a collision if the node currently stored in its bucket happens to have the same hashkey, which is determined by the distinguishing bits (40 bits or 36 bits in the examples above).My doubt is with the replacement scheme.
Of course the replacement scheme will affect individual collisions (if B does not overwrite C, there will be a collision between A and C, if B does overwrite C, there is no collision), but it will not affect the overall collision rate. The probability that A == C is the same as the probability that B == C. The point is that each time you compare to only one key. With a very small hash table (and the full 64 bits stored, so that the number of bits that "count" is high), collisions will be extremely rare even if the total tree searched has many nodes with identical hash keys.In the previous birthday formula, we assume that each key is compared against all available keys which is what brings up the probablity higher. But for this case I don't know how to quantitiy how many useful comparisons that lead to collision will be done. With a hashtable many positions map to the same entry (assume butcket_size=1). The entry is home to many hash signatures that may actually be real collisions or not in most cases. To detect collision the stored key should happen to be exactly same as the one we are going to store. Say A,B,C map to the same entry but only A == C. First we store C, then B, then A. Thus we will not detect collision. So don't you think the replacement scheme will have an effect ?
Of course everything under the assumption that the Zobrist keys are independently and uniformly distributed.