Rotated hash

AlvaroBegue · Post by **AlvaroBegue** » Tue Sep 13, 2016 8:14 pm

bob wrote:
jwes wrote:
syzygy wrote:The math is easy, but to do the math we have to know a bit more.

1. How do you know a hit is false? Or in other words: what does tt_falsehits mean precisely?
I have a second independent 32 bit hash key for testing, so tt_falsehits means that the first key matched, but the second key didn't.

syzygy wrote:2. Do you use buckets of size 1?
Yes, but I use wtm as the low bit of the index, so all entries for a bucket ar the same color.

syzygy wrote:3. Do you store the whole 64-bit key in hash entries?
Only 48 bits, but I used 27 bits of the 64 for the index.
Doing this, how do you know it is not the 32 bit independent hash key that is causing the problem?

If the 32-bit verification hash is truly independent, he has one chance in 2^32 of being wrong. I like those odds.

hgm · Post by **hgm** » Tue Sep 13, 2016 8:15 pm

This can leave collisions undetected (with an extremely low probability), but it can never make the same position have different keys, and fake a colision that way.

bob · Post by **bob** » Tue Sep 13, 2016 10:36 pm

AlvaroBegue wrote:
bob wrote:
jwes wrote:
syzygy wrote:The math is easy, but to do the math we have to know a bit more.

1. How do you know a hit is false? Or in other words: what does tt_falsehits mean precisely?
I have a second independent 32 bit hash key for testing, so tt_falsehits means that the first key matched, but the second key didn't.

syzygy wrote:2. Do you use buckets of size 1?
Yes, but I use wtm as the low bit of the index, so all entries for a bucket ar the same color.

syzygy wrote:3. Do you store the whole 64-bit key in hash entries?
Only 48 bits, but I used 27 bits of the 64 for the index.
Doing this, how do you know it is not the 32 bit independent hash key that is causing the problem?
If the 32-bit verification hash is truly independent, he has one chance in 2^32 of being wrong. I like those odds.

Not so fast. EVERY probe has a probability of being wrong. Which would you trust, a 64 bit key or a 32 bit key? If you compare both and call it an error when one matches and the other doesn't, WHICH one was the problem???

That was why I asked.

Only way I know to do this with any accuracy is to NOT hash the second value, store the actual complete board position which can be done in something like 168 bits. Then you can not only compare hash signatures, but actual board positions. If they disagree, only the hash can be wrong.

This is how I have measured collision rates in the past.

bob · Post by **bob** » Tue Sep 13, 2016 10:37 pm

hgm wrote:This can leave collisions undetected (with an extremely low probability), but it can never make the same position have different keys, and fake a colision that way.

Why? If you compute the 64 bit key and the 32 bit key separately, how can you not have occasional disagreement on matches unless you somehow have zero actual collisions in either?

syzygy · Post by **syzygy** » Tue Sep 13, 2016 10:50 pm

jwes wrote:
syzygy wrote:The math is easy, but to do the math we have to know a bit more.

1. How do you know a hit is false? Or in other words: what does tt_falsehits mean precisely?
I have a second independent 32 bit hash key for testing, so tt_falsehits means that the first key matched, but the second key didn't.

OK, excellent.

syzygy wrote:2. Do you use buckets of size 1?
Yes, but I use wtm as the low bit of the index, so all entries for a bucket ar the same color.

I don't think using wtm exclusively for the low bit of the index is the best idea, but we can leave that for another discussion.

syzygy wrote:3. Do you store the whole 64-bit key in hash entries?
Only 48 bits, but I used 27 bits of the 64 for the index.

Yes, so the real question is how many bits you still have left in an entry to distinguish it from other keys that go to that same entry. So the answer is 64-27=37?

Ideally you would get on average one false hit every 1 in 2^37 unsuccesful probes. 2^37 is 137 billion and you get 1 after about 50 billion. That seems to be well within the range of what you may expect based on just a single false hit.

syzygy · Post by **syzygy** » Tue Sep 13, 2016 10:53 pm

bob wrote:
jwes wrote:
syzygy wrote:The math is easy, but to do the math we have to know a bit more.

1. How do you know a hit is false? Or in other words: what does tt_falsehits mean precisely?
I have a second independent 32 bit hash key for testing, so tt_falsehits means that the first key matched, but the second key didn't.

syzygy wrote:2. Do you use buckets of size 1?
Yes, but I use wtm as the low bit of the index, so all entries for a bucket ar the same color.

syzygy wrote:3. Do you store the whole 64-bit key in hash entries?
Only 48 bits, but I used 27 bits of the 64 for the index.
Doing this, how do you know it is not the 32 bit independent hash key that is causing the problem?

The "shortness" of the 32-bit independent hash key will only result in 1 in 4 billion false hits going undetected. (For probes that miss there is no test of the independent key. For probes that are true hits, the independent hash keys will necessarily correspond as well.)

hgm · Post by **hgm** » Tue Sep 13, 2016 11:29 pm

bob wrote:
hgm wrote:This can leave collisions undetected (with an extremely low probability), but it can never make the same position have different keys, and fake a colision that way.
Why? If you compute the 64 bit key and the 32 bit key separately, how can you not have occasional disagreement on matches unless you somehow have zero actual collisions in either?

Both keys are a single-valued function of the position. So if the 32-bit key differs, then you can be absolutely certain the positions were not the same. If the 64-bit key then is equal, you can be absolutely certain that that key had a collision...

Rotated hash

Re: Rotated hash

Re: Rotated hash

Re: Rotated hash

Re: Rotated hash

Re: Rotated hash

Re: Rotated hash

Re: Rotated hash