Hi.
I tried to save some space in my transposition table by saving only 16 bits of the zobrist key instead of 32 bits as I did before.
Now with default tt size of ~128MB I'm getting a (not detected) hash collision just by searhing from startpos up to depth 11:
First Position:
Key: 631bda933644df0e
FEN: rnbqkb1r/pppp1ppp/7B/8/3QP3/8/PPP2PPP/RN2KBNR b KQkq - 0 4
Second Position:
Key: 631b27f21cc4df0e
FEN: r1bqkb1r/ppp1p1pp/2n4n/3p1P2/8/2NB1P2/PPPP2PP/R1BQK1NR b KQkq - 0 5
The first 16 bits are saved in the TT entry, the last 22 bits are used for the index number in the table so you can see the collision can't be detected.
My question is: Is this collision "normal" with only 16 bits saved? Or is there a problem in my zobrist code or maybe even in the random number generator I'm using (http://www.burtleburtle.net/bob/rand/smallprng.html)?
Any help is appreciated.
- Andreas
Not detected collisions in tt probing
Moderators: hgm, Rebel, chrisw
-
- Posts: 584
- Joined: Fri Mar 30, 2018 7:20 am
- Full name: Andreas Matthies
-
- Posts: 1871
- Joined: Sat Nov 25, 2017 2:28 pm
- Location: France
Re: Not detected collisions in tt probing
There are two things to check, the size of the hash, and how to get the index. Have a look at https://chessprogramming.wikispaces.com ... on%20Table if not already and also this http://preshing.com/20110504/hash-colli ... abilities/.
In Weini, I compute theoretical (type-2) collision rate using this approximation. My hash are 64bits, so the risk of type-1 collision is quite low.
In Weini, I compute theoretical (type-2) collision rate using this approximation. My hash are 64bits, so the risk of type-1 collision is quite low.
Code: Select all
double n = (double)Stats::visitednodes;
double m = (double)Definitions::ttSize;
double col = n-m+m*std::pow(1.-1./m,n);
LOG(logINFO) << "col rate (theorical) " << (n>0?100.f*col/n:0) << "%";
-
- Posts: 584
- Joined: Fri Mar 30, 2018 7:20 am
- Full name: Andreas Matthies
Re: Not detected collisions in tt probing
Thanks for your answer. Of course I know the chessprogramming page and my problem deals with this statement:
"... many programmers choose to trade-off space for accuracy and only store that part of the hash key not already considered as index, or even less".
In my engine I always used the "even less". I have a (default) 128MB tt size which leads (considering entry size and bucket number) to an index length of 22bit (lower bits of the key) and I'm storing (upper) 32bit of the key inside the entry. So there remain 64-22-32 = 10bit for undetectable hash collision. Probability of two different hash keys differ only in these 10bit should be VERY low.
Now I try to store only 16 bit which increases the probability to "two hash keys differ only in these (64-22-16=)26bit". Before I start to calculate, I had a look at the stockfish source and they also store only 16bit of the key in the tt entry. So I guess it should be possible, shouldn't it?
I will try a different PRNG and see if it works better...
- Andreas
"... many programmers choose to trade-off space for accuracy and only store that part of the hash key not already considered as index, or even less".
In my engine I always used the "even less". I have a (default) 128MB tt size which leads (considering entry size and bucket number) to an index length of 22bit (lower bits of the key) and I'm storing (upper) 32bit of the key inside the entry. So there remain 64-22-32 = 10bit for undetectable hash collision. Probability of two different hash keys differ only in these 10bit should be VERY low.
Now I try to store only 16 bit which increases the probability to "two hash keys differ only in these (64-22-16=)26bit". Before I start to calculate, I had a look at the stockfish source and they also store only 16bit of the key in the tt entry. So I guess it should be possible, shouldn't it?
I will try a different PRNG and see if it works better...
- Andreas
-
- Posts: 5563
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Not detected collisions in tt probing
16 bits can distinguish no more than 65536 positions, so this is normal.
That the other 22 bits are identical for these positions is no surprise, since all positions mapped to that TT entry have those 22 bits in common. Once the TT is full, every new probe of a TT entry will find a position in that entry that necessarily shares those 22 bits with the position being probed. If the probe is a real hit, the other 16 bits will also be the same. If the probe is not a real hit, the other bits will still be the same once for every 65536 positions probed.
So if you search a million positions per second, you should expect a dozen or so undetected errors per second.
In Stockfish the situation is even worse since each TT bucket contains three positions, which triples the number of false hits. Apparently this does not affect playing strength in a measurable way.
That the other 22 bits are identical for these positions is no surprise, since all positions mapped to that TT entry have those 22 bits in common. Once the TT is full, every new probe of a TT entry will find a position in that entry that necessarily shares those 22 bits with the position being probed. If the probe is a real hit, the other 16 bits will also be the same. If the probe is not a real hit, the other bits will still be the same once for every 65536 positions probed.
So if you search a million positions per second, you should expect a dozen or so undetected errors per second.
In Stockfish the situation is even worse since each TT bucket contains three positions, which triples the number of false hits. Apparently this does not affect playing strength in a measurable way.
-
- Posts: 5563
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Not detected collisions in tt probing
This formula is always mentioned when this topic comes up but imo not very useful since it ignores the reality that the hash table gets full and its entries overwritten. It does not matter that many positions will have identical hash keys if those positions are not stored in the TT at the same time.xr_a_y wrote: ↑Wed May 09, 2018 2:34 pm There are two things to check, the size of the hash, and how to get the index. Have a look at https://chessprogramming.wikispaces.com ... on%20Table if not already and also this http://preshing.com/20110504/hash-colli ... abilities/.
In Weini, I compute theoretical (type-2) collision rate using this approximation. My hash are 64bits, so the risk of type-1 collision is quite low.Code: Select all
double n = (double)Stats::visitednodes; double m = (double)Definitions::ttSize; double col = n-m+m*std::pow(1.-1./m,n); LOG(logINFO) << "col rate (theorical) " << (n>0?100.f*col/n:0) << "%";
Once you assume the TT is full, which it will always be if you don't clear the TT between moves, the math becomes very simple. With N bits stored in the TT entry and M entries per bucket, the error rate is 1 error per 2^N/M probes that do not hit.
Last edited by syzygy on Wed May 09, 2018 5:06 pm, edited 1 time in total.
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: Not detected collisions in tt probing
You could also check if the hash move is legal in the current position, as further verification.
One final twist to that trick consist of storing the move XORed with the additional bits of the hash key that you are not saving or using for addressing. That way the probability that the move you read back is valid is drastically reduced. If you use a 16-bit move encoding and there are only 40 legal moves, the trick will reduce the undetected collision rate by a factor of ~1600.
One final twist to that trick consist of storing the move XORed with the additional bits of the hash key that you are not saving or using for addressing. That way the probability that the move you read back is valid is drastically reduced. If you use a 16-bit move encoding and there are only 40 legal moves, the trick will reduce the undetected collision rate by a factor of ~1600.
-
- Posts: 584
- Joined: Fri Mar 30, 2018 7:20 am
- Full name: Andreas Matthies
Re: Not detected collisions in tt probing
This is really a miracle.
My engine starts to play really crap if I e.g. store the static evaluation of the position in tt and use it with (not reliable working) 16bit hash key detection.
Changing to a different PRNG doesn't change anything as you expected. So I will stay at 32bit keys in the tt.
Thanks!
-
- Posts: 584
- Joined: Fri Mar 30, 2018 7:20 am
- Full name: Andreas Matthies
Re: Not detected collisions in tt probing
Good idea. WIll try if the descrease of speed by this extra check can be compensated by the saved space in tt.AlvaroBegue wrote: ↑Wed May 09, 2018 3:36 pm You could also check if the hash move is legal in the current position, as further verification.
One final twist to that trick consist of storing the move XORed with the additional bits of the hash key that you are not saving or using for addressing. That way the probability that the move you read back is valid is drastically reduced. If you use a 16-bit move encoding and there are only 40 legal moves, the trick will reduce the undetected collision rate by a factor of ~1600.
Thanks.
-
- Posts: 536
- Joined: Thu Mar 09, 2006 3:01 pm
Re: Not detected collisions in tt probing
Hash collisions have been looked at and have surprisingly little impact.
See:
http://www.craftychess.com/hyatt/collisions.html
See:
http://www.craftychess.com/hyatt/collisions.html
-
- Posts: 1563
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: Not detected collisions in tt probing
Indeed, an a-b search is quite permissive for an occasional error now and then.brianr wrote: ↑Wed May 09, 2018 4:00 pm Hash collisions have been looked at and have surprisingly little impact.
See:
http://www.craftychess.com/hyatt/collisions.html
The biggest problem is that there is a possibility that the engine will crash when it gets an illegal move from the TT without verifying it.
To be on the safe side I store the whole 64 bit key and still have room enough for everything else.
I use 4 entries of 16 byte in a bucket, each bucket aligned on a 64 byte cache line, I don't see any reason why I should try to make it smaller.
Nowadays a PC has so much memory that you will never use it completely for your TT anyway.