Now that we are discussing the TT I have a question about a phenomenon I never understood --> illegal moves in the TT. How is this possible?
I have them, not many, less than 1/10 per mil of the hash hits.
It's not because of collisions, I can test with 48 and 64 bit keys and the percentage remains the same.
Speaking of the hash table
Moderators: hgm, Rebel, chrisw
-
- Posts: 1822
- Joined: Thu Mar 09, 2006 11:54 pm
- Location: The Netherlands
Re: Speaking of the hash table
Yeah it's unlikely it's collissions as with 64 bits only 1 in a 200 billion lookups should be a collission.Rebel wrote:Now that we are discussing the TT I have a question about a phenomenon I never understood --> illegal moves in the TT. How is this possible?
I have them, not many, less than 1/10 per mil of the hash hits.
It's not because of collisions, I can test with 48 and 64 bit keys and the percentage remains the same.
Must be implementation specific. Maybe if nullmove is best move you do not check the value of the register that you store with.
However... ..speaking about collissions...
If you index to the hashtable do you use bits of the hashkey?
The lowerbits?
In such case storing 64 bits or 48 bits is hardly difference as you already use those bits in the lookup.
-
- Posts: 613
- Joined: Sun Jan 18, 2009 7:03 am
Re: Speaking of the hash table
Multithreading. Typical size of hash table entry size is 128 bits.
Thread1.write(64bits)
Thread2.write(64bits)
Thread2.write(64bits)
Thread1.write(64bits)
leads to corrupted entry
Thread1.write(64bits)
Thread2.write(64bits)
Thread2.write(64bits)
Thread1.write(64bits)
leads to corrupted entry
Joona Kiiski
-
- Posts: 1822
- Joined: Thu Mar 09, 2006 11:54 pm
- Location: The Netherlands
Re: Speaking of the hash table
First of all this can only happen when you have a parallel searching engine.zamar wrote:Multithreading. Typical size of hash table entry size is 128 bits.
Thread1.write(64bits)
Thread2.write(64bits)
Thread2.write(64bits)
Thread1.write(64bits)
leads to corrupted entry
Did Rebel already get SMP?
Secondly this happens even more seldom than collissions at PC hardware as it can only happen with stores
Around 1 in 200 billion stores
In case Ed has his hashtable aligned and a multiple of it forms a cacheline, you can prove it cannot happen at a PC.
Note that in case of an i7 (not sandy bridge), it shows itself as having 64 bytes cacheline, however the memory controller only stores 192 bytes at a time. Sandy Bridge stores 256 bytes at once.
Has 4 memory channels.
Note that when reading it can give a premature abort reading less bytes, which is why it has such fast latency for us to the RAM.
It can abort after reading 32 bytes.
So the only way it can go wrong is when an entry is at the end of a cacheline, in this case 256 bytes and by accident 1 cacheline gets updated quicker than another.
Odds of this happening is real real tiny, especially if your hashtable is some gigabytes in size.
So how you represented it cannot happen as it doesn't write 64 bits. It writes 256 bytes at once.
-
- Posts: 1822
- Joined: Thu Mar 09, 2006 11:54 pm
- Location: The Netherlands
Re: Speaking of the hash table
Another possibility: a dimm is broken. One that doesn't get used by the OS itself but where Rebel allocates hashtable.Rebel wrote:Now that we are discussing the TT I have a question about a phenomenon I never understood --> illegal moves in the TT. How is this possible?
I have them, not many, less than 1/10 per mil of the hash hits.
It's not because of collisions, I can test with 48 and 64 bit keys and the percentage remains the same.
-
- Posts: 613
- Joined: Sun Jan 18, 2009 7:03 am
Re: Speaking of the hash table
AFAIK hash key collisions and multithreading are the only source of illegal moves in TT.
Not 100% sure, but I think that cache line alignment doesn't protect you from the issue.
We have no control of the thread execution, so in extreme case it could happen:
thread1.write(64bits)
sleep(1second)
thread2.write(64bits)
thread2.write(64bits)
sleep(1second)
thread1.write(64bits)
[sleep(1second) = situation when OS decides to pause thread execution]
Not 100% sure, but I think that cache line alignment doesn't protect you from the issue.
We have no control of the thread execution, so in extreme case it could happen:
thread1.write(64bits)
sleep(1second)
thread2.write(64bits)
thread2.write(64bits)
sleep(1second)
thread1.write(64bits)
[sleep(1second) = situation when OS decides to pause thread execution]
Joona Kiiski
-
- Posts: 1563
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: Speaking of the hash table
This must be some kind of bug, maybe there is something wrong with your Sobrist numbers.
I saw this happening once when I replaced my 64 bit random function with another one that contained a small bug.
I use 64 bit keys and I can let my program run for ages before I get a wrong move from the hashtable. I don't test for legality anymore because this really is not necessary.
I saw this happening once when I replaced my 64 bit random function with another one that contained a small bug.
I use 64 bit keys and I can let my program run for ages before I get a wrong move from the hashtable. I don't test for legality anymore because this really is not necessary.
-
- Posts: 1822
- Joined: Thu Mar 09, 2006 11:54 pm
- Location: The Netherlands
Re: Speaking of the hash table
What you write is impossible in the hardwarezamar wrote:AFAIK hash key collisions and multithreading are the only source of illegal moves in TT.
Not 100% sure, but I think that cache line alignment doesn't protect you from the issue.
We have no control of the thread execution, so in extreme case it could happen:
thread1.write(64bits)
sleep(1second)
thread2.write(64bits)
thread2.write(64bits)
sleep(1second)
thread1.write(64bits)
[sleep(1second) = situation when OS decides to pause thread execution]
Even impossible in fact in hardware from 90s.
They wrote 32 bytes at once
It really has cachelines of 64 bytes which is 512 bits and it really writes 1536 bits at once not 64.
So only when an entry overlaps at 2 cachelines and 2 different threads start to write to the RAM then it can happen.
As all this is atomic optimized, even in case 2 write simultaneous, it still hardly happens. That's why odds for it is smaller than a collision.
-
- Posts: 1822
- Joined: Thu Mar 09, 2006 11:54 pm
- Location: The Netherlands
Re: Speaking of the hash table
Maybe just a bad DIMM.Joost Buijs wrote:This must be some kind of bug, maybe there is something wrong with your Sobrist numbers.
I saw this happening once when I replaced my 64 bit random function with another one that contained a small bug.
I use 64 bit keys and I can let my program run for ages before I get a wrong move from the hashtable. I don't test for legality anymore because this really is not necessary.
-
- Posts: 1563
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: Speaking of the hash table
Yeah, this is possible. But with a bad DIMM Windows will probably crash as well. I know there are many Windows haters overhere, but I have to knock it off, Windows 7 never crashed on me once since I use it. And that is quite some time.