Speaking of the hash table

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Speaking of the hash table

Post by Rebel »

Now that we are discussing the TT I have a question about a phenomenon I never understood --> illegal moves in the TT. How is this possible?

I have them, not many, less than 1/10 per mil of the hash hits.

It's not because of collisions, I can test with 48 and 64 bit keys and the percentage remains the same.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Speaking of the hash table

Post by diep »

Rebel wrote:Now that we are discussing the TT I have a question about a phenomenon I never understood --> illegal moves in the TT. How is this possible?

I have them, not many, less than 1/10 per mil of the hash hits.

It's not because of collisions, I can test with 48 and 64 bit keys and the percentage remains the same.
Yeah it's unlikely it's collissions as with 64 bits only 1 in a 200 billion lookups should be a collission.

Must be implementation specific. Maybe if nullmove is best move you do not check the value of the register that you store with.

However... ..speaking about collissions...

If you index to the hashtable do you use bits of the hashkey?
The lowerbits?

In such case storing 64 bits or 48 bits is hardly difference as you already use those bits in the lookup.
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Speaking of the hash table

Post by zamar »

Multithreading. Typical size of hash table entry size is 128 bits.

Thread1.write(64bits)
Thread2.write(64bits)
Thread2.write(64bits)
Thread1.write(64bits)

leads to corrupted entry
Joona Kiiski
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Speaking of the hash table

Post by diep »

zamar wrote:Multithreading. Typical size of hash table entry size is 128 bits.

Thread1.write(64bits)
Thread2.write(64bits)
Thread2.write(64bits)
Thread1.write(64bits)

leads to corrupted entry
First of all this can only happen when you have a parallel searching engine.

Did Rebel already get SMP?

Secondly this happens even more seldom than collissions at PC hardware as it can only happen with stores :)

Around 1 in 200 billion stores :)

In case Ed has his hashtable aligned and a multiple of it forms a cacheline, you can prove it cannot happen at a PC.

Note that in case of an i7 (not sandy bridge), it shows itself as having 64 bytes cacheline, however the memory controller only stores 192 bytes at a time. Sandy Bridge stores 256 bytes at once.

Has 4 memory channels.

Note that when reading it can give a premature abort reading less bytes, which is why it has such fast latency for us to the RAM.

It can abort after reading 32 bytes.

So the only way it can go wrong is when an entry is at the end of a cacheline, in this case 256 bytes and by accident 1 cacheline gets updated quicker than another.

Odds of this happening is real real tiny, especially if your hashtable is some gigabytes in size.

So how you represented it cannot happen as it doesn't write 64 bits. It writes 256 bytes at once.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Speaking of the hash table

Post by diep »

Rebel wrote:Now that we are discussing the TT I have a question about a phenomenon I never understood --> illegal moves in the TT. How is this possible?

I have them, not many, less than 1/10 per mil of the hash hits.

It's not because of collisions, I can test with 48 and 64 bit keys and the percentage remains the same.
Another possibility: a dimm is broken. One that doesn't get used by the OS itself but where Rebel allocates hashtable.
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Speaking of the hash table

Post by zamar »

AFAIK hash key collisions and multithreading are the only source of illegal moves in TT.

Not 100% sure, but I think that cache line alignment doesn't protect you from the issue.

We have no control of the thread execution, so in extreme case it could happen:

thread1.write(64bits)
sleep(1second)
thread2.write(64bits)
thread2.write(64bits)
sleep(1second)
thread1.write(64bits)

[sleep(1second) = situation when OS decides to pause thread execution]
Joona Kiiski
Joost Buijs
Posts: 1563
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Speaking of the hash table

Post by Joost Buijs »

This must be some kind of bug, maybe there is something wrong with your Sobrist numbers.

I saw this happening once when I replaced my 64 bit random function with another one that contained a small bug.

I use 64 bit keys and I can let my program run for ages before I get a wrong move from the hashtable. I don't test for legality anymore because this really is not necessary.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Speaking of the hash table

Post by diep »

zamar wrote:AFAIK hash key collisions and multithreading are the only source of illegal moves in TT.

Not 100% sure, but I think that cache line alignment doesn't protect you from the issue.

We have no control of the thread execution, so in extreme case it could happen:

thread1.write(64bits)
sleep(1second)
thread2.write(64bits)
thread2.write(64bits)
sleep(1second)
thread1.write(64bits)

[sleep(1second) = situation when OS decides to pause thread execution]
What you write is impossible in the hardware :)

Even impossible in fact in hardware from 90s.

They wrote 32 bytes at once :)

It really has cachelines of 64 bytes which is 512 bits and it really writes 1536 bits at once not 64.

So only when an entry overlaps at 2 cachelines and 2 different threads start to write to the RAM then it can happen.

As all this is atomic optimized, even in case 2 write simultaneous, it still hardly happens. That's why odds for it is smaller than a collision.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: Speaking of the hash table

Post by diep »

Joost Buijs wrote:This must be some kind of bug, maybe there is something wrong with your Sobrist numbers.

I saw this happening once when I replaced my 64 bit random function with another one that contained a small bug.

I use 64 bit keys and I can let my program run for ages before I get a wrong move from the hashtable. I don't test for legality anymore because this really is not necessary.
Maybe just a bad DIMM.
Joost Buijs
Posts: 1563
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: Speaking of the hash table

Post by Joost Buijs »

Yeah, this is possible. But with a bad DIMM Windows will probably crash as well. I know there are many Windows haters overhere, but I have to knock it off, Windows 7 never crashed on me once since I use it. And that is quite some time.