Starting with Hash Tables.

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Luis Babboni
Posts: 464
Joined: Sat Feb 28, 2015 4:37 pm
Location: Argentina

Re: Starting with Hash Tables.

Post by Luis Babboni »

Mmmmm..... what are the usual way to make TT entries with 256MB as limit?

I´m not convenced to use just 4 bytes to store the position key.
I use 8 bytes to detect 3rd repetition rule.

I think 4,294,967,296 differents numbers (2^(8x4)) are not enough to avoid the risk that differents positions have the same key even with just as low as 10,000,000 positions analyzed that is the poor performance of Soberango allow in 40/4 time control and may be half of them are the same that left you 5,000,000.
I think (I´m right?) that 5,000,000 have more posibilities to fall in the same number out of 4,295,000,000 than 5 out of 4,295 that are around 2 in 1,000 if I´m right.

.... may be that risk is a reasonable risk no matter the consecuences?
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Starting with Hash Tables.

Post by syzygy »

Luis Babboni wrote:.... may be that risk is a reasonable risk no matter the consecuences?
Stockfish stores only 16 bits, which gives only 65536 possible values. For some reason 0 is excluded, so 65535 values are left (and 1 in every 65536 positions simply cannot be stored).

To make matters worse, Stockfish stores 3 positions per bucket. This has the effect that (with a filled hashtable) one in less than 22000 probes that should fail (because the position is not stored in the TT) will incorrectly succeed and read random data.

So SF makes hundreds of hash errors per second, but still plays OK.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Starting with Hash Tables.

Post by syzygy »

Luis Babboni wrote:I think 4,294,967,296 differents numbers (2^(8x4)) are not enough to avoid the risk that differents positions have the same key even with just as low as 10,000,000 positions analyzed that is the poor performance of Soberango allow in 40/4 time control and may be half of them are the same that left you 5,000,000.
I think (I´m right?) that 5,000,000 have more posibilities to fall in the same number out of 4,295,000,000 than 5 out of 4,295 that are around 2 in 1,000 if I´m right.
You should not look at the total number of positions searched, but at the number of positions that might be stored in the location in memory where your probe function will look for the current position. It does not matter if the positions with the same key have been seen during the search but are no longer stored in the hashtable.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Starting with Hash Tables.

Post by Sven »

Luis Babboni wrote:Mmmmm..... what are the usual way to make TT entries with 256MB as limit?

I´m not convenced to use just 4 bytes to store the position key.
I use 8 bytes to detect 3rd repetition rule.

I think 4,294,967,296 differents numbers (2^(8x4)) are not enough to avoid the risk that differents positions have the same key even with just as low as 10,000,000 positions analyzed that is the poor performance of Soberango allow in 40/4 time control and may be half of them are the same that left you 5,000,000.
I think (I´m right?) that 5,000,000 have more posibilities to fall in the same number out of 4,295,000,000 than 5 out of 4,295 that are around 2 in 1,000 if I´m right.

.... may be that risk is a reasonable risk no matter the consecuences?
Of the 64 bits forming the hash key, 23 bits (in your case with 8 million buckets) are taken as index, so storing them is redundant since they *must* be the same. The remaining 41 bits would need at least 6 byte. Storing only four of them means that there is a small (but acceptable) chance to use a TT entry that actually belongs to a position with a different hash key.

But that is not the only "inaccuracy" that needs to be accepted. The current assumption is that there are about 10^46.7 different chess positions which is about 2^155. So on average there would be around 2^(155-64) = 2^91 = about 10^27 different positions sharing the same 64 bit hash key. But we don't care about that, relying a lot on the low probability that two positions that are "close" to each other (i.e., differing only by few piece locations) have the same key. Zobrist hashing somehow gives us this level of confidence. In practical games actual hash collisions where two positions share the same hash key are rare enough to ignore the problem. And a similar reasoning also allows to only store a part of the hash key bits that do not form the table index. A hash table can't provide "perfect" information, it is not designed for that purpose. But using it efficiently and with care will allow to obtain "sufficiently reliable" information.
User avatar
Luis Babboni
Posts: 464
Joined: Sat Feb 28, 2015 4:37 pm
Location: Argentina

Re: Starting with Hash Tables.

Post by Luis Babboni »

Thanks Sven and Ronald, I think I understood all but this: :oops:
syzygy wrote: ...
You should not look at the total number of positions searched, but at the number of positions that might be stored in the location in memory where your probe function will look for the current position. It does not matter if the positions with the same key have been seen during the search but are no longer stored in the hashtable.