Hashing based on move lists

ZirconiumX · Post by **ZirconiumX** » Sun Jun 30, 2013 6:47 pm

Gusev wrote:
Henk wrote:
syzygy wrote:
Henk wrote:In my chess program, which has a dreadful performance by the way, I use the hashcode to look up an entry. If there are more entries with the same hashcode an equality operation is applied on the board representation.
So you store the whole board representation in the hash entry. That is not very competitive.

I do not want a hash function that possibly may look up wrong positions through collisions. I'm not gona implement a fruit machine. (gambling machine).
If you prefer dreadful performance over a negligble amount of incorrect probes that have an even more negligble chance of affecting the move played, then this is certainly the way to go.
I do not know at this moment if this dreadful performance is caused by cloning a complete key or a partial key(zobrist key). Of course it's less efficient. Do you know what percentage of these zobrist probes collide ?
See http://www.cis.uab.edu/hyatt/collisions.html for info.

With some estimated figures, (and an actual one), I get:

Code: Select all

                                                 1
--------------------------------------------------------------------------------------------
  100,000 games played with no collisions * 60 moves/game * 1,000,000,000 probes per search

= 1.6666666 * 10^-16

Needless to say, the day Crafty gets a hash collision is the day I buy a lottery ticket.

Matthew:out

Gerd Isenberg · Post by **Gerd Isenberg** » Sun Jun 30, 2013 7:13 pm

Gusev wrote:
ZirconiumX wrote:This is a chess-specific, very low collision (if not zero) hashing method.

This is basically a "do one thing and do it well" method.

Matthew:out
I don't know about zero, because it is impossible to describe a chess position uniquely using only 64 bits. The main idea is that we should be able to do better by controlling piece signatures directly, than by relying upon a pseudo-random number generator.

Sorry if I missed it, how do you distinguish same type white and black pieces (no pawns) which have same attack sets in the final 64-bit key?

ZirconiumX · Post by **ZirconiumX** » Sun Jun 30, 2013 7:21 pm

Gerd Isenberg wrote:
Gusev wrote:
ZirconiumX wrote:This is a chess-specific, very low collision (if not zero) hashing method.

This is basically a "do one thing and do it well" method.

Matthew:out
I don't know about zero, because it is impossible to describe a chess position uniquely using only 64 bits. The main idea is that we should be able to do better by controlling piece signatures directly, than by relying upon a pseudo-random number generator.
Sorry if I missed it, how do you distinguish same type white and black pieces (no pawns) which have same attack sets in the final 64-bit key?

He takes the complement (~attack_set) of the piece's attack set for black.

Matthew:out

rjgibert · Post by **rjgibert** » Sun Jun 30, 2013 7:28 pm

Gerd Isenberg wrote:
rjgibert wrote:Using OR is terrible. With pawns on b3, e3 and h3, adding any number of additional pawns to the 3rd rank will leave the signature unchanged.

Using XOR is better, but still bad. Compare the position with pawns on a3, b3, g3 and h3 with the position with pawns on d3 and e3.
Yep, but that collision already requires a lot of plies to occur in a typical search with a common root ancestor. I somehow like the idea to use attack sets on the otherwise empty board as "Zobrist" keys. I guess along with checking stored move is legal, this is quite safe. I guess there are collisions with less pieces different, but I fail to find some on the fly.

I assume you are referring to the XOR version. I used an example with pawns, because pawns are numerous. So how about this example with rooks: Compare the position with rooks on f3 and c6 against the position with rooks on f6 and c3.

sje · Post by **sje** » Sun Jun 30, 2013 7:29 pm

What is the expected false positive rate of 128 bit keys vs 64 bit keys? The answer depends on many factors, but my estimate is if a fast, multicore machine gets less than one false positive per day with 64 bit hashing, then moving to 128 bit hashing (reducing the error rate by about a factor of 2^32) means a false positive will occur at the rate of less than one per ten million years.

Cosmic rays causing random bit flips are much more of a threat.

The slowdown in per node processing time is almost unmeasurable. Actually, since no move sanity test is needed with 128 bit keys, the overall rate may be slightly higher. The only real penalty is the extra space needed for each transposition table entry and the added time required for access.

ZirconiumX · Post by **ZirconiumX** » Sun Jun 30, 2013 7:39 pm

sje wrote:What is the expected false positive rate of 128 bit keys vs 64 bit keys? The answer depends on many factors, but my estimate is if a fast, multicore machine gets less than one false positive per day with 64 bit hashing, then moving to 128 bit hashing (reducing the error rate by about a factor of 2^32) means a false positive will occur at the rate of less than one per ten million years.

Cosmic rays causing random bit flips are much more of a threat.

The slowdown in per node processing time is almost unmeasurable. Actually, since no move sanity test is needed with 128 bit keys, the overall rate may be slightly higher. The only real penalty is the extra space needed for each transposition table entry and the added time required for access.

This is true, but Zobrist keys have one problem. They are based on random numbers, and if the random numbers aren't very random at all, then you might have a problem.

Gusev hashing works by removing the randomness element, but is slightly harder to compute.

Matthew:out

ZirconiumX · Post by **ZirconiumX** » Sun Jun 30, 2013 7:50 pm

rjgibert wrote:
Gerd Isenberg wrote:
rjgibert wrote:Using OR is terrible. With pawns on b3, e3 and h3, adding any number of additional pawns to the 3rd rank will leave the signature unchanged.

Using XOR is better, but still bad. Compare the position with pawns on a3, b3, g3 and h3 with the position with pawns on d3 and e3.
Yep, but that collision already requires a lot of plies to occur in a typical search with a common root ancestor. I somehow like the idea to use attack sets on the otherwise empty board as "Zobrist" keys. I guess along with checking stored move is legal, this is quite safe. I guess there are collisions with less pieces different, but I fail to find some on the fly.
I assume you are referring to the XOR version. I used an example with pawns, because pawns are numerous. So how about this example with rooks: Compare the position with rooks on f3 and c6 against the position with rooks on f6 and c3.

Rf3 and Rc6:

Code: Select all

- - X - - X - -
- - X - - X - -
X X X X X - X X
- - X - - X - -
- - X - - X - -
X X - X X X X X
- - X - - X - -
- - X - - X - -

Rc3 and Rf6:

Code: Select all

- - X - - X - -
- - X - - X - -
X X - X X X X X
- - X - - X - -
- - X - - X - -
X X X X X - X X
- - X - - X - -
- - X - - X - -

Still bulletproof.

Matthew:out

Gerd Isenberg · Post by **Gerd Isenberg** » Sun Jun 30, 2013 7:54 pm

ZirconiumX wrote:
Gerd Isenberg wrote:
Gusev wrote:
ZirconiumX wrote:This is a chess-specific, very low collision (if not zero) hashing method.

This is basically a "do one thing and do it well" method.

Matthew:out
I don't know about zero, because it is impossible to describe a chess position uniquely using only 64 bits. The main idea is that we should be able to do better by controlling piece signatures directly, than by relying upon a pseudo-random number generator.
Sorry if I missed it, how do you distinguish same type white and black pieces (no pawns) which have same attack sets in the final 64-bit key?
He takes the complement (~attack_set) of the piece's attack set for black.

Matthew:out

Thanks, just found the init code in position.cpp, ahh yes, flip is complement and not byteswap. Complement is xor with -1, so each even number of black pieces xored, cancels the complement, hmm..., sounds dangerous.

hgm · Post by **hgm** » Sun Jun 30, 2013 8:03 pm

Indeed, this is what I also remarked above, except that it was not clear to me that they are just using Zobrist hashing with hand-made keys. (Or rather, I could not believe they would propose something so obviously flawed.)

What they propose is of course disastrous. It is hard to think of a scheme that would cause more collisions than this. Perhaps unless you would initialize all keys to zero from the start...

ZirconiumX · Post by **ZirconiumX** » Sun Jun 30, 2013 8:04 pm

Gerd Isenberg wrote:
ZirconiumX wrote:
Gerd Isenberg wrote:
Gusev wrote:
ZirconiumX wrote:This is a chess-specific, very low collision (if not zero) hashing method.

This is basically a "do one thing and do it well" method.

Matthew:out
I don't know about zero, because it is impossible to describe a chess position uniquely using only 64 bits. The main idea is that we should be able to do better by controlling piece signatures directly, than by relying upon a pseudo-random number generator.
Sorry if I missed it, how do you distinguish same type white and black pieces (no pawns) which have same attack sets in the final 64-bit key?
He takes the complement (~attack_set) of the piece's attack set for black.

Matthew:out
Thanks, just found the init code in position.cpp, ahh yes, flip is complement and not byteswap. Complement is xor with -1, so each even number of black pieces xored, cancels the complement, hmm..., sounds dangerous.

Got any better ideas?

Matthew:out

Hashing based on move lists

Re: Hashing based on move lists

Re: Hashing based on move lists

Re: Hashing based on move lists

Re: Hashing based on move lists

A brief comment on using 128 bit Zobrist hashing

Re: A brief comment on using 128 bit Zobrist hashing

Re: Hashing based on move lists

Re: Hashing based on move lists

Re: Hashing based on move lists

Re: Hashing based on move lists