Zobrist keys - measure of quality?

Discussion of chess software programming and technical issues.

Moderator: Ras

Rein Halbersma
Posts: 772
Joined: Tue May 22, 2007 11:13 am

Re: Zobrist keys - measure of quality?

Post by Rein Halbersma »

hgm wrote:
mar wrote:..., with the idea that this should (in theory) lower the possibility of cancellation,
This theory has already been debunked here many times. E.g. with four 12-bit keys, 0xF00, 0x0F0, 0x00F and 0xFFF any pair has 8 out of 12 different (the 'Hamming distance'), while they are dependent. (The XOR of all of them is 0.) While a simple 1, 2, 4, 8 would have provided collision-free keys, while they would all have 10 bits in common.

In fact there exists an algorithm to generate a set of keys with maximum Hamming distance. This results in a set where any key is the XOR of 2 others. Which is absolutely the worst possible key set.

A good way to test the quality of your key set is to store all XORs of 3 keys in a hash table (using that XOR as the hash). With 1000 keys you get 166M XORs. If two diffent sets give the same XOR (i.e. you get a collission in that hash table) your key set contains a 6th-order dependency, which is pretty bad. To detect 7th-order dependencies you can check for all products of 4 keys if it occurs in the table. (This takes very long with 1000 keys, as you have ~40G combinations.)
Zobrist hashing is also known as Tabulation hashing in the literature, and in the last few years it has been shown that it has a lot of nice statistical properties, even though Zobrist hashing is not even 4-independent in general. See e.g. http://arxiv.org/pdf/1011.5200.pdf or google for the authors' names for more accessible introductions.
User avatar
SMIRF
Posts: 91
Joined: Wed Mar 26, 2014 4:29 pm
Location: Buettelborn/Hessen/Germany

Re: Zobrist keys - measure of quality?

Post by SMIRF »

mar wrote:Thanks, so now I see that it's nonsense (or wrong measure)... Well... I also found in the links posted by Evert that it has been discussed already in 2001.
Whether it would be nonsense or not, I am using 64 bit zobrist atoms consisting of 32 0-bits and 32 1-bits. Thus applying a xored Zobrist atom always will switch 32 bits.

Random values traditionally chosen as zobrist atoms could in extrem encounter a zero value which would then switch nothing at all. Thus I concluded, the bigger the absolute difference between zero- and one-bits, the greater the chance of producing indentic overlayed keys.

This is pure intention, but I am sure, that using such equal bitted (randomized) Zobrist atoms as I do cannot harm anything.
AlvaroBegue
Posts: 932
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Zobrist keys - measure of quality?

Post by AlvaroBegue »

I studied the number of collisions for a hash function in a different setting (hashing short strings, like ticker symbols in the stock market) and I found that cyclic redundancy codes (CRC) performed better than could be expected from purely random numbers. You can describe CRC as a Zobrist key, by making the i-th entry in the Zobrist table be the coefficients of the polynomial x^(i+64) modulo P(x), where P(x) is a 65-th degree polynomial.

If you want to test out this idea, here's some C++ code to generate the Zobrist table:

Code: Select all

#include <cstdio>

int main() {
  unsigned long long const P = 0x42F0E1EBA9EA3693ull;
  unsigned long long h = P;
  for (int i = 0; i < 1000; ++i) {
    std::printf("%016llx\n", h);
    h = (h & 0x8000000000000000u) ? (h << 1) ^ P : h << 1;
  }
}
sean_vn
Posts: 4
Joined: Thu Feb 05, 2015 4:49 pm

Re: Zobrist keys - measure of quality?

Post by sean_vn »

Interesting discussion. This seems like a good forum for the sort of programming I am doing at the moment. Has anyone done any work on quasi-random numbers and hash tables? I tried Marsaglia's simple Weyl number idea with hash tables. It seems better than using a linear probe offset of 1.
AlvaroBegue
Posts: 932
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: Zobrist keys - measure of quality?

Post by AlvaroBegue »

Arg, there is a typo and the forum won't let me edit the message. Here's the corrected code:

Code: Select all

#include <cstdio>

int main() {
  unsigned long long const P = 0x42F0E1EBA9EA3693ull;
  unsigned long long h = P;
  for (int i = 0; i < 1000; ++i) {
    std::printf("%016llx\n", h);
    h = (h & 0x8000000000000000ull) ? (h << 1) ^ P : (h << 1);
  }
}
mar
Posts: 2691
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Zobrist keys - measure of quality?

Post by mar »

AlvaroBegue wrote:I studied the number of collisions for a hash function in a different setting (hashing short strings, like ticker symbols in the stock market) and I found that cyclic redundancy codes (CRC) performed better than could be expected from purely random numbers.
Interesting, I did a similar test of hash functions some time ago (looking for a good general-purpose hashing function) and found that CRC-32 indeed has pretty good distribution (but also quite slow unless you use slicing tables).
I also found that Hsieh's SuperFastHash is not a) that super fast and b) produces more collisions that other hashes I tried.
I used english wordlist of 1.6M words, hashing strings.
I'm posting my hash function here (originally inspired by MurmurHash3), I hereby place it in public domain. It only has 32-bit output though.
Another disadvantage right now is that it requires 32-bit aligned input and is not incremental.
Also it passed smhasher tests.

Code: Select all

// "random" keys
// primes are essential for multiplicative keys
static const UInt KEY1 = 0xdb91908du;	// prime
static const UInt KEY2 = 0x6be5be6fu;	// prime
static const UInt KEY3 = 0x53a28fc9u;	// prime
static const UInt KEY4 = 0xf8ffd50bu;	// prime

// rotate (left, could be right, doesn't matter because left = 32-right)
static inline UInt Rotate( UInt u, Byte amount )
{
	CORE_ASSERT( amount && amount < 32 );
	return (u << amount) | (u >> (Byte)(32-amount));
}

// faster than Murmur3 32-bit
static inline UInt MyHashUInt( UInt u, UInt h )
{
	// passes smhasher tests
	u = Rotate( u*KEY1, 1 );
	return u -= Rotate( h ^ KEY2, 11 );
}

static inline UInt MyHash( const void *buf, size_t len, UInt seed = 0 )
{
	const Byte *b = (const Byte *)buf;
	CORE_ASSERT( len < 4 || !((UIntPtr)b & 3) );
	const UInt *u = (const UInt *)b;
	size_t ulen = len/4;

	UInt h = KEY3 ^ seed;
	h ^= len * KEY4;

	b += ulen*4;
	len &= 3;

	while ( ulen-- ) {
		h = MyHashUInt( *u++, h );
	}

	// process remaining data if any:
	UInt last = 0;
	switch( len )
	{
	case 3:
		last = (UInt)b[2] << 16;
	case 2:
		last |= (UInt)b[1] << 8;
	case 1:
		last |= (UInt)b[0];
		h = MyHashUInt( last, h );
	default:
		break;
	}

	// finalize (avalanche)
	h ^= h >> 7;
	h ^= h << 21;
	return h;
}
This comes directly from my new simple framework, so UInt = uint32_t, Byte = uint8_t, CORE_ASSERT = assert.
It's also quite fast except that clang unfortunately missed to fold one of the rotations (I assume because another optimization is applied before rot-folding).
I will look at your code in more detail later, thanks.
EDIT: it also produces different results on little vs big endian machines but since I used it as general-purpose hashing function for my hashset, I don't really care as long as it's fast and has good distribution.
User avatar
cdani
Posts: 2204
Joined: Sat Jan 18, 2014 10:24 am
Location: Andorra

Re: Zobrist keys - measure of quality?

Post by cdani »

I use this to generate random numbers:
http://www.math.sci.hiroshima-u.ac.jp/~ ... T/emt.html
Daniel Anulliero
Posts: 773
Joined: Fri Jan 04, 2013 4:55 pm
Location: Nice

Re: Zobrist keys - measure of quality?

Post by Daniel Anulliero »

Hi all
Excuse me , if I ask a question (a little bit of topic..) to the experts here?
I quote that from Zobrist hashing wiki :
At program initialization, we generate an array of pseudorandom numbers [7] [8]:
One number for each piece at each square
One number to indicate the side to move is black
Four numbers to indicate the castling rights, though usually 16 (2^4) are used for speedEight numbers to indicate the file of a valid En passant square, if any
Why only one number for black ? Why not for white too?
I have two numbers for white and black in my engine...
It seems There is something I misunderstood ...
Bests
Dany
[/quote]
mar
Posts: 2691
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: Zobrist keys - measure of quality?

Post by mar »

Daniel Anulliero wrote:Hi all
Excuse me , if I ask a question (a little bit of topic..) to the experts here?
I quote that from Zobrist hashing wiki :
At program initialization, we generate an array of pseudorandom numbers [7] [8]:
One number for each piece at each square
One number to indicate the side to move is black
Four numbers to indicate the castling rights, though usually 16 (2^4) are used for speedEight numbers to indicate the file of a valid En passant square, if any
Why only one number for black ? Why not for white too?
I have two numbers for white and black in my engine...
It seems There is something I misunderstood ...
Bests
Dany
You only need one value to encode STM, so say if black's to move, you include the STM key (in fact, each time you make a move you simply xor with STM key unless it's a nullmove) - so if you build key from scratch,
if it's white to move you xor with 0 and if black then you xor with the STM key (you could also use one key for WTM and xor with 0 if its' BTM - this really doesn't matter).
If you use two, it will still work of course but it's not necessary.
The same holds for en passant square - you could use 64 values just like for any other square but that would be a waste since knowing only the file is enough, you can get along with just 8 values to encode the ep square.
Actually I made the same mistake and use too many keys to encode castling rights.
Even for chess960, only 3 states per side will do (can castle short, can castle long and can castle both long and short), no castling thus requires no extra key.
This is because rook/king positions are already "encoded" using piece keys.
Daniel Anulliero
Posts: 773
Joined: Fri Jan 04, 2013 4:55 pm
Location: Nice

Re: Zobrist keys - measure of quality?

Post by Daniel Anulliero »

Thanks for the very quick answer :wink:
Bests
Dany