Zobrist hashing is also known as Tabulation hashing in the literature, and in the last few years it has been shown that it has a lot of nice statistical properties, even though Zobrist hashing is not even 4-independent in general. See e.g. http://arxiv.org/pdf/1011.5200.pdf or google for the authors' names for more accessible introductions.hgm wrote:This theory has already been debunked here many times. E.g. with four 12-bit keys, 0xF00, 0x0F0, 0x00F and 0xFFF any pair has 8 out of 12 different (the 'Hamming distance'), while they are dependent. (The XOR of all of them is 0.) While a simple 1, 2, 4, 8 would have provided collision-free keys, while they would all have 10 bits in common.mar wrote:..., with the idea that this should (in theory) lower the possibility of cancellation,
In fact there exists an algorithm to generate a set of keys with maximum Hamming distance. This results in a set where any key is the XOR of 2 others. Which is absolutely the worst possible key set.
A good way to test the quality of your key set is to store all XORs of 3 keys in a hash table (using that XOR as the hash). With 1000 keys you get 166M XORs. If two diffent sets give the same XOR (i.e. you get a collission in that hash table) your key set contains a 6th-order dependency, which is pretty bad. To detect 7th-order dependencies you can check for all products of 4 keys if it occurs in the table. (This takes very long with 1000 keys, as you have ~40G combinations.)
Zobrist keys - measure of quality?
Moderator: Ras
-
Rein Halbersma
- Posts: 772
- Joined: Tue May 22, 2007 11:13 am
Re: Zobrist keys - measure of quality?
-
SMIRF
- Posts: 91
- Joined: Wed Mar 26, 2014 4:29 pm
- Location: Buettelborn/Hessen/Germany
Re: Zobrist keys - measure of quality?
Whether it would be nonsense or not, I am using 64 bit zobrist atoms consisting of 32 0-bits and 32 1-bits. Thus applying a xored Zobrist atom always will switch 32 bits.mar wrote:Thanks, so now I see that it's nonsense (or wrong measure)... Well... I also found in the links posted by Evert that it has been discussed already in 2001.
Random values traditionally chosen as zobrist atoms could in extrem encounter a zero value which would then switch nothing at all. Thus I concluded, the bigger the absolute difference between zero- and one-bits, the greater the chance of producing indentic overlayed keys.
This is pure intention, but I am sure, that using such equal bitted (randomized) Zobrist atoms as I do cannot harm anything.
-
AlvaroBegue
- Posts: 932
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: Zobrist keys - measure of quality?
I studied the number of collisions for a hash function in a different setting (hashing short strings, like ticker symbols in the stock market) and I found that cyclic redundancy codes (CRC) performed better than could be expected from purely random numbers. You can describe CRC as a Zobrist key, by making the i-th entry in the Zobrist table be the coefficients of the polynomial x^(i+64) modulo P(x), where P(x) is a 65-th degree polynomial.
If you want to test out this idea, here's some C++ code to generate the Zobrist table:
If you want to test out this idea, here's some C++ code to generate the Zobrist table:
Code: Select all
#include <cstdio>
int main() {
unsigned long long const P = 0x42F0E1EBA9EA3693ull;
unsigned long long h = P;
for (int i = 0; i < 1000; ++i) {
std::printf("%016llx\n", h);
h = (h & 0x8000000000000000u) ? (h << 1) ^ P : h << 1;
}
}
-
sean_vn
- Posts: 4
- Joined: Thu Feb 05, 2015 4:49 pm
Re: Zobrist keys - measure of quality?
Interesting discussion. This seems like a good forum for the sort of programming I am doing at the moment. Has anyone done any work on quasi-random numbers and hash tables? I tried Marsaglia's simple Weyl number idea with hash tables. It seems better than using a linear probe offset of 1.
-
AlvaroBegue
- Posts: 932
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: Zobrist keys - measure of quality?
Arg, there is a typo and the forum won't let me edit the message. Here's the corrected code:
Code: Select all
#include <cstdio>
int main() {
unsigned long long const P = 0x42F0E1EBA9EA3693ull;
unsigned long long h = P;
for (int i = 0; i < 1000; ++i) {
std::printf("%016llx\n", h);
h = (h & 0x8000000000000000ull) ? (h << 1) ^ P : (h << 1);
}
}
-
mar
- Posts: 2691
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Zobrist keys - measure of quality?
Interesting, I did a similar test of hash functions some time ago (looking for a good general-purpose hashing function) and found that CRC-32 indeed has pretty good distribution (but also quite slow unless you use slicing tables).AlvaroBegue wrote:I studied the number of collisions for a hash function in a different setting (hashing short strings, like ticker symbols in the stock market) and I found that cyclic redundancy codes (CRC) performed better than could be expected from purely random numbers.
I also found that Hsieh's SuperFastHash is not a) that super fast and b) produces more collisions that other hashes I tried.
I used english wordlist of 1.6M words, hashing strings.
I'm posting my hash function here (originally inspired by MurmurHash3), I hereby place it in public domain. It only has 32-bit output though.
Another disadvantage right now is that it requires 32-bit aligned input and is not incremental.
Also it passed smhasher tests.
Code: Select all
// "random" keys
// primes are essential for multiplicative keys
static const UInt KEY1 = 0xdb91908du; // prime
static const UInt KEY2 = 0x6be5be6fu; // prime
static const UInt KEY3 = 0x53a28fc9u; // prime
static const UInt KEY4 = 0xf8ffd50bu; // prime
// rotate (left, could be right, doesn't matter because left = 32-right)
static inline UInt Rotate( UInt u, Byte amount )
{
CORE_ASSERT( amount && amount < 32 );
return (u << amount) | (u >> (Byte)(32-amount));
}
// faster than Murmur3 32-bit
static inline UInt MyHashUInt( UInt u, UInt h )
{
// passes smhasher tests
u = Rotate( u*KEY1, 1 );
return u -= Rotate( h ^ KEY2, 11 );
}
static inline UInt MyHash( const void *buf, size_t len, UInt seed = 0 )
{
const Byte *b = (const Byte *)buf;
CORE_ASSERT( len < 4 || !((UIntPtr)b & 3) );
const UInt *u = (const UInt *)b;
size_t ulen = len/4;
UInt h = KEY3 ^ seed;
h ^= len * KEY4;
b += ulen*4;
len &= 3;
while ( ulen-- ) {
h = MyHashUInt( *u++, h );
}
// process remaining data if any:
UInt last = 0;
switch( len )
{
case 3:
last = (UInt)b[2] << 16;
case 2:
last |= (UInt)b[1] << 8;
case 1:
last |= (UInt)b[0];
h = MyHashUInt( last, h );
default:
break;
}
// finalize (avalanche)
h ^= h >> 7;
h ^= h << 21;
return h;
}
It's also quite fast except that clang unfortunately missed to fold one of the rotations (I assume because another optimization is applied before rot-folding).
I will look at your code in more detail later, thanks.
EDIT: it also produces different results on little vs big endian machines but since I used it as general-purpose hashing function for my hashset, I don't really care as long as it's fast and has good distribution.
-
cdani
- Posts: 2204
- Joined: Sat Jan 18, 2014 10:24 am
- Location: Andorra
Re: Zobrist keys - measure of quality?
I use this to generate random numbers:
http://www.math.sci.hiroshima-u.ac.jp/~ ... T/emt.html
http://www.math.sci.hiroshima-u.ac.jp/~ ... T/emt.html
Daniel José -
http://www.andscacs.com
-
Daniel Anulliero
- Posts: 773
- Joined: Fri Jan 04, 2013 4:55 pm
- Location: Nice
Re: Zobrist keys - measure of quality?
Hi all
Excuse me , if I ask a question (a little bit of topic..) to the experts here?
I quote that from Zobrist hashing wiki :
I have two numbers for white and black in my engine...
It seems There is something I misunderstood ...
Bests
Dany
[/quote]
Excuse me , if I ask a question (a little bit of topic..) to the experts here?
I quote that from Zobrist hashing wiki :
Why only one number for black ? Why not for white too?At program initialization, we generate an array of pseudorandom numbers [7] [8]:
One number for each piece at each square
One number to indicate the side to move is black
Four numbers to indicate the castling rights, though usually 16 (2^4) are used for speedEight numbers to indicate the file of a valid En passant square, if any
I have two numbers for white and black in my engine...
It seems There is something I misunderstood ...
Bests
Dany
[/quote]
-
mar
- Posts: 2691
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Zobrist keys - measure of quality?
You only need one value to encode STM, so say if black's to move, you include the STM key (in fact, each time you make a move you simply xor with STM key unless it's a nullmove) - so if you build key from scratch,Daniel Anulliero wrote:Hi all
Excuse me , if I ask a question (a little bit of topic..) to the experts here?
I quote that from Zobrist hashing wiki :Why only one number for black ? Why not for white too?At program initialization, we generate an array of pseudorandom numbers [7] [8]:
One number for each piece at each square
One number to indicate the side to move is black
Four numbers to indicate the castling rights, though usually 16 (2^4) are used for speedEight numbers to indicate the file of a valid En passant square, if any
I have two numbers for white and black in my engine...
It seems There is something I misunderstood ...
Bests
Dany
if it's white to move you xor with 0 and if black then you xor with the STM key (you could also use one key for WTM and xor with 0 if its' BTM - this really doesn't matter).
If you use two, it will still work of course but it's not necessary.
The same holds for en passant square - you could use 64 values just like for any other square but that would be a waste since knowing only the file is enough, you can get along with just 8 values to encode the ep square.
Actually I made the same mistake and use too many keys to encode castling rights.
Even for chess960, only 3 states per side will do (can castle short, can castle long and can castle both long and short), no castling thus requires no extra key.
This is because rook/king positions are already "encoded" using piece keys.
-
Daniel Anulliero
- Posts: 773
- Joined: Fri Jan 04, 2013 4:55 pm
- Location: Nice
Re: Zobrist keys - measure of quality?
Thanks for the very quick answer
Bests
Dany
Bests
Dany