On-the fly hash key generation?

AlvaroBegue · Post by **AlvaroBegue** » Tue Jan 12, 2016 10:17 pm

Evert wrote:
hgm wrote:HaChu does that. It multiplies a piece key with a square key.
That's close-ish to what I was thinking, but I see it still gets those from tables, right? Of course the tables can be much smaller, so it may not be worth it to go for full calculation...

How about something like this?

(piece_type * BOARD_SIZE + square + 1) * MAGIC

hgm · Post by **hgm** » Tue Jan 12, 2016 10:47 pm

I wonder if you could make them independent enough, that way. If you would have an additive key, rather than and XOR key, this would certainly not work. As the pieceType*BOARD_SIZE + square attains any value, so there would always be a pair of pieces with low piece types that would add up to a value still in the range that could be reached by a single piece with a large pieceType. And the MAGIC multiplication then would not change that.

So apparently the scheme hinges on the rather subtle difference between XOR and addition.

cdani · Post by **cdani** » Wed Jan 13, 2016 9:21 am

One idea could be to share 7 of 8 bytes (or 4 of 8...) with the next key, thus saving roughly 7/8 of memory. If this works, should suffice for wildest chess variants.

Of course will require unaligned memory accesses, but I suppose this cannot be very terrible, or at least will be probably better than bringing something from the next level of cache. Anyway sure here there is someone that knows more than me about this.

Also didn't think about how to implement this in an efficient way in C.

Aleks Peshkov · Post by **Aleks Peshkov** » Wed Jan 13, 2016 9:51 am

hgm wrote:Aleks Peshkov seems to have tried a scheme for normal Chess where you had just 12 hash keys, one for each piece type, and then rotated those by the square number:

Zobrist(piece, square) = key[piece] << square | key[piece] >> 63-square;

n << sq | n >> (64-sq);

Opposite side pieces encoded as ~bswap(k). So two same pieces either same or opposite colors on the same square has exactly 0 hash.

I have 7 keys. One extra key is necessary to encode en-passant and Chess960 castling rights.

hgm · Post by **hgm** » Wed Jan 13, 2016 9:58 am

cdani wrote:One idea could be to share 7 of 8 bytes (or 4 of 8...) with the next key, thus saving roughly 7/8 of memory. If this works, should suffice for wildest chess variants.

Indeed, micro-Max and Shokidoki do this. Shokidoki because with an 81-square board and 2x14 piece types a full Zobrist table would provide a rather heavy cache load (28x81*8 = 18KB, while at the time the usual L1 size was still 16KB). In micro-Max mainly because the rand() function delivered only 16-bit values, so that it would be a lot of work to fill 8-byte variables.

Code: Select all

int BOARDSIZE=256;
char T&#91;8200&#93;;

// during the engine initialization
 N=32*BOARDSIZE+7;while&#40;N-->BOARDSIZE+3&#41;T&#91;N&#93;=rand&#40;)>>9;

// access
*&#40;int*)&#40;T + square + BOARDSIZE*pieceType&#41;

Note this acces them as 32-bit integers, because micro-Max keeps the two halves of the hash key in separate variables (one used for indexing, the other stored in the entry as signature). The second key is then using square+4 instead of square for its update.

If you would want to do 64-bit acces, you would just have to write *(long long int*) in stead of *(int*). The unaligned accesses turned out to have no speed penalty as long as they do not straddle two 64-byte cache lines. If they do, it just requires an extra cycle. But with 0x88 square numbering there is no need to ever straddle a cache line.

However, with 2x100 piece types and 625 board squares it would still require 125KB, far exceding L1. So I did not consider it suitable for HaChu.

AlvaroBegue · Post by **AlvaroBegue** » Wed Jan 13, 2016 11:04 am

hgm wrote:I wonder if you could make them independent enough, that way. If you would have an additive key, rather than and XOR key, this would certainly not work. As the pieceType*BOARD_SIZE + square attains any value, so there would always be a pair of pieces with low piece types that would add up to a value still in the range that could be reached by a single piece with a large pieceType. And the MAGIC multiplication then would not change that.

So apparently the scheme hinges on the rather subtle difference between XOR and addition.

Of course this is true. The purpose of my formula was to compute a hash of the number (piece_type * BOARD_SIZE + square) in some quick and dirty way. I ran a simple test and I found out these linear relations are more problematic than I expected.

Here's my simple test: To see how good a table of 64*12 32-bit numbers is at not producing collisions, I try to find sets of 4 different indices whose entries XOR to 0.

If the table were filled up with 32-bit random integers, you expect the number of such sets to be around 5 (I simply measured this experimentally). Using the formula I proposed, I get numbers in the thousands. So that's not very good. I tried a few other things and this one seems to be good:

h = piece_type * BOARD_SIZE + square + 1;
return (h * (h + MAGIC1)) * MAGIC2;

A particular choice of constants that works very well is MAGIC1 = 16045690984833335023 and MAGIC2 = 11400714819323198485.

AlvaroBegue · Post by **AlvaroBegue** » Wed Jan 13, 2016 11:36 am

Another idea: Just use the mixing function of MurmurHash2.

h = piece_type * BOARD_SIZE + square;
h *= 0xc6a4a7935bd1e995ULL;
h ^= h >> 47;
h *= 0xc6a4a7935bd1e995ULL;

This works really well with my test: Only 1 collision if I use the lower 32 bits, 2 if I use the higher 32 bits.

hgm · Post by **hgm** » Wed Jan 13, 2016 12:08 pm

Now the question is, how many arithmetic operations are you willing to do to replace one table lookup. If the size of the table is no issue, I think a memory load is not any slower than a multiply, and it would just depend on whether the multiply unit or the memory unit is more in demand by the rest of your program.

Aleks' method with a table of just 7 keys, seems rather hard to beat. Even when you extend it to 32 piece types, the amount of storage required is still totally negligible. Extending the number of board squares would be less straightforward, though. Perhaps using the higher-order bits of the square number as part of the index for the piece key.

AlvaroBegue · Post by **AlvaroBegue** » Wed Jan 13, 2016 12:14 pm

hgm wrote:Now the question is, how many arithmetic operations are you willing to do to replace one table lookup. If the size of the table is no issue, I think a memory load is not any slower than a multiply, and it would just depend on whether the multiply unit or the memory unit is more in demand by the rest of your program.

Aleks' method with a table of just 7 keys, seems rather hard to beat. Even when you extend it to 32 piece types, the amount of storage required is still totally negligible. Extending the number of board squares would be less straightforward, though. Perhaps using the higher-order bits of the square number as part of the index for the piece key.

EDIT: Never mind. I implemented a buggy version of Aleks's idea.

Aleks Peshkov · Post by **Aleks Peshkov** » Wed Jan 13, 2016 1:10 pm

hgm wrote:Aleks' method with a table of just 7 keys, seems rather hard to beat.

But you have to explore a better key set then random.

I personally brute-force test a subset of Zobrist numbers with decent Hamming Distances to find a key combination. They seems to be as good (and probably better) as random 800 keys, but I do not know a good metric how to prove.

On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?

Re: On-the fly hash key generation?