Overlapped Zobrist keys array

stegemma · Post by **stegemma** » Tue Oct 06, 2009 1:17 pm

I would like to add to my program a transposition table and i'm studying Zobrist key. The first thing i've seen is that this method requires

a "huge" array of key. Maybe is not very huge but, more for my fun than for effective needing, i've searched a way to reduce this array size,

keeping the same validity of the keys.

In C, we access this array with something like this:

U64 zobrist[pcMAX][coMAX][sqMAX];

has reported in this link:

http://web.archive.org/web/200708222040 ... obrist.htm

Maybe somebody uses a different approach, but, at end, he/she always have an array of 64 bit values.

As said in another thread, assembly programmers look at things in another way... so i've looked at the key array as a sequence of 64 bit values, despite from the index that we use. We should get the index, somehow, of course, but looking at the array as a sequence of 64 bit values will let us see the whole array as... a flow of bits, ideally separated any 64 of them:

1001...11010 1010...1011 11...01011

In C we access this array at any 64 bit boundary. But why we need this 64 bit boundary access? If we don't care about processor data alignment, we can access this array at any lower boundary, without loosing the "randomness" of the keys! In fact, because any bit is important only for its position and the xor that we use to build the position key acts in a bit per bit way, without to influence any other bit (as, for sample, a sum will instead does, because of carry flag) then we can access the Zobrist random key array with any bit boundary, less or equal to 64! I think (or well i hope) that it can be mathematically proved that the randomness of the generated keys does'nt will be affected by the boundary that we choose.

So, we can access this array on any byte or word or dword boundary and even bit boundary. That means that we can use a very short Zobrist key array, just accessing that array with bit boundary. What does it means in practice? In the sample above, the index to the 64 bit value that we're searching for becomes an index to the first bit of the 64 bits needed. Obviously this gives us the shortest possible array but it is hard to extract the bits we needs (we should access 64+8 bits and then shift the result according to the bit index). An easiest way is to use a byte boundary, with something like this:

U64 GetZobristKey(int piece, int color, int square)
{
int index = piece*color*square;
const char *p = ((char *)zobrist) + index;
return *(U64 *)p;
}

In the latter, our array does'nt need:

(piece*color*square * 64) bits

as the original and well known implementation but only:

(piece*color*square * 8 + 3*8) bits

This will reduce our array of about 8th times in size.

If we use a bit boundary, the array becomes

(piece*color*square + 63) bits

One thing that i can't prove, for now, is if really the randomness of the keys will be preserved or if the generated keys are instead somehow "degenerated". If one can prove that the randomness is the same and the final postion keys collision is the same, then we have a way to use a very short Zobrist random keys array. This can help to reduce CPU cache usage of our programs (at the cost of data misalignment and code complexity!) or maybe is helpless at all... but for handled devices, with low memory available, it could be still usefull.

What do you think about this idea? I'm really crazy?

Code: Select all

hgm · Post by **hgm** » Tue Oct 06, 2009 1:36 pm

This works in fact very well. Micro-Max uses this method, and I never have had any problems with it. Because micro-Max uses a 0x88 board, the misaligned 64-bit access never crosses a cache-line boundary. This makes it without any performance penalty whatsoever, on the CPUs I am using. It reduces the footprint of the Zobrist array to 1034 bytes, which is really important for efficient L1 caching.

In key-dependency tests I have never been able to detect any performance difference between the byte-overlapped method and completely random keys. Selecting keys for a lrge Hamming distance usually produces highly inferior results.

One should not overdo it, though; I also tred keys that were overlapping 63 bits, rather than 56, and this produced significantly larger dependency.

plattyaj · Post by **plattyaj** » Tue Oct 06, 2009 1:40 pm

The concept that needs to be preserved in any approach to hash keys is that you want to have a large Hamming distance between one key and another:

http://en.wikipedia.org/wiki/Hamming_distance

If you can generate a set of keys that have a good hamming distance it will work fine for the hash table.

Now, why you would write an assembler program and then add the overhead of reading unaligned words and shifting is another matter

Andy.

stegemma · Post by **stegemma** » Tue Oct 06, 2009 2:31 pm

plattyaj wrote:...If you can generate a set of keys that have a good hamming distance it will work fine for the hash table.

Now, why you would write an assembler program and then add the overhead of reading unaligned words and shifting is another matter

Andy.

Ok, but what's important is distance between the starting Zobrist keys or between the position keys? I think that the last is the more important distance to compute, instead of those of the singular keys used. I don't know if starting from overlapped keys (with "bad" hamming distance) gives or not bad Hamming distance between final position keys.

Obviously i will test for the different boundaries in my program, before to choose the best one (maybe 64 bit boundary or 32 bit can work the same, in my 32 bit CPU). I'll avoid shifting at all, i know that shifting has to be avoid, in actual CPUs, but maybe in the future could becomes valid and cache saving could overcome the overhead for the major complexity. Of course, future CPUs can have bigger cache... so this could never happen.

But, if we have less keys, wath about to compute the best ones, instead to generate them randomly? thinking on the way keys will be overlapped, maybe, maybe De Bruijn sequences can be used....

hgm · Post by **hgm** » Tue Oct 06, 2009 3:37 pm

Code: Select all

#define U &#40;1<<24&#41;
struct _ &#123;int K,V;char X,Y,D;&#125; A&#91;U&#93;;           /* hash table, 16M entries  */

char
T&#91;1035&#93;;                                       /* hash translation table   */

#define K&#40;A,B&#41; *&#40;int*)&#40;T+A+&#40;B&8&#41;+S*&#40;B&7&#41;)
#define J&#40;A&#41; K&#40;y+A,b&#91;y&#93;)-K&#40;x+A,u&#41;-K&#40;H+A,t&#41;

...
       J+=J&#40;0&#41;;Z+=J&#40;8&#41;+G-S;                    /* update hash key&#40;s&#41;       */

stegemma · Post by **stegemma** » Tue Oct 06, 2009 3:44 pm

hgm wrote:

Code: Select all

#define U &#40;1<<24&#41;
struct _ &#123;int K,V;char X,Y,D;&#125; A&#91;U&#93;; ...

Obscured C... or maybe something has been missed?

hgm · Post by **hgm** » Tue Oct 06, 2009 4:13 pm

This is part of the micro-Max source. It is not really obfuscated. Just written in a compact way.

As it is a 32-bit code, I explicitly keep account of the low and the high 32-bit of the hash key, J and Z. (As would happen on a 32-bit machine after compilation anyway.) Because this requires me to do a similar update on J and Z, I made a macro J() for that. (There is no naming conflict between macros with one parameter and other uses of the same symbol.)

K(A,B) fetches the Zobrist key for square A and (colored) piece type B; the "8" bit of the piece encodes the color. S is a constant equal to 128. So the number of the key is derived as

square + (pieceType&8) + 128*(pieceType)

This makes the color bit conveniently fill in the hole in the 0x88 square numbers, so that black pieces use the table part corresponding to te 'hidded' squares of the 0x88 board. x and y are the from and the to square, H the capture square (which can be different from y in case of e.p. capture). u is the moved piece, t the captured piece, and b[y] the moved piece after the move (which can be different from u because of promotion), as b[] is the board. So J is the complete update of the hash key for one move.

For the other half of the key I offset everything by 8 squares.

The important part is the *(int*) cast on the char T[] array, which does the thing you propose.

stegemma · Post by **stegemma** » Tue Oct 06, 2009 5:24 pm

hgm wrote:...The important part is the *(int*) cast on the char T[] array, which does the thing you propose.

In fact it was the only part that i have understood the first time

.

It seems to me that you add keys (J+=...; Z+=...) instead of xoring them (J^=...; Z^=...), it's right? Why do you have choosen this method?

hgm · Post by **hgm** » Tue Oct 06, 2009 5:39 pm

I guess I always use + in stead of ^ because it is more general. XORing only works if you never have more than one piece on the same square. Adding the Zobrist keys would work for any number of pieces per square. In games where pieces can be stacked (e.g. in the Crazyhouse holdings) ^ would not work at all, as two identical pieces on the same square would be indistinguishable from an empty square. Of course there are work-arounds for that, by assigning separate keys to each (copyNumber, pieceType, square) combination, but these bloat the key table even more, for apparently no reason at all. With + I can treat the holdings like they are just a single extra square.

P. Villanueva · Post by **P. Villanueva** » Tue Oct 06, 2009 5:42 pm

Hi, Stefano

You may take into consideration the use of polyglot opening books hashing scheme. Later, if you want to make your engine compatible with polyglot books you will have most of the work already done.
More info here:

http://alpha.uhasselt.be/Research/Algeb ... ormat.html

Overlapped Zobrist keys array

Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array

Re: Overlapped Zobrist keys array