Zobrist benchmark

sje · Post by **sje** » Thu Jan 22, 2009 10:54 am

I have serious doubts about using Hamming distance metrics among a Zobrist key set as an evaluation of the key generation process.

Instead, I propose taking the 822,518 distinct BTM positions at ply five depth from the initial array to fill a 1,048,576 entry (2^20, no chaining) table using a given key generation scheme to arrive at a figure of merit. A perfect distribution would leave 226,058 slots free; the worse the key set, the greater the free slot count.

hgm · Post by **hgm** » Thu Jan 22, 2009 11:07 am

I agree that hamming distance is totally irrelevant. But I am not sure I understand the test you propose. You hash based on 20 bits of the key only? If you count collisions where the full 64 bits must be the same, you won't have any with only 822K positions. You would not even have any with a 40-bit key. But testing only 20 bits of the key does not prove anything about the other 44...

sje · Post by **sje** » Thu Jan 22, 2009 11:59 am

I agree that a simple test like the one proposed will only test 20 bits. But at least it is a real world test, simple, platform independent, and repeatable. Also, unless there is something special about the way the 20 bits used of the 64 (or whatever) are generated, the test results should be fairly indicative.

hgm · Post by **hgm** » Thu Jan 22, 2009 12:27 pm

It is not unlikely that the high-order bits are generated completely differently from the low-order bits. In early Joker versions, for instance, a bug crept in because I switched to using rand() in stead of random() to provide more platform-independentess, and later switched to using a library where rand() returned only 16-bit randoms rather than 32-bit randoms! So I definitely would advise a test that tests all bits.

The most meaningful test I could think of is the one desribed in the other thread is to look for the minimum number of board mutations needed to cause a collision. It should be possible to exclude any dependency in any sub-set of 7 keys, with a day of number-crunching, by storing all 3-key XORs in a hash table, and matching all 4-key XORs gagainst that table. That guarantees that you can only have collisions between positions that differ in at least 8 squares.

bob · Post by **bob** » Thu Jan 22, 2009 4:49 pm

sje wrote:I have serious doubts about using Hamming distance metrics among a Zobrist key set as an evaluation of the key generation process.

Instead, I propose taking the 822,518 distinct BTM positions at ply five depth from the initial array to fill a 1,048,576 entry (2^20, no chaining) table using a given key generation scheme to arrive at a figure of merit. A perfect distribution would leave 226,058 slots free; the worse the key set, the greater the free slot count.

I can try that pretty easily in fact. I will test with my original PRNG numbers and the latest optimized numbers. However, I believe min hamming distance _is_ important. If you have two numbers that are identical, that is an easy way to produce hash or bad book move errors.

Burton Wendroff did some analysis on this back in the late 80s or early 90s that I will try to dig up (was in the JICCA).

bob · Post by **bob** » Thu Jan 22, 2009 4:50 pm

sje wrote:I agree that a simple test like the one proposed will only test 20 bits. But at least it is a real world test, simple, platform independent, and repeatable. Also, unless there is something special about the way the 20 bits used of the 64 (or whatever) are generated, the test results should be fairly indicative.

I was thinking more along the lines of generating the Zobrist keys, then measuring the minimum hamming distance on those. Zeros would be particularly bad of course...

bob · Post by **bob** » Thu Jan 22, 2009 4:58 pm

hgm wrote:It is not unlikely that the high-order bits are generated completely differently from the low-order bits. In early Joker versions, for instance, a bug crept in because I switched to using rand() in stead of random() to provide more platform-independentess, and later switched to using a library where rand() returned only 16-bit randoms rather than 32-bit randoms! So I definitely would advise a test that tests all bits.

The most meaningful test I could think of is the one desribed in the other thread is to look for the minimum number of board mutations needed to cause a collision. It should be possible to exclude any dependency in any sub-set of 7 keys, with a day of number-crunching, by storing all 3-key XORs in a hash table, and matching all 4-key XORs gagainst that table. That guarantees that you can only have collisions between positions that differ in at least 8 squares.

If you want to do this the right way, you would not just pick random values to test. Out of the 64 knight random numbers, given a specific first number there are only 8 associated values (at most) that would be xor'ed together However, general hamming distance is just a more rigorous test of that issue since general hamming distance is actually a much stricter test that might not be as realistic...

I got into this with one simple issue in Crafty. The existing PRNG approach produced a pair of digits that were the same, in the same piece table. And they were also in positions where I could move a piece to square X and then from X to Y, where X and Y had the same random key, and I ended up right back where I started with the Zobrist key. I fixed ths a year or two ago, but thought it would be interesting to more carefully check the entire set of 788 numbers, which is where I am. My code ran for about 18 hours and could do no better than a min distance of 23 between any pair of the numbers from the above set.

I am now going to do some hashing experiments using old and new numbers to see if there is anything of significance...

hgm · Post by **hgm** » Thu Jan 22, 2009 6:12 pm

bob wrote: If you want to do this the right way, you would not just pick random values to test. Out of the 64 knight random numbers, given a specific first number there are only 8 associated values (at most) that would be xor'ed together However, general hamming distance is just a more rigorous test of that issue since general hamming distance is actually a much stricter test that might not be as realistic...

I have my doubts if one could buy a lot of leeway by making use of the special knowledge that in regular chess positions some groups of keys cannot occur together (so that their depedence has zero impact). The reason is that the expected number of dependent sub-sets of N keys increases very aggressively with N (due to the total number of keys being so large). So if you cannot avoid any dependent sub-sets of 8 keys for 64-bit key length, the number of such sub-groups of 9 keys already runs in the millions. You will not be able to keep all (or even most) of the key factors occurring in these key sets apart by distributing them over a mere 12 classes for the piece types.

But I am very curious about the actual collission rates you will measure. With 64-bit keys you are of course not likely to measure any collissions at all in reasonable time, so you would probably have to simulate the effect by going to shorter key length, and optimize your hamming distance for that key length.

sje · Post by **sje** » Thu Jan 22, 2009 7:27 pm

bob wrote:I can try that pretty easily in fact. I will test with my original PRNG numbers and the latest optimized numbers. However, I believe min hamming distance _is_ important. If you have two numbers that are identical, that is an easy way to produce hash or bad book move errors.

Burton Wendroff did some analysis on this back in the late 80s or early 90s that I will try to dig up (was in the JICCA).

I can remember some papers on collision placement management, but nothing about key generation.

When I get a chance, I'll get the CIL Toolkit to run the above benchmark. Its key set is slightly different from Crafty as it has only eight hash codes for en passant file identification.

I suppose I could generate an external file with the 822,518 FEN lines for distribution, but it would be a hefty download.

jwes · Post by **jwes** » Thu Jan 22, 2009 8:29 pm

bob wrote:
hgm wrote:It is not unlikely that the high-order bits are generated completely differently from the low-order bits. In early Joker versions, for instance, a bug crept in because I switched to using rand() in stead of random() to provide more platform-independentess, and later switched to using a library where rand() returned only 16-bit randoms rather than 32-bit randoms! So I definitely would advise a test that tests all bits.

The most meaningful test I could think of is the one desribed in the other thread is to look for the minimum number of board mutations needed to cause a collision. It should be possible to exclude any dependency in any sub-set of 7 keys, with a day of number-crunching, by storing all 3-key XORs in a hash table, and matching all 4-key XORs gagainst that table. That guarantees that you can only have collisions between positions that differ in at least 8 squares.
If you want to do this the right way, you would not just pick random values to test. Out of the 64 knight random numbers, given a specific first number there are only 8 associated values (at most) that would be xor'ed together However, general hamming distance is just a more rigorous test of that issue since general hamming distance is actually a much stricter test that might not be as realistic...

I got into this with one simple issue in Crafty. The existing PRNG approach produced a pair of digits that were the same, in the same piece table. And they were also in positions where I could move a piece to square X and then from X to Y, where X and Y had the same random key, and I ended up right back where I started with the Zobrist key. I fixed ths a year or two ago, but thought it would be interesting to more carefully check the entire set of 788 numbers, which is where I am. My code ran for about 18 hours and could do no better than a min distance of 23 between any pair of the numbers from the above set.

I am now going to do some hashing experiments using old and new numbers to see if there is anything of significance...

One simple way to test the Zobrist keys is to add a second hash key to the hash entry. Then if the first key matches and the second does not, you have a false match and by comparing the number of false matches to the expected number, you can tell how good your Zobrist keys are. The advantage of this is that you can test your Zobrist keys in actual game play, which is where you really want them to work.

Another thing that just occurred to me is that if the problem with your Zobrist keys is in the bits used to generate the address, it will be seen not as excessive false matches, but as unbalanced loading of the hash table.

Zobrist benchmark

Zobrist benchmark

Re: Zobrist benchmark

Re: Zobrist benchmark

Re: Zobrist benchmark

Re: Zobrist benchmark

Re: Zobrist benchmark

Re: Zobrist benchmark

Re: Zobrist benchmark

Re: Zobrist benchmark

Re: Zobrist benchmark