transpositiontable, zobrist and > 15% collisions

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

flok

transpositiontable, zobrist and > 15% collisions

Post by flok »

Hi,

Triggered by the suggestion a week or so ago that my hashing code was broken I took a look at my chess-engine (which is different from the book generation code).
In the transpositiontable-code I added two counters; hits and collisions. A hit is increased if tt[hash % n] == hash, collisions if != hash.
To my surprise I still get over 15% collisions. That is with ~7 plies search depth. Earlier today that was over 35% by the way: I had forgotten to replace the rand() by something better (I now read a uint64_t from /dev/urandom).

Maybe you guys can spot the problem?

init

Code: Select all

        for&#40;int pos=0; pos<64; pos++) &#123; 
                for&#40;int type=0; type<6 * 2; type++)
                        zobristPieceAtLoc&#91;type&#93;&#91;pos&#93; = temp&#91;ti++&#93;;
        &#125; 

        zobristIsWhite = temp&#91;ti++&#93;;

        for&#40;int i=0; i<4; i++)
                zobristCastling&#91;i&#93; = temp&#91;ti++&#93;;

        for&#40;int i=0; i<8; i++)
                zobristEnPassant&#91;i&#93; = temp&#91;ti++&#93;;
Note the temp[ti++]: I first fill that temp-array with values and before using them I check for duplicates. Should not happen but as is possible with random it can happen.

lookup

Code: Select all

tptEntry Tpt&#58;&#58;lookup&#40;const uint64_t hash&#41;
&#123;
        unsigned long int index = hash % nEntries;

        tptEntry copy = invalid;

        if &#40;entries&#91;index&#93;.hash == hash&#41;
                copy = entries&#91;index&#93;;

        return copy;
&#125;
put

Code: Select all

void Tpt&#58;&#58;store&#40;const uint64_t hash, const tptEntryFlag f, const int d, const int value&#41;
&#123;
        unsigned long int index = hash % nEntries;

        entries&#91;index&#93;.hash = hash;
        entries&#91;index&#93;.value = &#40;int16_t&#41;value;
        entries&#91;index&#93;.depth = &#40;int8_t&#41;d; // FIXME uint8_t?
        entries&#91;index&#93;.flags = &#40;int8_t&#41;f;
&#125;
d is depth, f are the flags (low-/upper-/etc), value is the evaluation score

hashing

Code: Select all

uint64_t Tpt&#58;&#58;calcZobrist&#40;const Board * const b, const PlayerColor c, const Move *const epMove&#41;
&#123;
        uint64_t hash = 0;

        if &#40;c == WHITE&#41;
                hash ^= zobristIsWhite;

        for&#40;int y=0; y<8; y++) &#123;
                int offset = y * 8;

                for&#40;int x=0; x<8; x++, offset++) &#123;
                        const ChessPiece *const cur = b -> getAt&#40;x, y&#41;;

// getType&#40;) returns 0...5
                        if &#40;cur != NULL&#41;
                                hash ^= zobristPieceAtLoc&#91;cur -> getType&#40;) + &#40;cur -> getColor&#40;) == BLACK ? 6 &#58; 0&#41; &#93;&#91;offset&#93;;
                &#125;
        &#125;

// white king first move
        if &#40;b -> getAt&#40;4, 0&#41; && b -> getAt&#40;4, 0&#41; -> getFirstMove&#40;))
        &#123;
// left rook
                if &#40;b -> getAt&#40;0, 0&#41; && b -> getAt&#40;0, 0&#41; -> getFirstMove&#40;))
                        hash ^= zobristCastling&#91;0&#93;;
// right rook
                if &#40;b -> getAt&#40;7, 0&#41; && b -> getAt&#40;7, 0&#41; -> getFirstMove&#40;))
                        hash ^= zobristCastling&#91;1&#93;;
        &#125;

        if &#40;b -> getAt&#40;4, 7&#41; && b -> getAt&#40;4, 7&#41; -> getFirstMove&#40;))
        &#123;
                if &#40;b -> getAt&#40;0, 7&#41; && b -> getAt&#40;0, 7&#41; -> getFirstMove&#40;))
                        hash ^= zobristCastling&#91;2&#93;;

                if &#40;b -> getAt&#40;7, 7&#41; && b -> getAt&#40;7, 7&#41; -> getFirstMove&#40;))
                        hash ^= zobristCastling&#91;3&#93;;
        &#125;

        if &#40;epMove&#41;
                hash ^= zobristEnPassant&#91;epMove -> tx&#93;;

        return hash;
&#125;
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: transpositiontable, zobrist and > 15% collisions

Post by bob »

flok wrote:Hi,

Triggered by the suggestion a week or so ago that my hashing code was broken I took a look at my chess-engine (which is different from the book generation code).
In the transpositiontable-code I added two counters; hits and collisions. A hit is increased if tt[hash % n] == hash, collisions if != hash.
To my surprise I still get over 15% collisions. That is with ~7 plies search depth. Earlier today that was over 35% by the way: I had forgotten to replace the rand() by something better (I now read a uint64_t from /dev/urandom).

Maybe you guys can spot the problem?

init

Code: Select all

        for&#40;int pos=0; pos<64; pos++) &#123; 
                for&#40;int type=0; type<6 * 2; type++)
                        zobristPieceAtLoc&#91;type&#93;&#91;pos&#93; = temp&#91;ti++&#93;;
        &#125; 

        zobristIsWhite = temp&#91;ti++&#93;;

        for&#40;int i=0; i<4; i++)
                zobristCastling&#91;i&#93; = temp&#91;ti++&#93;;

        for&#40;int i=0; i<8; i++)
                zobristEnPassant&#91;i&#93; = temp&#91;ti++&#93;;
Note the temp[ti++]: I first fill that temp-array with values and before using them I check for duplicates. Should not happen but as is possible with random it can happen.

lookup

Code: Select all

tptEntry Tpt&#58;&#58;lookup&#40;const uint64_t hash&#41;
&#123;
        unsigned long int index = hash % nEntries;

        tptEntry copy = invalid;

        if &#40;entries&#91;index&#93;.hash == hash&#41;
                copy = entries&#91;index&#93;;

        return copy;
&#125;
put

Code: Select all

void Tpt&#58;&#58;store&#40;const uint64_t hash, const tptEntryFlag f, const int d, const int value&#41;
&#123;
        unsigned long int index = hash % nEntries;

        entries&#91;index&#93;.hash = hash;
        entries&#91;index&#93;.value = &#40;int16_t&#41;value;
        entries&#91;index&#93;.depth = &#40;int8_t&#41;d; // FIXME uint8_t?
        entries&#91;index&#93;.flags = &#40;int8_t&#41;f;
&#125;
d is depth, f are the flags (low-/upper-/etc), value is the evaluation score

hashing

Code: Select all

uint64_t Tpt&#58;&#58;calcZobrist&#40;const Board * const b, const PlayerColor c, const Move *const epMove&#41;
&#123;
        uint64_t hash = 0;

        if &#40;c == WHITE&#41;
                hash ^= zobristIsWhite;

        for&#40;int y=0; y<8; y++) &#123;
                int offset = y * 8;

                for&#40;int x=0; x<8; x++, offset++) &#123;
                        const ChessPiece *const cur = b -> getAt&#40;x, y&#41;;

// getType&#40;) returns 0...5
                        if &#40;cur != NULL&#41;
                                hash ^= zobristPieceAtLoc&#91;cur -> getType&#40;) + &#40;cur -> getColor&#40;) == BLACK ? 6 &#58; 0&#41; &#93;&#91;offset&#93;;
                &#125;
        &#125;

// white king first move
        if &#40;b -> getAt&#40;4, 0&#41; && b -> getAt&#40;4, 0&#41; -> getFirstMove&#40;))
        &#123;
// left rook
                if &#40;b -> getAt&#40;0, 0&#41; && b -> getAt&#40;0, 0&#41; -> getFirstMove&#40;))
                        hash ^= zobristCastling&#91;0&#93;;
// right rook
                if &#40;b -> getAt&#40;7, 0&#41; && b -> getAt&#40;7, 0&#41; -> getFirstMove&#40;))
                        hash ^= zobristCastling&#91;1&#93;;
        &#125;

        if &#40;b -> getAt&#40;4, 7&#41; && b -> getAt&#40;4, 7&#41; -> getFirstMove&#40;))
        &#123;
                if &#40;b -> getAt&#40;0, 7&#41; && b -> getAt&#40;0, 7&#41; -> getFirstMove&#40;))
                        hash ^= zobristCastling&#91;2&#93;;

                if &#40;b -> getAt&#40;7, 7&#41; && b -> getAt&#40;7, 7&#41; -> getFirstMove&#40;))
                        hash ^= zobristCastling&#91;3&#93;;
        &#125;

        if &#40;epMove&#41;
                hash ^= zobristEnPassant&#91;epMove -> tx&#93;;

        return hash;
&#125;
That's not what most call a collision. The percentage of collisions using your definition is directly proportional to the size of the ttable. If you have 64K entries, you only use rightmost 16 bits. If you search 4 billion nodes, you will have a BUNCH of those kinds of collisions. What most call collisions is where the 64 bit hash signatures match but the positions are still different.
Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: transpositiontable, zobrist and > 15% collisions

Post by Dann Corbit »

You have something broken in your hash.
15% collisions means that there is a programming error.

A suggestion:
1. Create a routine that totally rebuilds the hash code from scratch given the current game state.
2. When incrementally updating your hash, call the new routine when in debug mode. E.g.:
#ifdef _DEBUG
full_hash_code = GenerateHashFromGameState(gs);
#endif
assert(full_hash_code == hash_code);
3. Things to check (where things often go wrong):
A. Hashing null move
B. Hashing Castle state
C. Hashing e.p. state
D. Hashing side to move

Check your undo move operation to see that everything is perfectly restored.
Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: transpositiontable, zobrist and > 15% collisions

Post by Dann Corbit »

Considering Bob's reply, you may not really have a problem.
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: transpositiontable, zobrist and > 15% collisions

Post by xmas79 »

Code: Select all

tptEntry Tpt&#58;&#58;lookup&#40;const uint64_t hash&#41; 
&#123; 
        unsigned long int index = hash % nEntries; 

        tptEntry copy = invalid; 

        if &#40;entries&#91;index&#93;.hash == hash&#41; 
                copy = entries&#91;index&#93;; 

        return copy; 
&#125; 
Are you really returning a thing (not a native data type) declared inside the function? This is calling for troubles IMHO....

My two cents...
flok

Re: transpositiontable, zobrist and > 15% collisions

Post by flok »

Thanks all!

@robert: right, makes sense.

@dann: apart from what bob said, I added your suggestion to the _DEBUG path and indeed the current version is broken. It is the en-passant code that I recently added that is broken. For the moment I'll rip out the incremental code and go ponder on a smarter implementation (while still being thread-safe and scalable et).

@natale: returning e.g. a struct is perfectly fine in c/c++ afaik. Returning a pointer to a non-static local variable would break yes but I'm now returning by value, not by pointer.
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: transpositiontable, zobrist and > 15% collisions

Post by matthewlai »

xmas79 wrote:

Code: Select all

tptEntry Tpt&#58;&#58;lookup&#40;const uint64_t hash&#41; 
&#123; 
        unsigned long int index = hash % nEntries; 

        tptEntry copy = invalid; 

        if &#40;entries&#91;index&#93;.hash == hash&#41; 
                copy = entries&#91;index&#93;; 

        return copy; 
&#125; 
Are you really returning a thing (not a native data type) declared inside the function? This is calling for troubles IMHO....

My two cents...
That is perfectly valid C++ and is done all the time.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: transpositiontable, zobrist and > 15% collisions

Post by matthewlai »

flok wrote:(I now read a uint64_t from /dev/urandom)
If you are using C++11/14, the <random> header has everything you need. You can even get things like different distributions, etc. And it deals with all the underlying platform-dependent stuff for you.

std::mt19937_64 is a 64-bit PRNG.

There's also a std::random_device class that abstracts away the platform-dependent random source (on Linux it would read from /dev/urandom).

Usual use case is to use std::random_device to seed a Mersenne Twister.

Code: Select all

std&#58;&#58;random_device rd;
std&#58;&#58;mt19937_64 gen&#40;rd&#40;));
Then calling "gen()" gives you a high quality 64-bit random number with uniform distribution.

This should work on all platforms.

To generate a random integer between 0 to 10 with uniform distribution (which modulus does not give you) -

Code: Select all

std&#58;&#58;uniform_int_distribution<> dist&#40;0, 10&#41;;
"dist(gen)" gives you a number between 0 and 10.

You can use another great C++11 feature - functional, to bind dist and gen together into a simple function.

Code: Select all

auto drawFunc = std&#58;&#58;bind&#40;dist, gen&#41;;
...
int randomNum = drawFunc&#40;);
Lot's of good stuff in <random>.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: transpositiontable, zobrist and > 15% collisions

Post by hgm »

flok wrote:In the transpositiontable-code I added two counters; hits and collisions. A hit is increased if tt[hash % n] == hash, collisions if != hash.
These are not collissions. Just positions competing for the same hash entry. In fact what you describe is hash hits vs hash misses. What is a reasonable number for those depends on the size of your search tree and hash table. If the tree is 100 times larger than the hash table, you will necessarily on average map 100 tree nodes to 1 hash entry. The chances that you will get a hash hit are very slim then; it would require the same node to be visited in the tree twice in a row before any of the other 99 competing with it are visited. If the tree is much smaller than the hash table, and you don't count probing empty entries as misses, you would expect only very few misses.
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: transpositiontable, zobrist and > 15% collisions

Post by xmas79 »

matthewlai wrote:That is perfectly valid C++ and is done all the time.
I know, and I'm sorry for not being clear in advance. I didn't say this is the problem, what I really meant is that I generally tend to avoid code patterns that could lead to silly mistakes, and this belongs to one of these IMHO.

BTW, I would have coded the lookup in this way:

Code: Select all

tptEntry *Tpt&#58;&#58;lookup&#40;const uint64_t hash&#41; 
&#123; 
        unsigned long int index = hash % nEntries; 
        return &#40;entries&#91;index&#93;.hash == hash ? &entries&#91;index&#93; &#58; NULL&#41;;
&#125;