transpositiontable, zobrist and > 15% collisions

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: transpositiontable, zobrist and > 15% collisions

Post by bob »

xmas79 wrote:
bob wrote:What does buckets have to do with this in a non-threaded search? WHO is going to change the bucket between when you compute the bucket address and when you decide to access an entry within the bucket? The alpha/beta search is a sequential process when only using one thread.
I was referring to something like (pseudo):

Code: Select all

search(a,b)
{
    score = a;
    ptr = hashProbe();
    TryCutoffByUsingPointer(ptr);
    TryHashMoveByUsingPointer(ptr);
    ForAllMoves {
        Make();
        score = -search(b,a)
        UnMake();
        if (score > b) {
            StoreHashByUsingPointer(ptr);
            return score;
        }
    }
    
    if (alphaImproved ) {
            StoreHashByUsingPointer(ptr);
    }
    return score;
}
After the use of ptr in hash move phase there is a search, and it can be very deep, allowing for the TT to be "trashed". With this logic in mind, here's an example:

Suppose we have 4 buckets per hash slot and suppose 64k entries (0xffff is the key hash mask), it can happen that we get a pointer for the hash key 0x12340000, and we get the hash index 0, so buckets 0, 1, 2, 3, corresponding to memory addresses 0, 16, 32, 48 (eg 16 bytes per entry). Suppose the TT is empty, we stores the hash key 0x12340000 at address 0. Then the search goes (among others) through positions 0xfa020000, 0x12fe0000, 0x16930000, 0x98af0000, 0xffb10000 etc... and then again 0x12340000. All these positions will map to hash index 0, however they will go in different buckets. Eg 0xfa020000 in bucket 1, 0x12fe0000 in bucket 2, 0x16930000 in bucket 3. Now 0x98af0000 and 0xffb10000 must be stored and they will eventually replace the entry 0x12340000 in bucket 0. When the position with hash 0x12340000 will be encountered again, it may be stored at a different bucket, so the pointer to the old hash entry is no longer valid.

Hope it was clear what I meant.
That is a really lousy design however. I can think of no valid reason to compute the hash index at the start of the search, then use that hash index at the end of that ply to do a store. It would work if you use buckets, since when you do the store you have to search through the bucket to find the best entry to replace.

But the bottom line is this: Don't write code that has so many ways to break it that you are guaranteed to encounter one or more of them as time progresses. Computing the hash address takes zero time (typically a simple 1/2 cycle AND operation). There's no reason to do that far away from where you need it, and there are good reasons for doing it right when you need it, namely avoiding a memory reference to fetch the pointer you calculated so long ago.

The time consuming part of the hash probe/store is the memory access for the table entry(s) (or a bucket of them). Moving the address calculations far away from where it is used really makes no sense at all, and just invites bugs. The type of bugs one sees in assembly language where you do a load long before you need it to avoid the latency, then forget about doing it and use that register for something else, which destroys the value you pre-loaded.

BTW you are using a non-traditional definition of bucket. The usual idea is a bucket contains N distinct hash entries. You map to a specific bucket, then choose the best entry from within that bucket. In your explanation, it would work just fine, because a pointer to a bucket doesn't identify the specific entry within that bucket you want...
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: transpositiontable, zobrist and > 15% collisions

Post by xmas79 »

bob wrote:BTW you are using a non-traditional definition of bucket. The usual idea is a bucket contains N distinct hash entries. You map to a specific bucket, then choose the best entry from within that bucket. In your explanation, it would work just fine, because a pointer to a bucket doesn't identify the specific entry within that bucket you want...
Yes I see, after re-reading my post I understand I used the terms "bucket" to identify what is commonly called "hash entry", and "hash slot" to identify what is commonly called "buckets". Sorry for that.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: transpositiontable, zobrist and > 15% collisions

Post by bob »

matthewlai wrote:
xmas79 wrote:To do what? The restriction comes from the algorithm, and from my (I'll be conservative here saying "my" and not "chess") algorithm point of view, nothing should be done with that entry, except "replace". So I still don't see the point.
Yes, the restriction comes from the algorithm you have now. It may no longer be true if you decide to change the algorithm later, for example, to add the null move flag.

I think this "flag" idea is a bad one. The "avoid null move" trick depends on the bound from the table entry compared to the bound for the current search, to predict whether a null-move search would fail high or not. A flag would lose this information since the bounds can change iteration by iteration, and even in the same iteration when a fail high or fail low at the root occurs.

If a shallower search (but deeper than the null-move reduced search) should fail low based on the table entry, there's no point in doing a null-move search since it would produce a value even worse. A simple flag does not capture ALL of that information, it would only capture it for the current specific position where the search was done and the result was stored.

I try to be more explicit. IN THE Folkert IMPLEMENTATION there are no multilple buckets for the same hash index, so one hash key will ALWAYS map to the SAME memory address (hash % nEntry). Now when I decide to use my pointer:

1) the TT entry is still the same entry (entryes[hash % nEntry].hash == hash), but I overwrite the entry. No problems at all.
2) the TT entry belongs to another position (entryes[hash % nEntry].hash != hash). First question: how it is possible in a single-threaded search? BTW, I should use the entry (that is a replace operation) .... Uhm... Replacement scheme:
2a) always replace: I replace it, stop. No matter what's inside the TT entry. No prolems.
2b) depth preferred: I have to decide if replace the entry or not. So I read through the pointer, or from some other means, but I will actually end up reading the exact SAME entry from the exact SAME memory address. So no different behaviour.

Now, in a TT implementation with multiple buckets per hash index, this is a bit different because an hash key can map to different hash buckets at different addresses, and there is "space" between the READ use of the hash entry (beginning of the search function) and the WRITE use (the end of the search, or at least after a subtree search), and this can change the content of the buckets at the same hash index, making a store through the pointer a very bad decision.

And of course, in a multithreaded search something more can happen, something so weird that your TT entry will change under your compiler's feet between the two 8 bytes read, as Bob said.
Yes, the hash key will always map to the same address. However, many hash keys map to the same address. The address may now store another position.

It's possible in a single threaded search if you perform another search before accessing it, because that search would have triggered stores to the TT, and may have stored another position that mapped to the same slot.

If you use always-replace and overwrite the whole entry all the time, this wouldn't be a problem. However, if you are not careful (in a few months time, when you have forgotten about all these restrictions), you may decide that if you got a hit, the entry must belong to this position, and there's no need to write the hash key.

ANYWAYS, the point of all these is not to say that your code is currently buggy. It's not. However, it's a style of coding that makes hard to find bugs easy to creep in, if you ever let your guard down and forget about these not-very-intuitive restrictions while changing your algorithm.

It's safer to just make a copy of the entry (with appropriate synchronization or lock-free check if multithreaded).
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: transpositiontable, zobrist and > 15% collisions

Post by matthewlai »

bob wrote:
matthewlai wrote:
xmas79 wrote:To do what? The restriction comes from the algorithm, and from my (I'll be conservative here saying "my" and not "chess") algorithm point of view, nothing should be done with that entry, except "replace". So I still don't see the point.
Yes, the restriction comes from the algorithm you have now. It may no longer be true if you decide to change the algorithm later, for example, to add the null move flag.

I think this "flag" idea is a bad one. The "avoid null move" trick depends on the bound from the table entry compared to the bound for the current search, to predict whether a null-move search would fail high or not. A flag would lose this information since the bounds can change iteration by iteration, and even in the same iteration when a fail high or fail low at the root occurs.
Yes, now that I thought about it, I agree it's a bad idea. I just tried to come up with a quick example to demonstrate the problem of returning pointers to internal data, from an encapsulation perspective.
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: transpositiontable, zobrist and > 15% collisions

Post by bob »

matthewlai wrote:
bob wrote:
matthewlai wrote:
xmas79 wrote:To do what? The restriction comes from the algorithm, and from my (I'll be conservative here saying "my" and not "chess") algorithm point of view, nothing should be done with that entry, except "replace". So I still don't see the point.
Yes, the restriction comes from the algorithm you have now. It may no longer be true if you decide to change the algorithm later, for example, to add the null move flag.

I think this "flag" idea is a bad one. The "avoid null move" trick depends on the bound from the table entry compared to the bound for the current search, to predict whether a null-move search would fail high or not. A flag would lose this information since the bounds can change iteration by iteration, and even in the same iteration when a fail high or fail low at the root occurs.
Yes, now that I thought about it, I agree it's a bad idea. I just tried to come up with a quick example to demonstrate the problem of returning pointers to internal data, from an encapsulation perspective.
My take on this is that guns, pointers, even a glass of water can ALL be dangerous when used in the wrong way. :)
matthewlai
Posts: 793
Joined: Sun Aug 03, 2014 4:48 am
Location: London, UK

Re: transpositiontable, zobrist and > 15% collisions

Post by matthewlai »

bob wrote:
matthewlai wrote:
bob wrote:
matthewlai wrote:
xmas79 wrote:To do what? The restriction comes from the algorithm, and from my (I'll be conservative here saying "my" and not "chess") algorithm point of view, nothing should be done with that entry, except "replace". So I still don't see the point.
Yes, the restriction comes from the algorithm you have now. It may no longer be true if you decide to change the algorithm later, for example, to add the null move flag.

I think this "flag" idea is a bad one. The "avoid null move" trick depends on the bound from the table entry compared to the bound for the current search, to predict whether a null-move search would fail high or not. A flag would lose this information since the bounds can change iteration by iteration, and even in the same iteration when a fail high or fail low at the root occurs.
Yes, now that I thought about it, I agree it's a bad idea. I just tried to come up with a quick example to demonstrate the problem of returning pointers to internal data, from an encapsulation perspective.
My take on this is that guns, pointers, even a glass of water can ALL be dangerous when used in the wrong way. :)
Certainly. And my take on this is that this is one of the wrong ways :).
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.