Transposition Age Tutorial

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

kbhearn
Posts: 411
Joined: Thu Dec 30, 2010 4:48 am

Re: Transposition Age Tutorial

Post by kbhearn »

That makes no sense. A fixed-size hash table has the same number of entries whether single or multi. They should fill up at exactly the same rate.
Indeed they do fill at the same rate, but while there are empty spots it's more likely that a random node will find one if it hits a multientry bucket in a 50% used HT than if it hits a single entry bucket in a 50% used HT.

the single entry has 2 states per bucket: used/unused - if you hit used you're pushing something out. Hence 50% filled is exactly 50% chance to have to push something out.

the 4-entry has 5 states: 0/4, 1/4, 2/4, 3/4, 4/4 - only 4/4 forces you to replace something. but if half the buckets are at 4/4 in a 50% full table that'd mean the other half the buckets were at 0/4. very unlikely unless you have an error in your code. Hence at 50% filled, there is < 50% chance of having to push something out with a multientry bucket.

edit: to extend this to something truly useful, you could replace 'filled' with 'would prefer not to replace' in the above arguments provided you have some method to detect the things that you don't mind being gone (such as age, depth)
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Transposition Age Tutorial

Post by bob »

kbhearn wrote:
That makes no sense. A fixed-size hash table has the same number of entries whether single or multi. They should fill up at exactly the same rate.
Indeed they do fill at the same rate, but while there are empty spots it's more likely that a random node will find one if it hits a multientry bucket in a 50% used HT than if it hits a single entry bucket in a 50% used HT.

the single entry has 2 states per bucket: used/unused - if you hit used you're pushing something out. Hence 50% filled is exactly 50% chance to have to push something out.

the 4-entry has 5 states: 0/4, 1/4, 2/4, 3/4, 4/4 - only 4/4 forces you to replace something. but if half the buckets are at 4/4 in a 50% full table that'd mean the other half the buckets were at 0/4. very unlikely unless you have an error in your code. Hence at 50% filled, there is < 50% chance of having to push something out with a multientry bucket.

edit: to extend this to something truly useful, you could replace 'filled' with 'would prefer not to replace' in the above arguments provided you have some method to detect the things that you don't mind being gone (such as age, depth)
You are overlooking one key. There are 1/4 as many buckets as there are entries. So you are just as likely to find a full bucket as you are a single used entry. The advantage is that with the bucket, you get a choice of what to replace, saving better entries.
kbhearn
Posts: 411
Joined: Thu Dec 30, 2010 4:48 am

Re: Transposition Age Tutorial

Post by kbhearn »

bob wrote:
kbhearn wrote:
That makes no sense. A fixed-size hash table has the same number of entries whether single or multi. They should fill up at exactly the same rate.
Indeed they do fill at the same rate, but while there are empty spots it's more likely that a random node will find one if it hits a multientry bucket in a 50% used HT than if it hits a single entry bucket in a 50% used HT.

the single entry has 2 states per bucket: used/unused - if you hit used you're pushing something out. Hence 50% filled is exactly 50% chance to have to push something out.

the 4-entry has 5 states: 0/4, 1/4, 2/4, 3/4, 4/4 - only 4/4 forces you to replace something. but if half the buckets are at 4/4 in a 50% full table that'd mean the other half the buckets were at 0/4. very unlikely unless you have an error in your code. Hence at 50% filled, there is < 50% chance of having to push something out with a multientry bucket.

edit: to extend this to something truly useful, you could replace 'filled' with 'would prefer not to replace' in the above arguments provided you have some method to detect the things that you don't mind being gone (such as age, depth)
You are overlooking one key. There are 1/4 as many buckets as there are entries. So you are just as likely to find a full bucket as you are a single used entry. The advantage is that with the bucket, you get a choice of what to replace, saving better entries.
No, i'm not overlooking that...

single entry: 1024 buckets of 1 entry, 512 used
4-entry: 256 buckets of 4 entries, total 1024 entries. if 128 of those buckets are 4/4 filled, then 128 of them are 0/4 at 512 used. very statistically unlikely.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Transposition Age Tutorial

Post by bob »

kbhearn wrote:
bob wrote:
kbhearn wrote:
That makes no sense. A fixed-size hash table has the same number of entries whether single or multi. They should fill up at exactly the same rate.
Indeed they do fill at the same rate, but while there are empty spots it's more likely that a random node will find one if it hits a multientry bucket in a 50% used HT than if it hits a single entry bucket in a 50% used HT.

the single entry has 2 states per bucket: used/unused - if you hit used you're pushing something out. Hence 50% filled is exactly 50% chance to have to push something out.

the 4-entry has 5 states: 0/4, 1/4, 2/4, 3/4, 4/4 - only 4/4 forces you to replace something. but if half the buckets are at 4/4 in a 50% full table that'd mean the other half the buckets were at 0/4. very unlikely unless you have an error in your code. Hence at 50% filled, there is < 50% chance of having to push something out with a multientry bucket.

edit: to extend this to something truly useful, you could replace 'filled' with 'would prefer not to replace' in the above arguments provided you have some method to detect the things that you don't mind being gone (such as age, depth)
You are overlooking one key. There are 1/4 as many buckets as there are entries. So you are just as likely to find a full bucket as you are a single used entry. The advantage is that with the bucket, you get a choice of what to replace, saving better entries.
No, i'm not overlooking that...

single entry: 1024 buckets of 1 entry, 512 used
4-entry: 256 buckets of 4 entries, total 1024 entries. if 128 of those buckets are 4/4 filled, then 128 of them are 0/4 at 512 used. very statistically unlikely.
When the table is 1/2 full, there is not a lot of overwriting going on. Or do you believe that your Zobrist numbers are poor enough that you don't evenly distribute hash signatures across the table?

You can easily test this. And yes, this bucket approach should result in a table with fewer unused entries. But it won't be a lot better. The issue is not just overwriting, but having a choice in what you overwrite.

You could (I will try to do this myself and post results) run with a table size of 1M entries. After doing 1M stores (with all different keys, so don't count the stores where the signatures match) loop over the table to see how many unused entries there are...

For my first run, I used 1M entries (16mbytes) with the normal 4 bucket approach. I'm giving two numbers on each line. First is number of actual stores done for different signatures, second is percent full for the table:

stores_done=1048576 used=80
stores_done=2097152 used=98
stores_done=3145728 used=99
stores_done=4194304 used=99
stores_done=5242880 used=99
stores_done=6291456 used=100
stores_done=7340032 used=100
stores_done=8388608 used=100
stores_done=9437184 used=100
stores_done=10485760 used=100
stores_done=11534336 used=100
stores_done=12582912 used=100
stores_done=13631488 used=100
stores_done=14680064 used=100
stores_done=15728640 used=100
stores_done=16777216 used=100
stores_done=17825792 used=100
stores_done=18874368 used=100
stores_done=19922944 used=100
stores_done=20971520 used=100

To make it simple, the next set of data uses 4M entries (1M buckets) but only the first entry of a bucket is used or checked, so still 1M usable entries but now no buckets to help spread the data over the table.

stores_done=1048576 used=62
stores_done=2097152 used=84
stores_done=3145728 used=94
stores_done=4194304 used=97
stores_done=5242880 used=99
stores_done=6291456 used=99
stores_done=7340032 used=99
stores_done=8388608 used=99
stores_done=9437184 used=99
stores_done=10485760 used=99
stores_done=11534336 used=99
stores_done=12582912 used=99
stores_done=13631488 used=99
stores_done=14680064 used=99
stores_done=15728640 used=99
stores_done=16777216 used=99
stores_done=17825792 used=100
stores_done=18874368 used=100
stores_done=19922944 used=100
stores_done=20971520 used=100

Observations. with 4 entry buckets, after 3M stores done the hash table is 99% used. With 1 entry buckets, it takes 5M stores to reach that point. This is not so significant. Both fill the hash table within 5M nodes, less than a factor of two difference between them. Beyond that point, there is no difference.

For a quick comparison in search efficiency with these, same position, searched to same depth, but with the second using 4x the memory to make the hash tables equal in entries.

time=5.35(100%) nodes=24458569(24.5M) fh1=92% nps=4.6M
time=15.25(100%) nodes=69474405(69.5M) fh1=92% nps=4.6M

So, both use 1M entries, one with 4 entry buckets, one with 1 entry buckets. Both get hash to 99%+ full, but notice the speed difference. A factor of 3x slower for the one-probe hash algorithm, because it has no choice in what it overwrites.

That's where the gain comes from, having a choice and making a wise one, rather than filling the table better...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Transposition Age Tutorial

Post by bob »

D Sceviour wrote:
hgm wrote:
D Sceviour wrote:Is this correct, that a four-entry hash table has to allocate four times as much memory as a single-entry (always overwrite) hash table to achieve the same or better results?
No, that is certainly not correct. You should always compare replacement schemes under the constraint that they use the same amount of total memory. The idea of the schemes is to make optimal use of the resources that you have, and their merit is how well they do that.

And when you are testing the schemes, you should always do it under conditions of very heavy table overloading. If your tree fits entirely in the hash table, replacement should have no effect, as it is never done. (So the most stupid algorithm, that doesn't need to waste any time on deciding what to replace, would probably come out as best.) Note that the number of unique positions in a tree is often an order of magnitude lower than the number of nodes the search reports, due to transpositions. In the graphs I showed earlier you can see that the search only starts to slow down when the overload factor gets above 10 (= total nodes in tree 10 times larger than the number of hash entries).
The proposed theory does not show in practice. The results for 40 games for two virtually identical 32-bit programs 1 CPU using a 32MB hash table show equal results (20-20). One program used a simplified double entry hash scheme similar to the one described by Alexandru Mosoi for Zurichess. No transposition age is used here. There is a wrap around and overwrite to two slots. The other engine used a single entry overwrite hash. Exactly equal results were also obtained for the first 4-entry test with transposition age (but it was 512MB hash).

Perhaps multiple cores might make a difference. It takes a long time to fill up a 1-4GB table. With today’s large amounts of memory available, the multi-entry system looks more like a fine-tuning adjustment for high performance engines with limited expected results. If memory is sufficient then there should be no need to decide what to overwrite. There should be room for everything.

I have difficulty visualizing the multiple entry hash. How does dividing the number of available entries by four, so four entries can be written to an allocation address, improve anything? It looks like another break-even scheme. On the other hand, other people are posting different results. At least multi-entry does not deteriorate matters.
See my post below. Unless you consider a search in a middle game position that only takes 1/3 the time using 4 entry buckets compared to 1 entry buckets, same number of total entries, to be break-even, the bucket approach is a HUGE winner.

Here's data for a position (fine #70) that heavily depends on good hashing. Searched to depth 48, 1M hash entries, one thread, etc. First uses 1M entries in 256K buckets of 4 entries, second uses 1M entries in buckets of 1 entry. So both use 1M entries.

time=2.87(100%) nodes=15831788(15.8M) fh1=93% nps=5.5M
time=20.92(100%) nodes=133246971(133.2M) fh1=85% nps=6.4M

I'd consider a factor of 6x+ to be more than "break even". Having a choice of what to overwrite is a huge improvement over a bare-bones "hash and overwrite what you see" simplistic approach.

For the record, this stuff is not new. As I said, hashing in chess has been tested to death, many times. There is a good reason why most are using buckets today.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Transposition Age Tutorial

Post by bob »

bob wrote:
Evert wrote:Here's a thought that initially helped me to understand why using multiple buckets is useful. Note that I don't think the reasoning is valid, but it nevertheless helped me.

You have a limited selection of keys with which to build the hash signature (768 for the position, a few more for castling rights, side to move and en-passant square). You use the lower N bits of this signature to index the transposition table (the remaining bits are used as a verification). Now, ideally, you want the search to address all positions in the table evenly (what use are table entries that are not used?). Question: how do you know that the hash signatures you use will do this? Answer: you don't. But by not only considering the indexed position itself, but also the next (say) 3 entries you smooth out any peaks in the frequency with which particular indices are encountered.

As I said, I don't think this line of reasoning holds up to scrutiny (for one thing, I've never actually tested whether there is any discrepancy in the frequency with which particular indices are generated, but if there is then a different set of keys can probably eliminate it), but it originally helped me to see a point to doing things this way.
That's really not the issue. It is all about choices. The more choices you have when you do a replacement, the better the decision is... Only problem is that there is a computational cost for the number of entries in the bucket, and that cost can eventually outweigh the gain you get with better replacement accuracy.
You can find a similar discussion in set-associative cache. 2 way is way better than 1-way (direct mapping). 4-way is better than 2-way, but not as much better as 2-way over 1-way. A sort of diminishing returns, because as you increase N, you have to start reducing the total size because the duplicate hardware to compare tags and such, the extra data pathways, etc. all take chip space and something has to be reduced to keep chip space under control.

but 4 entry buckets is so far superior to 1 entry buckets that there is no reason to even consider a direct-mapped hash table. At least 4-way associativity. I might try to produce some data for 1-way through 8-way and post here later...
kbhearn
Posts: 411
Joined: Thu Dec 30, 2010 4:48 am

Re: Transposition Age Tutorial

Post by kbhearn »

bob wrote:
kbhearn wrote:
bob wrote:
kbhearn wrote:
That makes no sense. A fixed-size hash table has the same number of entries whether single or multi. They should fill up at exactly the same rate.
Indeed they do fill at the same rate, but while there are empty spots it's more likely that a random node will find one if it hits a multientry bucket in a 50% used HT than if it hits a single entry bucket in a 50% used HT.

the single entry has 2 states per bucket: used/unused - if you hit used you're pushing something out. Hence 50% filled is exactly 50% chance to have to push something out.

the 4-entry has 5 states: 0/4, 1/4, 2/4, 3/4, 4/4 - only 4/4 forces you to replace something. but if half the buckets are at 4/4 in a 50% full table that'd mean the other half the buckets were at 0/4. very unlikely unless you have an error in your code. Hence at 50% filled, there is < 50% chance of having to push something out with a multientry bucket.

edit: to extend this to something truly useful, you could replace 'filled' with 'would prefer not to replace' in the above arguments provided you have some method to detect the things that you don't mind being gone (such as age, depth)
You are overlooking one key. There are 1/4 as many buckets as there are entries. So you are just as likely to find a full bucket as you are a single used entry. The advantage is that with the bucket, you get a choice of what to replace, saving better entries.
No, i'm not overlooking that...

single entry: 1024 buckets of 1 entry, 512 used
4-entry: 256 buckets of 4 entries, total 1024 entries. if 128 of those buckets are 4/4 filled, then 128 of them are 0/4 at 512 used. very statistically unlikely.
When the table is 1/2 full, there is not a lot of overwriting going on. Or do you believe that your Zobrist numbers are poor enough that you don't evenly distribute hash signatures across the table?
That was rather my point... At halffull, that with the single entry 50% of the time you have to overwrite, where with 4 entries there's 'not much overwriting going on - ~15%'. And that while this isn't terribly significant at normal operating point this argument can be translated if you have some method to determine which half of the entries you're not worried about overwriting ('making a choice' as you put it).
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Transposition Age Tutorial

Post by bob »

kbhearn wrote:
bob wrote:
kbhearn wrote:
bob wrote:
kbhearn wrote:
That makes no sense. A fixed-size hash table has the same number of entries whether single or multi. They should fill up at exactly the same rate.
Indeed they do fill at the same rate, but while there are empty spots it's more likely that a random node will find one if it hits a multientry bucket in a 50% used HT than if it hits a single entry bucket in a 50% used HT.

the single entry has 2 states per bucket: used/unused - if you hit used you're pushing something out. Hence 50% filled is exactly 50% chance to have to push something out.

the 4-entry has 5 states: 0/4, 1/4, 2/4, 3/4, 4/4 - only 4/4 forces you to replace something. but if half the buckets are at 4/4 in a 50% full table that'd mean the other half the buckets were at 0/4. very unlikely unless you have an error in your code. Hence at 50% filled, there is < 50% chance of having to push something out with a multientry bucket.

edit: to extend this to something truly useful, you could replace 'filled' with 'would prefer not to replace' in the above arguments provided you have some method to detect the things that you don't mind being gone (such as age, depth)
You are overlooking one key. There are 1/4 as many buckets as there are entries. So you are just as likely to find a full bucket as you are a single used entry. The advantage is that with the bucket, you get a choice of what to replace, saving better entries.
No, i'm not overlooking that...

single entry: 1024 buckets of 1 entry, 512 used
4-entry: 256 buckets of 4 entries, total 1024 entries. if 128 of those buckets are 4/4 filled, then 128 of them are 0/4 at 512 used. very statistically unlikely.
When the table is 1/2 full, there is not a lot of overwriting going on. Or do you believe that your Zobrist numbers are poor enough that you don't evenly distribute hash signatures across the table?
That was rather my point... At halffull, that with the single entry 50% of the time you have to overwrite, where with 4 entries there's 'not much overwriting going on - ~15%'. And that while this isn't terribly significant at normal operating point this argument can be translated if you have some method to determine which half of the entries you're not worried about overwriting ('making a choice' as you put it).
If you look at my data, this isn't much of an issue. With 4 entry buckets, after storing 1M entries the HT is 80% full, the one entry is 62% full. That's not a particularly significant number. 1M stores / nodes takes no time at all today. Good random numbers give a pretty good hash distribution by themselves. After only 3M stores, they are about equal in entries used. It is the choice of what to replace that dominates from start to end.
D Sceviour
Posts: 570
Joined: Mon Jul 20, 2015 5:06 pm

Re: Transposition Age Tutorial

Post by D Sceviour »

It might be possible to overwrite an important hash entry that has not been searched yet in the current iteration. Therefore, would it be prudent to test transposition age + 2 to make sure that an about-to-be used entry is not overwritten prematurely?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Transposition Age Tutorial

Post by bob »

D Sceviour wrote:It might be possible to overwrite an important hash entry that has not been searched yet in the current iteration. Therefore, would it be prudent to test transposition age + 2 to make sure that an about-to-be used entry is not overwritten prematurely?
In theory you could replace the one with the oldest age first. I use depth + age myself. If an entry comes from an older iteration, I replace the one with the lowest draft.