Transposition Age Tutorial

Aleks Peshkov · Post by **Aleks Peshkov** » Wed Feb 03, 2016 3:22 pm

Zenmastur wrote:A “significant performance drop” as compared to what? An infinite size table?

Different positions have drastically different success hit rate. No single number can be given.

I discovered that my 3+2 slot scheme becomes remarkably worse against standard 4-slot bucket scheme at the moment when average 3 free-way slot is completely full and started to lose information, but average 4 slots were still enough. When both replacement schemes becomes over-flooded the difference between 3+2 way and 4-way was not so important and more sophistical replacement started to gain advantage.

bob · Post by **bob** » Wed Feb 03, 2016 6:06 pm

Zenmastur wrote:
hgm wrote:This theorem uses 'full' in a meaning that has no practical relevance. I don't care if if there still is a single empty entry in a table of a million, and that it would take a million more stores to get a hit on that entry that fills it. From a performance point of view a table that is still 10% empty is practically indistinguishable from a full table. The performance starts to fall when you start to overwirte entries, and this in fact happens significantly before the number of stores equals the table size. ...
I disagree. The purpose of a hash table is to get hits. One factor controlling the number of hits is the percentage of entries in the table that are occupied. If the TT is empty then it's completely worthless.
So, I guess you need to define the word “performance” explicitly as your current usage seems counter-intuitive and vague. I think a TT is performing at it's best when it's completely full since this is when it has the highest probability of producing a hit when all other factors are held constant.

This graph of yours:

Effect of hash-table size on search time

Seems to contradict your statement that “performance starts to fall when you start to overwirte entries". It's not clear how “loading factor” is defined as used in the graph but, assuming it is related to the number of unique stores in to a table of k entries in a linear way, (e.g. a loading factor of 10 represents 10 * k unique stores into a hash table of k entries), then it looks to me like the performance stays about the same over a wide range of “loading factors”. i.e. from ~ 0.5 to ~5. This contradicts your 1/e example.

hgm wrote:...In an always-replace table of size k after k stores 1/e of the table would still be empty. But only by virtue that 1/e of the stored entries have already been overwritten, which will cause a very significant performance drop. ...
A “significant performance drop” as compared to what? An infinite size table?

I doubt this is true and your graph certainly doesn't support this conclusion. In order for there to be any degradation in performance a location has to be referenced at least 3 times. Once to place an entry in it, once to overwrite it, and once to reference the original data and this has to happen in that order. If the data in the cell is referenced and produces a hit before its over-written no performance is lost. At least not until another (fourth) reference to that location is made.

hgm wrote:... I vaguely recall that for perft(6) the number of unique positions is already an order of magnitude smaller than the number of nodes.
I think I'll stand by my previous assertion that the number of entries in the hash table is the controlling factor here and not the number of transpositions, since if the transpositions were the cause of this effect the cache hit rate would be VERY high. Besides, what engine uses a full width search without Alpha-Beta which is what this statement seems to imply.

Regards,

Zen

I disagree.

The purpose of the table is NOT to "get hits". It is to "reduce effort" or "reduce tree size". Nothing more, nothing less. How full or empty the table is has nothing to do with this. This problem is all about keeping the entries that will help the most, and getting rid of those that will help the least (or not at all).

Table entries are referenced far more than just three times in a search. In fact, on a 30 ply search, one can get probed 30 times, and the age updated, even if it doesn't cause a cutoff. It still can provide a best move.

Again, it is about effort saved, nothing else.

hgm · Post by **hgm** » Wed Feb 03, 2016 6:42 pm

Zenmastur wrote:I disagree. The purpose of a hash table is to get hits. One factor controlling the number of hits is the percentage of entries in the table that are occupied. If the TT is empty then it's completely worthless.

Wrong! You can get nearly 100% hits, on a table that is 99% empty. Because you are not picking a random place to look for entries, but you look for them in the same place as you store them. So how much larger the table is than you need doesn't really play a role at all.

So, I guess you need to define the word “performance” explicitly as your current usage seems counter-intuitive and vague.

HAsh-table performance = hit rate. Nothing vague about that. When you overwrite entries, probes that would otherwise have been hits now will be misses because the entry has disappeared. When they would not have been overwritten (e.g. because the table was infinitely large, or the replacement scheme was infinitely smart), you would hhave had those extra hits.

I think a TT is performing at it's best when it's completely full since this is when it has the highest probability of producing a hit when all other factors are held constant.

Wrong again. The number of necessary overwrites is determined by how full the table is. Not the hit rate. The larger you make the table for a given search, the emptier it will get, but the shorter the time needed to complete that search.

This graph of yours:
...
Seems to contradict your statement that “performance starts to fall when you start to overwirte entries". It's not clear how “loading factor” is defined as used in the graph but, assuming it is related to the number of unique stores in to a table of k entries in a linear way, (e.g. a loading factor of 10 represents 10 * k unique stores into a hash table of k entries),

Wrong assumption. It does not represent unique stores but number of tree nodes of an unpruned tree.

... then it looks to me like the performance stays about the same over a wide range of “loading factors”. i.e. from ~ 0.5 to ~5. This contradicts your 1/e example.

This is because the number of unique positions is about an order of magnitude smaller than the total number of positions in the tree. The number of unique positions is hard to determine, however.

Furthermore, note that the 1/e overwrite fraction is for an always-replace table. Even with buckets of 4 and only preferring storing in empty slots over filled ones you will have only very few overwrites before the filling fraction gets close to 100%. And the graph is for a much smarter replacement scheme.

A “significant performance drop” as compared to what? An infinite size table?

Compared to a situation without overwrites, which can always be achieved by a large-enough table, and helped by smarter replacement.

I doubt this is true and your graph certainly doesn't support this conclusion.

The graph clearly shows that if you make the same search in a smaller table (i.e. at larger overload), the search time goes up. After you reach the point where the tree does not fit in the tableanymore.

In order for there to be any degradation in performance a location has to be referenced at least 3 times. Once to place an entry in it, once to overwrite it, and once to reference the original data and this has to happen in that order. If the data in the cell is referenced and produces a hit before its over-written no performance is lost. At least not until another (fourth) reference to that location is made.

Yeah, so?

I think I'll stand by my previous assertion that the number of entries in the hash table is the controlling factor here and not the number of transpositions, since if the transpositions were the cause of this effect the cache hit rate would be VERY high.

Not sure at all what cache-hit rate has to do with any of this. But besides that, your conclusion about it also seems wrong. The cache is quite small compared to the table, and not able to catch most transpositions.

Besides, what engine uses a full width search without Alpha-Beta which is what this statement seems to imply.

I am not sure why you think this implies such a thing. If on average every unique position occurs 10 time in the perft tree, why would it be different in an alpha-beta search tree? Do you know any particular reason why the alpha-beta tree would not be a representative sampling of the perft tree, but would discriminate against duplicats?

Zenmastur · Post by **Zenmastur** » Thu Feb 04, 2016 3:30 am

bob wrote:
Zenmastur wrote:I disagree. The purpose of a hash table is to get hits. One factor controlling the number of hits is the percentage of entries in the table that are occupied. If the TT is empty then it's completely worthless.
I disagree.

The purpose of the table is NOT to "get hits". It is to "reduce effort" or "reduce tree size". Nothing more, nothing less.

I'll agree that reducing effort is the goal as long as you'll agree that in order to do this it must get hits. Without them the TT is wasted effort. So the two are related, albeit indirectly.

bob wrote:How full or empty the table is has nothing to do with this. ...

If we're going to nit-pick the way the sentences are worded then you're wrong since an empty table can never have a useful hit and a full one clearly can.

bob wrote:This problem is all about keeping the entries that will help the most, and getting rid of those that will help the least (or not at all).

Agreed, but this has little to do with my original point assuming the replacement algorithm is held constant while the TT size is changed. I believe I made this clear near the top of the post by stating that all other factors should be held constant. I DID NOT include an exception for the TT replacement algorithm so it should be held constant.

bob wrote:Table entries are referenced far more than just three times in a search. In fact, on a 30 ply search, one can get probed 30 times,...

While this is true, these comments were directed at this statement by HGM:

HGM wrote:...In an always-replace table of size k after k stores 1/e of the table would still be empty. But only by virtue that 1/e of the stored entries have already been overwritten, which will cause a very significant performance drop. ...

So, you're considerably less likely to get 30 probes to the same location with only k probes into a table with k locations “IF” the table is of reasonably large size (e.g. 1 million locations or more) and the data is IID which I assume it must be otherwise the statement he made is incorrect. As long as we're nit-picking, and apparently we are, he failed to specify that the k stores be unique, therefore his statement can't possibly be correct.

bob wrote:Again, it is about effort saved, nothing else.

We can agree on that, but that has little to do with my original point I was making. See my first post under this topic.

hgm wrote:
Zenmastur wrote:I disagree. The purpose of a hash table is to get hits. One factor controlling the number of hits is the percentage of entries in the table that are occupied. If the TT is empty then it's completely worthless.
Wrong!

Really? Something seems amiss.

hgm wrote:
This graph of yours:
...
Seems to contradict your statement that “performance starts to fall when you start to overwirte entries". It's not clear how “loading factor” is defined as used in the graph but, assuming it is related to the number of unique stores in to a table of k entries in a linear way, (e.g. a loading factor of 10 represents 10 * k unique stores into a hash table of k entries),
Wrong assumption. It does not represent unique stores but number of tree nodes of an unpruned tree.

I see! You'll forgive me if I thought this graphic was for a pruned Alpha-Beta search from a “normal” engine. I sure that I'm not the only one who thought this. Now that you have revealed its true nature it's true value, or lack thereof becomes apparent.

Let me shorten this conversation a bit.

Regards,

Forrest

hgm · Post by **hgm** » Thu Feb 04, 2016 8:54 am

Zenmastur wrote:
bob wrote:Table entries are referenced far more than just three times in a search. In fact, on a 30 ply search, one can get probed 30 times,...
While this is true, these comments were directed at this statement by HGM:

HGM wrote:...In an always-replace table of size k after k stores 1/e of the table would still be empty. But only by virtue that 1/e of the stored entries have already been overwritten, which will cause a very significant performance drop. ...
So, you're considerably less likely to get 30 probes to the same location with only k probes into a table with k locations “IF” the table is of reasonably large size (e.g. 1 million locations or more) and the data is IID which I assume it must be otherwise the statement he made is incorrect. As long as we're nit-picking, and apparently we are, he failed to specify that the k stores be unique, therefore his statement can't possibly be correct.

Of course they are unique. If you have a duplicat it would already be in the table, and you would not have to store it...

hgm wrote:Wrong assumption. It does not represent unique stores but number of tree nodes of an unpruned tree.
I see! You'll forgive me if I thought this graphic was for a pruned Alpha-Beta search from a “normal” engine. I sure that I'm not the only one who thought this. Now that you have revealed its true nature it's true value, or lack thereof becomes apparent.

My bad... I thought it was understood we were talking about an alpha-beta ('search') tree, as opposed to a 'perft' tree. So when I said 'unpruned' I meant "unpruned by hash cutoffs".

bob · Post by **bob** » Thu Feb 04, 2016 5:41 pm

Zenmastur wrote:
bob wrote:
Zenmastur wrote:I disagree. The purpose of a hash table is to get hits. One factor controlling the number of hits is the percentage of entries in the table that are occupied. If the TT is empty then it's completely worthless.
I disagree.

The purpose of the table is NOT to "get hits". It is to "reduce effort" or "reduce tree size". Nothing more, nothing less.

I'll agree that reducing effort is the goal as long as you'll agree that in order to do this it must get hits. Without them the TT is wasted effort. So the two are related, albeit indirectly.

bob wrote:How full or empty the table is has nothing to do with this. ...
If we're going to nit-pick the way the sentences are worded then you're wrong since an empty table can never have a useful hit and a full one clearly can.

bob wrote:This problem is all about keeping the entries that will help the most, and getting rid of those that will help the least (or not at all).
Agreed, but this has little to do with my original point assuming the replacement algorithm is held constant while the TT size is changed. I believe I made this clear near the top of the post by stating that all other factors should be held constant. I DID NOT include an exception for the TT replacement algorithm so it should be held constant.

bob wrote:Table entries are referenced far more than just three times in a search. In fact, on a 30 ply search, one can get probed 30 times,...
While this is true, these comments were directed at this statement by HGM:

HGM wrote:...In an always-replace table of size k after k stores 1/e of the table would still be empty. But only by virtue that 1/e of the stored entries have already been overwritten, which will cause a very significant performance drop. ...
So, you're considerably less likely to get 30 probes to the same location with only k probes into a table with k locations “IF” the table is of reasonably large size (e.g. 1 million locations or more) and the data is IID which I assume it must be otherwise the statement he made is incorrect. As long as we're nit-picking, and apparently we are, he failed to specify that the k stores be unique, therefore his statement can't possibly be correct.

bob wrote:Again, it is about effort saved, nothing else.
We can agree on that, but that has little to do with my original point I was making. See my first post under this topic.

hgm wrote:
Zenmastur wrote:I disagree. The purpose of a hash table is to get hits. One factor controlling the number of hits is the percentage of entries in the table that are occupied. If the TT is empty then it's completely worthless.
Wrong!
Really? Something seems amiss.

hgm wrote:
This graph of yours:
...
Seems to contradict your statement that “performance starts to fall when you start to overwirte entries". It's not clear how “loading factor” is defined as used in the graph but, assuming it is related to the number of unique stores in to a table of k entries in a linear way, (e.g. a loading factor of 10 represents 10 * k unique stores into a hash table of k entries),
Wrong assumption. It does not represent unique stores but number of tree nodes of an unpruned tree.
I see! You'll forgive me if I thought this graphic was for a pruned Alpha-Beta search from a “normal” engine. I sure that I'm not the only one who thought this. Now that you have revealed its true nature it's true value, or lack thereof becomes apparent.

Let me shorten this conversation a bit.

Regards,

Forrest

You overlook the obvious. If you use an anti-depth-preferred replacement, your hit rate will likely increase (I will test this and report numbers in a bit). But you miss out on the saved effort of the deeper entries. So just increasing hit rate is not the answer, increasing the hit rate in positions where more effort will be expended on a miss is the answer.

Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial