Transposition Age Tutorial

bob · Post by **bob** » Wed Jan 27, 2016 11:53 pm

hgm wrote:
bob wrote:Here's a question: Suppose you use a bucket size of 4, which is typical although some are beginning to go with 8, using an 8 byte uber-compressed table entry (different topic). All four entries in that bucket represent the same draft, so how do you choose which one to replace? Age might be a usable alternative when draft alone isn't enough to break the tie. What if two entries in the table represent the same draft (which is over-populated and should cause a replacement). How do you choose which one? Again age might help. Overpopulated and produced by the previous search would be a better choice it would seem.
The point is that this is all pretty unlikely. If every draft occurs about equally often, and fills, say, 5% of the table, the probability that all 4 entries in a bucket have the same draft is 1 in 8,000.

The drafts higher than a certain N are not yet 'saturated', i.e. they would occur less frequently than every draft below N, and in fact all drafts above N together would occur hardly more than any single draft below N, because the production rate decreases exponentially with draft. Each next higher draft, say, 3 times less than the previous one. So roughly each draft takes 1/(N+1.5) part of the table. But the exponential production rate also applies to the lower drafts. If draft N was produced enough to exactly hit the 1/(N+1.5) mark, draft N-1 would have been produced 3 times as much, so that 2/3 of it must already have been replaced. And draft N-3 already overloaded the table by a factor 27. So if you are at the 27th move of the game (excluding the opening moves) the amount of draft N-3 entries that can be held in the table corresponds to a single move. And for draft N-4 an entry would typically already have been 3 times overwritten before the current search completes.

So when N=20, the drafts 1-16 would basically all be from the current search. Even in draft 17 only about 1/e of the entries from a previous search would have survived. Only for drafts 18, 19 and 20 you would have a good chance that ther might be an stale entry around that you would preferably like to replace. But it is very unlikely that would be in the same bucket. And on a global scale you gain very little, just a bit more efficiency in 3 of the 20 drafts.

If you work with the 1/8000 idea, you will still have to contend with WAY more than 8000 buckets. IE in the parallel search testing I was reporting on a few months ago, I was using 64G of hash, which is 4G entries. Or 1G "buckets". So I would expect to see that a pretty significant number of times. But you only need two entries in a bucket with the same draft before the problem occurs, since that group could be the over-populated draft positions, and you have to choose which of the two to replace. If you assume say a 32 ply search, then you could see drafts between 1 to 32 (ignoring the edge case at the root). The probability of any two entries in a bucket having the same draft is fairly low (6 * (1/32) ^ 2) = about 1/200 if I did my mental math reasonably. With 1G buckets, that is about 5M buckets that meet that criterion.

It would seem that you need some rational way to choose, whether it be random or age or something else. Age is trivial to do.

D Sceviour · Post by **D Sceviour** » Thu Jan 28, 2016 3:38 am

The question of over-writing is similar to asking how many killer moves should be saved and tried - one, two, eight? Which killer should be tried first? Which killer should be overwritten? I found a big increase in performance using one killer move, but no significant change using extra killers. The single exception is an extra mate killer that is tried first if the first move leads to mate. However, in this situation something extra is known about the position; that is, there is a mate threat.

The first test results for using a Crafty style multiple hash entry (compared to an always overwrite scheme) are disappointing so far. There is obviously a slowdown in performance, as more hash entries have to be tested. A quick tournament test showed no change in elo, so expected improvement look nominal unless there is another type of error missing.

Aleks Peshkov · Post by **Aleks Peshkov** » Thu Jan 28, 2016 10:41 am

bob wrote:Here's a question: Suppose you use a bucket size of 4, which is typical although some are beginning to go with 8, using an 8 byte uber-compressed table entry (different topic). All four entries in that bucket represent the same draft, so how do you choose which one to replace? Age might be a usable alternative when draft alone isn't enough to break the tie. What if two entries in the table represent the same draft (which is over-populated and should cause a replacement). How do you choose which one? Again age might help. Overpopulated and produced by the previous search would be a better choice it would seem.

My slots in the bucket are ordered. The highest entry is the deepest one or the latest (written or read) if equal depth. All reading and writing reorders the bucket. Outdated by age entries pushed below any actual age entry but they ordered the same way.

When I had 4-slot bucket, it was profitable to overwrite not the shallowest, but the same or nearest lower depth entry (simplified equidistributed idea). When I have 3-slot bucket overwriting the shallowest (or the oldest of equal depth) performs better.

Now I have the bucket of 3x 16b any depth slots + 2x 8b depth 0 slots (no need to keep depth and age fields).

I increment age counter after every search depth iteration, but current and current-1 ages are treated as actual.

Aleks Peshkov · Post by **Aleks Peshkov** » Thu Jan 28, 2016 11:05 am

D Sceviour wrote:The first test results for using a Crafty style multiple hash entry (compared to an always overwrite scheme) are disappointing so far. There is obviously a slowdown in performance, as more hash entries have to be tested. A quick tournament test showed no change in elo, so expected improvement look nominal unless there is another type of error missing.

You need scale transposition table size to test time control if you want to optimize normal table size with tournament time control.

hgm · Post by **hgm** » Thu Jan 28, 2016 4:42 pm

bob wrote:If you work with the 1/8000 idea, you will still have to contend with WAY more than 8000 buckets. IE in the parallel search testing I was reporting on a few months ago, I was using 64G of hash, which is 4G entries. Or 1G "buckets". So I would expect to see that a pretty significant number of times. But you only need two entries in a bucket with the same draft before the problem occurs, since that group could be the over-populated draft positions, and you have to choose which of the two to replace. If you assume say a 32 ply search, then you could see drafts between 1 to 32 (ignoring the edge case at the root). The probability of any two entries in a bucket having the same draft is fairly low (6 * (1/32) ^ 2) = about 1/200 if I did my mental math reasonably. With 1G buckets, that is about 5M buckets that meet that criterion.

It would seem that you need some rational way to choose, whether it be random or age or something else. Age is trivial to do.

The way I implemented it is that I just overwrite the first overrepresented entry it encounters in the normal test order of the entries in a bucket. That minimalizes the number of key compares you have to do to find it back on a probe, as well as the number of entries you have to test for being overrepresented. That seems even easier to do. But you can very well combine it with an aging scheme. I tried that too, in the following way: you associate a small counter with each entry, say 4 bits, and when you encounter an entry with over-represented draft in the course of searching for an entry to replace, you increment that counter. You only replace it because of its over-representation when that counter overflows. (You might still replace it based on having the lowest depth in the bucket, in which case you reset the counter to 0.) This makes the lifetime of entries of a given draft more equal: instead of the surviving fraction decaying exponentially, so that most live very short, but a few live very long, the survival curve starts to look more like a step function.

bob · Post by **bob** » Thu Jan 28, 2016 5:44 pm

Aleks Peshkov wrote:
bob wrote:Here's a question: Suppose you use a bucket size of 4, which is typical although some are beginning to go with 8, using an 8 byte uber-compressed table entry (different topic). All four entries in that bucket represent the same draft, so how do you choose which one to replace? Age might be a usable alternative when draft alone isn't enough to break the tie. What if two entries in the table represent the same draft (which is over-populated and should cause a replacement). How do you choose which one? Again age might help. Overpopulated and produced by the previous search would be a better choice it would seem.
My slots in the bucket are ordered. The highest entry is the deepest one or the latest (written or read) if equal depth. All reading and writing reorders the bucket. Outdated by age entries pushed below any actual age entry but they ordered the same way.

When I had 4-slot bucket, it was profitable to overwrite not the shallowest, but the same or nearest lower depth entry (simplified equidistributed idea). When I have 3-slot bucket overwriting the shallowest (or the oldest of equal depth) performs better.

Now I have the bucket of 3x 16b any depth slots + 2x 8b depth 0 slots (no need to keep depth and age fields).

I increment age counter after every search depth iteration, but current and current-1 ages are treated as actual.

What kind of info can you store in 8 bytes? Obviously you eliminate more than just depth and age. What's the benefit?

hgm · Post by **hgm** » Thu Jan 28, 2016 6:05 pm

bob wrote:What kind of info can you store in 8 bytes? Obviously you eliminate more than just depth and age. What's the benefit?

32 bits signature, 13 bits move packed with 2 bits flag, 16 bits score? Perhaps 1 bit for in-check?

In Joker I used 9-byte entries, just the above plus 1 byte depth. Then seven of those fitted in a cache line, with 1 byte to spare. Because I did not feel like testing 7 entries on a probe I organized them as 3 groups of 2 depth-preferred entries, plus one always-replace entry, where everything would go that could nort beat the draft of the two depth-preferred entries.

Aleks Peshkov · Post by **Aleks Peshkov** » Thu Jan 28, 2016 6:28 pm

bob wrote:What kind of info can you store in 8 bytes? Obviously you eliminate more than just depth and age.

I have 12-bit move encoding. So 12b move + 16b eval + 2bits for bound flags is enough inside 64bit slot. The rest is lock key.

What's the benefit?

My main reason is that 3 slots are much easier to keep ordered, then to shuffle 4.

The benefit of (3x free + 2x rigid) against 4x free way slot allocation is good (almost 5/4 better) when the table space is critically low, slowly diminishes to negative gain when the table is about exactly 4x of search space and becomes slightly better again as table grows relative to search space.

(2x free + 4x rigid) performed much worse. I tried rigid depth: 3-2-1-0, 2-1-0-0 and 1-1-0-0.

Aleks Peshkov · Post by **Aleks Peshkov** » Thu Jan 28, 2016 6:55 pm

hgm wrote:In Joker I used 9-byte entries, just the above plus 1 byte depth. Then seven of those fitted in a cache line, with 1 byte to spare. Because I did not feel like testing 7 entries on a probe I organized them as 3 groups of 2 depth-preferred entries, plus one always-replace entry, where everything would go that could nort beat the draft of the two depth-preferred entries.

I suggest 2 groups of 3-slot buckets. The more slots the more difficult to use them effectively, but "shallowest of 3" easily outperforms 2 + 1 always.

I observed that standard "Shallowest of 4" can be improved using quasi equidistributed scheme: replace the slot that is equal depth or nearest with lower depth (except I always keep the previous most deep and replace second best deep in that case).

Example: 5, 4, 2, 1.
Writing new depth 4 or greater: replace slot with depth 4.
Writing new entry with depth 2 or 3: replace slot with depth 2.
Writing 1 or 0: replace the shallowest, depth 1.

Pure infamous LRU cache scheme is obviously bad for chess transposition table, but you can keep LRU information for entries with the same depth (or almost the same).

bob · Post by **bob** » Thu Jan 28, 2016 8:14 pm

hgm wrote:
bob wrote:What kind of info can you store in 8 bytes? Obviously you eliminate more than just depth and age. What's the benefit?
32 bits signature, 13 bits move packed with 2 bits flag, 16 bits score? Perhaps 1 bit for in-check?

In Joker I used 9-byte entries, just the above plus 1 byte depth. Then seven of those fitted in a cache line, with 1 byte to spare. Because I did not feel like testing 7 entries on a probe I organized them as 3 groups of 2 depth-preferred entries, plus one always-replace entry, where everything would go that could nort beat the draft of the two depth-preferred entries.

Yes, but the bound is the important information, and with that you need draft. I had a 25.0 version that used 8 byte entries. But the collision rate was way too high to accept. I used 12 bits for move (from/to only, "decorate" the actual move later when trying to use it), 2 bit type, 16 bit score, 7 bits for draft, 2 bits for age, rest for signature. It worked but it introduced some interesting problems. I went back to simple and safe.

Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial

Re: Transposition Age Tutorial