transposition table details

Don · Post by **Don** » Tue May 26, 2009 4:23 pm

I'm trying to improve my hash table table implementation after noticing that some programs respond instantly when you re-search the same position to the same depth. For instance from Rybka, "go depth 15" takes 20 seconds the first time, but if then do a "go depth 15" the second time it takes 0.12 seconds. (These are UCI interface commands.)

My own program also responds much quicker the second time, due to information in the hash tables, but not nearly so much as Rybka.

I have experimented so far with 3 basic replacement schemes:

a very basic scheme that does not overwrite a deeper entry, otherwise it always overwrites. Only 1 probe.

A scheme that segments the table into sets of 4 records and the program chooses the "best" entry among the 4. When writing it's possible that no entries are appropriate because it will not overwrite an entry of greater depth.

The standard 2 level table, depth based for one, always overwrite for the other.

In all cases, I always overwrite an entry if the "age" is different, in other words it was created by a previous search.

I am trying to collect information on what variations of schemes people have tried to guide my own experiments. This information is not in one place, but it's scattered throughout various posts and pretty incomplete. They rarely provide the gory details.

What I've also noticed is that in fixed depth 9 ply games some schemes appear to play "stronger" by a few ELO than others. I cannot be completely sure of this since the difference is only about 15 ELO and I have played less than 10,000 games (it could be statistical noise) but it's clear the searches vary signficantly enough that the games with the same openings are not duplicating. With LMR you would clearly expect that, so no surprise. I would also expect that the better implementation could have a slight impact on strength, at least when combined with LMR.

Here is the exact thing I'm doing with method 3 (the standard 2 level table) keeping in mind that AGE always trumps anything else, so an entry is viewed as empty if it was produced by a previous search:

if the depth based entry has a greater depth, use the entry from the always replace table. Whichever is used, gets overwritten regardless of the key. I have not experimented with whether to overwrite the depth entry if the KEY matches (even if the depth in the table is higher.)

Some other details I have heard that people use:

1. I think Bob Hyatt does not overwrite the depth based table when age is CLOSE to the current age.

2. Rybka (it is claimed) does not store scores that are extremely high or low but I'm not sure of the details. It is miserably slower for me when I do this so I cannot imagine what Rybka is doing.

3. Give each hash table entry 2 sets of values, depth based and always overwrite. This is a little different from method 3, the standard 2 level table because the key can be shared.

4. Various probing schemes, such as set associative (which is what seems to work best for me) and linear hashing to find an entry.

5. Depth based 2 level table is a different size (2X ?)

I have found in the simple implementation that you should always overwrite an entry with the same key. I have not tried to experiment with this too much.

There are a zillion details of course on how this might be done and I fear I missing some big point.

hgm · Post by **hgm** » Tue May 26, 2009 4:38 pm

It was quite difficult to compare different replacement strategies using a search as benchmark, since this leads to very noisy data. So I tested mainly on perft trees. The scheme that seemed to work best was equidistributed-depth replacement:

The favorable scaling of this scheme did seem to persist in searches:

Don · Post by **Don** » Tue May 26, 2009 4:59 pm

hgm wrote:It was quite difficult to compare different replacement strategies using a search as benchmark, since this leads to very noisy data. So I tested mainly on perft trees. The scheme that seemed to work best was equidistributed-depth replacement:

The favorable scaling of this scheme did seem to persist in searches:

I'm not sure what you mean by the phrase "equidistributed-depth replacement" but I would be willing to try it if I can figure out how that is done. I vaguely remember something posted on this.

The way I get my times is I just play autotest games and I average the time per game up to the Nth move, where N is 40 right now. For something like this I use fixed depth games.

The schemes I mentioned vary signficantly in search time. For instance the set associate hash takes 5.02 seconds per game and the "standard 2 level table" takes 5.518 per game. I trust this to about 1 decimal place so we have 5.0 vs 5.5. In this case, we have about 6000 games.

The set method I use has a scoring system to decide which of the 4 entries is the least valuable.

The most important criteria is the key, I always overwrite if the key is the same (so that I don't have multiple versions of the same entry in the table.)

The age overrides that, and is of secondary importance.

Then, I try to retain entries of the greatest depth.

I don't store an entry at all if I have to overwrite an entry of greater depth, unless of course it's from a different age or has the same key.

This has tested best of anything I've tried, but I'm quite sure I haven't tried everything!

So how do you do equidistributed-depth replacement?

hgm · Post by **hgm** » Tue May 26, 2009 5:33 pm

In equidistributed draft I try to keep the number of entries of each draft stored in the table approximately equal. If a certain draft in the depth-preferred table gets more populous than the lowest draft stored in the depth-preferred table (as higher drafts eventually will, with depth-preferred replacement) I treat entries with that draft as if they were empty (as far as the replacement decision is concerned).

If there in no such 'overrepresented' entry in the hash bucket, I replace the lowest draft. (Even if this is higher than that of the new one.)

That is the basic idea. In practice, the implementation used for taking the presented data had 7 entries per 64-byte hash bucket (allocated in register with the cache line), of which one (entry #6) was a dedicated replace-always entry. The hask key maps to one of the 'primary slots' 0-4, and if this contained an overrepresented entry it was replaced. If not, the smallest draft of the primary slot and its right neighbor was determined. If this had a draft smaller than or equal to the new entry, it was replaced. If it had a larger draft, the new entry went to the always-replace slot.

This way each entry can end up in only three places. On a hit, I just update the existing entry, to avoid duplicates.

zamar · Post by **zamar** » Tue May 26, 2009 5:57 pm

Don wrote:
2. Rybka (it is claimed) does not store scores that are extremely high or low but I'm not sure of the details. It is miserably slower for me when I do this so I cannot imagine what Rybka is doing.

My _theory_ of this is that Rybka is not avoiding to store them, but is rounding them down or up. So instead of storing for example +9.32 FLAG_LOWER, it stores it as +5.11 FLAG_LOWER. When value at root is around usual -3.00 -- +3.00 both values are enough to cause cutoffs in search tree. When beta is over 5, well... we are winning anyway... Just a theory!!

Don · Post by **Don** » Tue May 26, 2009 6:23 pm

I have a question for Bob Hyatt.

I saw a forum post where you claimed that you do not try to deal with draw by repetition (and presumably by 50 move rule) scores. In my program I do not store any score of zero, but I fear that is a terrible blunder because I get no help from the hash table towards recognizing draw by repetition.

Am I correct in assuming you completely ignore draw issues as far as storing or not storing them in the hash table?

I found a position where my program was sluggish seeing a repetition draw, even compared to many other weaker programs. I'm pretty sure the reason is because of how I deal badly with this case and wonder if just storing them is better than what I'm doing.

michiguel · Post by **michiguel** » Tue May 26, 2009 6:25 pm

Don wrote:I'm trying to improve my hash table table implementation after noticing that some programs respond instantly when you re-search the same position to the same depth. For instance from Rybka, "go depth 15" takes 20 seconds the first time, but if then do a "go depth 15" the second time it takes 0.12 seconds. (These are UCI interface commands.)

My own program also responds much quicker the second time, due to information in the hash tables, but not nearly so much as Rybka.

I have experimented so far with 3 basic replacement schemes:

a very basic scheme that does not overwrite a deeper entry, otherwise it always overwrites. Only 1 probe.

A scheme that segments the table into sets of 4 records and the program chooses the "best" entry among the 4. When writing it's possible that no entries are appropriate because it will not overwrite an entry of greater depth.

The standard 2 level table, depth based for one, always overwrite for the other.

In all cases, I always overwrite an entry if the "age" is different, in other words it was created by a previous search.

I am trying to collect information on what variations of schemes people have tried to guide my own experiments. This information is not in one place, but it's scattered throughout various posts and pretty incomplete. They rarely provide the gory details.

What I've also noticed is that in fixed depth 9 ply games some schemes appear to play "stronger" by a few ELO than others. I cannot be completely sure of this since the difference is only about 15 ELO and I have played less than 10,000 games (it could be statistical noise) but it's clear the searches vary signficantly enough that the games with the same openings are not duplicating. With LMR you would clearly expect that, so no surprise. I would also expect that the better implementation could have a slight impact on strength, at least when combined with LMR.

Here is the exact thing I'm doing with method 3 (the standard 2 level table) keeping in mind that AGE always trumps anything else, so an entry is viewed as empty if it was produced by a previous search:

if the depth based entry has a greater depth, use the entry from the always replace table. Whichever is used, gets overwritten regardless of the key. I have not experimented with whether to overwrite the depth entry if the KEY matches (even if the depth in the table is higher.)

Some other details I have heard that people use:

1. I think Bob Hyatt does not overwrite the depth based table when age is CLOSE to the current age.

2. Rybka (it is claimed) does not store scores that are extremely high or low but I'm not sure of the details. It is miserably slower for me when I do this so I cannot imagine what Rybka is doing.

3. Give each hash table entry 2 sets of values, depth based and always overwrite. This is a little different from method 3, the standard 2 level table because the key can be shared.

4. Various probing schemes, such as set associative (which is what seems to work best for me) and linear hashing to find an entry.

5. Depth based 2 level table is a different size (2X ?)

I have found in the simple implementation that you should always overwrite an entry with the same key. I have not tried to experiment with this too much.

There are a zillion details of course on how this might be done and I fear I missing some big point.

Could you post the position in which you see that behavior with Rybka, with all the details? (memory, depth, time etc.). In that way, we can compare data.

Miguel

Don · Post by **Don** » Tue May 26, 2009 6:40 pm

michiguel wrote: Could you post the position in which you see that behavior with Rybka, with all the details? (memory, depth, time etc.). In that way, we can compare data.
Miguel

The position is the opening position. Here is the test which I just ran directly in raw mode using UCI commands and I grep out much of the noise:

Code: Select all

drd@greencheeks&#58;~/Games/Chess/TOOLS/ThorstenRater$ ~/bin/rybka64 | grep time
position startpos
go depth 15
info depth 2 score cp 26 time 1 nodes 72 nps 73728 pv g1f3
info depth 2 time 2 nodes 86 nps 44032 
info depth 3 score cp 9 time 2 nodes 110 nps 56320 pv g1f3
info depth 3 time 3 nodes 194 nps 66218 
info depth 4 score cp 13 time 4 nodes 241 nps 61696 pv g1f3
info depth 4 time 5 nodes 323 nps 66150 
info depth 5 score cp 9 time 7 nodes 454 nps 66413 pv g1f3 g8f6
info depth 5 time 11 nodes 724 nps 67397 
info depth 6 score cp 6 time 20 nodes 1215 nps 62208 pv g1f3 g8f6 b1c3
info depth 6 time 30 nodes 1874 nps 63965 
info depth 7 score cp 10 time 43 nodes 2583 nps 61511 pv g1f3 g8f6 b1c3 b8c6
info depth 7 time 60 nodes 3640 nps 62122 
info depth 8 score cp 6 time 92 nodes 5579 nps 62096 pv g1f3 g8f6 b1c3 b8c6 d2d4
info depth 8 time 122 nodes 7377 nps 61918 
info depth 9 score cp 9 time 158 nodes 8914 nps 57771 pv g1f3 g8f6 b1c3 b8c6 d2d4 d7d5
info depth 9 time 202 nodes 11741 nps 59518 
info depth 10 score cp 8 time 474 nodes 30988 nps 66944 pv g1f3 g8f6 g2g3 b8c6 f1g2 d7d6 b1c3
info depth 10 time 545 nodes 36626 nps 68816 
info depth 11 score cp 9 time 814 nodes 55768 nps 70155 pv g1f3 g8f6 g2g3 g7g6 b1c3 f8g7 d2d3 b8c6
info depth 11 time 920 nodes 65381 nps 72771 
info depth 12 score cp 11 time 1350 nodes 94621 nps 71771 pv g1f3 g8f6 g2g3 c7c5 f1g2 b8c6 b1c3 d7d6 e1g1
info depth 12 time 1637 nodes 122582 nps 76679 
info depth 13 score cp 13 time 2523 nodes 182442 nps 74047 pv g1f3 g8f6 g2g3 c7c5 f1g2 b8c6 b1c3 d7d5 e1g1 d5d4
info depth 13 time 3515 nodes 268359 nps 78179 
info depth 14 score cp 11 time 6022 nodes 455321 nps 77424 pv g1f3 g8f6 g2g3 b8c6 f1g2 d7d5 e1g1 g7g6 d2d4 f8g7 b1c3
info depth 14 time 8894 nodes 717331 nps 82589 
info depth 15 score cp 10 time 16515 nodes 1332595 nps 82626 pv g1f3 g8f6 g2g3 b8c6 f1g2 d7d5 e1g1 g7g6 d2d4 f8g7 b1c3 e8g8
info depth 15 score cp 16 time 29096 nodes 2356319 nps 82927 pv e2e4 g8f6 b1c3 b8c6 g1f3 e7e5 f1c4 f8c5 e1g1 e8g8 d2d3 d7d6 c1e3
info depth 15 time 29341 nodes 2379878 nps 83057 
info depth 15 time 29341 nodes 2379878 nps 83057 
go depth 15
info depth 2 time 1 nodes 30 nps 30720 
info depth 3 time 1 nodes 37 nps 37888 
info depth 4 time 2 nodes 43 nps 22016 
info depth 5 time 2 nodes 50 nps 25600 
info depth 6 time 2 nodes 57 nps 29184 
info depth 7 time 2 nodes 63 nps 32256 
info depth 8 time 2 nodes 70 nps 35840 
info depth 9 time 2 nodes 76 nps 38912 
info depth 10 time 2 nodes 83 nps 42496 
info depth 11 time 2 nodes 89 nps 45568 
info depth 12 time 2 nodes 96 nps 49152 
info depth 13 time 2 nodes 102 nps 52224 
info depth 14 score cp 16 time 2 nodes 104 nps 53248 pv e2e4 g8f6 b1c3 b8c6 g1f3 e7e5 f1c4 f8c5 e1g1 e8g8 d2d3 d7d6 c1e3
info depth 14 time 2 nodes 111 nps 56832 
info depth 15 score cp 16 time 2 nodes 113 nps 57856 pv e2e4 g8f6 b1c3 b8c6 g1f3 e7e5 f1c4 f8c5 e1g1 e8g8 d2d3 d7d6 c1e3
info depth 15 time 2 nodes 119 nps 60928 
info depth 15 time 2 nodes 119 nps 60928

You will notice that the first "go depth 15" command takes 29.3 seconds in this case.

The second time I run this it takes 0.002 seconds!

This makes me think that Vas has a rule not to ever overwrite PV nodes, it's probably as simple as that or something similar.

rvida · Post by **rvida** » Tue May 26, 2009 7:17 pm

You will notice that the first "go depth 15" command takes 29.3 seconds in this case.

The second time I run this it takes 0.002 seconds!

This makes me think that Vas has a rule not to ever overwrite PV nodes, it's probably as simple as that or something similar.

Or simply feed back the PV to the TT after end of each iteration/search

bob · Post by **bob** » Tue May 26, 2009 7:28 pm

Don wrote:I'm trying to improve my hash table table implementation after noticing that some programs respond instantly when you re-search the same position to the same depth. For instance from Rybka, "go depth 15" takes 20 seconds the first time, but if then do a "go depth 15" the second time it takes 0.12 seconds. (These are UCI interface commands.)

My own program also responds much quicker the second time, due to information in the hash tables, but not nearly so much as Rybka.

I have experimented so far with 3 basic replacement schemes:

a very basic scheme that does not overwrite a deeper entry, otherwise it always overwrites. Only 1 probe.

A scheme that segments the table into sets of 4 records and the program chooses the "best" entry among the 4. When writing it's possible that no entries are appropriate because it will not overwrite an entry of greater depth.

The standard 2 level table, depth based for one, always overwrite for the other.

In all cases, I always overwrite an entry if the "age" is different, in other words it was created by a previous search.

I am trying to collect information on what variations of schemes people have tried to guide my own experiments. This information is not in one place, but it's scattered throughout various posts and pretty incomplete. They rarely provide the gory details.

What I've also noticed is that in fixed depth 9 ply games some schemes appear to play "stronger" by a few ELO than others. I cannot be completely sure of this since the difference is only about 15 ELO and I have played less than 10,000 games (it could be statistical noise) but it's clear the searches vary signficantly enough that the games with the same openings are not duplicating. With LMR you would clearly expect that, so no surprise. I would also expect that the better implementation could have a slight impact on strength, at least when combined with LMR.

Here is the exact thing I'm doing with method 3 (the standard 2 level table) keeping in mind that AGE always trumps anything else, so an entry is viewed as empty if it was produced by a previous search:

if the depth based entry has a greater depth, use the entry from the always replace table. Whichever is used, gets overwritten regardless of the key. I have not experimented with whether to overwrite the depth entry if the KEY matches (even if the depth in the table is higher.)

Some other details I have heard that people use:

1. I think Bob Hyatt does not overwrite the depth based table when age is CLOSE to the current age.

2. Rybka (it is claimed) does not store scores that are extremely high or low but I'm not sure of the details. It is miserably slower for me when I do this so I cannot imagine what Rybka is doing.

3. Give each hash table entry 2 sets of values, depth based and always overwrite. This is a little different from method 3, the standard 2 level table because the key can be shared.

4. Various probing schemes, such as set associative (which is what seems to work best for me) and linear hashing to find an entry.

5. Depth based 2 level table is a different size (2X ?)

I have found in the simple implementation that you should always overwrite an entry with the same key. I have not tried to experiment with this too much.

There are a zillion details of course on how this might be done and I fear I missing some big point.

The one thing I have planned is to fix something that I am not sure is correct, namely that if the table entry is not from the current search (age test) I overwrite period. I'm not sure that is the best idea for the depth-preferred table, but this probably needs some testing in that going to the always-store will leave a lot of stuff in the depth-preferred table that may well no longer be useful.

BTW I do an exact "age" match. Either the position comes from the current search or it is "stale" and will be overwritten quickly...

I just ran a quick test, setting up a position, telling it to search to depth=16, then backing up one move and telling it to search agan. Here are results:

time=10.18 mat=0 n=16580308 fh=94% nps=1.6M
time=0.09 mat=0 n=132776 fh=93% nps=1.0M

So a re-search does go quickly. If yours is not, you are most likely overwriting hash entries you really want to keep around.

transposition table details

transposition table details

Re: transposition table details

Re: transposition table details

Re: transposition table details

Re: transposition table details

Re: transposition table details

Re: transposition table details

Re: transposition table details

Re: transposition table details

Re: transposition table details