transposition table details

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: transposition table details

Post by bob »

Don wrote:I have a question for Bob Hyatt.

I saw a forum post where you claimed that you do not try to deal with draw by repetition (and presumably by 50 move rule) scores. In my program I do not store any score of zero, but I fear that is a terrible blunder because I get no help from the hash table towards recognizing draw by repetition.

Am I correct in assuming you completely ignore draw issues as far as storing or not storing them in the hash table?

I found a position where my program was sluggish seeing a repetition draw, even compared to many other weaker programs. I'm pretty sure the reason is because of how I deal badly with this case and wonder if just storing them is better than what I'm doing.
You are correct. I have seen many that don't store draws, don't store mate scores (some just store bounds, some ignore mates completely). I chose to ignore this as either solution does not solve the problem we have where table entries are position-based while rep/50move draws are path-based.

I have tried lots of different approaches over the years, but have never found anything that beats the classic "Belle approach" of the two-level table, where one is depth-preferred and the other is "always" store. You really can not afford to just give up and not store a hash entry, because you will lose a lot of "localized" information when you do. Ken's approach was really designed (obviously) for a hardware implementation based on a finite-state machine, since he could not have a variable-time-length hash probe in his hardware. However, it works well. I didn't use this in cray blitz because the vector hardware gave us a better way, but our approach back then was not much better than Belle's. I've been using Ken's idea since early Crafty implementations where I tested every idea I had seen. Bruce Moreland did the same and settled on the Belle approach for the same reason. Can't respond for what others are trying.

For me, fine 70 is a _classic_ to show the problems with path information and draws. Crafty finds the right move quickly, but on occasion will fail low when it realizes that a path that has been hashed is incorrect due to a repetition or 50 move draw way out there. It never changes its mind from the correct move, but at extreme depths (40+) the scores occasionally bounce around due to this. I once thought I had yet another draw bug and spent a week looking at a huge trace of the tree being searched to figure out this was the problem.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: transposition table details

Post by bob »

rvida wrote:
You will notice that the first "go depth 15" command takes 29.3 seconds in this case.

The second time I run this it takes 0.002 seconds!

This makes me think that Vas has a rule not to ever overwrite PV nodes, it's probably as simple as that or something similar.
Or simply feed back the PV to the TT after end of each iteration/search
Good suggestion. I do this to make sure the PV is present in the hash table since it can be replaced during a deepish search.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: transposition table details

Post by hgm »

In my engine HaQiKi D I tried a new approach to rep draws, which seems to work OK. When the search bumps into a repetition, I return the score as a depth=0 score even when the originally requestd depth was much larger. That way all the positions leading to it and potentially contaminated by the repetition score (because it was an all-node, or because the repetition occurred in the cut branch) get a lower depth, and thus are flushed quickly from the hash table, or don't have their score used because the draft was not sufficient. As soon as the contamination stops, because a cut-move to an uncontaminated branch is found, the depth of the score jumps back to normal.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: transposition table details

Post by Don »

bob wrote: I have tried lots of different approaches over the years, but have never found anything that beats the classic "Belle approach" of the two-level table, where one is depth-preferred and the other is "always" store.
I wonder about that. That has been what I've used for years, but recently I tried breaking the entries into sets of 4 entries and then picking out the "worst" one of those to replace. That is giving me a very clear speedup.

But it could be that I could improve on the two-level table. Perhaps I'm doing it wrong.

With the 2 level table approach I try the depth table first, and overwrite if I find a shallower entry there (should I overwrite an equally deep entry here too?)

Otherwise, I just write the entry in the "always store" table.

I can do more experimentation, but I will mention that my table is not stressed - the table is more than large enough. I know this can sometimes make things come out different. But given a table with adequate entries the set method is giving me a good boost. I mention this because the two tables are the same size and I have heard that some use 2X bigger table for the depth preferred table.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: transposition table details

Post by hgm »

When I was tring this, the shallowest-of-4 replacement gave a very clear speedup over shallowest-of-2. Pure replace-always or depth-preferred were total losers. Shallowest-of-7 was a little worse than shallowest-of-4.

So shallowest-of-4, which is what most people use, is close to optimum.

But as the graph shows, equi-distributed draft can still beat it by a factor 1.33 at high loading factor.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: transposition table details

Post by bob »

Don wrote:
bob wrote: I have tried lots of different approaches over the years, but have never found anything that beats the classic "Belle approach" of the two-level table, where one is depth-preferred and the other is "always" store.
I wonder about that. That has been what I've used for years, but recently I tried breaking the entries into sets of 4 entries and then picking out the "worst" one of those to replace. That is giving me a very clear speedup.

But it could be that I could improve on the two-level table. Perhaps I'm doing it wrong.

With the 2 level table approach I try the depth table first, and overwrite if I find a shallower entry there (should I overwrite an equally deep entry here too?)

Otherwise, I just write the entry in the "always store" table.

I can do more experimentation, but I will mention that my table is not stressed - the table is more than large enough. I know this can sometimes make things come out different. But given a table with adequate entries the set method is giving me a good boost. I mention this because the two tables are the same size and I have heard that some use 2X bigger table for the depth preferred table.
I treat my table like a 2-way associative cache. If I replace a depth-preferred entry, I always slide it over to the always-store entry before overwriting. In Cray Blitz, and in very early Crafty versions, I used a set of size N (N on the Cray was 8, where I tried 4 and 8 on Crafty).

I would not begin to say there is not a better approach. I can only say that _I_ have not found anything that worked better. And remember, tree size is not the only thing that matters. If you probe more entries, you are going to hurt performance some no matter what. This performance hit has to also be factored in, which is where testing in real games becomes more important.

BTW that last statement of yours is wrong. The Belle approach used two tables, where the always-store table is 2x bigger than the depth-preferred table. I do that in Crafty, although I do a bit of funny stuff to tie the three entries into one "wad" so that I get 'em all in one cache line fill 1/2 the time.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: transposition table details

Post by diep »

Don wrote:I'm trying to improve my hash table table implementation after noticing that some programs respond instantly when you re-search the same position to the same depth. For instance from Rybka, "go depth 15" takes 20 seconds the first time, but if then do a "go depth 15" the second time it takes 0.12 seconds. (These are UCI interface commands.)

My own program also responds much quicker the second time, due to information in the hash tables, but not nearly so much as Rybka.
Don,

In Diep i also have true bounds, do you?

You mean like this i assume:
depth
New depth(currently 61) :
15
anal
Analysis mode is on
putting engine to search errorlevel=0!
00:00 20 1 0 (4) 1 (0,0) 0.480 Ng1-f3
00:00 60 3 0 (4) 1 (0,0) 0.807 e2-e4
00:00 820 41 0 (4) 2 (0,1) 0.001 e2-e4 e7-e5
00:00 5580 279 0 (4) 3 (1,5) 0.385 e2-e4 d7-d5 Nb1-c3 d5xe4 Nc3xe4
++ g1-f3 procnr=0 terug=386 org=[385;386] newwindow=[385;520000]
00:00 11560 578 0 (4) 3 (2,8) 0.409 Ng1-f3 d7-d5 e2-e3
++ d2-d4 procnr=0 terug=410 org=[409;410] newwindow=[409;520000]
00:00 15040 752 0 (4) 3 (2,11) 0.431 d2-d4 e7-e6 Nb1-c3
00:00 21840 1092 0 (4) 4 (2,18) 0.001 d2-d4 e7-e6 Nb1-c3 Nb8-c6
++ e2-e3 procnr=1 terug=2 org=[1;2] newwindow=[1;520000]
00:00 82614 5783 0 (4) 5 (20,91) 0.425 d2-d4 d7-d5 Ng1-f3 Bc8-f5 Bc1-f4
00:00 138710 13871 0 (4) 6 (32,164) -0.003 d2-d4 Ng8-f6 Nb1-c3 d7-d6 Bc1-f4 Bc8-
f5
++ g1-f3 procnr=3 terug=-2 org=[-3;-2] newwindow=[-3;520000]
00:00 161818 17800 0 (4) 6 (40,223) 0.001 Ng1-f3 Ng8-f6 Nb1-c3 Nb8-c6 d2-d3 e7-e
6
00:00 225090 49520 0 (4) 7 (101,641) 0.504 Ng1-f3 d7-d6 e2-e4 Ng8-f6 Nb1-c3 Bc8-
g4 Bf1-c4
00:00 244612 78276 0 (4) 8 (144,947) -0.002 Ng1-f3 d7-d6 Nb1-c3 Nb8-c6 d2-d3 Ng8
-f6 Bc1-f4 e7-e5
++ d2-d4 procnr=2 terug=-1 org=[-2;-1] newwindow=[-2;520000]
++ e2-e4 procnr=2 terug=-1 org=[-2;-1] newwindow=[-2;520000]
++ b1-c3 procnr=2 terug=-1 org=[-2;-1] newwindow=[-2;520000]
++ d2-d3 procnr=0 terug=-1 org=[-2;-1] newwindow=[-2;520000]
++ e2-e3 procnr=3 terug=-1 org=[-2;-1] newwindow=[-2;520000]
00:00 386156 312787 0 (4) 9 (275,2216) 0.365 Ng1-f3 d7-d5 Nb1-c3 Bc8-f5 d2-d4 Nb
8-a6 Bc1-f4 Na6-b4 Ra1-c1
++ d2-d4 procnr=1 terug=366 org=[365;366] newwindow=[365;520000]
00:00 411024 406914 0 (4) 9 (297,2432) 0.464 d2-d4 Ng8-f6 Ng1-f3 e7-e6 Nb1-c3 Bf
8-b4 Bc1-g5 Bb4xc3 b2xc3 d7-d5
00:01 440695 572904 0 (4) 10 (341,2960) 0.018 d2-d4 d7-d5 Nb1-c3 Bc8-f5 Bc1-g5 N
g8-f6 e2-e3 Nb8-c6 Bf1-b5 e7-e6
++ g1-f3 procnr=1 terug=19 org=[18;19] newwindow=[18;520000]
00:01 453335 652803 0 (4) 10 (373,3428) 0.146 Ng1-f3 d7-d5 Nb1-c3 Bc8-f5 d2-d3 N
g8-f6 Nf3-d4 Bf5-g4 h2-h3 e7-e5 h3xg4 e5xd4
00:02 478591 1143834 0 (4) 11 (545,5815) 0.370 Ng1-f3 Ng8-f6 Nb1-c3 e7-e6 d2-d4
Bf8-b4 Qd1-d3 Bb4xc3 b2xc3 Nb8-c6 Bc1-g5 d7-d5
00:04 513530 2126017 0 (4) 12 (720,8278) 0.154 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4
Bc8-g4 e2-e3 e7-e6 Bf1-b5 Bf8-b4 c2-c3 Bb4-d6 Bb5xc6 b7xc6 Bf4xd6 c7xd6
++ d2-d4 procnr=0 terug=155 org=[154;155] newwindow=[154;520000]
00:05 540202 3143976 0 (4) 12 (873,10284) 0.182 d2-d4 Ng8-f6 Nb1-c3 d7-d5 Ng1-f3
Bc8-f5 Nf3-h4 Bf5-g4 f2-f3 Bg4-h5 g2-g4 Bh5-g6 Nh4xg6 h7xg6
00:11 563638 6746753 0 (4) 13 (1200,15505) 0.226 d2-d4 d7-d5 Ng1-f3 Nb8-c6 Bc1-f
4 Bc8-f5 c2-c3 Ng8-f6 Qd1-b3 e7-e5 d4xe5 Nf6-e4 Nb1-d2
00:24 570693 14255918 0 (4) 14 (1782,24513) 0.139 d2-d4 d7-d5 Bc1-f4 Ng8-f6 e2-e
3 Bc8-g4 Bf1-e2 Bg4xe2 Qd1xe2 Nb8-d7 Nb1-c3 Rh8-g8
++ g1-f3 procnr=2 terug=140 org=[139;140] newwindow=[139;520000]
00:29 584888 17482327 0 (4) 14 (1849,26054) 0.184 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-
f4 Ng8-f6 Nb1-c3 Nf6-h5 Bf4-d2 Bc8-f5 e2-e3 Nh5-f6 Bf1-d3 Nf6-e4 Nc3xe4 d5xe4
00:47 590558 27850748 0 (4) 15 (2125,29909) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-
f4 Ng8-f6 Nb1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
AbortAll
time taken to abort = 0 centiseconds
anal
Analysis mode is on
putting engine to search errorlevel=0!
00:00 33 1 0 (4) 1 (0,0) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 Nb1-c
3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 100 3 0 (4) 1 (0,0) 0.385 e2-e4 d7-d5 e4xd5 Ng8-f6 Bf1-b5 Bc8-d7 Nb1-c3
Nf6xd5 Bb5xd7 Qd8xd7 Qd1-h5 Nd5xc3 a2-a3 Nc3-e4 f2-f4
00:00 700 21 0 (4) 2 (0,1) 0.385 e2-e4 d7-d5 e4xd5 Ng8-f6 Bf1-b5 Bc8-d7 Nb1-c
3 Nf6xd5 Bb5xd7 Qd8xd7 Qd1-h5 Nd5xc3 a2-a3 Nc3-e4 f2-f4
00:00 1366 41 0 (4) 3 (0,2) 0.385 e2-e4 d7-d5 e4xd5 Ng8-f6 Bf1-b5 Bc8-d7 Nb1-c
3 Nf6xd5 Bb5xd7 Qd8xd7 Qd1-h5 Nd5xc3 a2-a3 Nc3-e4 f2-f4
00:00 4100 123 0 (4) 4 (0,6) 0.001 e2-e4 d7-d5 e4xd5 Ng8-f6 Bf1-b5 Bc8-d7 Nb1-
c3 Nf6xd5 Bb5xd7 Qd8xd7 Qd1-h5 Nd5xc3 a2-a3 Nc3-e4 f2-f4
++ g1-f3 procnr=2 terug=193 org=[1;2] newwindow=[1;520000]
00:00 4166 125 0 (4) 4 (0,6) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 Nb1
-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 4800 144 0 (4) 5 (0,7) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 Nb1
-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 5466 164 0 (4) 6 (0,8) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 Nb1
-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 6133 184 0 (4) 7 (0,9) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 Nb1
-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 6800 204 0 (4) 8 (0,10) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 Nb
1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 7466 224 0 (4) 9 (0,11) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 Nb
1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 8133 244 0 (4) 10 (0,12) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 N
b1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 8800 264 0 (4) 11 (0,13) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 N
b1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 9466 284 0 (4) 12 (0,14) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 N
b1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 10133 304 0 (4) 13 (0,15) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 N
b1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 10800 324 0 (4) 14 (0,16) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 N
b1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
00:00 11466 344 0 (4) 15 (0,17) 0.193 Ng1-f3 Nb8-c6 d2-d4 d7-d5 Bc1-f4 Ng8-f6 N
b1-c3 e7-e6 Nc3-b5 Bf8-d6 Nb5xd6 c7xd6 e2-e3 Nf6-e4 Bf1-d3 Bc8-d7
AbortAll
time taken to abort = 0 centiseconds
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: transposition table details

Post by diep »

I award a price for the invention of the wording:
"equidistributed-depth"

The price to be given in beer or wine form at the upcoming ICT tournament in Leiden to Harm-Geert Muller.

The official price name is: "hiding crappy research by inventing new complex descriptive name"

Vincent
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: transposition table details

Post by mjlef »

Don,

I used a kind of "real depth" as follows:

a. Always repace entries with hash key match (of course)
b. Store this value in the "real depth" part of the hash key:

depth+age

where "age" is the number of ply from the beginning of the game. For example, if we are in the 5th move of the game (10 half moves), and in the current position we have completed a depth=5 search, I store 5+10 in the "realdepth" part of the hash. I do not store age at all in the hash table.

Now, just always replace the lowest values "realdepth" entry. As the game goes on, and the age value increases, older entires get automatically over written, but deeper searches hang around longer, as I think they should.

Note realdepth is just used to decide on replacement. I also store the search depth in the table, of course. And I clear the table at the start of a new game, since age gets reset then also.

Try this. very easy to implement, and it has tested better than anything else I have come up with.

Mark
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: transposition table details

Post by mcostalba »

Don wrote:
You will notice that the first "go depth 15" command takes 29.3 seconds in this case.

The second time I run this it takes 0.002 seconds!

This makes me think that Vas has a rule not to ever overwrite PV nodes, it's probably as simple as that or something similar.
I have played with an idea that I called "Iterative deepening shortcut"

This is how it works.

- At the end of a search say at depth 13 we return the best move and store the ponder move and also the PV

- If the next opponent move is the ponder move we will detect this (because we remember previous ponder move) and so we know that the PV we have stored is the best one until depth 13 - 2 = 11

- So we try the PV and do not change it until we don't reach at least depth 11. This avoids false fail highs at lower depth that we know in advance will be reverted by a next search at deeper depths. This also avoids spikes and instability at very thight times when you have to return the best move before to reach the previous depth -2.

I have still not managed to make it work right, but I would think it has some potential. Another possibility is to skip completely the Iterative Loop and jump directly to Iteration 11 because we know the PV that we have stored is the best until that depth.

Have someone tested this idea before ?

Thanks
Marco