Hash table division

diep · Post by **diep** » Fri Apr 06, 2012 4:29 pm

Rebel wrote:
diep wrote:
Rebel wrote:
bob wrote:
Rebel wrote:When implementing hash tables in the days that memory was limited (1 Mb if you were rich) I set-up my hash table data structure as follows:

- One hash table for odd plies
- One hash table for even plies

And where ever possible make the odd ply HT twice as big as the even ply HT because that one is accessed the most due to the AB behaviour. And it also worked great in case of a full HT. Anno 2012 I wonder if that's still the case.

Anyone using a somewhat similar system ?
You realize that if you do an even-ply search, 1/2 of the hash stores are going to be from that last layer of even-ply searches??? That is, for every depth N-1 node where N-1 is odd, you will always see a depth N search (and probe).

I don't see why this makes any sense at all...
I have a display, it gives the percentages how both HT tables fill and the odd-ply HT fills a lot faster than the even ply HT. It's exactly what one might expect from a well ordered search. And so it makes sense to declare a bigger odd-ply-HT than an even-ply-HT. Just try yourself.
Maybe Bob didn't understand you mean odd iterations versus even iterations.
That's of course possible. But I mean from the root.
Code: Select all
ply-1 e2-e4  &#40;odd&#41;   --> into odd-HT
ply-2 e7-e5  &#40;even&#41;  --> into even-HT
ply-3 Ng1-f3 &#40;odd&#41;   --> into odd-HT
ply-4 Nb8-c6 &#40;even&#41;  --> into even HT
Hope that clears things up

Yes very clear - sorry for the confusion.

Note that it doesn't seem like a very efficient scheme at todays computing Ed.

As from the mainline you'll get a huge amount of nodes where the even/odd just is reversed thanks to mainline reasons.

Ack - i should say : *behind* the mainline you get a lot of nodes with a minimal search window where even/odd just is the reverse of the allnode/cutnode property from what you wrote down.

You basically sort it now colorbased if i may say so.
In Diep i'm not doing that.

In fact i take care that positions from black and white that are the same, get stored nearby each other.

Advantage is that i can reuse the evaluation value better then, as the evaluation from white is of course -black ones. As i store it always from white side viewpoint no big deal in fact.

Besides that of course we have already probably in one of our caches this c acheline now from the RAM, so if we nullmove we get our hashentry for near free from the hashtable.

diep · Post by **diep** » Fri Apr 06, 2012 4:47 pm

Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?

We can prove it is slower at todays RAM.

He's fetching in turn from the white table and the black table.

So odds is bigger that the distance within the RAM between the 2 fetches is larger on average than if we use 1 table, which is slower.

Of course it'll be maybe 1 nanosecond, so near impossible to measure

hgm · Post by **hgm** » Fri Apr 06, 2012 6:24 pm

Rebel wrote:From the root. The perfect search (except for the best move) goes like this:

odd - xxxxxxxxxxxxxxxxxxxxx (all moves searched) (no beta cut-off)
even - x (beta cut-off) (only one move searched)
odd - xxxxxxxxxxxxxxxxxxxxx (all moves searched) (no beta cut-off)
even - x (beta cut-off) (only one move searched)

etc.

No, I don't get this. The odd plies (except for the root) have beta cutoffs too, right? It just depends on where the sub-tree branched from the PV. Not how far it is from the root. E.g. after

ply-1: e2-e4
ply-2: a7-a6

every odd ply starting from ply-3 will be a beta cutoff, and every even ply an all-node.

Rebel · Post by **Rebel** » Fri Apr 06, 2012 6:53 pm

Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?

Because of testing, what else is the final measurement?

Rebel · Post by **Rebel** » Fri Apr 06, 2012 7:05 pm

hgm wrote:
Rebel wrote:From the root. The perfect search (except for the best move) goes like this:

odd - xxxxxxxxxxxxxxxxxxxxx (all moves searched) (no beta cut-off)
even - x (beta cut-off) (only one move searched)
odd - xxxxxxxxxxxxxxxxxxxxx (all moves searched) (no beta cut-off)
even - x (beta cut-off) (only one move searched)

etc.
No, I don't get this. The odd plies (except for the root) have beta cutoffs too, right? It just depends on where the sub-tree branched from the PV. Not how far it is from the root. E.g. after

ply-1: e2-e4
ply-2: a7-a6

every odd ply starting from ply-3 will be a beta cutoff, and every even ply an all-node.

I said: The perfect search (except for the best move) ....

To emphasize how a well ordered search behaves resulting in more odd-ply-moves to store in the HT.

Besides, in your example (1.e4 a6) logically considering (1.e4 e5) already has been searched no single ply-3 move will give a beta-cut-off.

diep · Post by **diep** » Fri Apr 06, 2012 7:14 pm

hgm wrote:
Rebel wrote:From the root. The perfect search (except for the best move) goes like this:

odd - xxxxxxxxxxxxxxxxxxxxx (all moves searched) (no beta cut-off)
even - x (beta cut-off) (only one move searched)
odd - xxxxxxxxxxxxxxxxxxxxx (all moves searched) (no beta cut-off)
even - x (beta cut-off) (only one move searched)

etc.
No, I don't get this. The odd plies (except for the root) have beta cutoffs too, right? It just depends on where the sub-tree branched from the PV. Not how far it is from the root. E.g. after

ply-1: e2-e4
ply-2: a7-a6

every odd ply starting from ply-3 will be a beta cutoff, and every even ply an all-node.

If you do things in odd and even then strictly spoken you have a black table and a white table, so doesn't hurt performance other than when you have more tricks in hashtable like i do with Diep and/or a caching difference when you use nullmove bigtime.

So Ed doesn't need to store what side he's storing for.

Saves out 1 bit of storage, but in itself from transposition viewpoint it has the same performance like one big table.

Vincent

Houdini · Post by **Houdini** » Fri Apr 06, 2012 7:24 pm

Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?

You didn't answer the question.
WHY would splitting up the hash table accelerate the search?

BubbaTough · Post by **BubbaTough** » Fri Apr 06, 2012 7:29 pm

Houdini wrote:
Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?
You didn't answer the question.
WHY would splitting up the hash table accelerate the search?

One possibility is if the hash associativity is the same (seems 4 is usual) then you are getting a better hit rate because none of the opposite color_to_move positions are competing for the same spots.

-Sam

bob · Post by **bob** » Fri Apr 06, 2012 8:58 pm

diep wrote:
bob wrote:
sje wrote:Symbolic has always done this with one table for WTM and one for BTM.
I simply xor a STM value. I'm not sure I'd want two different tables. What happens in positions where one side has lots of mobility and one side is very restricted? One table fills up, the other is barely used...

I've never thought, in any scenario, that two separate tables works, unless you talk about two tables in the Belle sense, where one is a depth-preferred replacement strategy while the other is an always-store table. But for Belle, both store both WTM/BTM entries, so there is no "side-to-move boundary".
Actually Bob you were doing 2 total different lookup until start of this century. Even some years after i noticed it and wrote about it several times in the forum, Crafty kept losing system time to 2 random probes.

I already mentioned that as the "Belle approach" in my post above. But the table was not divided into a black and a white part. Never done that. What I do today, I did in Cray Blitz back in the 80's as anyone can check (we used 8-entry buckets on the Cray since vector loads made that a "for free" operation, as opposed to 4-entry buckets today). And we never had a separate table for black and white, because nothing makes those positions balance evenly.

Now i say this in my own words: when it was convenient to wintel programmer Eugene Nalimov, to change that, he did do so.
Of course not for his employer, it was done well in time before AMD had a faster latency to the RAM with opteron and built in memory controller in k8, so we can't prove any relationship there. Sincethen you're also doing sequential probes Bob
And not a day before until Eugene modified that...

Vincent

p.s. normally spoken specint2004 would have released in 2004 and crafty also was a candidate for specint2004. Intel didn't have an on die memory controller until years later, so that Nalimov changed that, despite me writing it for years into the forum, was a most lucky change for wintel, which the years before that, despite mewriting it regurary in the CCC, wasn't convenient for them as during the on die memory controllers, intel had the fastest random latency. That changed to AMD's advantage when april 2003 they released the opteron that had an on die memory controller.
Interesting to see how $100 billion companies still are dependant there upon lucky coincidences... ...delaying this change that long really was lucky for them....

bob · Post by **bob** » Fri Apr 06, 2012 9:01 pm

BubbaTough wrote:
bob wrote:
Rebel wrote:When implementing hash tables in the days that memory was limited (1 Mb if you were rich) I set-up my hash table data structure as follows:

- One hash table for odd plies
- One hash table for even plies

And where ever possible make the odd ply HT twice as big as the even ply HT because that one is accessed the most due to the AB behaviour. And it also worked great in case of a full HT. Anno 2012 I wonder if that's still the case.

Anyone using a somewhat similar system ?
You realize that if you do an even-ply search, 1/2 of the hash stores are going to be from that last layer of even-ply searches??? That is, for every depth N-1 node where N-1 is odd, you will always see a depth N search (and probe).

I don't see why this makes any sense at all...
Given the variety in search depths that goes on in current programs, I would think it would be pretty hard to guess how things will work...real data seems necessary with funny effects from LMR, null move, and so on.

-Sam

I actually tested this just now, although not an extensive test. Since I don't hash in q-search, I ignored those positions, but for all the rest, I simply created a pair of counters like this: "int64_t node_count[2]" and at the top of search, I incremented the counter depending on side to move. Bottom line, the numbers vary. For 4 initial wtm positions, normal search of 60 seconds, they were roughly 50-50 as to wtm/btm nodes. Ditto for 4 initial BTM positions. I also tried fixed depth to depth N and then N+1, and the same was seen. No "pattern" of one being always larger than the other.

I don't think the idea is worthwhile and can actually hurt since the smaller table will be stressed more.

Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division