Hash table division

diep · Post by **diep** » Fri Apr 06, 2012 10:33 pm

BubbaTough wrote:
Houdini wrote:
Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?
You didn't answer the question.
WHY would splitting up the hash table accelerate the search?
One possibility is if the hash associativity is the same (seems 4 is usual) then you are getting a better hit rate because none of the opposite color_to_move positions are competing for the same spots.

-Sam

The table is 2x smaller, don't forget to factor that in

diep · Post by **diep** » Fri Apr 06, 2012 10:37 pm

bob wrote:
diep wrote:
bob wrote:
sje wrote:Symbolic has always done this with one table for WTM and one for BTM.
I simply xor a STM value. I'm not sure I'd want two different tables. What happens in positions where one side has lots of mobility and one side is very restricted? One table fills up, the other is barely used...

I've never thought, in any scenario, that two separate tables works, unless you talk about two tables in the Belle sense, where one is a depth-preferred replacement strategy while the other is an always-store table. But for Belle, both store both WTM/BTM entries, so there is no "side-to-move boundary".
Actually Bob you were doing 2 total different lookup until start of this century. Even some years after i noticed it and wrote about it several times in the forum, Crafty kept losing system time to 2 random probes.
I already mentioned that as the "Belle approach" in my post above. But the table was not divided into a black and a white part. Never done that. What I do today, I did in Cray Blitz back in the 80's as anyone can check (we used 8-entry buckets on the Cray since vector loads made that a "for free" operation, as opposed to 4-entry buckets today). And we never had a separate table for black and white, because nothing makes those positions balance evenly.

Now i say this in my own words: when it was convenient to wintel programmer Eugene Nalimov, to change that, he did do so.
Of course not for his employer, it was done well in time before AMD had a faster latency to the RAM with opteron and built in memory controller in k8, so we can't prove any relationship there. Sincethen you're also doing sequential probes Bob
And not a day before until Eugene modified that...

Vincent

p.s. normally spoken specint2004 would have released in 2004 and crafty also was a candidate for specint2004. Intel didn't have an on die memory controller until years later, so that Nalimov changed that, despite me writing it for years into the forum, was a most lucky change for wintel, which the years before that, despite mewriting it regurary in the CCC, wasn't convenient for them as during the on die memory controllers, intel had the fastest random latency. That changed to AMD's advantage when april 2003 they released the opteron that had an on die memory controller.
Interesting to see how $100 billion companies still are dependant there upon lucky coincidences... ...delaying this change that long really was lucky for them....

Of course all is correct what you write - you were doing 2 random probes. 2 sequential probes is a lot faster though. That's the change i am speaking about.

bob · Post by **bob** » Fri Apr 06, 2012 10:44 pm

Rebel wrote:
bob wrote:
Rebel wrote:When implementing hash tables in the days that memory was limited (1 Mb if you were rich) I set-up my hash table data structure as follows:

- One hash table for odd plies
- One hash table for even plies

And where ever possible make the odd ply HT twice as big as the even ply HT because that one is accessed the most due to the AB behaviour. And it also worked great in case of a full HT. Anno 2012 I wonder if that's still the case.

Anyone using a somewhat similar system ?
You realize that if you do an even-ply search, 1/2 of the hash stores are going to be from that last layer of even-ply searches??? That is, for every depth N-1 node where N-1 is odd, you will always see a depth N search (and probe).

I don't see why this makes any sense at all...
I have a display, it gives the percentages how both HT tables fill and the odd-ply HT fills a lot faster than the even ply HT. It's exactly what one might expect from a well ordered search. And so it makes sense to declare a bigger odd-ply-HT than an even-ply-HT. Just try yourself.

See my response to Sam. I simply counted wtm and btm positions throughout the search (none in q-search since I don't hash there) and found a reasonable balance...

I'll post the numbers in a few minutes, will have to extract and format...

Here we go.

First postion is BTM, kopec#22. Searched to depth=22
wtm=10,718,001 btm=8,587,532
time=39.66 mat=0 n=118,450,786 fh=95% nps=3.0M

The "even plies" (wtm, since BTM was root) were bigger, but not by a lot.

Next position is WTM, kopec#24, searched to depth=19
wtm=5411683 btm=6442429
time=49.83 mat=0 n=109541418 fh=93% nps=2.2M

Even plies (btm this time) slightly larger.

Next (and final) position is wtm, kopec#21, searched to depth=20
wtm=6760964 btm=3109970
time=15.51 mat=0 n=53688784 fh=94% nps=3.5M

This time the odd plies (wtm) is bigger.

I ran the entire set and saw absolutely no consistency other than the totals seemed to bounce to favor one side or the other about equal numbers of times...

Here's a few others just for fun, but just the wtm/btm counters:

wtm=7060090 btm=6452962
wtm=3893962 btm=3689548
wtm=9946589 btm=16398992
wtm=4324817 btm=6880681
wtm=8218700 btm=5259557

Those last ones were searched with a fixed time limit rather than a fixed depth...

Seems to be no correlation between STM and node distribution, nor odd/even ply depth.

Rebel · Post by **Rebel** » Fri Apr 06, 2012 11:55 pm

Houdini wrote:
Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?
You didn't answer the question.
WHY would splitting up the hash table accelerate the search?

When the HT becomes full the branch factor goes up. It helps then if the HT that is used most has a bigger size.

diep · Post by **diep** » Sat Apr 07, 2012 12:25 am

Rebel wrote:
Houdini wrote:
Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?
You didn't answer the question.
WHY would splitting up the hash table accelerate the search?
When the HT becomes full the branch factor goes up. It helps then if the HT that is used most has a bigger size.

Thumbs up

p.s. Just got back in my seat for the Muppet show, think i'm gonna sit here and watch it all out

...swap swap swap

Houdini · Post by **Houdini** » Sat Apr 07, 2012 12:27 am

Rebel wrote:
Houdini wrote:
Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?
You didn't answer the question.
WHY would splitting up the hash table accelerate the search?
When the HT becomes full the branch factor goes up. It helps then if the HT that is used most has a bigger size.

It helps if the HT has a bigger size.
That's actually a very good argument for NOT splitting the HT.

BubbaTough · Post by **BubbaTough** » Sat Apr 07, 2012 3:39 am

diep wrote:
BubbaTough wrote:
Houdini wrote:
Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?
You didn't answer the question.
WHY would splitting up the hash table accelerate the search?
One possibility is if the hash associativity is the same (seems 4 is usual) then you are getting a better hit rate because none of the opposite color_to_move positions are competing for the same spots.

-Sam
The table is 2x smaller, don't forget to factor that in

The total size allocated for hash is the same, its just a matter of what spot a position can go into. I would guess for most searches nowadays its about 50/50 between black and white to move, which Bob's results seem to support. So splitting the hash into 2 hashes may very well be a good idea, but they should probably be the same size for most.

-Sam

diep · Post by **diep** » Sat Apr 07, 2012 10:42 am

BubbaTough wrote:
diep wrote:
BubbaTough wrote:
Houdini wrote:
Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?
You didn't answer the question.
WHY would splitting up the hash table accelerate the search?
One possibility is if the hash associativity is the same (seems 4 is usual) then you are getting a better hit rate because none of the opposite color_to_move positions are competing for the same spots.

-Sam
The table is 2x smaller, don't forget to factor that in
The total size allocated for hash is the same, its just a matter of what spot a position can go into. I would guess for most searches nowadays its about 50/50 between black and white to move, which Bob's results seem to support. So splitting the hash into 2 hashes may very well be a good idea, but they should probably be the same size for most.

-Sam

Ed's assumption is that his table is 1.5 times the size you can allocate.

He's using an AND instruction to find the index and if you have 1 GB of RAM in total then you can't use the entire RAM for hashtable.

So YOUR hashtable then is 512MB and HIS hashtable then is 768MB.

Get the idea?

Now Diep's hashtable implementation is slow anyway, so i 'waste' a single multiplication instruction (thanks to Dieter Buerssner for explaining that to me a year or 10+ ago) and can allocate 899MB maybe, so i beat both of you - that's not the discussion here

You are using AND instruction?

p.s. Ed always got huge nps and remember that those cards he sold one day - chessmachine schroeder - they had in total like a 1MB ram or so - hashtable size matters a lot then.

p.s.2 unlike my expectation Kermit scored in the Muppet Show yesterday, so standings right now: Kermit 1, rest of the Muppet Show 0. Not bad knowing Kermit already lives for a while from his pension...

hgm · Post by **hgm** » Sat Apr 07, 2012 2:43 pm

Rebel wrote:I said: The perfect search (except for the best move) ....

To emphasize how a well ordered search behaves resulting in more odd-ply-moves to store in the HT.

Besides, in your example (1.e4 a6) logically considering (1.e4 e5) already has been searched no single ply-3 move will give a beta-cut-off.

No, I still don't get it. 'Perfect' search does not mean you search only the PV, right? You just search it first, and search cut-moves before other moves in non-PV nodes. Otherwise it would not be really a 'search', it woul djust be walking a known PV.

So after 1. e4 e5 (the PV) has been searched, all other ply-2 moves will be refuted by ply-3 cut moves. (e.g. 1.e4 a6 2.Nf3!). All 19 sub-optimal replies to 1.e4 will be refuted by such a ply-3 cut move. Like all 19 sub-optimal white first moves will be cut by a single cut-move on ply-2, which will lead to an all-node at ply-3. So ply-3 has 19 all-nodes and 19 ut-nodes.

BubbaTough · Post by **BubbaTough** » Sat Apr 07, 2012 5:11 pm

diep wrote:
BubbaTough wrote:
diep wrote:
BubbaTough wrote:
Houdini wrote:
Rebel wrote:
Houdini wrote:
Rebel wrote:Of course I agree with you that a faster search should be the end result and for me it does, I am just trying to explain the logic behind the approach.
Why does it result in a faster search?
Because of testing, what else is the final measurement?
You didn't answer the question.
WHY would splitting up the hash table accelerate the search?
One possibility is if the hash associativity is the same (seems 4 is usual) then you are getting a better hit rate because none of the opposite color_to_move positions are competing for the same spots.

-Sam
The table is 2x smaller, don't forget to factor that in
The total size allocated for hash is the same, its just a matter of what spot a position can go into. I would guess for most searches nowadays its about 50/50 between black and white to move, which Bob's results seem to support. So splitting the hash into 2 hashes may very well be a good idea, but they should probably be the same size for most.

-Sam
Ed's assumption is that his table is 1.5 times the size you can allocate.

He's using an AND instruction to find the index and if you have 1 GB of RAM in total then you can't use the entire RAM for hashtable.

So YOUR hashtable then is 512MB and HIS hashtable then is 768MB.

Get the idea?

Now Diep's hashtable implementation is slow anyway, so i 'waste' a single multiplication instruction (thanks to Dieter Buerssner for explaining that to me a year or 10+ ago) and can allocate 899MB maybe, so i beat both of you - that's not the discussion here

You are using AND instruction?

p.s. Ed always got huge nps and remember that those cards he sold one day - chessmachine schroeder - they had in total like a 1MB ram or so - hashtable size matters a lot then.

p.s.2 unlike my expectation Kermit scored in the Muppet Show yesterday, so standings right now: Kermit 1, rest of the Muppet Show 0. Not bad knowing Kermit already lives for a while from his pension...

Ahhh...OK. Thanks for explaining it.

-Sam

Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division

Re: Hash table division