Transposition Table and nps drop

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mathmoi
Posts: 286
Joined: Mon Mar 13, 2006 5:23 pm
Location: Québec

Transposition Table and nps drop

Post by mathmoi »

Hi,

My engine is actually using a simple minimax (no AlphaBeta) function for the moment. I want to get it completely functional before I write the AB search and all the other improvements.

This week I implemented a transposition table, because I want to be able to retrieve the principal variation from it. So I implemented it into my Minimax function and I get confused by the results.

If I use a small hash table of 4Kb (256 entries), the search for my test positions takes 34.4s at 8.5M nps searching 293M nodes.

If I use a 32Mb hash table the search takes 22.6s at 5M nps searching 112M nodes

and if I use a 256Mb the search takes 25.2s at 4.3M nps searching 107M nodes.

I was expecting a drop in nodes per seconds, but it seems that with a 256Mb hash tables, the nps drop by about 50% making the search even slower than with a 32 Mo hash table.

However, when I profile my engine, the profiling code seems to use only <1% of the total cpu time. What does that means? Do I have a bug or is this due the cache miss the huge table will inevitably cause?
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Transposition Table and nps drop

Post by hgm »

How do you count your nodes? If a move leads to a position that is found in the hash table, do you count that as a node, or do you only count nodes where you do move generation, or only count leave nodes where you do evaluation?

What is your eplacement algorithm? Do you replace everything, or do you make an attempt to spare deep eentries

How many test positions do you average (or sum) to get the quoted times?
mathmoi
Posts: 286
Joined: Mon Mar 13, 2006 5:23 pm
Location: Québec

Re: Transposition Table and nps drop

Post by mathmoi »

hgm wrote:How do you count your nodes? If a move leads to a position that is found in the hash table, do you count that as a node, or do you only count nodes where you do move generation, or only count leave nodes where you do evaluation?

What is your eplacement algorithm? Do you replace everything, or do you make an attempt to spare deep eentries

How many test positions do you average (or sum) to get the quoted times?
Hi,

I count every calls to Minimax, so it counts internal nodes and leaf nodes. At a position if a move is found in the transposition table it's counted as 1 node.

I actually use a replace if deeper or equal scheme. I'll try to use a replace always one to see what happens.

The data I provided are the sum of 6 positions searched to depth 5 or 6. The positions are the ones used by the benchmark command of Crafty.

Thanks for your help.
mathmoi
Posts: 286
Joined: Mon Mar 13, 2006 5:23 pm
Location: Québec

Re: Transposition Table and nps drop

Post by mathmoi »

Hi,

I just tried the sames test with a replace always scheme and the results are only slightly better.

Can all the caches miss on the TT probe add ups to 50% of the CPU time?
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Transposition Table and nps drop

Post by hgm »

Yes, easily. To fetch randomly addressed data from memory takes 386 CPU clock cycles on my 1.3GHz Pentium M. On faster machines, it will take even more CPU clocks, as only the CPU will be faster, not the memory. And a CPU can do between 2 and 3 instructions per clock cycle. So doing a single probe costs you as much as ~1000 instructions.

I do not know what exactly you do in your leave nodes, but if you do one hash probe, and less than 1000 instructions of work, the TT probes take 50%.
Aleks Peshkov
Posts: 892
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia

Re: Transposition Table and nps drop

Post by Aleks Peshkov »

Without minimal positional evaluation, alpha-beta pruning and quiescence search your search trees are artificial. Optimizing performance in such unnatural conditions can make final code even worse.
Harald Johnsen

Re: Transposition Table and nps drop

Post by Harald Johnsen »

What are you doing with your table ?

Retrieving the pv for the search at the next depth is not enought.

Are you using it to do some cutoff ?
Are you using it to order the moves with IID ?

HJ.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Transposition Table and nps drop

Post by hgm »

Well, he does do cutoffs, or his node count couldn't go down dependent on hash size. Ordering the moves would have little effect without alpha-beta.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Transposition Table and nps drop

Post by bob »

mathmoi wrote:Hi,

My engine is actually using a simple minimax (no AlphaBeta) function for the moment. I want to get it completely functional before I write the AB search and all the other improvements.

This week I implemented a transposition table, because I want to be able to retrieve the principal variation from it. So I implemented it into my Minimax function and I get confused by the results.

If I use a small hash table of 4Kb (256 entries), the search for my test positions takes 34.4s at 8.5M nps searching 293M nodes.

If I use a 32Mb hash table the search takes 22.6s at 5M nps searching 112M nodes

and if I use a 256Mb the search takes 25.2s at 4.3M nps searching 107M nodes.

I was expecting a drop in nodes per seconds, but it seems that with a 256Mb hash tables, the nps drop by about 50% making the search even slower than with a 32 Mo hash table.

However, when I profile my engine, the profiling code seems to use only <1% of the total cpu time. What does that means? Do I have a bug or is this due the cache miss the huge table will inevitably cause?
I have never seen such a drop in NPS, unless you are either (a) so small that everything fits into cache until you start probing the hash which causes things to get replaced frequently; (b) using a hash implementation that is too slow, perhaps using multiple probes or something...

In crafty, nps is relatively stable regardless of hash size used, but the tree gets smaller (as yours apparently does) as the size is increased...
mathmoi
Posts: 286
Joined: Mon Mar 13, 2006 5:23 pm
Location: Québec

Re: Transposition Table and nps drop

Post by mathmoi »

hgm wrote:Yes, easily. To fetch randomly addressed data from memory takes 386 CPU clock cycles on my 1.3GHz Pentium M. On faster machines, it will take even more CPU clocks, as only the CPU will be faster, not the memory. And a CPU can do between 2 and 3 instructions per clock cycle. So doing a single probe costs you as much as ~1000 instructions.

I do not know what exactly you do in your leave nodes, but if you do one hash probe, and less than 1000 instructions of work, the TT probes take 50%.
Hi H.G.Muller,

You pin pointed the issue. Before I added the TT code in my Minimax function at depth=0, I was returning Eval() wich is really fast since it only return an incrementally updated value. When I added my TT code, I did a probe _before_ I checked for depth=0, So I was doing an extra probe at each leaf nodes.

Thanks for you inputs.