Crafty's four hash tables

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Crafty's four hash tables

Post by zullil »

"BEFORE you set the hash size for any of the four hash tables, ..."

Must admit I don't know what the "path hash table" does, and am wondering about optimal sizes for the four tables. Bob?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Crafty's four hash tables

Post by bob »

zullil wrote:"BEFORE you set the hash size for any of the four hash tables, ..."

Must admit I don't know what the "path hash table" does, and am wondering about optimal sizes for the four tables. Bob?
First, what they do:

(1) hash=n is the standard trans/ref table. This should vary with the time control. 8 gigabytes is 512M entries (16 bytes per entry). My basic choice was to not hash in q-search, and my basic assumption is that q-search is typically about 90% of total nodes searched. So if the time control will search 5 billion nodes per move, 8 gigs would be about right. For my 20 core box, searching at 80-100M nodes per second, 5G divided by 100M is 50, which is about 5x more than I would shoot for, but it would work well. Only guideline I use is that bigger means more TLB pressure and the search will slow down due to extra memory accesses to translate virtual to real. Smaller will increase NPS, but then the tree size will grow due to fewer TT hits..

(2) hashp=n is the pawn hash. I generally use 64M and leave it alone. Longer time controls doesn't seem to add proportionally more pawn positions beyond some point.

(3) hashe=n is the eval hash. This ONLY affects NPS, so I tune this to produce the highest NPS for a given time control, it won't affect scoring, PVs and that sort of thing, it just eliminates the time needed to evaluate some of the positions.

(4) phash=n is the path hash table. I am still playing with this from time to time, but it is intended to address the graph history interaction problem where you would see a short PV due to a hash hit. This solves that for almost all positions, but collisions are possible and you can still see a <HT> short PV on occasion. Bigger is better. It won't slow the search down if it is too big as it is rarely accessed, but when it is accessed, a miss will create a truncated PV. When move ordering goes south, this will be used a lot, and overwrites occur. Then you begin to see a <HT> PV. And it will likely stick around for several iterations or searches thanks to GHI issues...
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Crafty's four hash tables

Post by zullil »

bob wrote:
zullil wrote:"BEFORE you set the hash size for any of the four hash tables, ..."

Must admit I don't know what the "path hash table" does, and am wondering about optimal sizes for the four tables. Bob?
First, what they do:

(1) hash=n is the standard trans/ref table. This should vary with the time control. 8 gigabytes is 512M entries (16 bytes per entry). My basic choice was to not hash in q-search, and my basic assumption is that q-search is typically about 90% of total nodes searched. So if the time control will search 5 billion nodes per move, 8 gigs would be about right. For my 20 core box, searching at 80-100M nodes per second, 5G divided by 100M is 50, which is about 5x more than I would shoot for, but it would work well. Only guideline I use is that bigger means more TLB pressure and the search will slow down due to extra memory accesses to translate virtual to real. Smaller will increase NPS, but then the tree size will grow due to fewer TT hits..

(2) hashp=n is the pawn hash. I generally use 64M and leave it alone. Longer time controls doesn't seem to add proportionally more pawn positions beyond some point.

(3) hashe=n is the eval hash. This ONLY affects NPS, so I tune this to produce the highest NPS for a given time control, it won't affect scoring, PVs and that sort of thing, it just eliminates the time needed to evaluate some of the positions.

(4) phash=n is the path hash table. I am still playing with this from time to time, but it is intended to address the graph history interaction problem where you would see a short PV due to a hash hit. This solves that for almost all positions, but collisions are possible and you can still see a <HT> short PV on occasion. Bigger is better. It won't slow the search down if it is too big as it is rarely accessed, but when it is accessed, a miss will create a truncated PV. When move ordering goes south, this will be used a lot, and overwrites occur. Then you begin to see a <HT> PV. And it will likely stick around for several iterations or searches thanks to GHI issues...
Thanks, Bob. Very helpful.