Cache over-writing and PV's

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Cache over-writing and PV's

Post by gladius »

bob wrote:Which is what I wanted when I came up with this idea. I want to see the exact endpoint that led to this score, not some halfway point where I have to use imagination to guess what happened after that <HT> sentinel in the first example.
This is quite a nice solution! I've prototyped it in SF and it works very well.

Here is the code for anyone interested:
https://github.com/glinscott/Stockfish/ ... v?expand=1

Warning: not production ready :)
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Cache over-writing and PV's

Post by Rein Halbersma »

bob wrote: What you will discover is that if you double the search time, you double the number of PV stores as a max. I've been writing up the hash path idea. Here is a bit of data. I used fine #70 which is a real stress-test for hashing.

Total probes/stores done = 387,877,352
Total times exact code executed (either HashProbe() OR HashStore()) = 2813

So almost 400M nodes searched, and I either probed OR stored an entry in the hash path table 2800 times. You can't even measure that in terms of time expended, it is so small. The benefit is that (a) you don't do this ridiculous "don't allow EXACT hash hits on the PV" which gives up some efficiency and (b) you get PVs that are complete every time except for the random hash hit. But I use a bucket size of 16 for this table since it is so infrequently accessed, which really drives overwrites down, and it doesn't require a huge table either.
Nice writeup! So just to make sure: you could in principle even use a dynamically allocated array (std::vector in C++) for this secondary hash, that just doubles its capacity (using realloc, or std::vector automatic expansion) when it becomes full? And then you'd be 100% protected from the rare hash overwrite at little to no extra cost. Is that correct?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Cache over-writing and PV's

Post by bob »

Rein Halbersma wrote:
bob wrote: What you will discover is that if you double the search time, you double the number of PV stores as a max. I've been writing up the hash path idea. Here is a bit of data. I used fine #70 which is a real stress-test for hashing.

Total probes/stores done = 387,877,352
Total times exact code executed (either HashProbe() OR HashStore()) = 2813

So almost 400M nodes searched, and I either probed OR stored an entry in the hash path table 2800 times. You can't even measure that in terms of time expended, it is so small. The benefit is that (a) you don't do this ridiculous "don't allow EXACT hash hits on the PV" which gives up some efficiency and (b) you get PVs that are complete every time except for the random hash hit. But I use a bucket size of 16 for this table since it is so infrequently accessed, which really drives overwrites down, and it doesn't require a huge table either.
Nice writeup! So just to make sure: you could in principle even use a dynamically allocated array (std::vector in C++) for this secondary hash, that just doubles its capacity (using realloc, or std::vector automatic expansion) when it becomes full? And then you'd be 100% protected from the rare hash overwrite at little to no extra cost. Is that correct?
It would seem so. However, not knowing what kind of overhead that might incur, one would obviously want to test it. But with so few stores, there is really nothing that says a linear list would not work equally well so that you never overwrite anything. But it would certainly need some testing. I started off with a hash-based approach because it seemed like the most logical solution. Whether it is or not I do not know for certain. There could be more stores in some oddly pathological position, obviously. All of my tests showed minimal numbers (I let it run on ICC for a few days playing some pretty long games (60 minutes and above) and I didn't see a single "alarming" probe/store total, which at least suggests it is reasonably bounded.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Cache over-writing and PV's

Post by Rein Halbersma »

bob wrote:
Rein Halbersma wrote:
bob wrote: What you will discover is that if you double the search time, you double the number of PV stores as a max. I've been writing up the hash path idea. Here is a bit of data. I used fine #70 which is a real stress-test for hashing.

Total probes/stores done = 387,877,352
Total times exact code executed (either HashProbe() OR HashStore()) = 2813

So almost 400M nodes searched, and I either probed OR stored an entry in the hash path table 2800 times. You can't even measure that in terms of time expended, it is so small. The benefit is that (a) you don't do this ridiculous "don't allow EXACT hash hits on the PV" which gives up some efficiency and (b) you get PVs that are complete every time except for the random hash hit. But I use a bucket size of 16 for this table since it is so infrequently accessed, which really drives overwrites down, and it doesn't require a huge table either.
Nice writeup! So just to make sure: you could in principle even use a dynamically allocated array (std::vector in C++) for this secondary hash, that just doubles its capacity (using realloc, or std::vector automatic expansion) when it becomes full? And then you'd be 100% protected from the rare hash overwrite at little to no extra cost. Is that correct?
It would seem so. However, not knowing what kind of overhead that might incur, one would obviously want to test it. But with so few stores, there is really nothing that says a linear list would not work equally well so that you never overwrite anything. But it would certainly need some testing. I started off with a hash-based approach because it seemed like the most logical solution. Whether it is or not I do not know for certain. There could be more stores in some oddly pathological position, obviously. All of my tests showed minimal numbers (I let it run on ICC for a few days playing some pretty long games (60 minutes and above) and I didn't see a single "alarming" probe/store total, which at least suggests it is reasonably bounded.
2800 nodes is about a dozen doublings in capacity for 400M nodes. In C++, std::vector has exactly that behavior and on average, every element in the array is only reallocated once (the last 50%, never, the 25% after that only once, the 12.5% after that only twice, etc.). So a dozen calls to realloc, 2800 copies of entries and that's it.

Of course, in an engine, you keep the allocated memory of that secondary hash table in between searches. I think that's what I'll try in my program. I need 100% guaranteed PVs.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Cache over-writing and PV's

Post by syzygy »

gladius wrote:
bob wrote:Which is what I wanted when I came up with this idea. I want to see the exact endpoint that led to this score, not some halfway point where I have to use imagination to guess what happened after that <HT> sentinel in the first example.
This is quite a nice solution! I've prototyped it in SF and it works very well.

Here is the code for anyone interested:
https://github.com/glinscott/Stockfish/ ... v?expand=1

Warning: not production ready :)
How about storing in each entry only the first move of the PV and walking down the PV to obtain it in full (basically as is already being done, but for the regular tt table)?

The PVCache would need more entries, as overwriting an entry will more easily lose (part of) the PV, but each entry would take up far less memory.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Cache over-writing and PV's

Post by syzygy »

Rein Halbersma wrote:Of course, in an engine, you keep the allocated memory of that secondary hash table in between searches. I think that's what I'll try in my program. I need 100% guaranteed PVs.
If you make sure not to overwrite any entries, then only storing the best move (and maybe a score) for each PV node seems an even better idea. Since you don't lose any entries, you will always be able to reconstruct the full PV by walking the full path.

But of course nowadays it might not matter whether each entry takes up 16 bytes or 256 bytes...
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Cache over-writing and PV's

Post by gladius »

syzygy wrote:
gladius wrote:
bob wrote:Which is what I wanted when I came up with this idea. I want to see the exact endpoint that led to this score, not some halfway point where I have to use imagination to guess what happened after that <HT> sentinel in the first example.
This is quite a nice solution! I've prototyped it in SF and it works very well.

Here is the code for anyone interested:
https://github.com/glinscott/Stockfish/ ... v?expand=1

Warning: not production ready :)
How about storing in each entry only the first move of the PV and walking down the PV to obtain it in full (basically as is already being done, but for the regular tt table)?

The PVCache would need more entries, as overwriting an entry will more easily lose (part of) the PV, but each entry would take up far less memory.
I tried it, and it didn't work very well unfortunately. Even with a (relatively) huge table of 1 million entries, I was still getting a lot of truncated PVs. Here is the branch if you are interested in what it looked like:
https://github.com/glinscott/Stockfish/ ... h?expand=1

There is certainly more experimentation I can do, but the accuracy of the crafty method is going to be tough to beat I think.
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: Cache over-writing and PV's

Post by xmas79 »

Rein Halbersma wrote:
bob wrote: What you will discover is that if you double the search time, you double the number of PV stores as a max. I've been writing up the hash path idea. Here is a bit of data. I used fine #70 which is a real stress-test for hashing.

Total probes/stores done = 387,877,352
Total times exact code executed (either HashProbe() OR HashStore()) = 2813

So almost 400M nodes searched, and I either probed OR stored an entry in the hash path table 2800 times. You can't even measure that in terms of time expended, it is so small. The benefit is that (a) you don't do this ridiculous "don't allow EXACT hash hits on the PV" which gives up some efficiency and (b) you get PVs that are complete every time except for the random hash hit. But I use a bucket size of 16 for this table since it is so infrequently accessed, which really drives overwrites down, and it doesn't require a huge table either.
Nice writeup! So just to make sure: you could in principle even use a dynamically allocated array (std::vector in C++) for this secondary hash, that just doubles its capacity (using realloc, or std::vector automatic expansion) when it becomes full? And then you'd be 100% protected from the rare hash overwrite at little to no extra cost. Is that correct?
This was my first approach, then I rewrote it as an hashtable of std::vector of PVs to keep the overhead even smaller.
xmas79
Posts: 286
Joined: Mon Jun 03, 2013 7:05 pm
Location: Italy

Re: Cache over-writing and PV's

Post by xmas79 »

bob wrote:...Which is what I wanted when I came up with this idea. I want to see the exact endpoint that led to this score, not some halfway point where I have to use imagination to guess what happened after that <HT> sentinel in the first example.
Yes, that's what I usually get in my engine too.

But another question arises: I noticed that sometimes there is a difference between the score of the PV retrieved from "hashed paths" table and the score stored in the probed TT. How can it happen? I also noticed that sometimes the score MUST be forced to be different, just because in the retrieved path there is a repetition that must be scored as a draw. So when I retrieve the pv on exact hash hit I usually play it on the board and take care of repetitions and update the score accordingly. Is this a good approach?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Cache over-writing and PV's

Post by bob »

xmas79 wrote:
bob wrote:...Which is what I wanted when I came up with this idea. I want to see the exact endpoint that led to this score, not some halfway point where I have to use imagination to guess what happened after that <HT> sentinel in the first example.
Yes, that's what I usually get in my engine too.

But another question arises: I noticed that sometimes there is a difference between the score of the PV retrieved from "hashed paths" table and the score stored in the probed TT. How can it happen? I also noticed that sometimes the score MUST be forced to be different, just because in the retrieved path there is a repetition that must be scored as a draw. So when I retrieve the pv on exact hash hit I usually play it on the board and take care of repetitions and update the score accordingly. Is this a good approach?
Different depths. Remember that the PV you yank from the hash pv table goes to the endpoint where that PV was stored. If you then overwrite that entry in the main TT with a shallower entry, or a deeper entry, or an entry with a different bound, the hash score won't match the score at the end of the retrieved PV. But if you make every move from root to tip, the evaluation should match the backed up (to the root) score exactly unless there is a bug somewhere.

Draws are, and always have been problematic. Because when you graft paths as the ttable does, repetitions will be overlooked since the ttable has no path information, just score and such. Not much you can do there, except to not allow exact hash matches anywhere, which seems much worse than the problem.