Cache over-writing and PV's

syzygy · Post by **syzygy** » Fri Oct 17, 2014 9:39 pm

hgm wrote:This is exactly why a quality engine does not dredge the PV out of the hash. (I did that in Joker, and usually the PV is completely different then the one that actually produced the score. E.g. a PV that ends in a checkmate, while the score is not a mate score.)

So check the scores stored in hash and cut it where it starts to deviate.

hgm · Post by **hgm** » Fri Oct 17, 2014 10:17 pm

Stan Arts wrote:
hgm wrote: This is exactly why a quality engine does not dredge the PV out of the hash. (I did that in Joker, and usually the PV is completely different then the one that actually produced the score. E.g. a PV that ends in a checkmate, while the score is not a mate score.)
That sounds pretty bad/buggy. What caused that?

Basically temporally inverted hash grafting:

The engine searches the best defense first, and this pushes the mate over the horizon. Then it starts to search the inferior defense, which gets it to the same hopeless position faster, so that now it has enough depth left to see it gets checkmated.

Because my hash dredged PV's have a pretty high accuracy rate.

Could have something to do with your replacementscheme and the order in which you retrieved moves from which part of hash?

The deeper search of the same node that was reached by a shorter path in sub-optimal defense will overwrite the original, less deep search of that node. That is pretty usual.

Biggest of infact only problem I have is with repetitions. When my program allows a repetition but will play something else the third time. Ofcourse the hash PV can't diverge and then shows a faulty rep draw PV. Then again it doesn't go 0.00 there like an idiot while that seems to be the norm.

In Joker I noticed that it is quite dangerous to allow repeats in the PV. The engine starts to abuse it to reduce its effective search depth in the face of trouble. E.g. when it can deliver a perpetual, but doesn't want to, because it thinks it is ahead. It then just throws in an extra cycle of the perpetual, to effectively reduce its search depth of what follows by 4 ply. (And then, one move later, it can perhaps do it again, to stall discovering the inevitable doom to which it is heading 4 more ply.)

hgm · Post by **hgm** » Fri Oct 17, 2014 10:19 pm

syzygy wrote:So check the scores stored in hash and cut it where it starts to deviate.

It would still give you incomplete PVs. So I would prefer the Crafty method.

Rein Halbersma · Post by **Rein Halbersma** » Fri Oct 17, 2014 10:21 pm

hgm wrote: The more reliable way is the 'triangular-array method'. The problem there is that it suffers from hash cuts, which is why some engines do not allow hash cuts in PV nodes. Crafty does allow the hash cuts, but has a separate hash table to hash complete PVs, so the tail of the PV can be restored on a hash cutoff. (And if that is overwritten, there is of course no hash cut.)

I use the triangular-array method. At the root in my iterative deepening, immediately after my pvs() has returned, I insert the PV into the TT. This guarantees that on the next iteration, the PV is searched as the "left-most" path in the search tree. This also guarantees there can be no hash cuts along the initial PV, and no special side-table is necessary. Has worked splendidly for me for years: in DEBUG mode I assert() that the returned score from pvs() at the root is obtained by walking down the PV and calling eval() at the PV-leaf.

The only PV issues I ever have are grafting and search instability, but they are not defects of the triangular-array method.

xmas79 · Post by **xmas79** » Fri Oct 17, 2014 11:19 pm

syzygy wrote:"Oh boy", seems I got you upset.

Please don't give too much importance to yourself...

Even if you get SF (or any other engine) to return a long PV, putting a lot of trust in the second half of the PV isn't very wise.

Yes... That's also the case of the OP where the PV was badly truncated at the 4th half-move and the score says it's mate in 24 moves, isn't it? Please again...

It seems that you have time (and answers) to answer other posts, but you still miss to answer my simple question... Oh wait... You don't have an answer... Just because you cannot have one... Well I will live with that... And all other people of this forum will, that are reading those words and looking you in a such embrassing situation...

Bye bye...

bob · Post by **bob** » Fri Oct 17, 2014 11:44 pm

Adam Hair wrote:
bob wrote:
syzygy wrote:
gladius wrote:
hgm wrote:Note that the design goal of Stockfish is not to be a good or useful engine. Its goal is to optimize beating a small set of opponents in bullet games under conditions where hash replacement virtually does not occur. Any patches that would improve its hashing, or solve the problems that you are experiencing, would be instantly rejected, as they would not cause a higher win rate in the bulet games.
This is just not true. The goal is indeed to be a "good and useful" engine. The *means* to achieve this is indeed by self-testing in bullet matches most of the time. This has proven to be a very effective strategy for improving strength. Changes that affect hash strategy are certainly tested at hash levels where replacement plays a major role (4mb hash at 60s).
I'm still waiting for any of the critics to explain how they are succesfully using 10-minute searches to improve their engines. Those engines must be really excellent at playing long time controls! I'm sure they'd blow SF away.

Oh wait...
Now there's an argument.

But SF testing is far from perfect and has too many regression errors. Any time you see a comment "you should only test the same idea 3-4 times max, as otherwise you will likely get a false positive, it makes you wonder.... or it should. The early SPRT terminations make testing go faster, but at a significant cost.
I am not sure that I understand what you mean by "significant cost". I may be misremembering, but I think that the SPRT bounds (which are used to determine when to pass or reject a change) have been set to hold both false positives and negatives to 5%.

bob wrote: And not all modifications work well at bullet. Some only show a gain at longer time controls. So that IS a valid criticism to make for anyone that uses bullet games in the wrong conditions.

I can only quote the FAQ. Which was something like this:

Q: how many times can I resubmit the same fix (I assume with a slight change of some kind).

A: 4-5 runs total should be the max to avoid a false positive.

I've been more of a "run more games to reduce that probability myself. Yes, quick SPRT-type tests using a LOS is useful, but (for me) not as a "confirmation" of anything.

bob · Post by **bob** » Fri Oct 17, 2014 11:47 pm

Rein Halbersma wrote:
hgm wrote: The more reliable way is the 'triangular-array method'. The problem there is that it suffers from hash cuts, which is why some engines do not allow hash cuts in PV nodes. Crafty does allow the hash cuts, but has a separate hash table to hash complete PVs, so the tail of the PV can be restored on a hash cutoff. (And if that is overwritten, there is of course no hash cut.)
I use the triangular-array method. At the root in my iterative deepening, immediately after my pvs() has returned, I insert the PV into the TT. This guarantees that on the next iteration, the PV is searched as the "left-most" path in the search tree. This also guarantees there can be no hash cuts along the initial PV, and no special side-table is necessary. Has worked splendidly for me for years: in DEBUG mode I assert() that the returned score from pvs() at the root is obtained by walking down the PV and calling eval() at the PV-leaf.

The only PV issues I ever have are grafting and search instability, but they are not defects of the triangular-array method.

Sorry but that is wrong. I have ALWAYS stored the PV from the depth n-1 search before starting the depth n search. But you can STILL get exact hash hits, whether from the previous iteration via a transposition, or from a previous search. And these exact hash hits terminate the PV instantly by providing a score (but no PV) for the sub-tree below the hit. I have reported on a workable solution that I use in Crafty. Overhead is so small I could not measure it, the short PVs disappear.

If you use my "trick" you get no short PVs, grafting will simply produce a longer PV, never losing the PV at all until some max limit is reached (for me it is 128 plies).

bob · Post by **bob** » Fri Oct 17, 2014 11:49 pm

syzygy wrote:
Zenmastur wrote:I don't know if I agree 100%, but I do know that if your dredging your cache for the PV and the time control is long enough that major portions of the cache are over written before the move is made or analysis stopped, the last PV out put is likely to be trashed. This makes it completely unreliable and is a nightmare as far as time is concerned.
Clipping of the PV in SF does not make the non-clipped part unreliable.

I save the PV's and then use the engine to search each and every move in it to ferret out all the trash moves. I can't ever recall a case where the PV was accurate. I always find errors and the evaluations always change.
Sure, but what do you expect?

If the root position is searched to depth 40, then the position after 20 half moves was searched only to depth 20. That's just how things work.

Even if you get SF (or any other engine) to return a long PV, putting a lot of trust in the second half of the PV isn't very wise.

Unless you do it right. Then it is perfectly accurate.

bob · Post by **bob** » Fri Oct 17, 2014 11:57 pm

Stan Arts wrote:
hgm wrote: This is exactly why a quality engine does not dredge the PV out of the hash. (I did that in Joker, and usually the PV is completely different then the one that actually produced the score. E.g. a PV that ends in a checkmate, while the score is not a mate score.)
That sounds pretty bad/buggy. What caused that?

Because my hash dredged PV's have a pretty high accuracy rate.

Could have something to do with your replacementscheme and the order in which you retrieved moves from which part of hash?

Biggest of infact only problem I have is with repetitions. When my program allows a repetition but will play something else the third time. Ofcourse the hash PV can't diverge and then shows a faulty rep draw PV. Then again it doesn't go 0.00 there like an idiot while that seems to be the norm.

Come to think of it I could fix the faulty rep thing by "guessing" a best move/looking what else is available in hash when score isn't 0.00. Well, or a hash triangle thing! Hmm.

Here is a quick test that will illustrate the problem...

run a decent search, and then take each PV and input all the moves and dump the score. Does it match the score backed up for that PV? Almost certainly not for every PV. And for each of those, some of the moves are wrong.

Particularly positions near the end of the PV where those get overwritten like mad as the search bounces around. And the score from the first time you searched that position won't quite be the same as the score from the second time (which overwrote entries near the end of the PV) due to different depths and/or transpositions...

It simply doesn't work. For me, my most useful debugging tool has been to spot a PV and score that look strange, and then just suck in the current position, and the moves in the PV, and then use the "score" command to see what was wrong. And I always get the same score that was displayed with the PV. Otherwise how do you try to debug things when you have an evaluation that looks odd, but you have no idea what the corresponding position looks like...

syzygy · Post by **syzygy** » Sat Oct 18, 2014 1:29 am

bob wrote:
syzygy wrote:
Zenmastur wrote:I don't know if I agree 100%, but I do know that if your dredging your cache for the PV and the time control is long enough that major portions of the cache are over written before the move is made or analysis stopped, the last PV out put is likely to be trashed. This makes it completely unreliable and is a nightmare as far as time is concerned.
Clipping of the PV in SF does not make the non-clipped part unreliable.

I save the PV's and then use the engine to search each and every move in it to ferret out all the trash moves. I can't ever recall a case where the PV was accurate. I always find errors and the evaluations always change.
Sure, but what do you expect?

If the root position is searched to depth 40, then the position after 20 half moves was searched only to depth 20. That's just how things work.

Even if you get SF (or any other engine) to return a long PV, putting a lot of trust in the second half of the PV isn't very wise.
Unless you do it right. Then it is perfectly accurate.

I'm talking about "perfectly accurate" PVs. The second half of such PV is hardly reliable from a "move quality" point of view and will contain "trash moves".

Note that the OP is complaining about the moves of "perfectly accurate" PVs. (Those output by somewhat recent versions of SF are accurate in so far as they are not clipped.)

Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's

Re: Cache over-writing and PV's