Singular Margin based on depth?

lkaufman · Post by **lkaufman** » Sat Feb 11, 2012 9:45 pm

michiguel wrote:
lkaufman wrote:I thought that the idea behind requiring a margin is obvious, namely that if a move is best by just a couple centipawns at a reduced depth, the probability that it is best at the current depth is not that much over 50%, and so you could very likely be extending the wrong move. Am I missing something here? What is your take on the question in the subject line question?
Everything you do in an engine is a gamble. Depending on the outcome and the risk, there is a reasonable bet. At higher depths, in general, if you make a mistake, it is a very serious one. So, you may want to be more conservative, hence, it is not rare that you may like to have higher margins. You really want to make sure that the probability to be wrong decreases.

Miguel

Of course at a greater depth, the consequences of a decision are much more significant PER NODE than near the leaves, but it's not obvious to me that the cost of being wrong X% of the time is greater at greater depths, nor is it obvious that the tradeoff between missing a good extension and making a bad one varies with depth. I'm not saying that this is wrong, only that it has not been explained here.

lkaufman · Post by **lkaufman** » Sat Feb 11, 2012 9:53 pm

rvida wrote:
lkaufman wrote:One of the ideas in the Ippolit based programs (and in Critter) is that when doing the test for Singularity (for Singular Extension), the margin is proportional to depth. Aside from whether this tests well or not, can anyone explain the rationale for this idea? It is not at all obvious to us why a deeper search should require a larger margin. One might even argue that move ordering gets better with depth, so a smaller margin might be needed at higher depths.
Original purpose was to limit the overhead of the exclusion search. Larger margin causes it to fail-high faster on non-singular moves. At lower depths the overhead is smaller and a smaller margin is affordable.

But with other improvements in Critter's search this became (almost) a non-issue and now I use a fixed margin.

"At lower depths the overhead is smaller." I don't get this. If you are using a depth reduction of 50% (or any percentage), the cost of the exclusion search should be a HIGHER percentage at lower depths than at higher ones. Can you explain this comment?

By the way, perhaps it's also a non-issue for Komodo as we have not detected a benefit to using depth-proportional margins here.

rvida · Post by **rvida** » Sat Feb 11, 2012 10:11 pm

lkaufman wrote: "At lower depths the overhead is smaller." I don't get this. If you are using a depth reduction of 50% (or any percentage), the cost of the exclusion search should be a HIGHER percentage at lower depths than at higher ones. Can you explain this comment?

Absolute number of extra nodes is important not the percentage.

Eelco de Groot · Post by **Eelco de Groot** » Sat Feb 11, 2012 10:14 pm

rvida wrote:
lkaufman wrote:One of the ideas in the Ippolit based programs (and in Critter) is that when doing the test for Singularity (for Singular Extension), the margin is proportional to depth. Aside from whether this tests well or not, can anyone explain the rationale for this idea? It is not at all obvious to us why a deeper search should require a larger margin. One might even argue that move ordering gets better with depth, so a smaller margin might be needed at higher depths.
Original purpose was to limit the overhead of the exclusion search. Larger margin causes it to fail-high faster on non-singular moves. At lower depths the overhead is smaller and a smaller margin is affordable.

But with other improvements in Critter's search this became (almost) a non-issue and now I use a fixed margin.

I vote for Richard's answer in this thread

My own rationale has changed a bit as well. The main purpose I see for the singular extension is that the singular move has to fail low i.e. after a deeper search the move no longer produces a beta cutoff so of course no longer being singular either. Only then it will have had some value to search the hashmove deeper. But if in the position where you test for singularity, there are at least two moves that give a beta cutoff, even if the first one fails the second one may still take over. So in that case there is little point in a singular extension. Basically this means you do not have to have a margin at all, it makes more sense, theoretically that is, just to be sure there are two moves that fail high against beta. Next factor is the nullwindow results of the singularity test degrade very quickly like all nullwindowsearches, that the nullwindow result against the best move is better than some margin is not worth very much in precision because you use very shallow searches (typically half depth) but also if these go deeper the lines become less accurate (the search 'degrades' so to speak and should drop below beta further if they can't reach it in the first place because of imprecise moves) and the hashresult itself is not fixed so you test against a moving target.

The larger margin then is there because you would still like to use this very imprecise nullwindow result even if it has degraded. But other than being able to use degraded nullwindow searches I don't think the margin has a real meaning, the tuning depends more on how fast your nullwindow searches degade in accuracy, not on some actual distance to beta or even more imprecise (distance to) the hashresult of the stored beta-cutoff.

Regards, Eelco

lkaufman · Post by **lkaufman** » Sat Feb 11, 2012 11:01 pm

rvida wrote:
lkaufman wrote: "At lower depths the overhead is smaller." I don't get this. If you are using a depth reduction of 50% (or any percentage), the cost of the exclusion search should be a HIGHER percentage at lower depths than at higher ones. Can you explain this comment?
Absolute number of extra nodes is important not the percentage.

Why?

bob · Post by **bob** » Sun Feb 12, 2012 3:31 am

lkaufman wrote:
bob wrote:
lkaufman wrote:One of the ideas in the Ippolit based programs (and in Critter) is that when doing the test for Singularity (for Singular Extension), the margin is proportional to depth. Aside from whether this tests well or not, can anyone explain the rationale for this idea? It is not at all obvious to us why a deeper search should require a larger margin. One might even argue that move ordering gets better with depth, so a smaller margin might be needed at higher depths.
Unknown. Certainly nothing Hsu/Campbell wrote about a "real SE implementation" suggests this. However, the ip* "singular extension" is really a "hack" that doesn't leave me with a "good feeling." You only extend if you get a hash hit with a best move, etc??? Hsu used a "sticky transposition table" so that once a move was flagged as "singular" it would at least be extended for the next iteration, regardless of whether the singular test returned true or not... Perhaps this is a way to limit the overhead of extending "best moves" that are nowhere near "singular" in nature?
The singular extension for hash move only appears to have been a major factor in the rating jump of Rybka 3, though I don't know if the idea was original or is on record as having been used previously in another program. Whether it is a "hack" or not, it works. But back to the question at hand, I guess you are saying that if a certain move is best by some constant margin move after move, it would get extended repeatedly if the excess over the second move is just over the singular margin. So I see the logic of the idea now, although it is addressing a rather unrealistic scenario (constant gap between move 1 and move 2). Anyway, this is probably the reason for the algorithm; thanks for a very intelligent reply!

The Hsu "sticky" stuff was done to address a search instability issue. When I did SE in the 1993/1994 Cray Blitz versions, I originally left that out not knowing whether it was that important or not. It was. The problem is that if you SE a move on iteration N, but when you get to N+1, things have changed a bit and the SE search says "don't extend". Now you are warping the shape of the tree and can see instability in the eval and/or PV moves. I don't remember the specifics since that was almost 20 years ago now, but as I carefully read his explanation and created the sticky TT table, the instability problem was mitigated. There was a "time-out" built in. SE says singular this iteration, but not the next. But the sticky says "treat it as singular". But if the next iteration also says "not singular" it would be removed. I don't remember the precise specifics about that, without getting to the office to dig up Hsu's ICGA paper on the algorithm...

How are these programs varying depth? For example, if the margin gets larger as remaining depth increases, that is probably a stop-gap measure to prevent the tree from exploding, as those moves are less likely to get extended near the root, where the extension affects a very large sub-tree. If the margin gets bigger near the tips, I am not sure what might lead one to do that...

I was never a huge fan of SE, because it was only done at some of the nodes, which leaves holes. What if you are at an ALL node, and you only have one good "out" move. No extension at all, because Hsu only defined singular extensions for CUT nodes and PV nodes... I remember his paper saying "No satisfactory definition for singular has been discovered for ALL nodes."

bob · Post by **bob** » Sun Feb 12, 2012 3:36 am

rvida wrote:
bob wrote:However, the ip* "singular extension" is really a "hack" that doesn't leave me with a "good feeling." You only extend if you get a hash hit with a best move, etc???
1.) If you don't have a best move there is nothing singular to extend anyway.
2.) You don't need a hash hit to get a best move, do you? If a best move exists, IID will find it for you.

The idea for this "SE" type thing is something I personally renamed to "TT-singular extensions." The move to extend comes from the hash table, period. I don't suppose there is anything wrong with an IID search, but those are done almost impossibly rarely anyway, since they are only on PV nodes and very rarely needed (usually from an unresolved fail-high or fail-low at the previous iteration.)

I had issues testing this in Crafty, because I use a speed optimization and ALWAYS store a best move, whether it is a PV, CUT or ALL node. For an ALL node, I just store the first move searched. Why? Next time around the resulting positions may well get a fail-low cutoff due to this search, and I'd rather avoid the move generation overhead. I always search the TT move before generating anything. It is a percentage point or two faster. But it wrecks hell out of this idea because now every move from a hash probe is subject to extension if the bound is reasonable. I had to simply exclude the SE test condition if the TT entry was "UPPER", even though I had a best move.

bob · Post by **bob** » Sun Feb 12, 2012 4:09 am

lkaufman wrote:
rvida wrote:
lkaufman wrote:One of the ideas in the Ippolit based programs (and in Critter) is that when doing the test for Singularity (for Singular Extension), the margin is proportional to depth. Aside from whether this tests well or not, can anyone explain the rationale for this idea? It is not at all obvious to us why a deeper search should require a larger margin. One might even argue that move ordering gets better with depth, so a smaller margin might be needed at higher depths.
Original purpose was to limit the overhead of the exclusion search. Larger margin causes it to fail-high faster on non-singular moves. At lower depths the overhead is smaller and a smaller margin is affordable.

But with other improvements in Critter's search this became (almost) a non-issue and now I use a fixed margin.
"At lower depths the overhead is smaller." I don't get this. If you are using a depth reduction of 50% (or any percentage), the cost of the exclusion search should be a HIGHER percentage at lower depths than at higher ones. Can you explain this comment?

By the way, perhaps it's also a non-issue for Komodo as we have not detected a benefit to using depth-proportional margins here.

I sense some apples and oranges in this discussion. The intent of the "margin" was to define just how much better a single move has to be before we trigger a singular extension. The smaller the margin, the more moves trigger this extension. The larger the margin, the lower the cost since there are fewer extensions triggered. The way to control overhead is to control the depth of the singular search. Less depth = less effort but less accuracy...

lkaufman · Post by **lkaufman** » Sun Feb 12, 2012 7:13 pm

[quote="bobThe Hsu "sticky" stuff was done to address a search instability issue. When I did SE in the 1993/1994 Cray Blitz versions, I originally left that out not knowing whether it was that important or not. It was. The problem is that if you SE a move on iteration N, but when you get to N+1, things have changed a bit and the SE search says "don't extend". Now you are warping the shape of the tree and can see instability in the eval and/or PV moves. I don't remember the specifics since that was almost 20 years ago now, but as I carefully read his explanation and created the sticky TT table, the instability problem was mitigated. There was a "time-out" built in. SE says singular this iteration, but not the next. But the sticky says "treat it as singular". But if the next iteration also says "not singular" it would be removed. I don't remember the precise specifics about that, without getting to the office to dig up Hsu's ICGA paper on the algorithm...

How are these programs varying depth? For example, if the margin gets larger as remaining depth increases, that is probably a stop-gap measure to prevent the tree from exploding, as those moves are less likely to get extended near the root, where the extension affects a very large sub-tree. ."[/quote]

Yes, margin gets larger with depth, in fact they just use depth or depth * constant as the margin. But I'm still unclear on the rationale. True, near the root the extension affects larger subtrees, but near the leaves the extension occurs vastly more often, so it is not obvious whether the extension costs more near the root or near the leaves in total. Can you explain why you think the total cost of all such extensions would be greater near the root than near the leaves?

lkaufman · Post by **lkaufman** » Sun Feb 12, 2012 7:20 pm

Eelco de Groot wrote:
rvida wrote:
lkaufman wrote:One of the ideas in the Ippolit based programs (and in Critter) is that when doing the test for Singularity (for Singular Extension), the margin is proportional to depth. Aside from whether this tests well or not, can anyone explain the rationale for this idea? It is not at all obvious to us why a deeper search should require a larger margin. One might even argue that move ordering gets better with depth, so a smaller margin might be needed at higher depths.
Original purpose was to limit the overhead of the exclusion search. Larger margin causes it to fail-high faster on non-singular moves. At lower depths the overhead is smaller and a smaller margin is affordable.

But with other improvements in Critter's search this became (almost) a non-issue and now I use a fixed margin.
I vote for Richard's answer in this thread My own rationale has changed a bit as well. The main purpose I see for the singular extension is that the singular move has to fail low i.e. after a deeper search the move no longer produces a beta cutoff so of course no longer being singular either. Only then it will have had some value to search the hashmove deeper. But if in the position where you test for singularity, there are at least two moves that give a beta cutoff, even if the first one fails the second one may still take over. So in that case there is little point in a singular extension. Basically this means you do not have to have a margin at all, it makes more sense, theoretically that is, just to be sure there are two moves that fail high against beta. Next factor is the nullwindow results of the singularity test degrade very quickly like all nullwindowsearches, that the nullwindow result against the best move is better than some margin is not worth very much in precision because you use very shallow searches (typically half depth) but also if these go deeper the lines become less accurate (the search 'degrades' so to speak and should drop below beta further if they can't reach it in the first place because of imprecise moves) and the hashresult itself is not fixed so you test against a moving target.

The larger margin then is there because you would still like to use this very imprecise nullwindow result even if it has degraded. But other than being able to use degraded nullwindow searches I don't think the margin has a real meaning, the tuning depends more on how fast your nullwindow searches degade in accuracy, not on some actual distance to beta or even more imprecise (distance to) the hashresult of the stored beta-cutoff.

Regards, Eelco

I'm not completely following your explanation about degraded nullwindow searches, but my own interpretation of it would be that if you reduce the search depth (for the singular test) by a percentage of depth remaining rather than by a constant number of plies, you need a larger margin due to the increased gap between singular depth and actual depth. Is that what you are saying? If so it sounds right.
What do you (and others reading this) think about the question of whether the margin should be from beta, from the hash score, or from some function of both?

Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?

Re: Singular Margin based on depth?