Probabilistic approach to optimize search tree

Sergei S. Markoff · Post by **Sergei S. Markoff** » Sun Sep 23, 2012 6:57 pm

Sven Schüle wrote: Can you explain what you mean by "a cutoff changes the best move at root", "a cutoff affects PV", and "a cutoff results in nothing"? Which kind of cutoffs are you thinking of - most probably not the usual "beta cutoffs"?

In my view, performing a cutoff *never* changes the best move at root or the PV. That is somehow inherent to a cutoff: you save searching more nodes based on the observation that searching these nodes would *not* have any effect on the PV, best move and evaluation at root.

Maybe we only have a different understanding of certain terms?
Sven

It's just a terminological thing. There are "safe cutoffs" – like usual alpha-beta cutoffs. And "unsafe" ones – futility pruning, razoring, null-move pruning, forward pruning and so on. For example, most of the modern engines going to cut-off all "quiet" moves at the bottom part of the search tree in the case if we're already lost some material. If the opponent won a queen – it's usually worthless to move some pieces w/o checks, captures and promotions – everything is probably already lost, it's better to skip this subtrees to save time for the more perspective ones.

Of course non-safe cutoffs are able to change root eval/move – if you will cut some move that should be in PV for exhaustive search. Or the refutation for non-exhaustive search PV.

zamar · Post by **zamar** » Sun Sep 23, 2012 7:32 pm

Sven Schüle wrote:In my view, performing a cutoff *never* changes the best move at root or the PV. That is somehow inherent to a cutoff: you save searching more nodes based on the observation that searching these nodes would *not* have any effect on the PV, best move and evaluation at root.

Maybe we only have a different understanding of certain terms?

Sven

Okay Sven,

I'll explain a bit:

As you know, modern chess engines spent most of their time in null-window search, where the tree looks pretty much like:

- first move fails high (cutoff) [depth = 1]
- All moves fail low [depth = 2]
- First move fails high (cutoff) [depth = 3]
- All moves fail low [depth = 4]
... etc ...

Typical perfectly ordered tree. Now what I think this thread is all about are "unexpected cutoffs" where all moves are supposed to fail-low, but they don't.

We'd like to (and all modern engines do) reduce and prune at the end of the move list to gain speed. But the risk is that we might miss some of those unexpected cutoffs. Some of these don't matter, but some are really valuable and propagate to root and change the assessment of the position.

Now if I understood correctly Sergei was suggesting some statistical methods to prune as much as possible and minimize the risk of missing valuable unexpected cutoffs.

I've thought about this a lot previously and pointed out that this would be very difficult.

Please feel free to correct me if I'm wrong.

Sergei S. Markoff · Post by **Sergei S. Markoff** » Sun Sep 23, 2012 8:05 pm

zamar wrote:Please feel free to correct me if I'm wrong.

You are right. Difficult doesn't mean impossible

When we're using history pruning/reducing, our descision to cut-off is based on history value. In Rybka and it's clones it works like "if eval + history_score < alpha → cutoff". In my SmarThink at early 2000s I've used "if we're in quiet node and this is a quiet move, and we're already searched killers and 2 best history moves → make reduced-depth search, and if it returns score below alpha → just skip the move".
Both of this approaches works, but there are a lot of params that should be determined. In my scheme – reduce amount (should it depends on move order? on eval? on eval change by move?). For example in SmarThink I've successfully used "optimistic eval" – eval that accounts only offending eval parts of one side – pawn advances, king attack. If the material is lost, but we have some growing "optimism" – we should explore this subtree and so on, and so on. There are a lot of options and ideas. But all of them should be statistically proved. Otherwise a lot of brilliant ideas can be covered by a tons of dust) And, otherwise, some usual techniques could be worthless.

Sven · Post by **Sven** » Sun Sep 23, 2012 8:10 pm

zamar wrote:
Sven Schüle wrote:In my view, performing a cutoff *never* changes the best move at root or the PV. That is somehow inherent to a cutoff: you save searching more nodes based on the observation that searching these nodes would *not* have any effect on the PV, best move and evaluation at root.

Maybe we only have a different understanding of certain terms?

Sven
Okay Sven,

I'll explain a bit:

As you know, modern chess engines spent most of their time in null-window search, where the tree looks pretty much like:

- first move fails high (cutoff) [depth = 1]
- All moves fail low [depth = 2]
- First move fails high (cutoff) [depth = 3]
- All moves fail low [depth = 4]
... etc ...

Typical perfectly ordered tree. Now what I think this thread is all about are "unexpected cutoffs" where all moves are supposed to fail-low, but they don't.

We'd like to (and all modern engines do) reduce and prune at the end of the move list to gain speed. But the risk is that we might miss some of those unexpected cutoffs. Some of these don't matter, but some are really valuable and propagate to root and change the assessment of the position.

Now if I understood correctly Sergei was suggesting some statistical methods to prune as much as possible and minimize the risk of missing valuable unexpected cutoffs.

I've thought about this a lot previously and pointed out that this would be very difficult.

Please feel free to correct me if I'm wrong.

The biggest problem I had was the sentence: "The only really valuable cutoffs are those which change bestmove at root." where the word "change" was not clear enough for me. Now I know what was meant. So the effect to the root node would result from comparing two possible decisions at a given node: A) I prune (and risk to handle the current node as "fail low" even though it might in fact become a "fail high" when not pruning), or B) I don't prune. One decision may change the outcome at the root compared to the other decision. My initial thought was about how one cutoff could change the best move at root during the search, which it can't, but that is not the viewpoint.

Thanks for enlightening me in this interpretation issue

Sven

diep · Post by **diep** » Mon Sep 24, 2012 1:13 pm

Sergei S. Markoff wrote:When we're using alpha/beta family, in best case we can reduce a nodes count to square root of the same count of the exhaustive search tree.

Practical spoken this is wrong. Though theoretical spoken, professor Bob has an argument there.

The proof doesn't take into account hashtables for example. At which point already it's far under the square root.

This means that a number of moves fall away so we always get to under the square root, even in a fullwidth search.

Furthermore we have lossy algorithms such as nullmove which really reduce our search a lot. Now you can say this is 'just shortening search depth', yet the effective impact is that we generate a cutoff in say 70%+ of the positions.

Well, it's good, we know, but the main problem is fail-high nodes, where we still going to examine all moves.

If our only knowledge is that a position is lowerbound (fail high) then we know that on average we searched under 1 move as we directly got a fail high already. Usually nullmove.

If you see nullmove as the first move, and not as move number 0, then we would get a cutoff somewhere visiting on average under 2.0 moves.

We all know that in all modern chess programms we're using several approaches to skip at least a huge part of that moves. Zero-move heursitic, history pruning, futility pruning/razoring etc.

The theoretically optimal way is to have:
1. Estimation of error probability in the case of skipping this move
2. Estimation of the nodes amount skipped by cutoff (N)
3. Estimation of the error probability if we will not search N nodes more (some eval of error margin rate by nodes searched should be useful).

It's obvious that each of this values should be estimated. The simpliest way is to use some polynonial functions that should use history table params, number of pieces, depth and so on.

It can be also generalized for reductions/extensions.

Anyone tried this approach?

I can't make anything of what you write here. But you might want to investigate probability cutoff, also known as ProbCut, there is some articles from 90s about it.

Sergei S. Markoff · Post by **Sergei S. Markoff** » Mon Sep 24, 2012 8:28 pm

diep wrote:
Sergei S. Markoff wrote:When we're using alpha/beta family, in best case we can reduce a nodes count to square root of the same count of the exhaustive search tree.
Practical spoken this is wrong. Though theoretical spoken, professor Bob has an argument there.

The proof doesn't take into account hashtables for example. At which point already it's far under the square root.

This means that a number of moves fall away so we always get to under the square root, even in a fullwidth search.

But the size of hashtable is fixed and much smaller than number of chess positions. So if the search is deep enought, the influence of hashtable should be negligibly small.
I think that square root is a good estimation, but of course there are a lot of things that we doesn't take into account: quiescence specific, extensions and so on.

diep wrote:Furthermore we have lossy algorithms such as nullmove which really reduce our search a lot. Now you can say this is 'just shortening search depth', yet the effective impact is that we generate a cutoff in say 70%+ of the positions.

Yes, of course, null move does a lot. I think it's not to hard to estimate it's average influence, but it's more the theoretical question) But I've tried to discuss some more practical things)

I can't make anything of what you write here. But you might want to investigate probability cutoff, also known as ProbCut, there is some articles from 90s about it.

Yes, I've read about it.

Now, one of the particularly interesting questions is — should we distinguish nodes by some distance to PV? We all know the common idea used in Fruit — no reductions/zero-move cutoffs in PV nodes. But what about nodes close to PV? For example — direct non-PV siblings of PV node?

It's interesting to analyze common scenarios of miscalculations in search trees. It looks to be a good idea to extend PV nodes at some way (a lot of modern engines extends more in PV) because of PV line could be forced and finally we will reach some refutation. Otherwise — for non-PV nodes we're reaching some depth «horizon» at which all false refutations will be refuted by probably some inrefutable attacking plan

So, that's why it's important to perform a statistical analysis of search trees — it could help us to create optimal chess-specific extension/reduction/cutoff rules.

Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree