Probabilistic approach to optimize search tree

Sergei S. Markoff · Post by **Sergei S. Markoff** » Sat Sep 22, 2012 5:32 pm

When we're using alpha/beta family, in best case we can reduce a nodes count to square root of the same count of the exhaustive search tree.
Well, it's good, we know, but the main problem is fail-high nodes, where we still going to examine all moves.

We all know that in all modern chess programms we're using several approaches to skip at least a huge part of that moves. Zero-move heursitic, history pruning, futility pruning/razoring etc.

The theoretically optimal way is to have:
1. Estimation of error probability in the case of skipping this move
2. Estimation of the nodes amount skipped by cutoff (N)
3. Estimation of the error probability if we will not search N nodes more (some eval of error margin rate by nodes searched should be useful).

It's obvious that each of this values should be estimated. The simpliest way is to use some polynonial functions that should use history table params, number of pieces, depth and so on.

It can be also generalized for reductions/extensions.

Anyone tried this approach?

Sergei S. Markoff · Post by **Sergei S. Markoff** » Sat Sep 22, 2012 8:23 pm

Really noone interested?

Sergei S. Markoff · Post by **Sergei S. Markoff** » Sat Sep 22, 2012 8:43 pm

I haven't a much time to continue experiments there due to huge amount of work in my busyness. So that's why I want to publish some ideas that can be useful to the community. It will be really great to get some feedback if anyone want to continue exploration in this way)

I think it's neccessary to perform some wide statistical exploration of the search trees to obtain some objective tools. Now an extensions and reductions are the sort of art, not science. The descisions to cut/extend/reduce should stand on the shoulders of statistical analyse

At first, it's neccessary to obtain some huge random sample of nodes of exhaustive alpha-beta search trees from different positions from real games. We should account for each node:

1. Number of nodes in subtree
2. Is the cutoff of the node will affect root score?
3. History/killer data for the move
4. Check flag
5. Mate threat flag
6. Move capture flag
7. Move promotion flag
8. Move check flag
9. See eval for move
10. Move piece
11. Capture piece
12. Move history score
13. Move killer 1/2 flags
14. psq score of move
15. Depth
16. Something else?

I think we need a sample of may be 1–2 millions of nodes. Randomly selected from real games.
Using some type of regression we will be able to create good functions to estimate params from the my 1st message.

Daniel Shawul · Post by **Daniel Shawul** » Sat Sep 22, 2012 9:04 pm

Sergei S. Markoff wrote:When we're using alpha/beta family, in best case we can reduce a nodes count to square root of the same count of the exhaustive search tree.
Well, it's good, we know, but the main problem is fail-high nodes, where we still going to examine all moves.

We all know that in all modern chess programms we're using several approaches to skip at least a huge part of that moves. Zero-move heursitic, history pruning, futility pruning/razoring etc.

The theoretically optimal way is to have:
1. Estimation of error probability in the case of skipping this move
2. Estimation of the nodes amount skipped by cutoff (N)
3. Estimation of the error probability if we will not search N nodes more (some eval of error margin rate by nodes searched should be useful).

It's obvious that each of this values should be estimated. The simpliest way is to use some polynonial functions that should use history table params, number of pieces, depth and so on.

It can be also generalized for reductions/extensions.

Anyone tried this approach?

There is a statistical method of search that is proven to converge to alpha beta with inifinite time. It is called UCT and is successfully used in GO. It keeps the top part of the tree in memory to selectively expand it.Alpha-beta is depth first so you are not going to revisit a node or keep statistics on it to measure uncertainty of the decision, so I am not sure what you are trying to achieve. If your goal is to build a general rule to that can be used to measure risk in skipping a move, it will be just a static decision that you could have developed through any means... UCT OTOH keeps dynamic statistics for pruning/expansion of tree decisions and it requires the tree be kept in memory. You can be very selective with it to reach depths equivalent depth with what can be achieved using LMR.

Sergei S. Markoff · Post by **Sergei S. Markoff** » Sat Sep 22, 2012 9:37 pm

Daniel Shawul wrote: There is a statistical method of search that is proven to converge to alpha beta with inifinite time. It is called UCT and is successfully used in GO.

Thanks, it's quite interesting.

Daniel Shawul wrote:It keeps the top part of the tree in memory to selectively expand it.

BTW, TT is also a way to store some parts of search tree. In some eviction schemes it will also store the top part

Daniel Shawul wrote:Alpha-beta is depth first so you are not going to revisit a node or keep statistics on it to measure uncertainty of the decision

Theoretically yes, but due to IID/TT — no.

Daniel Shawul wrote:If your goal is to build a general rule to that can be used to measure risk in skipping a move, it will be just a static decision that you could have developed through any means...

So it should be developed. I never seen any attempt to do it, may be except Rybka approach, but it looks to be very primitive)

zamar · Post by **zamar** » Sat Sep 22, 2012 10:58 pm

Sergei S. Markoff wrote: The theoretically optimal way is to have:
1. Estimation of error probability in the case of skipping this move
2. Estimation of the nodes amount skipped by cutoff (N)
3. Estimation of the error probability if we will not search N nodes more (some eval of error margin rate by nodes searched should be useful).

I don't think that this is so simple... The problem is that some cutoffs are much more valuable than others.

* The only really valuable cutoffs are those which change bestmove at root.

* Also valuable are those which affect PV and by doing that affect the evaluation of the best move.

* Most cutoffs are in fact quite worthless... We get a cutoff, but in parent node opponent just chooses a different move and cutoff resulted in nothing.

I've not collected any statistics, but considering how rarely best move changes at root and how many cutoffs we get in subtree, >99.9% cutoffs must be "worthless".

To really optimize the search tree, we'd need to be able somehow classify cutoffs, and minimize number of examined nodes and minimize number of missed _valuable_ cutoffs.

Easier to say than do. That's probably why I haven't heard anyone doing it successfully.

Just my two cents.

Sergei S. Markoff · Post by **Sergei S. Markoff** » Sun Sep 23, 2012 1:24 am

zamar wrote: I don't think that this is so simple... The problem is that some cutoffs are much more valuable than others.

* The only really valuable cutoffs are those which change bestmove at root.

* Also valuable are those which affect PV and by doing that affect the evaluation of the best move.

* Most cutoffs are in fact quite worthless... We get a cutoff, but in parent node opponent just chooses a different move and cutoff resulted in nothing.

See my comment on the list of node params to examine. One of them is about changing the root eval by cutoff. So, it looks to be nearly the same.
All these three points can be summarized as one — root eval change. The probability of this event should be estimated, of course. And of course it's not too easy, but chess AI itself not very easy thing

It's OK

Sven · Post by **Sven** » Sun Sep 23, 2012 6:14 pm

zamar wrote:
Sergei S. Markoff wrote: The theoretically optimal way is to have:
1. Estimation of error probability in the case of skipping this move
2. Estimation of the nodes amount skipped by cutoff (N)
3. Estimation of the error probability if we will not search N nodes more (some eval of error margin rate by nodes searched should be useful).

I don't think that this is so simple... The problem is that some cutoffs are much more valuable than others.

* The only really valuable cutoffs are those which change bestmove at root.

* Also valuable are those which affect PV and by doing that affect the evaluation of the best move.

* Most cutoffs are in fact quite worthless... We get a cutoff, but in parent node opponent just chooses a different move and cutoff resulted in nothing.

I've not collected any statistics, but considering how rarely best move changes at root and how many cutoffs we get in subtree, >99.9% cutoffs must be "worthless".

To really optimize the search tree, we'd need to be able somehow classify cutoffs, and minimize number of examined nodes and minimize number of missed _valuable_ cutoffs.

Easier to say than do. That's probably why I haven't heard anyone doing it successfully.

Just my two cents.

Can you explain what you mean by "a cutoff changes the best move at root", "a cutoff affects PV", and "a cutoff results in nothing"? Which kind of cutoffs are you thinking of - most probably not the usual "beta cutoffs"?

In my view, performing a cutoff *never* changes the best move at root or the PV. That is somehow inherent to a cutoff: you save searching more nodes based on the observation that searching these nodes would *not* have any effect on the PV, best move and evaluation at root.

Maybe we only have a different understanding of certain terms?

Sven

F. Bluemers · Post by **F. Bluemers** » Sun Sep 23, 2012 6:38 pm

* Most cutoffs are in fact quite worthless... We get a cutoff, but in parent node opponent just chooses a different move and cutoff resulted in nothing.

he means fail-highs

Sven · Post by **Sven** » Sun Sep 23, 2012 6:54 pm

F. Bluemers wrote:
* Most cutoffs are in fact quite worthless... We get a cutoff, but in parent node opponent just chooses a different move and cutoff resulted in nothing.
he means fail-highs

How is "fail-high" different from a cutoff? If searching a subtree runs onto a fail-high then the previous move of the opponent is refuted, I still don't see why that affects the root.

We must have some different wording, I'm quite sure about that but I don't see where ...

In my understanding you can only affect the root when improving the PV, which is not achieved by cutoffs or fail-highs but mainly by improving alpha at a sibling of a PV node without exceeding beta.

Sven

Probabilistic approach to optimize search tree

Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree

Re: Probabilistic approach to optimize search tree