A new algorithm accounting for the uncertainty in eval funcs

Karlo Bala · Post by **Karlo Bala** » Fri Nov 14, 2014 2:09 am

tholden wrote:Dear all,

The document below details a new algorithm for solving games such as chess which I'd love to persuade someone to implement. Even if you've no intention of implementing it, I would still love to hear your comments. I'm not naive enough to think it will beat cutting edge algorithms, but it will at least play differently, arguably in a more "human-like" manner, in which there is considerable value.

Before you dismiss me as another nut-job, let me say by way of defence that I am a professional academic economist, so understand well issues around decisions under uncertainty. I also used to be a professional computer games programmer for EA, so I have some experience of practical implementation issues. I have implemented basic alpha-beta in the past, so I'm aware of the "standard" approach to these problems as well.

http://www.tholden.org/wp-content/uploa ... -Chess.pdf

Any comments would be much appreciated,

Best regards,

Tom

I'm curious. If you have for example a subtree with 50 possible moves where each has a probability of victory of about 5%. Before playing the move, the probability that at least one move leads to victory is around 92%. After playing the move probability of victory falls to 5%. In the second subtree there is only one move with probability of victory around 50%. Which subtree should be chosen?

tholden · Post by **tholden** » Fri Nov 14, 2014 1:07 pm

AlvaroBegue wrote:What I am disagreeing with is that it's a good idea to give a 95% winning probability to a position where whatever you do you only have a 1% probability of winning. Chances are the position is too desperate and what you do next is irrelevant.

OK. So in my language, you are disputing that the errors in the evaluation functions at all those child nodes are uncorrelated. This is the assumption made by Baum and Smith as well it seems, but I'm happy to defer to your wisdom that it is inappropriate. Having thought about it, I now see that there is a less ad hoc way of dealing with it.

Step back to the point at which we estimate the evaluation function. This is a multi-nomial logit applied to panel data where the N dimension is different games and the T dimension is moves within a game. (Consequently, in practice, one would want to allow for random effects to capture heterogeneity in players at a particular time, and possibly also player fixed effects.) Now, the errors in this panel regression will be correlated in time. If it is "surprising" to the evaluation function that a position ends in a win, then the chances are it will still be "surprising" after one more move is made. Thus lagged errors ought to have significant predictive power, and the extent of this can be measured.

Now, returning to play, all of the child nodes of any parent node will share the same lagged error. Providing that lagged error is still viewed as unknown (so the uncertainty hasn't been removed by evaluation to greater depth), this introduces substantial correlation in the outcome of all of the child nodes. Thus the probability that one of them is a win in your example may be much closer to 1% than 95%. Obviously there are some technical statistical details about this to be worked out, but the idea is hopefully reasonably clear.

As ever, this is still making an approximating assumption, which you might reasonably disagree with. Here we are implicitly assuming that the correlation between the errors in eval. funcs. following moves which are played is the same as the correlation between the errors following moves which are not played. One would in fact expect correlation to be much lower between the moves that are not in fact played (which are never observed in the data). Maybe a richer probabilistic model could capture this as well, it is something I would think about were someone to take up the challenge of implementing this. Alternatively, you could just take a weighted average of the probabilities coming from ignore serial correlation, and those coming from taking it into account.

AlvaroBegue wrote: When you then compare moves that have been explored to different depths, the scores will tell you more about the parity of the subtree than about the position.

This is an interesting point. However, the fix above should deal with it. And even if the fix above did not deal with it, the weak dominance heuristic would ensure that all subtrees were evaluated to the same parity, if this was really an issue.

tholden · Post by **tholden** » Fri Nov 14, 2014 1:14 pm

The underlying idea is that once you choose the first sub-tree, you'll evaluate its children to greater depth, and if each event was truly independent, you would have a 92% chance of finding one which was a definite win. But of course the events may not be independent, and in that case a more sophisticated approach as discussed in my previous post is necessary.

I know it is counter-intuitive, but the only way the logic can be wrong is if there is this non-independence. The method given in the previous post will tend to overestimate the correlation between events, meaning that with a weighted average of the two approaches we could come close to the true probabilities.

Karlo Bala · Post by **Karlo Bala** » Fri Nov 14, 2014 5:13 pm

tholden wrote:The underlying idea is that once you choose the first sub-tree, you'll evaluate its children to greater depth, and if each event was truly independent, you would have a 92% chance of finding one which was a definite win. But of course the events may not be independent, and in that case a more sophisticated approach as discussed in my previous post is necessary.

I know it is counter-intuitive, but the only way the logic can be wrong is if there is this non-independence. The method given in the previous post will tend to overestimate the correlation between events, meaning that with a weighted average of the two approaches we could come close to the true probabilities.

That's the problem. The algorithm will choose the wrong subtree! Since the probability of victory in the subtree is 92% and the probability of victory of the individual move is only 5%, after playing the move the probability of victory drops to 5%. The problem is that even if the moves are independent it is very difficult to choose the move that led to victory no matter how deep the algorithm search. Making a move (if it is not a blunder) should not change the evaluation so much.

tholden · Post by **tholden** » Fri Nov 14, 2014 7:39 pm

If the events that each move is a win is not independent of the others, then the evaluated probability of a win from the sub-tree will be far lower than 92%, so it will not choose that sub-tree in the first place. Not being able to find the good move even after deep evaluation of each possibility is indicative of precisely such non-independence.

I'm confident that the serial correlation approach is tractable, and will deliver reasonable probabilities in these circumstances.

A new algorithm accounting for the uncertainty in eval funcs

Re: A new algorithm accounting for the uncertainty in eval f

Re: A new algorithm accounting for the uncertainty in eval f

Re: A new algorithm accounting for the uncertainty in eval f

Re: A new algorithm accounting for the uncertainty in eval f

Re: A new algorithm accounting for the uncertainty in eval f