Bayesian Evaluation Functions

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
Posts: 267
Joined: Thu Jan 22, 2015 2:21 pm
Location: Zurich, Switzerland
Full name: Jonathan Rosenthal

Bayesian Evaluation Functions

Post by jorose » Wed Feb 15, 2017 9:25 am

Disclaimer: I don't know if this has been tried and discussed by anyone before, I think it's a rather obvious idea imo, so its not unlikely. That being said, I haven't found it anywhere so I thought I'd post anyways.

Tapered evaluation is a feature that is generally considered indispensable for strong engines nowadays. It is a feature which is logical (in the endgame I want my king in the center, but he'll get mated if he isn't in a corner early) but at the same time rather arbitrary. Why are we linearly interpolating between opening (or middlegame) and ending? Why only opening and endgame? Why are we only using material for our features and why are we giving each piece the specified value?

I would like to suggest a more theoretically sound approach to this problem, which is a rather standard idea from machine learning. Instead of using tapered evaluation functions I would like to recommend using conditional scores, with scores being weighted averages from a Gaussian Mixture Model. A GMM interprets our data (set of chess positions in our case) as stemming from a number of Gaussian distributions and associated weighting factor. Instead of calculating a score for opening and ending we can instead calculate a score for each of our GMM's K mixtures and then calculate a weighted average with the weights being the relative probability of the current position times a weighting factor.

We could interpret chess as having just two kinds of positions, opening and endgame, as we do with tapered evaluation functions, but that is not actually the case. For example chess players typically learn that in positions with opposite colored bishops, material is less valuable. If we are using a GMM we can have arbitrarily many different kinds of positions.

What the optimal number of mixtures is depends on our data, choice of features and how much time we are willing to spend in our evaluation function at run-time. The more mixtures we have the more data (or tuning iterations) we will need to tune our parameters, as each mixture has it's own copy of our evaluation parameters. Depending on our choice of features a GMM may or may not be able to recognize different types of positions, eg we have to explicitly calculate if there are opposite colored bishops if we want our GMM to assign such positions different weights. Finally the more mixtures we have the greater the number of scores we have to calculate in order to calculate the weighted average. Since our time is limited at run time, spending more time calculating a greater number of scores results in less time being available for search.

In order to implement a GMM I recommend googling and reading up on it, but I'll give a quick rundown here. The standard way to initialize a GMM is to run an expectation maximization (EM) algorithm. Essentially we do the following:
  • 1. Load data
    2. Choose differing random points to initialize the means of our Gaussians
    3. Assign each point in our data a likelihood of belonging to each of our Gaussians. For example we could decide to assign the Gaussian with the closest mean to each data point a probability of 100%
    4. For each Gaussian we recalculate our means and standard deviations based on our weighted data points. We also set the weighting factor of each Gaussian to be the respective sum of the weights of the data points.
    5. Recalculate the relative probabilities of each data point belonging to each weighted gaussian.
    6. Repeat steps 4 and 5 until we are happy :)
In my own current chess engine this approach is working rather well. I am currently using a GMM with 4 different mixtures and 6 different features to calculate probabilities of a position belonging to each respective Gaussian. Those 6 features are: The sum of differences between the players respective piece counts, the sum of knights, bishops, rooks and queens and finally the distance between the two kings. One interesting detail is that for me std::exp became a bottleneck, which was solved by only calling the function at the end of the joint probability calculation of our Gaussian and using a single precision float for the exponent. That being said my engine is still in its baby stages and rather weak, so I have no idea if it would be effective in any half decent engine.

What do you guys think? Has this been recommended before? Perhaps someone could recommend a library so that not everyone wants to try this needs to implement GMM on their own?

Posts: 3824
Joined: Fri Mar 10, 2006 4:23 am

Re: Bayesian Evaluation Functions

Post by jdart » Wed Feb 15, 2017 2:39 pm

I think it's a sound approach. However, it is known that piece relationships are distorted in the endgame. For example Rook + Knight vs Rook is a draw, so the value of the Knight there is effectively zero. If you take these cases out and treat them as special cases, then I am not sure how much you need this approach for the cases that remain.

Also another way to approach this is to use different scaling factors for different evaluation features. Texel does this to some extent, for example for passed pawns. The scaling factors can be tuned like any other eval parameter.


Post Reply