Re: How Do You Automatically Tune Your Evaluation Tables
Posted: Wed Jan 08, 2014 4:29 pm
A different approach is to try to make the evaluation function be a good predictor of game result, without playing games. (Of course I would play many games after tuning to verify the new weights are better than the previous ones.)
The basic mechanism would be something like this:
(1) Get a database of [~1 million?] positions with associated results.
(2) From each position, run quiescence search and extract the position that the evaluation ultimately came from; in the process, write a new database of quiescent positions with associated results.
(3) Define the probability of winning as sigmoid(C*evaluation), where sigmoid(x):=1/(1+exp(-x)) and the constant C is chosen so the evaluation has the usual scale in pawns (I got C=0.58 or something like that, but I am quoting from memory).
(4) Use non-linear regression to estimate the parameters of the evaluation function that maximize the [log-]likelihood of the results.
One needs to do something to handle draws, but probably treating them as half a victory and half a loss would be fine.
Notice that if your evaluation function is a linear combination of terms and you are trying to figure out the coefficients, step (4) is logistic regression.
I have only done small-scale tests with this idea, but the Junior team seems to have used it extensively, as described in this paper: http://www.ratio.huji.ac.il/node/2362 . They seem to handle draws in a complicated way, but other than that I think their ideas are similar to mine (I haven't read the paper in a while).
The basic mechanism would be something like this:
(1) Get a database of [~1 million?] positions with associated results.
(2) From each position, run quiescence search and extract the position that the evaluation ultimately came from; in the process, write a new database of quiescent positions with associated results.
(3) Define the probability of winning as sigmoid(C*evaluation), where sigmoid(x):=1/(1+exp(-x)) and the constant C is chosen so the evaluation has the usual scale in pawns (I got C=0.58 or something like that, but I am quoting from memory).
(4) Use non-linear regression to estimate the parameters of the evaluation function that maximize the [log-]likelihood of the results.
One needs to do something to handle draws, but probably treating them as half a victory and half a loss would be fine.
Notice that if your evaluation function is a linear combination of terms and you are trying to figure out the coefficients, step (4) is logistic regression.
I have only done small-scale tests with this idea, but the Junior team seems to have used it extensively, as described in this paper: http://www.ratio.huji.ac.il/node/2362 . They seem to handle draws in a complicated way, but other than that I think their ideas are similar to mine (I haven't read the paper in a while).