Ab-initio evaluation tuning
Posted: Wed Aug 30, 2017 2:18 pm
I've been writing up a new evaluation function from scratch, and one of the things I've been looking into is tuning it from the beginning. Right now it only has material evaluation (the material evaluation is a general quadratic function to handle material imbalances), and I'm trying to tune the piece values. The results confuse me a bit, however.
Some details:
1. By "tune" I mean that I filter the evaluation function through a logistic that maps the evaluation to a game result (0...1, with 0=black wins, 1=white wins, 0.5=draw). The idea is pretty standard. I use f(x) = 1/(1+exp(-k*x)), with x the evaluation in pawn units and k a scaling constant (which happens to be 1 according to a monte-carlo best parameter estimate).
2. I use the test position set by Alexandru Mosoi (http://talkchess.com/forum/viewtopic.php?p=686204).
3. To fit the data, I use stochastic gradient descent with 1000 positions in each estimate for the gradient. I haven't tried anything more fancy, but I did first try the Simplex algorithm from GSL as an alternative. This works ok too.
4. In the evaluation, I have fixed the value of a pawn in the end game (VALUE_P_EG) at 256. This fixes the scale for the evaluation, which is otherwise arbitrary.
5. During tuning, evaluation parameters are treated as double-precision floating point numbers.
6. I started with 11 parameters to tune: piece values for N, B, R, Q in MG and EG, pawn value in MG and bishop pair bonus for MG and EG (this is in fact just the quadratic term in the material evaluation).
The initial parameters are
After tuning, I get
What I find particularly odd is the lower end-game value of the minor pieces. Without playing any games (which is a while away yet), they look wrong to me. With these values, an engine that is a minor ahead would adopt a trade-avoiding strategy (because his minor would get devalued) unless it can get an extra pawn in the bargain. Conversely, an engine that is a minor behind will gladly exchange material.
This makes me wonder if there's something I'm overlooking in how I've implemented my optimiser. What I'm wondering is whether anyone else has done a similar experiment with similar results? In particular, I would like to know if anyone has tried to match Alexandru Mosoi's dataset with just material and arrived at "correct" piece values that represent the data.
Some details:
1. By "tune" I mean that I filter the evaluation function through a logistic that maps the evaluation to a game result (0...1, with 0=black wins, 1=white wins, 0.5=draw). The idea is pretty standard. I use f(x) = 1/(1+exp(-k*x)), with x the evaluation in pawn units and k a scaling constant (which happens to be 1 according to a monte-carlo best parameter estimate).
2. I use the test position set by Alexandru Mosoi (http://talkchess.com/forum/viewtopic.php?p=686204).
3. To fit the data, I use stochastic gradient descent with 1000 positions in each estimate for the gradient. I haven't tried anything more fancy, but I did first try the Simplex algorithm from GSL as an alternative. This works ok too.
4. In the evaluation, I have fixed the value of a pawn in the end game (VALUE_P_EG) at 256. This fixes the scale for the evaluation, which is otherwise arbitrary.
5. During tuning, evaluation parameters are treated as double-precision floating point numbers.
6. I started with 11 parameters to tune: piece values for N, B, R, Q in MG and EG, pawn value in MG and bishop pair bonus for MG and EG (this is in fact just the quadratic term in the material evaluation).
The initial parameters are
Code: Select all
MG EG
P 0.80 1.00
N 3.25 3.50
B 3.25 3.50
R 4.50 5.50
Q 9.00 9.75
BB 0.00 0.00
Code: Select all
MG EG
P 0.96 1.00
N 3.46 2.39
B 3.37 2.54
R 4.40 4.70
Q 8.79 9.29
BB 0.17 0.21
This makes me wonder if there's something I'm overlooking in how I've implemented my optimiser. What I'm wondering is whether anyone else has done a similar experiment with similar results? In particular, I would like to know if anyone has tried to match Alexandru Mosoi's dataset with just material and arrived at "correct" piece values that represent the data.