Some details:
1. By "tune" I mean that I filter the evaluation function through a logistic that maps the evaluation to a game result (0...1, with 0=black wins, 1=white wins, 0.5=draw). The idea is pretty standard. I use f(x) = 1/(1+exp(-k*x)), with x the evaluation in pawn units and k a scaling constant (which happens to be 1 according to a monte-carlo best parameter estimate).
2. I use the test position set by Alexandru Mosoi (http://talkchess.com/forum/viewtopic.php?p=686204).
3. To fit the data, I use stochastic gradient descent with 1000 positions in each estimate for the gradient. I haven't tried anything more fancy, but I did first try the Simplex algorithm from GSL as an alternative. This works ok too.
4. In the evaluation, I have fixed the value of a pawn in the end game (VALUE_P_EG) at 256. This fixes the scale for the evaluation, which is otherwise arbitrary.
5. During tuning, evaluation parameters are treated as double-precision floating point numbers.
6. I started with 11 parameters to tune: piece values for N, B, R, Q in MG and EG, pawn value in MG and bishop pair bonus for MG and EG (this is in fact just the quadratic term in the material evaluation).
The initial parameters are
Code: Select all
MG EG
P 0.80 1.00
N 3.25 3.50
B 3.25 3.50
R 4.50 5.50
Q 9.00 9.75
BB 0.00 0.00
Code: Select all
MG EG
P 0.96 1.00
N 3.46 2.39
B 3.37 2.54
R 4.40 4.70
Q 8.79 9.29
BB 0.17 0.21
This makes me wonder if there's something I'm overlooking in how I've implemented my optimiser. What I'm wondering is whether anyone else has done a similar experiment with similar results? In particular, I would like to know if anyone has tried to match Alexandru Mosoi's dataset with just material and arrived at "correct" piece values that represent the data.