nionita wrote:Hi Michael,
Do I understand correct, you try to tune only the material with this method?
If yes, then my opinion is that this will not work, unless you are very lucky (which actually would be again bad luck in the end because then next time will not work). And this is why I think it doesn't work:
The tuner trys to "explain" the differences between the training positions with too less parameters (the material values). But we all know that this is by far not enough. Then the result is biased by the few hundred thousand positions used in the training. But the number of positions encountered in a chess game (in the analysis, not in the real moves!) are much more than a few hundred thousend, and theyre variation (especially material imbalance) is even much much more then what you see in a real game!
So actually what you get is a set of overfitted parameter values.
Just my 2 cents.
[Edit: overfitting must be the wrong term: is should be a model error - the model is too simple - and the training set is probably not representative]
Regards, Nicu
Hello, Nicu,
i tune all parameters like this, but the idea is only to tune a subset of all parameters at the same time.
Currently i pick a feature (like mobility values, file/rank values) as subset and so the subset includes
4,5,8...16,64 parameters to tune. Of course there would be no problem to mix parameters of different features.
Material values are mainly choosen in this thread to explain the ideas, but there is no restrictions to nothing.
The databases i use include 200000/400000/...games, up to 40 million positions. The starting post shows that
even using small databases which only includes 100000 positions or less are able to give usable results.
Especially because i want to have a set of positions that are not biased by any properties (like gamephase),
the main control loops over games (not positions), to get a complete picture. So, looping about 1000 games means
to loop over 1000 games * (let's say) 100 positions = 100000 positions with balanced characteristics.
The tuner trys to "explain" the differences between the training positions with too less parameters (the material values).
That is a good point, i will enable some more parameters, so orthogonality of features may influence more than i do expect it for that stage.
But it does not explain why it works for a "simple" evaluation and not when score interpolation is included.
Well, it works too, but the results are somehow unexpected.
As i said, i need some time to produce/provide some data and to formulate some thoughts on the results.
First, i need to go to work...