Daniel Shawul wrote:Evolutionary methods for optimization have been discussed here before so my first thought
was that yours was a GA. These can indeed find a global optima despite the users initial guesses
for the parameters. It takes a lot more objective function evaluations (games) though.
I dont know how important this is to the topic of chess evaluation terms as we have already have
a "good" estimate of the parameters before the experiment anyway..
If anybody has ever observed a local optimum when optimizing parameters, I'd be curious to know. I never observed any. But it is very probable that some might exist.
In fact, I implemented plenty of algorithms to compare their performances. Local quadratic regression performs better than the others for any function that is smooth and flat at the maximum. The second best algorithm in my tests is the cross-entropy method (CEM). CEM is a population-based method, a bit similar to a genetic algorithm.
For newton-raphson I recall that computation of the hessian matrix can be done in O(n^2 + n/2).
To decrease the cost of more function evaluations, we can use quasi-newton methods like BFGS.
Also We can use radial basis functions to approximate the response and determine optimal values
with much less number of samples. The reason for my question was that I had set a limit of n^2
samples per iteration for the methods I am familiar with.. I now understand that your method
makes many function calls per iteration but converges faster than the rest overall, which is the final goal.
In fact, I compute the full inverse of the Hessian, with a Cholesky decomposition. The full inverse is not required for Newton-Raphson, but some other algorithms I implemented require it. With the full inverse, it is easy to generate random samples according to the posterior distribution, or estimate variance.
Rémi