Hi,
I did a lot of research on black-box parameter optimization last year. That is to say optimization from the win/loss/draw outcome of games. I suppose many people do it manually with tools such as bayeselo, choosing parameters values by hand. Clever automatic procedures can probably be more efficient than manually choosing parameter values.
My intuition is that regression-based approaches are the most efficient, but it is very difficult to come with a good stable algorithm. I am still working on mine.
I recommend the cross-entropy method. It is very simple and efficient:
http://www.personeel.unimaas.nl/m-winan ... rossmc.pdf
I found that the algorithm used by Guillaume can be improved by tricks described in that paper:
http://www.informs-sim.org/wsc09papers/044.pdf
This algorithm is not optimal, but it is certainly better than any genetic algorithm, or simulated annealing. Simulated annealing is not very efficient for noisy measurements of performance. It works for noiseless measurements, with tricky local optima, which is not the situation at all when optimizing a chess program. The cross-entropy method can be viewed as a kind of genetic algorithm, except that it is based on more solid theoretical principles.
Before trying with a real program, you should test algorithms on artificial functions. It is much more efficient.
Rémi
Parameter tuning
Moderators: hgm, Rebel, chrisw
-
- Posts: 201
- Joined: Thu Mar 22, 2007 7:12 pm
- Location: Netherlands
Re: Parameter tuning
I have been thinking about a chess program simulator to test tuning algorithms.Rémi Coulom wrote: Before trying with a real program, you should test algorithms on artificial functions. It is much more efficient.
The simulator would simulate a set of chess programs, each with a certain ELO rating (unknown to the tuning algorithm).
When two programs play a game, the outcome (win/draw/loss) is determined by the the ELO rating difference between the two programs and a random component.
A first challenge is being able to pick a good program from a set of programs with a constant performance with a limited number of games
(if I am not mistaken this is referred to as the multi-armed bandit problem).
Next, we can assign a set of parameters to each program, which influences its playing strength.
Depending on the outcome of games, the tuning algorithm can modify the parameters.
The question then is, how to map the parameters of a program to an ELO rating such that it is a realistic model of a real chess program.
Probably best to start with a something simple like the ELO rating is inversely proportional to the distance to a single optimal set of parameter values.
And, as you mentioned, a simulator would be much faster faster than playing actual games, and it is easy to measure the performance of the tuning algorithm.
Jan