CLOP for Noisy Black-Box Parameter Optimization

Rémi Coulom · Post by **Rémi Coulom** » Thu Sep 01, 2011 12:32 pm

Hi,

This is my paper for the Tilburg conference:

Title: CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning

Abstract: Artificial intelligence in games often leads to the problem of parameter tuning. Some heuristics may have coefficients, and they should be tuned to maximize the win rate of the program. A possible approach consists in building local quadratic models of the win rate as a function of program parameters. Many local regression algorithms have already been proposed for this task, but they are usually not robust enough to deal automatically and efficiently with very noisy outputs and non-negative Hessians. The CLOP principle, which stands for Confident Local OPtimization, is a new approach to local regression that overcomes all these problems in a simple and efficient way. It consists in discarding samples whose estimated value is confidently inferior to the mean of all samples. Experiments demonstrate that, when the function to be optimized is smooth, this method outperforms all other tested algorithms.

pdf and source code:
http://remi.coulom.free.fr/CLOP/

It makes no miracle: you'll have to play a lot of games to get really good parameters. But it is certainly much more efficient than any manual method you could use with bayeselo. It is also more efficient than any other algorithm I am aware of.

Compared to the old version of QLR, I solved all the unstability problems. I do not have a mathematical proof of convergence, but I am convinced it always work well, unless the maximum is at a discontinuity, which never happens in practice.

Comments and questions are welcome.

Rémi

zamar · Post by **zamar** » Fri Sep 02, 2011 10:52 am

Very impressive achievement Remi. I will try your new tool at some point in the near future!

Rémi Coulom · Post by **Rémi Coulom** » Fri Sep 02, 2011 11:45 am

Thanks Joona.

I added some screenshots of the program to the web page, and a description:

These are some screenshots of an old version in action. You can also run the program from the command line, which is more convenient for use on a remote cluster. The program can deal with chess outcomes (win/draw/loss), and integer parameters. The program is written in C++ with Qt, so it can be compiled and run on Windows, Linux, and MacOS.

http://remi.coulom.free.fr/CLOP/

Rémi

Zach Wegner · Post by **Zach Wegner** » Fri Sep 02, 2011 6:44 pm

zamar wrote:Very impressive achievement Remi. I will try your new tool at some point in the near future!

+1

I am very glad that you decided to re-release your source code, even though you are a "commercial" now. Thanks!

I will try and understand your paper. Do you know how well your ideas could be applied to non-game-playing applications (i.e., a floating point objective function)?

mcostalba · Post by **mcostalba** » Fri Sep 02, 2011 7:34 pm

Rémi Coulom wrote:Thanks Joona.

I added some screenshots of the program to the web page, and a description:

These are some screenshots of an old version in action. You can also run the program from the command line, which is more convenient for use on a remote cluster. The program can deal with chess outcomes (win/draw/loss), and integer parameters. The program is written in C++ with Qt, so it can be compiled and run on Windows, Linux, and MacOS.

http://remi.coulom.free.fr/CLOP/

Rémi

As you perhaps know Stockfish has been tuned in an automatic way by means of an (not disclosed) algorithm written by Joona, so I am very happy that he finds your work useful because he is the expert of tuning in the SF team...and so I hope this will give him good hints to further tune the tuner

Rémi Coulom · Post by **Rémi Coulom** » Fri Sep 02, 2011 9:58 pm

Zach Wegner wrote:Do you know how well your ideas could be applied to non-game-playing applications (i.e., a floating point objective function)?

I did not try, but I expect the basic idea of CLOP would work well in many situations, even the completely noiseless case.

I am really enthusiastic about this algorithm, because it is extremely simple, and very universal. The problem of optimizing a function from noisy (or noiseless) observations has really been researched a lot for more than 50 years. It is really difficult to contribute anything significant to this field. I am looking forward to the feedback of optimization specialists. Time will tell if CLOP makes an impact.

Rémi

Rein Halbersma · Post by **Rein Halbersma** » Sat Sep 03, 2011 12:07 am

Hi Rémi,

Can CLOP also be applied to LOS as an objective function? Suppose that a tournament has a very skewed prize money distribution (e.g. in poker). In such cases, I can imagine that programs slightly below the absolute top might want to maximize a different mean-variance combination than the best programs. E.g. optimize their chance of winning a tournament, rather than their ELO.

Rein

Rémi Coulom · Post by **Rémi Coulom** » Sat Sep 03, 2011 11:24 am

Rein Halbersma wrote:Hi Rémi,

Can CLOP also be applied to LOS as an objective function? Suppose that a tournament has a very skewed prize money distribution (e.g. in poker). In such cases, I can imagine that programs slightly below the absolute top might want to maximize a different mean-variance combination than the best programs. E.g. optimize their chance of winning a tournament, rather than their ELO.

Rein

I don't really understand your question.

The objective function can be the expected value of any random variable that depends on parameters. So, if, instead of playing one game and getting the result, you play a tournament and observe LOS over a specific opponent, you can use CLOP to optimize it. But that would be a strange way to use CLOP. If you wish to optimize a program against a set of opponents instead of just one opponent, you can use the "Replications" option of CLOP to play a game against each opponent, and then CLOP will maximize the average winning rate against all these opponents.

Regarding your example, if I understand correctly, you mean letting the program be more agressive when it needs to win, and safer when a draw is OK. For that, you could tune your evaluation with CLOP, considering that a draw is a loss (resp. a win) to make it play agressively (resp. defensively). Make sure your evaluation function is asymmetric, then. It may be more efficient than using just contempt.

Rémi

Andres Valverde · Post by **Andres Valverde** » Sat Sep 03, 2011 12:28 pm

Can anybody explain in a nutshell how can one use it in engine tunning? I find the doc very interesting but pretty abstract.

Rémi Coulom · Post by **Rémi Coulom** » Sat Sep 03, 2011 12:40 pm

Andres Valverde wrote:Can anybody explain in a nutshell how can one use it in engine tunning? I find the doc very interesting but pretty abstract.

Did you read the doc in the "README" file?

Did you manage to open DummyExperiment.clop with CLOP?

Once you manage to run this experiment, you should be all set up. All you have to do is write your own scripts to replace DummyScript.py. Run DummyScript.py without arguments (or look at the source) for an explanation.

Rémi

CLOP for Noisy Black-Box Parameter Optimization

CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization

Re: CLOP for Noisy Black-Box Parameter Optimization