Tool for automatic black-box parameter optimization released

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Tool for automatic black-box parameter optimization rele

Post by Daniel Shawul »

I can email it to you if you want.
mathmoi
Posts: 286
Joined: Mon Mar 13, 2006 5:23 pm
Location: Québec

Re: Tool for automatic black-box parameter optimization rele

Post by mathmoi »

Bonjour Rémi,

It would be much appreciated.

My email is mathieu@mathieupage.com

Thanks
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Tool for automatic black-box parameter optimization rele

Post by Daniel Shawul »

Dammit Mathieu, please read poster's name :)
I had the same problem once but fortunately found them stashed on
a dirty harddrive. Hope I am not 'infringing' copyright or anything .
Enjoy
IQ
Posts: 162
Joined: Thu Dec 17, 2009 10:46 am

Re: Tool for automatic black-box parameter optimization rele

Post by IQ »

Hi,

i just glanced over the presentation pdf and wonder why you do not use a full Markov Chain Monte Carlo Approach as you already stated your problem in a Baysian context. Advantages might be:
a) much less computational burden (no hessian, hill climbing etc.)
b) more parsimonious model as you already use a sigma-sampler and give the posterior
c) convergence is faster especially for large parameter spaces and is robust to identification problems
d) its easy to extend the model to incorporate all kind of dependencies/correlations of the parameter space

The key insight is that a mcmc chain converges to the full posterior distribution, so that calculation expected values ist real easy -> and the bernstein van-mieses theorem ensures that the expected values converge to the maximum liklihood if n is large enough. As the "largeness" of n is purely technical -> you can just replicate the existing data multiple times to get this kind of convergence with the net result usually still being faster than many other optimization methods in case of large parameter spaces, especially with interdependencies/correlations.

Did you investigate this? Or are there some reasons against this, that i am missing?
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: Tool for automatic black-box parameter optimization rele

Post by Rémi Coulom »

mathmoi wrote:Sorry for the thread necromancy, but is this tool still avaible somewhere?

Thanks
I became a commercial programmer recently, so I decided to not give away my secret weapons any more. Still the previous version was GPL, and I am sure a few members of this forum have a copy. I don't mind if they distribute it.

Rémi
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: Tool for automatic black-box parameter optimization rele

Post by Rémi Coulom »

IQ wrote:Hi,

i just glanced over the presentation pdf and wonder why you do not use a full Markov Chain Monte Carlo Approach as you already stated your problem in a Baysian context. Advantages might be:
a) much less computational burden (no hessian, hill climbing etc.)
b) more parsimonious model as you already use a sigma-sampler and give the posterior
c) convergence is faster especially for large parameter spaces and is robust to identification problems
d) its easy to extend the model to incorporate all kind of dependencies/correlations of the parameter space

The key insight is that a mcmc chain converges to the full posterior distribution, so that calculation expected values ist real easy -> and the bernstein van-mieses theorem ensures that the expected values converge to the maximum liklihood if n is large enough. As the "largeness" of n is purely technical -> you can just replicate the existing data multiple times to get this kind of convergence with the net result usually still being faster than many other optimization methods in case of large parameter spaces, especially with interdependencies/correlations.

Did you investigate this? Or are there some reasons against this, that i am missing?
Hi Roberto,

My method computes the maximum of the posterior with gradient descent (conjugate gradient, in fact), and it works very well. QLR also has an MCMC sampler for the posterior. Efficiency of MCMC sampling algorithms depend on having a reasonable approximation of the posterior distribution. By computing the maximum and the Hessian at the maximum I can get a good Gaussian approximation that is then used by MCMC. In fact, the posterior samples I plotted in those slides were obtained by MCMC IIRC.

But estimating the posterior distribution is not the main problem at all in this kind of algorithm. The main problem is converging to parameters that produce the optimal winning rate. It is different. Each sample of the posterior is a quadratic form. Most of these sample don't even have a maximum. This is what makes the problem difficult.

The last version of QLR I made public had convergence problems. It is still very usable, and should behave well in most simple cases. If you wish to optimize just one parameter that has a strong influence on playing strength, then there should be no problem.

Rémi
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Tool for automatic black-box parameter optimization rele

Post by Daniel Shawul »

Hi Remi
My apologies, I thought it was offline for other reasons. Ofcourse I won't give it to anyone anymore.
FlavusSnow
Posts: 89
Joined: Thu Apr 01, 2010 5:28 am
Location: Omaha, NE

Re: Tool for automatic black-box parameter optimization rele

Post by FlavusSnow »

I'll first comment that I wasn't able to get a copy before it fell from the face of the earth ;)

But regardless, a lot of people were instantly trying to apply this to evaluation parameters. Couldn't this also be applied specifically to search parameters? Essentially parameters whose sole purpose were to determine when to extend or reduce, or when to apply a certain pruning method or quiescent search. Including the number of ply to extend or reduce, etc.

Eval is certainly important, but if past history is any indication, then large ELO breakthroughs are usually found in the search.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: Tool for automatic black-box parameter optimization rele

Post by Rémi Coulom »

Daniel Shawul wrote:Hi Remi
My apologies, I thought it was offline for other reasons. Ofcourse I won't give it to anyone anymore.
Don't worry. No problem with that. I distributed it under GPL, so in fact anybody could re-distribute it with or without modifications. I simply won't distribute my new version.

Rémi
IQ
Posts: 162
Joined: Thu Dec 17, 2009 10:46 am

Re: Tool for automatic black-box parameter optimization rele

Post by IQ »

Rémi Coulom wrote: My method computes the maximum of the posterior with gradient descent (conjugate gradient, in fact), and it works very well. QLR also has an MCMC sampler for the posterior. Efficiency of MCMC sampling algorithms depend on having a reasonable approximation of the posterior distribution. By computing the maximum and the Hessian at the maximum I can get a good Gaussian approximation that is then used by MCMC. In fact, the posterior samples I plotted in those slides were obtained by MCMC IIRC.
i
This is the point i was talking about. You already use MCMC, but only as a second step after you approximate with a gaussian. I suggested to skip this approximation step and integrate the step 1 maximization into a full bayesian framework. The fact that the quadratic forms sometimes do not have a maximum, would make it even more suited, as f.e. a metropolis hastings algorithm have a higher accept probability in regions with more probability mass. Usually this approximates the posterior and makes it easy to calculate expectations -> but as n gets larger it converges to maximum liklihood. This would get rid of the intermediate maximization step, in a robust way. And if n is not large enough, you could just replicate the data. But maybe its difficult to find a usefull prior for the quadratic forms?