Tool for automatic black-box parameter optimization released

Rémi Coulom · Post by **Rémi Coulom** » Wed Apr 13, 2011 10:22 am

IQ wrote:
Rémi Coulom wrote: My method computes the maximum of the posterior with gradient descent (conjugate gradient, in fact), and it works very well. QLR also has an MCMC sampler for the posterior. Efficiency of MCMC sampling algorithms depend on having a reasonable approximation of the posterior distribution. By computing the maximum and the Hessian at the maximum I can get a good Gaussian approximation that is then used by MCMC. In fact, the posterior samples I plotted in those slides were obtained by MCMC IIRC.
i
This is the point i was talking about. You already use MCMC, but only as a second step after you approximate with a gaussian. I suggested to skip this approximation step and integrate the step 1 maximization into a full bayesian framework. The fact that the quadratic forms sometimes do not have a maximum, would make it even more suited, as f.e. a metropolis hastings algorithm have a higher accept probability in regions with more probability mass. Usually this approximates the posterior and makes it easy to calculate expectations -> but as n gets larger it converges to maximum liklihood. This would get rid of the intermediate maximization step, in a robust way. And if n is not large enough, you could just replicate the data. But maybe its difficult to find a usefull prior for the quadratic forms?

Hi Roberto,

Thanks for you advice, but I am afraid what you mention does not solve any of the difficult problems. I repeat that I have no difficulty with estimating the posterior and taking samples from it. The fact that some samples are not definite negative is not a major problem. I tried for a while to use a prior that ensures they are definite negative, but it does not really help, and is not very important.

The main difficulty for the stability and convergence of the algorithm is making the regression local. Since the function to be maximized is never quadratic in practice, quadratic regression has a bias. So, in order to converge to the maximum, it is necessary to progressively remove samples that are far from the maximum. This is difficult to do, because if remote samples are removed too fast, then the algorithm may become unstable, and, if they are not removed fast enough, the algorithm will be inefficient.

After a lot of complicated theoretical and empirical study, I have come up with a satisfactory solution now. So I don't really need any advice, since I know how to make everything work efficienctly and robustly. The last version of QLR I made public did not have all the problems solved (but almost). The current private one is OK.

There are other issues such as "how do I estimate the location of the maximum?", and "with which parameters do I play the next game". I also found some good solutions for that.

All these problems are very subtle and difficult. Even if I don't release the software to the public, I may write a paper about it some day.

Rémi

IQ · Post by IQ » Wed Apr 13, 2011 12:28 pm

Ok, i rest my case as it seems you have found a satisfactory solution. But one point of clarification: in my approach would never explicitely solve the quadratic regression as it would be part of the bayesian model -> hence less problems. But there are always more than one way that leads to rome.

Rémi Coulom wrote:
IQ wrote:
Rémi Coulom wrote: My method computes the maximum of the posterior with gradient descent (conjugate gradient, in fact), and it works very well. QLR also has an MCMC sampler for the posterior. Efficiency of MCMC sampling algorithms depend on having a reasonable approximation of the posterior distribution. By computing the maximum and the Hessian at the maximum I can get a good Gaussian approximation that is then used by MCMC. In fact, the posterior samples I plotted in those slides were obtained by MCMC IIRC.
i
This is the point i was talking about. You already use MCMC, but only as a second step after you approximate with a gaussian. I suggested to skip this approximation step and integrate the step 1 maximization into a full bayesian framework. The fact that the quadratic forms sometimes do not have a maximum, would make it even more suited, as f.e. a metropolis hastings algorithm have a higher accept probability in regions with more probability mass. Usually this approximates the posterior and makes it easy to calculate expectations -> but as n gets larger it converges to maximum liklihood. This would get rid of the intermediate maximization step, in a robust way. And if n is not large enough, you could just replicate the data. But maybe its difficult to find a usefull prior for the quadratic forms?
Hi Roberto,

Thanks for you advice, but I am afraid what you mention does not solve any of the difficult problems. I repeat that I have no difficulty with estimating the posterior and taking samples from it. The fact that some samples are not definite negative is not a major problem. I tried for a while to use a prior that ensures they are definite negative, but it does not really help, and is not very important.

The main difficulty for the stability and convergence of the algorithm is making the regression local. Since the function to be maximized is never quadratic in practice, quadratic regression has a bias. So, in order to converge to the maximum, it is necessary to progressively remove samples that are far from the maximum. This is difficult to do, because if remote samples are removed too fast, then the algorithm may become unstable, and, if they are not removed fast enough, the algorithm will be inefficient.

After a lot of complicated theoretical and empirical study, I have come up with a satisfactory solution now. So I don't really need any advice, since I know how to make everything work efficienctly and robustly. The last version of QLR I made public did not have all the problems solved (but almost). The current private one is OK.

There are other issues such as "how do I estimate the location of the maximum?", and "with which parameters do I play the next game". I also found some good solutions for that.

All these problems are very subtle and difficult. Even if I don't release the software to the public, I may write a paper about it some day.

Rémi

mridul · Post by **mridul** » Mon May 02, 2011 6:56 am

Chanced upon this thread ... and wanted to give it a shot.
But looks like the page has moved on ? I found a couple of slides at the url, but no source code ...
Can someone redirect me to the right location ?

Thanks,
Mridul

mathmoi · Post by **mathmoi** » Mon May 02, 2011 5:00 pm

Send me an email at CCC@mathieupage.com and I'll send you the tool.

mridul · Post by **mridul** » Tue May 03, 2011 8:30 pm

This is not available online some place ?

- Mridul

mathmoi · Post by **mathmoi** » Tue May 03, 2011 9:36 pm

No, it's not.

Fabien removed it from it's website, but he stated in another message in this very thread that he is fine with other persons distributing it.

mridul · Post by **mridul** » Wed May 11, 2011 5:58 pm

Looks like I cant get a response from mathieu@mathieupage.com ... do you have any other mail id I can reach you at ?
Does anyone else have a copy they can send my way ?

Thanks,
Mridul

mathmoi · Post by **mathmoi** » Wed May 11, 2011 6:48 pm

I've uploaded the code here : http://mathieupage.com/misc/QLR.zip

Tool for automatic black-box parameter optimization released

Re: Tool for automatic black-box parameter optimization rele

Re: Tool for automatic black-box parameter optimization rele

Re: Tool for automatic black-box parameter optimization rele

Re: Tool for automatic black-box parameter optimization rele

Re: Tool for automatic black-box parameter optimization rele

Re: Tool for automatic black-box parameter optimization rele

Re: Tool for automatic black-box parameter optimization rele

Re: Tool for automatic black-box parameter optimization rele