CLOP for Noisy Black-Box Parameter Optimization

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CLOP - a trial for xiangqi engine

Post by Daniel Shawul »

I am also interested in tuning piece values. I have tons of variants in which I don't know the piece values (can't even make a guess most of them). Later I may even be forced to determine piece values at run time..

Ok my question is CLOP seems to do local optimization guessing from the name. Can it find optimal values where I have no idea of what kind of games I am about to play ?
What is the advantage of CLOP over other "general" type of response surface models that use for example a radial basis function (RBF) to fit about any curve. Does the fact that the objective function is stochastic affect the choice above ? In the past i have used gradient descent methods to optimize eval parameters one by one. Does CLOP take into consideration correlation of parameters as that seems very important for determining piece values? By that I mean given 5 parameters to tune , does it give me correlation table after I run the optimization ?

Thank you
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: CLOP - a trial for xiangqi engine

Post by Rémi Coulom »

Daniel Shawul wrote:I am also interested in tuning piece values. I have tons of variants in which I don't know the piece values (can't even make a guess most of them). Later I may even be forced to determine piece values at run time..

Ok my question is CLOP seems to do local optimization guessing from the name. Can it find optimal values where I have no idea of what kind of games I am about to play ?
Yes, CLOP is completely domain-independent. I never use it for chess, personally.
What is the advantage of CLOP over other "general" type of response surface models that use for example a radial basis function (RBF) to fit about any curve.
Most of the time, the function to be optimized does not have any trickly local optimum. By taking advantage of this assumption, CLOP can be faster than methods that try to perform global optimization instead of local optimization.
Does the fact that the objective function is stochastic affect the choice above ?
No. It is mainly a choice of local vs global optimization.
In the past i have used gradient descent methods to optimize eval parameters one by one. Does CLOP take into consideration correlation of parameters as that seems very important for determining piece values? By that I mean given 5 parameters to tune , does it give me correlation table after I run the optimization ?
Yes, CLOP is good at taking correlations into consideration. For chess, Knight and Bishop have very strong correlation, so it is important.

On the web site of CLOP, you can find a data file by Gian-Carlo Pascutto that shows the application of CLOP to his program. CLOP should work well to optimize piece values.

Rémi
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Issue....

Post by Michel »

I managed to lose data of 100000 games!

Here is what I did.

I ran some 75000 games. Unfortunately CLOP seemed to be converging to a local maximum which I knew was not optimal.

Since I knew better parameters than what CLOP was converging to, I decided to let CLOP run for 25000 samples using a small window around the known good values. The idea was that after enlarging the window again
the regression would be pulled towards the good values.

Unfortunately this had the effect that the previous 75000 samples were (silently) cropped to the small window, making them worthless:-(

Normally I would have still had the backup file, but since I did not immediately understand what was going on I restarted clop several
times, losing the backup file in the process.

I think it would be more logical if samples lying outside the parameter
window were simply ignored, instead of cropped.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: Issue....

Post by Rémi Coulom »

Michel wrote:I managed to lose data of 100000 games!

Here is what I did.

I ran some 75000 games. Unfortunately CLOP seemed to be converging to a local maximum which I knew was not optimal.

Since I knew better parameters than what CLOP was converging to, I decided to let CLOP run for 25000 samples using a small window around the known good values. The idea was that after enlarging the window again
the regression would be pulled towards the good values.

Unfortunately this had the effect that the previous 75000 samples were (silently) cropped to the small window, making them worthless:-(

Normally I would have still had the backup file, but since I did not immediately understand what was going on I restarted clop several
times, losing the backup file in the process.

I think it would be more logical if samples lying outside the parameter
window were simply ignored, instead of cropped.
CLOP crops the display to the current parameter range, but does not lose data. It should take that data into consideration. If you re-widen the ranges again, you should find your old data again. It is only a display problem.

If there is a convergence problem with your data, I am curious to take a look.

Rémi
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Issue....

Post by Michel »

If you re-widen the ranges again, you should find your old data again.
Unfortunately this was not the case. This is how I noticed. The data was
there, but clipped.

I can try to debug it although I currently have a bit of problem finding my way around in the source code.


Code: Select all

/////////////////////////////////////////////////////////////////////////////
// Transform x to the [-1, 1] range
/////////////////////////////////////////////////////////////////////////////
double CIntegerParameter::TransformToQLR(double x) const
{
 return -1.0 + 2.0 * (x - Min + 0.5) / (Max - Min + 1.0);
}
In the inverse transform some clipping seems to be going on

Code: Select all

/////////////////////////////////////////////////////////////////////////////
// Transform x to the [Min, Max] range
/////////////////////////////////////////////////////////////////////////////
double CIntegerParameter::TransformFromQLR(double x) const
{
 int i = Min + int((Max - Min + 1) * (x + 1.0) * 0.5);
 if (i > Max)
  i = Max;
 return i;
}
This clipping does not occur for CLinearParamer.

Code: Select all

/////////////////////////////////////////////////////////////////////////////
// Transform x to the [Min, Max] range
/////////////////////////////////////////////////////////////////////////////
double CLinearParameter::TransformFromQLR(double x) const
{
 return Min + (Max - Min) * (x + 1.0) * 0.5;
}
If there is a convergence problem with your data, I am curious to take a look.


Well I lost my data:-( If I had not lost it there were various things I could
have checked.

EDIT: fixed confusion between IntegerParameter and IntegerGammaParameter.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: Issue....

Post by Rémi Coulom »

You are right, sorry. I'll improve clop for this case.

Rémi
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Issue....

Post by Michel »

You are right, sorry. I'll improve clop for this case.
No need to say sorry. Software of this complexity is bound to have little issues.

Something else: would it be meaningful to choose the prior differently?

Instead of taking the quadratic function ax^2+bx+c with a,b,c Gaussian
with mean zero and fixed stdDev perhaps one could take

a(x-b)^2+c

with a,b,c Gaussian. This would make it possible to prebias CLOP around
a suggested value for b.

EDIT: I guess I am suggesting to trade in the linear term in the quadratic
function for the choice of an extremum.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: Issue....

Post by Rémi Coulom »

Michel wrote:
You are right, sorry. I'll improve clop for this case.
No need to say sorry. Software of this complexity is bound to have little issues.

Something else: would it be meaningful to choose the prior differently?

Instead of taking the quadratic function ax^2+bx+c with a,b,c Gaussian
with mean zero and fixed stdDev perhaps one could take

a(x-b)^2+c

with a,b,c Gaussian. This would make it possible to prebias CLOP around
a suggested value for b.

EDIT: I guess I am suggesting to trade in the linear term in the quadratic
function for the choice of an extremum.
I thought about it, and many other alternatives. A major problem with this approach is the non-linearity it introduces. With ax^2+bx+c, strength estimation depends linearly on parameters of the regression. This is an extremely good property to have. It allows very efficient algorithms, and ensures a unique maximum of the posterior. Introducing non-linearities makes the regression a lot more complex.

I already gave this advice in the README of clop: if you have an idea of good values for the parameters, you can prime the clop experiment by starting with a narrow interval around these values. Then, you can widen parameter range. (I am not 100% certain now, but I expect this would cause no data loss with the current clop). It seems to be the right way to bias the clop process with prior knowledge.

Rémi
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Daniel Shawul »

Note that, in general, you cannot use CLOP to estimate the strength of the optimised program accurately. Win rates produced by CLOP are biased. Win rate over all samples is pessimistic. Local win rate tends to be optimistic. The real win rate of optimal parameters is somewhere in-between.
Is that the reason why the elo I calculate from winningRate percentage sometimes do not match the one displayed on the gui with 95% confidence? For example I saw a 47% winning rate suggesting -21 elo, but only -5 elo is displayed. Also since the current selected parameters do not get all the games played (some early games are probably truncated), how are the mean and confidence intervals estimated ?
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: CLOP for Noisy Black-Box Parameter Optimization

Post by Michel »

(I am not 100% certain now, but I expect this would cause no data loss with the current clop).
I can confirm this is safe. It does not lose data.
It seems to be the right way to bias the clop process with prior knowledge.
Yes this is what I am doing now. I assume it will work (still only 33000 games).
From a theoretical point of view this method appears to be slightly inelegant though...