Page 1 of 1

Logarithmic Patterns In Evaluations

Posted: Sat Dec 09, 2017 7:05 pm
by D Sceviour
Middle game bonuses sometimes follow a linear pattern, and endgame values follow a logarithmic pattern. These patterns can be used to generate mg/eg tables with appropriate formulae. Texel's tuning method can be used to fine-tune the coefficients. My research engine demonstrated a definite strength increase.

For example, knight mobility values can be generated:

Code: Select all

 mg(x) = 6.6 x - 28.86	      linear
 eg(x) = 24.76 ln(x) - 43.82	logarithmic
where x = the number of escape squares that a knight can move to. If x = 0 then the knight is trapped. The results are:

knm_mg(8) = { -29,-22,-16,-9,-2, 4, 11, 17, 24 }
knm_eg(8) = { -50,-44,-27,-17,-9,-4, 1, 4, 8 }

knm_eg(0) = -50, arbitrarily because ln(0) has no solution.

The first coefficient is an accurate description of the shape and proportion, but the second coefficient may have to be intuitively altered to make sense with existing known evaluations of positions. That problem may be more of an aberration of my version of Texel's tuning method which overall seems to generate extreme values that are often larger than might be expected. Further, it adds extra variation to all the other tuning values, which never seems to stabilize. Has anyone have any suggestions on how to get tuning values to "stabilize"? Those who have tried Texel's tuning method should understand what this means. To a degree, one simply has to trust the tuning method.

Good luck if you try this.

Re: Logarithmic Patterns In Evaluations

Posted: Sat Dec 09, 2017 7:59 pm
by jdart
IMO it is simpler to just let each possible number of escape squares be its own parameter, and tune them independently.

Re convergence, you need to have a large enough training set, for starters. It also makes a difference what kind of positions are included. You want diversity so that all relevant features are present in the training set, but you also generally want to avoid very lopsided positions because changing parameter values has very little effect on their outcomes.

Re avoiding extreme values: you can also try penalizing large parameter values by applying L2 regularization (add a penalty term that is a constant times the sum of squares of the normalized parameters).

--Jon

Re: Logarithmic Patterns In Evaluations

Posted: Sat Dec 09, 2017 8:21 pm
by D Sceviour
jdart wrote:Re convergence, you need to have a large enough training set, for starters. It also makes a difference what kind of positions are included. You want diversity so that all relevant features are present in the training set, but you also generally want to avoid very lopsided positions because changing parameter values has very little effect on their outcomes.
500,000 positions were tested for the values. That should be sufficient diversity.
Re avoiding extreme values: you can also try penalizing large parameter values by applying L2 regularization (add a penalty term that is a constant times the sum of squares of the normalized parameters).
--Jon
https://en.wikipedia.org/wiki/Regulariz ... thematics)

Interesting. Could you expand a little more? That is, do you have an L2 regularization method for dummies example available that can be applied quickly into the formula. Texel's tuning is:

#define sigmoid(s) ( 1 / (1 + 10^(-K*s/400)) )

and the error for the test set is added up with:

E += (result - sigmoid(score))^2

Re: Logarithmic Patterns In Evaluations

Posted: Sat Dec 09, 2017 8:46 pm
by jdart
Peter says he used about 8 million positions for tuning. My current training set is about 6 million.

L2 regularization is easy: you just add a penalty term:

lambda*sum(param(i)^2)

where param(i) is normalized (one way is to subtract the midpoint and divide by the range of each parameter), and lambda is a constant (have to tune this).

The gradient of the regularization term is

2*lambda*sum(param(i))

Re: Logarithmic Patterns In Evaluations

Posted: Sat Dec 09, 2017 9:54 pm
by D Sceviour
jdart wrote:Peter says he used about 8 million positions for tuning. My current training set is about 6 million.
You have a lot of patience to wait! Generally, getting the numbers in the ballpark is usually good enough. As you mention, hand tuning can be just as effective. Exact precision importance has to be assessed for each different variable. The next generation of tuning may increase the number of test positions since the variables are now mostly "in the ballpark".
L2 regularization is easy: you just add a penalty term:
I am lost. Add a penalty term to what? I think you mean this:

E += (result - sigmoid(score))^2 + lambda*sum(param(i)^2)
where param(i) is normalized (one way is to subtract the midpoint and divide by the range of each parameter),
Still lost. The knight middle game coefficient 6.6 given above is a linear constant independant of any (i)...
and lambda is a constant (have to tune this).
Ok, easy enough to find once everything else is in order.

Re: Logarithmic Patterns In Evaluations

Posted: Sat Dec 09, 2017 11:51 pm
by jdart
No, I am not saying hand tune mobility. I am saying, Knight with one square to move to gets one weight, and Knight with two squares to move to gets another weight. Both weights can be auto-tuned. They are independent.

Yes, the regularization penalty is added to the objective.

The reason you normalize is that the purpose of regularization is to penalize extreme values. So you want large values of the parameters (+ or -) to give larger penalties.

Actually, while doing the tuning steps, you should normalize all your parameters (tuned weights) so that they have the same range (typically 0..1). You can re-scale them once the optimization is done.

--Jon

Re: Logarithmic Patterns In Evaluations

Posted: Sun Dec 10, 2017 12:15 am
by D Sceviour
jdart wrote:No, I am not saying hand tune mobility. I am saying, Knight with one square to move to gets one weight, and Knight with two squares to move to gets another weight. Both weights can be auto-tuned. They are independent.
I do not agree they are independent. They follow an average logarithmic pattern of increase.
The reason you normalize is that the purpose of regularization is to penalize extreme values. So you want large values of the parameters (+ or -) to give larger penalties.
This can be approached from a different point of view. The following type of position is won for white in almost all queen positions:
[d]8/8/8/6k1/3Q2p1/8/5K2/8 w - - 0 1
Queen mobility numbers are irrelevant and a tuning number would give an infinite value. Thus, instead a test:
if whiteMaterial < 300 then ...
will bypass any mobility test on the static evaluation. Some static evaluators also call this a lazy evaluation cutoff.

Re: Logarithmic Patterns In Evaluations

Posted: Sun Dec 10, 2017 3:29 pm
by D Sceviour
D Sceviour wrote:
jdart wrote:No, I am not saying hand tune mobility. I am saying, Knight with one square to move to gets one weight, and Knight with two squares to move to gets another weight. Both weights can be auto-tuned. They are independent.
I do not agree they are independent. They follow an average logarithmic pattern of increase.
It should be added further that this is the point of the exercise. Individual tuning of a parameter only tries to make up for the inadequacy of other variables, in spite of Peter Osterlund's claim that the tuning method adapts somewhat for elasticity of values. The result is an uneven curve with occasional spikes and unexplainable values.

By forcing the parameters to follow a smooth curve, other piece values can fit their curve better. The result should be that pieces will not fight with each other for control of space on the board, but adapt with each other to maximize mobility. The final test is whether there is an increase in strength. This cannot ultimately be seen by changing mobility for only one piece, but for all pieces so they can co-ordinate. Also, by forcing a natural logarithmic curve and testing its coefficients, a smaller sample size for the test set should produce a faster convergence.

This method is new (to me) but eventually it should be adaptable to all tables. The next step will be to try passed pawn values.

Re: Logarithmic Patterns In Evaluations

Posted: Sun Dec 10, 2017 9:34 pm
by jdart
> The result is an uneven curve with occasional spikes and unexplainable values.

This is expectable, because your training data is noisy (you only know one of the 3 possible results for each position, and some of those are bogus, because you might have gotten a draw result from a lost position, for example).

The more training data you have, the less a problem this is.

But also: you really have to look at the results of training. Does it play better, despite what look to you like weird values? If so then you should probably leave it be, and not try to coerce it into values that look ok to you.

--Jon