Ab-initio evaluation tuning

asanjuan · Post by **asanjuan** » Thu Aug 31, 2017 12:27 pm

Evert wrote: 4. In the evaluation, I have fixed the value of a pawn in the end game (VALUE_P_EG) at 256. This fixes the scale for the evaluation, which is otherwise arbitrary.

There's no need to do that. You are biasing the result. The function is not arbitrary at all.
Probably if the resulting "real" value for the pawn should be less than a pawn, the regression algorithm will lower the rest of the pieces to fit the value of the whole set to the game outcomes.

I could be wrong, but I think you should let the algorithm to fit the pawn endgame value too. If I am right you will see the pawn endgame value lowering and endgame middle pieces raise to something similar to middlegame values.

Evert · Post by **Evert** » Thu Aug 31, 2017 12:30 pm

hgm wrote: Actually he might have gotten it from me: I remember a thread here where I proposed a Bishop-dependent Pawn value, to be handled through the Pawn hash. Upon which Tord remarked that we needed a Pawn-dependent Bishop value, and that I pointed out this is the same, and just a quadratic cross term between Bishops and Pawns.

I never knew that!
I remember I read a forum post by Tord about the quadratic form used in Glaurung years ago, but I don't remember what forum it was on (it wasn't this one, might have been WB) but I can't find it now. All I remember from it is that he said it was a generalisation of the bishop pair bonus to other piece types.

Evert · Post by **Evert** » Thu Aug 31, 2017 12:40 pm

asanjuan wrote: There's no need to do that. You are biasing the result. The function is not arbitrary at all.
Probably if the resulting "real" value for the pawn should be less than a pawn, the regression algorithm will lower the rest of the pieces to fit the value of the whole set to the game outcomes.

I could be wrong, but I think you should let the algorithm to fit the pawn endgame value too. If I am right you will see the pawn endgame value lowering and endgame middle pieces raise to something similar to middlegame values.

I can give it a shot, but I'm not convinced. Piece values are arbitrary up to a scale factor, which I fixed by setting the Pawn EG value. Allowing that to vary as well is equivalent to scaling the whole evaluation, which doesn't actually matter.

...

On second thought, that's what the constant in the logistic function does as well (fix the conversion of evaluation score to game outcome), so fixing both for the tuning algorithm might be wrong. I'll investigate that.

asanjuan · Post by **asanjuan** » Thu Aug 31, 2017 12:48 pm

Evert wrote:
asanjuan wrote: There's no need to do that. You are biasing the result. The function is not arbitrary at all.
Probably if the resulting "real" value for the pawn should be less than a pawn, the regression algorithm will lower the rest of the pieces to fit the value of the whole set to the game outcomes.

I could be wrong, but I think you should let the algorithm to fit the pawn endgame value too. If I am right you will see the pawn endgame value lowering and endgame middle pieces raise to something similar to middlegame values.
I can give it a shot, but I'm not convinced. Piece values are arbitrary up to a scale factor, which I fixed by setting the Pawn EG value. Allowing that to vary as well is equivalent to scaling the whole evaluation, which doesn't actually matter.

...

On second thought, that's what the constant in the logistic function does as well (fix the conversion of evaluation score to game outcome), so fixing both for the tuning algorithm might be wrong. I'll investigate that.

you are fitting game outcomes (1, 0 or 0.5) over
f(x) = 1/(1+exp(-k*x))

The pawn value shoud be the result of the optimisation process, not an arbitray number. Your goal is to minimize that function, not to obtain 1.00 for the pawn value, wich is a "cosmetic" result. So: you need to tune ALL parameter values.

For the record: Rhetoric uses also k=1 and I tune EVERY parameter.

Just try.

asanjuan · Post by **asanjuan** » Thu Aug 31, 2017 1:02 pm

asanjuan wrote:
Evert wrote:
asanjuan wrote: There's no need to do that. You are biasing the result. The function is not arbitrary at all.
Probably if the resulting "real" value for the pawn should be less than a pawn, the regression algorithm will lower the rest of the pieces to fit the value of the whole set to the game outcomes.

I could be wrong, but I think you should let the algorithm to fit the pawn endgame value too. If I am right you will see the pawn endgame value lowering and endgame middle pieces raise to something similar to middlegame values.
I can give it a shot, but I'm not convinced. Piece values are arbitrary up to a scale factor, which I fixed by setting the Pawn EG value. Allowing that to vary as well is equivalent to scaling the whole evaluation, which doesn't actually matter.

...

On second thought, that's what the constant in the logistic function does as well (fix the conversion of evaluation score to game outcome), so fixing both for the tuning algorithm might be wrong. I'll investigate that.
you are fitting game outcomes (1, 0 or 0.5) over
f(x) = 1/(1+exp(-k*x))

The pawn value shoud be the result of the optimisation process, not an arbitray number. Your goal is to minimize that function, not to obtain 1.00 for the pawn value, wich is a "cosmetic" result. So: you need to tune ALL parameter values.

For the record: Rhetoric uses also k=1 and I tune EVERY parameter.

Just try.

Another point.

This is just only a thougt, not something that I have tested, but I've been working quite a lot with this stuff, and in my experience, without PST or knowlege about pawn advances, your learning process won't learn enough about how important are pawns in endgames, because the game outcome surely will be adjudicated after the promotion, and the tunning will produce low numbers for pawns because can learn that they are not important.

The tunning can learn that the important thing is the piece after the promotion, not the pawn itself.

AlvaroBegue · Post by **AlvaroBegue** » Thu Aug 31, 2017 1:09 pm

Evert wrote:[...] Something else that bothers me: we're trying to fit the continuous outcome of a logistic function to the ordinal outcome of the game. I suspect it might be better to assign each position a value in the range 0..1 based on a number of games played from that position, with the value determined by the average score (the fraction of points won by White). The computation cost to do that is huge though...

This should not bother you at all. A probability is a continuous quantity that expresses our expectation of a binary result, and we use logistic regression to fit probabilities to discrete outcomes all the time. If you have time to play more games, you should probably play them from new positions so you get a more complete sample of situations on the board, instead of playing them from the same positions.

AlvaroBegue · Post by **AlvaroBegue** » Thu Aug 31, 2017 1:13 pm

asanjuan wrote:[...]

you are fitting game outcomes (1, 0 or 0.5) over
f(x) = 1/(1+exp(-k*x))

The pawn value shoud be the result of the optimisation process, not an arbitray number. Your goal is to minimize that function, not to obtain 1.00 for the pawn value, wich is a "cosmetic" result. So: you need to tune ALL parameter values.

The introduction of the factor k allows you to have an arbitrary scale in the evaluation. If you like evaluations where the EG value of the pawn is 1.00, you can have them. If you like evaluations where k is 1.00, you can have them. Proportional evaluation functions are equivalent, except for the effects introduced by rounding.

Evert · Post by **Evert** » Thu Aug 31, 2017 2:05 pm

asanjuan wrote: you are fitting game outcomes (1, 0 or 0.5) over
f(x) = 1/(1+exp(-k*x))

The pawn value shoud be the result of the optimisation process, not an arbitray number. Your goal is to minimize that function, not to obtain 1.00 for the pawn value, wich is a "cosmetic" result. So: you need to tune ALL parameter values.

For the record: Rhetoric uses also k=1 and I tune EVERY parameter.

Just try.

Well, my point was that you can fix either the EG value of a Pawn, or k. Fixing both probably removes too many degrees of freedom. So one can either

Fix k and tune Value[P][EG]
Fix Value[P][EG] and tune k

Now, k is not actually important to the engine because the engine is not concerned with mapping evaluation scores to predicted game results, only that a larger evaluation corresponds to better winning chances.

For cosmetic reasons, I would prefer to fix Value[P][EG] in the engine, but when tuning it's actually easier to fix k (at least in the way I implemented it, I just add the pawn value to the list of evaluation parameters to tune). You can of course do both after a fashion: fix k, tune the evaluation, then for the purpose of playing games rescale everything so that the end-game value of a pawn is fixed. The only drawback then is that you need to determine k again for the next run.

Anyway, allowing the pawn value to vary gives me this:

Code: Select all

   MG    EG
P  0.67  1.07
N  3.27  2.94
B  3.19  3.09
R  4.08  5.39
Q  9.13  9.93
BB 0.10  0.21

so the problem certainly looks to be smaller, but is not actually gone. I guess that means that to do this correctly, I do need some passed-pawn terms.

asanjuan · Post by **asanjuan** » Thu Aug 31, 2017 2:13 pm

AlvaroBegue wrote:
asanjuan wrote:[...]

you are fitting game outcomes (1, 0 or 0.5) over
f(x) = 1/(1+exp(-k*x))

The pawn value shoud be the result of the optimisation process, not an arbitray number. Your goal is to minimize that function, not to obtain 1.00 for the pawn value, wich is a "cosmetic" result. So: you need to tune ALL parameter values.
The introduction of the factor k allows you to have an arbitrary scale in the evaluation. If you like evaluations where the EG value of the pawn is 1.00, you can have them. If you like evaluations where k is 1.00, you can have them. Proportional evaluation functions are equivalent, except for the effects introduced by rounding.

In other words: go and try different K's until you get 1.00 as a result in endgame pawn value after tune every parameter.

But you can't fix both at the same time, because fixing EG pawn value introduces a bias in the rest of the parameters for a given K.

Evert · Post by **Evert** » Thu Aug 31, 2017 2:18 pm

AlvaroBegue wrote: This should not bother you at all. A probability is a continuous quantity that expresses our expectation of a binary result, and we use logistic regression to fit probabilities to discrete outcomes all the time. If you have time to play more games, you should probably play them from new positions so you get a more complete sample of situations on the board, instead of playing them from the same positions.

Maybe.
But how certain are you that the outcome is correct? If the result is 1-0, but a match over 10 games would result in 7-3, then we would be better off using 0.7 rather than 1.0 for that position. Worse, what if the result of a 10-game match is 1-9? Right now we treat positions that are won because you are a minor ahead as though they should give the same score as positions that are won because you are a queen ahead. They aren't, of course. Allowing for a difference there should reduce the noise in the fit. Of course adding more positions can also do that, but each individual position is then less important. On the other hand, reducing noise is not a goal per se.

The interesting positions here are not those where you are a piece ahead (or behind), of course, but those closer to the draw score, around the inflection point of the logistic function. I might try to make an estimate for how much time it would take to get a better estimate for some of those.

Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning

Re: Ab-initio evaluation tuning