Exchange

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

abulmo2
Posts: 433
Joined: Fri Dec 16, 2016 11:04 am
Location: France
Full name: Richard Delorme

Re: Exchange

Post by abulmo2 »

xr_a_y wrote: Mon Jul 01, 2019 9:23 am This is scary. Does others engines exhibit such lower EG values ?
For Amoeba, pawn, rook, bishop pair and material imbalance have significantly stronger values during the endgame phase. knight and bishop keep about the same value and the queen is slightly stronger. However, Amoeba's evaluation function is of course more complex than just material and the game phase influences many other values.
Richard Delorme
PK
Posts: 893
Joined: Mon Jan 15, 2007 11:23 am
Location: Warsza

Re: Exchange

Post by PK »

My Texel tuning also consistently gets significantly lower value for a knight in the endgame and slightly lower for a bishop. Rook and queen behave as expected. The observation is consistent across two engines, with different k factor but the same set of positions. I guess it might be fixed by increasing endgame pawn value first.
User avatar
hgm
Posts: 27791
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Exchange

Post by hgm »

xr_a_y wrote: Mon Jul 01, 2019 11:28 am I agree with your analysis but :

1°) my recent optimization (texel tuning on quiet.edp) tries (even starting from higher EG values) disagree and converge to lower EG values (without the additional term of course), and I don't get why yet.

2°) the added term won't affect opening because the scaling factor depends on material imbalance that is small in the opening (I guess above 1 pawn we are in the crazy part of opening theory).

I must precise that PST used (from rofchade1) are almost centered so I think they are not influencing too much those EG values.
Texel tuning is suspect, because it suffers from the same problem as what makes LC0 suck: it only tunes on expected score, ignoring how long/difficult it is to actually get that score. And what we are discussing here is a progress measure. Which is furthermore expected to only become important when the imbalance is so large that the score expectancy nearly saturates. (We already concluded that within the draw range it would actually backfire to haten trading.) So it would be rather ill-defined in the fitting process.

Also, Texel tuning is only as good as the terms you give it to tune. A global up-scaling of EG values would promote trading both inside and outside the draw range, which probably hurts more in the draw range than that it helps (result-wise) outside it (as it mostly helps progress). You cannot expect it to get a good result if there are no eval terms that can make the distinction between decided and drawish position. Because you force it to average things that should heve been distinguished.

I always compare this with fitting a circle with a square. You can make an art of tuning the size of a square such that on average as many randomly chosen points in the plane will fall inside and outside it as they do with the circle. But no matter how precisely you do this, using the square to determine whether an individual point will be inside or outside the circle will always have a pretty large error rate. If you would have used an octagon instead of a square it would do much better, even without tuning, by just a reasonable guess for its size (e.g. with its corners on the circle, or its edges touching it).
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Exchange

Post by xr_a_y »

How would you tune pieces values without Texel tuning ? CLOP would be too slow I guess.
User avatar
hgm
Posts: 27791
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Exchange

Post by hgm »

No tuning method can tune eval terms that are missing.

It also depends what you tune the eval to represent. I would not tune it purely for expected score, but in case of a won position, also for duration of the win. Trying to teach it that +14 is just as good as +10 is very detrimental.
User avatar
xr_a_y
Posts: 1871
Joined: Sat Nov 25, 2017 2:28 pm
Location: France

Re: Exchange

Post by xr_a_y »

hgm wrote: Mon Jul 01, 2019 5:36 pm No tuning method can tune eval terms that are missing.
You are stating that there is a redundancy of evaluation term in a previous post ;-)
hgm wrote: Mon Jul 01, 2019 5:36 pm It also depends what you tune the eval to represent. I would not tune it purely for expected score, but in case of a won position, also for duration of the win. Trying to teach it that +14 is just as good as +10 is very detrimental.
I think Texel tuning can do this as it minimize the error between win/lose and a sigmoid based on static score of quiet position. This way a +14 position has a smaller error than a +10 one. So it shall be possible to see that bigger EG values are better no ?

Oh ! I get your point now, we are not using "distance to win" of each position... :(