cursed parameters in Texel tuning

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

PK
Posts: 893
Joined: Mon Jan 15, 2007 11:23 am
Location: Warsza

cursed parameters in Texel tuning

Post by PK »

I'm doing a kind of of manually-aided Texel tuning since a couple of months. The procedure is that I change weights manually, and accept changes when new parameter scores better across a million or two of positions evaluated by quiescence search according to a sigmoid function. Obviously it's not as efficient as automatic tuning, but it allowed me to notice a curious phenomenon, which I call "cursed parameters". In my engine, I have identified several parameters that yield a gain according to the sigmoid function, but decrease playing strength according to the tests (1000 to 4000 games). On the other hand, there are parameters (such as a queen check threat) that have always been significant according to testing in games, but have no impact on a tuning run. My hypothesis is that these parameters occur too frequently for the losing side. Is there any automatic way to detect cursed parameters in order to avoid tuning them?
jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: cursed parameters in Texel tuning

Post by jdart »

Tuning for max alignment between game result and score is imperfect, because among other things there will always be cases where a game-changing move is far off from the point where you measuring the score. The method relies on this kind of noise being reduced through the use of large numbers of positions. But there is still noise and the rarer a tuning feature is the more you will get.

I don't think there is any real substitute for game testing. Param tuning is supposed to correlate to performance in real games at reasonable depths but it is not a perfect correlation.

Regularization may help (feature of auto-tuning). L2 regularization helps keep individual parameters from getting too large, and L1 regularization will help zero out parameters that don't contribute significantly to prediction.

(Also btw. 1000-4000 games for verification is quite a low number. You need more than that to get reasonably tight error bounds, to the point where you can really see if a modest parameter change is good or not).

--Jon