Trying to understand Texel tuning method

nionita · Post by **nionita** » Mon May 01, 2023 10:00 pm

Hi all,

because I never succeeded to improve Barbarossa with Texel tuning method, even after a very deep dive in the last few months, I want to discuss some basic principles, to be sure I don't have a basic mistake in my assumptions. Just in case after the NNUE took over there is still some interest in this topic.

So I want just to postulate a few principles, which for me are evident, but maybe not for everybody, and maybe discuss a bit around them:

1. Any significant improvement of the Texel loss function should give an improvement of the evaluation function (except for some noisy small improvements, due to just a few optimizing steps, based on the fact that the training set is finite)

2. The method used to improve the loss function is not relevant (except maybe for efficiency)

3. An improved eval function should most of the time give better playing strength of the engine

4. (Actually a corollary of 1) Even if you optimize just some parameters of the eval function, this should lead to a better engine

Needless to say that I got a lot of improvements of the loss function by different methods and optimizing different eval parameters with none or at best very little success (like 3 Elo, maybe just noise). Otherwise I would not ask you for help.

It is clear that the size of the data set is very important, but I experimented with ~4 million position, mostly quiet (like CCRL 3200). Compared to the number of optimized parameters (at most 100, let's say), I guess it can be excluded that I get some overfitting. While it looks like many people got significant better engines (like tens of ELO) even with smaller data sets.