Issue with Texel Tuning

AndrewGrant · Post by **AndrewGrant** » Tue Feb 14, 2023 9:49 pm

This is useful: https://github.com/AndyGrant/Ethereal/b ... Tuning.pdf

JVMerlino · Post by **JVMerlino** » Tue Feb 14, 2023 10:12 pm

Fulvio wrote: ↑Tue Feb 14, 2023 8:45 pm
JVMerlino wrote: ↑Tue Feb 14, 2023 6:58 pm Sigh - it took me years to wrap my brain around Texel tuning before I could feel good about trying to implement it. Now I'm back to "square zero" with GD - I don't understand it at all!
This is a great explanation imho:

Thanks very much for that, but almost 13 hours of video will probably melt my brain before I can get a solid grasp of the concept.

algerbrex · Post by **algerbrex** » Tue Feb 14, 2023 10:28 pm

JVMerlino wrote: ↑Tue Feb 14, 2023 10:12 pm
Fulvio wrote: ↑Tue Feb 14, 2023 8:45 pm
JVMerlino wrote: ↑Tue Feb 14, 2023 6:58 pm Sigh - it took me years to wrap my brain around Texel tuning before I could feel good about trying to implement it. Now I'm back to "square zero" with GD - I don't understand it at all!
This is a great explanation imho:

Thanks very much for that, but almost 13 hours of video will probably melt my brain before I can get a solid grasp of the concept.

Once you understand the concept of what a gradient is and what it tells us about a function, understanding gradient descent becomes much easier I think.

So my advice might be to spend a little time flipping through the relevant sections of a college-level Calculus textbook on gradients, or I'm sure you can find some good resources online. You don't even need to understand all the relevant theories, just the basics of a gradient and how to compute it.

The essential idea is that the gradient of a function gives a vector which tells us the direction of the steepest ascent from a given point. So negating this vector gives us the direction of steepest descent. We can leverage this to find the local/global minimum of a function by starting at a point in the parameter space, computing the gradient, negating it, and "taking a step" in the direction of this negated gradient vector. In code this would look like iterating through the negated gradient vector, and adding each partial derivative to the corresponding parameter to update it. But since we don't want to overshoot the minimum, we take a fraction of each partial derivative, this fraction is known as the learning rate.

So in pseudo-code something like:

Code: Select all

parameters := list of starting parameters

for iteration := 0; itertation < NUMBER_OF_ITERATIONS; iteration++ {

    negated_gradient = compute_negated_gradient(parameters)

    for (i := 0;  i < length(negated_gradient); i++) {
        parameters[i] += learning_rate * negated_gradient[i]
    }
}

The hard part is taking your evaluation, developing a multivariable function to model it, and computing the gradient for that function. I have some formulas sketched out in Latex that I'd be happy to share if that might be helpful for you, as I spent some time last year computing the relevant derivations. And I'm pretty sure Andrew's paper also conveys much of the same information, so check that out too.

emadsen · Post by **emadsen** » Thu Feb 16, 2023 1:53 am

JVMerlino wrote: ↑Tue Feb 14, 2023 6:58 pm Sigh - it took me years to wrap my brain around Texel tuning before I could feel good about trying to implement it. Now I'm back to "square zero" with GD - I don't understand it at all!

You could use a particle swarm algorithm and not compute any gradient.

ciorap0 · Post by **ciorap0** » Sat Feb 18, 2023 9:26 am

Whiskers wrote: ↑Tue Feb 14, 2023 4:27 pm A good way to test your tuning algorithm is to let it change say only the value of a pawn and see if the results are reasonable

But what does reasonable mean? I expect the elo difference to be insignificant with only one changed parameter

algerbrex · Post by **algerbrex** » Sat Feb 18, 2023 1:20 pm

ciorap0 wrote: ↑Sat Feb 18, 2023 9:26 am
Whiskers wrote: ↑Tue Feb 14, 2023 4:27 pm A good way to test your tuning algorithm is to let it change, say only the value of a pawn and see if the results are reasonable
But what does reasonable mean? I expect the Elo difference to be insignificant with only one changed parameter.

True, tuning one parameter probably won't make a huge difference in Elo. But you mentioned in your original post that you got unusually small/large values for some different evaluation weights. So after tuning the weights, ensure they finish with reasonable values. Now of course the range of appropriate values is pretty variable depending on the engine specifics, but for usual implementations, I think values in the range of 60-120 are within the "reasonable" range. Any higher or lower, and you might be dealing with issues in your dataset and/or tuner.

As said earlier, make sure the tuner algorithm works first by using an established dataset, like the Zurichess dataset, and then you can work on creating an original dataset, which I definitely think is an admirable goal, and one I'm working on myself.

Issue with Texel Tuning

Re: Issue with Texel Tuning

Re: Issue with Texel Tuning

Re: Issue with Texel Tuning

Re: Issue with Texel Tuning

Re: Issue with Texel Tuning

Re: Issue with Texel Tuning