Issue with Texel Tuning

Discussion of chess software programming and technical issues.

Moderator: Ras

AndrewGrant
Posts: 1955
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Issue with Texel Tuning

Post by AndrewGrant »

JVMerlino
Posts: 1396
Joined: Wed Mar 08, 2006 10:15 pm
Location: San Francisco, California

Re: Issue with Texel Tuning

Post by JVMerlino »

Fulvio wrote: Tue Feb 14, 2023 8:45 pm
JVMerlino wrote: Tue Feb 14, 2023 6:58 pm Sigh - it took me years to wrap my brain around Texel tuning before I could feel good about trying to implement it. Now I'm back to "square zero" with GD - I don't understand it at all! :oops: :D
This is a great explanation imho:
Thanks very much for that, but almost 13 hours of video will probably melt my brain before I can get a solid grasp of the concept. :lol:
User avatar
algerbrex
Posts: 608
Joined: Sun May 30, 2021 5:03 am
Location: United States
Full name: Christian Dean

Re: Issue with Texel Tuning

Post by algerbrex »

JVMerlino wrote: Tue Feb 14, 2023 10:12 pm
Fulvio wrote: Tue Feb 14, 2023 8:45 pm
JVMerlino wrote: Tue Feb 14, 2023 6:58 pm Sigh - it took me years to wrap my brain around Texel tuning before I could feel good about trying to implement it. Now I'm back to "square zero" with GD - I don't understand it at all! :oops: :D
This is a great explanation imho:
Thanks very much for that, but almost 13 hours of video will probably melt my brain before I can get a solid grasp of the concept. :lol:
Once you understand the concept of what a gradient is and what it tells us about a function, understanding gradient descent becomes much easier I think.

So my advice might be to spend a little time flipping through the relevant sections of a college-level Calculus textbook on gradients, or I'm sure you can find some good resources online. You don't even need to understand all the relevant theories, just the basics of a gradient and how to compute it.

The essential idea is that the gradient of a function gives a vector which tells us the direction of the steepest ascent from a given point. So negating this vector gives us the direction of steepest descent. We can leverage this to find the local/global minimum of a function by starting at a point in the parameter space, computing the gradient, negating it, and "taking a step" in the direction of this negated gradient vector. In code this would look like iterating through the negated gradient vector, and adding each partial derivative to the corresponding parameter to update it. But since we don't want to overshoot the minimum, we take a fraction of each partial derivative, this fraction is known as the learning rate.

So in pseudo-code something like:

Code: Select all

parameters := list of starting parameters

for iteration := 0; itertation < NUMBER_OF_ITERATIONS; iteration++ {

    negated_gradient = compute_negated_gradient(parameters)

    for (i := 0;  i < length(negated_gradient); i++) {
        parameters[i] += learning_rate * negated_gradient[i]
    }
}
The hard part is taking your evaluation, developing a multivariable function to model it, and computing the gradient for that function. I have some formulas sketched out in Latex that I'd be happy to share if that might be helpful for you, as I spent some time last year computing the relevant derivations. And I'm pretty sure Andrew's paper also conveys much of the same information, so check that out too.
User avatar
emadsen
Posts: 440
Joined: Thu Apr 26, 2012 1:51 am
Location: Oak Park, IL, USA
Full name: Erik Madsen

Re: Issue with Texel Tuning

Post by emadsen »

JVMerlino wrote: Tue Feb 14, 2023 6:58 pm Sigh - it took me years to wrap my brain around Texel tuning before I could feel good about trying to implement it. Now I'm back to "square zero" with GD - I don't understand it at all! :oops: :D
You could use a particle swarm algorithm and not compute any gradient.
Erik Madsen | My C# chess engine: https://www.madchess.net
ciorap0
Posts: 23
Joined: Sun Feb 12, 2023 11:10 am
Full name: Vlad Ciocoiu

Re: Issue with Texel Tuning

Post by ciorap0 »

Whiskers wrote: Tue Feb 14, 2023 4:27 pm A good way to test your tuning algorithm is to let it change say only the value of a pawn and see if the results are reasonable
But what does reasonable mean? I expect the elo difference to be insignificant with only one changed parameter
User avatar
algerbrex
Posts: 608
Joined: Sun May 30, 2021 5:03 am
Location: United States
Full name: Christian Dean

Re: Issue with Texel Tuning

Post by algerbrex »

ciorap0 wrote: Sat Feb 18, 2023 9:26 am
Whiskers wrote: Tue Feb 14, 2023 4:27 pm A good way to test your tuning algorithm is to let it change, say only the value of a pawn and see if the results are reasonable
But what does reasonable mean? I expect the Elo difference to be insignificant with only one changed parameter.
True, tuning one parameter probably won't make a huge difference in Elo. But you mentioned in your original post that you got unusually small/large values for some different evaluation weights. So after tuning the weights, ensure they finish with reasonable values. Now of course the range of appropriate values is pretty variable depending on the engine specifics, but for usual implementations, I think values in the range of 60-120 are within the "reasonable" range. Any higher or lower, and you might be dealing with issues in your dataset and/or tuner.

As said earlier, make sure the tuner algorithm works first by using an established dataset, like the Zurichess dataset, and then you can work on creating an original dataset, which I definitely think is an admirable goal, and one I'm working on myself.