I will add some comments here that will hopefully help clear up some of the confusion.AlvaroBegue wrote:The trick is doing the gradient descent. While it would be possible to do it on the search function itself, it would be hard to make that efficient. So instead, you need to recover what position gave the eval that was propagated to the root, and then compute the gradient of the evaluation function at that node.Desperado wrote:Maybe i should think about it twice, but the pv eval should be passed to the root as search result. So at first glance i don't know in what way the "eval at the end of the pv" is different to the search result score. :?: :!:

The basic texel tuning method treats the evaluation function and the q-search function as black boxes. You put in a position and a set of parameter values, and you get out an evaluation score. How the score is computed is completely irrelevant for the tuning algorithm.

Without any assumptions about how the evaluation function works internally, you are restricted to quite primitive algorithms for finding a minimum in parameter space. The pseudo code on the CPW for example varies one parameter at a time, following the downwards direction. It stops when no smaller value can be found in any direction.

If we assume that the function to minimize is differentiable almost everywhere but still treat the function as a black box, we could use various gradient based optimization methods to speed up the search for a local minimum. Since the function is a black box it would not be possible to directly compute the required partial derivatives, so they would have to be approximated using finite differences instead. Typically something like

Code: Select all

`dE/dPi ~= E(pi+1) - E(pi)`

Code: Select all

`dE/dPi ~= (E(pi+1) - E(pi-1))/2`

If we further assume that the evaluation function has a certain structure, so that the evaluation score is computed from the position and parameters using only a well-defined set of operations, and assume that the evaluation function is written in a language that supports generic types and operator overloading, it is possible to implement a framework that automatically computes the partial derivatives at the same time as the evaluation score is computed. See for example this article for an explanation of how this can be done.

Álvaro has implemented such a framework, which is called ruy_tune. It is written in C++ and a requirement for it to work is that the evaluation function is converted to a template, where the score type is a template parameter. With such a modified evaluation function, the gradient can be computed much faster than if it were computed using finite differences. (At least I think it will be much faster, I have not actually tested this.)

However, for this to work you would have to find the position at the end of the PV and use that position to compute the evaluation score and the corresponding gradient. If you wanted to apply the automatic gradient computation technique to the q-search function, the q-search function would also have to be converted to a template, and the framework would have to be extended to overload also comparison operators in order to make the mini-maxing work.