Texel tuning method question

petero2 · Post by **petero2** » Wed Jun 07, 2017 7:43 pm

AlvaroBegue wrote:
Desperado wrote:Maybe i should think about it twice, but the pv eval should be passed to the root as search result. So at first glance i don't know in what way the "eval at the end of the pv" is different to the search result score. :?: :!:
The trick is doing the gradient descent. While it would be possible to do it on the search function itself, it would be hard to make that efficient. So instead, you need to recover what position gave the eval that was propagated to the root, and then compute the gradient of the evaluation function at that node.

I will add some comments here that will hopefully help clear up some of the confusion.

The basic texel tuning method treats the evaluation function and the q-search function as black boxes. You put in a position and a set of parameter values, and you get out an evaluation score. How the score is computed is completely irrelevant for the tuning algorithm.

Without any assumptions about how the evaluation function works internally, you are restricted to quite primitive algorithms for finding a minimum in parameter space. The pseudo code on the CPW for example varies one parameter at a time, following the downwards direction. It stops when no smaller value can be found in any direction.

If we assume that the function to minimize is differentiable almost everywhere but still treat the function as a black box, we could use various gradient based optimization methods to speed up the search for a local minimum. Since the function is a black box it would not be possible to directly compute the required partial derivatives, so they would have to be approximated using finite differences instead. Typically something like

Code: Select all

dE/dPi ~= E(pi+1) - E(pi)

or

Code: Select all

dE/dPi ~= (E(pi+1) - E(pi-1))/2

If there are M parameters the first formula would require M+1 evaluations to compute the value and all partial derivatives. The second formula is more accurate but would require 2*M+1 evaluations.

If we further assume that the evaluation function has a certain structure, so that the evaluation score is computed from the position and parameters using only a well-defined set of operations, and assume that the evaluation function is written in a language that supports generic types and operator overloading, it is possible to implement a framework that automatically computes the partial derivatives at the same time as the evaluation score is computed. See for example this article for an explanation of how this can be done.

Álvaro has implemented such a framework, which is called ruy_tune. It is written in C++ and a requirement for it to work is that the evaluation function is converted to a template, where the score type is a template parameter. With such a modified evaluation function, the gradient can be computed much faster than if it were computed using finite differences. (At least I think it will be much faster, I have not actually tested this.)

However, for this to work you would have to find the position at the end of the PV and use that position to compute the evaluation score and the corresponding gradient. If you wanted to apply the automatic gradient computation technique to the q-search function, the q-search function would also have to be converted to a template, and the framework would have to be extended to overload also comparison operators in order to make the mini-maxing work.

AlvaroBegue · Post by **AlvaroBegue** » Wed Jun 07, 2017 8:08 pm

petero2 wrote:[...]
Álvaro has implemented such a framework, which is called ruy_tune. It is written in C++ and a requirement for it to work is that the evaluation function is converted to a template, where the score type is a template parameter. With such a modified evaluation function, the gradient can be computed much faster than if it were computed using finite differences. (At least I think it will be much faster, I have not actually tested this.)

However, for this to work you would have to find the position at the end of the PV and use that position to compute the evaluation score and the corresponding gradient. If you wanted to apply the automatic gradient computation technique to the q-search function, the q-search function would also have to be converted to a template, and the framework would have to be extended to overload also comparison operators in order to make the mini-maxing work.

Correct. Computing the gradient on the QS directly is a colossal waste of time, at least with the method I implemented. It is much faster to run the QS saving the PV and then compute the gradient using the end-of-PV position.

What I did with RuyTune is turn the original positions into quiet positions using this end-of-PV method using my existing evaluation function, and then not worry too much about the fact that tweaking the evaluation function could result in a different position being picked. I could rerun this periodically (as someone else has suggested in this thread), but I think it would make very little difference in practice.

jdart · Post by **jdart** » Wed Jun 07, 2017 9:03 pm

Because my eval function is relatively simple I actually do a closed-form gradient computation. For me this entails some duplication of logic between the actual eval function and the tuning code. I also have some code that does the finite difference calculation and compares with the gradient computation and verifies that they are giving the same result within some small error margin.

Computing the gradient is fairly straightforward but it is important to take account of material value scaling, if that is used. And I have a few bits, notably king safety computation, that are nonlinear and for which gradient computation is non-trivial, but still doable.

This is quite a bit more dev work than finite differences though.

--Jon

Cheney · Post by **Cheney** » Wed Jul 19, 2017 9:49 pm

Hi,

I have a question at this point in regards to speed and method for testing various parameters.

I am just testing the mechanics and speed of the tuning method on a single thread and get about 1M positions tested in just under 20 seconds. I am OK with this for now but as I think about it, and I see this has been discussed in other posts, if I expose a few dozen parameters to tune, this could take weeks or longer, right?

What I envision is this, for example:
- I have 10 parameters that I want to tune.
- I want to use a delta for each parameter to test, let's just say delta +/- 10. For each parameter, this is 21 different values from value-10 to value+10.

That is a lot of combinations for ~20 seconds per test.

I do not know if this is called a "local search" or not, but the more parameters I want to expose the more time that is needed.

Am I seeing this wrong? Maybe this is the simplest brute force and there are better ways. Can someone help clear this idea up and give me a nudge in the right direction?

AlvaroBegue · Post by **AlvaroBegue** » Wed Jul 19, 2017 11:28 pm

Cheney wrote:Hi,

I have a question at this point in regards to speed and method for testing various parameters.

I am just testing the mechanics and speed of the tuning method on a single thread and get about 1M positions tested in just under 20 seconds. I am OK with this for now but as I think about it, and I see this has been discussed in other posts, if I expose a few dozen parameters to tune, this could take weeks or longer, right?

What I envision is this, for example:
- I have 10 parameters that I want to tune.
- I want to use a delta for each parameter to test, let's just say delta +/- 10. For each parameter, this is 21 different values from value-10 to value+10.

That is a lot of combinations for ~20 seconds per test.

I do not know if this is called a "local search" or not, but the more parameters I want to expose the more time that is needed.

Am I seeing this wrong? Maybe this is the simplest brute force and there are better ways. Can someone help clear this idea up and give me a nudge in the right direction?

Let me see if I understand what you are saying. If we only had one parameter to tune, you could imagine computing the derivative of your loss function (the thing you are minimizing) with respect to your parameter by setting the parameter 10 points higher, then 10 points lower, and approximating the derivative like this:

(d Loss) / (d Param) ~= (Loss(Param+10) - Loss(Param-10)) / 20

If you want to do this with P parameters, you would need 2P evaluations of the loss function (i.e., 2P passes through all the data), which gets expensive quickly.

Enter automatic differentiation. You can actually compute all those derivatives in 2 or 3 times the cost of computing the loss function once, regardless of P. The method is called "reverse-mode automatic differentiation". Neural nets people call it "backpropagation". And people that want to point out how obvious it all is in retrospect call it "the chain rule".

Last year I made RuyTune available so people could do this kind of thing on their engines. Unfortunately, I don't think I managed to make it user-friendly enough. But if you are interested, I can try to help you to make use of it, or at least the automatic-differentiation piece of it. See here: https://bitbucket.org/alonamaloh/ruy_tune

Cheney · Post by **Cheney** » Thu Jul 20, 2017 2:39 am

I'm sorry but I have not worked with derivatives (knowingly) since 1990

, but I only recall they are used to figure out the rate of change of a function, or something like that

. That being said, I need to let what you wrote sink in and dust off my old calculus books (and read some more on line).

However, since it is apparent I do not fully understand what you wrote, let me share this sample code with you on what I think I need to do next in case my original post is misleading:

Code: Select all

bestK = 1.13 // I calculated the best K
bestE = 0.145667  // This is the minimum E when K was determined
delta = 10  

oP1 = eval.pawn1 // A parameter we want to adjust and test
oR1 = eval.rook1 // Another parameter

for p1 = oP1-delta; p1 < oP1+delta; p1++
 for p2 = oR1-delta; p2 < oR1+delta; p2++
  // Set the penalty/bonus value of the eval parameters
  eval.pawn1 = p1
  eval.rook1 = p2

  // Run the process of loading FENs, doing qSearch, and calculating sigmoid and E
  E=DoQSearchAndGetE()

  // Preserve the parameters if E was lowered.
  if E < bestE
    // these parameters lowered E, save them
    bestP1 = p1
    bestP2 = p2
    bestE = E
  end if
 next p2
next p1

This is how I see testing changing parameters to look for the best combination that lowers E. In the above sample, the two loops account for 100 tests. The more parameters I want to attempt to tune, time becomes a factor (unless I somehow multi-thread this).

Is what I am thinking correct on the process? Maybe I do not need to do a loop? I did just look at the pseudo code on CPW for the "local search" and, if I am reading it right, it only tunes a parameter by 1?

As for RuyTune and your offer to help, I did review the CPW writeup, went through some other posts about it, and glanced at the code. At this time, I'd like to continue to work on this method since I have put a fair amount of time into it. Once done, I'd like to take a deeper look at RuyTune.

AlvaroBegue · Post by **AlvaroBegue** » Thu Jul 20, 2017 2:49 am

The code you posted is a brute-force search of all parameter settings. This doesn't scale very well. If you have 2 parameters to adjust, you are measuring a function at 400 different points (20^2). If you have 10 parameters to adjust, you'll never finish.

An alternative idea is to consider your function E as a function of several variables. The case of 2 variables is easy to visualize. Imagine you are in a terrain and you are trying to get to the lowest point possible. Just looking a little bit around you, you can figure out the direction of steepest descent where you are and take a step in that direction. Repeat this many times over, until you get to the bottom of some pit, where there is no direction in which you can continue to descend.

This is called "gradient descent", since the gradient (which is what reverse-mode automatic differentiation computes) tells you what the direction of steepest descent is. There are many many variations on this idea. Maybe with this initial explanation you can make some sense of what RuyTune is about.

Anyway, feel free to contact me whenever you want to give RuyTune a try.

Ferdy · Post by **Ferdy** » Thu Jul 20, 2017 8:05 am

Sample basic Texel tuning code in python.

Tune.py

Code: Select all

"""
Sample Texel tuning in Python

Tested on python 3.6

"""


import sys
import random
from copy import deepcopy
import time


MAXBESTE = 1000.0


def DoQSearchAndGetE(K):
    """ Just return random for now """
    time.sleep(0.5) # Sleep for 0.5s for simulation purposes
    return random.randint(14400, 14600)/100000.0


def SendParamToEngine(param):
    """ Set the param to the engine, pass for now """
    pass
	

def Tune(paramToOptimize, initBestE, delta, K, iLog, cycleLimit):
    """ paramToOptimize is a list of list
        initBestE is float
        delta is a list
    """
    goodCnt = 1
    goodCntOld = 0
    lastBestE = initBestE
    lastBestParam = []
    
    for g in range(cycleLimit): # g = 0, 1, 2, ... cycleLimit-1
        if iLog:
            print('Tune >> param cycle: %d' %(g+1))

        # Exit tuning if goodCnt is not increased
        if goodCnt <= goodCntOld:
            if iLog:
                print('Tune >> bestE has not been improved, exiting the tuning now ...')
            break
        goodCntOld = goodCnt

        if len(lastBestParam):
            paramToOptimize = deepcopy(lastBestParam)

        for p in paramToOptimize:
            pName = p[0]
            pValue = p[1]
            
            for d in delta:
                dValue = pValue + d
                if iLog:
                    print('Tune >> param to optimize: %s' %(pName))
                    print('Tune >> delta: %+d' %(d))
                
                # Create a new param set instead of paramToOptimize.
                # This is the set that will be sent to the engine.
                paramToBeTested = [] 
                
                for a in paramToOptimize:
                    if a[0] == pName:
                        a[1] = dValue
                    paramToBeTested.append([a[0], a[1]])
                        
                if iLog:
                    print('Tune >> paramSet to try: %s' %(paramToBeTested))
                    print('Tune >> Send this set to engine')
                        
                # Send param values to engine
                SendParamToEngine(paramToBeTested)

                if iLog:
                    print('Tune >> lastBestE: %0.5f' %(lastBestE))
                    print('Tune >> Calculate E')

                E = DoQSearchAndGetE(K)

                if iLog:
                    print('Tune >> CalculatedE: %0.5f' %(E))
                
                if E < lastBestE:
                    goodCnt += 1
                    lastBestE = E

                    if iLog:
                        print('Tune >> NewBestE: %0.5f' %(lastBestE))
                        print('Tune >> CalculatedE is good -------- !!\n')
                        
                    lastBestParam = deepcopy(paramToBeTested)
                    
                    # Get out of 'for delta' and try the next p
                    break
                else:
                    if iLog:
                        print('Tune >> CalculatedE is not good ----- ?\n')

        # Log if wer have reached cycle limit
        if g == cycleLimit-1:
            if iLog:
                print('Tune >> param cycle limit has been reached, exiting tuning now ...')

    return lastBestE, lastBestParam
		

def main(argv):
        
    bestK = 1.13
    bestESTart = MAXBESTE
    enableLog = True
    paramCycleLimit = 1000

    paramToOptimize = [
        ['pawn', 100],
        ['knight', 300],
        ['bishop', 300]
    ]
    delta = [+1, -1]

    # Show init values
    print('origBestE       : %0.5f' %(bestESTart))
    print('origParam       : %s' %(paramToOptimize))
    print('bestK           : %0.3f' %(bestK))
    print('delta           : %s' %(delta))
    print('paramCycleLimit : %d' %(paramCycleLimit))
    print('EnableLogging   : %s\n' %('On' if enableLog else 'Off'))

    t1 = time.clock()

    # Run the tuner  
    optiE, optiParam = Tune(paramToOptimize, bestESTart, delta, bestK, enableLog, paramCycleLimit)

    t2 = time.clock()

    print('\nbestE   : %0.5f' %(optiE))
    print('bestParam : %s' %(optiParam))
    print('Elapsed   : %ds' %(t2-t1))


if __name__ == "__main__":
    main(sys.argv[1:])

Calculation of E is done by random number for simulation purposes.
Sample output:

Code: Select all

origBestE       : 1000.00000
origParam       : [['pawn', 100], ['knight', 300], ['bishop', 300]]
bestK           : 1.130
delta           : [1, -1]
paramCycleLimit : 1000
EnableLogging   : On

Tune >> param cycle: 1
Tune >> param to optimize: pawn
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 101], ['knight', 300], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 1000.00000
Tune >> Calculate E
Tune >> CalculatedE: 0.14514
Tune >> NewBestE: 0.14514
Tune >> CalculatedE is good -------- !!

Tune >> param to optimize: knight
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 101], ['knight', 301], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14514
Tune >> Calculate E
Tune >> CalculatedE: 0.14517
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 101], ['knight', 299], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14514
Tune >> Calculate E
Tune >> CalculatedE: 0.14462
Tune >> NewBestE: 0.14462
Tune >> CalculatedE is good -------- !!

Tune >> param to optimize: bishop
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 101], ['knight', 299], ['bishop', 301]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14462
Tune >> Calculate E
Tune >> CalculatedE: 0.14560
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 101], ['knight', 299], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14462
Tune >> Calculate E
Tune >> CalculatedE: 0.14596
Tune >> CalculatedE is not good ----- ?

Tune >> param cycle: 2
Tune >> param to optimize: pawn
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 102], ['knight', 299], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14462
Tune >> Calculate E
Tune >> CalculatedE: 0.14543
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: pawn
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 100], ['knight', 299], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14462
Tune >> Calculate E
Tune >> CalculatedE: 0.14435
Tune >> NewBestE: 0.14435
Tune >> CalculatedE is good -------- !!

Tune >> param to optimize: knight
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 100], ['knight', 300], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14496
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 100], ['knight', 298], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14599
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 100], ['knight', 298], ['bishop', 301]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14575
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 100], ['knight', 298], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14457
Tune >> CalculatedE is not good ----- ?

Tune >> param cycle: 3
Tune >> param to optimize: pawn
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 101], ['knight', 299], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14533
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: pawn
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 99], ['knight', 299], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14448
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 99], ['knight', 300], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14555
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 99], ['knight', 298], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14553
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 99], ['knight', 298], ['bishop', 301]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14553
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 99], ['knight', 298], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14435
Tune >> Calculate E
Tune >> CalculatedE: 0.14419
Tune >> NewBestE: 0.14419
Tune >> CalculatedE is good -------- !!

Tune >> param cycle: 4
Tune >> param to optimize: pawn
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 100], ['knight', 298], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14419
Tune >> Calculate E
Tune >> CalculatedE: 0.14416
Tune >> NewBestE: 0.14416
Tune >> CalculatedE is good -------- !!

Tune >> param to optimize: knight
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 100], ['knight', 299], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14416
Tune >> Calculate E
Tune >> CalculatedE: 0.14444
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 100], ['knight', 297], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14416
Tune >> Calculate E
Tune >> CalculatedE: 0.14539
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 100], ['knight', 297], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14416
Tune >> Calculate E
Tune >> CalculatedE: 0.14510
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 100], ['knight', 297], ['bishop', 298]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14416
Tune >> Calculate E
Tune >> CalculatedE: 0.14425
Tune >> CalculatedE is not good ----- ?

Tune >> param cycle: 5
Tune >> param to optimize: pawn
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 101], ['knight', 298], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14416
Tune >> Calculate E
Tune >> CalculatedE: 0.14557
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: pawn
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 99], ['knight', 298], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14416
Tune >> Calculate E
Tune >> CalculatedE: 0.14561
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 99], ['knight', 299], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14416
Tune >> Calculate E
Tune >> CalculatedE: 0.14535
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 99], ['knight', 297], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14416
Tune >> Calculate E
Tune >> CalculatedE: 0.14404
Tune >> NewBestE: 0.14404
Tune >> CalculatedE is good -------- !!

Tune >> param to optimize: bishop
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 99], ['knight', 297], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14404
Tune >> Calculate E
Tune >> CalculatedE: 0.14599
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 99], ['knight', 297], ['bishop', 298]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14404
Tune >> Calculate E
Tune >> CalculatedE: 0.14518
Tune >> CalculatedE is not good ----- ?

Tune >> param cycle: 6
Tune >> param to optimize: pawn
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 100], ['knight', 297], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14404
Tune >> Calculate E
Tune >> CalculatedE: 0.14426
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: pawn
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 98], ['knight', 297], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14404
Tune >> Calculate E
Tune >> CalculatedE: 0.14437
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 98], ['knight', 298], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14404
Tune >> Calculate E
Tune >> CalculatedE: 0.14475
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: knight
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 98], ['knight', 296], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14404
Tune >> Calculate E
Tune >> CalculatedE: 0.14519
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 98], ['knight', 296], ['bishop', 300]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14404
Tune >> Calculate E
Tune >> CalculatedE: 0.14522
Tune >> CalculatedE is not good ----- ?

Tune >> param to optimize: bishop
Tune >> delta: -1
Tune >> paramSet to try: [['pawn', 98], ['knight', 296], ['bishop', 298]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14404
Tune >> Calculate E
Tune >> CalculatedE: 0.14477
Tune >> CalculatedE is not good ----- ?

Tune >> param cycle: 7
Tune >> bestE has not been improved, exiting the tuning now ...

bestE   : 0.14404
bestParam : [['pawn', 99], ['knight', 297], ['bishop', 299]]
Elapsed   : 18s

So you get your best param and best error
bestE : 0.14404
bestParam : [['pawn', 99], ['knight', 297], ['bishop', 299]]

In certain situation you can stop the tuning and record the best so far.
Remember the best error and param.
When you resume the tuning, you can now use the last best tuning info.
So your next input run for example would be:

Code: Select all

bestESTart =0.14404 
paramToOptimize = [ 
        ['pawn', 99], 
        ['knight', 297], 
        ['bishop', 299] 
]

At a point where there is successful reduction of error like,

Code: Select all

Tune >> param cycle: 4
Tune >> param to optimize: pawn
Tune >> delta: +1
Tune >> paramSet to try: [['pawn', 100], ['knight', 298], ['bishop', 299]]
Tune >> Send this set to engine
Tune >> lastBestE: 0.14419
Tune >> Calculate E
Tune >> CalculatedE: 0.14416
Tune >> NewBestE: 0.14416
Tune >> CalculatedE is good -------- !!

Remember the param tuned:
[['pawn', 100], ['knight', 298], ['bishop', 299]]
You can use that to test your engine in actual matches, this is called sampling. It can happen that those values may actually perform in actual game test.

It is good to experiment varied delta array like,

Code: Select all

delta = [+1, -1, +2, -2, +3, -3]

Expand the param if you like.

Code: Select all

paramToOptimize = [ 
        ['pawn', 100], 
        ['knight', 300], 
        ['bishop', 300],
        ['rook', 500],
        ['queen', 1000] 
]

Ferdy · Post by **Ferdy** » Thu Jul 20, 2017 8:29 am

Tune.py

Code: Select all

"""
Sample Texel tuning in Python

Tested on python 3.6

"""


import sys
import random
from copy import deepcopy
import time


MAXBESTE = 1000.0


def DoQSearchAndGetE(K):
    """ Just return random for now """
    time.sleep(0.5) # Sleep for 0.5s for simulation purposes
    return random.randint(14400, 14600)/100000.0


def SendParamToEngine(param):
    """ Set the param to the engine, pass for now """
    pass
	

def Tune(paramToOptimize, initBestE, delta, K, iLog, cycleLimit):
    """ paramToOptimize is a list of list
        initBestE is float
        delta is a list
    """
    goodCnt = 1
    goodCntOld = 0
    lastBestE = initBestE
    lastBestParam = []
    
    for g in range(cycleLimit): # g = 0, 1, 2, ... cycleLimit-1
        if iLog:
            print('Tune >> param cycle: %d' %(g+1))

        # Exit tuning if goodCnt is not increased
        if goodCnt <= goodCntOld:
            if iLog:
                print('Tune >> bestE has not been improved, exiting the tuning now ...')
            break
        goodCntOld = goodCnt

        # Update paramToOptimize with lastBestParam
        if len(lastBestParam):
            paramToOptimize = deepcopy(lastBestParam)

        for p in paramToOptimize:
            pName = p[0]
            pValue = p[1]
            
            for d in delta:
                dValue = pValue + d
                if iLog:
                    print('Tune >> param to optimize: %s' %(pName))
                    print('Tune >> delta: %+d' %(d))

                # Update paramToOptimize with lastBestParam
                if len(lastBestParam):
                    paramToOptimize = deepcopy(lastBestParam)
                
                # Create a new param set instead of paramToOptimize.
                # This is the set that will be sent to the engine.
                paramToBeTested = [] 
                
                for a in paramToOptimize:
                    if a[0] == pName:
                        a[1] = dValue
                    paramToBeTested.append([a[0], a[1]])
                        
                if iLog:
                    print('Tune >> paramSet to try: %s' %(paramToBeTested))
                    print('Tune >> Send this set to engine')
                        
                # Send param values to engine
                SendParamToEngine(paramToBeTested)

                if iLog:
                    print('Tune >> lastBestE: %0.5f' %(lastBestE))
                    print('Tune >> Calculate E')

                E = DoQSearchAndGetE(K)

                if iLog:
                    print('Tune >> CalculatedE: %0.5f' %(E))
                
                if E < lastBestE:
                    goodCnt += 1
                    lastBestE = E

                    if iLog:
                        print('Tune >> NewBestE: %0.5f' %(lastBestE))
                        print('Tune >> CalculatedE is good -------- !!\n')
                        
                    lastBestParam = deepcopy(paramToBeTested)
                    
                    # Get out of 'for delta' and try the next p
                    break
                else:
                    if iLog:
                        print('Tune >> CalculatedE is not good ----- ?\n')

        # Log if wer have reached cycle limit
        if g == cycleLimit-1:
            if iLog:
                print('Tune >> param cycle limit has been reached, exiting tuning now ...')

    return lastBestE, lastBestParam
		

def main(argv):
        
    bestK = 1.13
    bestESTart = MAXBESTE
    enableLog = True
    paramCycleLimit = 1000

    paramToOptimize = [
        ['pawn', 100],
        ['knight', 300],
        ['bishop', 300]
    ]
    delta = [+1, -1]

    # Show init values
    print('origBestE       : %0.5f' %(bestESTart))
    print('origParam       : %s' %(paramToOptimize))
    print('bestK           : %0.3f' %(bestK))
    print('delta           : %s' %(delta))
    print('paramCycleLimit : %d' %(paramCycleLimit))
    print('EnableLogging   : %s\n' %('On' if enableLog else 'Off'))

    t1 = time.clock()

    # Run the tuner  
    optiE, optiParam = Tune(paramToOptimize, bestESTart, delta, bestK, enableLog, paramCycleLimit)

    t2 = time.clock()

    print('\nbestE   : %0.5f' %(optiE))
    print('bestParam : %s' %(optiParam))
    print('Elapsed   : %ds' %(t2-t1))


if __name__ == "__main__":
    main(sys.argv[1:])

Added:

Code: Select all

# Update paramToOptimize with lastBestParam
if len(lastBestParam):
    paramToOptimize = deepcopy(lastBestParam)

In

for d in delta:
dValue = pValue + d
if iLog:
print('Tune >> param to optimize: %s' %(pName))
print('Tune >> delta: %+d' %(d))

# Update paramToOptimize with lastBestParam
if len(lastBestParam):
paramToOptimize = deepcopy(lastBestParam)

# Create a new param set instead of paramToOptimize.
# This is the set that will be sent to the engine.
paramToBeTested = []

Cheney · Post by **Cheney** » Sat Jul 22, 2017 6:23 pm

Yes, a brute force and I know that would take a long time

. However, your explanation of the gradient descent was spot on!

I started to use that concept and after some time, I have some tuned numbers. I basically am testing with just about 3M positions now and have eight parameters I want to play with (all pawn items, like isolated, doubled, backwards, and chained for mg and eg). I know the values I currently have are good, so I let the tuning process start with those. To "walk" the parameters, I'd increase the first by one, test, if E is not lower, then I'd decrease the original value by 1. I'd then repeat this for the next parameter and repeat until there are no improvements and I'd record the improvements. A question on this is if I can walk one parameter until it no longer improves and then walk the next one? I figure there would be a point where original parameters are now no longer optimal. I need to test this

For some of the parameters, they stay close to the values I originally hand tuned while others go somewhere I did not expect

- e.g., chained pawn in mg is a negative "bonus".

I tested the new values by having that engine play the engine that was hand tuned - the auto-tuned loses but only by a few ELO over 10K games.

I then zeroed out all my pawn parameters and started over. The new auto tuned values are still close to the previous auto tuned values (one might be off by +/- 1). The interesting thing is I had all three engines play - one with 0 values, one with hand tuned values, and one with auto tuned values.

The auto-tuned loses consistently while my hand tuned wins consistently.

Now maybe this just won't work for me, or I am doing something wrong or am missing the point. I'm not sure where to go with this. I was thinking of extracting positions from GM games or from games pulled from CCRL and testing to find out what happens. I am still working on this ...

Thanks again!

Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Re: Texel tuning method question

Code bug fix

Re: Texel tuning method question