CLOP: when to stop?

cdani · Post by **cdani** » Wed Nov 09, 2016 9:48 pm

zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself?

Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.

zenpawn · Post by **zenpawn** » Wed Nov 09, 2016 10:53 pm

cdani wrote:
zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself?
Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.

OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me.

cdani · Post by **cdani** » Fri Nov 11, 2016 7:18 am

zenpawn wrote:
OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me.

Once you have this done, should be relative simple. The idea is to call the calc_e function with the obtained fixed value (and you don't change it never) but changing one of your static evaluation parameters, and if the returned value is lower, the change is supposedly good.

zenpawn · Post by **zenpawn** » Fri Nov 11, 2016 12:09 pm

cdani wrote: Once you have this done, should be relative simple. The idea is to call the calc_e function with the obtained fixed value (and you don't change it never) but changing one of your static evaluation parameters, and if the returned value is lower, the change is supposedly good.

OK, so the E() in the CPW pseudo-code for localOptimize is your calc_e(K), correct? Surprised they didn't make it a function argument as well, if only to make it clearer where the K would be used.

In your calc_e, how is game_result obtained? Is that from the FEN? Would it be possible to use another engine's eval as the comparison instead of the final result?

lucasart · Post by **lucasart** » Fri Nov 11, 2016 12:12 pm

zenpawn wrote:
cdani wrote:
zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself?
Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.
OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me.

I use the Newton-Raphson method for that. Very easy:
- you want to minimise the error E(lambda), distance to the logistic with parameter lambda.
- you start with a guess value (or any value really), and apply iteratively the following:
1. compute the second order polynomial tangent to the function
2. this polynomial is convex (upwards parabola). the min value is your new starting point. go to step 1.

The reason this works (for any starting point) is that the function E is convex. This is very fortunate indeed, because minimizing in general can be very complicated (eg. even if you start in an area where E is convex, an iteration of NR can throw you into another area past an inflexion point, and it can diverge or converge to another local minimum instead).

Here's my code:
https://github.com/lucasart/Demolito/bl ... rc/tune.cc

Code: Select all

double error&#40;double lambda&#41;
&#123;
    double sum = 0;

    for &#40;size_t i = 0; i < fens.size&#40;); i++) &#123;
        const double logistic = 1 / &#40;1.0 + std&#58;&#58;exp&#40;-lambda * qsearches&#91;i&#93;));
        sum += std&#58;&#58;abs&#40;scores&#91;i&#93; - logistic&#41;;
    &#125;

    return sum / fens.size&#40;);
&#125;

void logistic&#40;)
&#123;
    double lambda = 0.4, e0 = error&#40;lambda&#41;;
    double h = 0.01;

    while &#40;true&#41; &#123;
        // Compute function in 3 points
        const double ep = error&#40;lambda + h&#41;;
        const double em = error&#40;lambda - h&#41;;

        // Deduce first and second order derivative estimates
        const double e1 = &#40;ep - em&#41; / &#40;2 * h&#41;;
        const double e2 = &#40;ep - 2 * e0 + em&#41; / &#40;h * h&#41;;

        // New Lambda is the minimum of the tangent parabola
        const double newLambda = &#40;e2 * lambda - e1&#41; / e2;

        // Error and difference for the next iteration
        e0 = error&#40;newLambda&#41;;
        h = &#40;newLambda - lambda&#41; / 10;

        std&#58;&#58;cout << "lambda = " << newLambda << ", error&#40;lambda&#41; = " << e0 << '\n';

        if &#40;std&#58;&#58;abs&#40;newLambda - lambda&#41; < 0.00001&#41;
            break;

        lambda = newLambda;
    &#125;
&#125;

PS: Surprisingly, the absolute distance seemed to work much better than quadratic distance.

You can also do it by dichotomy if you want. The math are simpler, but it is less elegant, and the code will be more complicated (and require more iterations).

CLOP: when to stop?

Re: CLOP: when to stop?

Re: CLOP: when to stop?

Re: CLOP: when to stop?

Re: CLOP: when to stop?

Re: CLOP: when to stop?