Page 1 of 2

Re: CLOP: when to stop?

Posted: Tue Nov 08, 2016 3:53 pm
by AlvaroBegue
jdart wrote:
You have to go through R rounds of tuning, where I've found the 10k rounds figure fairly accurate.
This is CLOP you are talking about, or Texel? For the latter my tuning program typically converges in 50-60 iterations. It uses either Adagrad or ADAM (https://arxiv.org/pdf/1412.6980.pdf) to find the optimum.

--Jon
I tried some first-order gradient-descent algorithms (momentum, Adam) and the convergence was too slow. Perhaps I could play around with larger learning rates...

Anyway, yesterday I managed to get L-BFGS working and my first tests seem very promising.

Re: CLOP: when to stop?

Posted: Tue Nov 08, 2016 6:30 pm
by zenpawn
ZirconiumX wrote:
zenpawn wrote: Yeah, Texel tuning definitely intrigues me as well, but there doesn't seem to be a ready-made program to execute the algorithm on any 'ole engine like CLOP.
It requires more effort, yes, but the 150 elo leap my engine made with it shows it can produce big rewards. The problem with such a universal tuner is the communication latencies are massive.
I'd be happy to implement it myself, but the math is apparently over my head. :(

Re: CLOP: when to stop?

Posted: Tue Nov 08, 2016 7:42 pm
by cdani
zenpawn wrote:
ZirconiumX wrote:
zenpawn wrote: Yeah, Texel tuning definitely intrigues me as well, but there doesn't seem to be a ready-made program to execute the algorithm on any 'ole engine like CLOP.
It requires more effort, yes, but the 150 elo leap my engine made with it shows it can produce big rewards. The problem with such a universal tuner is the communication latencies are massive.
I'd be happy to implement it myself, but the math is apparently over my head. :(
Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20

Re: CLOP: when to stop?

Posted: Wed Nov 09, 2016 1:26 am
by zenpawn
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself? :?

Re: CLOP: when to stop?

Posted: Wed Nov 09, 2016 9:48 pm
by cdani
zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself? :?
Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.

Re: CLOP: when to stop?

Posted: Wed Nov 09, 2016 10:53 pm
by zenpawn
cdani wrote:
zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself? :?
Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.
OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me. :(

Re: CLOP: when to stop?

Posted: Fri Nov 11, 2016 7:18 am
by cdani
zenpawn wrote:
OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me. :(
Once you have this done, should be relative simple. The idea is to call the calc_e function with the obtained fixed value (and you don't change it never) but changing one of your static evaluation parameters, and if the returned value is lower, the change is supposedly good.

Re: CLOP: when to stop?

Posted: Fri Nov 11, 2016 12:09 pm
by zenpawn
cdani wrote: Once you have this done, should be relative simple. The idea is to call the calc_e function with the obtained fixed value (and you don't change it never) but changing one of your static evaluation parameters, and if the returned value is lower, the change is supposedly good.
OK, so the E() in the CPW pseudo-code for localOptimize is your calc_e(K), correct? Surprised they didn't make it a function argument as well, if only to make it clearer where the K would be used.

In your calc_e, how is game_result obtained? Is that from the FEN? Would it be possible to use another engine's eval as the comparison instead of the final result?

Re: CLOP: when to stop?

Posted: Fri Nov 11, 2016 12:12 pm
by lucasart
zenpawn wrote:
cdani wrote:
zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself? :?
Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.
OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me. :(
I use the Newton-Raphson method for that. Very easy:
- you want to minimise the error E(lambda), distance to the logistic with parameter lambda.
- you start with a guess value (or any value really), and apply iteratively the following:
1. compute the second order polynomial tangent to the function
2. this polynomial is convex (upwards parabola). the min value is your new starting point. go to step 1.

The reason this works (for any starting point) is that the function E is convex. This is very fortunate indeed, because minimizing in general can be very complicated (eg. even if you start in an area where E is convex, an iteration of NR can throw you into another area past an inflexion point, and it can diverge or converge to another local minimum instead).

Here's my code:
https://github.com/lucasart/Demolito/bl ... rc/tune.cc

Code: Select all

double error(double lambda)
{
    double sum = 0;

    for &#40;size_t i = 0; i < fens.size&#40;); i++) &#123;
        const double logistic = 1 / &#40;1.0 + std&#58;&#58;exp&#40;-lambda * qsearches&#91;i&#93;));
        sum += std&#58;&#58;abs&#40;scores&#91;i&#93; - logistic&#41;;
    &#125;

    return sum / fens.size&#40;);
&#125;

void logistic&#40;)
&#123;
    double lambda = 0.4, e0 = error&#40;lambda&#41;;
    double h = 0.01;

    while &#40;true&#41; &#123;
        // Compute function in 3 points
        const double ep = error&#40;lambda + h&#41;;
        const double em = error&#40;lambda - h&#41;;

        // Deduce first and second order derivative estimates
        const double e1 = &#40;ep - em&#41; / &#40;2 * h&#41;;
        const double e2 = &#40;ep - 2 * e0 + em&#41; / &#40;h * h&#41;;

        // New Lambda is the minimum of the tangent parabola
        const double newLambda = &#40;e2 * lambda - e1&#41; / e2;

        // Error and difference for the next iteration
        e0 = error&#40;newLambda&#41;;
        h = &#40;newLambda - lambda&#41; / 10;

        std&#58;&#58;cout << "lambda = " << newLambda << ", error&#40;lambda&#41; = " << e0 << '\n';

        if &#40;std&#58;&#58;abs&#40;newLambda - lambda&#41; < 0.00001&#41;
            break;

        lambda = newLambda;
    &#125;
&#125;
PS: Surprisingly, the absolute distance seemed to work much better than quadratic distance.

You can also do it by dichotomy if you want. The math are simpler, but it is less elegant, and the code will be more complicated (and require more iterations).