CLOP: when to stop?

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
AlvaroBegue
Posts: 920
Joined: Tue Mar 09, 2010 2:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: CLOP: when to stop?

Post by AlvaroBegue » Tue Nov 08, 2016 2:53 pm

jdart wrote:
You have to go through R rounds of tuning, where I've found the 10k rounds figure fairly accurate.
This is CLOP you are talking about, or Texel? For the latter my tuning program typically converges in 50-60 iterations. It uses either Adagrad or ADAM (https://arxiv.org/pdf/1412.6980.pdf) to find the optimum.

--Jon
I tried some first-order gradient-descent algorithms (momentum, Adam) and the convergence was too slow. Perhaps I could play around with larger learning rates...

Anyway, yesterday I managed to get L-BFGS working and my first tests seem very promising.

zenpawn
Posts: 296
Joined: Sat Aug 06, 2016 6:31 pm
Location: United States

Re: CLOP: when to stop?

Post by zenpawn » Tue Nov 08, 2016 5:30 pm

ZirconiumX wrote:
zenpawn wrote: Yeah, Texel tuning definitely intrigues me as well, but there doesn't seem to be a ready-made program to execute the algorithm on any 'ole engine like CLOP.
It requires more effort, yes, but the 150 elo leap my engine made with it shows it can produce big rewards. The problem with such a universal tuner is the communication latencies are massive.
I'd be happy to implement it myself, but the math is apparently over my head. :(

User avatar
cdani
Posts: 2104
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: CLOP: when to stop?

Post by cdani » Tue Nov 08, 2016 6:42 pm

zenpawn wrote:
ZirconiumX wrote:
zenpawn wrote: Yeah, Texel tuning definitely intrigues me as well, but there doesn't seem to be a ready-made program to execute the algorithm on any 'ole engine like CLOP.
It requires more effort, yes, but the 150 elo leap my engine made with it shows it can produce big rewards. The problem with such a universal tuner is the communication latencies are massive.
I'd be happy to implement it myself, but the math is apparently over my head. :(
Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20

zenpawn
Posts: 296
Joined: Sat Aug 06, 2016 6:31 pm
Location: United States

Re: CLOP: when to stop?

Post by zenpawn » Wed Nov 09, 2016 12:26 am

cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself? :?

User avatar
cdani
Posts: 2104
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: CLOP: when to stop?

Post by cdani » Wed Nov 09, 2016 8:48 pm

zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself? :?
Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.

zenpawn
Posts: 296
Joined: Sat Aug 06, 2016 6:31 pm
Location: United States

Re: CLOP: when to stop?

Post by zenpawn » Wed Nov 09, 2016 9:53 pm

cdani wrote:
zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself? :?
Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.
OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me. :(

User avatar
cdani
Posts: 2104
Joined: Sat Jan 18, 2014 9:24 am
Location: Andorra
Contact:

Re: CLOP: when to stop?

Post by cdani » Fri Nov 11, 2016 6:18 am

zenpawn wrote:
OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me. :(
Once you have this done, should be relative simple. The idea is to call the calc_e function with the obtained fixed value (and you don't change it never) but changing one of your static evaluation parameters, and if the returned value is lower, the change is supposedly good.

zenpawn
Posts: 296
Joined: Sat Aug 06, 2016 6:31 pm
Location: United States

Re: CLOP: when to stop?

Post by zenpawn » Fri Nov 11, 2016 11:09 am

cdani wrote: Once you have this done, should be relative simple. The idea is to call the calc_e function with the obtained fixed value (and you don't change it never) but changing one of your static evaluation parameters, and if the returned value is lower, the change is supposedly good.
OK, so the E() in the CPW pseudo-code for localOptimize is your calc_e(K), correct? Surprised they didn't make it a function argument as well, if only to make it clearer where the K would be used.

In your calc_e, how is game_result obtained? Is that from the FEN? Would it be possible to use another engine's eval as the comparison instead of the final result?

User avatar
lucasart
Posts: 3040
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: CLOP: when to stop?

Post by lucasart » Fri Nov 11, 2016 11:12 am

zenpawn wrote:
cdani wrote:
zenpawn wrote:
cdani wrote: Some code that maybe can help about Texel tuning:

http://www.talkchess.com/forum/viewtopi ... =&start=20
Hi Daniel,

Is this some sort of pre-step to the Texel tuning itself? :?
Is the "Compute the K that minimizes E." part that you can find here:
https://chessprogramming.wikispaces.com ... ing+Method

And then can be used to tune. With the code I think you will be able to follow more easy the explanation in chessprogramming website.
OK, so it is indeed a pre-step, yes? Once you have this constant K, however, it's unclear what follows. Presumably, it's somehow used in the local optimization pseudo code at CPW (though I don't see it). And, where is the comparison against an oracle evaluation? It's not playing full games like CLOP, right?, so without such a comparison, how does it know whether it's making progress. All very confusing to me. :(
I use the Newton-Raphson method for that. Very easy:
- you want to minimise the error E(lambda), distance to the logistic with parameter lambda.
- you start with a guess value (or any value really), and apply iteratively the following:
1. compute the second order polynomial tangent to the function
2. this polynomial is convex (upwards parabola). the min value is your new starting point. go to step 1.

The reason this works (for any starting point) is that the function E is convex. This is very fortunate indeed, because minimizing in general can be very complicated (eg. even if you start in an area where E is convex, an iteration of NR can throw you into another area past an inflexion point, and it can diverge or converge to another local minimum instead).

Here's my code:
https://github.com/lucasart/Demolito/bl ... rc/tune.cc

Code: Select all

double error(double lambda)
{
    double sum = 0;

    for &#40;size_t i = 0; i < fens.size&#40;); i++) &#123;
        const double logistic = 1 / &#40;1.0 + std&#58;&#58;exp&#40;-lambda * qsearches&#91;i&#93;));
        sum += std&#58;&#58;abs&#40;scores&#91;i&#93; - logistic&#41;;
    &#125;

    return sum / fens.size&#40;);
&#125;

void logistic&#40;)
&#123;
    double lambda = 0.4, e0 = error&#40;lambda&#41;;
    double h = 0.01;

    while &#40;true&#41; &#123;
        // Compute function in 3 points
        const double ep = error&#40;lambda + h&#41;;
        const double em = error&#40;lambda - h&#41;;

        // Deduce first and second order derivative estimates
        const double e1 = &#40;ep - em&#41; / &#40;2 * h&#41;;
        const double e2 = &#40;ep - 2 * e0 + em&#41; / &#40;h * h&#41;;

        // New Lambda is the minimum of the tangent parabola
        const double newLambda = &#40;e2 * lambda - e1&#41; / e2;

        // Error and difference for the next iteration
        e0 = error&#40;newLambda&#41;;
        h = &#40;newLambda - lambda&#41; / 10;

        std&#58;&#58;cout << "lambda = " << newLambda << ", error&#40;lambda&#41; = " << e0 << '\n';

        if &#40;std&#58;&#58;abs&#40;newLambda - lambda&#41; < 0.00001&#41;
            break;

        lambda = newLambda;
    &#125;
&#125;
PS: Surprisingly, the absolute distance seemed to work much better than quadratic distance.

You can also do it by dichotomy if you want. The math are simpler, but it is less elegant, and the code will be more complicated (and require more iterations).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

Post Reply