Page 2 of 5

Re: Texel tuning speed

Posted: Thu Aug 30, 2018 11:16 am
by Ronald
I would expect that a correct outcome of the position is important for the tuning result, and if you use the outcome of those games which are probably played with different rated engines the "true" outcome may differ more often from the game outcome and thus create error. There's only one way to find out however.., did you already make a comparison ?

Re: Texel tuning speed

Posted: Thu Aug 30, 2018 11:45 am
by Joost Buijs
Ronald wrote: Thu Aug 30, 2018 11:16 am I would expect that a correct outcome of the position is important for the tuning result, and if you use the outcome of those games which are probably played with different rated engines the "true" outcome may differ more often from the game outcome and thus create error. There's only one way to find out however.., did you already make a comparison ?
No, I didn't make a comparison yet, when I can find some time for it I will.

Somehow I have the feeling that with enough games statistics will make up for the lower quality (or uncertain outcome) of the games played by weaker engines.

Most of the time I spend working on a new version of my engine which is atm basic search with material and psq only. Right now I'm busy implementing YBW using C++11 threads, in the past I always used Windows threads, things like mutexes, locks and condition variables behave somewhat differently now. Maybe lazy SMP would be a better choice, at least 10 times easier to implement.

Re: Texel tuning speed

Posted: Thu Aug 30, 2018 11:53 am
by mar
Joost Buijs wrote: Thu Aug 30, 2018 9:57 am I think using eval cache when tuning is an error, each time you modify a term and call the evaluator you will get the cached value instead of the new value. The same holds for quiescence, if you use TT pruning in quiescence you have to disable it.
Not at all, you clear it before each iteration, so you modify your parameter(s), then run through the set of positions with eval cache enabled.

Re: Texel tuning speed

Posted: Thu Aug 30, 2018 12:05 pm
by Joost Buijs
mar wrote: Thu Aug 30, 2018 11:53 am
Joost Buijs wrote: Thu Aug 30, 2018 9:57 am I think using eval cache when tuning is an error, each time you modify a term and call the evaluator you will get the cached value instead of the new value. The same holds for quiescence, if you use TT pruning in quiescence you have to disable it.
Not at all, you clear it before each iteration, so you modify your parameter(s), then run through the set of positions with eval cache enabled.
You're right, that is something I didn't think of. I only use a pawn eval cache, currently disabled when tuning. To be honest, I didn't look at it very carefully yet, and never spent any effort to make it faster.

Re: Texel tuning speed

Posted: Thu Aug 30, 2018 4:30 pm
by jdart
By the way, I have also tried a recent method called SVRG (https://papers.nips.cc/paper/4937-accel ... uction.pdf). It is an improved version of SGD. But I couldn't get it to work. I am pretty sure that is an implementation issue and not a defect in the algorithm, which does seem promising.

--Jon

Re: Texel tuning speed

Posted: Thu Aug 30, 2018 11:24 pm
by Sven
Ronald wrote: Thu Aug 30, 2018 10:14 am I used the "quiet-labeled.epd" set created by Zurichess for Texel tuning, which contains 750.000 quiet positions. Because they are quiet you don't need to call quiescence but you can call the eval function directly. This saves a lot of time.
I do exactly the same for Jumbo ("quiet-labeled.epd" and only calling eval), and I think my tuning runs quite fast. I have also implemented parallel computation of the E function for different training positions. Jumbo currently has 2 * 137 = 274 eval parameters (MG + EG) which I always tune all at once (about 10 of them are excluded from tuning, e.g. the EG pawn material value). But my eval function is not very complex, and more than half of the parameters belong to the PST. Basically I use the original Texel tuning method. When using 8 threads in parallel the time needed for one iteration, i.e. one walk over all parameters where some of them are modified, is about 2-3 minutes. The number of iterations until convergence obviously depends on many factors and can't be predicted but I never observed much more than 100 iterations so the longest tuning run that I can remember was roughly about 4 hours (not sure though since the last time was several months ago already). The first complete tuning of Jumbo (performed end of 2017) gave an improvement of about 100 Elo points.

Re: Texel tuning speed

Posted: Thu Aug 30, 2018 11:33 pm
by Sven
Joost Buijs wrote: Thu Aug 30, 2018 12:05 pm
mar wrote: Thu Aug 30, 2018 11:53 am
Joost Buijs wrote: Thu Aug 30, 2018 9:57 am I think using eval cache when tuning is an error, each time you modify a term and call the evaluator you will get the cached value instead of the new value. The same holds for quiescence, if you use TT pruning in quiescence you have to disable it.
Not at all, you clear it before each iteration, so you modify your parameter(s), then run through the set of positions with eval cache enabled.
You're right, that is something I didn't think of. I only use a pawn eval cache, currently disabled when tuning. To be honest, I didn't look at it very carefully yet, and never spent any effort to make it faster.
I do not understand how an eval cache can help to speed up texel tuning. To my knowledge an eval cache stores the result of the whole evalution call as one value per position. Since you need to clear that cache after modifying any eval parameter, and since all training positions are different from each other, a texel tuning implementation that callls eval() (with 100% quiet positions) will not get any benefit from using an eval cache since no position will be evaluated more than once for the same set of parameter values. The same basically holds for implementations calling qsearch() since there it is possible in rare cases that one training position in the input file can lead to another training position by playing a capture sequence that can also be part of qsearch() of the first position, but this is certainly an exception.

Pawn hash helps of course, I use this in Jumbo. I also have implemented some logic that tries to avoid clearing the pawn hash whenever it is clear that this would not change anything. This is a further speedup of the tuning run.

Re: Texel tuning speed

Posted: Thu Aug 30, 2018 11:52 pm
by Sven
xr_a_y wrote: Thu Aug 30, 2018 12:16 am I seems Weini is able to run the qsearch needed for each position and each evaluation of the error in around 0.06 millisecond.

Let's say I have only 100 000 positions and want to optimize 10 parameters.
It will requiere let's say at least 100 000 x 10 x 100 qsearch, so 2h40min of computation.
Do you mean *one* qsearch() call takes 0.06 ms? That would be quite a lot, I don't think that is what you mean. On the other hand, 100,000 qsearch() calls plus one error calculation should also take much longer than 0.06 ms.

Therefore my question is: how often do you calculate the error function, once per training set and per set of parameter values (as it is intended), or once per position (which I would not understand)?

Re: Texel tuning speed

Posted: Fri Aug 31, 2018 12:05 am
by Sven
Sven wrote: Thu Aug 30, 2018 11:24 pm I have also implemented parallel computation of the E function for different training positions.
This may sound confusing and is actually wrong, what I meant was I compute eval() and the corresponding Sigmoid value in parallel for different training positions.

Re: Texel tuning speed

Posted: Fri Aug 31, 2018 6:53 am
by Joost Buijs
Sven wrote: Thu Aug 30, 2018 11:33 pm
Joost Buijs wrote: Thu Aug 30, 2018 12:05 pm
mar wrote: Thu Aug 30, 2018 11:53 am
Joost Buijs wrote: Thu Aug 30, 2018 9:57 am I think using eval cache when tuning is an error, each time you modify a term and call the evaluator you will get the cached value instead of the new value. The same holds for quiescence, if you use TT pruning in quiescence you have to disable it.
Not at all, you clear it before each iteration, so you modify your parameter(s), then run through the set of positions with eval cache enabled.
You're right, that is something I didn't think of. I only use a pawn eval cache, currently disabled when tuning. To be honest, I didn't look at it very carefully yet, and never spent any effort to make it faster.
I do not understand how an eval cache can help to speed up texel tuning. To my knowledge an eval cache stores the result of the whole evalution call as one value per position. Since you need to clear that cache after modifying any eval parameter, and since all training positions are different from each other, a texel tuning implementation that callls eval() (with 100% quiet positions) will not get any benefit from using an eval cache since no position will be evaluated more than once for the same set of parameter values. The same basically holds for implementations calling qsearch() since there it is possible in rare cases that one training position in the input file can lead to another training position by playing a capture sequence that can also be part of qsearch() of the first position, but this is certainly an exception.

Pawn hash helps of course, I use this in Jumbo. I also have implemented some logic that tries to avoid clearing the pawn hash whenever it is clear that this would not change anything. This is a further speedup of the tuning run.
In the training set not every position has to be unique and different from each other, a position that can be won in game A can be drawn or lost in game B, and this all averages out in the error function. When you train with unique positions only you have no statistical info whatsoever.

Another possibility might be to prepare a training set with unique positions only that doesn't score 0, 0.5 or 1.0 but an average between 0 and 1.0. I've been thinking about this but I don't know if it mathematically amounts to the same, something I still have to look at.