txt: automated chess engine tuning

Joerg Oster · Post by **Joerg Oster** » Sat Mar 21, 2015 1:23 pm

brtzsnr wrote:I got similar values for Stockfish and Zurichess.

Unrelated, I changed the default eval depth to 0. Stockfish doesn't output a score for `go depth 0` so it won't work with stockfish. Also I changed the script to only output quiet positions as suggested earlier in this thread.

So I better keep the former version.

The first tuning session for getting used to it, is up and running. So far, there is only two lines of output: base score = xxxxxxxxxxxxxx and new score = xxxxxxxxxxxxxxxxx ...
How do I know tuning makes progress? What are the current best values so far?

From txt.go I see that you are doing 10,000 steps. That would mean one tuning session would take about a month to finish?! (And this is only with about 500k positions which may be too less!)

mar · Post by **mar** » Sat Mar 21, 2015 1:52 pm

Joerg Oster wrote:I ran findk for both, Zurichess and Stockfish.
Each iteration with Zurichess took about 14 minutes, while Stockfish needed about 4 minutes.

Yes, it seems very slow as I expected. How many positions?
For comparison, my integrated tuner does one iteration (paralellized) in ~15-20 seconds on a quad (~6.5M positions).
I doubt you can use this tuning method without integrating it into the engine (technically you can but you would have wait for very long).
Maybe a custom protocol intead of UCI might work,
because, among other things, sending position fen each time will slow it down further. Doing depth 1 search each time is also overkill.
It's much better to store the unpacked boards/positions in memory.
So all that's needed is to feed the engine with positions first and then run a command that simply does qsearch on each and outputs the final value.
This may work.

AlvaroBegue · Post by **AlvaroBegue** » Sat Mar 21, 2015 2:37 pm

What optimization method is being used here? With moderate effort one can modify the engine to produce the gradient of the loss function, and then a clever optimization method like L-BFGS or conjugate gradient would be very fast.

Computing the gradient can be done with a very clever trick called "reverse-mode automatic differentiation". I have some sample C++ code, if anyone is interested.

Joerg Oster · Post by **Joerg Oster** » Sat Mar 21, 2015 2:59 pm

mar wrote:
Joerg Oster wrote:I ran findk for both, Zurichess and Stockfish.
Each iteration with Zurichess took about 14 minutes, while Stockfish needed about 4 minutes.
Yes, it seems very slow as I expected. How many positions?
For comparison, my integrated tuner does one iteration (paralellized) in ~15-20 seconds on a quad (~6.5M positions).
I doubt you can use this tuning method without integrating it into the engine (technically you can but you would have wait for very long).
Maybe a custom protocol intead of UCI might work,
because, among other things, sending position fen each time will slow it down further. Doing depth 1 search each time is also overkill.
It's much better to store the unpacked boards/positions in memory.
So all that's needed is to feed the engine with positions first and then run a command that simply does qsearch on each and outputs the final value.
This may work.

Thank you for your suggestions.
Before trying to modify Stockfish, with which I would definitely need help, I want to know first if this tuning method will also work for Stockfish.
I am not so sure about that ...

I will now generate a new epd file with a smaller number and only quiet positions. This should also reduce the effort spent in qsearch.
Hopefully, this will give me at least a hint if further proceeding is worth the effort.

jdart · Post by **jdart** » Sat Mar 21, 2015 3:05 pm

I think most practitioners are approaching this as a classical machine learning problem, and the standard method for that is Stochastic Gradient Descent (SGD), although other methods can be used.

--Jon

AlvaroBegue · Post by **AlvaroBegue** » Sat Mar 21, 2015 3:07 pm

jdart wrote:I think most practitioners are approaching this as a classical machine learning problem, and the standard method for that is Stochastic Gradient Descent (SGD), although other methods can be used.

--Jon

You would still need to be able to compute gradients quickly for this, so you would still benefit from automatic differentiation.

jdart · Post by **jdart** » Sat Mar 21, 2015 3:52 pm

SGD is an iterative method that incrementally updates the gradient as it goes through the data.

--Jon

AlvaroBegue · Post by **AlvaroBegue** » Sat Mar 21, 2015 4:04 pm

jdart wrote:SGD is an iterative method that incrementally updates the gradient as it goes through the data.

--Jon

Errrr.... No, or at least not in the sense that I am talking about. Let me clarify.

The loss function is the sum of squares of errors over all samples, so its gradient is the sum of the gradients of the squares of the errors of the samples. In SGD you take steps in the direction of the gradient of each sample, or perhaps of a few samples (so-called minibatches). But you still need a method to compute the gradient of the square of the error for a particular sample.

In machine learning -specifically in neural networks- the computation of this gradient is called backpropagation. It turns out that backpropagation is a particular instance of reverse-mode automatic differentiation. But you can do the same thing for any function. And in C++ you can do it automatically, if you make your evaluation function a template so it can take any type for representing scores.

jdart · Post by **jdart** » Sat Mar 21, 2015 7:51 pm

The kicker for me is, does making the objective go down improve game play?

If the answer is no, then you are not heading down the right path.

--Jon

brtzsnr · Post by **brtzsnr** » Sat Mar 21, 2015 7:56 pm

Joerg Oster wrote:
brtzsnr wrote:I got similar values for Stockfish and Zurichess.

Unrelated, I changed the default eval depth to 0. Stockfish doesn't output a score for `go depth 0` so it won't work with stockfish. Also I changed the script to only output quiet positions as suggested earlier in this thread.
So I better keep the former version.

The first tuning session for getting used to it, is up and running. So far, there is only two lines of output: base score = xxxxxxxxxxxxxx and new score = xxxxxxxxxxxxxxxxx ...
How do I know tuning makes progress? What are the current best values so far?

From txt.go I see that you are doing 10,000 steps. That would mean one tuning session would take about a month to finish?! (And this is only with about 500k positions which may be too less!)

If the optimizer finds a set of values that improves the base score then it will print those values. For me it takes about 1h to get a better score.

However, the values converge to extremes which is quiet bad. For example the optimizer found:

Code: Select all

setvalue EndGameMaterial.DoublePawnPenalty 39
setvalue MidGameMaterial.DoublePawnPenalty 3
setvalue EndGameMaterial.PawnChainBonus 0
setvalue MidGameMaterial.PawnChainBonus 4

These give a ELO drop of ~20.

Code: Select all

T&#58;1800 @ TC&#58;40/60+0.20
W&#58;435 L&#58;541 D&#58;824
LOS&#58;0.000 ELO&#58;-20.5±11.8

FWIW, the optimizer used 500k quiet positions with `go depth 0`. It's something I don't understand, yet.

txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning