txt: automated chess engine tuning

brtzsnr · Post by **brtzsnr** » Sat Mar 21, 2015 8:12 pm

mar wrote:
Joerg Oster wrote:I ran findk for both, Zurichess and Stockfish.
Each iteration with Zurichess took about 14 minutes, while Stockfish needed about 4 minutes.
Yes, it seems very slow as I expected. How many positions?
For comparison, my integrated tuner does one iteration (paralellized) in ~15-20 seconds on a quad (~6.5M positions).
I doubt you can use this tuning method without integrating it into the engine (technically you can but you would have wait for very long).
Maybe a custom protocol intead of UCI might work,
because, among other things, sending position fen each time will slow it down further. Doing depth 1 search each time is also overkill.
It's much better to store the unpacked boards/positions in memory.
So all that's needed is to feed the engine with positions first and then run a command that simply does qsearch on each and outputs the final value.
This may work.

My measurements show a 50% overhead of going through UCI. eval takes about 20% of CPU, zurichess about 75% (out of which 20% is to decode the FEN).

I could make txt much faster, but then everybody would need to write a new coding/decoding logic. I want people (like Joerg Oster, for example) to start experimenting with txt on their engines without investing too much time. If they find it useful, writing a custom evaluation function bypassing UCI is very easy (it took me 2h to write the generic one).

Nonetheless, right now I'm trying to get some improvements at least for Zurichess, which I failed so far.

Joerg Oster · Post by **Joerg Oster** » Sat Mar 21, 2015 9:30 pm

brtzsnr wrote:
mar wrote:
Joerg Oster wrote:I ran findk for both, Zurichess and Stockfish.
Each iteration with Zurichess took about 14 minutes, while Stockfish needed about 4 minutes.
Yes, it seems very slow as I expected. How many positions?
For comparison, my integrated tuner does one iteration (paralellized) in ~15-20 seconds on a quad (~6.5M positions).
I doubt you can use this tuning method without integrating it into the engine (technically you can but you would have wait for very long).
Maybe a custom protocol intead of UCI might work,
because, among other things, sending position fen each time will slow it down further. Doing depth 1 search each time is also overkill.
It's much better to store the unpacked boards/positions in memory.
So all that's needed is to feed the engine with positions first and then run a command that simply does qsearch on each and outputs the final value.
This may work.
My measurements show a 50% overhead of going through UCI. eval takes about 20% of CPU, zurichess about 75% (out of which 20% is to decode the FEN).

I could make txt much faster, but then everybody would need to write a new coding/decoding logic. I want people (like Joerg Oster, for example) to start experimenting with txt on their engines without investing too much time. If they find it useful, writing a custom evaluation function bypassing UCI is very easy (it took me 2h to write the generic one).

Nonetheless, right now I'm trying to get some improvements at least for Zurichess, which I failed so far.

Highly appreciated!

I am just considering disabling repetition check in Stockfish and also TT probing. This may eventually give a tiny speedup.

mar · Post by **mar** » Sun Mar 22, 2015 9:11 am

brtzsnr wrote:I could make txt much faster, but then everybody would need to write a new coding/decoding logic. I want people (like Joerg Oster, for example) to start experimenting with txt on their engines without investing too much time. If they find it useful, writing a custom evaluation function bypassing UCI is very easy (it took me 2h to write the generic one).

Nonetheless, right now I'm trying to get some improvements at least for Zurichess, which I failed so far.

I understand your motivation and I'm sure people will appreciate it (if it works, your last sentence makes me wonder), but being so slow will make them give up, sooner or later

A custom protocol could make it run much faster and it shouldn't be that much work.
As I said, one command would feed the positions to the engine (could be fens+outcome or simply filename) => one time init.
Then the engine would run qsearch on those (applying sigmoid) and simply output error sum (not the average).
This would allow the tool to easily paralellize each iteration as well.
Of course one still needs a mechanism to set parameters for tuning (I assume even your txt needs something like that).
So by using this, you could make it much faster and even more when paralellized.
Such a (engine) modification should be rather trivial and once someone forks say SF, new changes could be merged in easily later.

Joerg Oster · Post by **Joerg Oster** » Sun Mar 22, 2015 2:49 pm

brtzsnr wrote:
mar wrote:
Joerg Oster wrote:I ran findk for both, Zurichess and Stockfish.
Each iteration with Zurichess took about 14 minutes, while Stockfish needed about 4 minutes.
Yes, it seems very slow as I expected. How many positions?
For comparison, my integrated tuner does one iteration (paralellized) in ~15-20 seconds on a quad (~6.5M positions).
I doubt you can use this tuning method without integrating it into the engine (technically you can but you would have wait for very long).
Maybe a custom protocol intead of UCI might work,
because, among other things, sending position fen each time will slow it down further. Doing depth 1 search each time is also overkill.
It's much better to store the unpacked boards/positions in memory.
So all that's needed is to feed the engine with positions first and then run a command that simply does qsearch on each and outputs the final value.
This may work.
My measurements show a 50% overhead of going through UCI. eval takes about 20% of CPU, zurichess about 75% (out of which 20% is to decode the FEN).

I could make txt much faster, but then everybody would need to write a new coding/decoding logic. I want people (like Joerg Oster, for example) to start experimenting with txt on their engines without investing too much time. If they find it useful, writing a custom evaluation function bypassing UCI is very easy (it took me 2h to write the generic one).

Nonetheless, right now I'm trying to get some improvements at least for Zurichess, which I failed so far.

Can you please take a look at this output:

Code: Select all

go run rook.go --eval $GOPATH/bin/eval --input epdt --binary ./stockfish --k 1.13209
13&#58;28&#58;41 txt.go&#58;125&#58; base score = 0.11500409653977656
13&#58;29&#58;04 txt.go&#58;143&#58; new score = 0.11500409653977656 ; base = 0.11500409653977656 ; false

Is it ok, that there is no more output since then?
From looking into txt.go I thought there should be more output ...

Joerg Oster · Post by **Joerg Oster** » Sun Mar 22, 2015 6:00 pm

Joerg Oster wrote:
brtzsnr wrote:
mar wrote:
Joerg Oster wrote:I ran findk for both, Zurichess and Stockfish.
Each iteration with Zurichess took about 14 minutes, while Stockfish needed about 4 minutes.
Yes, it seems very slow as I expected. How many positions?
For comparison, my integrated tuner does one iteration (paralellized) in ~15-20 seconds on a quad (~6.5M positions).
I doubt you can use this tuning method without integrating it into the engine (technically you can but you would have wait for very long).
Maybe a custom protocol intead of UCI might work,
because, among other things, sending position fen each time will slow it down further. Doing depth 1 search each time is also overkill.
It's much better to store the unpacked boards/positions in memory.
So all that's needed is to feed the engine with positions first and then run a command that simply does qsearch on each and outputs the final value.
This may work.
My measurements show a 50% overhead of going through UCI. eval takes about 20% of CPU, zurichess about 75% (out of which 20% is to decode the FEN).

I could make txt much faster, but then everybody would need to write a new coding/decoding logic. I want people (like Joerg Oster, for example) to start experimenting with txt on their engines without investing too much time. If they find it useful, writing a custom evaluation function bypassing UCI is very easy (it took me 2h to write the generic one).

Nonetheless, right now I'm trying to get some improvements at least for Zurichess, which I failed so far.
Can you please take a look at this output:
Code: Select all
go run rook.go --eval $GOPATH/bin/eval --input epdt --binary ./stockfish --k 1.13209
13&#58;28&#58;41 txt.go&#58;125&#58; base score = 0.11500409653977656
13&#58;29&#58;04 txt.go&#58;143&#58; new score = 0.11500409653977656 ; base = 0.11500409653977656 ; false
Is it ok, that there is no more output since then?
From looking into txt.go I thought there should be more output ...

OK, I think I figured out why ...
You only give an update if the score has changed.
And in the case of Stockfish this might take a bit longer, because everything is already well tuned.

Sorry for the noise.

brtzsnr · Post by **brtzsnr** » Sun Mar 22, 2015 11:24 pm

Joerg Oster wrote:
Code: Select all
go run rook.go --eval $GOPATH/bin/eval --input epdt --binary ./stockfish --k 1.13209
13&#58;28&#58;41 txt.go&#58;125&#58; base score = 0.11500409653977656
13&#58;29&#58;04 txt.go&#58;143&#58; new score = 0.11500409653977656 ; base = 0.11500409653977656 ; false
Is it ok, that there is no more output since then?
From looking into txt.go I thought there should be more output ...
OK, I think I figured out why ...
You only give an update if the score has changed.
And in the case of Stockfish this might take a bit longer, because everything is already well tuned.

Sorry for the noise.

I pushed a new version that's a bit faster (implemented some of Miguel's suggestions). It will also print the score after every iteration (just not the values). You'll need however to change the program (just a bit) and how to run it. See updated doc here: https://bitbucket.org/brtzsnr/txt .

Joerg Oster · Post by **Joerg Oster** » Sun Mar 22, 2015 11:40 pm

brtzsnr wrote:
Joerg Oster wrote:
Code: Select all
go run rook.go --eval $GOPATH/bin/eval --input epdt --binary ./stockfish --k 1.13209
13&#58;28&#58;41 txt.go&#58;125&#58; base score = 0.11500409653977656
13&#58;29&#58;04 txt.go&#58;143&#58; new score = 0.11500409653977656 ; base = 0.11500409653977656 ; false
Is it ok, that there is no more output since then?
From looking into txt.go I thought there should be more output ...
OK, I think I figured out why ...
You only give an update if the score has changed.
And in the case of Stockfish this might take a bit longer, because everything is already well tuned.

Sorry for the noise.
I pushed a new version that's a bit faster (implemented some of Miguel's suggestions). It will also print the score after every iteration (just not the values). You'll need however to change the program (just a bit) and how to run it. See updated doc here: https://bitbucket.org/brtzsnr/txt .

Great!
Will take a closer look tomorrow.

Joerg Oster · Post by **Joerg Oster** » Mon Mar 23, 2015 7:24 pm

brtzsnr wrote:
Joerg Oster wrote:
Code: Select all
go run rook.go --eval $GOPATH/bin/eval --input epdt --binary ./stockfish --k 1.13209
13&#58;28&#58;41 txt.go&#58;125&#58; base score = 0.11500409653977656
13&#58;29&#58;04 txt.go&#58;143&#58; new score = 0.11500409653977656 ; base = 0.11500409653977656 ; false
Is it ok, that there is no more output since then?
From looking into txt.go I thought there should be more output ...
OK, I think I figured out why ...
You only give an update if the score has changed.
And in the case of Stockfish this might take a bit longer, because everything is already well tuned.

Sorry for the noise.
I pushed a new version that's a bit faster (implemented some of Miguel's suggestions). It will also print the score after every iteration (just not the values). You'll need however to change the program (just a bit) and how to run it. See updated doc here: https://bitbucket.org/brtzsnr/txt .

Good job!
I get about 20% speedup.

One minor issue though, I could not start the tuning process as shown on the website, but had to make a small change.
Here is my command line:

go build . && ./example $GOPATH/bin/eval --input ../misc/epdt --binary ../stockfish --k 1.13209

I had to use the precompiled eval binary.
Error message when trying to use ../eval/eval:

Code: Select all

19&#58;14&#58;57 txt.go&#58;143&#58; exec&#58; "../eval/eval"&#58; stat ../eval/eval&#58; no such file or directory

brtzsnr · Post by **brtzsnr** » Wed Mar 25, 2015 7:28 pm

brtzsnr wrote:Hi, all!

I wrote a small framework to do automated chess tuning based on Texel's Tuning method (https://chessprogramming.wikispaces.com ... ing+Method). You can find the source code and instructions how to use it here https://bitbucket.org/brtzsnr/txt.

I'm still experimenting with it, so I cannot yet report any success. Nevertheless, feel free to experiment with txt and if you find it useful, please consider contributing.

Regards,

I managed to get a +10 ELO improvement by optimizing PSQT for Pawns in the endgame https://bitbucket.org/brtzsnr/zurichess ... psqt-pawns

I also tried & failed to optimize figure bonus, pawn structure bonuses, etc. In some cases I got no improvement, in other cases I got a regression with long time controls.

Conclusion: It's a trial-and-error game. Success rate is sufficiently high to use this method to tune parameters, but the returned values must be checked. It works best with many (I have 8 million) quiet positions.

Next step: is to decrease the time needed to find a local minimum is found. One idea I want to try to use a tiny set of positions at first and the switch to a larger set.

Joerg Oster · Post by **Joerg Oster** » Wed Mar 25, 2015 8:36 pm

brtzsnr wrote:
brtzsnr wrote:Hi, all!

I wrote a small framework to do automated chess tuning based on Texel's Tuning method (https://chessprogramming.wikispaces.com ... ing+Method). You can find the source code and instructions how to use it here https://bitbucket.org/brtzsnr/txt.

I'm still experimenting with it, so I cannot yet report any success. Nevertheless, feel free to experiment with txt and if you find it useful, please consider contributing.

Regards,
I managed to get a +10 ELO improvement by optimizing PSQT for Pawns in the endgame https://bitbucket.org/brtzsnr/zurichess ... psqt-pawns

I also tried & failed to optimize figure bonus, pawn structure bonuses, etc. In some cases I got no improvement, in other cases I got a regression with long time controls.

Conclusion: It's a trial-and-error game. Success rate is sufficiently high to use this method to tune parameters, but the returned values must be checked. It works best with many (I have 8 million) quiet positions.

Next step: is to decrease the time needed to find a local minimum is found. One idea I want to try to use a tiny set of positions at first and the switch to a larger set.

Congratulations!
How many iterations did it take? And how many hours?

I also found a higher number of positions seems to work better. Currently I have 5 million positions.
But then one iteration takes more than an hour with Stockfish, which is simply impractical.

txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning

Re: txt: automated chess engine tuning