Impressive Texel-tuning results
Posted: Fri Jun 16, 2017 7:49 pm
About a month ago I tried to tune my evaluation values using the Texel-tuning method:
First I thought about how to create fens of millions of positions which should contain the actual result of that match. I used cute-chess at very fast time controls and pgn-extractor which created 3 files containing the positions: wins.epd, lost.epd and draws.epd.
Then I wrote some logic to import these fens and I implemented a local optimization routine as described on the cpw. My optimization routine is able to tune multiple arrays and per array a 'step' can be specified. I also made it possible to specify that certain values in the array can be skipped, values can be mirrored (left, right) and should sometimes be applied to both the white and black arrays.
The cpw describes that you should use the quiescence-search for this method but I wasn't happy with the speed of calculating the error which took about 20 seconds (see: http://talkchess.com/forum/viewtopic.php?t=64189). I also read that the author of zurichess used the Texel-tuning method and he had created several epd files which could be used as a testset: http://www.talkchess.com/forum/viewtopic.php?t=61427
This zip-file also contains a quiet.epd which only contains quiet positions so I thought that using this set I don't need the quiescence-search but I can call the evaluation funtion directly. This is way faster and can quite easily be made multi-threaded, which is what I did.
Next I calculated the scaling-factor which would give the lowest error: 1.3
My original evaluation function had an error of 0.0654. Tuned I am able to get it to 0.0586. These are the (sometimes obvious) major tweaks:
All in all this give an Elo increase of more than 100 elo points! See table below for prelimenary results (chess22k-ex has tuned values, chess22k has same logic without tuned values)
Some other things I like is that you can easily see which evaluation values are actually not used.
It is now also easy to add a new feature to the evalution function and see how it performs (assuming that a lower error values results in a higher elo-score).
I guess chess22k 1.4 will be released soon!
First I thought about how to create fens of millions of positions which should contain the actual result of that match. I used cute-chess at very fast time controls and pgn-extractor which created 3 files containing the positions: wins.epd, lost.epd and draws.epd.
Then I wrote some logic to import these fens and I implemented a local optimization routine as described on the cpw. My optimization routine is able to tune multiple arrays and per array a 'step' can be specified. I also made it possible to specify that certain values in the array can be skipped, values can be mirrored (left, right) and should sometimes be applied to both the white and black arrays.
The cpw describes that you should use the quiescence-search for this method but I wasn't happy with the speed of calculating the error which took about 20 seconds (see: http://talkchess.com/forum/viewtopic.php?t=64189). I also read that the author of zurichess used the Texel-tuning method and he had created several epd files which could be used as a testset: http://www.talkchess.com/forum/viewtopic.php?t=61427
This zip-file also contains a quiet.epd which only contains quiet positions so I thought that using this set I don't need the quiescence-search but I can call the evaluation funtion directly. This is way faster and can quite easily be made multi-threaded, which is what I did.
Next I calculated the scaling-factor which would give the lowest error: 1.3
My original evaluation function had an error of 0.0654. Tuned I am able to get it to 0.0586. These are the (sometimes obvious) major tweaks:
Code: Select all
- rook material : 500 -> 645
- queen material : 950 -> 1200
- pinned-pieces-penalty : always 15 -> 0-80, depending on the pinned-piece type
- pawn-storm-bonus : disabled
- pawn-shield-bonus : way higher scores when the pawn is almost promoted, now the king is defending the pawn! :)
- king-safety-scores : start at 50, then decreases, then increases, no clue why :P
- mobility : bigger penalties and bonusses
- psqt : subtle but relevant tweaks
Code: Select all
Rank Name Elo +/- Games Score Draws
1 chess22k-ex 146 45 191 69.9% 26.7%
2 Ruffian_105 31 42 193 54.4% 29.0%
3 Maverick 1.5 -7 42 192 49.0% 29.2%
4 chess22k -20 41 192 47.1% 29.7%
5 AnMon_5.75 -60 42 193 41.5% 26.9%
6 chess22k 1.3 -83 40 191 38.2% 35.6%
576 of 1500 games finished.
Some other things I like is that you can easily see which evaluation values are actually not used.
It is now also easy to add a new feature to the evalution function and see how it performs (assuming that a lower error values results in a higher elo-score).
I guess chess22k 1.4 will be released soon!