How Do You Automatically Tune Your Evaluation Tables
Moderators: Harvey Williamson, Dann Corbit, hgm
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.

 Posts: 303
 Joined: Sat Apr 28, 2012 4:18 pm
 Location: Austin, TX
 Contact:
How Do You Automatically Tune Your Evaluation Tables
I have a question, which I'm sure many programmers have wrestled with and likely solved. Basically, I'm curious how you automatically tune the tables in your chess program. I use CLOP as well as some homegrown solutions to tuning individual variables. Of course, CLOP also allows multiple variables to be tuned simultaneously, but the number of games required for convergence grows prodigiously, (45 variables usually require 4050K games for a reliable result). This problem is even worse for a table of 64 values. For some tables, it's possible to take advantage of the table's symmetry so that you only need to tune approximately 16 values, (rotating and mirroring the results works well for certain times of evaluation terms). But for others, you really should CLOP all 64 values. Unfortunately, that will never converge in any reasonable amount of time.
One idea I've tried, with mixed success, is to CLOP only selected squares and then perform a linear or quadratic interpolation of the squares between the selected squares. Essentially, establishing a gradient field. This has kind of worked, but I'm sure there must be a better way. I'm curious, what others have tried.
regards,
tom
One idea I've tried, with mixed success, is to CLOP only selected squares and then perform a linear or quadratic interpolation of the squares between the selected squares. Essentially, establishing a gradient field. This has kind of worked, but I'm sure there must be a better way. I'm curious, what others have tried.
regards,
tom
Re: How Do You Automatically Tune Your Evaluation Tables
Have you tried doing that followed by optimizing just the squares in between? You could even iteratively optimize half the squares each time after you've done the initial step you just mentioned.Tom Likens wrote: One idea I've tried, with mixed success, is to CLOP only selected squares and then perform a linear or quadratic interpolation of the squares between the selected squares. Essentially, establishing a gradient field. This has kind of worked, but I'm sure there must be a better way. I'm curious, what others have tried.

 Posts: 303
 Joined: Sat Apr 28, 2012 4:18 pm
 Location: Austin, TX
 Contact:
Re: How Do You Automatically Tune Your Evaluation Tables
I haven't, and it's probably worth a try. But, I'm still wondering if I'm missing something obvious. I've seen the statement a few times, that Stockfish's tables are autotuned, so they must have a methodology for it.rbarreira wrote:Have you tried doing that followed by optimizing just the squares in between? You could even iteratively optimize half the squares each time after you've done the initial step you just mentioned.Tom Likens wrote: One idea I've tried, with mixed success, is to CLOP only selected squares and then perform a linear or quadratic interpolation of the squares between the selected squares. Essentially, establishing a gradient field. This has kind of worked, but I'm sure there must be a better way. I'm curious, what others have tried.

 Posts: 4065
 Joined: Tue Mar 14, 2006 10:34 am
 Location: Ethiopia
 Contact:
Re: How Do You Automatically Tune Your Evaluation Tables
I guess that you have to use some kind of function to interpolate over the PSQTs and then tune its parameters instead of all the values. You can either use a predetermined shape (say quadratic) and tune its parameters, or try to build a 'response surface' automatically and then optimize on the model instead of the actual data. In the past I did an optimization on a deterministic blackbox function using radial basis functions to build a response surface, which proved to be much faster. IIRC CLOP itself may be using a flavor of that although in a stochastic sense since game outcomes always have some uncertainty. In any case there is no escaping the fact that you will have to reduce the parameter space somehow, which is hopefully low enough for PSQTs. How many 'key points' do you need to tune Knight's PSQT for instance? I guess that you would need the four corners, the sides and the center as well. They are symmetric so you have only 64/8=8 squares to begin with. It will be twice that if you have backrank penalties. Also if you can build PSQT from simple equations , as Fruit does/used to do, you will be able to tune the parameters directly.

 Posts: 303
 Joined: Sat Apr 28, 2012 4:18 pm
 Location: Austin, TX
 Contact:
Re: How Do You Automatically Tune Your Evaluation Tables
I've been converging (pardon the pun) to this same conclusion. It's obvious that CLOP or any other optimization technique can't be applied to all the squares independently, so some reduction of the statespace has to occur. It's interesting that Fruit uses an equation to build its piece square tables, I never knew that. I'd bet money that Fabien switched to this after wrestling with this same problem.Daniel Shawul wrote:I guess that you have to use some kind of function to interpolate over the PSQTs and then tune its parameters instead of all the values. You can either use a predetermined shape (say quadratic) and tune its parameters, or try to build a 'response surface' automatically and then optimize on the model instead of the actual data. In the past I did an optimization on a deterministic blackbox function using radial basis functions to build a response surface, which proved to be much faster. IIRC CLOP itself may be using a flavor of that although in a stochastic sense since game outcomes always have some uncertainty. In any case there is no escaping the fact that you will have to reduce the parameter space somehow, which is hopefully low enough for PSQTs. How many 'key points' do you need to tune Knight's PSQT for instance? I guess that you would need the four corners, the sides and the center as well. They are symmetric so you have only 64/8=8 squares to begin with. It will be twice that if you have backrank penalties. Also if you can build PSQT from simple equations , as Fruit does/used to do, you will be able to tune the parameters directly.
Re: How Do You Automatically Tune Your Evaluation Tables
Here's how I did it:Tom Likens wrote:I have a question, which I'm sure many programmers have wrestled with and likely solved. Basically, I'm curious how you automatically tune the tables in your chess program. I use CLOP as well as some homegrown solutions to tuning individual variables. Of course, CLOP also allows multiple variables to be tuned simultaneously, but the number of games required for convergence grows prodigiously, (45 variables usually require 4050K games for a reliable result). This problem is even worse for a table of 64 values. For some tables, it's possible to take advantage of the table's symmetry so that you only need to tune approximately 16 values, (rotating and mirroring the results works well for certain times of evaluation terms). But for others, you really should CLOP all 64 values. Unfortunately, that will never converge in any reasonable amount of time.
One idea I've tried, with mixed success, is to CLOP only selected squares and then perform a linear or quadratic interpolation of the squares between the selected squares. Essentially, establishing a gradient field. This has kind of worked, but I'm sure there must be a better way. I'm curious, what others have tried.
regards,
tom
* I started by replicating the Fruit 2.1 PST. I wrote the code myself, because I didn't like the Fruit coding style, but I replicated the functionality almost exactly.
* I handtuned and simplified, gradually reducing the number of parameters.
* Then I simplified even more and did some CLOP tuning.
The problem with CLOP is that the nb of games explodes with dimension. I never got any success beyond 3 parameters with CLOP. So I only tuned the most important PST, which are 1/ King 2/ Knight. And I tuned them using super simplified parametric form, with only 2 or 3 parameters.
But CLOP is really not the best solution. It's the best I know how to use, but Joona has a much better autotuner, and he's recently got some pretty amazing result tuning SF's King PST.
Perhaps in a theoretical sense CLOP is best, but in reality it's horribly costly, explodes with dimension. Also, CLOP always starts from scratch, it cannot just iteratively improve already good values.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

 Posts: 303
 Joined: Sat Apr 28, 2012 4:18 pm
 Location: Austin, TX
 Contact:
Re: How Do You Automatically Tune Your Evaluation Tables
Hello Lucas,lucasart wrote: Here's how I did it:
* I started by replicating the Fruit 2.1 PST. I wrote the code myself, because I didn't like the Fruit coding style, but I replicated the functionality almost exactly.
* I handtuned and simplified, gradually reducing the number of parameters.
* Then I simplified even more and did some CLOP tuning.
The problem with CLOP is that the nb of games explodes with dimension. I never got any success beyond 3 parameters with CLOP. So I only tuned the most important PST, which are 1/ King 2/ Knight. And I tuned them using super simplified parametric form, with only 2 or 3 parameters.
But CLOP is really not the best solution. It's the best I know how to use, but Joona has a much better autotuner, and he's recently got some pretty amazing result tuning SF's King PST.
Perhaps in a theoretical sense CLOP is best, but in reality it's horribly costly, explodes with dimension. Also, CLOP always starts from scratch, it cannot just iteratively improve already good values.
Yes, I've done some of this myselfstarting with a (semi)reasonable guess and then essentially hand tuning, while simultaneously trying to reduce the variables to something manageable. I should probably look into what Joona has done, since I'd like to autotune things as much as possible (what's the point of having a computer if we don't let it do the drudge work?).
I tend to agree about CLOP. I think Remi has produced some amazing software, which has revolutionized the way we develop chess programs. Bayeselo taught everyone about statistics and the staggering number of games required to measure real progress, that alone was enough to make his reputation, CLOP was simply an added bonus. I also like the CLOP interfacebut, that said, I don't think CLOP is a silver bullet. It takes a *long* time to converge and once you get past a few parameters it's almost hopeless. I don't blame Remi for that, I just think it's a hard problem. One suggestion I'm going to consider is SPSA. It has the potential to do what I want and I might learn something along the way, (always a nice ancillary bonus).
regards,
tom
Re: How Do You Automatically Tune Your Evaluation Tables
Out of curiosity, the Fruit PST's are reasonable so I always wondered once you have reasonable PST's what the maximum elo gain would be if you optimize them. How many elo did you get?lucasart wrote:Here's how I did it:
* I started by replicating the Fruit 2.1 PST. I wrote the code myself, because I didn't like the Fruit coding style, but I replicated the functionality almost exactly.
* I handtuned and simplified, gradually reducing the number of parameters.
* Then I simplified even more and did some CLOP tuning.
 hgm
 Posts: 25390
 Joined: Fri Mar 10, 2006 9:06 am
 Location: Amsterdam
 Full name: H G Muller
 Contact:
Re: How Do You Automatically Tune Your Evaluation Tables
Indeed, this seems a legitimate worry. PST are a very course evaluation term, only expressing some very vague ideas, like centralization, advance, and staying away from edges and corners. Of course the range of the terms has to have a reasonable magnitude, so that it won't sacrifice a Rook to move a Knight out of a corner. But beyond that...
I always have the feeling that when I am doing this, I am trying to fit a circle with a square. For any part of the circle that you try to capture inside the square, by enlarging the latter, you will create new area in the square that sticks out of the circle. It seems quite pointless to try to determine the optimal square size that minimizes the nonoverlapping area to 3 digits, because you will improve from poor to poor.001. While going to an octagon (= new eval term, cutting off the corners) in stead of a square would immediately do far better in reducing the nonoverlapping area even if the square size was completely untuned (e.g. the smallest one that completely covered the circle). In fact starting from a square that minimized the nonoverlapping areas would not do particularly well when you made it into an octagon by cutting off the corners.
I always have the feeling that when I am doing this, I am trying to fit a circle with a square. For any part of the circle that you try to capture inside the square, by enlarging the latter, you will create new area in the square that sticks out of the circle. It seems quite pointless to try to determine the optimal square size that minimizes the nonoverlapping area to 3 digits, because you will improve from poor to poor.001. While going to an octagon (= new eval term, cutting off the corners) in stead of a square would immediately do far better in reducing the nonoverlapping area even if the square size was completely untuned (e.g. the smallest one that completely covered the circle). In fact starting from a square that minimized the nonoverlapping areas would not do particularly well when you made it into an octagon by cutting off the corners.
Re: How Do You Automatically Tune Your Evaluation Tables
I don't know, because this was the result of many patches over the years. Those were of two kinds:Rebel wrote:Out of curiosity, the Fruit PST's are reasonable so I always wondered once you have reasonable PST's what the maximum elo gain would be if you optimize them. How many elo did you get?lucasart wrote:Here's how I did it:
* I started by replicating the Fruit 2.1 PST. I wrote the code myself, because I didn't like the Fruit coding style, but I replicated the functionality almost exactly.
* I handtuned and simplified, gradually reducing the number of parameters.
* Then I simplified even more and did some CLOP tuning.
1/ simplification: removing parameters and trying to compensate by making something more general, plus some hand tuning. Those did not give much, but allowed me to simplify to an extreme and have very few parameters left.
2/ CLOP tuning of the most sensitive PSQ. Those are King and Knight. The other ones are not sensitive enough, so tuning them with CLOP is hopeless. That gave most of the elo, but was made possible by preliminary step 1/.
I would have to implement the Fruit PSQ again to be able to measure, and really I can't be bothered. But my guess is that I have gained easily 20 ELO out of tuning PSQ alone.
Fruit's PSQ is (like the rest of Fruit) very well designed. But it is not tuned properly. All features have default weights that are reasonable and within the right ballpark, but that's it. They are not optimized properly. Fabien was not interested in tuning.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.