How Do You Automatically Tune Your Evaluation Tables

AlvaroBegue · Post by **AlvaroBegue** » Wed Jan 08, 2014 4:29 pm

A different approach is to try to make the evaluation function be a good predictor of game result, without playing games. (Of course I would play many games after tuning to verify the new weights are better than the previous ones.)

The basic mechanism would be something like this:
(1) Get a database of [~1 million?] positions with associated results.
(2) From each position, run quiescence search and extract the position that the evaluation ultimately came from; in the process, write a new database of quiescent positions with associated results.
(3) Define the probability of winning as sigmoid(C*evaluation), where sigmoid(x):=1/(1+exp(-x)) and the constant C is chosen so the evaluation has the usual scale in pawns (I got C=0.58 or something like that, but I am quoting from memory).
(4) Use non-linear regression to estimate the parameters of the evaluation function that maximize the [log-]likelihood of the results.

One needs to do something to handle draws, but probably treating them as half a victory and half a loss would be fine.

Notice that if your evaluation function is a linear combination of terms and you are trying to figure out the coefficients, step (4) is logistic regression.

I have only done small-scale tests with this idea, but the Junior team seems to have used it extensively, as described in this paper: http://www.ratio.huji.ac.il/node/2362 . They seem to handle draws in a complicated way, but other than that I think their ideas are similar to mine (I haven't read the paper in a while).

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Jan 08, 2014 4:37 pm

Tom Likens wrote:I have a question, which I'm sure many programmers have wrestled with and likely solved. Basically, I'm curious how you automatically tune the tables in your chess program. I use CLOP as well as some homegrown solutions to tuning individual variables. Of course, CLOP also allows multiple variables to be tuned simultaneously, but the number of games required for convergence grows prodigiously, (4-5 variables usually require 40-50K games for a reliable result). This problem is even worse for a table of 64 values. For some tables, it's possible to take advantage of the table's symmetry so that you only need to tune approximately 16 values, (rotating and mirroring the results works well for certain times of evaluation terms). But for others, you really should CLOP all 64 values. Unfortunately, that will never converge in any reasonable amount of time.

One idea I've tried, with mixed success, is to CLOP only selected squares and then perform a linear or quadratic interpolation of the squares between the selected squares. Essentially, establishing a gradient field. This has kind of worked, but I'm sure there must be a better way. I'm curious, what others have tried.

regards,
--tom

Hi Tom.

Sorry to be slightly off topic, I would just like to tell you my opinion about psqt.
They are not fully symmetrical, as ranks play an important role into the equation.

Everyone knows that basically the numbers you have for the different squares are a function of 3 distinct values: values in terms of centralistation (i.e., e4,d4,e5,d5 squares score better than the extended center, etc.), values in terms of files (i.e., e and d files score higher than c and f files, etc.), and values in terms of ranks, or space advantage. When you add the 3 values for each square, you have the end value.

Now, values in terms of centralisation and values in terms of files are symmetrical, but values in terms of ranks/space advantage are not. So that the whole table could not possibly be symmetrical. Besides, when doing the sum total one should have in mind that values in terms of ranks are so important, that they singificantly outweigh values for both other categories taken together. So that it is impossible to maintain any kind of fully symmetrical approach. I think the most one can do is create the basic values for all squares on one of the central e or d files and then proceed decreasing those numbers with less central files by fixed steps. Thus, you might need to do just the 8 squares on one of the central e or d files, and tune other squares automatically. I do not think any other 8 squares can give you a better approach.

Now, you can totally neglect my advice, but I think when creating the main 8 values for the central file, the following could be true:

for pawns:
- 3rd rank gets minimally higher values than 2nd rank
- 4th rank gets minimally higher values than 3rd rank
- 5th rank gets much higher values than 4th rank
- 6th rank gets significantly higher values than 5th rank

for minor pieces:
- second rank gets minimally higher values than 1st rank (unless of course you decide to experiment with much bigger penalty for the back rank, which I doubt is the best approach, but some medium values could do well)
- 3rd rank gets minimally higher values than 2nd rank
- 4th rank gets minimally higher values than 3rd rank (but knights can get some additional bonus for that rank)
- 5th rank gets much higher values than 4th rank (even if a minor is not on an outpost, it still gains space and that is very important, outposts are not that frequent, but space is very important even without outposts)
- 6th rank gets significantly higher values than 5th rank
- 7th rank might get values as for the 4th rank (minors on the 7th are excellently placed)
- 8th rank might get the same values as for the 3rd rank (minors on the 8th are in no way worse placed than minors on the 3rd rank)

for rooks:
- second rank gets minimally higher values than 1st rank
- 3rd rank gets minimally higher values than 2nd rank (with or without rook lift bonus)
- 4th rank gets minimally higher values than 3rd rank
- 5th rank gets much higher values than 4th rank, but nowhere near as big as for minors
- 6th rank gets much higher values than 5th rank, with the same remark
- 7th rank gets best value, of course, equal to bonus for rook on 7th, and then you do not need such a bonus apart in the psqt
- 8th rank might be about equal to 6th rank

for the queen:
- 2nd rank gets minimally higher values than 1st rank
- 3rd rank gets minimally higher values than 2nd rank
- 4th rank gets minimally higher values than 3rd rank
- 5th rank gets much higher values than 4th rank, but not as big a difference as with minors
- 6th rank gets best value
- 7th rank might get about the same value as 5th rank
- 8th rank might get about the same value as 4th rank

Thus, all pieces have their specificities across ranks. The minors and the queen should get their best values on the 6th rank, while the rook on the 7th.

In the end, you need to have values specified just for the e1,e2,e3,e4,e5,e6,e7,e8 squares, and then proceed to lower those values for the f,g and h files by steps, and then mirror the other half of the board.

I remember playing some games with your engine in the past, it was a double-edged affair.

PK · Post by PK » Wed Jan 08, 2014 9:33 pm

Lyudmil, I'll take a liberty to disagree. higher bonuses for 5th and 6th rank probably aren't function of board geometry alone, but of proximity of enemy force, especially king. Thus a table with Your bonuses for advancing would probably beat a table without them, but for the wrong reasons, and would in turn be beaten by a "flat" table (awarding centralization alone) + king proximity bonus.

Rebel · Post by **Rebel** » Thu Jan 09, 2014 12:04 am

Lyudmil Tsvetkov wrote:for pawns:
- 3rd rank gets minimally higher values than 2nd rank
- 4th rank gets minimally higher values than 3rd rank
- 5th rank gets much higher values than 4th rank
- 6th rank gets significantly higher values than 5th rank

Here is a nice and nasty side effect. PST's scores interact with other eval terms, for instance mobilty, consider the diagram.

[d]rnbqkbnr/pppppppp/8/8/P7/8/1PPPPPPP/RNBQKBNR b KQkq a3[/d]

The move 1.a4 gets a too high score because mobility gains 2 squares (a2 and a3) for the rook. Things get worse if we follow your advice for 2.a5 (a much higher value than a4 you say). I remember my program favoring the a4 move for the wrong reason (mobility) and my fix was to favor the a2/a3 squares above a4 and a5 during the opening phase.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Jan 09, 2014 12:26 am

PK wrote:Lyudmil, I'll take a liberty to disagree. higher bonuses for 5th and 6th rank probably aren't function of board geometry alone, but of proximity of enemy force, especially king. Thus a table with Your bonuses for advancing would probably beat a table without them, but for the wrong reasons, and would in turn be beaten by a "flat" table (awarding centralization alone) + king proximity bonus.

Hi Pawel.

It is not about proximity, but rather about pawns and pieces into enemy territory restricting the enemy pieces' activity, and that is true regardless of whether those friendly pawns and pieces are close to the enemy king or on the opposite side very far away.

Yes, you can do it in many ways, for example having a flat table and king attack/proximity bonus. I would do it all with space advantage, for example by scoring space advantage in the tables higher for the side of the board where the enemy king is. If there is a way to optimize this table, then probably you can do without king attacks even, only space tables and mobility.

To take advantage of your presence here, let me ask you: do engines in general/or your engine score minor outposts on the side where the enemy king is? If so, does not that clash in some way with king attacks/proximity? What I know is that minor pieces on the 5th rank, even when not on outposts, are almost of the same worth as being outposted. But outposts on the 5th rank would occur in only about 1/3 of the cases when a minor is on that rank. So, in a way, if you rely only on outposts, you miss probably some good moves.

Same with rooks, why have a flat table when a rook on the 6th is much stronger than a rook on the 4th rank? I think it just originated that way, people started like that, then some successful patterns started being used by many authors, then they tuned all they eval and search to those patterns, and all people are doing it like that. But, if people started with space advantage tables, then surely they would also have found a way to successfully tune such tables with other eval and search parameters.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Jan 09, 2014 12:39 am

Rebel wrote:
Lyudmil Tsvetkov wrote:for pawns:
- 3rd rank gets minimally higher values than 2nd rank
- 4th rank gets minimally higher values than 3rd rank
- 5th rank gets much higher values than 4th rank
- 6th rank gets significantly higher values than 5th rank
Here is a nice and nasty side effect. PST's scores interact with other eval terms, for instance mobilty, consider the diagram.

[d]rnbqkbnr/pppppppp/8/8/P7/8/1PPPPPPP/RNBQKBNR b KQkq a3[/d]

The move 1.a4 gets a too high score because mobility gains 2 squares (a2 and a3) for the rook. Things get worse if we follow your advice for 2.a5 (a much higher value than a4 you say). I remember my program favoring the a4 move for the wrong reason (mobility) and my fix was to favor the a2/a3 squares above a4 and a5 during the opening phase.

It is really nice and nasty.

Of all the engines I remember Junior liked very much such moves, it would play h4-h5-h6 very frequently in the opening, forgetting about other pieces.

But with 1.e4 instead of 1.a4 you get a better posted central pawn than a4 and also better minor piece mobility. I have seen engines playing moves like a4 but maybe on the 4th or 5th move after some knights have been developed and probably one bishop. Could not it be that very heavily weighted rook mobility is the reason for such moves?

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Jan 09, 2014 12:53 am

lucasart wrote:
Tom Likens wrote:I have a question, which I'm sure many programmers have wrestled with and likely solved. Basically, I'm curious how you automatically tune the tables in your chess program. I use CLOP as well as some homegrown solutions to tuning individual variables. Of course, CLOP also allows multiple variables to be tuned simultaneously, but the number of games required for convergence grows prodigiously, (4-5 variables usually require 40-50K games for a reliable result). This problem is even worse for a table of 64 values. For some tables, it's possible to take advantage of the table's symmetry so that you only need to tune approximately 16 values, (rotating and mirroring the results works well for certain times of evaluation terms). But for others, you really should CLOP all 64 values. Unfortunately, that will never converge in any reasonable amount of time.

One idea I've tried, with mixed success, is to CLOP only selected squares and then perform a linear or quadratic interpolation of the squares between the selected squares. Essentially, establishing a gradient field. This has kind of worked, but I'm sure there must be a better way. I'm curious, what others have tried.

regards,
--tom
Here's how I did it:
* I started by replicating the Fruit 2.1 PST. I wrote the code myself, because I didn't like the Fruit coding style, but I replicated the functionality almost exactly.
* I hand-tuned and simplified, gradually reducing the number of parameters.
* Then I simplified even more and did some CLOP tuning.

The problem with CLOP is that the nb of games explodes with dimension. I never got any success beyond 3 parameters with CLOP. So I only tuned the most important PST, which are 1/ King 2/ Knight. And I tuned them using super simplified parametric form, with only 2 or 3 parameters.

But CLOP is really not the best solution. It's the best I know how to use, but Joona has a much better auto-tuner, and he's recently got some pretty amazing result tuning SF's King PST.

Perhaps in a theoretical sense CLOP is best, but in reality it's horribly costly, explodes with dimension. Also, CLOP always starts from scratch, it cannot just iteratively improve already good values.

Regarding Stockfish king table, I have some suspicions regarding the way Stockfish treats the b1 and c1 squares with long castling. I have played many games against it with Stockfish castling long, and, in 100% of the cases (but I really have not seen a single game where the engine would do otherwise) the next move of Stockfish immediately after castling and the king landing on c1/c8 would be to retreat its king to safety to b1. In this way on many occasions the engine misses some better moves that would not be available on the next move. I have seen many grandmaster/world champion games at long TC, and, in only 50% of cases the next move after castling long would be to retreat immediately the king to safety further to b1. Stockfish does that automatically all the time.

Well, probably this would be true only of blitz, but I find such a behaviour somewhat suspicious. It is either that the engine values b1 too much over c1, or that there is some other eval/search rule that tells the engine it should prioritize moving its king to safety. However, such a priority should not be so stringent after castling long.

PK · Post by PK » Thu Jan 09, 2014 1:12 am

Hi,

I'm not aware of any engine that scores outposts differently depending on king proximity, but this might be interesting thing to try.

As for space advantage tables, I have tried them several times, always failing, with one exception: knights genuinely want to forward. but my midgame bishop table looks as follows:

Code: Select all

const int pstBishopMg&#91;64&#93; =  
&#123; 
 //A1                                        H1
    -2,  -5, -16,  -5,  -5, -16,  -5,  -2,
    -5,   6,   3,   3,   3,   3,   6,  -5,
    -5,   5,   8,   5,   5,   8,   5,  -5,
    -5,   0,   5,  13,  13,   5,   0,  -5,
    -5,   0,   5,  13,  13,   5,   0,  -5,
    -5,   0,   8,   5,   5,   8,   0,  -5,
    -5,   3,   0,   0,   0,   0,   3,  -5,
    -2,  -5,  -5,  -5,  -5,  -5,  -5,  -2
// A8                                       H8
&#125;;

it contains centralization bonus (going in the rings), long diagonal bonus, starting square penalty and home ground penalty (these threes on the second row). since my engine has relatively high mobility scores, I prefer my piece/square tables to have a mild tendency for inhibiting forward movement of pieces.

I'd be interested to check how Your piece/square tables work per se, as the only element of evaluation function. Let's say, a three-way competition between Your set, a set that I'll design for this occassion and UFO set (http://chessprogramming.wikispaces.com/ ... n+function). I'd like to determine the following:

- is space-grabbing (i.e. Yours) set of table better than more laid-back one?
- how much UFO can be improved upon by using midgame/endgame interpolation?

all I ask for is that You (or anyone for that matter) supplies piece values and a set of 12 piece/square tables in the format presented above. I promise to create a provisional rating list for pst-only evaluations, using a crippled version of Rodent.

wgarvin · Post by **wgarvin** » Thu Jan 09, 2014 1:33 am

PK wrote:Hi,

I'm not aware of any engine that scores outposts differently depending on king proximity, but this might be interesting thing to try.

As for space advantage tables, I have tried them several times, always failing, with one exception: knights genuinely want to forward. but my midgame bishop table looks as follows:
Code: Select all
const int pstBishopMg&#91;64&#93; =  
&#123; 
 //A1                                        H1
    -2,  -5, -16,  -5,  -5, -16,  -5,  -2,
    -5,   6,   3,   3,   3,   3,   6,  -5,
    -5,   5,   8,   5,   5,   8,   5,  -5,
    -5,   0,   5,  13,  13,   5,   0,  -5,
    -5,   0,   5,  13,  13,   5,   0,  -5,
    -5,   0,   8,   5,   5,   8,   0,  -5,
    -5,   3,   0,   0,   0,   0,   3,  -5,
    -2,  -5,  -5,  -5,  -5,  -5,  -5,  -2
// A8                                       H8
&#125;;
it contains centralization bonus (going in the rings), long diagonal bonus, starting square penalty and home ground penalty (these threes on the second row). since my engine has relatively high mobility scores, I prefer my piece/square tables to have a mild tendency for inhibiting forward movement of pieces.

You wrote "home ground penalty", but it looks like you are giving +3 to white bishops on b2 through g2, so it would really be a "home ground bonus" ?

If I understood what you meant, this bonus is to help inhibit forward movement a little bit.

Tom Likens · Post by **Tom Likens** » Thu Jan 09, 2014 3:03 am

Lyudmil,

I don't disagree in theory with what you've wrote, but I personally find it difficult to create an exact set of rules to cover all the cases possible. I think even for what you list, as well thought out as it is, there are exceptions. I've tried this many times over the years and I've always found that my intuition is never really accurate enough. It's similar to "profiling by intuition", of course I know where all the slow code in my program is, the problem is that the stupid, brain-dead profiler won't agree!

I think this is similar, that's why I'd like to find an automated solution. Ideally, it would be something that I could feed a table, such as the one you describe, that it would use as the initial seed and from there it would start optimizing the values based on a fitness function. It would also be nice if it converged before the heat death of the universe.

Lyudmil Tsvetkov wrote:I remember playing some games with your engine in the past, it was a double-edged affair.

Yes, it's a double-edged sword programming it as well. It's agony when it plays something stupid beyond belief, a too often occurrence unfortunately--hence this thread.

regards,
--tom

How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables

Re: How Do You Automatically Tune Your Evaluation Tables