Stuck trying to come up with my own PST values

KhepriChess · Post by **KhepriChess** » Tue Feb 22, 2022 7:03 pm

I've spent quite a bit of time lately trying to come up with my own PST values (instead of using PeSTO's). I've read most of the forum links on the chessprogramming site about PSTs. I wrote myself a texel tuner.

Unfortunately nothing I seem to do gets close to the strength of my engine with the PeSTO values. I think the best I've been able to do is something like a 40-50 Elo loss in self-play.

For example, I hand tuned to something like the following for bishops:

Code: Select all

-20, -10, -10, -10, -10, -10, -10, -20,
-10,   0,   0,   0,   0,   0,   0, -10,
-10,   0,  -5,   5,   5,  -5,   0, -10,
-10,   5,   0,  15,  15,   0,   5, -10,
-10,   0,   0,  15,  15,   0,   0, -10,
-10,   0,  -5,  10,  10,  -5,   0, -10,
-10,  25,   0,  15,  15,   0,  20, -10,
-20, -10, -15, -10, -10, -15, -10, -20,

For my texel tuner, I basically rewrote my evaluation function in C# (because JS is way too slow to do texel tuning). I grabbed one of the "texel tuning" files from https://rebel13.nl/download/data.html (and made sure to remove non-quiet positions). Running the tuner against 100,000 positions gave me the following for bishops:

Code: Select all

-24, -6, -16, -10, -16, -14, -12, -20, 
 -6,  4,   0,   2,  -2,  -6,  -6, -12, 
 -6, -4,  -7,  -1,  -1,  -7,  -2, -14, 
-14, -1,  -6,   9,   9,  -6,   3, -14, 
-12, -2,  -2,  11,   9,  -4,  -6, -16, 
-16, -4,  -7,   8,   6,  -7,  -4, -16, 
-10, 19,  -6,   9,   9,   4,  14, -16, 
-26,-16, -11,  -6,  -6, -11,  -8, -22,

But my engine with those texel values was something like a 100 Elo loss (in self-play). I tried running the texel tuner against all the positions (something like 700,000), but the output values were just a ton of negative numbers.

I'm really at a loss here now. I'm not sure what to do.

federico · Post by **federico** » Tue Feb 22, 2022 8:35 pm

KhepriChess wrote: ↑Tue Feb 22, 2022 7:03 pm I've spent quite a bit of time lately trying to come up with my own PST values (instead of using PeSTO's). I've read most of the forum links on the chessprogramming site about PSTs. I wrote myself a texel tuner.

Unfortunately nothing I seem to do gets close to the strength of my engine with the PeSTO values. I think the best I've been able to do is something like a 40-50 Elo loss in self-play.

For example, I hand tuned to something like the following for bishops:
Code: Select all
-20, -10, -10, -10, -10, -10, -10, -20,
-10,   0,   0,   0,   0,   0,   0, -10,
-10,   0,  -5,   5,   5,  -5,   0, -10,
-10,   5,   0,  15,  15,   0,   5, -10,
-10,   0,   0,  15,  15,   0,   0, -10,
-10,   0,  -5,  10,  10,  -5,   0, -10,
-10,  25,   0,  15,  15,   0,  20, -10,
-20, -10, -15, -10, -10, -15, -10, -20,
For my texel tuner, I basically rewrote my evaluation function in C# (because JS is way too slow to do texel tuning). I grabbed one of the "texel tuning" files from https://rebel13.nl/download/data.html (and made sure to remove non-quiet positions). Running the tuner against 100,000 positions gave me the following for bishops:
Code: Select all
-24, -6, -16, -10, -16, -14, -12, -20, 
 -6,  4,   0,   2,  -2,  -6,  -6, -12, 
 -6, -4,  -7,  -1,  -1,  -7,  -2, -14, 
-14, -1,  -6,   9,   9,  -6,   3, -14, 
-12, -2,  -2,  11,   9,  -4,  -6, -16, 
-16, -4,  -7,   8,   6,  -7,  -4, -16, 
-10, 19,  -6,   9,   9,   4,  14, -16, 
-26,-16, -11,  -6,  -6, -11,  -8, -22,
But my engine with those texel values was something like a 100 Elo loss (in self-play). I tried running the texel tuner against all the positions (something like 700,000), but the output values were just a ton of negative numbers.

I'm really at a loss here now. I'm not sure what to do.

I had a similar problem when i tried texel tuning. The problem was that the PST was compensating for the value of the piece. If say a queen had a value of 900, and the average value of the queen PST (sum of all 64 piece sq values) was +400, it means that the value of the queen is too low and perhaps the optimal value would be closer to 1300.

To solve it, i added a normalizing function after each update to the PST to ensure the average value of the PST was always zero. This would push the value of the piece to the correct value, instead of pushing all values in the PST up or down.

hope this helps and good luck with tuning !

JVMerlino · Post by **JVMerlino** » Tue Feb 22, 2022 9:04 pm

I never bothered to tune every value in my PST for just this reason. I then found a PST (and now I'm embarrassed to say that I can't remember where) that included piece values, so queen PST values are all around 900-950, for example. It turned out to be far better than what I had personally arrived at (no surprise), but I also found that I could get some good improvement on it by adding a single adjustment per piece and per mg/eg, as follows:

Code: Select all

// tuning adjustments to PSTs
#define QUEEN_MG_ADJ   37
#define QUEEN_EG_ADJ   23
#define ROOK_MG_ADJ     1
#define ROOK_EG_ADJ    15 
#define BISHOP_MG_ADJ   0  
#define BISHOP_EG_ADJ  -2   
#define KNIGHT_MG_ADJ  19  
#define KNIGHT_EG_ADJ   2   
#define PAWN_MG_ADJ    -4    
#define PAWN_EG_ADJ    -2

So clearly, for my engine, the PST caused queens to be valued to low, but bishops were almost perfect already, etc. You might want to give this a try, although it is a lazy approach.

jm

op12no2 · Post by **op12no2** » Tue Feb 22, 2022 10:27 pm

Could you keep the piece values anchored at 1 3 3 5 9 for example and just tune the PSTs? Or set the whole mg/eg bishop (for example) PSTs to 3 and not have piece values? Then there is only one feature coefficient and weight. NB: I find Javascript fast enough, taking ~30 sec per 1M positions per ~1000 features/weights - but yeah that means hours not minutes for tuning...

lithander · Post by **lithander** » Tue Feb 22, 2022 11:15 pm

What's the benefit of having material value modified by PSTs and not just use the PSTs to store the combination?

emadsen · Post by **emadsen** » Tue Feb 22, 2022 11:38 pm

KhepriChess wrote: ↑Tue Feb 22, 2022 7:03 pm But my engine with those texel values was something like a 100 Elo loss (in self-play). I tried running the texel tuner against all the positions (something like 700,000), but the output values were just a ton of negative numbers.

I'm really at a loss here now. I'm not sure what to do.

What are you tuning? I recommend tuning advancement, centrality, and closeness to corner for each piece for MG and EG. Well, pawns don't need the corner parameter.

That's a total of (3 * 6 * 2) - 2 = 34 parameters from which you can calculate the PSTs.
Or are you attempting to tune 64 * 6 * 2 = 768 parameters?

I also recommend more positions. Try 5 million or more.

KhepriChess · Post by **KhepriChess** » Wed Feb 23, 2022 1:27 am

op12no2 wrote: ↑Tue Feb 22, 2022 10:27 pm Could you keep the piece values anchored at 1 3 3 5 9 for example and just tune the PSTs? Or set the whole mg/eg bishop (for example) PSTs to 3 and not have piece values? Then there is only one feature coefficient and weight. NB: I find Javascript fast enough, taking ~30 sec per 1M positions per ~1000 features/weights - but yeah that means hours not minutes for tuning...

That's actually what I've been doing. I just have the tuner changing the PST values, and not the piece values. And man, I don't know what you're doing but I was never able to get my JS tuner to be that fast. Even my C# version takes over an hour for 100,000 positions.

lithander wrote: ↑Tue Feb 22, 2022 11:15 pmWhat's the benefit of having material value modified by PSTs and not just use the PSTs to store the combination?

If I'm using material value elsewhere (like in move scoring), doesn't it make sense to separate them?

emadsen wrote: ↑Tue Feb 22, 2022 11:38 pm
KhepriChess wrote: ↑Tue Feb 22, 2022 7:03 pm But my engine with those texel values was something like a 100 Elo loss (in self-play). I tried running the texel tuner against all the positions (something like 700,000), but the output values were just a ton of negative numbers.

I'm really at a loss here now. I'm not sure what to do.
What are you tuning? I recommend tuning advancement, centrality, and closeness to corner for each piece for MG and EG. Well, pawns don't need the corner parameter.

That's a total of (3 * 6 * 2) - 2 = 34 parameters from which you can calculate the PSTs.

Or are you attempting to tune 64 * 6 * 2 = 768 parameters?

I also recommend more positions. Try 5 million or more.

Your recommendation is more or less what I'm aiming for in tuning (and basically what I've tried to aim for in hand tuning up to this point). I've look at PST for a few other engines to get a sense of what they're doing, for some reference.

I'm trying to tune the latter (64 * 6 * 2), every square for every piece for MG and EG.

I've tried with more positions, but then the resulting PSTs are just a bunch of large negative numbers.

emadsen · Post by **emadsen** » Wed Feb 23, 2022 2:24 am

KhepriChess wrote: ↑Wed Feb 23, 2022 1:27 am Your recommendation is more or less what I'm aiming for in tuning (and basically what I've tried to aim for in hand tuning up to this point). I've look at PST for a few other engines to get a sense of what they're doing, for some reference.

I'm trying to tune the latter (64 * 6 * 2), every square for every piece for MG and EG.

I've tried with more positions, but then the resulting PSTs are just a bunch of large negative numbers.

Then it isn't.

You asked why your tuning efforts haven't yielded a stronger engine. I suggested it may be because you're tuning 23x more parameters against 1/7th the number of positions I recommended. You responded by saying that's "more or less" what you're doing. It's not.

You could be tuning against noise. How many games included a key maneuver of a white bishop to d6 in the endgame? Not near d6, exactly d6. The problem you're encountering, perhaps, is too easily convincing yourself, "I'm doing the same as others" or "I've done all that can be done." I'd be a little more cautious if I were you of casually dismissing what amounts to a 161x difference in eval minimization factors.

Please don't take this the wrong way. I'm just trying to answer your question as directly as I can. I certainly am not in possession of all the answers. I have a middling engine in terms of playing strength and a few years of experience with chess engine programming. Others have more of each. Listen to what other experienced chess engine authors suggest. If it doesn't align with what I'm saying, then consider my advice an outlier not worth following at this point.

For what it's worth, I think...

Step one is to gather more quiet positions correlated with game results. I suspect games from engines near to the strength of your engine may be more helpful than games from elite engines- though I don't have hard data to prove that. Run these games yourself or download games from CCRL.
Step two is to increase the rigor of your process beginning with decreasing the parameter search space your tuner must traverse. Verify the tuner actually is changing all param values. Also verify the tuner updates all lookup tables dependent on param values prior to calculating the eval error of the next iteration.
Step three is to investigate the numbers the tuner produces. Why does it return many large, negative numbers for PST values? How does this affect the static score of sample positions from your test set? What static eval does the stronger version of your engine return compared to the static eval of the weaker engine? As other have suggested, investigate whether the total value of material + PST changes drastically or only the PST value. If the tuned eval params produce nonsensical static scores, that suggests your tuner is not actually updating param values, or isn't calculating the total error correctly (are you summing the square of each error so positive and negative eval diffs don't cancel each other out?), isn't converting score to win percentage via a sigmoid function correctly, etc.

emadsen · Post by **emadsen** » Wed Feb 23, 2022 2:37 am

-- accidental duplicate post --

amanjpro · Post by **amanjpro** » Wed Feb 23, 2022 5:08 am

emadsen wrote: ↑Wed Feb 23, 2022 2:24 am
KhepriChess wrote: ↑Wed Feb 23, 2022 1:27 am Your recommendation is more or less what I'm aiming for in tuning (and basically what I've tried to aim for in hand tuning up to this point). I've look at PST for a few other engines to get a sense of what they're doing, for some reference.

I'm trying to tune the latter (64 * 6 * 2), every square for every piece for MG and EG.

I've tried with more positions, but then the resulting PSTs are just a bunch of large negative numbers.
Then it isn't.

You asked why your tuning efforts haven't yielded a stronger engine. I suggested it may be because you're tuning 23x more parameters against 1/7th the number of positions I recommended. You responded by saying that's "more or less" what you're doing. It's not.

You could be tuning against noise. How many games included a key maneuver of a white bishop to d6 in the endgame? Not near d6, exactly d6. The problem you're encountering, perhaps, is too easily convincing yourself, "I'm doing the same as others" or "I've done all that can be done." I'd be a little more cautious if I were you of casually dismissing what amounts to a 161x difference in eval minimization factors.

Please don't take this the wrong way. I'm just trying to answer your question as directly as I can. I certainly am not in possession of all the answers. I have a middling engine in terms of playing strength and a few years of experience with chess engine programming. Others have more of each. Listen to what other experienced chess engine authors suggest. If it doesn't align with what I'm saying, then consider my advice an outlier not worth following at this point.

For what it's worth, I think...

Step one is to gather more quiet positions correlated with game results. I suspect games from engines near to the strength of your engine may be more helpful than games from elite engines- though I don't have hard data to prove that. Run these games yourself or download games from CCRL.

Step two is to increase the rigor of your process beginning with decreasing the parameter search space your tuner must traverse. Verify the tuner actually is changing all param values. Also verify the tuner updates all lookup tables dependent on param values prior to calculating the eval error of the next iteration.

Step three is to investigate the numbers the tuner produces. Why does it return many large, negative numbers for PST values? How does this affect the static score of sample positions from your test set? What static eval does the stronger version of your engine return compared to the static eval of the weaker engine? As other have suggested, investigate whether the total value of material + PST changes drastically or only the PST value. If the tuned eval params produce nonsensical static scores, that suggests your tuner is not actually updating param values, or isn't calculating the total error correctly (are you summing the square of each error so positive and negative eval diffs don't cancel each other out?), isn't converting score to win percentage via a sigmoid function correctly, etc.

Zahak 6.2 (which is 2800 as per CCRL), was tuned over the same 700_000 quiet positions, and I tuned around 816 parameters with them. If the OP has issues with making progress with tuning, then he has a bug, either in the tuner, eval or search

Stuck trying to come up with my own PST values

Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values

Re: Stuck trying to come up with my own PST values