How to tune the bonus for having the move.

bob · Post by **bob** » Tue Jun 03, 2008 3:30 am

michiguel wrote:
Onno Garms wrote:0.05 is surprisingly small (but almost matches Strelka's value of 0.03).

As I have been taught that three tempi are a pawn in the opening and having the move is half a tempo, I would expect 1/6th of a pawn to be a good value. I would expect that this is fairly independent of the evaluation.
0.05 would be an "average" of all the situations during different games. In close positions and endgames would be worth less, but in open, populated positions would be worth more.

Miguel

That's exactly the idea. We have not tried to have an adaptive STM bonus as of yet, but the idea seems reasonable to test.

Don · Post by **Don** » Tue Jun 03, 2008 6:20 am

bob wrote: There's no reason to be gun-shy. My comments came from experience because we have been trying various ways of evaluating changes for a couple of years. With a goal of trying to do this as quickly as possible. And the primary conclusion we have reached is that doing this quickly is not possible. And finally, testing with positions tends to make you handle those positions better, but whether this carries over into games may or may not be the case.

Bottom line is we have found no quick way to pull this off except for cases where the change is amazingly good, or horribly bad. But I have come to the conclusion that testing using positions is good for debugging or sanity testing, but not for evaluating changes, unless you are working on some specific tactical issue and have positions that you don't do well in normally. But even with those, it is critical to play a bunch of games to verify things, as some extensions improve tactical test results but hurt in real games...

Ditto on playing games. That's the only reliable way to measure improvements.

In fact, that's how I have always done it since writing my very first program. Larry and I were doing this back in the 8086 days. Each of us had 4 or 5 computers we tested with! That's pretty pathetic in comparison to what we have now with just a single computer but it's all we had to work with.

We flirted with trying to produce a large problem set that would tell us if we had improved the program, but you cannot even come close doing that. However, problem sets are extremely useful for many other things, such as bug finding and hunting.

chrisw · Post by **chrisw** » Tue Jun 03, 2008 1:26 pm

Don wrote:
chrisw wrote:
Don wrote:I noticed that some people have suggested that a "having the move bonus" is beneficial to their program. I believe that too and have always used one in my programs. How to set it correctly seems to be a bit of a black art, but I have a suggestion that I would like others to try and report back.

You should start with a large timing test set. I use 100 positions to measure the general effects of performance tuning on my program. This is not a "find the solution" set, it is just a set of random positions from games to measure general search speed, node counts, etc, and I don't care what move is returned.

The basic idea is to TIME your program using various stand pat bonuses. The theory is that the most correct values will produce the fastest searches because there will be less "turbulence" in move choices, odd/even scores, etc. All of these things hurt the search.

It's just a theory I admit, but I tried it and I get reasonable values. Since the values are reasonable and you are just guessing anyway, why not use the values that produce the fastest search?

My program is very new and has a very primitive evaluation function that is not aggressive at all about positional values - so I would thus expect the stand pat bonus needed to be relatively low. That's what I get here, a value of 0.10 works best of the ones I tried. Here are the times for my 100 position timing set at various stand pat bonus values:
Code: Select all
BONUS   AVE TIME      AVE NODES
-----   --------      ---------  
 0.00      5.312      3,443,479
 0.05      4.734      3,119,767
 0.10      4.247      2,780,342
 0.15      4.692      2,938,563
 0.20      4.684      2,901,375
 0.30      5.862      3,310,455
[/color]

As you can see, 0.10 appears to be best for me right now. 0.15 and 0.20 are non-optimal and I put 0.30 to get one that is obviously wrong but not totally ridiculous.

I would be curious about the results others get with such a test.
with a good quiescence function it ought not to matter too much. With a lousy quiesence function it will matter a lot, but also vary massively depending on position.

There's no reason why tuning on search time reduction is going to optimise for strength, is there? You might try tuning on either a huge test set, or on speed to find actual move chosen by winning side in a set of high-ELO games.

So, assuming you have good quiescence, I would be inclined to suggest also modifying your idea for the sidetomovebonus by pieces left on the board. KQRBN different to KRBN etc
Who uses a good quiescence function now days? The biggest development over the past few years is to make the quies fast and stupid so if you are right, then it does matter. I don't know of any evaluation function that smooths out positional scores.

- Don

Well, a strong chess player uses a good quiescence function. As in search until position quiet and evaluate.

Anything less than that is a downwards compromise based on inability to create a quiescence function that approximates it, and traditionally excused on the basis that bean-counting gives results.

I find the sidetomove bonus a kind of nonsense forced on the beancounting program as a kludge to cover its inherently problematical design.

You've heard of the concept of the coiled spring in chess? Perhaps in relation to some variations of the Sicilian. Black is forced back onto defending some weakness. But if white actually tries to do something and overextends, then, suddenly, all blacks pieces uncoil forwards from their defensive positions and white is in trouble.

My guess is that sidetomove bonus in that kind of situation is probably counterproductive.

Fact is this: the advantage of having the move is either positive, negative or neutral depending on the situation. Period.

chrisw · Post by **chrisw** » Tue Jun 03, 2008 1:38 pm

Don wrote:
chrisw wrote:
Don wrote:
chrisw wrote:
Don wrote:I noticed that some people have suggested that a "having the move bonus" is beneficial to their program. I believe that too and have always used one in my programs. How to set it correctly seems to be a bit of a black art, but I have a suggestion that I would like others to try and report back.

You should start with a large timing test set. I use 100 positions to measure the general effects of performance tuning on my program. This is not a "find the solution" set, it is just a set of random positions from games to measure general search speed, node counts, etc, and I don't care what move is returned.

The basic idea is to TIME your program using various stand pat bonuses. The theory is that the most correct values will produce the fastest searches because there will be less "turbulence" in move choices, odd/even scores, etc. All of these things hurt the search.

It's just a theory I admit, but I tried it and I get reasonable values. Since the values are reasonable and you are just guessing anyway, why not use the values that produce the fastest search?

My program is very new and has a very primitive evaluation function that is not aggressive at all about positional values - so I would thus expect the stand pat bonus needed to be relatively low. That's what I get here, a value of 0.10 works best of the ones I tried. Here are the times for my 100 position timing set at various stand pat bonus values:
Code: Select all
BONUS   AVE TIME      AVE NODES
-----   --------      ---------  
 0.00      5.312      3,443,479
 0.05      4.734      3,119,767
 0.10      4.247      2,780,342
 0.15      4.692      2,938,563
 0.20      4.684      2,901,375
 0.30      5.862      3,310,455
[/color]

As you can see, 0.10 appears to be best for me right now. 0.15 and 0.20 are non-optimal and I put 0.30 to get one that is obviously wrong but not totally ridiculous.

I would be curious about the results others get with such a test.
with a good quiescence function it ought not to matter too much. With a lousy quiesence function it will matter a lot, but also vary massively depending on position.

There's no reason why tuning on search time reduction is going to optimise for strength, is there? You might try tuning on either a huge test set, or on speed to find actual move chosen by winning side in a set of high-ELO games.

So, assuming you have good quiescence, I would be inclined to suggest also modifying your idea for the sidetomovebonus by pieces left on the board. KQRBN different to KRBN etc
Who uses a good quiescence function now days? The biggest development over the past few years is to make the quies fast and stupid so if you are right, then it does matter. I don't know of any evaluation function that smooths out positional scores.

- Don
It's not relevent to your original question, the assumption that "nobody uses a good quiesence function nowadays", maybe you do and maybe you don't but the question of the best value of a sidetomove bonus would be answered by me on a philosophical basis and on general terms first, and not by restriction to what is the best sidetomove bonus for a modern beancounting program. If the later, don't know, don't care, there isn't going to be a bestbonus anyway, just a horrible kludge that couldn't possibly cover the situations that arise.

It depends on what can be done by the move advantage.

a) deliver mate - infinite value
b) grab a piece for free - value of piece
c) grab a pawn for free - value of pawn
d) push a freepawn - worth something probably, depending on how much material on board, where the king is
f) make a threat - depends, might be worth something
g) continue developing, worth something, depends on whether true ot not
h) if can't do anything constructive, then likely worth nothing or even negative
i) start undoing a perfect position - negative
j) in zugswang - more negative

even with a quiescence funcgtion that picks some of that stuff up, it may be good to have the move in varying degrees of good, or it may be bad, again in varying degrees.

So, I'ld just like to suggest that averaging out over a bunch of positions without trying to go deeper into the problem first is a bit not terribly useful and suggested instead grading the bonus against material left on the board, just for starters.

Chris
Chris,

Breaking this up into middle game and end game with a transition between them (which is all the rage now) is a good idea.

I'm not sure you understand what the bonus is for. Let me explain it to you so that we are talking the same language.

Sometimes having the move can lead to checkmate or the win of material as you say, that is true. But we are relying on the quies search for things like this. Having the move bonus doesn't try to cover that. It also doesn't cover the case where one side is on the attack and the other side is defending - what is often called a "time" advantage. The search detects this pretty much and will play attacking style just to keep the opponent busy so that he cannot improve his position. Chess program are pretty good at this and the stand pat bonus doesn't try to know this.

The stand pat bonus is only a device to compensate programs for that abrupt stopping point. At some point you have to stop the search and it's just plain silly to get a slightly higher score for the color that you happened to stop on. It's not just silly, but it weakens the play slightly and makes the search slower. It makes the search slower because a few hash table moves which work on odd ply searches will not work on even ply searches. This can apply to killers too. Stand Pat Bonus doesn't solve this or prevent it, but it does minimize it.

Thanks for the lesson

Your argument is that there is a step function in operation as you move between black and white in the search tree which can be smoothed by the addition of some averaged sidetomove bonus.

Well, I say not.

Sometimes there may be, sometimes there may not be and sometimes you'll get the step function backwards and degrade the evaluation score.

It all depends on the position. You're assuming always that it is possible to improve by having the move. But it isn't.

Hence an averaged step function smoother is a nonsense. That's not to say it can't statistically help an otherwise semi-clueless beancounter, but I think you should face up to what you're doing - namely a horrible kludge

Don · Post by **Don** » Tue Jun 03, 2008 4:45 pm

chrisw wrote: Thanks for the lesson

Your argument is that there is a step function in operation as you move between black and white in the search tree which can be smoothed by the addition of some averaged sidetomove bonus.

Well, I say not.

Sometimes there may be, sometimes there may not be and sometimes you'll get the step function backwards and degrade the evaluation score.

Welcome to my world - it's called an imperfect evaluation function. I think every term in my evaluation function (and yours too) is like this. I don't lose any sleep over it (although I try to improve on it) and yes I have faced up to this fact as you accuse me of not doing.

chrsw wrote: It all depends on the position. You're assuming always that it is possible to improve by having the move. But it isn't.

Hence an averaged step function smoother is a nonsense. That's not to say it can't statistically help an otherwise semi-clueless beancounter, but I think you should face up to what you're doing - namely a horrible kludge

Of course it's a kludge. It's a trick and the idea is that hopefully you can't see behind the smoke and mirrors. Isn't that what CSTAL does with it's evaluation function terms and weights? Isn't that what every program does? Are you going to tell me that CSTAL doesn't speculate?

mclane · Post by **mclane** » Tue Jun 03, 2008 5:15 pm

it seems we discuss in very big loops

each 10 years or more we come back to the same point of views.

hi don !!
long time since we met in the hague !

Don · Post by **Don** » Tue Jun 03, 2008 5:19 pm

mclane wrote:

it seems we discuss in very big loops
each 10 years or more we come back to the same point of views.

hi don !!
long time since we met in the hague !

Yes, I know this discussion by heart! Nice to hear from you.

bob · Post by **bob** » Tue Jun 03, 2008 6:49 pm

I agree that problem sets have their uses. WAC at 1 sec / move is a good sanity test to make sure you have not wrecked your search / extensions after some sort of change.

However, for real measurement, that isn't enough. One thing we encountered was something like a simple rook on the 7th bonus. You can find several positions in books where the correct move is to plant your rook on the 7th rank. And you can tune your rook on 7th bonus to make it do that in those positions, and think "aha, this is better". But have you ever seen a program plant its rook on the 7th rank, with no opponent pawns on the 7th and the opponent king up in the center of the board somewhere? So that the rook is very ineffectual where it stands? Some eval terms are pretty long-range (this is one of them, but king safety in general is an even bigger one). And you need a long-term test (a game) to see if a bonus is too high or too small...

Only problem is the time required to play enough games to be statistically significant...

Don · Post by **Don** » Tue Jun 03, 2008 7:09 pm

bob wrote:I agree that problem sets have their uses. WAC at 1 sec / move is a good sanity test to make sure you have not wrecked your search / extensions after some sort of change.

However, for real measurement, that isn't enough. One thing we encountered was something like a simple rook on the 7th bonus. You can find several positions in books where the correct move is to plant your rook on the 7th rank. And you can tune your rook on 7th bonus to make it do that in those positions, and think "aha, this is better". But have you ever seen a program plant its rook on the 7th rank, with no opponent pawns on the 7th and the opponent king up in the center of the board somewhere? So that the rook is very ineffectual where it stands? Some eval terms are pretty long-range (this is one of them, but king safety in general is an even bigger one). And you need a long-term test (a game) to see if a bonus is too high or too small...

Only problem is the time required to play enough games to be statistically significant...

We played a few games in our hotel room with Joel Benjamin at the one of the tournaments. To our embarrassment our program "planted" the rook on the 7th due to a poorly conditioned rules for when to do that.

EVERY program of course is subject to approximate rules that don't quite capture the whole picture. Kings safety is one of the toughest, it's correct except when it isn't. And massive autotesting is the only way to really know if the good outweighs the bad.

I now avoid almost any general rule that specific. For instance I don't have a rook to the 7th bonus period. It's a stupid rule. It's better to capture the spirit behind the rule and implement that instead, even though that will be imperfect too. As we all know, rooks on the 7th are good because they attack pawns which are totally impossible to defend with other pawns, and to a lesser extent because a rook on the 7th can confine the enemy king to the back rank.

I'm also taking out the castling bonus. That should be covered by king safety and rook development, not a blind and simple minded bonus to castle no matter what.

The classic example is the rule to never move you queen out early. Even though that's probably not a rule in any of our programs, it's a rule of thumb for beginners and it has many exceptions. Years ago a master explained to me that this rule often caused weaker players errors, because it was burned into their brains and he proposed a better rule: try to move your queen out as early as you safely can. His thought was that a rule that DISCOURAGES development is counter-productive. Of course we understand the spirit behind that rule, but slavish devotion to them gets us into trouble.

Anyway, this is beside the point. I fully agree with what you are saying here about testing. It's the principle of specificity. If you want to see how much weight you can benchpress, measuring how fast you can run a marathon isn't a very appropriate metric. Even though there may be a little correlation it's not going to be a very good indicator. (I doubt the words strongest man is a world class marathoner.)

michiguel · Post by **michiguel** » Tue Jun 03, 2008 7:56 pm

chrisw wrote:
Don wrote:
chrisw wrote:
Don wrote:I noticed that some people have suggested that a "having the move bonus" is beneficial to their program. I believe that too and have always used one in my programs. How to set it correctly seems to be a bit of a black art, but I have a suggestion that I would like others to try and report back.

You should start with a large timing test set. I use 100 positions to measure the general effects of performance tuning on my program. This is not a "find the solution" set, it is just a set of random positions from games to measure general search speed, node counts, etc, and I don't care what move is returned.

The basic idea is to TIME your program using various stand pat bonuses. The theory is that the most correct values will produce the fastest searches because there will be less "turbulence" in move choices, odd/even scores, etc. All of these things hurt the search.

It's just a theory I admit, but I tried it and I get reasonable values. Since the values are reasonable and you are just guessing anyway, why not use the values that produce the fastest search?

My program is very new and has a very primitive evaluation function that is not aggressive at all about positional values - so I would thus expect the stand pat bonus needed to be relatively low. That's what I get here, a value of 0.10 works best of the ones I tried. Here are the times for my 100 position timing set at various stand pat bonus values:
Code: Select all
BONUS   AVE TIME      AVE NODES
-----   --------      ---------  
 0.00      5.312      3,443,479
 0.05      4.734      3,119,767
 0.10      4.247      2,780,342
 0.15      4.692      2,938,563
 0.20      4.684      2,901,375
 0.30      5.862      3,310,455
[/color]

As you can see, 0.10 appears to be best for me right now. 0.15 and 0.20 are non-optimal and I put 0.30 to get one that is obviously wrong but not totally ridiculous.

I would be curious about the results others get with such a test.
with a good quiescence function it ought not to matter too much. With a lousy quiesence function it will matter a lot, but also vary massively depending on position.

There's no reason why tuning on search time reduction is going to optimise for strength, is there? You might try tuning on either a huge test set, or on speed to find actual move chosen by winning side in a set of high-ELO games.

So, assuming you have good quiescence, I would be inclined to suggest also modifying your idea for the sidetomovebonus by pieces left on the board. KQRBN different to KRBN etc
Who uses a good quiescence function now days? The biggest development over the past few years is to make the quies fast and stupid so if you are right, then it does matter. I don't know of any evaluation function that smooths out positional scores.

- Don
Well, a strong chess player uses a good quiescence function. As in search until position quiet and evaluate.

Anything less than that is a downwards compromise based on inability to create a quiescence function that approximates it, and traditionally excused on the basis that bean-counting gives results.

I find the sidetomove bonus a kind of nonsense forced on the beancounting program as a kludge to cover its inherently problematical design.

You've heard of the concept of the coiled spring in chess? Perhaps in relation to some variations of the Sicilian. Black is forced back onto defending some weakness. But if white actually tries to do something and overextends, then, suddenly, all blacks pieces uncoil forwards from their defensive positions and white is in trouble.

My guess is that sidetomove bonus in that kind of situation is probably counterproductive.

Fact is this: the advantage of having the move is either positive, negative or neutral depending on the situation. Period.

In other words, the perfect bonus is one that is adaptive. Sometimes positive, sometimes negative. Most of eval terms are gambling terms. This is one of them. If you set it fixed to zero, you are actually wrong when it should be positive and wrong when it should be negative. You are gambling that zero will be the best compromise. So, technically, everybody has a sidetomove bonus, many of them set to 0. I really doubt that is the best number.

For instance, it is difficult to find positions in the opening in which is convenient that the opponent moves.

Miguel

How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.

Re: How to tune the bonus for having the move.