The importance of contempt

BubbaTough · Post by **BubbaTough** » Mon Aug 29, 2011 5:41 am

Don wrote: The rating lists don't broadcast rating differences to the engines, but it's probably just as well. If BOTH programs use the appropriate contempt factor, I believe the effect would cancel out and you would get similar results to both programs setting it to zero. Setting to zero handicaps both the stronger and the weaker program so perhaps it comes out the same. I'm not sure they are handicapped equally, but I'll bet any difference is pretty small.

I Disagree. If the only effect of contempt was changing the draw score that would be true. But as has been a consistent theme in my every post on this thread, that is not the only use of contempt. Judging how drawish something is, and adjusting your score via contempt accordingly is a very rich and interesting area (used by humans with great frequency as you alluded to) and if rating lists were to support it I would guess more chess researchers would be looking into it as well.

-Sam

Don · Post by **Don** » Mon Aug 29, 2011 1:45 pm

BubbaTough wrote:
Don wrote: The rating lists don't broadcast rating differences to the engines, but it's probably just as well. If BOTH programs use the appropriate contempt factor, I believe the effect would cancel out and you would get similar results to both programs setting it to zero. Setting to zero handicaps both the stronger and the weaker program so perhaps it comes out the same. I'm not sure they are handicapped equally, but I'll bet any difference is pretty small.
I Disagree. If the only effect of contempt was changing the draw score that would be true. But as has been a consistent theme in my every post on this thread, that is not the only use of contempt. Judging how drawish something is, and adjusting your score via contempt accordingly is a very rich and interesting area (used by humans with great frequency as you alluded to) and if rating lists were to support it I would guess more chess researchers would be looking into it as well.

-Sam

I'm more interested in why you don't think they cancel. If both program are implementing your concept of "rich contempt" and applying it fully to all drawish concepts does the stronger program benefit more or the weaker program? And what is the reason you think it's highly asymmetrical?

Don · Post by **Don** » Mon Aug 29, 2011 2:12 pm

rbarreira wrote:Very interesting. According to that test the optimal value is probably even smaller than -30.

The data suggests a formula for computing contempt that should work in Komodo - it may vary for other programs and I'm not sure it applies to all levels because draws are much more common at deeper levels than I am testing, but it's not clear to me that this should matter. It's also not clear how this applies to human play.

The formula that suggests itself is that you should set contempt to 1 centipawn per 10 ELO of difference. So the drawscore would be -50 if you opponent were 500 ELO weaker. Presumably it works in reverse, +50 if your opponent is 500 ELO stronger.

BubbaTough · Post by **BubbaTough** » Mon Aug 29, 2011 2:35 pm

Don wrote:
BubbaTough wrote:
Don wrote: The rating lists don't broadcast rating differences to the engines, but it's probably just as well. If BOTH programs use the appropriate contempt factor, I believe the effect would cancel out and you would get similar results to both programs setting it to zero. Setting to zero handicaps both the stronger and the weaker program so perhaps it comes out the same. I'm not sure they are handicapped equally, but I'll bet any difference is pretty small.
I Disagree. If the only effect of contempt was changing the draw score that would be true. But as has been a consistent theme in my every post on this thread, that is not the only use of contempt. Judging how drawish something is, and adjusting your score via contempt accordingly is a very rich and interesting area (used by humans with great frequency as you alluded to) and if rating lists were to support it I would guess more chess researchers would be looking into it as well.

-Sam
I'm more interested in why you don't think they cancel. If both program are implementing your concept of "rich contempt" and applying it fully to all drawish concepts does the stronger program benefit more or the weaker program? And what is the reason you think it's highly asymmetrical?

errr....I am assuming both sides do not implement "rich contempt" the same way. The odds of two programs independently choosing the same set of factors and weights to define how drawish a position is is pretty low. And even if they did, there is an interaction with the eval, so assuming the two programs have different eval the result is different. There is also an interaction with search of course.

Judging the chance of a draw is not much less complicating than judging the chance of a win (the job of eval). It just so happens that most programmer efforts have been dedicated to judging the chance of a win to date.

-Sam

bob · Post by **bob** » Mon Aug 29, 2011 3:33 pm

Don wrote:
bob wrote:
Don wrote:Komodo has a parameter which allows you to set the draw score - often known as the contempt factor. If Komodo has contempt it should be negative. The nomenclature "draw score" is based on a suggestion by Gian Carlo.

How important is it? In the Leiden tournament we were playing a significantly weaker opponent in an early round and nearly accepted a draw. The draw score was fixed at the time and was set at zero, a sensible value that assumes you don't know if your opponent is weaker or stronger. We watched helplessly as the result was out of our hands and Komodo seemed to be ok with a draw, feeling it's position was just slightly inferior. As things turned out, Komodo's opponent avoided the draw although there were some repetitions, just not 3-fold. A very close call.

As a result of this we immediately added a contempt factor to the program in order to avoid this situation in the future and also in anticipation of having to play Rybka and other strong programs in the same tournament so we could use a negative contempt (or positive draw score.)

But my question is "How important is contempt?" My feeling at the time was that it would be unlikely to make more than a couple of ELO difference. But I think it's more important than I had imagined. In Komodo the contempt factor is applied to repetition and stalemate, but not other draws.

In my study I am simulating a weaker opponent by using fixed nodes levels and handicapping this "weak" player. "weak" is about 270 ELO weaker and plays a gauntlet against "strong" komodo versions using various contempt factors. An opponent who is 270 ELO weaker still represents a dangerous opponent who will score almost 20% against you. I will repeat this study with opponents who are less different and more different to get a sense of how to compute a contempt factor given and ELO difference. I think that can be computed with a straightforward formula which can be used to convert a chess program evaluation score to a winning percentage (or ELO number) and would have to be calibrated for each program.

The match is still in progress, but the early indications are that this is far more important than I had realized previously.

Has anyone else studied this issue in any serious detail? Here is the current ongoing result of my small study where I'm testing "drawscore" of 0, -10, -20 and -30.
Code: Select all
Rank Name              Elo      +      -    games   score   oppo.   draws 
   1 kse-4290.00-30  3030.5   10.8   10.8    6831   85.7%  2733.1   15.9% 
   2 kse-4290.00-20  3016.5   10.5   10.5    6831   84.8%  2733.1   17.4% 
   3 kse-4290.00-10  3013.1   10.4   10.4    6832   84.6%  2733.1   18.1% 
   4 kse-4290.00-0   3000.0   10.2   10.2    6832   83.7%  2733.1   19.8% 
   5 weak            2733.1    5.2    5.2   27326   15.3%  3015.0   17.8% 
[/size]
I've had a variable draw score in Crafty for about 15 years now. I first had to add the "rating" command to xboard, and get Tim to make it standard. I then used that for several things. Book learning is an obvious one, as the quality of your opponent dictates what you should "learn" by playing an opening. Draw score is another, increase it against stronger opponents, reduce it against weaker opponents. I do use draw-score for more than just repetitions and 50-move draws, I use it in the eval when I notice drawish positions (like opposite colored bishops where I drag the score toward draw-score as material comes off, etc...
We do a similar thing with opposite color bishops and several other endings - we multiply the score by some constant less than 1.0 for endings we consider drawish. We also assign scores to certain endings and reduce others by constant amounts to more reflect their actual winning chances. We could in principle modify all of these things to account for a higher or lower draw score.

For my specific test case it looks like just changing the draw score to -30 for rep, stalemate (and 50 move draw) gains at least 25 ELO against an opponent 270 ELO weaker or so. I assume that it would have a similar affect if you use +30 against an opponent 270 ELO stronger but I have not tested that yet. I have already tested against an equal opponent and even -10 contempt against an equal opponent hurts by 5 or more ELO, so this is more than just academic.

I did some playing around with this a long while back, and there are some issues. Crafty used to have a "wider range" of draw scores. And in tuning on the cluster, what I found was that you can EASILY go too far and drop elo all over the place, because if you try too hard to avoid a draw, you can easily do so by losing instead.

In studying the problem a lot, I noticed one recurring theme that I did not have much luck in solving... Some positions are drawn because they are drawn. And to avoid the draw, your only choice is to accept a loss. Others are drawn only because you think the alternative is slightly worse, but there is still a lot of "play" in the position and if you are stronger than your opponent, avoiding the draw makes sense.

The problem is, recognizing the difference between the two cases is difficult, and if you just tune for max elo, you are tuning for whichever case is more common. And that is as much a function of opening books and opponents as anything else, so the common case for one opponent might be playable if you avoid the draw, while against another it might not. The ones that are particularly hard deal with locked up pawn structure where you make a pawn break to avoid the draw, but your opponent is better positioned to penetrate first and you just invited him in and gave him the game.

This can be very tricky.

Don · Post by **Don** » Mon Aug 29, 2011 4:14 pm

bob wrote: I did some playing around with this a long while back, and there are some issues. Crafty used to have a "wider range" of draw scores. And in tuning on the cluster, what I found was that you can EASILY go too far and drop elo all over the place, because if you try too hard to avoid a draw, you can easily do so by losing instead.

In studying the problem a lot, I noticed one recurring theme that I did not have much luck in solving... Some positions are drawn because they are drawn. And to avoid the draw, your only choice is to accept a loss. Others are drawn only because you think the alternative is slightly worse, but there is still a lot of "play" in the position and if you are stronger than your opponent, avoiding the draw makes sense.

The problem is, recognizing the difference between the two cases is difficult, and if you just tune for max elo, you are tuning for whichever case is more common. And that is as much a function of opening books and opponents as anything else, so the common case for one opponent might be playable if you avoid the draw, while against another it might not. The ones that are particularly hard deal with locked up pawn structure where you make a pawn break to avoid the draw, but your opponent is better positioned to penetrate first and you just invited him in and gave him the game.

This can be very tricky.

I agree with you. It's understood that sometimes you will win and sometimes you will lose by using a contempt factor as it is nothing more than a calculated gamble. The whole point is that you are willing to accept an inferior position to avoid a draw because of your belief that you can overcome a small handicap and win. Your point is well taken that this gamble is not the same for different kinds of positions.

My thoughts on this is that the draw score or contempt is much more useful in the early part of the game. This is pretty obvious. If you are 200 ELO stronger you are NOT 200 ELO stronger if you start from a position with equal chances closer to the end of the game. The reason is that in the opening you have many more opportunities to assert your superiority but in the ending there are few moves left. The better program does not play better moves EVERY time it makes a move, just once in a while.

We talked about this on the go mailing list a long time ago and why the range of strengths in GO is wider than in chess. If you assign ELO ratings to go players and the average player is rated 1500 you would see players with ratings closer to 4000, not 3000 as in chess. The reason might be partly because the games are longer and this gives even a slightly stronger player more of an opportunity to assert his superiority. A chess analogy is that if you are 100 ELO stronger you have only about a 64% chance of winning the game, but if you were to play a 10 game match your chances of winning the match is MUCH higher than 64%. If the match is long enough your chances of winning approach certainty.

So for the same reason I think it's a mistake to use the same contempt factor in the ending that you would use in the middlegame. In the endgame, as you say, a draw is usually a draw but in the middlegame a draw by repetition can be giving up too soon when you are the heavy favorite to win (or conversely, missing an opportunity for a draw when you are the underdog.)

So one suggestion is to phase the contempt out with stage of game, either partially or fully.

BubbaTough · Post by **BubbaTough** » Mon Aug 29, 2011 4:46 pm

Don wrote:
bob wrote: I did some playing around with this a long while back, and there are some issues. Crafty used to have a "wider range" of draw scores. And in tuning on the cluster, what I found was that you can EASILY go too far and drop elo all over the place, because if you try too hard to avoid a draw, you can easily do so by losing instead.

In studying the problem a lot, I noticed one recurring theme that I did not have much luck in solving... Some positions are drawn because they are drawn. And to avoid the draw, your only choice is to accept a loss. Others are drawn only because you think the alternative is slightly worse, but there is still a lot of "play" in the position and if you are stronger than your opponent, avoiding the draw makes sense.

The problem is, recognizing the difference between the two cases is difficult, and if you just tune for max elo, you are tuning for whichever case is more common. And that is as much a function of opening books and opponents as anything else, so the common case for one opponent might be playable if you avoid the draw, while against another it might not. The ones that are particularly hard deal with locked up pawn structure where you make a pawn break to avoid the draw, but your opponent is better positioned to penetrate first and you just invited him in and gave him the game.

This can be very tricky.
I agree with you. It's understood that sometimes you will win and sometimes you will lose by using a contempt factor as it is nothing more than a calculated gamble. The whole point is that you are willing to accept an inferior position to avoid a draw because of your belief that you can overcome a small handicap and win. Your point is well taken that this gamble is not the same for different kinds of positions.

My thoughts on this is that the draw score or contempt is much more useful in the early part of the game. This is pretty obvious. If you are 200 ELO stronger you are NOT 200 ELO stronger if you start from a position with equal chances closer to the end of the game. The reason is that in the opening you have many more opportunities to assert your superiority but in the ending there are few moves left. The better program does not play better moves EVERY time it makes a move, just once in a while.

We talked about this on the go mailing list a long time ago and why the range of strengths in GO is wider than in chess. If you assign ELO ratings to go players and the average player is rated 1500 you would see players with ratings closer to 4000, not 3000 as in chess. The reason might be partly because the games are longer and this gives even a slightly stronger player more of an opportunity to assert his superiority. A chess analogy is that if you are 100 ELO stronger you have only about a 64% chance of winning the game, but if you were to play a 10 game match your chances of winning the match is MUCH higher than 64%. If the match is long enough your chances of winning approach certainty.

So for the same reason I think it's a mistake to use the same contempt factor in the ending that you would use in the middlegame. In the endgame, as you say, a draw is usually a draw but in the middlegame a draw by repetition can be giving up too soon when you are the heavy favorite to win (or conversely, missing an opportunity for a draw when you are the underdog.)

So one suggestion is to phase the contempt out with stage of game, either partially or fully.

If you are calculating chance of draw and chance of win reasonably, it will automatically account for these issues. Imagine if, instead of tracking centipawns, you track % chance white win, % chance black win, and % chance draw. If you were to do that, a contempt factor that intelligently balances taking risks to avoid a draw vs. increasing chance of losing is not that hard.

Of course, the problem for most is that they try to blend win % and draw % into a single number and THEN do contempt, which causes all the problems you both are describing (in my opinion).

-Sam

bob · Post by **bob** » Mon Aug 29, 2011 5:10 pm

Don wrote:
bob wrote: I did some playing around with this a long while back, and there are some issues. Crafty used to have a "wider range" of draw scores. And in tuning on the cluster, what I found was that you can EASILY go too far and drop elo all over the place, because if you try too hard to avoid a draw, you can easily do so by losing instead.

In studying the problem a lot, I noticed one recurring theme that I did not have much luck in solving... Some positions are drawn because they are drawn. And to avoid the draw, your only choice is to accept a loss. Others are drawn only because you think the alternative is slightly worse, but there is still a lot of "play" in the position and if you are stronger than your opponent, avoiding the draw makes sense.

The problem is, recognizing the difference between the two cases is difficult, and if you just tune for max elo, you are tuning for whichever case is more common. And that is as much a function of opening books and opponents as anything else, so the common case for one opponent might be playable if you avoid the draw, while against another it might not. The ones that are particularly hard deal with locked up pawn structure where you make a pawn break to avoid the draw, but your opponent is better positioned to penetrate first and you just invited him in and gave him the game.

This can be very tricky.
I agree with you. It's understood that sometimes you will win and sometimes you will lose by using a contempt factor as it is nothing more than a calculated gamble. The whole point is that you are willing to accept an inferior position to avoid a draw because of your belief that you can overcome a small handicap and win. Your point is well taken that this gamble is not the same for different kinds of positions.

My thoughts on this is that the draw score or contempt is much more useful in the early part of the game. This is pretty obvious. If you are 200 ELO stronger you are NOT 200 ELO stronger if you start from a position with equal chances closer to the end of the game. The reason is that in the opening you have many more opportunities to assert your superiority but in the ending there are few moves left. The better program does not play better moves EVERY time it makes a move, just once in a while.

We talked about this on the go mailing list a long time ago and why the range of strengths in GO is wider than in chess. If you assign ELO ratings to go players and the average player is rated 1500 you would see players with ratings closer to 4000, not 3000 as in chess. The reason might be partly because the games are longer and this gives even a slightly stronger player more of an opportunity to assert his superiority. A chess analogy is that if you are 100 ELO stronger you have only about a 64% chance of winning the game, but if you were to play a 10 game match your chances of winning the match is MUCH higher than 64%. If the match is long enough your chances of winning approach certainty.

So for the same reason I think it's a mistake to use the same contempt factor in the ending that you would use in the middlegame. In the endgame, as you say, a draw is usually a draw but in the middlegame a draw by repetition can be giving up too soon when you are the heavy favorite to win (or conversely, missing an opportunity for a draw when you are the underdog.)

So one suggestion is to phase the contempt out with stage of game, either partially or fully.

I have worked on this idea off and on for a while. In Crafty, I tend to drag the score toward draw under certain circumstances, like opposite bishops as an example. But there is more to it. If you know that your position is pretty dire, and you have some choices, you could do (a) trade pawns but not pieces which makes the endgame that much less winnable; (b) lock pawns to make it hard for the opponent to create an endgame advantage. But deciding when to do those, as opposed to preventing them from happening, becomes a problem. I pretty difficult one. Every time I've had some "breakthrough idea" it was quickly relegated to "broken idea". Trying to assess "is this winning, or losing?" statically is a real challenge. That's why we need a q-search and extensions in fact. Yet we are applying the knowledge in the eval, so we sort of have to depend on it. And I have not yet found a way to make this happen...

bob · Post by **bob** » Mon Aug 29, 2011 5:16 pm

BubbaTough wrote:
Don wrote:
bob wrote: I did some playing around with this a long while back, and there are some issues. Crafty used to have a "wider range" of draw scores. And in tuning on the cluster, what I found was that you can EASILY go too far and drop elo all over the place, because if you try too hard to avoid a draw, you can easily do so by losing instead.

In studying the problem a lot, I noticed one recurring theme that I did not have much luck in solving... Some positions are drawn because they are drawn. And to avoid the draw, your only choice is to accept a loss. Others are drawn only because you think the alternative is slightly worse, but there is still a lot of "play" in the position and if you are stronger than your opponent, avoiding the draw makes sense.

The problem is, recognizing the difference between the two cases is difficult, and if you just tune for max elo, you are tuning for whichever case is more common. And that is as much a function of opening books and opponents as anything else, so the common case for one opponent might be playable if you avoid the draw, while against another it might not. The ones that are particularly hard deal with locked up pawn structure where you make a pawn break to avoid the draw, but your opponent is better positioned to penetrate first and you just invited him in and gave him the game.

This can be very tricky.
I agree with you. It's understood that sometimes you will win and sometimes you will lose by using a contempt factor as it is nothing more than a calculated gamble. The whole point is that you are willing to accept an inferior position to avoid a draw because of your belief that you can overcome a small handicap and win. Your point is well taken that this gamble is not the same for different kinds of positions.

My thoughts on this is that the draw score or contempt is much more useful in the early part of the game. This is pretty obvious. If you are 200 ELO stronger you are NOT 200 ELO stronger if you start from a position with equal chances closer to the end of the game. The reason is that in the opening you have many more opportunities to assert your superiority but in the ending there are few moves left. The better program does not play better moves EVERY time it makes a move, just once in a while.

We talked about this on the go mailing list a long time ago and why the range of strengths in GO is wider than in chess. If you assign ELO ratings to go players and the average player is rated 1500 you would see players with ratings closer to 4000, not 3000 as in chess. The reason might be partly because the games are longer and this gives even a slightly stronger player more of an opportunity to assert his superiority. A chess analogy is that if you are 100 ELO stronger you have only about a 64% chance of winning the game, but if you were to play a 10 game match your chances of winning the match is MUCH higher than 64%. If the match is long enough your chances of winning approach certainty.

So for the same reason I think it's a mistake to use the same contempt factor in the ending that you would use in the middlegame. In the endgame, as you say, a draw is usually a draw but in the middlegame a draw by repetition can be giving up too soon when you are the heavy favorite to win (or conversely, missing an opportunity for a draw when you are the underdog.)

So one suggestion is to phase the contempt out with stage of game, either partially or fully.
If you are calculating chance of draw and chance of win reasonably, it will automatically account for these issues. Imagine if, instead of tracking centipawns, you track % chance white win, % chance black win, and % chance draw. If you were to do that, a contempt factor that intelligently balances taking risks to avoid a draw vs. increasing chance of losing is not that hard.

Of course, the problem for most is that they try to blend win % and draw % into a single number and THEN do contempt, which causes all the problems you both are describing (in my opinion).

-Sam

The other issue is that we are applying this inside the evaluation function. Which means we only have static information to go on. Yet, by admission, our static evaluation only works well when the position is tactically quiet. I can produce enough games to accurately fit a curve to the static eval vs win/lose/draw results. But I am not sure that the endpoint scores are as reliable as those that get backed up thru the tree to the root, which is the score I would see...

Don · Post by **Don** » Mon Aug 29, 2011 5:22 pm

BubbaTough wrote: If you are calculating chance of draw and chance of win reasonably, it will automatically account for these issues. Imagine if, instead of tracking centipawns, you track % chance white win, % chance black win, and % chance draw. If you were to do that, a contempt factor that intelligently balances taking risks to avoid a draw vs. increasing chance of losing is not that hard.

Of course, the problem for most is that they try to blend win % and draw % into a single number and THEN do contempt, which causes all the problems you both are describing (in my opinion).

-Sam

I agree with you on this. How much do you think this affects the ELO rating of Komodo on the rating lists where contempt is not used?

The importance of contempt

Re: The importance of contempt

Re: The importance of contempt

Re: The importance of contempt

Re: The importance of contempt

Re: The importance of contempt

Re: The importance of contempt

Re: The importance of contempt

Re: The importance of contempt

Re: The importance of contempt

Re: The importance of contempt