final drawscore testing results.

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: final drawscore testing results.

Post by bob »

Don wrote:
hgm wrote:When your pawn value is pretty much constant over game phase, I would expect the contempt would have to be scaled with game phase. If you are playing a real patzer, being 100cP down in the opening phase should not really make you go for a draw yet. Better sac the Pawn and crush the opponent with your Queen and Rooks. OTOH, trying to avoid a draw in KRPKRP might be a completely hopeless affair even against a patzer, and giving the Pawn to avoid a draw will be a very bad idea unless you know he cannot search deeper than 1 ply..
My intuition on this is that it should apply mostly to the early game and that "real" draws (such as stalemate, 50 move rule or late endgame repetitions) should be almost zero.

However I actually tested that (by phasing out most of the contempt as the game progresses) and it did not test as well.
I have that test queued up along with some others. So far, nothing has improved the idea I posted to start this thread. But there's still time. :)
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: final drawscore testing results.

Post by Laskos »

Don wrote:
hgm wrote:When your pawn value is pretty much constant over game phase, I would expect the contempt would have to be scaled with game phase. If you are playing a real patzer, being 100cP down in the opening phase should not really make you go for a draw yet. Better sac the Pawn and crush the opponent with your Queen and Rooks. OTOH, trying to avoid a draw in KRPKRP might be a completely hopeless affair even against a patzer, and giving the Pawn to avoid a draw will be a very bad idea unless you know he cannot search deeper than 1 ply..
My intuition on this is that it should apply mostly to the early game and that "real" draws (such as stalemate, 50 move rule or late endgame repetitions) should be almost zero.

However I actually tested that (by phasing out most of the contempt as the game progresses) and it did not test as well.
Isn't hgm example misleading, and +0.60 in the opening usually means more than +0.60 in the endgame, therefore, on average, contempt in the endgame should be larger? Only then hgm type of positions should be identified
Besides that, the linear formula must be wrong. The correct curve is given by performance as a function of score and of the phase of the game. Rybka had a table of performance vs score, but not on the phase.

Kai
User avatar
hgm
Posts: 27837
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: final drawscore testing results.

Post by hgm »

Well, I make the explicit caveat that this applies to pure centipawn values. Then +0.6 in the opening would mean exactly the same as +0.6 in the end-game, namely slightly more than half a pawn.

The point is that you expect to cash in on the rating difference during the game. But if the game is already mostly played, you will get less opportunity for that. Even Chad's Chess will have no difficulty drawing in KRKR against Houdini, despite the 3000-Elo rating difference. So the amount of material/score you can realistically hope to gain back during the game decreases as the game develops to its end, and the contempt factor thus should do likewise.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: final drawscore testing results.

Post by Don »

Laskos wrote:
Don wrote:
hgm wrote:When your pawn value is pretty much constant over game phase, I would expect the contempt would have to be scaled with game phase. If you are playing a real patzer, being 100cP down in the opening phase should not really make you go for a draw yet. Better sac the Pawn and crush the opponent with your Queen and Rooks. OTOH, trying to avoid a draw in KRPKRP might be a completely hopeless affair even against a patzer, and giving the Pawn to avoid a draw will be a very bad idea unless you know he cannot search deeper than 1 ply..
My intuition on this is that it should apply mostly to the early game and that "real" draws (such as stalemate, 50 move rule or late endgame repetitions) should be almost zero.

However I actually tested that (by phasing out most of the contempt as the game progresses) and it did not test as well.
Isn't hgm example misleading, and +0.60 in the opening usually means more than +0.60 in the endgame, therefore, on average, contempt in the endgame should be larger? Only then hgm type of positions should be identified
Besides that, the linear formula must be wrong. The correct curve is given by performance as a function of score and of the phase of the game. Rybka had a table of performance vs score, but not on the phase.

Kai
I see your point. You can convert a score to a winning expectancy with a simple formula but the scale is a bit difference depending on stage of game for Komodo.

The problem I wanted to address with contempt was more about coming out of book and settling on a draw against a weaker opponent shortly thereafter, but I now believe that the principle is much more general and applies equally everywhere - with consideration for the issue you raise of course.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: final drawscore testing results.

Post by Laskos »

hgm wrote:Well, I make the explicit caveat that this applies to pure centipawn values. Then +0.6 in the opening would mean exactly the same as +0.6 in the end-game, namely slightly more than half a pawn.

The point is that you expect to cash in on the rating difference during the game. But if the game is already mostly played, you will get less opportunity for that. Even Chad's Chess will have no difficulty drawing in KRKR against Houdini, despite the 3000-Elo rating difference. So the amount of material/score you can realistically hope to gain back during the game decreases as the game develops to its end, and the contempt factor thus should do likewise.
I don't quite understand what your +0.60 means. Is the performance for this +0.60 the same in all phases? How do you do that? In this unpractical case I somehow agree, though with less material the endgame could be longer, say 40-80 moves, and still could have a great impact on the outcome. In KRKR a large contempt wouldn't hurt, but KRPKRP is problematic. Imagine normal, 10-12 men endgames. With some practical, say Fruitish performance vs (score + phase) plots, my guess is that contempt in any case wouldn't vanish, quite the opposite, would have to increase towards the endgame until the engines reach your or tbases positions, where the contempt has to be disabled.

Kai
BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 5:18 am

Re: final drawscore testing results.

Post by BubbaTough »

For optimal engine performance +0.6 should have the same expected return (by expected return I mean win % * 1 + draw % * 0.5 assuming no contempt) in the opening as in the endgame. That way, the engine can make intelligent decisions about whether to trade into endgames or not. I am not claiming that Komodo (or any other engine) does this perfectly or not, but if not, the solution ideally is not scaling contempt but fixing the eval.

-Sam
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: final drawscore testing results.

Post by Laskos »

BubbaTough wrote:For optimal engine performance +0.6 should have the same expected return (by expected return I mean win % * 1 + draw % * 0.5 assuming no contempt) in the opening as in the endgame. That way, the engine can make intelligent decisions about whether to trade into endgames or not. I am not claiming that Komodo (or any other engine) does this perfectly or not, but if not, the solution ideally is not scaling contempt but fixing the eval.

-Sam
That is correct in the case of a sharp transition from a phase to another. In a case of an almost continuous "flow" the requirement for score is to be a monotonous mapping, say +0.60 score at men 26 opening is mapped univocally into an _average_ of +1.10 score at men 12 endgame. These different scores in these phases have the same final performance (expected return in your words). Then there is +0.70 mapped into +1.70 average, for example, the map being non-linear. The mapping is such as to be more stable at scores closer to 0.00. Theoretically one can invert the monotonous function, adjusting the score as to be equal for all phases for equal performance, but it is unnecessary and I never quite saw that.

Kai