Contempt and the ELO model.

Michel · Post by **Michel** » Thu Sep 05, 2013 8:22 am

Standard contempt (changing the draw score) is supposed to make an engine play better against weaker engines and worse against stronger engines.

If(!) this is true then it is clearly impossible within the standard logistic elo model.

Question: is there a mathematically nice way of upgrading the elo model so that it can incorporate contempt like behaviour?

I assume it would involve introducing at least one additional engine parameter besides elo. Perhaps "aggressiveness"?

hgm · Post by **hgm** » Thu Sep 05, 2013 9:17 am

Setting contempt would simply decrease the DrawElo parameter of the model, right? You would get fewer draws, and force them to become either wins or losses. What they become depends (statistically) on the Elo difference with your opponent; you are not going to play any better just because you refuse to take draws.

To use this in an analysis you should allow a two-parameter description of players (strength, predisposition for contempt), where the DrawElo should be a function of the rating difference and predisposition for contempt of both players.

Uri Blass · Post by **Uri Blass** » Thu Sep 05, 2013 9:44 am

hgm wrote:Setting contempt would simply decrease the DrawElo parameter of the model, right? You would get fewer draws, and force them to become either wins or losses. What they become depends (statistically) on the Elo difference with your opponent; you are not going to play any better just because you refuse to take draws.

To use this in an analysis you should allow a two-parameter description of players (strength, predisposition for contempt), where the DrawElo should be a function of the rating difference and predisposition for contempt of both players.

I think that things are not so simple.

I can imagine that contempt that is too high may increase the number of draws against some weaker opponent.

Imagine that you play against some opponent and you need to choose between move A and move B.

After move A the opponent can force a draw and you see it but the opponent does not see the draw or evaluates some alternative as better than draw for it.
After move B the opponent has an advantage of 0.5 pawn but no forced draw that you can see.

With very high comtempt you can prefer move B and later fight for a draw and get the draw when practically you could win by move A because in case of move A the choice of the opponent is not to force the draw but to play a losing move.

Michel · Post by **Michel** » Thu Sep 05, 2013 9:48 am

I like the suggestion but I would like to see a model...

How would you model "predisposition for contempt"? And how would DrawElo be computed from it?

DrawElo seems to be dependent on absolute elo and not just elo difference. Of course this can be explained by the fact that the parameters (elo,"predisposition for contempt") maybe somewhat correlated with weaker engines generally having higher "predisposition for contempt".

Another issue is that DrawElo is bigger for self play than for foreign play. But this is unmodelable with engine parameters I think. It would require instead a "similarity" parameter for pairs of engines.

My suggestion did not use DrawElo but instead an extra parameter "aggessiveness" (agg). The effective elo difference for a match between two engines would be (ignoring black/white).

agg1*agg2*(elo2-elo1)

Note that this increases the gauge group from a 1 dimensional group (translations) to a two dimensional group (translations and dilations). So "gauge fixing" (standard candles if you want) becomes an issue.

Kirill Kryukov · Post by **Kirill Kryukov** » Thu Sep 05, 2013 10:23 am

Michel wrote:Standard contempt (changing the draw score) is supposed to make an engine play better against weaker engines and worse against stronger engines.

If(!) this is true then it is clearly impossible within the standard logistic elo model.

Question: is there a mathematically nice way of upgrading the elo model so that it can incorporate contempt like behaviour?

I assume it would involve introducing at least one additional engine parameter besides elo. Perhaps "aggressiveness"?

This problem is one of the reasons why I am estimating performance slope in addition to ratings in KCEC list ("Perf. slope" column). I assume each engine gains or loses rating depending on difference from opponent, and that this value depends linearly on rating difference.

Some engines have significant slope, for example Requiem or Simontacchi (you can see the scatterplots of performances at those links). This slope only sometimes results from contempt. More often it's a consequence of severe bugs.

The easiest way to address the problem may be to first estimate the ratings normally, and then correct the ratings using performance slope and average opponent for each engine. This won't be very accurate, a better way would be to incorporate these extra parameters into the model. Another additional parameter may be drawness. So each engine will be described by three numbers (rating, perf. slope, drawness) instead of just rating.

I haven't completed the method yet, so for now it's just some observations.

Michel · Post by **Michel** » Thu Sep 05, 2013 10:39 am

This problem is one of the reasons why I am estimating performance slope in addition to ratings in KCEC list ("Perf. slope" column). I assume each engine gains or loses rating depending on difference from opponent, and that this value depends linearly on rating difference.

Hey that is very nice data! I will look closer at it.

It seems "Perf. slope" may be somewhat similar to the "aggressiveness" parameter I was proposing.

Modern Times · Post by **Modern Times** » Thu Sep 05, 2013 10:53 am

Michel wrote:
This problem is one of the reasons why I am estimating performance slope in addition to ratings in KCEC list ("Perf. slope" column). I assume each engine gains or loses rating depending on difference from opponent, and that this value depends linearly on rating difference.
Hey that is very nice data! I will look closer at it.

It seems "Perf. slope" may be somewhat similar to the "aggressiveness" parameter I was proposing.

Yes - Kirill's site is superb. A hidden gem that many people are not aware of.

Daniel Shawul · Post by **Daniel Shawul** » Thu Sep 05, 2013 12:22 pm

I did something similar in Bopo but it is not same as contempt. Instead of fixed 'Drawelo' and 'Homeadvantage', I have additional slope parameters for both that vary with the average elo of the players (not their difference). Then I used CG to minimize the likelihood function that takes much longer time than MM. This was found to give much better results but Remi didn't like the two additional parameters and slow computation.

As you pointed out, Drawelo is probably not same as contempt as it depends on average elo. You can have one additional parameter, contempt, for each player and scale up/down the elo differences between players based on it like you did in your formula. It should be easy to incorporate it in bopo and solve with CG as follows but it would require a parameter for each player unlike the shared eloDraw and eloHomeAdvantge params.

Code: Select all

static double win_prob(double pelo,double oelo) {
	double eloDelta = pelo - oelo; /****** change this to agg1*agg2*(pelo-oelo)*********/
	//double avg = (elo_to_gamma(pelo) + elo_to_gamma(oelo)) / 2; 
	double eloD = eloDraw;//eloDrawSlope * avg + eloDraw;
	double eloH = eloHome;//eloHomeSlope * avg + eloHome;
	if(eloModel == 0) {
		return logistic(-eloDelta - eloH + eloD);
	} else if(eloModel == 1) {
		double thetaD = elo_to_gamma(eloD);
		double f = thetaD * sqrt(logistic(eloDelta + eloH) * logistic(-eloDelta - eloH));
		return logistic(-eloDelta - eloH) / (1 + f);
		/*
		double g1 = elo_to_gamma(pelo);
		double g2 = elo_to_gamma(oelo);
		double th = elo_to_gamma(eloH);
		double td = elo_to_gamma(eloD);
		return th * g1 / (th * g1 + g2 + td * pow(th * g1 * g2,DVPWR));
		*/
	} else {
		return gaussian(-eloDelta - eloH + eloD);
	}
}

michiguel · Post by **michiguel** » Thu Sep 05, 2013 1:22 pm

Michel wrote:Standard contempt (changing the draw score) is supposed to make an engine play better against weaker engines and worse against stronger engines.

If(!) this is true then it is clearly impossible within the standard logistic elo model.

Question: is there a mathematically nice way of upgrading the elo model so that it can incorporate contempt like behaviour?

I assume it would involve introducing at least one additional engine parameter besides elo. Perhaps "aggressiveness"?

In Ordo's conceptual model, every match is played at the same temperature. In your case, there are players who play "hot" (win more against stronger opposition but lose more against weaker ones) or "cold" (the opposite). Temperature is part of the Boltzmann Beta (beta = 1/(k*T)), which ends up being the beta of the logistic equation. So, each player should have an extra parameter that modifies the particular beta for that particular match.

I do not think it will give significant improvement and could overfit the data. But it will be interesting to explore.

Miguel

Michel · Post by **Michel** » Thu Sep 05, 2013 4:02 pm

It should be easy to incorporate it in bopo and solve with CG as follows but it would require a parameter for each player unlike the shared eloDraw and eloHomeAdvantge params.

What is CG? Probably obvious but I feel a bit stupid right now...

Having an extra parameter is a nuissance but unavoidable I think.

It would be nice if the model would quantify that certain engines play "drawish" or "aggressive".

Contempt and the ELO model.

Contempt and the ELO model.

Re: Contempt and the ELO model.

Re: Contempt and the ELO model.

Re: Contempt and the ELO model.

Re: Contempt and the ELO model.

Re: Contempt and the ELO model.

Re: Contempt and the ELO model.

Re: Contempt and the ELO model.

Re: Contempt and the ELO model.

Re: Contempt and the ELO model.