Contempt and the ELO model.

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Contempt and the ELO model.

Post by Daniel Shawul »

Michel wrote:
Why don't we just do the following. I guess that the reason we needed to incorporate 'contempt' into ratings is to figure out 'real' performance of an engine against equally ranked opponents. As it is Houdini's rating is correct for the given conditions so there is nothing to correct, but now we moved our interest to ratings against equally ranked opponents. So instead of trying to find out contempt for each player, why not use one parameter for all players to weigh good scores against close opponents as more important Idea. Lets call it again 'contempt' for lack of better term. Then taking out the bottom 100 opponents brings down Houdini's rating while increasing Stockfish/Komodo rating and so on. So the new contempt param we introduced describes our interest in what kind of rating we want to see. This is much easier to program and achieves the same goal so far as I understand...
Well this is not my motivation... I would like to resolve the incompatibility of contempt with the elo model by suitably augmenting the elo model.

What you are proposing is different: it is to change the standard maximum likelihood estimator to one which is _less sensitive_ to contempt settings in engines (if I understand correctly you are proposing some kind of weighted maximum likelihood estimator).
Let me repeat what I said elewhere in this thread.
Also I am not clear about the discussion on contempt here. Is it something inherent that changes rating calculation or simply a description of our wish to see certain kind of rating. I am more inclined to the later now...
It is not clear to me if the rating should be changed at all based the contempt used by a certain engine i.e. being aggressive against week engines. That is the style of the engine, so why should it affect its ratings? Also going down this road would imply changing elo model for every other style of an engine which makes it score more against certain kinds of engines. I am of the opinion that this is simply an expression of our wish to see a certain kind of rating list, maybe something that is more useful small pool of roughlly equal opponents etc. AFAIK your goal is the same as mine, i.e. bring down inflated ratings of engines with high contempt. If not please explain your statement below, because I do not see any inconsistency but just our wish to produce certain kind of rating, like post-processing as demonstrated by a two-step process I proposed earlier ...
Well this is not my motivation... I would like to resolve the incompatibility of contempt with the elo model by suitably augmenting the elo model.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Contempt and the ELO model.

Post by Michel »

I do not want to _change_ the elo model but to _augment_ it with an extra parameter modeling contempt (a characteristic which as I demonstrated cannot be captured by the standard elo model).

There might be other engine characteristic which are not covered by the elo model but certainly contempt, if it works as advertized would be a very generic one.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Contempt and the ELO model.

Post by Daniel Shawul »

Michel wrote:I do not want to _change_ the elo model but to _augment_ it with an extra parameter modeling contempt (a characteristic which as I demonstrated cannot be captured by the standard elo model).

There might be other engine characteristic which are not covered by the elo model but certainly contempt, if it works as advertized would be a very generic one.
You are being so cryptic. Please use more words to explain your final goal like I did. I am sure no one will miss what I am trying to convey, but you focus on finding trivial typos it is annoying. Like you missed what I meant by agg1 & agg2 in my previous post, or by 'changing' elo model which I meant produce different ratings etc... So please take your time and explain how your proposal affects rating of Houdini for instace, given that it allegedly uses more contempt than Stockfish. Because right now I do not understand your goal at all ...
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Contempt and the ELO model.

Post by Michel »

@Daniel

I really don't see how I could explain more. I am proposing a refined elo model. You are proposing a more robust elo estimator. What is there
to explain?

[[ A robust estimator is an estimator which is insensitive to small modeling errors. See http://en.wikipedia.org/wiki/Robust_statistics ]]

Anyway I noticed that my proposal has a 3-dimensional symmetry group (instead of 2-dimensional one as I claimed earlier). The extra symmetries are quite unintuitive, like

(elo,agg)--->(-1/elo,elo*agg)

So unless there would be a good rule to pick 1 value out of a 3-dimensional family of optimal values (=maxima for likelihood function) the proposal cannot really be used.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Contempt and the ELO model.

Post by Daniel Shawul »

We have to avoid all technicalities of how to achieve a certain goal, and clearly state what it is first with an example if possible. It should be easily understandable by any one here and not be obscured by technical jargon. So your next post better not be a link to wiki :) but a bullet style answer to the following questions I pose once more:

a) Given Houdini's use of contempt in CEGT, what does your robust estimator's do to the relative ratings of Houdini/Stokfish/Komodo afterwards?

b) Why would ratings be changed at all depending on style? This is a fundamental question. If we go down this road, it means we have to do the same for engines that have certain playing style that allows to score more against certain class of engines. Just because contempt had an effect that allows one to score more against weaker engines, we felt the need to "correct" the ratings so that we weight more scores against equally rated engines, which I did in a straight forward manner. Infact reviewing the thread, Kirilly seems to have stated something along those lines.

c) Your robust estimator reduces significance of outlier scores (scores against low rated engines AFAIK). If that is the case then our goals are the same, isn't it? The importance factor I does something similar when a Gaussian distribution is used. Right now it considers results against all engines are equally important. All outliers as defined by say abs(elo1-elo2) <= 100 are weighed less than those who fall inside the window.