1 draw=1 win + 1 loss (always!)

lkaufman · Post by **lkaufman** » Sat Sep 21, 2013 5:20 pm

Daniel Shawul wrote:
lkaufman wrote:
Daniel Shawul wrote:Whats new here? To be more precise it is only the Rao-Kupper draw model that says 1 draw = 1 win + 1 loss.
Probably Larry Kauffman would prefer to see

P(draw)**2=c*P(win)*P(loss)
I don't see how you arrived at this from the previous relation. This I am afraid is a different draw model, name Davidson, not the one used currently by bayeselo. Infact it say 2 draws = 1 win + 1 loss. There is a third model that say 1.5 draws = 1 win + 1 loss. More from http://www.grappa.univ-lille3.fr/~coulo ... tcomes.pdf
Well it's nice that there are so many draw models to choose from. But has there been any work on determining which one actually fits data from the game of chess?
Yes this one.
Clearly the Davidson model is the one that fits the way tournaments are scored and Elo ratings are calculated, so the burden of proof lies with anyone claiming that one of the other models is superior.
'Clearly' you don't know what you are talking about which I pointed out before when you tried to tell us how ordo is better than bayeselo (but in reality you made a mistake with understanding scale). You should give citations as to why 2 * 0 = 1 + -1 (davidson) is 'clearly' better than 0 = 1 + -1 (rao-kupper). Infact most use the Glenn-David model 1.5 * 0 = 1 + -1 such as microsoft's rating software. There is even a work that says Davidson is worse for _human_ ratings, so you as a human GM should 'clearly' have knowledge of such things. You can find such statements here. "Joe, H. (1990). Extended use of paired comparion models, with applications to chess rankings. Journal of the Royal Statistical Society, 39(1):85–93." I think you should let your pal Glickman to judge the merit of such works, because 'clearly' you are not qualified, and 'clearly' he is.

The Davidson model corresponds to the way tournaments are scored (one win and one loss equals two draws). The main human over-the-board rating systems (FIDE, USCF, and most other national rating systems) consider only score and ratings, not how the score was split up among wins and draws, so they also assume Davidson. So my statement is correct and doesn't need further proof. You mention Microsoft; what do they know about chess? What federation uses their software?
I didn't make any claim as to which model fits the data better, I asked for info on this, which I thank you for providing. So it seems from your comments that the model in the middle, Glen-David, best fits chess data. If so, I would still argue that Ordo is superior to Bayeselo because although both are "wrong" by being equally far off from the "correct" Glen-David model, at least Ordo corresponds to the way tournaments are scored and FIDE ratings calculated.

Daniel Shawul · Post by **Daniel Shawul** » Sat Sep 21, 2013 5:28 pm

The Davidson model corresponds to the way tournaments are scored (one win and one loss equals two draws). The main human over-the-board rating systems (FIDE, USCF, and most other national rating systems) consider only score and ratings, not how the score was split up among wins and draws, so they also assume Davidson. So my statement is correct and doesn't need further proof. You mention Microsoft; what do they know about chess? What federation uses their software?

Better than you apparently.

I didn't make any claim as to which model fits the data better, I asked for info on this, which I thank you for providing. So it seems from your comments that the model in the middle, Glen-David, best fits chess data.

Clearly I didn't say that!

If so, I would still argue that Ordo is superior to Bayeselo because although both are "wrong" by being equally far off from the "correct" Glen-David model, at least Ordo corresponds to the way tournaments are scored and FIDE ratings calculated.

'Clearly' I said there is publication that 'Davidson' (2*0 = 1 + -1) was worse for _human_ ratings. Joe. H(1993), read it if you can. But it is obvious for you self-proclaimed expert that Davidson is the best. Please stop spewing crap you don't understand. That is why I didn't want to post a public link to the paper, to prevent unnecessary brawl. I hope it stops here.

hgm · Post by **hgm** » Sat Sep 21, 2013 5:52 pm

Whether it is the way that FIDE calculates the ratings or scores tournaments is unfortunately totally irrelevant. You can calculate the same rating in many ways. A rating calculated from a finite data set is not the same as the true strength of the player, but a stochastic quantity, contaminated by noise. It is of no interest to reproduce that noise (and thus the calculated rating). What determines which calculation method is better is how fast they converge to the true value, not if one always gets the same result as the other.

Empirical data has shown that draws are a stronger indication for equality of the player strength than equally scoring series of wins and losses. It could just as well have been the other way around; it all depends on the win, loss and draw probability as a function of rating difference. These probabilities happen to be such that draws are more significant.

Because of that counting two draws the same as one loss inefficiently weights the available data, in a way similar to randomly selecting half of the games of a player to count double. With enough games you will get the same rating. But the method that weighted the games properly will most likely be closer to the true result than one that doesn't.

Uri Blass · Post by **Uri Blass** » Sat Sep 21, 2013 6:20 pm

Daniel Shawul wrote:
The Davidson model corresponds to the way tournaments are scored (one win and one loss equals two draws). The main human over-the-board rating systems (FIDE, USCF, and most other national rating systems) consider only score and ratings, not how the score was split up among wins and draws, so they also assume Davidson. So my statement is correct and doesn't need further proof. You mention Microsoft; what do they know about chess? What federation uses their software?
Better than you apparently.
I didn't make any claim as to which model fits the data better, I asked for info on this, which I thank you for providing. So it seems from your comments that the model in the middle, Glen-David, best fits chess data.
Clearly I didn't say that!
If so, I would still argue that Ordo is superior to Bayeselo because although both are "wrong" by being equally far off from the "correct" Glen-David model, at least Ordo corresponds to the way tournaments are scored and FIDE ratings calculated.
'Clearly' I said there is publication that 'Davidson' (2*0 = 1 + -1) was worse for _human_ ratings. Joe. H(1993), read it if you can. But it is obvious for you self-proclaimed expert that Davidson is the best. Please stop spewing crap you don't understand. That is why I didn't want to post a public link to the paper, to prevent unnecessary brawl. I hope it stops here.

I do not see where larry kaufman said that davidson is the best.
from his post:
"I didn't make any claim as to which model fits the data better"

If I understand correctly your claim is that it is proved that the davidson is better for CCRL games and worse for human chess.

Rein Halbersma · Post by **Rein Halbersma** » Sat Sep 21, 2013 6:24 pm

Rémi Coulom wrote:
Rein Halbersma wrote:Re: drawing percentage as a function of strength, this seems to follow naturally from a statistical model where two players make moves with a small probability of an error. Maybe the value of the current position can then be modelled to follow a random walk with a drift proportional to the difference of the cumulative errors of both players. Once the position value crosses a threshold, a win or loss is realized.
This sounds very much like the Glenn-David model. The distribution of a sum of iid random values has the shape of a Gaussian.

But GD does not explain the draw rate in terms of the underlying error rates. I would expect that rating diferences are related to relative error rates, and draw percentage related to the absolute error rates. Perhaps there are some models from the theory of stochastic differential games.

Daniel Shawul · Post by **Daniel Shawul** » Sat Sep 21, 2013 6:31 pm

Uri Blass wrote:
Daniel Shawul wrote: I do not see where larry kaufman said that davidson is the best.
from his post:
"I didn't make any claim as to which model fits the data better"
'Clearly' you don't read very well. Here is what his first post said
Clearly the Davidson model is the one that fits the way tournaments are scored and Elo ratings are calculated, so the burden of proof lies with anyone claiming that one of the other models is superior.
I am not responsible for change of positions as he goes on...
If I understand correctly your claim is that it is proved that the davidson is better for CCRL games and worse for human chess.
I claim now you can't read what others post. I never said anything about human ratings except that his claim that it is 'clear' that Davidson is best for human chess is utter crap by giving him a reference to the opposite. Joe. H(1993)

hgm · Post by **hgm** » Sat Sep 21, 2013 6:32 pm

Note that the distribution of the sum would only approach Gaussian if the individual errors have a finite variance. If that is not the case, the sum might be dominated by its largest term. It is not inconceivable that this is the case in Chess, where it is easy to give away the game with a single move, and that in many cases it is not obvious that this move is a poor one.

Uri Blass · Post by **Uri Blass** » Sat Sep 21, 2013 7:23 pm

Daniel,I think that it is better to be more polite
I say the following:
1)I do not agree with you about the meaning of Larry's claim
His words that you quoted do not mean that the davidson model is better.

His words that you quoted:
"The Davidson model is the one that fits the way tournaments are scored and Elo ratings are calculated, so the burden of proof lies with anyone claiming that one of the other models is superior."

Edit:
fits the way tournaments are scored does not mean that the model is better.
It means only that tournaments are scored based on that model.

2) you said
"'Clearly' I said there is publication that 'Davidson' (2*0 = 1 + -1) was worse for _human_ ratings. Joe. H(1993), read it if you can."

I guess based on your last post that you do not consider the relevant publication as a proof that the davidson model is worse for human chess
when you consider your publication as a proof that the davidson model is better for computer chess.

Note that I did not read the relevant publication so I say nothing about it
and I do not claim what is better for human chess.

Rémi Coulom · Post by **Rémi Coulom** » Sat Sep 21, 2013 7:41 pm

Daniel Shawul wrote:There is even a work that says Davidson is worse for _human_ ratings, so you as a human GM should 'clearly' have knowledge of such things. You can find such statements here. "Joe, H. (1990). Extended use of paired comparion models, with applications to chess rankings. Journal of the Royal Statistical Society, 39(1):85–93."

Well, that paper states that the Davidson model does not fit human data well because some human players tend to draw much more than some others. But it does not compare the Davidson model to others. All the three models would perform equally badly at differentiating the propensity for draws between individuals.

Daniel Shawul · Post by **Daniel Shawul** » Sat Sep 21, 2013 7:54 pm

Obviously there was no comparison made between different models which is what the topic of the current work is. If it was clear that Davidson is better, then there is no point for the current research. The reason given by Larry as to why he thinks that is nonsense. Ofcourse rating calculated by one model will fit the ELOs calcuated by the same model better than any other model. But that doesn't mean other draw models don't fit the data better, but that FIDE has to change to the way they calculate ratings if a better model is found. That is why cross-correlation tests were necessary to compare different models.

Anyway I wish you didn't update the paper in public. I already sent Alvaro a version I had with completed sections in case you thought otherwise when you made the post.

1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)

Re: 1 draw=1 win + 1 loss (always!)