Name for elo without draws?

mvk · Post by **mvk** » Wed Sep 02, 2015 7:14 pm

If we have N games with results W, D and L (with W+D+L=N), the elo difference is calculated from P = (0*L + 0.5*D + 1*W) / N.

We can throw away the draws and compute elo' from P' = W / (N-D) using the same elo formula.

What you then get I now call 'pseudo-elo', but I'm wondering if there is a standard name for that quantity already.

Dann Corbit · Post by **Dann Corbit** » Wed Sep 02, 2015 8:46 pm

mvk wrote:If we have N games with results W, D and L (with W+D+L=N), the elo difference is calculated from P = (0*L + 0.5*D + 1*W) / N.

We can throw away the draws and compute elo' from P' = W / (N-D) using the same elo formula.

What you then get I now call 'pseudo-elo', but I'm wondering if there is a standard name for that quantity already.

I would call it wrong Elo.

Imagine two computer players A and B, very evenly matched.
They play one million and two games.
Players A and B draw one million times.
B wins the other two games.
Is B dominantly better than A?
No.
B has EXACTLY the SAME strength as A.
To calculate otherwise is simply incorrect.

mvk · Post by **mvk** » Wed Sep 02, 2015 9:01 pm

Thank you. As this "pseudo-elo", as I call it now, emerges from my evaluation's draw model, I think "wrong elo" is a rather poor name. I'm just looking if the quantity is already named elsewhere or not. I'm not looking for a value judgement.

Rein Halbersma · Post by **Rein Halbersma** » Wed Sep 02, 2015 9:30 pm

Dann Corbit wrote:
mvk wrote:If we have N games with results W, D and L (with W+D+L=N), the elo difference is calculated from P = (0*L + 0.5*D + 1*W) / N.

We can throw away the draws and compute elo' from P' = W / (N-D) using the same elo formula.

What you then get I now call 'pseudo-elo', but I'm wondering if there is a standard name for that quantity already.
I would call it wrong Elo.

Imagine two computer players A and B, very evenly matched.
They play one million and two games.
Players A and B draw one million times.
B wins the other two games.
Is B dominantly better than A?
No.
B has EXACTLY the SAME strength as A.
To calculate otherwise is simply incorrect.

Even though the ELO difference for this example is tiny, the Likelihood of Superiority of B over A is 7/8. In the words of Remi Coulom: A draw will at the same time make estimated Elo ratings closer to each other, and reduce the width of confidence intervals. It does this in such a way that the LOS does not change.

michiguel · Post by **michiguel** » Wed Sep 02, 2015 10:06 pm

mvk wrote:If we have N games with results W, D and L (with W+D+L=N), the elo difference is calculated from P = (0*L + 0.5*D + 1*W) / N.

We can throw away the draws and compute elo' from P' = W / (N-D) using the same elo formula.

What you then get I now call 'pseudo-elo', but I'm wondering if there is a standard name for that quantity already.

I will use Wilo in the next version of Ordo (I may even change the name of the program not to confuse things) with a different model as I mentioned before.
http://www.talkchess.com/forum/viewtopi ... ilo#593262

The scales will be different, so the delta "wilos" will give a different meaning, but I have the opinion that it is _the_ way to measure strength based on some theoretical and experimental considerations. For instance, the scale of doubling speeds is very linear with respect to wilos, but it curves a lot with elo. At the higher end, we are actually underestimating the improvements of SF and Komodo. They are actually much stronger than what they look.

Miguel
PS: The new ordo-wilo will not throw away draws, but will incorporate them into a new draw model where draw rates vary with strength.

bob · Post by **bob** » Wed Sep 02, 2015 10:33 pm

mvk wrote:Thank you. As this "pseudo-elo", as I call it now, emerges from my evaluation's draw model, I think "wrong elo" is a rather poor name. I'm just looking if the quantity is already named elsewhere or not. I'm not looking for a value judgement.

This would be much closer to a sort of LOS measure, which only estimates which one is better (and the likelihood of 2 wins and 1M draws does suggest the program with two wins is slightly better), while Elo tries to estimate the difference between the two and provide a predictive estimate of how future games will go.

You might pose this to Remi. When we were talking about this very issue here years ago he mentioned the business about draws being irrelevant if all you care about is which program is better. I don't recall that he had a specific name, but he's a good one to ask.

Rein Halbersma · Post by **Rein Halbersma** » Wed Sep 02, 2015 10:48 pm

bob wrote:
mvk wrote:Thank you. As this "pseudo-elo", as I call it now, emerges from my evaluation's draw model, I think "wrong elo" is a rather poor name. I'm just looking if the quantity is already named elsewhere or not. I'm not looking for a value judgement.
This would be much closer to a sort of LOS measure, which only estimates which one is better (and the likelihood of 2 wins and 1M draws does suggest the program with two wins is slightly better), while Elo tries to estimate the difference between the two and provide a predictive estimate of how future games will go.

You might pose this to Remi. When we were talking about this very issue here years ago he mentioned the business about draws being irrelevant if all you care about is which program is better. I don't recall that he had a specific name, but he's a good one to ask.

See the post of his I quoted in my earlier post.

Dann Corbit · Post by **Dann Corbit** » Wed Sep 02, 2015 11:14 pm

Rein Halbersma wrote:
Dann Corbit wrote:
mvk wrote:If we have N games with results W, D and L (with W+D+L=N), the elo difference is calculated from P = (0*L + 0.5*D + 1*W) / N.

We can throw away the draws and compute elo' from P' = W / (N-D) using the same elo formula.

What you then get I now call 'pseudo-elo', but I'm wondering if there is a standard name for that quantity already.
I would call it wrong Elo.

Imagine two computer players A and B, very evenly matched.
They play one million and two games.
Players A and B draw one million times.
B wins the other two games.
Is B dominantly better than A?
No.
B has EXACTLY the SAME strength as A.
To calculate otherwise is simply incorrect.
Even though the ELO difference for this example is tiny, the Likelihood of Superiority of B over A is 7/8. In the words of Remi Coulom: A draw will at the same time make estimated Elo ratings closer to each other, and reduce the width of confidence intervals. It does this in such a way that the LOS does not change.

LOS of 7/8 is obvious horse crap.
The two wins are random noise in a million and two games.

hgm · Post by **hgm** » Wed Sep 02, 2015 11:53 pm

Dann Corbit wrote:LOS of 7/8 is obvious horse crap.
The two wins are random noise in a million and two games.

No!

The situation is the same as having 2 wins out of 2 games. That is not always just random noise, in many cases (actually most cases) it would be because the winning player is significantly stronger. The only thing proven by the long match is that the draw probability is stupendously large. But that does not imply anything at all on the ratio of the win vs loss probability. It could very well be that P(draw) = 0.999998, P(win) = 1.999999e-6 and P(loss) = 1e-12. One player could be perfect, and cannot lose at all, while the other is only nearly perfect, and makes a losing error about once every million games. The situation in top draughts is somewhat like that (except not for millions of games but for dozens of games).

If you only know that A beat B 2 times out of 2, the odds that B would beat A in the next game that is not a draw are rather poor. It would be very unwise to bet on that when the payout is not at least 4 times the investment.

Dann Corbit · Post by **Dann Corbit** » Thu Sep 03, 2015 12:29 am

This is clearly wrong.
Ignoring a million draws is lunacy.
If those players play a game, the outcome will be a draw.
It would be utterly unsurprising if after a million and two games the next time, the one who lost won three games.

When the math says something utterly stupid, then the math is wrong.

Name for elo without draws?

Name for elo without draws?

Re: Name for elo without draws?

Re: Name for elo without draws?

Re: Name for elo without draws?

Re: Name for elo without draws?

Re: Name for elo without draws?

Re: Name for elo without draws?

Re: Name for elo without draws?

Re: Name for elo without draws?

Re: Name for elo without draws?