Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Kempelen
Posts: 620
Joined: Fri Feb 08, 2008 9:44 am
Contact:

Could someone explain to me where the current formulas to calculate error margins according to number of games cames from? (i.e. +-10 elo points for 4000 games). I mean, Why those numbers and no others?. What observations has been done to arrive to that conclusions?
thx Fermin
Fermin Serrano
Author of 'Rodin' engine

Daniel Shawul
Posts: 3804
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Kempelen wrote:Could someone explain to me where the current formulas to calculate error margins according to number of games cames from? (i.e. +-10 elo points for 4000 games). I mean, Why those numbers and no others?. What observations has been done to arrive to that conclusions?
thx Fermin
I think you would need an estimate of the sample standard deviation to give specific values. Otherwise all you can say is that for every doubling of number of games ,the error margin would be halved... For 95% confidence i.e 2 sigma you have , mean +/- 2 * sd / sqrt(n). Then from winning percentage and elo relation you can tell how much the error margin is interms of elo. If you have a record of games, like WLWWWDDLLLWL etc , you can estimate the sample mean and sd from it.

hgm
Posts: 23871
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Kempelen wrote:What observations has been done to arrive to that conclusions?
No observations have to be done, because this is simple mathematics. One also doesn't do observations to establish that 5+5=10...

It follows from the definition of standard deviation, confidence interval and the fact that the games are independent.

Sven
Posts: 3830
Joined: Thu May 15, 2008 7:57 pm
Location: Berlin, Germany
Full name: Sven Schüle
Contact:

Daniel Shawul wrote:I think you would need an estimate of the sample standard deviation to give specific values. Otherwise all you can say is that for every doubling of number of games ,the error margin would be halved...
In principle I agree, but the sample standard deviation is already an estimate (of the "real" standard deviation which is unknown) so you don't need to estimate it again. And for every doubling of the number of games the error margin is divided by sqrt(2), or for every multiplying of the number of games by four the error margin is halved.

Sven

Daniel Shawul
Posts: 3804
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Sven Schüle wrote:
Daniel Shawul wrote:I think you would need an estimate of the sample standard deviation to give specific values. Otherwise all you can say is that for every doubling of number of games ,the error margin would be halved...
In principle I agree, but the sample standard deviation is already an estimate (of the "real" standard deviation which is unknown) so you don't need to estimate it again. And for every doubling of the number of games the error margin is divided by sqrt(2), or for every multiplying of the number of games by four the error margin is halved.

Sven
Just left that mistake for mr nitpicker to find it Correction people,it is when number of games is _quadrupled_ not doubled you get s.e halved !

But I was not saying he should use the population sd but just the sample sd. Note that he mentioned only number of games and associated error of margin. It is impossilbe to determine error of margin from number of games alone so he needs to estimate the standard deviation from the sample of games. 4000 games is pretty high so sample or population statistic don't matter much even if the later is known. It is a good question though, as I have noticed some say (including myself) that 4000 games is a +/10 elo or so which is not quite right. Probably that is what Fermin meant by observations. Most of the time you would get a +/- 10 elo from bayeselo at around 3-4 thousand games.. For example a 50% winning percentage could be from a WLWLWLWLWL or DDDDDDDDDD. The latter has a big sd ,but the latter has 0 sd!

Code: Select all

``````sd = sqrt&#40;wins * &#40;1 - m&#41;^2 + losses * &#40;0 - m&#41;^2 + draws * &#40;0.5 - m&#41;^2&#41; / sqrt&#40;n - 1&#41;
``````
If the draw ratio is 0, it is possible to tell the error margin from the number of games and winning percentage alone.

Posts: 9725
Joined: Wed Jul 26, 2006 8:21 pm

Daniel Shawul wrote:
Sven Schüle wrote:
Daniel Shawul wrote:I think you would need an estimate of the sample standard deviation to give specific values. Otherwise all you can say is that for every doubling of number of games ,the error margin would be halved...
In principle I agree, but the sample standard deviation is already an estimate (of the "real" standard deviation which is unknown) so you don't need to estimate it again. And for every doubling of the number of games the error margin is divided by sqrt(2), or for every multiplying of the number of games by four the error margin is halved.

Sven
Just left that mistake for mr nitpicker to find it Correction people,it is when number of games is _quadrupled_ not doubled you get s.e halved !

But I was not saying he should use the population sd but just the sample sd. Note that he mentioned only number of games and associated error of margin. It is impossilbe to determine error of margin from number of games alone so he needs to estimate the standard deviation from the sample of games. 4000 games is pretty high so sample or population statistic don't matter much even if the later is known. It is a good question though, as I have noticed some say (including myself) that 4000 games is a +/10 elo or so which is not quite right. Probably that is what Fermin meant by observations. Most of the time you would get a +/- 10 elo from bayeselo at around 3-4 thousand games.. For example a 50% winning percentage could be from a WLWLWLWLWL or DDDDDDDDDD. The latter has a big sd ,but the latter has 0 sd!

Code: Select all

``````sd = sqrt&#40;wins * &#40;1 - m&#41;^2 + losses * &#40;0 - m&#41;^2 + draws * &#40;0.5 - m&#41;^2&#41; / sqrt&#40;n - 1&#41;
``````
If the draw ratio is 0, it is possible to tell the error margin from the number of games and winning percentage alone.
DDDD... gives 0 sd only in the approximation you use.

Kai

Daniel Shawul
Posts: 3804
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

How? Note that I am calculating sd of winning percentage. When you go to elo calculation with bayeselo, there is ofcourse elodraw and eloadvantage. So a DDDDDD is more variable when you consider those things... Anyway I was just trying to demostrate why one can't tell margin of error of elo from the number of games and winning percentage alone.

Posts: 9725
Joined: Wed Jul 26, 2006 8:21 pm

Daniel Shawul wrote:How? Note that I am calculating sd of winning percentage. When you go to elo calculation with bayeselo, there is ofcourse elodraw and eloadvantage. So a DDDDDD is more variable when you consider those things... Anyway I was just trying to demostrate why one can't tell margin of error of elo from the number of games and winning percentage alone.
That thing you wrote is the error (SD) in the normal approximation of the trinomial. You can still use it for 10 draws match, but keep in mind that after that match, probability of W,L = 1/13, for D = 11/13, and input these into your formula.

Kai

Daniel Shawul
Posts: 3804
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Daniel Shawul wrote:How? Note that I am calculating sd of winning percentage. When you go to elo calculation with bayeselo, there is ofcourse elodraw and eloadvantage. So a DDDDDD is more variable when you consider those things... Anyway I was just trying to demostrate why one can't tell margin of error of elo from the number of games and winning percentage alone.
That thing you wrote is the error (SD) in the normal approximation of the trinomial. You can still use it for 10 draws match, but keep in mind that after that match, probability of W,L = 1/13, for D = 11/13, and input these into your formula.

Kai
The probability of a W,L would surely be lowered a lot after that odd observation.. That formula is for calculating standard deviation of a given sample that does not assume probablities for WDL. The rewards are fixed at 0,0.5,1 ofcourse. This was just a quick example , but I know it does not directly translate to elo because there you have mix of players with different strength, white elo advantage etc..

Posts: 9725
Joined: Wed Jul 26, 2006 8:21 pm