Discussion of chess software programming and technical issues.
Moderators: bob, hgm, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.

Rémi Coulom
 Posts: 434
 Joined: Mon Apr 24, 2006 6:06 pm

Contact:
Post
by Rémi Coulom » Thu Jan 23, 2014 3:08 pm
Gerd Isenberg wrote:Thanks for the correction. Can you review it again whether it is now correct? I am illiterate in statistics, and Edmund seems actually inactive as cpw editor.
I don't have time to edit the wiki, but I'd like to make some remarks.
wiki wrote:This calculation becomes very inefficient for larger number of games. In this case the Normal Distribution can give a good approximation.
Well, the calculation can be done cleverly, and the Normal approximation is not really required.
http://www.talkchess.com/forum/viewtopi ... 82&t=30624
http://www.talkchess.com/forum/viewtopi ... 05&t=30624
http://www.talkchess.com/forum/viewtopi ... 30&t=30624
It is important to note that LOS does not depend on the number of draws.
These calculations don't make a difference whether the game results were obtained when playing Black or White. It is a good approximation when the two players played the same number of games with each color.
Rémi

Laskos
 Posts: 9725
 Joined: Wed Jul 26, 2006 8:21 pm
 Full name: Kai Laskos
Post
by Laskos » Thu Jan 23, 2014 7:27 pm
Gerd Isenberg wrote:Laskos wrote:mohzus wrote:Hi guys,
I found a formula to calculate the LOS at
http://chessprogramming.wikispaces.com/Match+Statistics, basically it's LOS=(1/2)*(1+erf (x/sqrt 2)) where x is defined as score_difference/(N/(1draw_ratio)) where N is the total number of games. This formula is supposed to be a good approximation when the number of games is large.
So I've tested a bit this formula with very few games (not more than 300) and then I tried to do it once for a large number of games, around 40 k games.
The data is the following (taken out directly from the fishtest, the LOS is supposed to be close to 17% according to the fishtest calculations):
Code: Select all
Total: 38946 W: 5975 L: 6077 D: 26894
.
From this I calculated
Code: Select all
Draw rate: 0,69
Score difference: 102
x= 102/(38946*(1 0,69)) = 0.0084633
LOS=(1/2)*(1+erf (0.0084633 /sqrt 2 )) =0.4966
. Note that in order to perform the calculations, I've kept more digits than the ones I've written here.
Conclusion: the LOS that I've calculated is totally off from 0.17...
What am I doing wrong?
They forgot a sqrt there in Wiki, and added useless draw ratio.
"score_difference/(N/(1draw_ratio))" should be
score_difference/sqrt(N*(1draw_ratio)). Number of draws is irrelevant:
LOS = (1 + erf[(wins  losses)/(2*(wins+losses))^0.5])/2
By the way, it's my formula.
In your particular case, the exact LOS is 0.176424329128482367, that Erf approximation is 0.176414.
Thanks for the correction. Can you review it again whether it is now correct? I am illiterate in statistics, and Edmund seems actually inactive as cpw editor.
Yes, now it seems correct, although I would explicitly show that the number of draws doesn't matter.

Gerd Isenberg
 Posts: 2130
 Joined: Wed Mar 08, 2006 7:47 pm
 Location: Hattingen, Germany
Post
by Gerd Isenberg » Fri Jan 24, 2014 7:46 am
Laskos wrote:Gerd Isenberg wrote:Laskos wrote:mohzus wrote:Hi guys,
I found a formula to calculate the LOS at
http://chessprogramming.wikispaces.com/Match+Statistics, basically it's LOS=(1/2)*(1+erf (x/sqrt 2)) where x is defined as score_difference/(N/(1draw_ratio)) where N is the total number of games. This formula is supposed to be a good approximation when the number of games is large.
So I've tested a bit this formula with very few games (not more than 300) and then I tried to do it once for a large number of games, around 40 k games.
The data is the following (taken out directly from the fishtest, the LOS is supposed to be close to 17% according to the fishtest calculations):
Code: Select all
Total: 38946 W: 5975 L: 6077 D: 26894
.
From this I calculated
Code: Select all
Draw rate: 0,69
Score difference: 102
x= 102/(38946*(1 0,69)) = 0.0084633
LOS=(1/2)*(1+erf (0.0084633 /sqrt 2 )) =0.4966
. Note that in order to perform the calculations, I've kept more digits than the ones I've written here.
Conclusion: the LOS that I've calculated is totally off from 0.17...
What am I doing wrong?
They forgot a sqrt there in Wiki, and added useless draw ratio.
"score_difference/(N/(1draw_ratio))" should be
score_difference/sqrt(N*(1draw_ratio)). Number of draws is irrelevant:
LOS = (1 + erf[(wins  losses)/(2*(wins+losses))^0.5])/2
By the way, it's my formula.
In your particular case, the exact LOS is 0.176424329128482367, that Erf approximation is 0.176414.
Thanks for the correction. Can you review it again whether it is now correct? I am illiterate in statistics, and Edmund seems actually inactive as cpw editor.
Yes, now it seems correct, although I would explicitly show that the number of draws doesn't matter.
I will try to do that during the next days.

Isaac
 Posts: 265
 Joined: Sat Feb 22, 2014 7:37 pm
Post
by Isaac » Sun May 25, 2014 2:01 am
I would like to mention that the formula given in the wiki page (
http://chessprogramming.wikispaces.com/Match+Statistics , which can be written as los=(1+erf ((winslosses)/ (2*(wins+losses))**0.5) )/2. into a Fortran program) is not equivalent to the LOS given by Rémi Coulom in the first link he gives in this thread, namely
http://www.talkchess.com/forum/viewtopi ... 82&t=30624.
For example if you tell the program that there is 1 win and 0 loss, the Rémi's program gives the value 0.75 while the formula on wiki programming gives 0.84.

AlvaroBegue
 Posts: 922
 Joined: Tue Mar 09, 2010 2:46 pm
 Location: New York
 Full name: Álvaro Begué (RuyDos)
Post
by AlvaroBegue » Sun May 25, 2014 2:25 am
Isaac wrote:I would like to mention that the formula given in the wiki page (
http://chessprogramming.wikispaces.com/Match+Statistics , which can be written as los=(1+erf ((winslosses)/ (2*(wins+losses))**0.5) )/2. into a Fortran program) is not equivalent to the LOS given by Rémi Coulom in the first link he gives in this thread, namely
http://www.talkchess.com/forum/viewtopi ... 82&t=30624.
For example if you tell the program that there is 1 win and 0 loss, the Rémi's program gives the value 0.75 while the formula on wiki programming gives 0.84.
The erf formula is an approximation that works well with large numbers of games. Using it with 1 win and 0 losses is not a good idea. But we are trying to answer questions about whether 10,000 games or 30,000 games is enough to get statistical significance, and that's done perfectly well by the erf formula.

Isaac
 Posts: 265
 Joined: Sat Feb 22, 2014 7:37 pm
Post
by Isaac » Sun May 25, 2014 4:12 am
AlvaroBegue wrote:Isaac wrote:I would like to mention that the formula given in the wiki page (
http://chessprogramming.wikispaces.com/Match+Statistics , which can be written as los=(1+erf ((winslosses)/ (2*(wins+losses))**0.5) )/2. into a Fortran program) is not equivalent to the LOS given by Rémi Coulom in the first link he gives in this thread, namely
http://www.talkchess.com/forum/viewtopi ... 82&t=30624.
For example if you tell the program that there is 1 win and 0 loss, the Rémi's program gives the value 0.75 while the formula on wiki programming gives 0.84.
The erf formula is an approximation that works well with large numbers of games. Using it with 1 win and 0 losses is not a good idea. But we are trying to answer questions about whether 10,000 games or 30,000 games is enough to get statistical significance, and that's done perfectly well by the erf formula.
I see thank you. I knew they were asymptotically equivalent but I didn't realize that it was not suitable for a low number of games.