Hello Dann:
AFAIK, LOS is intended to be applied in a single n-game match, not through k n-game matches (you are usually running k = n, so n matches of n games each one). I think it is the key. If you average the n LOS results from the n matches, you will get <LOS> = (1/n)*SUM(LOS_i ; i = 1, ..., n) = 50% (or very close to 50%) since the score distribution is symmetric to 50%.
As explained before by some people, LOS does not quantify an Elo difference, just the
probability that Elo > 0 (whatever is +0.01 or +9000, this is, likelihood of superiority,
not how much superiority). In other words, LOS is the probability of score being greater than 0.5 (50%), like in the image I posted
before in this thread:
Following the
classical definition of probability among the cases we are interested on (superiority, score from 0.5 to 1) divided by the whole range of scores (from 0 to 1), just looking at the limits of the definite integrals of the last equation of the image.
------------
Regarding tails, your examples follow the binomial distribution, which approach the normal distribution with large samples. I wrote a little about the binomial distribution in an answer to a recent Dann's post some days ago:
Re: How good is your engine?
With dimensionless numbers (just dividing the mean and the standard deviation by the number of games), the score will be [0.5*n + z*0.5*sqrt(n)]/n = 0.5 + 0.5*z/sqrt(n), where z is the z-score: z = (2*score - 1)*sqrt(n). As the number of games increases, tails are expected to be closer to the central point and so the Elo difference:
Code: Select all
Elo = 400*log[score/(1 - score)] = 400*log( [ 0.5 + 0.5*z/sqrt(n) ] / { 1 - [ 0.5 + 0.5*z/sqrt(n) ] } )
Elo = 400*log[0.5 + 0.5*z/sqrt(n)] - 400*log[0.5 - 0.5*z/sqrt(n)]
Elo = 400*log{[1 + z/sqrt(n)]/2} - 400*log{[1 - z/sqrt(n)]/2} = 400*log[1 + z/sqrt(n)] - 400*log(2) - 400*[1 - z/sqrt(n)] + 400*log(2)
Elo = [400/ln(10)]*ln[1 + z/sqrt(n)] - [400/ln(10)]*ln[1 - z/sqrt(n)]
n >> 1 ; 1/sqrt(n) << 1
Taylor series (x << 1): ln(x) ~ x - x²/2 + x³/3 - ... ~ x
Elo ~ [400/ln(10)]*z/sqrt(n) - [400/ln(10)]*[-z/sqrt(n)] = [800/ln(10)]*z/sqrt(n)
------------
Other form:
Elo = 400*log(wins/loses) since draws do not exist in this example.
Elo = 400*log( [ 0.5 + 0.5*(wins - loses) / (wins + loses) ] / { 1 - [ 0.5 + 0.5*(wins - loses)/(wins + loses) ] } )
Elo = 400*log{ [ 1 + (wins - loses)/n ] / [ 1 - (wins - loses)/n ] }
Elo = [400/ln(10)]*ln[ 1 + (wins - loses)/n ] - [400/ln(10)]*ln[ 1 - (wins - loses)/n ]
Taylor series again because (wins - loses)/n << 1:
Elo ~ [800/ln(10)]*(wins - loses)/n
------------
Other form is Elo ~ [1600/ln(10)]*(score - 0.5) when score is close to 50%.
I hope no typos. Elo differences tends to zero, as expected. Why LOS is not close to 50% then? Because LOS = 0.5*{1 + erf[z/sqrt(2)]} by definition, where erf is the
error function. It was seen before that z is proportional to sqrt(n) so knowing the plot of erf, it is expected that LOS is far from 50% with such an unbelievable number of games, even if scores are close to 50%. How much can be |z| = abs(z) in n-game matches?
This can give a hint. Other definition with wins and loses in LOS is 0.5*( 1 + erf{ (wins - loses)/sqrt[2*(wins + loses) ] } ), reinforcing the latter statement.
Last but not least, your desired LOS = 50% (other than averaging n LOS values, as in the start of this post) would be reached when wins = loses, which is less and less probable when n grows. Using the binomial distribution once again and let be the number of games an even number (otherwise wins <> loses when draws are not a possible result), then the
central binomial coefficient plays a role:
Code: Select all
Prob.(wins) = W = 0.5
Prob.(loses) = L = 1 - W = 0.5
Prob.(wins = loses) = (2*wins over wins) * W^wins * L^loses
Prob.(wins = loses) = {[(2*wins)!]/[(wins!)²]} * 0.5^wins * 0.5^wins = {[(2*wins)!]/[(wins!)²]} * 0.5^(wins + wins) = {[(2*wins)!]/[(wins!)²]} * 0.25^wins
{[(2*wins)!]/[(wins!)²]} ~ (4^wins)/sqrt(pi*wins) applying Stirling's formula of x! ~ x^x * exp(-x) * sqrt(2*pi*x).
Prob.(wins = loses) ~ [(4^wins)/sqrt(pi*wins)] * 0.25^wins = 1/sqrt(pi*wins)
A very small value even if you would get n*Prob.(wins = loses) = n*[1/sqrt(pi*wins)] = sqrt(wins/pi) outcomes (or nearly this value) if you run n simulations of n games each.
Summary:
a) LOS cares about the sign of Elo difference: sign(LOS - 0.5) = sign(Elo difference).
b) Since LOS can be related with z-score, which can be proportional to sqrt(n), which tends to infinity, z-score tends to ±infinity and LOS tends to 0 or 1.
c) The case with wins = loses (LOS = 50%) is less and less probable with more games: probability = 1/sqrt(pi*wins) = sqrt[2/(pi*games)].
d) LOS is intended to be applied in a single n-game match, not in a k n-game matches (in your case, n n-game matches). If you want to obtain 50% by any means, try to average those n LOS values. I expect <LOS> = (1/n)*SUM(LOS_i ; i = 1, ..., n) = 50% or very close given the symmetric distribution of scores with respect of score = 50% (the score that brings LOS = 50%).
Regards from Spain.
Ajedrecista.