Chess Statistics

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Chess Statistics

Post by Edmund »

Milos wrote:
Edmund wrote:This los calculation uses an normal distribution to approximate the win distribution. The otherwise needed multinominal distribution would take ages to calculate an exact value for your request with > 10000 games.
Of course you use normal distribution approximation. There is nothing wrong in using it per se.
However even there draw ration is implicitly included in variance.
The problem is by calculating LOS from difference tables. That is wrong.
Since you have the actual normal distribution approximation of ELO for both engines (lets call random variables for according PDFs X and Y), you could easily calculate the PDF of random variable Z=X-Y (just a simple convolution). And LOS would be Pr(Z>0).
the variance in my calculation = SQRT((1 - Draws/N) * N)

So in your example:
1/10000/0: SQRT((1 - 10000/10001) * 10001) vs
1/0/0: SQRT((1 - 0/1) * 1)

the difference is very low
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Chess Statistics

Post by Milos »

Laskos wrote:Likelihood of Success that one engine is better than another. It does not depend on the number of draws. Error intervals yes, depend. If you want a precise formula for LOS, I can give it.
Would you please give it, since just expending the acronym doesn't tell much.
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Chess Statistics

Post by Edmund »

Laskos wrote:
Edmund wrote: you will notice that the draw rate is very much considered.
It shouldn't be considered for LOS.

Kai
Kai, some months ago we were discussing how to generate a LOS Table. For this we took the draw-rate as constant because it shouldn't change much from engine to engine (somewhere around 0.3). However if the draw rate is actually given we can even generate more accurate results.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Chess Statistics

Post by Milos »

Edmund wrote:the variance in my calculation = SQRT((1 - Draws/N) * N)

So in your example:
1/10000/0: SQRT((1 - 10000/10001) * 10001) vs
1/0/0: SQRT((1 - 0/1) * 1)

the difference is very low
Ok that explains. Your variance approximation is not accurate enough for these cases.
Why not use SQRT((win_ratio*loss_ratio-0.25*draw_ratio)/num_games) instead?
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Chess Statistics

Post by Laskos »

Milos wrote:
Laskos wrote:Likelihood of Success that one engine is better than another. It does not depend on the number of draws. Error intervals yes, depend. If you want a precise formula for LOS, I can give it.
Would you please give it, since just expending the acronym doesn't tell much.
YOU asked it, now take that:

For any match +a =b -c, the formula which gives the exact LOS

LOS = 1 - ( binomial( a+c+2,0) + binomial( a+c+2,1) + ... + binomial( a+c+2, c) + binomial(a+c+1,c) ) / 2^(a+c+2)

and nothing depends on the number of draws.

Kai
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Chess Statistics

Post by Edmund »

Milos wrote:
Edmund wrote:the variance in my calculation = SQRT((1 - Draws/N) * N)

So in your example:
1/10000/0: SQRT((1 - 10000/10001) * 10001) vs
1/0/0: SQRT((1 - 0/1) * 1)

the difference is very low
Ok that explains. Your variance approximation is not accurate enough for these cases.
Why not use SQRT((win_ratio*loss_ratio-0.25*draw_ratio)/num_games) instead?
in your example loss_ratio = 0
thus having a negative square root
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Chess Statistics

Post by Milos »

Edmund wrote:in your example loss_ratio = 0
thus having a negative square root
I agree it doesn't work for zero, so you can't use it per se in my example, but works for any positive number ;).
However, you can use the same "trick" as for elo calculations when there are zero points for one opponent.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Chess Statistics

Post by Milos »

Laskos wrote:YOU asked it, now take that:

For any match +a =b -c, the formula which gives the exact LOS

LOS = 1 - ( binomial( a+c+2,0) + binomial( a+c+2,1) + ... + binomial( a+c+2, c) + binomial(a+c+1,c) ) / 2^(a+c+2)

and nothing depends on the number of draws.
Seams you don't understand the problem from the start. You simply ignore draws as they never happened. That's wrong. It's not a binomial but multinomial distribution.
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Chess Statistics

Post by Edmund »

Laskos wrote:
Milos wrote:
Laskos wrote:Likelihood of Success that one engine is better than another. It does not depend on the number of draws. Error intervals yes, depend. If you want a precise formula for LOS, I can give it.
Would you please give it, since just expending the acronym doesn't tell much.
YOU asked it, now take that:

For any match +a =b -c, the formula which gives the exact LOS

LOS = 1 - ( binomial( a+c+2,0) + binomial( a+c+2,1) + ... + binomial( a+c+2, c) + binomial(a+c+1,c) ) / 2^(a+c+2)

and nothing depends on the number of draws.

Kai
The LOS from my interpretation is the probability of two equally strong engines not reaching >= the given score. IOW LOS = P(x < Score)

to calculate this one needs the sum of P(w,d,l) for all different permutations of w,d,l that fulfill the requirement (x<score)

and in P(w,d,l) the draw rate does play a role, as
P(w,d,l) = N! / (w! d! l!) prob_w^w prob_d^d prob_w^l

where prob_d = draw rate
and prob_w = (1-draw rate) / 2
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Chess Statistics

Post by Laskos »

Milos wrote:
Laskos wrote:YOU asked it, now take that:

For any match +a =b -c, the formula which gives the exact LOS

LOS = 1 - ( binomial( a+c+2,0) + binomial( a+c+2,1) + ... + binomial( a+c+2, c) + binomial(a+c+1,c) ) / 2^(a+c+2)

and nothing depends on the number of draws.
Seams you don't understand the problem from the start. You simply ignore draws as they never happened. That's wrong. It's not a binomial but multinomial distribution.
Seems you do not understand the result. I know that it is a trinomial distribution, used to calculate the error intervals, there the number of draws is very important. But for LOS it is not. Can you understand that or your brain is just flat?

Kai