An objective test process for the rest of us?

Discussion of chess software programming and technical issues.

Moderator: Ras

Rob

Re: An objective test process for the rest of us?

Post by Rob »

hgm wrote:Yes, variance does not always have to be finite. But to have infinite variance, a quantity should be able to attain abitrarily large value (implying that it should also be able to attain infinitely many different values).

For chess scores a game can result ion only 3 scores, 0, 1/2 or 1. That means that the variance can be at most 1/2 squared = 1/4. Standard deviation is always limited to half the range of the maximum and the minimum outcome, and pathological cases can only occur when his range is infinite. It can then even happen that the expectation value does not exist. But no such thing in chess scores.
I meant could it be there's no variance in computer chess tournament results? R.Hyatt describes large fluctuations so the question is, is it correct to calculate standard deviation and variance without knowing the distribution model?

Edit: Nicolai, I missed your post while writing this post.
nczempin

Re: An objective test process for the rest of us?

Post by nczempin »

Rob wrote:
hgm wrote:Yes, variance does not always have to be finite. But to have infinite variance, a quantity should be able to attain abitrarily large value (implying that it should also be able to attain infinitely many different values).

For chess scores a game can result ion only 3 scores, 0, 1/2 or 1. That means that the variance can be at most 1/2 squared = 1/4. Standard deviation is always limited to half the range of the maximum and the minimum outcome, and pathological cases can only occur when his range is infinite. It can then even happen that the expectation value does not exist. But no such thing in chess scores.
I meant could it be there's no variance in computer chess tournament results? R.Hyatt describes large fluctuations so the question is, is it correct to calculate standard deviation and variance without knowing the distribution model?

Edit: Nicolai, I missed your post while writing this post.
I don't understand. What is the difference between "fluctuations" and variance?
nczempin

Re: An objective test process for the rest of us?

Post by nczempin »

Rob wrote:
I meant could it be there's no variance in computer chess tournament results? R.Hyatt describes large fluctuations so the question is, is it correct to calculate standard deviation and variance without knowing the distribution model?
I think hgm answered this particular question already.

Even without knowing the exact distribution model, we can deduct a few properties that this distribution must have. Maybe you are talking about something way too advanced for me, but given that a single chess result is only 1, 1/2 or 0 (or +, 0 and -), I can't see how any strange distribution is even possible for any summation of these individual results.
nczempin

Re: An objective test process for the rest of us?

Post by nczempin »

bob wrote:
nczempin wrote: Could you give me the correct way to analyse my example statistically?
I don't think we have such a methodology yet. At least I have not found one after all the games I have played using our cluster. This is really a difficult problem to address. I'm simply trying to point out to everyone that a few dozen games are a poor indicator. Again the easiest way to see why is to run the same "thing" more than once and look at how unstable the results are.
But how unstable can it get?

If you play a match of two games (black and white), and your engine wins both games. Then you play another match, and your engine loses both games.

That is the maxium variance you will ever be able to get. Correct me if I'm wrong.
Last edited by nczempin on Sat Sep 15, 2007 2:53 pm, edited 1 time in total.
Rob

Re: An objective test process for the rest of us?

Post by Rob »

nczempin wrote:
Rob wrote:
hgm wrote:Yes, variance does not always have to be finite. But to have infinite variance, a quantity should be able to attain abitrarily large value (implying that it should also be able to attain infinitely many different values).

For chess scores a game can result ion only 3 scores, 0, 1/2 or 1. That means that the variance can be at most 1/2 squared = 1/4. Standard deviation is always limited to half the range of the maximum and the minimum outcome, and pathological cases can only occur when his range is infinite. It can then even happen that the expectation value does not exist. But no such thing in chess scores.
I meant could it be there's no variance in computer chess tournament results? R.Hyatt describes large fluctuations so the question is, is it correct to calculate standard deviation and variance without knowing the distribution model?

Edit: Nicolai, I missed your post while writing this post.
I don't understand. What is the difference between "fluctuations" and variance?
Some distributions have no defined variance, but measured values fluctuate around some median value.
nczempin

Re: An objective test process for the rest of us?

Post by nczempin »

Rob wrote:
nczempin wrote:
Rob wrote:
hgm wrote:Yes, variance does not always have to be finite. But to have infinite variance, a quantity should be able to attain abitrarily large value (implying that it should also be able to attain infinitely many different values).

For chess scores a game can result ion only 3 scores, 0, 1/2 or 1. That means that the variance can be at most 1/2 squared = 1/4. Standard deviation is always limited to half the range of the maximum and the minimum outcome, and pathological cases can only occur when his range is infinite. It can then even happen that the expectation value does not exist. But no such thing in chess scores.
I meant could it be there's no variance in computer chess tournament results? R.Hyatt describes large fluctuations so the question is, is it correct to calculate standard deviation and variance without knowing the distribution model?

Edit: Nicolai, I missed your post while writing this post.
I don't understand. What is the difference between "fluctuations" and variance?
Some distributions have no defined variance, but measured values fluctuate around some median value.
Okay, I will rephrase:

In what way are the fluctuations that R. Hyatt describes different from "variance"?
Rob

Re: An objective test process for the rest of us?

Post by Rob »

nczempin wrote:
Rob wrote:
nczempin wrote:
Rob wrote:
hgm wrote:Yes, variance does not always have to be finite. But to have infinite variance, a quantity should be able to attain abitrarily large value (implying that it should also be able to attain infinitely many different values).

For chess scores a game can result ion only 3 scores, 0, 1/2 or 1. That means that the variance can be at most 1/2 squared = 1/4. Standard deviation is always limited to half the range of the maximum and the minimum outcome, and pathological cases can only occur when his range is infinite. It can then even happen that the expectation value does not exist. But no such thing in chess scores.
I meant could it be there's no variance in computer chess tournament results? R.Hyatt describes large fluctuations so the question is, is it correct to calculate standard deviation and variance without knowing the distribution model?

Edit: Nicolai, I missed your post while writing this post.
I don't understand. What is the difference between "fluctuations" and variance?
Some distributions have no defined variance, but measured values fluctuate around some median value.
Okay, I will rephrase:

In what way are the fluctuations that R. Hyatt describes different from "variance"?
Variance and standard deviation are mathematical definitions that tell you something about the distribution. With fluctuations I mean measurements. They don't tell you anything predefined about the distribution.
I know next to nothing about this, it's just what I read on Wikipedia :)
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: An objective test process for the rest of us?

Post by hgm »

Rob wrote:I know next to nothing about this, it's just what I read on Wikipedia :)
Well, better not interfere,then, because you are only causing confusion. What you read in the Wikipedia pertains to processes with an infinite range of outcomes, and is totally irrelevant here (although very interesting in itself).

The concept variance is not only defined for probability distributions, but also for finite data sets, as the average of squared deviations from the mean. The variance of distributions is merely a special case of that, as the variance of a data set drawn by sampling a process with the given distribution will, for large samples, tend to the variance of the distribution. Simply because according to the law of large numbers, the occurrence frequencies of the numbers in the sample will approach the probability, so the sample becomes identical to the distribution from which it is drawn.
nczempin

Re: An objective test process for the rest of us?

Post by nczempin »

Rob wrote:
nczempin wrote: Okay, I will rephrase:

In what way are the fluctuations that R. Hyatt describes different from "variance"?
Variance and standard deviation are mathematical definitions that tell you something about the distribution. With fluctuations I mean measurements. They don't tell you anything predefined about the distribution.
I know next to nothing about this, it's just what I read on Wikipedia :)
Okay, while I think these Cauchy distribution are interesting and I will surely check them out some time, I think I can safely say that they have nothing to do with computer chess.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: An objective test process for the rest of us?

Post by hgm »

Indeed, Cauchy distributions have the very interesting property that the average of a number of draws from a variable distributed like that is exactly the same as the original distribution. So you gain nothing by averaging a longer measurement, and might as well use the first measurement you do. (Properly weighted averaging can do better, though.) Longer averaging only increases the chance on a fluke, spoiling everything you averaged before.

But such pathological behavior only can occur when measuring unbounded quantity. E.g. if you would want to determine the average game length. (Remember claiming 50-move or rep draws is not obligatory!) Then your average would probably quickly converge to something between 40 and 100 moves, but if you continued measuring long enough you would sooner or later run into that million-moves game where both engines could not claim, and you came only back a day later to reset your computer. That single game would then have totally screwed your average, and to get it back down to something realistic you would have to collect so many 'normal' games that you would encounter an even worse fluke before you could do it.