This is a topic I had already mentioned. In a perfect world, if A plays B, and A is better than B, then A will always win. And there will be perfect correlation between games since any game can be used to predict the outcome of any other, even there is none of the "causality dependcy" present since the games do not effect each other in any fashion.
Here's the next segment which again is a rehash of what has been explained previously:
=====================================================
Let's consider a series of hypothetical trial runs. I assume that you are as capable as anyone in the industry of preventing any causal
dependence between various games in the trials, so causal dependence will not factor in my calculations at all. I believe you when you
say that you have solved that problem.
Trial A: Crafty plays forty positions against each of five opponents with colors each way for a total of 400 games. The engines are each
limited to a node count of 10,000,000. Crafty wins 198 games.
Trial B: Same as Trial A, except the node count limit is changed to 10,010,000. Crafty wins 190 games.
Now we compare these two results to see if anything extraordinary has happened. In 400 games, the standard deviation is 10, and the
difference in results was only 8, so we are well within expected bounds. There's nothing to get excited about, and we move on to the
next experiment.
Trial C: Same as Trial A, except that each position-opponent-color combination is played out 64 times. Yes, this is a silly experiment,
because we know that repeated playouts with a fixed node count give identical results, but bear with me. Crafty wins (as expected)
exactly 12672 games.
Trial D: Same as Trial B, except that each position-opponent-color combination is played out 64 times. Crafty wins 12160, as we knew it
would.
Now we compare the latter two trials. In 25,600 games the standard deviation is 80, and our difference in result was 512, so we are more
than six sigmas out. Holy cow! Run out and buy lottery tickets!
In this deterministic case it is easy to see what happened. The prefect correlation of the sixty-four repeats of each combination meant
that we were gaining no new information by expanding the trial. The calculation of standard deviation, however, assumes no correlation
whatsoever, i.e. perfect mathematical independence. Since the statistical assumption was not met, the statistical result is absurd.
=====================================================