Dann Corbit wrote:When the math says something utterly stupid, then the math is wrong.

Not always. Math can be proven. Stupidity is just an opinion. Math can be wrong, but only if it can be proven wrong.

Dann Corbit wrote:It would be utterly unsurprising if after a million and two games the next time, the one who lost won three games.

That depends on your 'surprise threshold'. It is just as surprising as that after a 2-0 the next three games go to the loser (for 2-3). Perhaps not very, but the likelihood is nowhere near 50%.

The proper math is this: if you did know nothing about the players, the probability p that A beats B will have a flat ('homogeneous') prior likelihood of 1 (i.e. distribution P(p) = 1 for all p in [0,1]) The likelihood that the hypothesis that "the win probability for A is p" is correct will be favored with a factor equal to the probabily that it predicted the observed outcome. We have two outcomes, both wins, and the predicted probability for a single win is p. So the prior (1) is multiplied by p*p. After normalization the likelihood for p then becomes 3*p*p. (The integral 3*p*p from p=0 to p=1 equals 1, so 3 was the correct normalization factor.)

The probability that the next game will also be a win (i.e. 3-0) is the average for p under this likelihood distribution, integral (3*p*p)*p from 0 to 1, which equals 3/4. Of all cases where you see two players totally unknown tp you reach a 2-0 score in their first two games, only 25% of those will see the score rebound to 2-1 (if draws are ignored), and in 75% of the cases the score will go to 3-0. The chances that it will get to 2-3 are 1/4^3 = 1/64. Whether is is 'utterly unsurprising' that something with a likelihood of 1 in 64 happens is up to you. I for one, would not be willing to bet my life savings on it...

The 'likelihood of superiority' can be calculated from the posterior likelihood distribution; the likelihood that the losing player is better encompasses all cases where 0 <= p < 0.5, so it is the integral of 3*p*p from 0 to 0.5. As the antiderivative of 3*p*p is p*p*p (p-cubed), this is 1/2^3 = 1/8. So as Rein already pointed out, the LOS of the winning player is 7/8. We see that the likelihood to lose a game (25%) is larger than the likelihood that he is weaker (12.5%). This is of course to be expected; even if it would be 100% certain that he is better, he could still lose a substantial fraction of the games. (As long as that is less than 50%.)

There is nothing wrong with this math. If you don't believe it, just try a simulation: Pick a random number P between 0 and 1, representing the probability that A beats B. Pick two random numbers x1 and x2 between 0 and 1, and consider them wins for A when xi < P. Now pick a another random x3, and consider it a win for A in the third game when x3 < P.

Repeat this a few million times, taking statistics as follows:

* keep separate histograms for the cases P<0.5 and P>0.5 (B superior vs A superior), say A[][] and B[][]

* for each case keep a 2-dimensional array of counters, using the score from the first two games (0-2, 1-1 or 2-0) as first index, and the score of the third game (0 or 1) as second index.

At the end, compare:

* the number of times B wins the 3rd game when A won the first two vs the number of times he lost it (A[2][1]+B[2][1] vs A[2][0] vs B[2][0]).

* The number of times A was superior when he won the first two games (irrespective of third), vs the number of times B was superior (A[2][0] + A[2][1] vs B[2][0] + B[2][1]).

Finally, tell us what you observed.

If you think the probability to draw matters, you can also simulate a flat prior by calling the PRNG 3 times, for win, draw and loss probability, reject the attempt if the sum of those three is larger than one, and normalize by dividing all through their sum if the sum was smaller than one. Each draw for a game should then be considered a win for A when it is smaller than Pwin, a loss for A when it is larger than 1-Ploss, and a draw otherwise. You can then discard any result of a run of 1M+2 games that did not have 1M draws, but you would have to make billions of runs of a million games to find a significant number of those. So perhaps it would be more productive to first consider the case where there were 1000 draws in 1002 games, so that you will already have some such runs after 1000 tries, and then do a few million of tries to get statistically significant numbers. This way you can easily convince yourself that it does not matter at all for the likeihood the 3rd win will go to B whether it took 100, 1000 or 10,000 draws to get that many wins.