hgm wrote:Oops! You got me there!
What I intended to write was 'positions', not 'games'. That a small number of games gives results that are typically far from the truth is of course a no-brainer.
Actually, it isn't. If you play the same position 10 times, and notice that the outcome varies from win to lose to draw, that the games are not duplicated, then it is a pretty reasonable conclusion to believe that repeating the same position multiple times will be as reasonable as playing multiple positions just once each. CT fell into this. _many_ others have been using the Nunn, Noomen, and two versions of Silver positions in this same way. Once the issue is exposed, it seems pretty obvious. But not initially. Or at least not to most of us doing this.
The point I intended to summarize in (1) was that even with an infinite number of games, the results would be far from the truth if these games were only played from a small number of positions.
I agree with that after Karl's explanation, and then playing multiple runs with about the same number of games, but with far more positions. With this number of positions, the statistical results from BayesElo seem to be perfectly reasonable each and every run, which is a change from the previous approach.
The point you deny is (3), btw, not (1).
The point we have been discussing the past month seems a moving target. At the very beginning of this discussion I already brought up the fact that the results were far from the truth, due to the small number of games and opponents. But at the time I accepted your dismissal of that, when you said we were not discussing the difference of your results with the truth, but from each other.
OK. So far so good. That was _the_ issue. I never cared, and stated so, whether the ratings were accurate or not. Just that they were repeatable so that I could test and draw conclusions...
But then Karl showed up, and he was only interested in the difference with the truth, and did not want to offer anything on the original problem. So then playing more positions suddenly became the hype of the day...
Not IMHO. He was interested, specifically, in first explaining the "six sigma event" that happened on back-to-back runs, and then suggesting a solution that would prevent this from happening in the future.
But, elaborating on (3), the fact remains that:
3a) Playing from more positions only could help to get the results closer to the truth, but does nothing for their variability.
Sorry, but Karl did _not_ say that. He was specifically addresssing the "position played N times correlation" that was wrecking the statistical assumptions. And he suggested a way to eliminate that. I can again quote from his original email to clearly show that he first addressed the variable results that were outside normal statistical bounds, and then offered a solution. The "truth" was a different issue.
3b) I said this
3c) Karl said this
3d) I said it again
3e) You keep denying it.
Nope, I just keep quoting what Karl said, namely that playing the same position multiple times using the same opponents violates the independent trial requirement and while it appeared to reduce the standard deviation with more games, it did not. I understood that. Do you not?