Do You really need 1000s of games for testing?

hgm · Post by **hgm** » Mon Nov 08, 2010 1:29 pm

abulmo wrote:If you mix 1 liter of water with 1 liter of alcohol, you get 1.98 l. of alcoholic solution, not 2 l. Real life may be more complicated than simple mathematics.

Indeed, but we are not talking about real life, but about statistcs of Chess games. And statistics is mathematics.

Your example, btw, just proves my point: the fact that you get 1.98 l, does in no way prove that 1+1=1.98. What it does prove is that mixng liquids is not a good technology for building an adding machine. Similarly here. Of course it is possible that these rating testers get ratings that are completely independent of the number of games they play. But that would only show that they do not know how to test engines.

Or maybe the mathematics behind Elo needs to be more sophisticated, for example when dealing with draw games, correlation between players, etc.

This has nothing to do with mathematics behind Elo. It has to do with statistical fluctuations in the scores. If the scores are subject to statistical errors, any method of calculating Elo from them will produce Elo with errors. Garbage in, garbage out...

Do You really need 1000s of games for testing?

Re: Do You really need 1000s of games for testing?