Do You really need 1000s of games for testing?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27820
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Do You really need 1000s of games for testing?

Post by hgm »

abulmo wrote:If you mix 1 liter of water with 1 liter of alcohol, you get 1.98 l. of alcoholic solution, not 2 l. Real life may be more complicated than simple mathematics.
Indeed, but we are not talking about real life, but about statistcs of Chess games. And statistics is mathematics.

Your example, btw, just proves my point: the fact that you get 1.98 l, does in no way prove that 1+1=1.98. What it does prove is that mixng liquids is not a good technology for building an adding machine. Similarly here. Of course it is possible that these rating testers get ratings that are completely independent of the number of games they play. But that would only show that they do not know how to test engines.
Or maybe the mathematics behind Elo needs to be more sophisticated, for example when dealing with draw games, correlation between players, etc.
This has nothing to do with mathematics behind Elo. It has to do with statistical fluctuations in the scores. If the scores are subject to statistical errors, any method of calculating Elo from them will produce Elo with errors. Garbage in, garbage out...