A 20000 STC game test will give a better idea of LTC performance as compared to a 20 LTC game test. With only 20 games, the CI is massive and any result, barring an extreme result, is meaningless. Electoral surveys consist of 100s-1000s of participants, not 10s. The high variance of the sampling distribution for small N is inescapable.amchess wrote: ↑Sun Sep 05, 2021 3:37 pm No since it is an LTC test.
If I were to run 1000 games at long times (at least half an hour each), maybe on completely random opening positions, it would be the equivalent of at least 41 days!
There are so many patches to test and the development of a chess engine would be impossible.
Since the purpose of Shashchess is to be the best at long times, the criterion that is used, even in other branches of software engineering (testing strategy), is to use particularly significant samples. This concept is also used in statistics when we talk about "projections", for example, in the case of electoral surveys.
In our case, the chess concept of characteristic fits perfectly with this purpose: 20 games are played in one day and are based on the 10 characteristics (to simplify, center types), most common in chess. Since, if several positions share the same feature, they also share game plans, in this way, we can cover the range of possible situations without wasting too much time unnecessarily. We have explained this in the wiki.
Anyone with an iota of understanding of statistics will tell you the same thing.