Page 2 of 2

Re: What can be said about 1 - 0 score?

Posted: Fri Aug 25, 2017 7:18 am
by Laskos
Laskos wrote:I decided this week to do a more thorough study, and I played a dozen or so matches between wildly differing in strength engines (without adjudications), then analyzed their game lengths. One useful engine is the very first Zurichess, Zurichess Aargau, which is very stable and very weak (1730 CCRL 40/4). One analyzed match on normalized histogram looks like this:

The maximum likelihood lengths of these 3 matches:

Stockfish - Zurichess: 25 moves
Zurichess - Zurichess: 68 moves
Stockfish - Stockfish: 75 moves

Here too the strength difference is the main factor determining the mode of the game length. Based on the dozen matches played like this one, I derived a rule-of-thumb 1 standard deviation band of ELO difference versus game length:

We see that the band is very broad, and 2 standard deviations almost cover the entire range. So, one would say that little can be said of the derivation of strength difference from the length of a single game.

But as broad and inconclusive this result seems, this is an information we know beforehand playing that single game. Assuming normal distributions with the center value given by black line and standard deviation given by red region, I built symmetric priors dependent on game lengths to be used in calculation of Likelihood of Superiority (LOS) for 1-0 result. With uniform prior, LOS=75%.

Here are the derived according to the previous plot priors, for 45, 55, 65 and 75 move lengths of the games:

With these priors, I computed LOS for 1-0 score depending on the length of that single game:

We see that for game lengths above ~55 moves, the priors and LOS are very close to uniform, regular prior and LOS for 1-0 result. So, not much more information was gained. But below that length, going towards 28-30 moves, the LOS increases pretty dramatically to 0.9999. So, on this single game, if it is shorter than 50-55 moves, we can gain some information, and for very short games one can almost be sure that the winning engine is better (can be used as stopping rule too).
It is also useful to see the t-value for 1 - 0 result (directly related to LOS) as a function of game length (because priors depend on game lengths). One can see that for move lengths 30 and 55 the slope decreases pretty abruptly. The computation involves some double integrals, and the precision is not enough to go below 28 moves length of that single game.