Understanding Bayeselo results

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Eric Stock

Understanding Bayeselo results

Post by Eric Stock »

Hi, I ran a set of 1000 matches between two engines, then loaded the .pgn file into Bayeselo and executed the command mm and then results

Here is what I got


ResultSet-EloRating>ratings
Rank Name Elo + - games score oppo. draws
1 A 14 17 17 1000 54% -14 43%
2 B -14 17 17 1000 46% 14 43%

Ok, so a couple of questions:
1. The +,- are these the 95% confidence intervals for the rating..ie 95% of the time the score is within the interval [-17,+17] of the score?

2. Why might the draw % be so high ? What does this mean for my results? I am testing using Arena and using the opening book "little main book"

Thanks,
Eric Stock
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Understanding Bayeselo results

Post by Edmund »

Hello Eric,

firstly 1) is correct. The +- values indicate the confidence intervals for an alpha of 95%.

secondly 2) the draw rate depends on a couple of aspects.
a) the general level of the play. That means, the longer the timecontrol or the higher the average elo of the engines, the higher will be the draw rate.
b) the elo difference of the players, ie. the lower the difference the higher is the chance for a draw. and
c) the more similar the playing style of the engines the higher the chance for a draw. So playing matches between two version of almost the same engine (testing a change for example) can also produce higher draw rates.

There are some other issues like choosing a drawish opening book for example. Furthermore I am not quite sure about the impact of endgame-tablebases, but I could imagine a decreased draw rate there as well.

You might want to take a look on Kirills http://kirill-kryukov.com/chess/kcec/draw_rate.html draw statistics. There you will find that 43% is not so unusual for pairs such close in strength.