Severe bug found in the LittleBlitzerGUI

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
pohl4711
Posts: 2444
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Severe bug found in the LittleBlitzerGUI

Post by pohl4711 »

Thomas Zipproth spotted a severe bug in the LittleBlitzerGUI: The 50moves-draw-rule detection doesnt work properly! Some games, which are draw by the 50moves-rule, can be won or lost. After an investigation of the 55000 games of my LS top10 tournament (a big thanx to Thomas Zipproth!), we found, that an Elo-distortion of the engine-rankings in the range of +/- 0-2 Elo is caused by that bug...From now, all testwork for the LS-ratinglist is done with cutechess-cli.

It is strongly recommended, not ot use the LittleBlitzerGUI for testing anymore !!!

Stefan
User avatar
pohl4711
Posts: 2444
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Severe bug found in the LittleBlitzerGUI

Post by pohl4711 »

SzG wrote:
pohl4711 wrote:After an investigation of the 55000 games of my LS top10 tournament (a big thanx to Thomas Zipproth!), we found, that an Elo-distortion of the engine-rankings in the range of +/- 0-2 Elo is caused by that bug...From now, all testwork for the LS-ratinglist is done with cutechess-cli.

It is strongly recommended, not ot use the LittleBlitzerGUI for testing anymore !!!

Stefan
That 2 Elo distortion does not seem unbearable to me. If we wanted our ranking lists to be as accurate as that, we would have to play at least 100000 games for each engine, a task we surely could not undertake. Our best tested engine, SlowChess Blitz has 11000 games and still at only +/- 7 Elo error margin.
For me two engines with a 5, maybe even with more, Elo difference are equal.
Correct. And I will use cutechess-cli in the future. But in engine-tests with not so many games, the distortion can be bigger...So I wanted to post this warning here for other testers.

Stefan
User avatar
hgm
Posts: 27811
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Severe bug found in the LittleBlitzerGUI

Post by hgm »

Note that when the Elo ratings calculated with and without this by 2 Elo, it doesn't imply at all that these ratings are off by two Elo. Nearly as often they will be 2 Elo better, because the correct calculation has an error that is much larger, and happens to have the opposite sign.

In fact, when I run a test gauntlet, and now randomly select as much as 10% of the games, and replace their result by a randomly chosen win or loss, this would only marginally drive up the error. If you did very asymmetric testing (on average much better or much worse opponents) there would be a small systematic effect too, but such testing would be unreliable anyway.
phenri
Posts: 284
Joined: Tue Aug 13, 2013 9:44 am

Re: Severe bug found in the LittleBlitzerGUI

Post by phenri »

pohl4711 wrote:Thomas Zipproth spotted a severe bug in the LittleBlitzerGUI: The 50moves-draw-rule detection doesnt work properly! Some games, which are draw by the 50moves-rule, can be won or lost. After an investigation of the 55000 games of my LS top10 tournament (a big thanx to Thomas Zipproth!), we found, that an Elo-distortion of the engine-rankings in the range of +/- 0-2 Elo is caused by that bug...From now, all testwork for the LS-ratinglist is done with cutechess-cli.

It is strongly recommended, not ot use the LittleBlitzerGUI for testing anymore !!!

Stefan
Are you sure? Because Dariusz Orzechowski reports implicitly it works
Dariusz Orzechowski wrote:Virtually all these "missing" repetition draws were transferred to fifty moves category when contempt was enabled. https://groups.google.com/d/msg/fishcoo ... RqRmM6pasJ