Page 1 of 2

My wish list for cutechess-cli

Posted: Sat Jul 21, 2012 9:12 am
by Rebel
My wish list for cutechess-cli to speed-up testing.

1. The ability for an engine to terminate the game and save the game in the PGN with a "*" game score.

Why?

2 examples:

1a) Say you are testing King Safety then when the queens are of the board the rest of the game in an equal position actually isn't of any interest, in fact only adds unwanted randomness in a system we are trying to minimize the effect of randomness by playing many games. As such I would like to terminate a game with an "*" score if (say) the score is within a 0.25/-0.25 window.

1b) If I want to tune the bishop_pair same story, I want to terminate the game when there is no bishop_pair any longer.

I am currently writing some stuff for the knight_pair. The kind of minor stuff that gives little ELO and is hardly measurable without 50.000+ games. An option to terminate a game with no knight pairs surely would help a lot to suppress the noise.

----------

2. The ability for an engine to declare the game a draw and save the game in the PGN accordingly.

Basically to speed-up the senseless draws such as KQKQ | RKRK | RKNKN etc. pawnless endings. I am sure there are more tricks.

-----------

Introducing cheating? Oh well... I suggest the default setting is "off" (don't allow engine interference) and when "on" show it (with the statistic how many times an engine made use of it) on the screen (in UPPERCASE or so) after each game.

Re: My wish list for cutechess-cli

Posted: Sat Jul 21, 2012 9:15 am
by ZirconiumX
Rebel wrote:My wish list for cutechess-cli to speed-up testing.

1. The ability for an engine to terminate the game and save the game in the PGN with a "*" game score.

Why?

2 examples:

1a) Say you are testing King Safety then when the queens are of the board the rest of the game in an equal position actually isn't of any interest, in fact only adds unwanted randomness in a system we are trying to minimize the effect of randomness by playing many games. As such I would like to terminate a game with an "*" score if (say) the score is within a 0.25/-0.25 window.

1b) If I want to tune the bishop_pair same story, I want to terminate the game when there is no bishop_pair any longer.

I am currently writing some stuff for the knight_pair. The kind of minor stuff that gives little ELO and is hardly measurable without 50.000+ games. An option to terminate a game with no knight pairs surely would help a lot to suppress the noise.

----------

2. The ability for an engine to declare the game a draw and save the game in the PGN accordingly.

Basically to speed-up the senseless draws such as KQKQ | RKRK | RKNKN etc. pawnless endings. I am sure there are more tricks.

-----------

Introducing cheating? Oh well... I suggest the default setting is "off" (don't allow engine interference) and when "on" show it (with the statistic how many times an engine made use of it) on the screen (in UPPERCASE or so) after each game.
1. is already done.
2. I think it already has this, but I cannot remember precisely.

Matthew:out

Re: My wish list for cutechess-cli

Posted: Sat Jul 21, 2012 9:44 am
by Daniel White
Rebel wrote:My wish list for cutechess-cli to speed-up testing.

1. The ability for an engine to terminate the game and save the game in the PGN with a "*" game score.

Why?

2 examples:

1a) Say you are testing King Safety then when the queens are of the board the rest of the game in an equal position actually isn't of any interest, in fact only adds unwanted randomness in a system we are trying to minimize the effect of randomness by playing many games. As such I would like to terminate a game with an "*" score if (say) the score is within a 0.25/-0.25 window.

1b) If I want to tune the bishop_pair same story, I want to terminate the game when there is no bishop_pair any longer.

I am currently writing some stuff for the knight_pair. The kind of minor stuff that gives little ELO and is hardly measurable without 50.000+ games. An option to terminate a game with no knight pairs surely would help a lot to suppress the noise.

----------

2. The ability for an engine to declare the game a draw and save the game in the PGN accordingly.

Basically to speed-up the senseless draws such as KQKQ | RKRK | RKNKN etc. pawnless endings. I am sure there are more tricks.

-----------

Introducing cheating? Oh well... I suggest the default setting is "off" (don't allow engine interference) and when "on" show it (with the statistic how many times an engine made use of it) on the screen (in UPPERCASE or so) after each game.
For 1. you could just make an illegal move or something similar. Might cause you to miss bugs but if you're at the stage of playing mass games to tune things this likely isn't an issue anyway.

Re: My wish list for cutechess-cli

Posted: Sat Jul 21, 2012 10:16 am
by hgm
This is basically a protocol matter. WinBoard protocol does allow the engine to declare game end, by sending a result (1-0, 0-1 or 1/2-1/2) together with a result-comment. Like:

1/2-1/2 {trivial draw}

So your last request should already be granted if cutechess-cli has a proper implementation of WB protocol. (And I cannot imagine it would not!) In WinBoard there is a setting whether such engine claims are unconditionally obeyed (the result comment ending up in the PGN), or whether they are verified, and false claims lead to forfeit ('Verify engine claims' in the adjudications menu).

Now unfortunately the protocol does not support any '*' result. I sorely missed this when writing the peer-to-peer pseudo-engine, and consider adding it to the protocol. But, like the previous poster mentioned, for your application there seem to by plenty of alternatives. Just let the engine send "1/2-1/2 {abort}". The aborted games then won't contribute noise, because they are all draws. And when you really want to know how many there were, to know how significant the rest of the test was, you just select the games with result comment "abort" from the PGN.

Note that WinBoard already can do the trivial-draw adjudication for you (tick 'Trivial draws' in the adjudication dialog, or use -trivialDraws true on the command line). This would take care of KQKQ, KRKR etc. So there is no need for the engine to bother with that. Perhaps cutechess-cli has a similar option.

Re: My wish list for cutechess-cli

Posted: Sat Jul 21, 2012 12:32 pm
by ilari
ZirconiumX wrote: 1. is already done.
2. I think it already has this, but I cannot remember precisely.

Matthew:out
Actually neither of those features are implemented. Cutechess-cli will only adjudicate a game with "*" result if the entire match is terminated (eg. by pressing CTRL+C). If an engine quits, crashes, makes an illegal move or an invalid win or draw claim, it will lose the game.

I could easily add an engine option that would make cutechess-cli always accept any result claim from the engine. I think that would satisfy both requests (1. and 2.). Of course that option would only work for Xboard/Winboard engines.
hgm wrote:This is basically a protocol matter. WinBoard protocol does allow the engine to declare game end, by sending a result (1-0, 0-1 or 1/2-1/2) together with a result-comment. Like:

1/2-1/2 {trivial draw}

So your last request should already be granted if cutechess-cli has a proper implementation of WB protocol. (And I cannot imagine it would not!)
I think it's a proper implementation, meaning that FIDE rules are obeyed when it comes to draw claims. But at the moment positions such as KQKQ or KRKR can't be claimed to be drawn. But like I said earlier, this can be changed easily.

Re: My wish list for cutechess-cli

Posted: Sat Jul 21, 2012 1:32 pm
by hgm
I think it would be useful to add that option. (Which is why I did it in WinBoard. :wink: )

For UCI engines you could apply the rule that giving a bestmove 0000 together with a 0 cp score will be taken as a draw claim. (I mean: what else could you possibly want the interface to do if it receives that input? You might as well use it for this purpose.)

Re: My wish list for cutechess-cli

Posted: Sun Jul 22, 2012 6:21 am
by lucasart
hgm wrote:I think it would be useful to add that option. (Which is why I did it in WinBoard. :wink: )

For UCI engines you could apply the rule that giving a bestmove 0000 together with a 0 cp score will be taken as a draw claim. (I mean: what else could you possibly want the interface to do if it receives that input? You might as well use it for this purpose.)
I agree with you that this would be the right way to implement such a feature (but it would be a non standard UCI extension).
But I don't think the feature makes sense: engine A may think it's a draw, but if engine B disagrees, it should be given a chance to prove it over the board and play on. Also this feature could be abused by malicious developpers who want to cheat and declare lost games as draws...

Re: My wish list for cutechess-cli

Posted: Sun Jul 22, 2012 8:33 am
by hgm
Well, I think Ed clearly explains why the feature wouldbe useful to him. Not in all applications fairness would be a priority. If you are interested in collecting win/loss statistics on games where a R vs 2N imbalance occurs it makes no sense to ontinue on playing after all Kights are traded. The opponents will not know that, because they only care about winning, not about statistics.

Re: My wish list for cutechess-cli

Posted: Sun Jul 22, 2012 10:06 am
by Rebel
Tx HGM, I will try to get cutechess-cli to work with xboard and investigate what's already possible.

Re: My wish list for cutechess-cli

Posted: Sun Jul 22, 2012 10:23 am
by hgm
Beware that Ilari said cutechess-cli currently cannot be set to a mode where it obeys the 1/2-1/2 command. But even when it declares a forfeit in response to it, you can discount all forfeited games from the data set.