My wish list for cutechess-cli

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Rebel
Posts: 4658
Joined: Thu Aug 18, 2011 10:04 am

My wish list for cutechess-cli

Post by Rebel » Sat Jul 21, 2012 7:12 am

My wish list for cutechess-cli to speed-up testing.

1. The ability for an engine to terminate the game and save the game in the PGN with a "*" game score.

Why?

2 examples:

1a) Say you are testing King Safety then when the queens are of the board the rest of the game in an equal position actually isn't of any interest, in fact only adds unwanted randomness in a system we are trying to minimize the effect of randomness by playing many games. As such I would like to terminate a game with an "*" score if (say) the score is within a 0.25/-0.25 window.

1b) If I want to tune the bishop_pair same story, I want to terminate the game when there is no bishop_pair any longer.

I am currently writing some stuff for the knight_pair. The kind of minor stuff that gives little ELO and is hardly measurable without 50.000+ games. An option to terminate a game with no knight pairs surely would help a lot to suppress the noise.

----------

2. The ability for an engine to declare the game a draw and save the game in the PGN accordingly.

Basically to speed-up the senseless draws such as KQKQ | RKRK | RKNKN etc. pawnless endings. I am sure there are more tricks.

-----------

Introducing cheating? Oh well... I suggest the default setting is "off" (don't allow engine interference) and when "on" show it (with the statistic how many times an engine made use of it) on the screen (in UPPERCASE or so) after each game.

ZirconiumX
Posts: 1327
Joined: Sun Jul 17, 2011 9:14 am

Re: My wish list for cutechess-cli

Post by ZirconiumX » Sat Jul 21, 2012 7:15 am

Rebel wrote:My wish list for cutechess-cli to speed-up testing.

1. The ability for an engine to terminate the game and save the game in the PGN with a "*" game score.

Why?

2 examples:

1a) Say you are testing King Safety then when the queens are of the board the rest of the game in an equal position actually isn't of any interest, in fact only adds unwanted randomness in a system we are trying to minimize the effect of randomness by playing many games. As such I would like to terminate a game with an "*" score if (say) the score is within a 0.25/-0.25 window.

1b) If I want to tune the bishop_pair same story, I want to terminate the game when there is no bishop_pair any longer.

I am currently writing some stuff for the knight_pair. The kind of minor stuff that gives little ELO and is hardly measurable without 50.000+ games. An option to terminate a game with no knight pairs surely would help a lot to suppress the noise.

----------

2. The ability for an engine to declare the game a draw and save the game in the PGN accordingly.

Basically to speed-up the senseless draws such as KQKQ | RKRK | RKNKN etc. pawnless endings. I am sure there are more tricks.

-----------

Introducing cheating? Oh well... I suggest the default setting is "off" (don't allow engine interference) and when "on" show it (with the statistic how many times an engine made use of it) on the screen (in UPPERCASE or so) after each game.
1. is already done.
2. I think it already has this, but I cannot remember precisely.

Matthew:out
Some believe in the almighty dollar.

I believe in the almighty printf statement.

Daniel White
Posts: 33
Joined: Wed Mar 07, 2012 3:15 pm
Location: England
Contact:

Re: My wish list for cutechess-cli

Post by Daniel White » Sat Jul 21, 2012 7:44 am

Rebel wrote:My wish list for cutechess-cli to speed-up testing.

1. The ability for an engine to terminate the game and save the game in the PGN with a "*" game score.

Why?

2 examples:

1a) Say you are testing King Safety then when the queens are of the board the rest of the game in an equal position actually isn't of any interest, in fact only adds unwanted randomness in a system we are trying to minimize the effect of randomness by playing many games. As such I would like to terminate a game with an "*" score if (say) the score is within a 0.25/-0.25 window.

1b) If I want to tune the bishop_pair same story, I want to terminate the game when there is no bishop_pair any longer.

I am currently writing some stuff for the knight_pair. The kind of minor stuff that gives little ELO and is hardly measurable without 50.000+ games. An option to terminate a game with no knight pairs surely would help a lot to suppress the noise.

----------

2. The ability for an engine to declare the game a draw and save the game in the PGN accordingly.

Basically to speed-up the senseless draws such as KQKQ | RKRK | RKNKN etc. pawnless endings. I am sure there are more tricks.

-----------

Introducing cheating? Oh well... I suggest the default setting is "off" (don't allow engine interference) and when "on" show it (with the statistic how many times an engine made use of it) on the screen (in UPPERCASE or so) after each game.
For 1. you could just make an illegal move or something similar. Might cause you to miss bugs but if you're at the stage of playing mass games to tune things this likely isn't an issue anyway.

User avatar
hgm
Posts: 23604
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: My wish list for cutechess-cli

Post by hgm » Sat Jul 21, 2012 8:16 am

This is basically a protocol matter. WinBoard protocol does allow the engine to declare game end, by sending a result (1-0, 0-1 or 1/2-1/2) together with a result-comment. Like:

1/2-1/2 {trivial draw}

So your last request should already be granted if cutechess-cli has a proper implementation of WB protocol. (And I cannot imagine it would not!) In WinBoard there is a setting whether such engine claims are unconditionally obeyed (the result comment ending up in the PGN), or whether they are verified, and false claims lead to forfeit ('Verify engine claims' in the adjudications menu).

Now unfortunately the protocol does not support any '*' result. I sorely missed this when writing the peer-to-peer pseudo-engine, and consider adding it to the protocol. But, like the previous poster mentioned, for your application there seem to by plenty of alternatives. Just let the engine send "1/2-1/2 {abort}". The aborted games then won't contribute noise, because they are all draws. And when you really want to know how many there were, to know how significant the rest of the test was, you just select the games with result comment "abort" from the PGN.

Note that WinBoard already can do the trivial-draw adjudication for you (tick 'Trivial draws' in the adjudication dialog, or use -trivialDraws true on the command line). This would take care of KQKQ, KRKR etc. So there is no need for the engine to bother with that. Perhaps cutechess-cli has a similar option.

User avatar
ilari
Posts: 750
Joined: Mon Mar 27, 2006 5:45 pm
Location: Finland
Contact:

Re: My wish list for cutechess-cli

Post by ilari » Sat Jul 21, 2012 10:32 am

ZirconiumX wrote: 1. is already done.
2. I think it already has this, but I cannot remember precisely.

Matthew:out
Actually neither of those features are implemented. Cutechess-cli will only adjudicate a game with "*" result if the entire match is terminated (eg. by pressing CTRL+C). If an engine quits, crashes, makes an illegal move or an invalid win or draw claim, it will lose the game.

I could easily add an engine option that would make cutechess-cli always accept any result claim from the engine. I think that would satisfy both requests (1. and 2.). Of course that option would only work for Xboard/Winboard engines.
hgm wrote:This is basically a protocol matter. WinBoard protocol does allow the engine to declare game end, by sending a result (1-0, 0-1 or 1/2-1/2) together with a result-comment. Like:

1/2-1/2 {trivial draw}

So your last request should already be granted if cutechess-cli has a proper implementation of WB protocol. (And I cannot imagine it would not!)
I think it's a proper implementation, meaning that FIDE rules are obeyed when it comes to draw claims. But at the moment positions such as KQKQ or KRKR can't be claimed to be drawn. But like I said earlier, this can be changed easily.

User avatar
hgm
Posts: 23604
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: My wish list for cutechess-cli

Post by hgm » Sat Jul 21, 2012 11:32 am

I think it would be useful to add that option. (Which is why I did it in WinBoard. :wink: )

For UCI engines you could apply the rule that giving a bestmove 0000 together with a 0 cp score will be taken as a draw claim. (I mean: what else could you possibly want the interface to do if it receives that input? You might as well use it for this purpose.)

User avatar
lucasart
Posts: 3037
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: My wish list for cutechess-cli

Post by lucasart » Sun Jul 22, 2012 4:21 am

hgm wrote:I think it would be useful to add that option. (Which is why I did it in WinBoard. :wink: )

For UCI engines you could apply the rule that giving a bestmove 0000 together with a 0 cp score will be taken as a draw claim. (I mean: what else could you possibly want the interface to do if it receives that input? You might as well use it for this purpose.)
I agree with you that this would be the right way to implement such a feature (but it would be a non standard UCI extension).
But I don't think the feature makes sense: engine A may think it's a draw, but if engine B disagrees, it should be given a chance to prove it over the board and play on. Also this feature could be abused by malicious developpers who want to cheat and declare lost games as draws...

User avatar
hgm
Posts: 23604
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: My wish list for cutechess-cli

Post by hgm » Sun Jul 22, 2012 6:33 am

Well, I think Ed clearly explains why the feature wouldbe useful to him. Not in all applications fairness would be a priority. If you are interested in collecting win/loss statistics on games where a R vs 2N imbalance occurs it makes no sense to ontinue on playing after all Kights are traded. The opponents will not know that, because they only care about winning, not about statistics.

User avatar
Rebel
Posts: 4658
Joined: Thu Aug 18, 2011 10:04 am

Re: My wish list for cutechess-cli

Post by Rebel » Sun Jul 22, 2012 8:06 am

Tx HGM, I will try to get cutechess-cli to work with xboard and investigate what's already possible.

User avatar
hgm
Posts: 23604
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: My wish list for cutechess-cli

Post by hgm » Sun Jul 22, 2012 8:23 am

Beware that Ilari said cutechess-cli currently cannot be set to a mode where it obeys the 1/2-1/2 command. But even when it declares a forfeit in response to it, you can discount all forfeited games from the data set.

Post Reply