Testing engines in tournaments

eligolf · Post by **eligolf** » Thu Feb 24, 2022 9:30 am

Hello everyone,

Since I have started implementing more special features into my engine I want to test them against eachother, and in the future against other engines. However, none of them have an opening book so it doesn't matter how many games I play, they will always play the same moves. Is there any established approach on how to test, on what openings to use, or any other parameters to set? Currently I am using Arena to play vs my engine.

mvanthoor · Post by **mvanthoor** » Thu Feb 24, 2022 10:01 am

eligolf wrote: ↑Thu Feb 24, 2022 9:30 am Hello everyone,

Since I have started implementing more special features into my engine I want to test them against eachother, and in the future against other engines. However, none of them have an opening book so it doesn't matter how many games I play, they will always play the same moves. Is there any established approach on how to test, on what openings to use, or any other parameters to set? Currently I am using Arena to play vs my engine.

Arena can only do one game at a time and the GUI is too slow to handle engines with lots of output.

CuteChess allows you to run a game per core, which makes testing much faster. It supports .bin (polyglot) opening books that can be found around the internet.

yeni_sekme · Post by **yeni_sekme** » Thu Feb 24, 2022 10:31 am

https://github.com/official-stockfish/b ... v3.pgn.zip
You can use this opening book for your testings.

eligolf · Post by **eligolf** » Thu Feb 24, 2022 11:11 am

Thanks guys!

lithander · Post by **lithander** » Thu Feb 24, 2022 11:47 am

When I started with chessprogramming last year I was also confused that playing the opening is not something the engine does but instead the GUI usually choses the opening and then it let's the engines continue from there. It makes sense in hindsight, though.

I can recommend the commandline version of cutechess (cutechess-cli) to run tests. Because the terminal remembers your previous commands it's super easy to repeat tests or repeat them with slightly changed parameters. Everything from self-play to running gauntlets is equally easy.

I have a long textfile where I track the results of my tests and I usually store the CLI-command, the name of the PGN file containing the games and the match summary printed to the terminal after the tournament terminated.

The summary is enough to tell you who won the match and how close it was but if you want to guesstimate the actual CCRL Elo (for example) I can recommend https://github.com/michiguel/Ordo

JVMerlino · Post by **JVMerlino** » Thu Feb 24, 2022 6:24 pm

eligolf wrote: ↑Thu Feb 24, 2022 9:30 am Hello everyone,

Since I have started implementing more special features into my engine I want to test them against eachother, and in the future against other engines. However, none of them have an opening book so it doesn't matter how many games I play, they will always play the same moves. Is there any established approach on how to test, on what openings to use, or any other parameters to set? Currently I am using Arena to play vs my engine.

There is no "established approach", other than your test environment must remain consistent. I use CuteChess, and I let the engines use their own default settings as delivered in their download packages. In other words, I download an engine and give it a quick test. If it works and is in the ELO range that I'm looking for, it goes into my 12-engine gauntlet. I only swap engines out when they get too weak for my engine, but otherwise everything else stays the same.

Some have criticized CCRL for their approach (pre-defined openings, adjudicated results, etc). But the point is that their testing environment is consistent, so their rating list is well-respected by the vast majority.

KhepriChess · Post by **KhepriChess** » Fri Feb 25, 2022 6:05 am

Definitely take a look at using Cutechess's CLI: https://github.com/cutechess/cutechess

Since their documentation on how to use the CLI is rather...thin, take a look at Leela's guide: https://lczero.org/dev/wiki/testing-guide/

Follow those directions, carefully, and it should get you up and running. The number after "concurrency" in the command, you'll want to set to something that's at least equal to your number of cores (or cores - 1). I've seen some people say they run it closer to the number of available threads their CPU has (instead of cores). For time controls, you can have it play at very fast time controls to get through more games; Some people run at time controls like 10+0.1 or even 1+0.01. You might want to play around and see what's best for your engine (some engines can't really play effectively at ultra-fast time controls).

You can browse engines on CCRL and try to find ones close to what you think your engine's rating is.

dangi12012 · Post by **dangi12012** » Fri Feb 25, 2022 5:32 pm

If its really selfplay you are after then you can get away with no GUI at all.
If you dont reprint the stdio of the engines in the console the overhead should be very very minimal.

Of course no gui still means a piece of software that needs to be written that translates the uci outputs and sends the correct commands.

JVMerlino · Post by **JVMerlino** » Fri Feb 25, 2022 7:35 pm

dangi12012 wrote: ↑Fri Feb 25, 2022 5:32 pm If its really selfplay you are after then you can get away with no GUI at all.
If you dont reprint the stdio of the engines in the console the overhead should be very very minimal.

Of course no gui still means a piece of software that needs to be written that translates the uci outputs and sends the correct commands.

YMMV, but I've never found selfplay to be conclusive. You might have found a way to exploit the older version of your engine, while actually losing elo. Others feel differently, of course, but I stand by my 12-opponent gauntlet as getting results that end up being very close to those of other testers.

Testing engines in tournaments

Testing engines in tournaments

Re: Testing engines in tournaments

Re: Testing engines in tournaments

Re: Testing engines in tournaments

Re: Testing engines in tournaments

Re: Testing engines in tournaments

Re: Testing engines in tournaments

Re: Testing engines in tournaments

Re: Testing engines in tournaments