Hello everyone,
Since I have started implementing more special features into my engine I want to test them against eachother, and in the future against other engines. However, none of them have an opening book so it doesn't matter how many games I play, they will always play the same moves. Is there any established approach on how to test, on what openings to use, or any other parameters to set? Currently I am using Arena to play vs my engine.
Testing engines in tournaments
Moderator: Ras
-
- Posts: 114
- Joined: Sat Nov 14, 2020 12:49 pm
- Full name: Elias Nilsson
-
- Posts: 1784
- Joined: Wed Jul 03, 2019 4:42 pm
- Location: Netherlands
- Full name: Marcel Vanthoor
Re: Testing engines in tournaments
Arena can only do one game at a time and the GUI is too slow to handle engines with lots of output.eligolf wrote: ↑Thu Feb 24, 2022 9:30 am Hello everyone,
Since I have started implementing more special features into my engine I want to test them against eachother, and in the future against other engines. However, none of them have an opening book so it doesn't matter how many games I play, they will always play the same moves. Is there any established approach on how to test, on what openings to use, or any other parameters to set? Currently I am using Arena to play vs my engine.
CuteChess allows you to run a game per core, which makes testing much faster. It supports .bin (polyglot) opening books that can be found around the internet.
-
- Posts: 40
- Joined: Mon Mar 01, 2021 7:51 pm
- Location: İstanbul, Turkey
- Full name: Ömer Faruk Tutkun
Re: Testing engines in tournaments
https://github.com/official-stockfish/b ... v3.pgn.zip
You can use this opening book for your testings.
You can use this opening book for your testings.
-
- Posts: 114
- Joined: Sat Nov 14, 2020 12:49 pm
- Full name: Elias Nilsson
Re: Testing engines in tournaments
Thanks guys!
-
- Posts: 915
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Testing engines in tournaments
When I started with chessprogramming last year I was also confused that playing the opening is not something the engine does but instead the GUI usually choses the opening and then it let's the engines continue from there. It makes sense in hindsight, though.
I can recommend the commandline version of cutechess (cutechess-cli) to run tests. Because the terminal remembers your previous commands it's super easy to repeat tests or repeat them with slightly changed parameters. Everything from self-play to running gauntlets is equally easy.
I have a long textfile where I track the results of my tests and I usually store the CLI-command, the name of the PGN file containing the games and the match summary printed to the terminal after the tournament terminated.
The summary is enough to tell you who won the match and how close it was but if you want to guesstimate the actual CCRL Elo (for example) I can recommend https://github.com/michiguel/Ordo
I can recommend the commandline version of cutechess (cutechess-cli) to run tests. Because the terminal remembers your previous commands it's super easy to repeat tests or repeat them with slightly changed parameters. Everything from self-play to running gauntlets is equally easy.
I have a long textfile where I track the results of my tests and I usually store the CLI-command, the name of the PGN file containing the games and the match summary printed to the terminal after the tournament terminated.
The summary is enough to tell you who won the match and how close it was but if you want to guesstimate the actual CCRL Elo (for example) I can recommend https://github.com/michiguel/Ordo
-
- Posts: 1397
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
Re: Testing engines in tournaments
There is no "established approach", other than your test environment must remain consistent. I use CuteChess, and I let the engines use their own default settings as delivered in their download packages. In other words, I download an engine and give it a quick test. If it works and is in the ELO range that I'm looking for, it goes into my 12-engine gauntlet. I only swap engines out when they get too weak for my engine, but otherwise everything else stays the same.eligolf wrote: ↑Thu Feb 24, 2022 9:30 am Hello everyone,
Since I have started implementing more special features into my engine I want to test them against eachother, and in the future against other engines. However, none of them have an opening book so it doesn't matter how many games I play, they will always play the same moves. Is there any established approach on how to test, on what openings to use, or any other parameters to set? Currently I am using Arena to play vs my engine.
Some have criticized CCRL for their approach (pre-defined openings, adjudicated results, etc). But the point is that their testing environment is consistent, so their rating list is well-respected by the vast majority.
-
- Posts: 93
- Joined: Sun Aug 08, 2021 9:14 pm
- Full name: Kurt Peters
Re: Testing engines in tournaments
Definitely take a look at using Cutechess's CLI: https://github.com/cutechess/cutechess
Since their documentation on how to use the CLI is rather...thin, take a look at Leela's guide: https://lczero.org/dev/wiki/testing-guide/
Follow those directions, carefully, and it should get you up and running. The number after "concurrency" in the command, you'll want to set to something that's at least equal to your number of cores (or cores - 1). I've seen some people say they run it closer to the number of available threads their CPU has (instead of cores). For time controls, you can have it play at very fast time controls to get through more games; Some people run at time controls like 10+0.1 or even 1+0.01. You might want to play around and see what's best for your engine (some engines can't really play effectively at ultra-fast time controls).
You can browse engines on CCRL and try to find ones close to what you think your engine's rating is.
Since their documentation on how to use the CLI is rather...thin, take a look at Leela's guide: https://lczero.org/dev/wiki/testing-guide/
Follow those directions, carefully, and it should get you up and running. The number after "concurrency" in the command, you'll want to set to something that's at least equal to your number of cores (or cores - 1). I've seen some people say they run it closer to the number of available threads their CPU has (instead of cores). For time controls, you can have it play at very fast time controls to get through more games; Some people run at time controls like 10+0.1 or even 1+0.01. You might want to play around and see what's best for your engine (some engines can't really play effectively at ultra-fast time controls).
You can browse engines on CCRL and try to find ones close to what you think your engine's rating is.
-
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Testing engines in tournaments
If its really selfplay you are after then you can get away with no GUI at all.
If you dont reprint the stdio of the engines in the console the overhead should be very very minimal.
Of course no gui still means a piece of software that needs to be written that translates the uci outputs and sends the correct commands.
If you dont reprint the stdio of the engines in the console the overhead should be very very minimal.
Of course no gui still means a piece of software that needs to be written that translates the uci outputs and sends the correct commands.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer
-
- Posts: 1397
- Joined: Wed Mar 08, 2006 10:15 pm
- Location: San Francisco, California
Re: Testing engines in tournaments
YMMV, but I've never found selfplay to be conclusive. You might have found a way to exploit the older version of your engine, while actually losing elo. Others feel differently, of course, but I stand by my 12-opponent gauntlet as getting results that end up being very close to those of other testers.dangi12012 wrote: ↑Fri Feb 25, 2022 5:32 pm If its really selfplay you are after then you can get away with no GUI at all.
If you dont reprint the stdio of the engines in the console the overhead should be very very minimal.
Of course no gui still means a piece of software that needs to be written that translates the uci outputs and sends the correct commands.