Looking for some help re: SPRT. I've been happily using cutechess-cli for automated testing, and it works great.
I'm now trying to use SPRT to terminate matches early, and save time. I'm using the following to test a new version of my engine against a very old version (several hundred ELO weaker). I'd have expected the SPRT bit to terminate the match quite early, but it seems to play on and on and not terminate. Here's my code:
A 5 Elo range is quite small to detect (with the 95% error) without thousands of games, unless there is a huge difference the strength of the engines.
You mentioned expecting hundreds of Elo, but it might be less.
Also, the time controls are very long for what is typically used with SPRT.
Draw movenumber 100 seems like more than what many people use.
Depending on how many games are anticipated, the 2 move book seems rather shallow.
SCID and be used to scan for duplicate games.
You can try to reduce the number of draws with a slightly imbalanced book too.
Finally, using syzygy tablebase adjudication will speed things up some also.
Perhaps trying a "fast" sort of test run with much faster time controls would show a difference.
If they are traditional A/B engines, something like 0:10+0.1 would be where I would start.
For timed games, concurrency 10 seems fine with 12+ cores.
Others may have better suggestions and I don't actually use SPRT very much and prefer Ordo myself with cutechess-cli.
silentshark wrote: ↑Tue Jan 19, 2021 6:37 pm
Hi all,
Looking for some help re: SPRT. I've been happily using cutechess-cli for automated testing, and it works great.
I'm now trying to use SPRT to terminate matches early, and save time. I'm using the following to test a new version of my engine against a very old version (several hundred ELO weaker). I'd have expected the SPRT bit to terminate the match quite early, but it seems to play on and on and not terminate. Here's my code:
I'm probably missing something stupid, so please shout
Running cutechess 1.20 under win10, fyi.
Thanks in advance!
You should be aware that SPRT is only efficient if the Elo difference is comparable to the bounds. If this is not the case then SPRT is very inefficient (compared to a standard fixed length test).
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Michel wrote: ↑Tue Jan 19, 2021 8:52 pm
You should be aware that SPRT is only efficient if the Elo difference is comparable to the bounds. If this is not the case then SPRT is very inefficient (compared to a standard fixed length test).
Interesting.. why would that be? So the parameters I'm using would be more efficient if there is only a small difference in ELO?
silentshark wrote: ↑Tue Jan 19, 2021 6:37 pm
Hi all,
Looking for some help re: SPRT. I've been happily using cutechess-cli for automated testing, and it works great.
I'm now trying to use SPRT to terminate matches early, and save time. I'm using the following to test a new version of my engine against a very old version (several hundred ELO weaker). I'd have expected the SPRT bit to terminate the match quite early, but it seems to play on and on and not terminate. Here's my code:
silentshark wrote: ↑Tue Jan 19, 2021 6:37 pm
Hi all,
Looking for some help re: SPRT. I've been happily using cutechess-cli for automated testing, and it works great.
I'm now trying to use SPRT to terminate matches early, and save time. I'm using the following to test a new version of my engine against a very old version (several hundred ELO weaker). I'd have expected the SPRT bit to terminate the match quite early, but it seems to play on and on and not terminate. Here's my code: