Easy engine to use for testing

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
stevemulligan
Posts: 117
Joined: Wed Jul 20, 2011 2:54 pm
Location: Ottawa, Canada

Easy engine to use for testing

Post by stevemulligan »

I've been working on my standard chess engine (c#) for what feels like a very long time and I'm ready to start testing against other engines. What is a good choice of opponents to use against a beginner like me. I'm at the point where I want to try to make my eval a bit smarter.

I started with Warrior but I'm wondering if there are easier engines for me to play against? Against Warrior I get about 10% wins, 40% draws, 50% loss. Maybe that W/L ratio is ok for testing eval changes? I'm not sure...

Also any tips on how to make my eval smarter would be much appreciated.
User avatar
emadsen
Posts: 434
Joined: Thu Apr 26, 2012 1:51 am
Location: Oak Park, IL, USA
Full name: Erik Madsen

Re: Easy engine to use for testing

Post by emadsen »

You could test against my C# engine, RumbleMinze. Rated about 1880 in 2 min + 1 sec.
My C# chess engine: https://www.madchess.net
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Easy engine to use for testing

Post by Adam Hair »

stevemulligan wrote:I've been working on my standard chess engine (c#) for what feels like a very long time and I'm ready to start testing against other engines. What is a good choice of opponents to use against a beginner like me. I'm at the point where I want to try to make my eval a bit smarter.

I started with Warrior but I'm wondering if there are easier engines for me to play against? Against Warrior I get about 10% wins, 40% draws, 50% loss. Maybe that W/L ratio is ok for testing eval changes? I'm not sure...

Also any tips on how to make my eval smarter would be much appreciated.
Here is a list of engines: http://adamsccpages.blogspot.com/p/also ... t.html?m=1
The engines with more than 300 games are relatively stable. Tscp is a common engine to measure against at this level. MSCP is also a good engine. Perhaps Crafty, set at an appropriate skill level, would be a good choice.

Most authors, if they are not conducting self-testing, will use a pool of 8 to 10 reliable opponents. I can go over my notes and see which of these engines appear to be most stable.

Adam
User avatar
Ajedrecista
Posts: 1971
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Easy engine to use for testing.

Post by Ajedrecista »

Hello Steve:
stevemulligan wrote:I've been working on my standard chess engine (c#) for what feels like a very long time and I'm ready to start testing against other engines. What is a good choice of opponents to use against a beginner like me. I'm at the point where I want to try to make my eval a bit smarter.

I started with Warrior but I'm wondering if there are easier engines for me to play against? Against Warrior I get about 10% wins, 40% draws, 50% loss. Maybe that W/L ratio is ok for testing eval changes? I'm not sure...

Also any tips on how to make my eval smarter would be much appreciated.
Congrats for developing a chess engine! I would be unable of do only a move generator.

I remember that Lucas Braesch (the author of DoubleCheck/DiscoCheck chess engine) said some months ago that Faile 1.4 was very stable... but I have just seen in the excellent Adam's web that Faile is even stronger than Warrior, so it should not be the best engine to play against it. I also agree with Adam that using a bunch of engines may be a little better than using only an engine.

I have calculated some error bars (rounded up to 0.01 Elo) for 2-sigma confidence (~95.45% confidence) with 10% of wins, 40% of draws and 50% of loses:

Code: Select all

n = number of games:

n =  100 ---> ]-206.33,  -95.19[
n =  200 ---> ]-187.99, -109.91[
n =  300 ---> ]-180.17, -116.55[
n =  500 ---> ]-172.49, -123.29[
n = 1000 ---> ]-164.91, -130.16[
n = 2000 ---> ]-159.64, -135.09[
n = 3000 ---> ]-157.33, -137.28[
n = 5000 ---> ]-155.02, -139.50[

With a score of 30%, the Elo advantage is more less -147.19 Elo.
I used my own clumsy programme, which I think that uses a similar algorithm to EloSTAT.

Maybe this W/L ratio is a bit unbalanced for testing (small) eval changes... but I am a newbie, so please take my words with a lot of care. Of course, I can not help you with tips regarding evals. I wish you good luck with the development of your engine.

Regards from Spain.

Ajedrecista.
jdart
Posts: 4367
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Easy engine to use for testing

Post by jdart »

Yes, the key thing is stability if you are going to run long matches.

I am not familiar with Warrior? Is this CM 10 Warrior?

Some older and weaker (by 2012 standards) but stable engines include Crafty 20.14 and Arasan 11.7.

Stockfish has variable strength and you might find a setting that makes it a suitable opponent.

I have also used: N2, Daydreamer, Gaviota, Spike, Fruit, Umko, Toga (although these are pretty strong, especially the last 2).

--Jon
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Easy engine to use for testing

Post by Adam Hair »

Warrior was a free engine that was released in 2006 by Aleksandrs Saveljevs of Latvia.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Easy engine to use for testing

Post by bob »

stevemulligan wrote:I've been working on my standard chess engine (c#) for what feels like a very long time and I'm ready to start testing against other engines. What is a good choice of opponents to use against a beginner like me. I'm at the point where I want to try to make my eval a bit smarter.

I started with Warrior but I'm wondering if there are easier engines for me to play against? Against Warrior I get about 10% wins, 40% draws, 50% loss. Maybe that W/L ratio is ok for testing eval changes? I'm not sure...

Also any tips on how to make my eval smarter would be much appreciated.
Ideally, you want 3 classes of engines. Those significantly stronger, those about the same level, and those that are weaker. If you are not careful, it is easy to close the gap on the stronger group a bit, but damage performance against weaker engines... You can look at overall results, but with the above "groups" you can also look at them individually to understand what is actually going on...
User avatar
stevemulligan
Posts: 117
Joined: Wed Jul 20, 2011 2:54 pm
Location: Ottawa, Canada

Re: Easy engine to use for testing

Post by stevemulligan »

Adam, once my engine is ready, how can I apply to get on the Also-Rans rating list?

Running a 1000 games against Erik's RumbleMinze since it's a c# engine like mine. I'm not even sure if my engine will play 1000 games without crashing.

Code: Select all

cutechess-cli.exe -engine conf=Pwned1 -engine conf=RumbleMinze -each tc=40/120+1 -rounds 1000 -debug -pgnin in.pgn -pgndepth 4 -repeat -pgnout rumble2.pgn
It looks like it's going to take over 24 hours to finish (only 1 free machine to test on). After that I'll do as Bob suggests and get a mix of different engines to test against. Many thanks for the suggestions :)
User avatar
stevemulligan
Posts: 117
Joined: Wed Jul 20, 2011 2:54 pm
Location: Ottawa, Canada

Re: Easy engine to use for testing

Post by stevemulligan »

If I have a bunch of old laptops, can I run a few games on those machines and then merge all the PGN files together to save time? Or do the specs on the machines need to be identical for that?
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Easy engine to use for testing

Post by Adam Hair »

stevemulligan wrote:Adam, once my engine is ready, how can I apply to get on the Also-Rans rating list?

Running a 1000 games against Erik's RumbleMinze since it's a c# engine like mine. I'm not even sure if my engine will play 1000 games without crashing.

Code: Select all

cutechess-cli.exe -engine conf=Pwned1 -engine conf=RumbleMinze -each tc=40/120+1 -rounds 1000 -debug -pgnin in.pgn -pgndepth 4 -repeat -pgnout rumble2.pgn
It looks like it's going to take over 24 hours to finish (only 1 free machine to test on). After that I'll do as Bob suggests and get a mix of different engines to test against. Many thanks for the suggestions :)
All you have to do is make it publicly available. I tend to switch back and forth between testing for the Also-Ran/CCRL lists and with other sorts of testing/experimentation (which is why I have not yet tested RumbleMinze), so I may or may not jump on it immediately when it is available. But I do make an effort to test anybody's engine when they ask. My unstated (until now) requirement that an engine must be publicly available is that it is a CCRL requirement, and the intended goal for any engine I test is to place it on the CCRL 40/4 lists.