Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Tony Thomas

Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by Tony Thomas »

Since my computer over heating problem is less frequent, I was able to play few games with the above engines. As always, I didnt observe any crashes, or time losses from crafty. However, I decided to take out Sos from my tests in the future as it losses on time almost every single time, mostly in won positions. Crafty 22.0 didnt perform as well as its predecessor.


Excuses from a crafty fan

1) Number of games is far too low to have any kind of decent conclusion.
2) Newer version played against engines that are slightly stronger.
3) May be the black and white special cases helped in fast time controls.
4) My neighbors wife gained weight which lot of things slower, including crafty.

Code: Select all

49 Crafty 21.6 JA                2653   35   35   299   52%  2641   23% 
57 Crafty 22.0 JA                2638   53   54   128   41%  2706   23% 
Sloppy did perform better, but not as well as I expected. The author had said that it would perform much better at my time controls, but I didnt observe anything that spectacular. Still a very good engine over all.

Code: Select all

89 Sloppy 0.2.0 JA               2563   54   54   120   50%  2559   21%
98 Sloppy-0.1.1 JA               2548   49   50   142   48%  2563   23% 

I do not have any previous experience with atlanchess. It is a great engine, no crashes, no losses on time, and most of all it can be defeated by a decently strong human.

Code: Select all

383 Atlanchess 4.1                1886  105  112    45   37%  2010    2% 
swami
Posts: 6640
Joined: Thu Mar 09, 2006 4:21 am

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by swami »

I thought Sloppy is higher rated than Crafty?
Tony Thomas

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by Tony Thomas »

Nope, not under my conditions.
User avatar
ilari
Posts: 750
Joined: Mon Mar 27, 2006 7:45 pm
Location: Finland

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by ilari »

Tony Thomas wrote:Sloppy did perform better, but not as well as I expected. The author had said that it would perform much better at my time controls, but I didnt observe anything that spectacular. Still a very good engine over all.
Thanks for testing the new Sloppy. I have to say I'm a bit surprised as well. Sure, I only tested your time control with a couple hundred games (mostly against Bugchess), but the results looked pretty conclusive.

To be sure, I just ran another test (1 min + 1 sec increment) against Sloppy 0.1.1, and this happened: Match Sloppy-0.2.0 vs. Sloppy-0.1.1: final score 46-20-34. I do all my testing in 64-bit Linux using Xboard as the GUI.

Could you give some info about your testing, mainly operating system, 32-bit or 64-bit, GUI (Arena, Winboard, etc.), GUI settings (pondering, show thinking) and Sloppy's configuration (hash size, opening book, egbbs). Then I might be able to reproduce your results better.
grant
Posts: 67
Joined: Mon Aug 06, 2007 4:42 pm
Location: London, England

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by grant »

Tony

Thanks for your kind comments on Atlanchess 4.1

Grant
Tony Thomas

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by Tony Thomas »

ilari wrote:
Tony Thomas wrote:Sloppy did perform better, but not as well as I expected. The author had said that it would perform much better at my time controls, but I didnt observe anything that spectacular. Still a very good engine over all.
Thanks for testing the new Sloppy. I have to say I'm a bit surprised as well. Sure, I only tested your time control with a couple hundred games (mostly against Bugchess), but the results looked pretty conclusive.

To be sure, I just ran another test (1 min + 1 sec increment) against Sloppy 0.1.1, and this happened: Match Sloppy-0.2.0 vs. Sloppy-0.1.1: final score 46-20-34. I do all my testing in 64-bit Linux using Xboard as the GUI.

Could you give some info about your testing, mainly operating system, 32-bit or 64-bit, GUI (Arena, Winboard, etc.), GUI settings (pondering, show thinking) and Sloppy's configuration (hash size, opening book, egbbs). Then I might be able to reproduce your results better.
It is possible that you created a Bugchess destroyer. Sloppy did score 75% against Bugchess in my four games.

Conditions

Intel Celeron D 2800MHz
Memory about 16-50 total per engine
Time control 1m+1sec

Here is Sloppy's config, I doubt that you really want to test under my conditions. I am using the 32bit version and I am not using any kind of endgame bases.

Code: Select all

# Sloppy's config file

# Hash table size in megabytes
 hash = 16

# Use 5-men bitbases (on/off)
egbb_5men = off

# Endgame bitbase load type (4men/5men/smart/none/off)
# 4men: load 3-men and 4-men bitbases to RAM
# 5men: load 3-men, 4-men and 5-men bitbases to RAM
# smart: load a smart selection of bitbases to RAM
# none: load nothing to RAM
# off: disable bitbases completely
egbb_load_type = 4men

# Endgame bitbase cache size in megabytes
egbb_cache = 

# Endgame bitbase path
egbb_path = bitbases

# Book mode (disk/mem/off)
bookmode = mem

# Book learning (on/off)
learn = off

# Write logfile(s) (on/off)
logfile = off

# The number of threads Sloppy may use (currently for perft only).
# Comment it out if you want Sloppy to autodetect the best value.
# threads = 1

# End of config file
Tony Thomas

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by Tony Thomas »

grant wrote:Tony

Thanks for your kind comments on Atlanchess 4.1

Grant
Sorry that I couldnt come up with anything better. I wasnt going to knock your engine just because it is weak, I still cant get a draw against it.. Also, many others are starting to be fascinated by relatively weak engines due to the high draw percentage.
User avatar
ilari
Posts: 750
Joined: Mon Mar 27, 2006 7:45 pm
Location: Finland

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by ilari »

Here is Sloppy's config, I doubt that you really want to test under my conditions. I am using the 32bit version and I am not using any kind of endgame bases.
Thanks. Actually, testing under your conditions may expose some problems. Btw, I don't generally test with egbbs either.

Can you also tell which GUI you use, and whether or not you have pondering and "Show Thinking" (that's what it's called in Winboard) on? It's interesting because Sloppy doesn't have a pondering mode, and I've heard that Arena punishes (by taking time away) engines that show their thinking frequently.
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by Michael Sherwin »

ilari wrote:
Tony Thomas wrote:Sloppy did perform better, but not as well as I expected. The author had said that it would perform much better at my time controls, but I didnt observe anything that spectacular. Still a very good engine over all.
Thanks for testing the new Sloppy. I have to say I'm a bit surprised as well. Sure, I only tested your time control with a couple hundred games (mostly against Bugchess), but the results looked pretty conclusive.

To be sure, I just ran another test (1 min + 1 sec increment) against Sloppy 0.1.1, and this happened: Match Sloppy-0.2.0 vs. Sloppy-0.1.1: final score 46-20-34. I do all my testing in 64-bit Linux using Xboard as the GUI.

Could you give some info about your testing, mainly operating system, 32-bit or 64-bit, GUI (Arena, Winboard, etc.), GUI settings (pondering, show thinking) and Sloppy's configuration (hash size, opening book, egbbs). Then I might be able to reproduce your results better.
I have found out the hard way, that testing two versions of the same engine against each other can be very misleading. The newer 'better' version can even be worse overall. And just because the new version kills a particular engine, worse than it has ever done before does not mean that it will play better against a weaker engine.

Lots of games against alot of opponents is what is needed. If there is no time for that then I suggest a gauntlet of 100 games against 50 engines. And run it twice. If the score differs much between the two gauntlets then run it again.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Tony Thomas

Re: Results of Crafty 22.0, Sloppy 0.2.0 and Atlanchess 4.1

Post by Tony Thomas »

ilari wrote:
Here is Sloppy's config, I doubt that you really want to test under my conditions. I am using the 32bit version and I am not using any kind of endgame bases.
Thanks. Actually, testing under your conditions may expose some problems. Btw, I don't generally test with egbbs either.

Can you also tell which GUI you use, and whether or not you have pondering and "Show Thinking" (that's what it's called in Winboard) on? It's interesting because Sloppy doesn't have a pondering mode, and I've heard that Arena punishes (by taking time away) engines that show their thinking frequently.
I dont have pondering on, single CPU, cant do it. I exclusively use Arena 1.1, no adapters and own books. I heard Uri Blass staying that Arena steals time from engines, but I am not sure if those two or three seconds from the whole game is significant. Uri did gain decent amount of strength by telling the engine to not print PV in fast games, but I am not sure how effective that is.