engine-engine testing isues

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
JBNielsen
Posts: 250
Joined: Thu Jul 07, 2011 8:31 pm
Location: Denmark
Contact:

engine-engine testing isues

Post by JBNielsen » Sun Jan 20, 2013 6:18 pm

I run engine-engine tests with arena 1.1 and mainbook.

It is very stable, but these things could be better:

1) many opening lines are much too long. Up to ca. 30 moves. 8-10 moves would be fine.
2) Some lines has a big or even winning position for white or black. Almost even positions should be played.
3) The system uses 30 seconds from a game ends until computing of a move in the next game starts.

As I only have limited computer capacity, I rarely play much more than 100 games (yes I know it is much too few).
But to give a better comparison of the 2 engines and comparisons with earlier matches, I would like:
4) The same openings (and in the same order) were played in every match.
5) Every opening were played twice with reversed colours.

Any suggestions that can give a better testing?

jdart
Posts: 3824
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: engine-engine testing isues

Post by jdart » Sun Jan 20, 2013 6:46 pm

I don't know why you are on Arena 1.1 when 3.0 is the current version (last I checked). Arena 3 can use a PGN file for a book and play positions sequentially from that file, alternating colors. You could also of course use a different Arena book file instead of the mainbook.

--Jon

zullil
Posts: 5668
Joined: Mon Jan 08, 2007 11:31 pm
Location: PA USA
Full name: Louis Zulli

Re: engine-engine testing isues

Post by zullil » Sun Jan 20, 2013 6:47 pm

JBNielsen wrote:
Any suggestions that can give a better testing?
You may wish to look at Cute Chess.

User avatar
Graham Banks
Posts: 33145
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: engine-engine testing isues

Post by Graham Banks » Sun Jan 20, 2013 7:17 pm

JBNielsen wrote:Any suggestions that can give a better testing?
Have you tried ChessGUI?

You can limit the number of opening moves and you can get each side to play a line as white and as Black.
I can suggest some good opening books that will serve your purpose too - ones with fair lines.
On the off chance that you strike an unfair opening line, ChessGUI can be set to auto restart with a different line before the game goes too far.
Debug files are also produced for each game, which might help you as an engine author.
My email addresses:
gbanksnz at gmail.com
gbanksnz at yahoo.co.nz

Richard Allbert
Posts: 767
Joined: Wed Jul 19, 2006 7:58 am

Re: engine-engine testing isues

Post by Richard Allbert » Sun Jan 20, 2013 7:46 pm

When In the tournament dialog, on the second tab there is an option to choose a pgn or epd position file. These is also an option to reverse the colours for each position, and to load sequentially.

If you at using an opening book, go into the book menu and here you can set the move limit to whatever you want. Eg 8 moves.

Regarding the delay betweengames, ive never seen this!

As others have written, there are also other good option .. Cutechess, although you can't see the games. Winboard .. A little more work with initial setup, otherwise good, and you see the games. ChessGui is also good.

I wrote a little tool for creating batch files for running Cutechess tournaments, you can find it on rja-software.com

Regards

Richard

Lavir
Posts: 263
Joined: Sun Oct 28, 2012 10:45 am

Re: engine-engine testing isues

Post by Lavir » Sun Jan 20, 2013 8:08 pm

Richard Allbert wrote: I wrote a little tool for creating batch files for running Cutechess tournaments, you can find it on rja-software.com

Regards

Richard
I've took a look at your program, and it's very well done, thank you for it.

The only thing I think you could add is the possibility of adding engines by the program, without having an engines.json (i.e. creating or modifying the file via the program); you could do it in the same way as standard GUIs install engines.

If you add that possibility it would become an extremely powerful tool to use cutechess also for those that are not very tech-savy.

Richard Allbert
Posts: 767
Joined: Wed Jul 19, 2006 7:58 am

Re: engine-engine testing isues

Post by Richard Allbert » Sun Jan 20, 2013 8:36 pm

You're welcome!

Using the .json was a lazy option, I'll see about adding a GUI engines functions also.

Also, I noticed the epd->pgn is still not activated
:oops:

I'll add the engine feature n the next couple of days

Richard

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 2:27 pm

Re: engine-engine testing isues

Post by Don » Sun Jan 20, 2013 8:56 pm

JBNielsen wrote:I run engine-engine tests with arena 1.1 and mainbook.

It is very stable, but these things could be better:

1) many opening lines are much too long. Up to ca. 30 moves. 8-10 moves would be fine.
2) Some lines has a big or even winning position for white or black. Almost even positions should be played.
3) The system uses 30 seconds from a game ends until computing of a move in the next game starts.

As I only have limited computer capacity, I rarely play much more than 100 games (yes I know it is much too few).
But to give a better comparison of the 2 engines and comparisons with earlier matches, I would like:
4) The same openings (and in the same order) were played in every match.
5) Every opening were played twice with reversed colours.

Any suggestions that can give a better testing?
For engine-engine testing I think having a very large book with very shallow lines is important. After all, you are testing an engine, not a book and not a book/engine combination. Of course some test that too, for example Komodo with its book vs some other program with the book which comes with it. That has its place too but not for testing the quality of the engine in isolation which is what 90% of people are really interested.

I also don't like going 20 moves into the game before the engines take over. What is the point in having the book play most of the game?

So we have a variety of books that we use and I am happy to let anyone have them who wants them. We mostly use V10 (see table below) but when you get higher frequency of occurrence positions you don't see very many that will go very deep.

The way I produced these books was to take all the data in TWIC and using various tool get them to stop once a position is seen less than N times.

My books are just a big text file, one line per opening and the moves in long algebraic format. Most convenient for my tester since it can be fed into my own tester without additional processing. Here is an example:

e2e4 e7e5 b1c3 g8f6 f1c4 b8c6 g1f3



Books available:

note: FOQ is minimum number of times it appears in twic

standard - 35,533 10 ply (default book)
V10 - 28,936 openings, variable depth, FOQ 10, unique
V20 - 14,703 openings, variable depth, FOQ 20, unique
V50 - 7,229 openings, variable depth, FOQ 50
F10_50 - 2,017 openings, 10 ply, FOQ 50
F20_50 - 437 openings, 20 ply, FOQ 50
F20_25 - 1,100 openings, 20 ply, FOQ 25
ending - 1,339 openings, variable depth
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.

JBNielsen
Posts: 250
Joined: Thu Jul 07, 2011 8:31 pm
Location: Denmark
Contact:

Re: engine-engine testing isues

Post by JBNielsen » Sun Jan 20, 2013 10:45 pm

jdart wrote:I don't know why you are on Arena 1.1 when 3.0 is the current version (last I checked). Arena 3 can use a PGN file for a book and play positions sequentially from that file, alternating colors. You could also of course use a different Arena book file instead of the mainbook.

--Jon
I do have arena 3.0 on a pc, that my 15 year old son have in his room now...
It is a more powerful pc, so I used it to analyze games and feed the output into my Dabbaba-engines blundergraph-function mentioned in other posts.

But I have used arena 1.1 instead of arena 3.0 for these reasons:
1) I was familiar with 1.1 and so far I was satisfied with it.
2) I did not like the way version 3.0 presented analyzed games with the graphs.
3) I had some difficulties installing my own engine and a few other in version 3.0.

Thanks for suggesting other alternatives in this thread. I wonder if they require my engine to understand some commands it does not understand today...

User avatar
Mike S.
Posts: 1480
Joined: Thu Mar 09, 2006 4:33 am

Re: engine-engine testing isues

Post by Mike S. » Mon Jan 21, 2013 1:11 am

(As for the Arena version, I stick to 2.0.1 and I can recommend it. I did find small bugs in 3.0 a long time ago, I forgot what it was, but I decided to wait for 3.0.1 :mrgreen: )

With Arena books(*), settings like "up to halfmove", percentages etc. can be adjusted:

Image

*) You can load any .abk book and use it as Arena Mainbook, disregarding it's name. For example, I can offer my Balanced-14.abk. Maybe it is suitable for you.

http://members.aon.at/computerschach/li ... #downloads

The example settings above are from this book and are stored with the book.
Regards, Mike

Post Reply