engine-engine testing isues

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

JBNielsen
Posts: 267
Joined: Thu Jul 07, 2011 10:31 pm
Location: Denmark

engine-engine testing isues

Post by JBNielsen »

I run engine-engine tests with arena 1.1 and mainbook.

It is very stable, but these things could be better:

1) many opening lines are much too long. Up to ca. 30 moves. 8-10 moves would be fine.
2) Some lines has a big or even winning position for white or black. Almost even positions should be played.
3) The system uses 30 seconds from a game ends until computing of a move in the next game starts.

As I only have limited computer capacity, I rarely play much more than 100 games (yes I know it is much too few).
But to give a better comparison of the 2 engines and comparisons with earlier matches, I would like:
4) The same openings (and in the same order) were played in every match.
5) Every opening were played twice with reversed colours.

Any suggestions that can give a better testing?
jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: engine-engine testing isues

Post by jdart »

I don't know why you are on Arena 1.1 when 3.0 is the current version (last I checked). Arena 3 can use a PGN file for a book and play positions sequentially from that file, alternating colors. You could also of course use a different Arena book file instead of the mainbook.

--Jon
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: engine-engine testing isues

Post by zullil »

JBNielsen wrote:
Any suggestions that can give a better testing?
You may wish to look at Cute Chess.
User avatar
Graham Banks
Posts: 41433
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: engine-engine testing isues

Post by Graham Banks »

JBNielsen wrote:Any suggestions that can give a better testing?
Have you tried ChessGUI?

You can limit the number of opening moves and you can get each side to play a line as white and as Black.
I can suggest some good opening books that will serve your purpose too - ones with fair lines.
On the off chance that you strike an unfair opening line, ChessGUI can be set to auto restart with a different line before the game goes too far.
Debug files are also produced for each game, which might help you as an engine author.
gbanksnz at gmail.com
Richard Allbert
Posts: 792
Joined: Wed Jul 19, 2006 9:58 am

Re: engine-engine testing isues

Post by Richard Allbert »

When In the tournament dialog, on the second tab there is an option to choose a pgn or epd position file. These is also an option to reverse the colours for each position, and to load sequentially.

If you at using an opening book, go into the book menu and here you can set the move limit to whatever you want. Eg 8 moves.

Regarding the delay betweengames, ive never seen this!

As others have written, there are also other good option .. Cutechess, although you can't see the games. Winboard .. A little more work with initial setup, otherwise good, and you see the games. ChessGui is also good.

I wrote a little tool for creating batch files for running Cutechess tournaments, you can find it on rja-software.com

Regards

Richard
Lavir
Posts: 263
Joined: Sun Oct 28, 2012 11:45 am

Re: engine-engine testing isues

Post by Lavir »

Richard Allbert wrote: I wrote a little tool for creating batch files for running Cutechess tournaments, you can find it on rja-software.com

Regards

Richard
I've took a look at your program, and it's very well done, thank you for it.

The only thing I think you could add is the possibility of adding engines by the program, without having an engines.json (i.e. creating or modifying the file via the program); you could do it in the same way as standard GUIs install engines.

If you add that possibility it would become an extremely powerful tool to use cutechess also for those that are not very tech-savy.
Richard Allbert
Posts: 792
Joined: Wed Jul 19, 2006 9:58 am

Re: engine-engine testing isues

Post by Richard Allbert »

You're welcome!

Using the .json was a lazy option, I'll see about adding a GUI engines functions also.

Also, I noticed the epd->pgn is still not activated
:oops:

I'll add the engine feature n the next couple of days

Richard
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: engine-engine testing isues

Post by Don »

JBNielsen wrote:I run engine-engine tests with arena 1.1 and mainbook.

It is very stable, but these things could be better:

1) many opening lines are much too long. Up to ca. 30 moves. 8-10 moves would be fine.
2) Some lines has a big or even winning position for white or black. Almost even positions should be played.
3) The system uses 30 seconds from a game ends until computing of a move in the next game starts.

As I only have limited computer capacity, I rarely play much more than 100 games (yes I know it is much too few).
But to give a better comparison of the 2 engines and comparisons with earlier matches, I would like:
4) The same openings (and in the same order) were played in every match.
5) Every opening were played twice with reversed colours.

Any suggestions that can give a better testing?
For engine-engine testing I think having a very large book with very shallow lines is important. After all, you are testing an engine, not a book and not a book/engine combination. Of course some test that too, for example Komodo with its book vs some other program with the book which comes with it. That has its place too but not for testing the quality of the engine in isolation which is what 90% of people are really interested.

I also don't like going 20 moves into the game before the engines take over. What is the point in having the book play most of the game?

So we have a variety of books that we use and I am happy to let anyone have them who wants them. We mostly use V10 (see table below) but when you get higher frequency of occurrence positions you don't see very many that will go very deep.

The way I produced these books was to take all the data in TWIC and using various tool get them to stop once a position is seen less than N times.

My books are just a big text file, one line per opening and the moves in long algebraic format. Most convenient for my tester since it can be fed into my own tester without additional processing. Here is an example:

e2e4 e7e5 b1c3 g8f6 f1c4 b8c6 g1f3



Books available:

note: FOQ is minimum number of times it appears in twic

standard - 35,533 10 ply (default book)
V10 - 28,936 openings, variable depth, FOQ 10, unique
V20 - 14,703 openings, variable depth, FOQ 20, unique
V50 - 7,229 openings, variable depth, FOQ 50
F10_50 - 2,017 openings, 10 ply, FOQ 50
F20_50 - 437 openings, 20 ply, FOQ 50
F20_25 - 1,100 openings, 20 ply, FOQ 25
ending - 1,339 openings, variable depth
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
JBNielsen
Posts: 267
Joined: Thu Jul 07, 2011 10:31 pm
Location: Denmark

Re: engine-engine testing isues

Post by JBNielsen »

jdart wrote:I don't know why you are on Arena 1.1 when 3.0 is the current version (last I checked). Arena 3 can use a PGN file for a book and play positions sequentially from that file, alternating colors. You could also of course use a different Arena book file instead of the mainbook.

--Jon
I do have arena 3.0 on a pc, that my 15 year old son have in his room now...
It is a more powerful pc, so I used it to analyze games and feed the output into my Dabbaba-engines blundergraph-function mentioned in other posts.

But I have used arena 1.1 instead of arena 3.0 for these reasons:
1) I was familiar with 1.1 and so far I was satisfied with it.
2) I did not like the way version 3.0 presented analyzed games with the graphs.
3) I had some difficulties installing my own engine and a few other in version 3.0.

Thanks for suggesting other alternatives in this thread. I wonder if they require my engine to understand some commands it does not understand today...
User avatar
Mike S.
Posts: 1480
Joined: Thu Mar 09, 2006 5:33 am

Re: engine-engine testing isues

Post by Mike S. »

(As for the Arena version, I stick to 2.0.1 and I can recommend it. I did find small bugs in 3.0 a long time ago, I forgot what it was, but I decided to wait for 3.0.1 :mrgreen: )

With Arena books(*), settings like "up to halfmove", percentages etc. can be adjusted:

Image

*) You can load any .abk book and use it as Arena Mainbook, disregarding it's name. For example, I can offer my Balanced-14.abk. Maybe it is suitable for you.

http://members.aon.at/computerschach/li ... #downloads

The example settings above are from this book and are stored with the book.
Regards, Mike