How to test the evaluation function

Discussion of chess software programming and technical issues.

Moderator: Ras

jacobbl
Posts: 80
Joined: Wed Feb 17, 2010 3:57 pm

How to test the evaluation function

Post by jacobbl »

I was wondering if anyone has made an "EPD" file which instead of stating the best moves returns an aproximate static evaluation. This can only be done on quiet positions. I know Dann Corbit has made an EPD file called silent but deadly, it would be interesting to have an aproximate static evaluation on these positions. I know there will be different opinions of what the evalutaion would be, but for my enginge Sjakk (pretty basic) it would be nice to do a comparisation between my evaluation and Rybkas evaluation or a GM evaluation on different positions. This would help me find what kind of positions my program is evaluating badly. Have anyone done anything like this or some other way to test ones evaluation function?

Regards
Jacob
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: How to test the evaluation function

Post by Dann Corbit »

There are several programs that will provide evaluation of a position or set of positions.

You could just feed any EPD test set to any of these engines and compare the result against your engine. Some engines will even break down the eval terms.
jacobbl
Posts: 80
Joined: Wed Feb 17, 2010 3:57 pm

Re: How to test the evaluation function

Post by jacobbl »

Thanks

Do you have the name of any of these programs, and the names of some of the engines that provide a break down of the eval terms?

Regards
Jacob
yoshiharu
Posts: 56
Joined: Sat Nov 11, 2006 11:14 pm

Re: How to test the evaluation function

Post by yoshiharu »

jacobbl wrote:
Do you have the name of any of these programs, and the names of some of the engines that provide a break down of the eval terms?
Crafty does that. IIRC you use the command 'score', and it answers with the evaluation broken down, for colours and components.

Unfortunately I don't know about the others.

Cheers, Mauro
User avatar
Kempelen
Posts: 620
Joined: Fri Feb 08, 2008 10:44 am
Location: Madrid - Spain

Re: How to test the evaluation function

Post by Kempelen »

jacobbl wrote:Thanks

Do you have the name of any of these programs, and the names of some of the engines that provide a break down of the eval terms?

Regards
Jacob
My engine Rodin, which is not very strong does it (eval command). The next release I am work will do that better.

Anyway, I consider comparing eval scores is very inexact because different engines value things differents and both could be right. You can evaluate a possition with rybka and with fritz and you will see quiet positions that difer more than 0.50 centipawns..... it is all a question of style...
Fermin Serrano
Author of 'Rodin' engine
http://sites.google.com/site/clonfsp/
User avatar
rvida
Posts: 481
Joined: Thu Apr 16, 2009 12:00 pm
Location: Slovakia, EU

Re: How to test the evaluation function

Post by rvida »

jacobbl wrote:Thanks

Do you have the name of any of these programs, and the names of some of the engines that provide a break down of the eval terms?

Regards
Jacob
Critter can show its static evaluation (use the 'eval' command)

Code: Select all

position fen 1rbqr1k1/1ppn1pbp/p2p1np1/8/2PNP3/2N3PP/PP3PB1/R1BQR1K1 w - -

eval

             | Score | White     Mg     Eg | Black     Mg     Eg
-------------+-------+---------------------+--------------------
    material | +0.00 |+39.48 +39.48 +39.70 |-39.48 -39.48 -39.70
        pawn | +0.06 | +0.11  +0.11  -0.08 | -0.05  -0.05  +0.09
      knight | +0.02 | +0.38  +0.38  +0.24 | -0.35  -0.35  -0.37
      bishop | +0.22 | +0.06  +0.06  +0.12 | +0.16  +0.16  +0.10
        rook | -0.18 | +0.13  +0.13  +0.22 | -0.32  -0.32  -0.39
       queen | +0.18 | +0.20  +0.20  +0.12 | -0.01  -0.01  +0.05
        king | -0.05 | +0.28  +0.28  -0.50 | -0.33  -0.33  +0.50
passed pawns | +0.00 | +0.00  +0.00  +0.00 | +0.00  +0.00  +0.00
 development | +0.00 | -0.01  -0.01  +0.00 | +0.01  +0.01  +0.00
       tempo | +0.07 | +0.07  +0.07  +0.05 | +0.00  +0.00  +0.00
-------------+-------+---------------------+--------------------
       total | +0.33 |+40.73 +40.73 +39.89 |-40.40 -40.40 -39.72
  mtrl bonus | +0.00 |
 final score | +0.33 | (Mg: 100% , Eg:   0%)
Richard
Antonio Torrecillas
Posts: 92
Joined: Sun Nov 02, 2008 4:43 pm
Location: Madrid
Full name: Antonio Torrecillas

Re: How to test the evaluation function

Post by Antonio Torrecillas »

You can also use an open source engine and add an uci command for this purpose.
for example for an engine type Glaurung, on uci.cpp locate and modify uci_handle_command:

Code: Select all

extern void epdval(char *name);

static void uci_handle_command(char *command) {
  if(strncasecmp(command, "ucinewgame", 10) == 0) return; 
  else if(strncasecmp(command, "uci", 3) == 0) uci_start();
  else if(strncasecmp(command, "isready", 7) == 0) printf("readyok\n");
  else if(strncasecmp(command, "position", 8) == 0) 
    uci_set_position(command + 9);
  else if(strncasecmp(command, "quit", 4) == 0) quit();
  else if(strncasecmp(command, "go", 2) == 0) uci_go(command + 3);
  else if(strncasecmp(command, "setoption name", 14) == 0)
    uci_setoption(command + 15);
  else if(strncasecmp(command, "epd ", 4) == 0)
	  epdval(&command[4]);

}
then just add a source epdeval.cpp with:

Code: Select all

// Walk a epd / fen file and static evaluate 
//
#include <stdio.h>
#include <string.h>

#include "glaurung.h"

void epdval(char *name)
{
	static position_t RootPosition[1];
    search_stack_t sstack[MAX_DEPTH];

	char buffer[1024];
	char Fen[1024];
	char *aux;
	FILE *fdi;
	FILE *fdo;
	int Value;
	aux = strstr(name,"\n");
	*aux = '\0';
	fdi = fopen(name,"r");
	if(fdi)
	{
		fdo = fopen("output.epd","w");
		if(fdo)
		{
			while(fgets(buffer,sizeof(buffer),fdi))
			{
				// Extrac fen
				strcpy(Fen,buffer);
				aux = strstr(Fen,"bm ");
				if(!aux)
					aux = strstr(Fen,";");
				*aux = '\0';

				set_position(&RootPosition[0], Fen);
				Value = evaluate(&RootPosition[0], &(sstack[0].eval_vector), 0);
				Value = (Value*100)/P_VALUE;

				sprintf(buffer,"%s;%d;\n",Fen,Value);
				fputs(buffer,fdo);
			}
			fclose(fdo);
		}
		fclose(fdi);
	}
}
Once evaluated a large epd file, you can perform the same operation on your own engine
and extract the positions with greater differences, for subsequent step by step analysis.
I hope this serves you.