tactical play or positional play for chess engine

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Zach Wegner
Posts: 1922
Joined: Thu Mar 09, 2006 12:51 am
Location: Earth

Re: The Art of Evaluation (long)

Post by Zach Wegner »

Tord Romstad wrote:This is best understood by an example: Assume that you have written a basic chess engine where the evaluation consists of material and piece square tables only. In this program, you add a single new evaluation term: A small bonus for rooks on open files. By doing this seemingly minor and innocent change, you are actually doing something potentially very dangerous: You are increasing the average value of a rook over the set of all possible chess positions. This has some probably undesired side-effects. The program will, for instance, be slightly more inclined than before to exchange two minor pieces for a rook and a pawn. In order to prevent this, you should either give the rook a small penalty for being on a closed file, or (equivalently) slightly lower the base material value of a rook.
This effect has got me thinking about a normalizing process. When you want the values of pieces to average out to their correct values after bonuses/penalties, the important metric is not the average over all possible chess positions, but rather the average over all visited chess positions in the search tree. A way to do this is simply to see how many times an evaluation feature is added to the score in a tree.

I took a look at my evaluation function, and a nice feature to track this with is mobility. There are many values for each piece besides pawn and king (of which the pawn has a more or less fixed value, and the king doesn't matter). I have two arrays, mobility_value[6][32] and safe_mobility_count[6][32]. The first is measured by the amount of squares attacked, and the second by the amount of safe squares attacked (based on the pawn structure). I added counters to my mobility function to see how many times each array element is accessed throughout the search, and printed these along with a dot product at the end that gave the average value given to each piece based on mobility. Here's some data based on a quick search in the starting position:

Code: Select all

knight mob=1456,86323,1011144,1370258,154199,654391,397846,110123,1348 total=-18330108 average=-0.04
bishop mob=1581290,226192,136491,57120,66162,1289180,151250,221012,171913,136229,26085,1254,2,0 total=-26363713 average=-0.06
rook mob=2200415,1651499,403335,114653,12922,3832,11610,6317,2619,1894,1696,487,203,1535,74 total=-81303669 average=-0.18
queen mob=61717,92330,185246,157586,293761,141933,170241,107869,67019,62311,80066,75987,78987,83177,76947,65669,57453,39175,28847,22115,15353,7710,2368,331,29,2,0 total=-28307591 average=-0.14
knight smob=0,6430,31877,2390601,124380,94991,300432,319745,652179 total=4654849 average=+0.01
bishop smob=8690,3911,1603229,363498,408448,1007407,304421,146356,174310,34477,9009,421,2,1 total=-34562663 average=-0.08
rook smob=0,115,2220836,1641710,401477,108608,21641,9871,3005,1667,2172,389,1476,110,14 total=-57825802 average=-0.13
queen smob=0,0,0,67,1305,65287,121080,242827,336590,246081,222199,145640,144559,119033,87631,76384,65547,43522,25850,17970,9228,2793,581,55,0,0,0 total=-12605802 average=-0.06
Or more condensed:

Code: Select all

knight = -0.04 + +0.01 = -0.03
bishop = -0.06 + -0.08 = -0.14
rook   = -0.18 + -0.13 = -0.31
queen  = -0.14 + -0.06 = -0.20
When you see these numbers, there seems to be a negative bias. Looking at my mobility values alone would not be enough to determine this. But these values seem to correlate well with what is expected in the start position:
  • knights are developed early and can reach maximum mobility fairly easily
  • bishops are also developed early but the position needs to be opened for them to have much movement
  • rooks are stuck in the corners for quite a while during the opening with practically minimum mobility
  • queens are not used often at all but can reach a high mobility early (as opposed to rooks)
Looking at my values, they seem that they would be biased badly, there are more negative values than positive, and all of them are just untested guesses based on what seemed right. But when I ran this same test on ECM at .1 second per position (I really am that impatient), the results surprised me:

Code: Select all

knight = +0.00 + +0.05 = +0.05
bishop = +0.00 + +0.00 = +0.00
rook   = -0.02 + +0.01 = -0.01
queen  = -0.02 + +0.00 = -0.02
Interestingly enough, it seems my values are in harmony with each other to a remarkable degree (over this set of positions at least). However, this still gives us some useful data: these mobility tables can be normalized based on this bias. Each element of safe_mobility_value[KNIGHT] is decremented by 5, each element of mobility_value[QUEEN] incremented by 2, etc. This takes away the bias mobility gives towards any particular piece. Of course, this does not take into consideration that the values for any particular piece are anywhere close to correct. The values can be normalized in any way such that the sum of counts*values = 0 for each piece.

Obviously, whether this normalizing would make a program play any better remains to be seen. But it does make an evaluation more "modular"--each feature is more independent, and corrections do not need to be applied outside of them (i.e. the piece values do not need to be adjusted based on each piece's average contribution to the overall eval). One goal I have in this is to eliminate piece values altogether in the eval, and instead evaluate why each piece gets those values.

Zach
Jose Carlos
Posts: 151
Joined: Wed Mar 08, 2006 10:09 pm
Location: Murcia (Spain)

Re: The Art of Evaluation (long)

Post by Jose Carlos »

Great post, Tord. It awakens my long time slept computer programming interest.
About continuity: discontinuity happens all the time when capturing pieces, and chess program handle captures and the so-created discontinuity in the quiescent search. So, how about trying all eval-discontinuos moves in qsearch? It looks expensive because you have to generate and evaluate every move before deciding on whether to try it or not, but would it be worth in terms of tree search stability?
Some day, when I have some spare time, I'll do some testing.
__________________________
José Carlos Martínez Galán