tactical play or positional play for chess engine

kaustubh · Post by **kaustubh** » Wed Aug 01, 2007 4:07 pm

What is more important for a chess engine to be tactically strong or positionally better.

MartinBryant · Post by **MartinBryant** » Wed Aug 01, 2007 8:08 pm

In the end you'll need both.
Tactics is probably easiest to implement well initially and objective to test.
However you'll never get into the top ranks unless you play positionally well too.

Michael Sherwin · Post by **Michael Sherwin** » Thu Aug 02, 2007 4:40 am

IMO

Positional play is virtually the function of the eval. Even if the program has an aggressive eval that leads to lots of tactics it is tactics based in strategy.

Tactics that take advantage of an opponents mistakes are found at some point by the search. All the search can do is help the eval to find opponent mistakes and help its eval to avoid them.

If a program has a super good eval (super good does not necessarily mean complicated), it can still play very strongly, even if the search is not quite on the same level. That is because the super good eval will not lead the program into positions where it is highly susceptible to being forced to compromise its position that will eventually lead to a loss.

The program with the worse eval will more often be forced to make small compromises that will eventually lead to its defeat. That is true, even if it outsearches the other engine.

One absolute truth is, it does not matter how deep program A searches if program B does not make any loosing mistakes. On the other hand if program A does not have a very good eval, it will make plenty of positional mistakes that will lead to its defeat.

The eval is therefore a little more important than the search, although both are important.

However, there are programmers of some very strong engines that believe that the search is more important.

Uri Blass · Post by **Uri Blass** » Thu Aug 02, 2007 10:26 am

Michael Sherwin wrote:IMO

Positional play is virtually the function of the eval. Even if the program has an aggressive eval that leads to lots of tactics it is tactics based in strategy.

Tactics that take advantage of an opponents mistakes are found at some point by the search. All the search can do is help the eval to find opponent mistakes and help its eval to avoid them.

If a program has a super good eval (super good does not necessarily mean complicated), it can still play very strongly, even if the search is not quite on the same level. That is because the super good eval will not lead the program into positions where it is highly susceptible to being forced to compromise its position that will eventually lead to a loss.

The program with the worse eval will more often be forced to make small compromises that will eventually lead to its defeat. That is true, even if it outsearches the other engine.

One absolute truth is, it does not matter how deep program A searches if program B does not make any loosing mistakes. On the other hand if program A does not have a very good eval, it will make plenty of positional mistakes that will lead to its defeat.

The eval is therefore a little more important than the search, although both are important.

However, there are programmers of some very strong engines that believe that the search is more important.

I wonder how can you compare between importance of both parts.
I think that it is simply undefined.

Uri

Michael Sherwin · Post by **Michael Sherwin** » Thu Aug 02, 2007 10:42 am

To me it seems obvious.

The eval can "put a hook into the jaw" of any chess program and lead it straight to defeat! And all the 'weapons' of the search will not save it.

Uri Blass · Post by **Uri Blass** » Thu Aug 02, 2007 11:03 am

Michael Sherwin wrote:To me it seems obvious.

The eval can "put a hook into the jaw" of any chess program and lead it straight to defeat! And all the 'weapons' of the search will not save it.

Of course a chain is weak because of the weakest part but the search can also lead a program to lose.

fruit with the same evaluation at fixed depth of 8 plies is going to be very weak at tournament time control.

If you change fruit's search by not using null move pruning and other pruning and not using hash tables and having worse order of moves then you can practically get only depth 8 at tournament time control.

Uri

Stan Arts · Post by **Stan Arts** » Thu Aug 02, 2007 12:23 pm

What I see in computerchess is that there's some bare minimum in evaluation you need to have, need to know a few things about pawnstructure, kingsafety and piece mobility, also some simple endgame knowledge, then you are hard to beat in computer chess as long as you have a super search.

Unfortunately I see this with my own engine, where most of my work has gone into writing a detailed evaluationfunction, and can hold it's own against almost anything in fixed depth matched, but gets beaten terribly in normal matches because of my crappy search. It does beat others from time to time by some "deep" evaluation ideas, but those are rare enough that a simple evaluation with a super search wins you many more games in computer chess. Unfortunately.
Even against new-ish amateur engines with simple eval's but that search very deep (with LMR and what not) they simply find the good moves and refutations and my engine's eval can not find a way through at it's low searchdepths, actually get kicked off the board by some 5 ply deeper tactics (and even many positional lines are tactically forced/driven) where a reasonable eval really won't help you stay out of.

(A good example is Fruit. (I think Fruit's evaluation is not amazing like some claim, simply because my engine can knock of plenty of games at fixed depth. If it is amazing it is only amazing as it works so well with it's search.) There are a great deal of programs with more knowledgeable evals, yet Fruit outsearches anything and ends up at the top. It is terribly hard to compensate 1-2 extra ply with evaluation.)

Unfortunately.

Stan

Tord Romstad · Post by **Tord Romstad** » Thu Aug 02, 2007 2:50 pm

In the days when I knew even less about chess programming than I do now, I used to think that writing a good evaluation function was simply about identifying pieces of knowledge that my program lacked, finding good ways to quantify the missing knowledge, choosing good weights, and implementing it all without bugs.

But over the years, it slowly became clear to me that there must be something more to writing a good evaluation function. A very frustrating experience repeated itself over and over: I noticed some obvious evaluation weakness in my program, and attempted to fix it (using more or less the procedure described above). Initially, the new evaluation term usually seemed to work: The previously misevaluated positions were understood better, and I saw plenty of games where the new knowledge helped my program to win. Nevertheless, the results in long matches were very often worse with the new evaluation term implemented, even when there was no significant slowdown, and no implementation bugs.

I think there are two main reasons for this phenomenon. The first one is that whenever you add a new term to the evaluation function, you have to carefully analyse how it interacts with all the previously existing evaluation terms. It is possible that some of the existing terms should be rewritten, re-tuned, or removed entirely.

This is best understood by an example: Assume that you have written a basic chess engine where the evaluation consists of material and piece square tables only. In this program, you add a single new evaluation term: A small bonus for rooks on open files. By doing this seemingly minor and innocent change, you are actually doing something potentially very dangerous: You are increasing the average value of a rook over the set of all possible chess positions. This has some probably undesired side-effects. The program will, for instance, be slightly more inclined than before to exchange two minor pieces for a rook and a pawn. In order to prevent this, you should either give the rook a small penalty for being on a closed file, or (equivalently) slightly lower the base material value of a rook.

Similar effects occur in almost every case when you add a new evaluation term, but usually the interactions between the new and old evaluation terms are much more subtle and hard to spot than in the simple example above. Identifying and fixing the potential problems is generally very difficult.

The second reason is that evaluation rules which are correct only in the vast majority of chess positions (as opposed to all chess positions) can very often make the program play worse in practice. The point is that a chess player's strength is mainly determined by the quality of his/her/its worst moves, and not by the best moves. Therefore, the price of misevaluating 1% of the nodes will sometimes more than outweigh the benefits of evaluating the remaining 99% of all nodes more precisely. This is particularly obvious for evaluation terms which can take very big absolute values, but the same thing can happen even for small bonuses and penalties.

In summary, an evaluation function is not just the sum of its parts. You can't just heap lots of correct (in isolation) pieces of chess knowledge on top of each other and expect to end up with a good evaluation function. You need patience, restraint, thorough testing, and a sound basic philosophy to succeed.

The following are, in my opinion, the most important properties of a good evaluation function:

Orthogonality. When it is possible, it is better to avoid having two different evaluation components which to some extent quantify the same extent of the position. When adding a new evaluation component which has a non-zero "orthogonal projection" (in a methaphorical sense, of course) on a previously existing component, try to adjust the two components in such a way that the projection is minimized, or to generalize and combine the two components into one.
Continuity. If two positions X and Y which are "close to each other" in the sense that it is possible to get from position X to position Y by a short sequence of good moves, the two positions should ideally have similar evaluations. As a corollary, when one adds a big bonus or penalty for some particular pattern, one should also consider introducing a smaller bonus or penalty for getting close to this pattern. For instance, when adding a big bonus for a knight on an outpost square, it might be a good idea to add a smaller bonus for a knight attacking an outpost square.
Sense of progress. It is much more important that the evaluation function is able to accurately judge which of two very similar
positions is better, than that it is able to judge which of two totally different positions is better. The evaluation function doesn't need to be able to answer questions like whether a certain classical King's Indian middle game is better than an endgame arising from a Richter-Rauzer Sicilian. What it needs is to be able to decide is things like whether one side should try to exchange a bishop for a knight, or whether it is better to castle kingside or queenside.
Good worst case behavior. It is better to be wrong by 10 centipawns all the time than to be completely correct 99.9% of the time and wrong by 300 centipawns 0.1% of the time.

Stan Arts wrote:(A good example is Fruit. (I think Fruit's evaluation is not amazing like some claim, simply because my engine can knock of plenty of games at fixed depth. If it is amazing it is only amazing as it works so well with it's search.) There are a great deal of programs with more knowledgeable evals, yet Fruit outsearches anything and ends up at the top. It is terribly hard to compensate 1-2 extra ply with evaluation.)

Fruit's evaluation function is actually very good. It is true that there are many programs with more knowledgeable evals, but as explained above, this is not the same as better evals. Fruit's evaluation is founded on a sound philosophy, and has very few bugs. This is far more important than how much knowledge it contains.

BTW: Using fixed-depth search games to compare the quality of evaluation functions isn't very meaningful, but that's a topic for another post (and probably a different day): This post is already far too long.

Tord

hgm · Post by **hgm** » Thu Aug 02, 2007 3:52 pm

Great post. I fell exactly into the trap with the open-line bonus you describe, in the early development of Joker.

I increased the open-line bonus, and suddenly it started hopelessly losing many games by unprovoked exchanging of 2 minor pieces for R+P.

Joker also suffers from discontinuity, in particular w.r.t. castling bonus. Lack of continuity is a major source of horizon effect. When Joker gets a large positional advantage early in the opening, it often permanently wrecks its position by moves it knows to be bad, just to push the castling of its opponent over the horizon. I considered shifting part of the castling bonus to making empty squares between uncastled K and R, to improve continuity.

Jan Brouwer · Post by **Jan Brouwer** » Thu Aug 02, 2007 5:54 pm

Posts like Tord's can not easily be too long, in my opinion

.

The present evaluation function of my chess program (Rotor) is really a hack, and I am looking for just this kind of discussion
on how to develop a good evaluation function.
Searching through the archives, I found the following ideas:

- use symmetries to catch bugs
- divide evaluation features into "static" and "dynamic" features; static features (e.g. material imbalance, pawn structure)
change less oftern, are important, can be determined quite reliably, and can not easily be replaced by deeper search (Don Dailey)
- consider non-linear functions
- battle horizon effects by avoiding “step” functions, try to “smell” big changes in evaluation far in advance

One idea I had was to consider an evaluation feature to consist of the feature proper, and of a noise component.
Is it possible to measure how large the noisy part of a particular feature is?

And how do you measure the goodness of an evaluation function in general? By playing many games?
Vasik Rajlich wrote: “The key to having a good evaluation is coming up with some way to test it, piece by piece.
Self-play is not enough, you'll never play enough games to show a 10-point improvement.”

tactical play or positional play for chess engine

tactical play or positional play for chess engine

Re: tactical play or positional play for chess engine

Re: tactical play or positional play for chess engine

Re: tactical play or positional play for chess engine

Re: tactical play or positional play for chess engine

Re: tactical play or positional play for chess engine

Re: tactical play or positional play for chess engine

The Art of Evaluation (long)

Re: The Art of Evaluation (long)

Re: The Art of Evaluation (long)