tactical play or positional play for chess engine

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Roman Hartmann
Posts: 295
Joined: Wed Mar 08, 2006 8:29 pm

Re: The Art of Evaluation (long)

Post by Roman Hartmann »

Thanks a lot for this great post, Tord. I too made (or still make) some of the mistakes you mention.
Took me also quite some time to figure out that giving a bonus for something doesn't have to result in the same behaviour like giving a penalty for not doing something. Good explanation for that phenomenon from you side.

Roman
Tord Romstad
Posts: 1808
Joined: Wed Mar 08, 2006 9:19 pm
Location: Oslo, Norway

Re: The Art of Evaluation (long)

Post by Tord Romstad »

Hello Jan,

Thanks to you, Harm Geert and Roman for your kind words! While writing long posts like the one you replied to, I sometimes wonder why I bother writing them, and fear that noone is going to do the effort of reading them. That intelligent people like you not only read what I write, but even appreciate it, is very encouraging. :)
Jan Brouwer wrote:One idea I had was to consider an evaluation feature to consist of the feature proper, and of a noise component.
Is it possible to measure how large the noisy part of a particular feature is?
I haven't tried, and I am not even sure I understand the idea correctly. Do you mean that each evaluation term should not only consist of a value, but also an estimate of its possible inaccuracy? Perhaps this might be useful, but I am not quite sure how I would use the information.
And how do you measure the goodness of an evaluation function in general? By playing many games?
Vasik Rajlich wrote: “The key to having a good evaluation is coming up with some way to test it, piece by piece.
Self-play is not enough, you'll never play enough games to show a 10-point improvement.”
This is a very difficult question, and I'm afraid I don't have any good answer. Quite often, I have to trust my intuition, and I am sure my intuition is very often wrong with respect to the evaluation function. It is quite common that some new piece of knowledge makes the program play "optically" better than before, in the sense that its play looks more intelligent and purposeful, even if the practical strength drops by a few Elo points. You can often see clearly that your program wins a few games because of the newly added knowledge, but it is not so easy to notice the many unexpected ways the newly added knowledge causes your program to lose games.

Tord
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: The Art of Evaluation (long)

Post by Michael Sherwin »

Thank you Tord for this reply to Stan Arts as I did not have the time to reply until now. You did a better Job than what I would have done anyway. I would just add that the eval can not be divorced from the time element. If the eval has fantastic and correct chess knowledge, but, is too slow then it is not a super good eval. It is one thing to have a slightly under par search and quite another to have a slightly under par search that is also crippled by a too slow eval.

Edit: If anyone thinks that I am contradicting myself to some degree then go back and look at my original post and you will see that I indicated that a super good eval did not necessarily mean complicated. :)
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Harald
Posts: 318
Joined: Thu Mar 09, 2006 1:07 am

Re: The Art of Evaluation (long)

Post by Harald »

Tord Romstad wrote: Thanks to you, Harm Geert and Roman for your kind words! While writing long posts like the one you replied to, I sometimes wonder why I bother writing them, and fear that noone is going to do the effort of reading them. That intelligent people like you not only read what I write, but even appreciate it, is very encouraging. :)
I like to read posts with good technical informations in it. And even if I do
not answer them I store them in a big folder of chess programming on my hard disc. In this case in the subfolder 'evaluation'. In case I want to
improve or rewrite my own engine some day I have a lot of ideas to think
about. There is about 800 MByte of stuff in the folder and then there are
others with chess related papers, chess engines, sources and so on. I do
not believe I can read the 2.5 Gbyte in 23000 files in the rest of my life
but collecting is fun, too. :-)

Harald
Jan Brouwer
Posts: 201
Joined: Thu Mar 22, 2007 7:12 pm
Location: Netherlands

Re: The Art of Evaluation (long)

Post by Jan Brouwer »

Hi Tord,

I'm sure that quite some amateur chess engine authors like myself are eager to understand why Glaurung is so strong, and searches so efficiently!
Tord Romstad wrote:
Jan Brouwer wrote:One idea I had was to consider an evaluation feature to consist of the feature proper, and of a noise component.
Is it possible to measure how large the noisy part of a particular feature is?
I haven't tried, and I am not even sure I understand the idea correctly. Do you mean that each evaluation term should not only consist of a value, but also an estimate of its possible inaccuracy? Perhaps this might be useful, but I am not quite sure how I would use the information.
I was thinking about treating an evaluation feature as a "black box", apply a lot of smart number crunching to it, and out comes an answer like "this feature correlates 60% with playing strength, 40% is random noise". Now the only tricky part that remains is defining the number crunching needed :-). If this were possible, it would provide a way of optimizing evalutaion features. Anyway, it is just a vague idea :-(.
Tord Romstad wrote:
And how do you measure the goodness of an evaluation function in general? By playing many games?
Vasik Rajlich wrote: “The key to having a good evaluation is coming up with some way to test it, piece by piece.
Self-play is not enough, you'll never play enough games to show a 10-point improvement.”
This is a very difficult question, and I'm afraid I don't have any good answer. Quite often, I have to trust my intuition, and I am sure my intuition is very often wrong with respect to the evaluation function. It is quite common that some new piece of knowledge makes the program play "optically" better than before, in the sense that its play looks more intelligent and purposeful, even if the practical strength drops by a few Elo points. You can often see clearly that your program wins a few games because of the newly added knowledge, but it is not so easy to notice the many unexpected ways the newly added knowledge causes your program to lose games.
Here I am at a disadvantage, I know next to nothing about playing chess. It is only recently that a learned about the importance of (candidate) passed pawns. Playing chess for me is mainly about concentrating enough not to give away a piece!

Btw, I found that Wikipedia contains quite some good articles (as far as I can judge) about chess.
Dan Andersson
Posts: 442
Joined: Wed Mar 08, 2006 8:54 pm

Re: The Art of Evaluation (long)

Post by Dan Andersson »

For tuning evaluation terms there are publications on reinforcement learning for chess. Temporal difference learning is one useful technique. There are papers concerning reinforcement learning of proper values of evaluation terms and extension policies.
Automatically acquiring new knowledge is a harder field. There you get into Markov chains, Monte Carlo simulation, Bayesian inferences and genetic algorithms ...
Citeseer link:
http://citeseer.ist.psu.edu/cs
Good search subjects:
Temporal difference learning
Yngvi Björnsson
Reinforcement learning


MvH Dan Andersson
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: The Art of Evaluation (long)

Post by wgarvin »

Tord Romstad wrote:
Jan Brouwer wrote:One idea I had was to consider an evaluation feature to consist of the feature proper, and of a noise component.
Is it possible to measure how large the noisy part of a particular feature is?
I haven't tried, and I am not even sure I understand the idea correctly. Do you mean that each evaluation term should not only consist of a value, but also an estimate of its possible inaccuracy? Perhaps this might be useful, but I am not quite sure how I would use the information.
One possibility would be to use a sort of "fuzzy eval". Eval could produce both an upper and lower bound on the score, with the size of the gap between them representing the uncertainty in the evalation. Some features of the eval might introduce more uncertainty into the position than others. Both of these values would be backed up through the search, and maybe you'd do something clever when comparing them (e.g. if the bounds of one evaluation completely enclose the bounds of the other you could do some probabilistic thing, otherwise the one with the highest upper bound wins). Maybe you could think of some tricks to make the program favor positions whose eval it has a high degree of certainty about (lowerBound - (difference/4) or something...)

I have no idea if this sort of thing works well or not, and it might not fit nicely into most programs.
Tord Romstad
Posts: 1808
Joined: Wed Mar 08, 2006 9:19 pm
Location: Oslo, Norway

Re: The Art of Evaluation (long)

Post by Tord Romstad »

wgarvin wrote:One possibility would be to use a sort of "fuzzy eval". Eval could produce both an upper and lower bound on the score, with the size of the gap between them representing the uncertainty in the evalation. Some features of the eval might introduce more uncertainty into the position than others.
Yes, there are some programs which do (or at least did) something like this. Instead of an exact score, the evaluation function returns a probability distribution. This allows some interesting search algorithms different from classical alpha beta. I am not aware of any current top programs which use such techniques, but I think Hans Berliner's old program Hitech did.

Tord
BBauer
Posts: 658
Joined: Wed Mar 08, 2006 8:58 pm

Re: The Art of Evaluation (long)

Post by BBauer »

Hi Tord,

thank you for your very good and clear post.
Many of your thoughts you can find in an article written by C. Donninger in the swiss magazin KARL.
For example he finds that removing a bug may make the program play weaker, because the bug is in some sens a part of the program.

In KARL he reports an experiment with programs which he calls OLA_n.
OLA is the name of an ape in sweden.
OLA_m is a program which searches m moves. The eval is done by choosing random values. OLA_n is a program which searches n move. The eval is again done by choosing random values.
Now he finds that OLA_n is significantly better than OLA_m for n>m.

As we do not understand a position, we can not know the exact evaluation. Therefor we have to live with some randomness.
Special evaluation which should help may lead to special stupidity in other cases. Therefor it may be impossible to make a program signifficantly stronger by adding something. IMHO some programmers have noticed this and therefor stopped the development of their engine. They have startet with a new program.

kind regards
Bernhard
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: The Art of Evaluation (long)

Post by wgarvin »

Perfection is achieved not when there is nothing left to add, but when there is nothing left to take away.