Rebel wrote:Don wrote: I personally believe that Komodo has the best evaluation function of any chess program in the world.
I see a new form of (fun!) competition arising at the horizon, who has the best eval?
Its basic framework:
1. Root search (1-ply) only with standard QS.
2. QS needs to be defined by mutual agreement.
3. No extensions allowed.
25,000 - 50,000 games (or so) to weed out most of the noise because the lack of search.
Details to be worked out of course.
here are just some thoughts on that:
what does mean "best" ?
here are some contents i believe a good evaluation function should include,
are essential somehow for the definition of "best",
but would not be measured in such a test frame.
* The evaluation "score" is not the only search controller.
Many evaluations may set flags to allow nullmove,lmr or whatever,
controlled by little parts (eg kingSafty) of the complete evaluation.
Such things wont be measured in that frame (one ply search).
* another point is the move ordering. Ordering the moves at the root
is simply done differently than in the complete search. In a search
tree heuristics like history are effected by the evaluation of course, too.
Doesnt matter if there is direct influence or somehow indirect.
This test would fail here too.
* not to talk about the implementation of features with respect to time
costs...
* to have a common QS is by far not enough. There are many other "little"
things which have to be considered, IMO. eg: search all moves with an
open window, or use nullwindow search.
Finally, IMO the better solution would be, to have, lets say, a
common engine, with an "Interface" for the evaluation.
So, we would come closer to what the effect of the evaluation finally is.
(But again, this would not catch some of the thoughts i already
mentioned above).
*
well, the intention to know which evaluation represents the "best
chess knowledge" is only one attribute, but unfortunatelly does not
suffice to tell us sth about its real target(s).
Namely to guide the search into the right direction as fast as possible.
Or simply to transform a tree into a perfect ordered tree (as good as
possible) i am simply not sure that best chess knowledge and guiding
the search into the right direction are really congruent contents.
i really would like to see such a "fun" test, BUT to conclude which
is the "best" evaluation, will simply fail.
if all these points doesnt matter at all, i would suggest to
create a
common engine (EvalTester)
with an interface for the evaluation components.
Then you can say we are close as possible, with the only
difference produced by an evaluation score.
Just some thoughts adhoc...
Michael