I am trying to find a formula to measure move-ordering as best as possible and when agreed upon hold an educational (and fun) contest between engine authors.
I can think of 2 systems.
#1. the percentage when the first move of a move-list in the tree remains the best move at the end of the move-list. This might require an extra entry in the stack initialized with -infinity before the first move is searched.
#2. Similar to (#1) but the normal ALPHA check is used.
I simply introduced a third killer slot to do the job and at the end of the move-list check if there has been a change.
I took the first 100 positions of the STS test and the results (at 1 sec) are:
In my experience such statistics is totally meaningless unless you split it up by (remaining) depth. Otherwise it would be totally dominated by QS. While having good move ordering in nodes close to the root is just as important (and much more difficult, as you also have to consider non-captures there). Also, the fact that you usually have a hash move tends to obscure the importance of move ordering; at some point you must have found this move, and it is much more important how much effort that took.
Finally, it might make sense to distinguish between in-check nodes and other nodes, as the usual heuristics for ordering usually do not work very whel when in check. And a lot there depends on whether you have legal or pseudo-legal generation; if you would count all the pseudo-legal moves that are not valid evasions, in-check nodes would seem very bad. But of course all these moves require at most a 1-node tree to refute, no matter what the requested depth is. OTOH, trying an inadequate legal move when in check is very expensive, because it is extended.
hgm wrote:In my experience such statistics is totally meaningless unless you split it up by (remaining) depth. Otherwise it would be totally dominated by QS. While having good move ordering in nodes close to the root is just as important (and much more difficult, as you also have to consider non-captures there). Also, the fact that you usually have a hash move tends to obscure the importance of move ordering; at some point you must have found this move, and it is much more important how much effort that took.
Finally, it might make sense to distinguish between in-check nodes and other nodes, as the usual heuristics for ordering usually do not work very whel when in check. And a lot there depends on whether you have legal or pseudo-legal generation; if you would count all the pseudo-legal moves that are not valid evasions, in-check nodes would seem very bad. But of course all these moves require at most a 1-node tree to refute, no matter what the requested depth is. OTOH, trying an inadequate legal move when in check is very expensive, because it is extended.
Forgot to say (indeed) that QS nodes doesn't count, only the main search. Of course the hash table is part of the quality of the move ordering, I don't see a problem.
some other questions:
1) are you considering only cutoff node or every node in your search?
2) in system 1, are you testing all the move-list disabling beta cutoff or other prunings?
3) system2 means how many times 1° move is enough to cause a cutoff?
Rebel wrote:Forgot to say (indeed) that QS nodes doesn't count, only the main search. Of course the hash table is part of the quality of the move ordering, I don't see a problem.
Well, d=1 is usually also completely different from higher depths.
Rebel wrote:I am trying to find a formula to measure move-ordering as best as possible and when agreed upon hold an educational (and fun) contest between engine authors.
I can think of 2 systems.
#1. the percentage when the first move of a move-list in the tree remains the best move at the end of the move-list. This might require an extra entry in the stack initialized with -infinity before the first move is searched.
#2. Similar to (#1) but the normal ALPHA check is used.
I simply introduced a third killer slot to do the job and at the end of the move-list check if there has been a change.
I took the first 100 positions of the STS test and the results (at 1 sec) are:
Additionally I can think of a formula to measure the efficiency of the hash table.
Your thoughts please.
There are two other measures I have thought about.
1. Average number of moves looked at before the best move is found, e.g. if the best move is found first 90% and second 10%, the average is 1.1 or if the best move is found first 90% and tenth 10%, the average is 2.
2. Number of nodes searched when the best move is searched / number of nodes searched when the all moves are searched. This seems most directly correlated with improvements in move ordering though the effect of the hash table is difficult to quantify.
I have some instrumentation I can turn on that will print out the % of times the 1st, 2nd, 3rd or 4th move was returned from the search and improved the initial alpha value (node->best_score > node->alpha).
Usually after the 4th move you are looking at small percentages.
I only tested the percentage of fail highs for the first 4 moves and returning immediately on a fail high.
So I can have 87% 5% 2% and 1% for the first 4 moves.
However these readings can change drastically if I only measure this if the remaining depth > 5*PLY
I'm at work so I can't test to see the exact differences but if I only measure this at depths > 5*PLY I can get something like:
95% 2% 1% and 1% for the first 4 moves.
And this makes sense to me, depth=0 dominates the statistics, and how good can be an hash move (killers/counter moves/history moves) at depth=0?
Test this yourself and see the differences.
And if you only measure at depths > 10*PLY then you will get even better move ordering, so depth really influences the readings.