[Discussion] - Measuring move ordering

Rebel · Post by **Rebel** » Sat Feb 24, 2018 9:35 pm

I am trying to find a formula to measure move-ordering as best as possible and when agreed upon hold an educational (and fun) contest between engine authors.

I can think of 2 systems.

#1. the percentage when the first move of a move-list in the tree remains the best move at the end of the move-list. This might require an extra entry in the stack initialized with -infinity before the first move is searched.

#2. Similar to (#1) but the normal ALPHA check is used.

I simply introduced a third killer slot to do the job and at the end of the move-list check if there has been a change.

I took the first 100 positions of the STS test and the results (at 1 sec) are:

Code: Select all

       system #1                system #2
&#91;1&#93; - 79.30% &#40;79.30%)    &#91;1&#93; - 95.31% &#40;95.31%)
&#91;2&#93; - 77.54% &#40;78.42%)    &#91;2&#93; - 93.17% &#40;94.24%)
&#91;3&#93; - 73.41% &#40;76.75%)    &#91;3&#93; - 86.09% &#40;91.52%)
.....................    .....................
&#91;100&#93; 75.61% &#40;77.07%)  &#91;100&#93; - 91.36% &#40;93.06%)

Additionally I can think of a formula to measure the efficiency of the hash table.

Your thoughts please.

elcabesa · Post by **elcabesa** » Sun Feb 25, 2018 11:34 am

I don't understand what each line of your output means.

Code: Select all

       system #1                system #2
&#91;1&#93; - 79.30% &#40;79.30%)    &#91;1&#93; - 95.31% &#40;95.31%)
&#91;2&#93; - 77.54% &#40;78.42%)    &#91;2&#93; - 93.17% &#40;94.24%)
&#91;3&#93; - 73.41% &#40;76.75%)    &#91;3&#93; - 86.09% &#40;91.52%)
.....................    .....................
&#91;100&#93; 75.61% &#40;77.07%)  &#91;100&#93; - 91.36% &#40;93.06%)

hgm · Post by **hgm** » Sun Feb 25, 2018 1:04 pm

In my experience such statistics is totally meaningless unless you split it up by (remaining) depth. Otherwise it would be totally dominated by QS. While having good move ordering in nodes close to the root is just as important (and much more difficult, as you also have to consider non-captures there). Also, the fact that you usually have a hash move tends to obscure the importance of move ordering; at some point you must have found this move, and it is much more important how much effort that took.

Finally, it might make sense to distinguish between in-check nodes and other nodes, as the usual heuristics for ordering usually do not work very whel when in check. And a lot there depends on whether you have legal or pseudo-legal generation; if you would count all the pseudo-legal moves that are not valid evasions, in-check nodes would seem very bad. But of course all these moves require at most a 1-node tree to refute, no matter what the requested depth is. OTOH, trying an inadequate legal move when in check is very expensive, because it is extended.

Rebel · Post by **Rebel** » Sun Feb 25, 2018 4:51 pm

elcabesa wrote:I don't understand what each line of your output means.

Code: Select all

       system #1                system #2
&#91;1&#93; - 79.30% &#40;79.30%)    &#91;1&#93; - 95.31% &#40;95.31%)
&#91;2&#93; - 77.54% &#40;78.42%)    &#91;2&#93; - 93.17% &#40;94.24%)
&#91;3&#93; - 73.41% &#40;76.75%)    &#91;3&#93; - 86.09% &#40;91.52%)
.....................    .....................
&#91;100&#93; 75.61% &#40;77.07%)  &#91;100&#93; - 91.36% &#40;93.06%) 
           
 |      |       |
 |      |       |
 |      |       --> average percentage pos 1-100
 |      |
 |      --> percentage the first move was the best move
 |
 ---> position

Rebel · Post by **Rebel** » Sun Feb 25, 2018 5:00 pm

hgm wrote:In my experience such statistics is totally meaningless unless you split it up by (remaining) depth. Otherwise it would be totally dominated by QS. While having good move ordering in nodes close to the root is just as important (and much more difficult, as you also have to consider non-captures there). Also, the fact that you usually have a hash move tends to obscure the importance of move ordering; at some point you must have found this move, and it is much more important how much effort that took.

Finally, it might make sense to distinguish between in-check nodes and other nodes, as the usual heuristics for ordering usually do not work very whel when in check. And a lot there depends on whether you have legal or pseudo-legal generation; if you would count all the pseudo-legal moves that are not valid evasions, in-check nodes would seem very bad. But of course all these moves require at most a 1-node tree to refute, no matter what the requested depth is. OTOH, trying an inadequate legal move when in check is very expensive, because it is extended.

Forgot to say (indeed) that QS nodes doesn't count, only the main search. Of course the hash table is part of the quality of the move ordering, I don't see a problem.

elcabesa · Post by **elcabesa** » Sun Feb 25, 2018 6:27 pm

ops sorry, it was easy to understand.

some other questions:
1) are you considering only cutoff node or every node in your search?
2) in system 1, are you testing all the move-list disabling beta cutoff or other prunings?
3) system2 means how many times 1° move is enough to cause a cutoff?

thank you

hgm · Post by **hgm** » Sun Feb 25, 2018 7:13 pm

Rebel wrote:Forgot to say (indeed) that QS nodes doesn't count, only the main search. Of course the hash table is part of the quality of the move ordering, I don't see a problem.

Well, d=1 is usually also completely different from higher depths.

jwes · Post by **jwes** » Sun Feb 25, 2018 7:35 pm

Rebel wrote:I am trying to find a formula to measure move-ordering as best as possible and when agreed upon hold an educational (and fun) contest between engine authors.

I can think of 2 systems.

#1. the percentage when the first move of a move-list in the tree remains the best move at the end of the move-list. This might require an extra entry in the stack initialized with -infinity before the first move is searched.

#2. Similar to (#1) but the normal ALPHA check is used.

I simply introduced a third killer slot to do the job and at the end of the move-list check if there has been a change.

I took the first 100 positions of the STS test and the results (at 1 sec) are:
Code: Select all
       system #1                system #2
&#91;1&#93; - 79.30% &#40;79.30%)    &#91;1&#93; - 95.31% &#40;95.31%)
&#91;2&#93; - 77.54% &#40;78.42%)    &#91;2&#93; - 93.17% &#40;94.24%)
&#91;3&#93; - 73.41% &#40;76.75%)    &#91;3&#93; - 86.09% &#40;91.52%)
.....................    .....................
&#91;100&#93; 75.61% &#40;77.07%)  &#91;100&#93; - 91.36% &#40;93.06%)
Additionally I can think of a formula to measure the efficiency of the hash table.

Your thoughts please.

There are two other measures I have thought about.
1. Average number of moves looked at before the best move is found, e.g. if the best move is found first 90% and second 10%, the average is 1.1 or if the best move is found first 90% and tenth 10%, the average is 2.
2. Number of nodes searched when the best move is searched / number of nodes searched when the all moves are searched. This seems most directly correlated with improvements in move ordering though the effect of the hash table is difficult to quantify.

jdart · Post by **jdart** » Mon Feb 26, 2018 1:53 am

I have some instrumentation I can turn on that will print out the % of times the 1st, 2nd, 3rd or 4th move was returned from the search and improved the initial alpha value (node->best_score > node->alpha).

Usually after the 4th move you are looking at small percentages.

--Jon

Cardoso · Post by **Cardoso** » Fri Mar 02, 2018 6:46 pm

I only tested the percentage of fail highs for the first 4 moves and returning immediately on a fail high.
So I can have 87% 5% 2% and 1% for the first 4 moves.
However these readings can change drastically if I only measure this if the remaining depth > 5*PLY
I'm at work so I can't test to see the exact differences but if I only measure this at depths > 5*PLY I can get something like:
95% 2% 1% and 1% for the first 4 moves.
And this makes sense to me, depth=0 dominates the statistics, and how good can be an hash move (killers/counter moves/history moves) at depth=0?
Test this yourself and see the differences.
And if you only measure at depths > 10*PLY then you will get even better move ordering, so depth really influences the readings.

[Discussion] - Measuring move ordering

[Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering

Re: [Discussion] - Measuring move ordering