I hope this means that Komodo's move ordering could be improved. This would be really good news for me. It looks like your numbers are better than mine. But somehow I expect that it will not turn out that way. Probably some detail of how pruning or LMR is done will make this invalid for comparison from one program to another.
I think we are overlooking a critical issue. The obtain a cut off, you can spend one move or more, and you are measuring the ratio between those two options. However, in many cases you can spend ZERO moves to obtain a cut off, and that is not being taken into account. When? hashtable cut offs, null move cut offs, and pruning methods based on eval (stand pat like methods). The better the engine, the more cut offs are obtain with ZERO moves. That of course is at the expense of number of cut offs with ONE move, so... it is reasonable that Komodo gets a lower efficiency ratio number. We should include a ratio with cut offs with ZERO moves compared to more.
If we have
Engine A (cut offs at moves)
0: **********************
1: ***************
2: ***
3: *
4: *
etc
Engine B (cut offs at moves)
0: *****
1: **************************
2: ***
3: *
4: *
etc
Engine A is better than B but with the ratio we are talking about, it looks worse since the column at "0" is not taken into account.
Don wrote:I will propose a really simple alternative. Suppose we simply average the move number of the cutoff? Komodo don't know how many legal moves there are in a position because we don't generate them unless we have to, so HG's idea is not very workable. I could still do it the HG way but I would have to generate and count which would slow down the program a lot and be a little messy.
So I propose we try just averaging the number of moves we had to play to get a cutoff - if no cutoff we don't average anything.
Here is those same 10 positions when I take the AVERAGE move number of the cutoff. I think this will be much more meaningful. (I also clear the data between runs
I would expect any reasonable program to get an average cutoff of less than 2 easily. And I'm not sure even this method is ideal but it's surely better. Note: the best you could score on this is 1.0 and only if you always get a cutoff on the first move:
michiguel wrote:I think we are overlooking a critical issue. The obtain a cut off, you can spend one move or more, and you are measuring the ratio between those two options. However, in many cases you can spend ZERO moves to obtain a cut off, and that is not being taken into account. When? hashtable cut offs, null move cut offs, and pruning methods based on eval (stand pat like methods). The better the engine, the more cut offs are obtain with ZERO moves. That of course is at the expense of number of cut offs with ONE move, so... it is reasonable that Komodo gets a lower efficiency ratio number. We should include a ratio with cut offs with ZERO moves compared to more.
Suggestion, we only measure till move_count=5. If we are at the end of the move list we add 6.
This just measures % of cutoffs caused by the first move, right? That's part of Jazz' standard output (suppressed under XBoard though). For this set of positions the lowest I get is 87.6% (position 10) and the highest if 94.4 (position 8). Most of the other ones are 89% or better.
I have to ask though: how serious is the "best move" listed here? Jazz only "gets" 2/10 of them.
Why would you do that? Seems to me this completely spoils any significance of the measurement. What makes the difference between a good and a very poor move ordering is whether in nodes where null, hash and killers do not work you will get your cutoff on average after 8 moves or after 20. Almost every engine does the first 6 moves in exactly the same order anyway.
Some other points of concern:
Engines might do null-move pruning in a different part of the code as beta cutoffs on normal moves (as you need a different MakeMove routine for null move, and you don't have to increase alpha if the null move scores above it, etc.). Should this code patch also count cutoffs due to null move, and should the null move be counted in the node's move_count?
How about stand-pat cutoffs? Should these be counted as a cutoff after 0 moves?
Evert wrote:That's part of Jazz' standard output (suppressed under XBoard though).
Note that if you consider this important info, there is nothing against printing it in the Thinking Output as part of the PV field (as long as you enclose it in braces, so it will look as a PGN comment when XBoard tries to parse the PV).
If you don't want to print it with every new PV, but (say) only after the search, you can emit a dummy Thinking Output line that has score = nodes = time = 0 (so just the latest depth and the PV field), and XBoard will treat the PV field as an 'info string', displaying it in the Engine Output window over the full width, ignoring it for the purpose of PV parsing.
I have to ask though: how serious is the "best move" listed here? Jazz only "gets" 2/10 of them.
iCE finds 5, Dirty 6 and Texel (I think) 7.
In one of the unsolved positions I gave iCE a bit more time and it found the move with an additional ply (48 seconds). So I guess the given best moves are somehow correct.
I suspect that an engine with only material evaluation is going to have better move ordering by your definition because in most cases it is going to start with a move that does not lose material based on the depth that it can see so the move is going to create a cutoff.
More extreme example is an engine that has draw evaluation for every position that is not checkmate.
I suspect that this engine may show almost always cutoff in the first move
and it does not mean a good move ordering.