kbhearn wrote:The problem with fixed depth testing is it bears no relation to time at all. i.e. an engine that does a relatively wide depth 12 search no matter what the time it takes will have an advantage over an engine that would get to depth 24 in the same time using aggressive pruning because the depth 24 engine would have to stop at 12.
Hello Kevin,
The fixed depth method is best used for engines being tested against themselves. Fixed depth is a very good way of exposing bad bugs in engine patches, not to produces good games.
While this is a little extreme, Null Move Pruning for instance would be a loss under fixed depth.
Good point. It depends on how the engine treats fixed depth. For my engines, the fixed depth is tested as the root iterative depth level, and is tested only at the root. The depth limitation does not affect extensions or null move search and there is no depth test inside the search itself (except the maximum depth of 96).
Furthermore positions where you'd usually get more depth than others such as some endgames don't get to take advantage of that to provide a realistic view of how strong your engine would actually be at a time control where it reaches your fixed depth picked in the middlegame.
True, but of course fixed depth search should never be expected to test anything related to time control. The games produced are corrupt. The purpose of the test is not to produce strong games, but to expose bad bugs. The games should never be entered as part of a database for openings, or for calculating ratings. However, small fixed depths (depth=4) are much easier to trace for errors. It is easy to see how the engines "plan" went wrong somewhere.
Using nodes as a time control is a bit better - still can't be used fairly between different engines, but changes in your own engine won't typically affect NPS much and it still has the same advantage for non-SMP tests of being able to compare results generated on different hardware or unfair running conditions. It also wouldn't reflect NPS speedups/slowdowns in various phases of the game and would thus create a bit of skew that way but less so than ignoring the branching factor changes in phases of the game. Both winboard and uci protocols have extensions to support such a mode i believe.
I have not experimented with nodes searched yet. Why would it expose the differences in two engines any better than a fixed depth search? A node count cutoff means that up to 90% of the search could be wasted if the search cuts off before the score is completed. A fixed depth search uses 100% of the nodes and there is no waste.