Well, performance usually doesn't come for free. It sounds like a small price to pay for highly accelerating the development rate. Like I said, one would have to re-test accepted changes after sme time anyway, if the engine changes so much that non-linear interaction between the features becomes important.bob wrote:It makes testing far more intractable than if you just accurately test each change the first time. Because now you are faced with rerunning past tests by removing one feature at a time. And hoping your inaccuracies won't cover up the problem a second time.
In uMax 4.0 the recapture extension gave an improvement. After having added an all-capture QS, in stead of recapture only QS, the recapture extension was suddenly counter-productive. One never can take anything for granted.
Anyway, I just point out that methods that take backward steps can be very viable, and for many problems superior to methods that at any cost try to prevent such steps. There is a contiuos range of confidence, and there is an optimum along the confidence scale. Being satisfied with too low a confidence is counterproductive because there are too many backward steps. Asking for too much confidence is counterproductive because individual steps take too long. The optimum is a compromise, and where exactly that compromise lies is both a function of the relative cost of testing and implementing new ideas, (very different for a 256 Xeon cluster and a single laptop), and the steepness of the learing curve t which it is applied (very different for a 1600 Elo engine as for a 2700 Elo engine).
From people on one end of the scale, 100-game tests can easily be the best solution. Provided you are aware of the risks, and adopt a strategy to live with it. A 100-Elo improvement scores nominally 64%, which is 2.5 sigma for a difference between two 100-game gauntlets. That means you will pick it out with 99% confidence. You are not interested anyway in 1-Elo improvements, they won't bring you from 1600 to 2700...
Always optimize your methodology to the problem at hand.