jdart wrote:That sort of makes sense, although I have found that STC results are a bad predictor of LTC results, so using them to select a patch is iffy.
Hi Jon,
Yes basically I agree: you have to be aware of scaling. Just as an example the recent test of asmFish by Stefan Pohl. The pure speed difference of asmFish has less Elo impact with his new test conditions. Longer time controls but also I think the book was different. If we ignore the book, I think it is true any patch that only differs in speed would have more trouble passing LTC.
In the case that I remember of testing several versions, that was done by Marco to test some numerical parameters of the search patch lthat was later called 'improving' There was a clear pattern visible in the STC and you can do much more games in the same time using STC. Also a property of Fishtest is that it has many different machines. So testing at STC is actually a range of timecontrols; that makes possible effects of for instance a very specific timecotrol just reaching or not reaching a certain depth, less.
Statistically the timecontrol should not matter insofar as the confidence intervals are concerned but as the number of draws increases, testresults at least under the theory of Bayeselo, that puts more value in draws, profit from this increase in draws. But that is more theoretical and some people would question this as a reason to trust long time ontrols better. I just thought this up, myself I'm still not clear what is involved here (other than some things scaling or not scaling of course)
Well under that tab are mainly tests from Jean-Francois Romang but the number of games he could play on his own computer were not enough. And later Stefan Pohl came with his very valuable testing!
I was referring to the regression tests done against the official version of Stockfish, now Stockfish 8. The idea is it is a bit weaker so it can simulate weaker engines, and acts as a fixed point in time. The testresults are a bit difficult to find but are done regularly. Testing against other engines is not done on the framework one reason being people object to having other programs running on their machines of which it was not very clear what code is inside them.