GUI idea: Testing until certainty
Posted: Tue Dec 07, 2010 3:52 pm
Here is an idea for the GUI designers here:
In backgammon, rollouts (monte carlo simulations with variance reduction) are done to find the truth of a position. The number of trials required to reach certainty varies from position to position (due to volatility), so a common setting is to request it continue rollouts until one move has reached mathematical certainty as the best.
It occurred to me this might be an interesting feature (with a twist) for engine testers working on settings and features. Suppose you have a feature or setting that you believe might be an improvement, but you are not sure by how much. You could set the testing in ultrafast games to test until there was a certainty of one way or the other, and the GUI would only stop when a conclusion was reached. There would be added options such as to extend the matches, in case there was an improvement, to try and figure out by how much, etc. Also, the GUI might offer one to test until a specific Elo range of certainty. Ex: it would test until it knew the strength within 5-10-20 Elo. Etc. Of course the latter is already a known set number of games, so when choosing this, it would advise the user how many games to expect, and even a rough estimate on the time for the testing to undergo (via average of course).
In backgammon, rollouts (monte carlo simulations with variance reduction) are done to find the truth of a position. The number of trials required to reach certainty varies from position to position (due to volatility), so a common setting is to request it continue rollouts until one move has reached mathematical certainty as the best.
It occurred to me this might be an interesting feature (with a twist) for engine testers working on settings and features. Suppose you have a feature or setting that you believe might be an improvement, but you are not sure by how much. You could set the testing in ultrafast games to test until there was a certainty of one way or the other, and the GUI would only stop when a conclusion was reached. There would be added options such as to extend the matches, in case there was an improvement, to try and figure out by how much, etc. Also, the GUI might offer one to test until a specific Elo range of certainty. Ex: it would test until it knew the strength within 5-10-20 Elo. Etc. Of course the latter is already a known set number of games, so when choosing this, it would advise the user how many games to expect, and even a rough estimate on the time for the testing to undergo (via average of course).