Error margins via resampling (jackknifing)

Toadofsky · Post by **Toadofsky** » Wed Feb 01, 2017 1:31 pm

Pardon my ignorance here, even after having read this thread and related threads. I am advised:

It's essential to keep results per side (if playing without book) or results of game pairs with the same start position (if using a large set of start positions).. Not doing that results in overestimation of statistical error (see http://talkchess.com/forum/viewtopic.ph ... ight=gsprt ) and waste of testing resources. Why doesn't mainline Stockfish testing doesn't do that? - well, they found enough dupes to contribute their resources.

At the same time, I am aware that there exists a win expectancy estimation model and that Lichess uses a similar model. Rather than relying on historical games, couldn't one run both engines from the start position and use the average evaluation to calculate a win expectancy?

Ferdy · Post by **Ferdy** » Thu Feb 02, 2017 4:54 am

Toadofsky wrote:Pardon my ignorance here, even after having read this thread and related threads. I am advised:

It's essential to keep results per side (if playing without book) or results of game pairs with the same start position (if using a large set of start positions).. Not doing that results in overestimation of statistical error (see http://talkchess.com/forum/viewtopic.ph ... ight=gsprt ) and waste of testing resources. Why doesn't mainline Stockfish testing doesn't do that? - well, they found enough dupes to contribute their resources.
At the same time, I am aware that there exists a win expectancy estimation model and that Lichess uses a similar model. Rather than relying on historical games, couldn't one run both engines from the start position and use the average evaluation to calculate a win expectancy?

The win expectancy of an engine should be based on the evaluation of positions and the results from the games of the engine itself and not from other players' games. Every engine has its own evaluation and win expectancy equivalent.

Error margins via resampling (jackknifing)

Re: Error margins via resampling (jackknifing)

Re: Error margins via resampling (jackknifing)