Ali Baba and the 40 positions
Posted: Mon Sep 17, 2007 11:05 am
If you were to use 40 positions and defining a match between two engines ( regardless of level, whether very high or potato-like) to encompass one game with each color:
My guess is that there will be many positions that will show very similar results. For an extreme illustration of the principle, just assume that two of the positions differ only in that in one Nf3 and Nf6 have been played, and in the other it hasn't.
Shouldn't it be possible to find this out, playing enough games with a wide enough selection of engines, to be able to find such correlations? And if there are such correlations, it would be feasible to remove one of the positions, yet still get a result that is very similar to the previous result, yet reducing the necessary effort?
It would be ideal if the positions are as independent as possible, say one highly tactical position and one that involves the finer points of knight maneuvering and/or rook endgames.
Has this kind of analysis been done (mathematically, not intuitively like I assume it has been) for the Nunn positions or that set of 40 positions that Bob uses for his tests?
In addition, using those 40 positions' results equally weighted will likely result in differences from the underlying proposition that is to be proven.
For example, if (another extreme example, for illustration purposes only; I have not looked at the actual positions) one of the positions were a pawn ending, and such pawn endings occur less frequently in actual games than 1/40, the result of evaluating that position will be over-represented.
One could also somehow (theoretically, I have not examined how this would be possible in practice) find the contribution of each position to the underlying set of all games, and again either take the position out of the test suite if the contribution is insignificant, or at least reduce its weight in the analysis?[/i]
My guess is that there will be many positions that will show very similar results. For an extreme illustration of the principle, just assume that two of the positions differ only in that in one Nf3 and Nf6 have been played, and in the other it hasn't.
Shouldn't it be possible to find this out, playing enough games with a wide enough selection of engines, to be able to find such correlations? And if there are such correlations, it would be feasible to remove one of the positions, yet still get a result that is very similar to the previous result, yet reducing the necessary effort?
It would be ideal if the positions are as independent as possible, say one highly tactical position and one that involves the finer points of knight maneuvering and/or rook endgames.
Has this kind of analysis been done (mathematically, not intuitively like I assume it has been) for the Nunn positions or that set of 40 positions that Bob uses for his tests?
In addition, using those 40 positions' results equally weighted will likely result in differences from the underlying proposition that is to be proven.
For example, if (another extreme example, for illustration purposes only; I have not looked at the actual positions) one of the positions were a pawn ending, and such pawn endings occur less frequently in actual games than 1/40, the result of evaluating that position will be over-represented.
One could also somehow (theoretically, I have not examined how this would be possible in practice) find the contribution of each position to the underlying set of all games, and again either take the position out of the test suite if the contribution is insignificant, or at least reduce its weight in the analysis?[/i]