syzygy wrote:Rein Halbersma wrote:Here's some further optimized code that repeatedly scans the binomial table from below, does not need knowledge of N (board size) and does a minimal amount of subtractions of binomial coefficients.
I think this is my "modified" approach. Unless I am mistaken, it takes more operations on average than the scanning from above approach. Since one cannot avoid extracting the highest square first, the "right" way to scan seems to be from the top down.
Why does your upward scanning code do the binom() subtraction from the index from within the while() loop? That seems quite expensive.
Even for K = 2, the highest square of the two will on average be closer to h8 than to a1, so searching for it from above is faster. The remaining square will on average be as close to a1 as to the highest square minus 1. (Of course the remaining square does not need to be searched for anymore, because idx equals it.)
A binary search for each next highest square would be possible, but is probably not worth it.
The difference is quite large. I did some benchmarking with 2 optimized versions. I changed to using std::array so that I can return a tuple of squares, this make composing ranking and unranking functions a bit easier. With C++11 move semantics, the returned array is guaranteed to be constructed in place (so no extra copying).
Code: Select all
#if IMP == 0
template<size_t N, size_t K>
std::array<size_t, K> comb_unrank(size_t index)
{
std::array<size_t, K> a;
for (size_t n = N, k = K; k > 1; --n, --k) {
while (binom(n, k) > index)
--n;
a[k - 1] = n;
index -= binom(n, k);
}
a[0] = index;
assert(std::is_sorted(a.begin(), a.end()));
return a;
}
#elif IMP == 1
template<size_t N, size_t K>
std::array<size_t, K> comb_unrank(size_t index)
{
std::array<size_t, K> a;
for (size_t n = K, k = K; k > 1; n = --k) {
while (binom(n, k) <= index)
++n;
a[k - 1] = --n;
index -= binom(n, k, false);
}
a[0] = index;
assert(std::is_sorted(a.begin(), a.end()));
return a;
}
#endif
See the code here:
http://coliru.stacked-crooked.com/a/29cec910f4783947
Click "Edit" and change the command line to -DIMP=1 to get a comparison between methods. I count calls to binom() for the 2 roundtrips of all Binom(64, 3) distinct ways to place 3 identical pieces on a chessboard. I assert that rank/unrank are each other's inverse for all indices and all tuples.
Method 0 (scanning down from above) takes 3,12 million calls to binom() and method 1 (scanning up from below) takes 6,85 million calls. Quite a difference!
On an *unrank per position** basis (there are B(64,3)=41664 positions), it is the difference between 35 and 80 calls to binom(). I corrected for the fact that there are 2 calls to unrank() and 2 calls to rank() in the loop. Each *rank per position* takes exactly 2 calls to binom().
Method 1 also does some extra work because it can overshoot its target (if the remaining index is equal to a binom coefficient. The overhead is 2.5% extra calls to binom().