Its quite compact and pleasing to the eye -
Code: Select all
namespace Bitrotation {
#define BitFunction __inline__ __device__ uint64_t
template<uint64_t bb>
BitFunction mask_shift(int ranks) {
return ranks > 0 ? bb >> (ranks << 3) : bb << -(ranks << 3);
}
BitFunction dir_HO(int square) { return 0xFFull << (square & 56); }
BitFunction dir_VE(int square) { return 0x0101010101010101ull << (square & 7); }
BitFunction dir_D1(int square) { return mask_shift<0x8040201008040201ull>((square & 7) - (square >> 3)); }
BitFunction dir_D2(int square) { return mask_shift<0x0102040810204080ull>(7 - (square & 7) - (square >> 3)); }
BitFunction bit_reverse(uint64_t x) { return __brevll(x); }
/* Generate attack using the hyperbola quintessence approach */
BitFunction attack(uint64_t pieces, uint32_t x, uint64_t mask) {
uint64_t o = pieces & mask;
return ((o - (1ull << x)) ^ bit_reverse(bit_reverse(o) - (1ull << (x ^ 63)))) & mask;
}
BitFunction Queen(int s, uint64_t occ) {
return attack(occ, s, dir_HO(s))
^ attack(occ, s, dir_VE(s))
^ attack(occ, s, dir_D1(s))
^ attack(occ, s, dir_D2(s));
}
#undef BitFunction
}
No x64-x86 algo ever came close. No known asic or fpga implementation scratched this performance.
We are scratching at the 100 Billion Lookup/ second mark here - which is insane since the best cpu algo can do 10Gigalookups / 16 Cores at the moment.
Code: Select all
Bitrotation o^(o-2r) : 91.89 GigaQueens/s
Black Magic - Fixed shift: 7.41 GigaQueens/s
QBB Algo : 59.07 GigaQueens/s
Bob Lookup : 1.63 GigaQueens/s
Kogge Stone : 40.20 GigaQueens/s
Hyperbola Quiescence : 17.59 GigaQueens/s
Switch Lookup : 4.22 GigaQueens/s
Slide Arithm : 18.39 GigaQueens/s
Pext Lookup : 16.74 GigaQueens/s
SISSY Lookup : 8.03 GigaQueens/s
Hypercube Alg : 1.28 GigaQueens/s
Dumb 7 Fill : 25.01 GigaQueens/s
Obstruction Difference : 59.78 GigaQueens/s
Leorik : 55.59 GigaQueens/s
SBAMG o^(o-3cbn) : 58.15 GigaQueens/s
NO HEADACHE : 27.53 GigaQueens/s
AVX Branchless Shift : 27.21 GigaQueens/s
Slide Arithmetic Inline : 59.86 GigaQueens/s
Greetings - Daniel
Special thanks to: