I wrote a optimized sifting tool that enumerates all possible C++ Syntax trees with a given node count in order starting from the smallest tree (less code) up to bigger trees.
16 tokens are supported (binary functions, unary functions, constants, paramters) but in theory there is no limit.
Here is the list of 16 tokens used for this attempt at the optimal sliding piece algorithm:
Code: Select all
1,2,56,
square, occ, raymask
std::popcount, std::countl_zero, std::countr_zero, bit_reverse,
(arg0 | arg1) (arg0 & ~arg1) (arg0 ^ arg1) (arg0 - arg1) (arg0 << arg1) (arg0 >> arg1)
That is any solution with 8 tokens or less are basically solved instantly and bigger trees (16^11 = 1.7592186e+13) can be done overnight.
I am very proud of this achievement since this tool can solve basically any bit transformation problem with a handful of instructions. But skill and intuition is still required to know which kind of binary operations and functions can solve a problem.
Here is the proven minimum solution for the 8 ray problem (with the tokens above):
Code: Select all
//HO UPPER: (occ ^ (occ - 1ull))
//HO LOWER: (occ ^ bit_reverse((bit_reverse(occ) - 1ull)))
//VE UPPER: ((occ - 1ull) << 1ull)
//VE LOWER: (bit_reverse((bit_reverse(occ) - 1ull)) >> 1ull)
//D1 UPPER: ((occ - 1ull) << 1ull)
//D1 LOWER: (bit_reverse((bit_reverse(occ) - 1ull)) >> 1ull)
//D2 UPPER: ((occ - 1ull) << 1ull)
//D2 LOWER: (bit_reverse((bit_reverse(occ) - 1ull)) >> 1ull)
All rays share the same solution! - except the horizontal rays. The inputs parameters were pruned and in a sense this algorithm does not need to know which square we are operating on. This is a really good thing because enumerating set bits in a Bitboards yields sing bit masks and not the squares. This would be an algorithm to work without the source square (just its bitmask) and thus shaving off another countLZero instruction in real movegen code.
Now lets compare the 6 solution rays to hyperbola quiescence:
Code: Select all
/* Generate attack using the sifted 8 ray approach */
static constexpr uint64_t SolveGenetic(uint64_t occ, uint64_t mask1, uint64_t mask2)
{
uint64_t occ1 = occ & mask1; uint64_t occ2 = occ & mask2;
return ((occ1 - 1ull) << 1ull) & mask1 | (bit_bswap((bit_bswap(occ) - 1ull)) >> 1ull) & mask2
}
/* Generate attack using the hyperbola quintessence approach */
static constexpr uint64_t SolveHyperbola(uint64_t pieces, uint32_t x, uint64_t mask) {
uint64_t o = pieces & mask;
return ((o - (1ull << x)) ^ bit_bswap(bit_bswap(o) - (1ull << (x ^ 56)))) & mask;
}
https://godbolt.org/z/MW1vo5Ed6
They are different but somehow weirdly identical. Not to mention its two totally different problems. Mine is working with 8 rays - and hyperbola quiescense is solving 4 rays but the number of instructions is really the same (since I use one extra mask with one & instuction extra) and godbolt gives my algorithm one instruction less compared to hyperbola.
You can try out the new algorithm today! - Github release
https://github.com/Gigantua/Chess_Moveg ... ic8Ray.hpp
What is there to do next?
Of course taking inputs from other very fast non table lookup algorithms like QBB or SBAMG and seeing if a uniform smaller solution exists.
This tool was created out of pure lazyness. Of course bithacks are widely known but putting them together in bigger expressions gives me headaches. Also there was no way to proof if an algorithm is optimal. UNTIL NOW.
Greetings - Daniel