Final Algorithmic Updates!
Code: Select all
AMD Ryzen 9 5950X 16-Core Processor
Million Lookups/s Random Squares, Random Occupation/s:
Name Performance [MQueens/s] Tablesize Dependencies Template Author Reference
SBAMG o^(o-3cbn) 268.565725 576 [4kb] countl_zero, bswap yes Syed Fahad http://www.talkchess.com/forum3/viewtopic.php?t=59845
SBAMG Inline 405.486229 0 [0kb] countl_zero, bswap yes Syed Fahad and Daniel Inf�hr http://www.talkchess.com/forum3/viewtopic.php?t=59845
Hyperbola Quintessence o^(o-2r) 626.543516 256 [2kb] bswap no Ryan Mack https://www.chessprogramming.org/Hyperbola_Quintessence
Hyperbola Quintessence Inline 572.912688 0 [0kb] bswap yes Ryan Mack https://www.chessprogramming.org/Hyperbola_Quintessence
Genetic 8 Ray 161.060638 0 [0kb] bswap no Daniel Inf�hr (dangi12012) Abstract C++ Syntax Tree Sifter (c) Daniel Infuehr
Bitrotation 117.578817 0 [0kb] ReverseBits no Daniel Inf�hr (dangi12012) http://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078&start=20
Binary Neural Network 36.595017 5852 [45kb] pdep_u64, AVX2 no Daniel Inf�hr (dangi12012) http://www.talkchess.com/forum3/viewtopic.php?f=7&t=79332
Exploding Bitboards 74.412098 768 [6kb] imul64 no Harald L��en http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=4523&start=80
Reference (Switch Lookup) 83.331655 0 [0kb] none yes Daniel Inf�hr (dangi12012) http://www.talkchess.com/forum3/viewtopic.php?f=7&t=78235&p=907362&hilit=espresso#p907362
AVX Branchless Shift 278.273130 0 [0kb] AVX2 no Daniel Inf�hr (dangi12012) http://www.talkchess.com/forum3/viewtopic.php?f=7&t=79005&start=60
Pext Emulated 66.443490 107904 [843kb] none no Zach Wegner https://randombit.net/bitbashing/posts/haswell_bit_permutations.html
Dumb7 Fill 298.717010 0 [0kb] none no Gunnar Andersson https://www.chessprogramming.org/Dumb7Fill
Kogge-Stone 498.975022 0 [0kb] none no Peter M. Kogge, Harold S. Stone https://www.chessprogramming.org/Kogge-Stone_Algorithm
Rotated Bitboards 46.388734 1848 [14kb] none no Robert Hyatt https://www.chessprogramming.org/Rotated_Bitboards
QBBEngine 355.518687 0 [0kb] countr_zero, countl_zero yes Fabio Gobbato https://www.chessprogramming.org/QBBEngine
QBBEngine - Shifted Mask 343.973583 0 [0kb] countr_zero, countl_zero no Fabio Gobbato http://www.talkchess.com/forum3/viewtopic.php?f=7&t=79005&start=90#p924623
Classical Bob-Mike 300.460456 1024 [8kb] countr_zero, countl_zero yes Robert Hyatt and Michael Sherwin https://www.chessprogramming.org/Classical_Approach
Advanced Bob-Mike 378.937396 520 [4kb] countr_zero, countl_zero no Michael Sherwin and Daniel Inf�hr http://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078&start=50#p924653
Leorik 333.512133 128 [1kb] countl_zero no Thomas Jahn (lithander) https://github.com/lithander/MinimalChessEngine
Leorik Inline 351.574173 0 [0kb] countl_zero no Thomas Jahn (lithander) https://github.com/lithander/MinimalChessEngine
Obstruction Difference 309.071241 768 [6kb] countl_zero no Michael Hoffmann http://www.talkchess.com/forum3/viewtopic.php?t=29087
Obstruction Difference Inline 328.424302 0 [0kb] countl_zero yes Michael Hoffmann http://www.talkchess.com/forum3/viewtopic.php?t=29087
Genetic Obstruction Difference 389.031894 384 [3kb] countl_zero yes Daniel Inf�hr and Michael Hoffmann http://www.talkchess.com/forum3/viewtopic.php?f=7&t=79701
Slide Arithmetic 258.543795 256 [2kb] bzhi_u64, blsmsk_u64 no Jakob Progsch and Daniel Inf�hr http://www.talkchess.com/forum3/viewtopic.php?f=7&t=78693&p=914767&hilit=SlideArithm#p914767
Slide Arithmetic Inline 149.183407 0 [0kb] bzhi_u64, blsmsk_u64 no Jakob Progsch and Daniel Inf�hr http://www.talkchess.com/forum3/viewtopic.php?f=7&t=78693&p=914767&hilit=SlideArithm#p914767
Kindergarten 599.784078 16640 [130kb] imul64 no Urban Koistinen https://www.chessprogramming.org/Kindergarten_Bitboards
SISSY Bitboards 253.911292 180416 [1409kb] none no Michael Sherwin http://www.talkchess.com/forum3/viewtopic.php?f=7&t=73083
Fancy Magic BB - Variable shift 522.231845 93376 [729kb] imul64 yes Pradu Kannan https://www.chessprogramming.org/Magic_Bitboards#Fancy
Plain Magic BB 552.234479 295168 [2306kb] imul64 no Lasse Hansen https://www.chessprogramming.org/Magic_Bitboards#Plain
Black Magic BB - Fixed shift 633.927817 88891 [694kb] imul64 no Onno Garms and Volker Annuss https://www.chessprogramming.org/Magic_Bitboards#Fixed_shift_Fancy
Pext constexpr 781.127949 107904 [843kb] pext_u64 yes Zach Wegner https://www.chessprogramming.org/BMI2#PEXTBitboards
HyperCube 43.105213 107680 [841kb] none yes Daniel Inf�hr (dangi12012) http://www.talkchess.com/forum3/viewtopic.php?f=7&t=79004&p=916723&hilit=hypercube#p916723
There remains only 1 non implemented idea: Solving by linear algebra.
The input is an 8x8 binary matrix (the occupancy) which is an unknown variable (A).
Multiplied with a mask matrix M (which is known per square).
Multiplied with an unknown solve matrix S (which is unknown) - gives the result R which is the solution.
This can be tried out and solved in eigen or wolfram alpha. If it turns out that the occupancy is non reversible than its proven that there is no such solution - if its reversible then we get access to a solution working with linear algebra which was hardware and software optimised for the last 60 years.
Of course with 2 unknowns its not solvable - but we can enumerate all 2^14 * 64 occupancies and put them on a diagonal of a matrix. It would look similar to this:
The formula is simply AMS = R. Inverse (AM) and multiply it with R and you get the solution if it exists.
The dimensions of A would be 1048576 * 1048576. But since its very sparse that should be no problem at all. I will not put in the work there to solve it since I am sure that the resulting 64 8x8 matrices are interesting for Tensor/Matlab but the bottleneck in chessprogramming for sure is not the speed at which the queen moves can be calculated (at the moment). Maybe it interests you?
Final improvements
https://github.com/Gigantua/Chess_Moveg ... f6ff502eR1
New Algo - Bob Advanced
I saw the the initialisation is just the raymasks - so I made the constepr init code very small. Also looking up a shifted value is slower than calulating it. So that together with the reverse trick michael mentioned this i a new version that is 20% faster with half the memory footprint.
Genetic 8Ray
This is a completely new algorithm that was found last week by sifting the sytax tree of bitrotation. It works with 8 masks instead of 4 and in this commit I made smarter shift constants eliminating 4 instructions.
Genetic QBB
Quite pleasing to the eye its the same as QBB but with the elimination of one constant in the result calcuation. QBB is proven by the sifter to be optimized fully. I tried looking up intermediates which made it faster on msvc but slower on clang. So no lookups since clang is the better compiler.
Genetic Obstruction Difference
This one makes me proud the most. Its a quite pleasing faster mask lookup together with a removal of a mask (just xor two lines automatically eliminates the source bit) compared to Obstruction Difference. A faster line_attack method was found and implemented by c++ tree sifting. (Now the core algorithm is just 1 line of code like bitrotation). Also much smarter masking constants also made the table half as big and eliminated more unnecessary masking. So all together I found and implemented 4 improvements: (Core algo improvement, XOR, Masking, Smaller Table)
Keep in mind that the original Obstruction Difference was already optimized once when moving it into my repo.
Code: Select all
Obstruction Difference 309.071241 768 [6kb] countl_zero no Michael Hoffmann http://www.talkchess.com/forum3/viewtopic.php?t=29087
Obstruction Difference Inline 328.424302 0 [0kb] countl_zero yes Michael Hoffmann http://www.talkchess.com/forum3/viewtopic.php?t=29087
Genetic Obstruction Difference 389.031894 384 [3kb] countl_zero yes Daniel Inführ and Michael Hoffmann http://www.talkchess.com/forum3/viewtopic.php?f=7&t=79701
Important Preview
The next post will be the final post since I think I found everything I wanted to find. It will either be a nice scientific paper on arxiv or a post with a shared link to a pdf file since I have the feeling that I know and understand all of the above algorithms by heart - and could improve many of those way beyond the original authors.
I can see a nice conceptual correletion of the inner workings when comparing algorithms which gives rise to distinct idea groups clustered in a picture which should be made public. These are the groups: (Shifting, 4 Mask, 8 Mask, Hashing)
It now is an assortment of the best versions of all the known algorithms.