Move Generation with Perfect Hash Functions
by Trevor Fenner and Mark Levene
Issue is sliding piece attacks. The paper is quite interesting, stunning modulo math. The authors map masked and normalized (shifted to bit zero) files, diagonal or antidiagonal occupancies to vectors of eight (or nine) consecutive bits, quite similar as kindergarten bitboards does.
They use modulo 514 for the diagonals and modulo 257 for the anti-diagonals to get 8 (or even nine for the diagonals) bits as an index. That can be done most efficiently by reciprocal multiplication - this is how vc2005 implements the mod for x64 processors. One 64*64=128 bit mul, one shift and one further imul, sub. Of course using 64-bit div to get the remainder burns even more cycles.
Code: Select all
% 514 mov r11d, r10 ; masked diagonal mov rax, ff00ff00ff00ff01H mul r10 shr rdx, 9 imul edx, 514 ; 00000202H sub r11d, edx % 257 mov r11d, r10 ; masked diagonal mov rax, ff00ff00ff00ff01H mul r10 shr rdx, 8 imul edx, 257 ; 00000101H sub r11d, edx
Code: Select all
mov rax, 0101010101010101H imul rdx, rax shr rdx, 56
Despite huge rook-tables most prefer magic-bitboards over kindergarten, since it covers both lines of a rook or bishop in one run with only one and, mul, shift->lookup.
My question is - how can they conclude they are competitive over magic-botboards - if they even can not beat kindergarten-bitboards by margins - not to mention table size and inner six bit only? To compare a Matlab 32-bit implementation with a loop-version to get the sliding attacks? Because Sam Tannous claimed small improvement over rotated with some high level hashing directories - and Bob Hyatt reported similar performance of rotated versus magic in crafty?
Strange conclusions, for Reference see Bob's post the authors mention in their paper:
http://220.127.116.11/forum/viewtopic.php ... 41&t=16002
http://chessprogramming.wikispaces.com/ ... ctionaries
http://chessprogramming.wikispaces.com/ ... +Bitboards
Did I miss something on superfast matlab modulo?