bit boards - what is betwen

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: bit boards - what is betwen

Post by Don »

The approach works, but it's not faster for me. It's a very close call but the 64x64 array lookup is still slightly faster.

Don

Gerd Isenberg wrote:
gladius wrote:
Gerd Isenberg wrote:Indexing an array of 240 bitboards by 0x88 square difference and rotate might we worth a try.
Thanks for the idea Gerd! I ended up trying this in Stockfish, and got a small speed win. I ended up using an array of size 120, as I couldn't see how to avoid a conditional on the shift even when using size 240.

Github commit here: https://github.com/glinscott/Stockfish/ ... b79b7606c4.
Its a matter of correct initialization, to make the rotate left approach with 240 element lookup work, f.i.:

Code: Select all

       . . . . . . . .    . . . . . . . .
       . . . . . . . .    . . . . . . . 1
       . . . . . . . .    . . . . . . 1 .
       . . . . . . . .    . . . . . 1 . .
       . . . 1 . . . .    . . . . . . . .
       . . 1 . . . . .    . . . . . . . .
       . 1 . . . . . .    . . . . . . . .
       . . . . . . . .    . . . . . . . .

between   (0, 36) 		  (36, 0) 	
lookup -> { 9, 18, 27}    {37, 46, 55}
rotl   -> { 9, 18, 27}    { 9, 18, 27}  

between   (10, 46)        (46, 10) 
lookup -> { 9, 18, 27}    {37, 46, 55}
rotl   -> {19, 28, 37}    {19, 28, 37}

between   (27, 63)        (63, 27)
lookup -> { 9, 18, 27}    {37, 46, 55}
rotl   -> {36, 45, 54}    {36, 45, 54}
Otherwise, with 120, to avoid that branch, better use std:abs of the +/- difference.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: bit boards - what is betwen

Post by gladius »

wgarvin wrote:Why not AND together two masks? Even if they both come from lookup tables, those will be really small (e.g. 512 bytes each) and you ought to be able to keep them in L1 all the time.

With a bit of care, you can probably get routine(s) that are inclusive or exclusive at each end, or produce the opposite mask, etc. all with the same cost of two loads (or two shifts) and one logic-op.
Interesting, I'm not sure I understand though. What two masks would be anded together?
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: bit boards - what is betwen

Post by gladius »

Don wrote:The approach works, but it's not faster for me. It's a very close call but the 64x64 array lookup is still slightly faster.

Don
I tried a few variations, and only the final version in the commit was any faster. Interestingly, the branchless versions, and 240 size array with rotate left were about the same speed as the [64][64] array. I double checked and GCC was correctly optimizing the (bb << s) | (bb >> (64 - s)) correctly into a rol as well. So, I was surprised it still ended up being a bit slower than the swap method.

The vagaries of performance optimization on modern CPUs I guess :).
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: bit boards - what is betwen

Post by wgarvin »

gladius wrote:
wgarvin wrote:Why not AND together two masks? Even if they both come from lookup tables, those will be really small (e.g. 512 bytes each) and you ought to be able to keep them in L1 all the time.

With a bit of care, you can probably get routine(s) that are inclusive or exclusive at each end, or produce the opposite mask, etc. all with the same cost of two loads (or two shifts) and one logic-op.
Interesting, I'm not sure I understand though. What two masks would be anded together?
Ahhh, I misunderstood the original question. I was thinking "bitboard with all bits between N and M set" which is obviously not too hard.

You actually want that bitboard ANDed with the appropriate ray mask, and the simple 64x64 table lookup does sound pretty hard to beat.

64x64x8 is only a 32KB table. To beat that, you'd need something with (1) a much smaller table, (2) one or two table lookups and only a very small amount of math before/after, and (3) no dependent lookup.