[Question] Efficiently generate ray masks?

Discussion of chess software programming and technical issues.

Moderator: Ras

dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: [Question] Efficiently generate ray masks?

Post by dangi12012 »

Mike Sherwin wrote: Mon Mar 07, 2022 12:29 am

Code: Select all

uint64_t mask_shift_down(int ranks) {
    return bb >> (ranks << 3);
}
:?:
Interesting - but how do you know from the arguments:

Code: Select all

(X & 7) - (X >> 3)
7 - (X & 7) - (X >> 3)
Which of both functions to call?

x & 7 = modulus 8 = X
x >> 8 = div 8 = Y

Both bitmasks are this:

Code: Select all

//0x8040201008040201ull
X.......
.X......
..X.....
...X....
....X...
.....X..
......X.
.......X

//0x0102040810204080ull
.......X
......X.
.....X..
....X...
...X....
..X.....
.X......
X.......
So the code currently checks where to shift towards.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: [Question] Efficiently generate ray masks?

Post by Mike Sherwin »

dangi12012 wrote: Mon Mar 07, 2022 12:42 am
Mike Sherwin wrote: Mon Mar 07, 2022 12:29 am

Code: Select all

uint64_t mask_shift_down(int ranks) {
    return bb >> (ranks << 3);
}
:?:
Interesting - but how do you know from the arguments:

Code: Select all

(X & 7) - (X >> 3)
7 - (X & 7) - (X >> 3)
Which of both functions to call?

x & 7 = modulus 8 = X
x >> 8 = div 8 = Y

Both bitmasks are this:

Code: Select all

//0x8040201008040201ull
X.......
.X......
..X.....
...X....
....X...
.....X..
......X.
.......X

//0x0102040810204080ull
.......X
......X.
.....X..
....X...
...X....
..X.....
.X......
X.......
So the code currently checks where to shift towards.
The programmer knows because both diagonal and anti-diagonal need to be generated. So the diagonal and anti-diagonal bits would both be in the appropriate function as constants. I hope that makes sense?

Edit: I'll try to get them working in my engine Bricabrac for the bishop. But as usual I'm not very speedy though I will get started on it today.
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: [Question] Efficiently generate ray masks?

Post by Mike Sherwin »

I guess that I just do not understand templates. So let's see how much I can deduce.
sq is passed to dir_D1(sq)
in the macro dir_D1(X) ((X & 7) - (X >> 3)) can be either negative or positive which creates a decision that needs to be made. So at this point the programmer does not know which function to call. Is this correct? So maybe this?
ranks = ((fs & 7) - (fs >> 3))
mask_shiftD1[ranks > 0](ranks);
Chessnut1071
Posts: 313
Joined: Tue Aug 03, 2021 2:41 pm
Full name: Bill Beame

Re: [Question] Efficiently generate ray masks?

Post by Chessnut1071 »

dangi12012 wrote: Sun Jan 16, 2022 2:23 pm

Code: Select all

struct Rays{
   uint64_t Horizontal, 
   uint64_t Vertical,
   uint64_t Diag1,
   uint64_t Diag2

   Rays(int sq){
        //??
   }
}
What is the most efficient C++ code to generate all 4 rays inside the constructor? (no external lookup is allowed)
what's faster than simply packing the vectors north_east, north_west, south_east, south_west, or, north, south, east & west, with the 64-bit continuation for each of the 64 squares? Once computed it's a simple reference. How could anything be faster?
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: [Question] Efficiently generate ray masks?

Post by dangi12012 »

Chessnut1071 wrote: Mon Mar 07, 2022 6:15 am
dangi12012 wrote: Sun Jan 16, 2022 2:23 pm

Code: Select all

struct Rays{
   uint64_t Horizontal, 
   uint64_t Vertical,
   uint64_t Diag1,
   uint64_t Diag2

   Rays(int sq){
        //??
   }
}
What is the most efficient C++ code to generate all 4 rays inside the constructor? (no external lookup is allowed)
what's faster than simply packing the vectors north_east, north_west, south_east, south_west, or, north, south, east & west, with the 64-bit continuation for each of the 64 squares? Once computed it's a simple reference. How could anything be faster?
On some architectures a simple lookup is really slow compared to calcuation. Cuda proof:
33.74 Billion [lookup]
92.32 Billion [direct calculation]

So if there was a way to optimize the calculation it would be really great - but I dont see one either!
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: [Question] Efficiently generate ray masks?

Post by dangi12012 »

Mike Sherwin - Question:

Couldnt I remove the second 7 -
and call another function that has the mask_shift implemented in reverse?

Code: Select all

#	define dir_D1(X) (mask_shift<0x8040201008040201ull>(     (X & 7) - (X >> 3)))
#	define dir_D2(X) (mask_shift<0x0102040810204080ull>(7 - (X & 7) - (X >> 3)))
Could be one OP less?
https://godbolt.org/z/qKfWerq45
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
tcusr
Posts: 325
Joined: Tue Aug 31, 2021 10:32 pm
Full name: Mateo

Re: [Question] Efficiently generate ray masks?

Post by tcusr »

dangi12012 wrote: Mon Mar 07, 2022 12:56 pm Mike Sherwin - Question:

Couldnt I remove the second 7 -
and call another function that has the mask_shift implemented in reverse?

Code: Select all

#	define dir_D1(X) (mask_shift<0x8040201008040201ull>(     (X & 7) - (X >> 3)))
#	define dir_D2(X) (mask_shift<0x0102040810204080ull>(7 - (X & 7) - (X >> 3)))
Could be one OP less?
https://godbolt.org/z/qKfWerq45
i don't think you can because that "7-" gives the right number for the shift
https://www.chessprogramming.org/Anti-D ... numeration
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: [Question] Efficiently generate ray masks?

Post by Mike Sherwin »

dangi12012 wrote: Mon Mar 07, 2022 12:56 pm Mike Sherwin - Question:

Couldnt I remove the second 7 -
and call another function that has the mask_shift implemented in reverse?

Code: Select all

#	define dir_D1(X) (mask_shift<0x8040201008040201ull>(     (X & 7) - (X >> 3)))
#	define dir_D2(X) (mask_shift<0x0102040810204080ull>(7 - (X & 7) - (X >> 3)))
Could be one OP less?
https://godbolt.org/z/qKfWerq45
I see that occ is passed to Optimise() but I do not see where occ is used in the following code.

Code: Select all

template<uint64_t bb>
uint64_t mask_shift(int ranks) {
	return ranks > 0 ? bb >> (ranks << 3) : bb << -(ranks << 3);
}

#	define dir_HO(X) (0xFFull << (X & 56))
#	define dir_VE(X) (0x0101010101010101ull << (X & 7))
#	define dir_D1(X) (mask_shift<0x8040201008040201ull>((X & 7) - (X >> 3)))
#	define dir_D2(X) (mask_shift<0x0102040810204080ull>(7 - (X & 7) - (X >> 3)))

uint64_t Optimize(int sq, uint64_t occ)
{
    return dir_HO(sq) ^ dir_VE(sq) ^  dir_D1(sq) ^ dir_D2(sq);
}
So now I do not see how this code can even work? Since I do not understand templates well enough I tried to translate the algo to straight C code.

Code: Select all

      ranks = ((fs & 7) - (fs >> 3));
      bb = ranks > 0 ? 0x8040201008040201 >> (ranks << 3) : 0x8040201008040201 << -(ranks << 3);
      ranks = (7 - (fs & 7) - (fs >> 3));
      bb ^= ranks > 0 ? 0x0102040810204080 >> (ranks << 3) : 0x0102040810204080ull << -(ranks << 3);
Is this a good translation? And if it is how is occ used? And if is it is not a good translation then can you post the complete bishop code in standard C? Maybe then I can try to help! 8-)
tcusr
Posts: 325
Joined: Tue Aug 31, 2021 10:32 pm
Full name: Mateo

Re: [Question] Efficiently generate ray masks?

Post by tcusr »

the translation is correct.
this is how the code is used so you can understand better https://talkchess.com/forum3/viewtopic. ... 20#p922270
Mike Sherwin
Posts: 965
Joined: Fri Aug 21, 2020 1:25 am
Location: Planet Earth, Sol system
Full name: Michael J Sherwin

Re: [Question] Efficiently generate ray masks?

Post by Mike Sherwin »

tcusr wrote: Mon Mar 07, 2022 6:28 pm the translation is correct.
this is how the code is used so you can understand better https://talkchess.com/forum3/viewtopic. ... 20#p922270
Oh, thank you very much!