[Question] Efficiently generate ray masks?

Discussion of chess software programming and technical issues.

Moderator: Ras

dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

[Question] Efficiently generate ray masks?

Post by dangi12012 »

Code: Select all

struct Rays{
   uint64_t Horizontal, 
   uint64_t Vertical,
   uint64_t Diag1,
   uint64_t Diag2

   Rays(int sq){
        //??
   }
}
What is the most efficient C++ code to generate all 4 rays inside the constructor? (no external lookup is allowed)
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: [Question] Efficiently generate ray masks?

Post by Desperado »

dangi12012 wrote: Sun Jan 16, 2022 2:23 pm

Code: Select all

struct Rays{
   uint64_t Horizontal, 
   uint64_t Vertical,
   uint64_t Diag1,
   uint64_t Diag2

   Rays(int sq){
        //??
   }
}
What is the most efficient C++ code to generate all 4 rays inside the constructor? (no external lookup is allowed)
Let us start with the public solutions like

Code: Select all


#define FileOf(S) ((S) & 7)

uint64_t M64::setup_rank(int s)
{
    static const uint64_t C = 0xFF;
    return C << (s & 56);
}

uint64_t M64::setup_file(int s)
{
    static const uint64_t C = 0x0101010101010101;
    return C << FileOf(s);
}

uint64_t M64::setup_diag(int sq)
{
    static const uint64_t C = 0x8040201008040201;
    int d = 8 * FileOf(sq) - (sq & 56);
    int n = -d & (d >> 31);
    int s = d & (-d >> 31);
    return (C >> s) << n;
}

uint64_t M64::setup_anti(int sq)
{
    static const uint64_t C = 0x0102040810204080;
    int d = 56 - 8 * FileOf(sq) - (sq & 56);
    int n = -d & (d >> 31);
    int s = d & (-d >> 31);
    return (C >> s) << n;
}
I think file and rank are optimal. I am not sure for the diagonals.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: [Question] Efficiently generate ray masks?

Post by Sven »

Maybe you like short but still readable code, so the following would be equivalent to Michael's solution:

Code: Select all

#include <cstdint>

#define FileOf(sq)     ((sq) & 7)
#define RankMaskOf(sq) ((sq) & 56)

inline uint64_t setup_d(uint64_t C, int d)
{
    int n = -d & (d >> 31);
    int s = d & (-d >> 31);
    return (C >> s) << n;
}

struct Rays {
    uint64_t m_horizontal;
    uint64_t m_vertical;
    uint64_t m_diag1;
    uint64_t m_diag2;

    Rays(int sq)
        : m_horizontal   (uint64_t(0xff)               << RankMaskOf(sq)),
          m_vertical     (uint64_t(0x0101010101010101) << FileOf(sq)),
          m_diag1(setup_d(uint64_t(0x8040201008040201),      8 * FileOf(sq) - RankMaskOf(sq))),
          m_diag2(setup_d(uint64_t(0x0102040810204080), 56 - 8 * FileOf(sq) - RankMaskOf(sq)))
    {
    }
};
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)
tcusr
Posts: 325
Joined: Tue Aug 31, 2021 10:32 pm
Full name: Mateo

Re: [Question] Efficiently generate ray masks?

Post by tcusr »

this is the simplest code for anti/diagonals
https://github.com/Luecx/Koivisto/blob/ ... ard.h#L178
i don't know how you can get a formula out of this, i don't know where ANTI_DIAGONAL_7_BB is...
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: [Question] Efficiently generate ray masks?

Post by Desperado »

To be a bit nitpicking, the multiplication of 8 might be replaced with the shift operator. That is of course not relevant for a one time initialization call, but if the constructor is called a lot the shift might be faster than the multiplication i guess. Maybe the compiler is doing sth. like that anyway.
mar
Posts: 2645
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: [Question] Efficiently generate ray masks?

Post by mar »

the compiler will generate the shift for you, alternatively there are instructions like lea where multiply by 8 is part of SIB - so the optimizer will generate whichever it deems faster for your code, like multiplying by 9 can be as simple as lea reg, [reg*8+reg] on x86 and so on
tcusr
Posts: 325
Joined: Tue Aug 31, 2021 10:32 pm
Full name: Mateo

Re: [Question] Efficiently generate ray masks?

Post by tcusr »

tcusr wrote: Sun Jan 16, 2022 5:46 pm this is the simplest code for anti/diagonals
https://github.com/Luecx/Koivisto/blob/ ... ard.h#L178
i don't know how you can get a formula out of this, i don't know where ANTI_DIAGONAL_7_BB is...
https://www.chessprogramming.org/Diagonals
koivisto seems to use 7 + rank - file enumeration for diagonals and rank + file for anti-diagonals
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: [Question] Efficiently generate ray masks?

Post by dangi12012 »

Thanks everyone this is what I ended up with - and it works fine!
The constexpr and template stuff is just so I have fewer lines of code compared to 4 different functions with the same signature.

Code: Select all

	#define FileOf(S) ((S) & 7)
template<int dir>
static constexpr uint64_t dirMask(int sq)
{
	if constexpr (dir == 0) {
		uint64_t C = 0xFFull;
		return (C << (sq & 56)) & ~(1ull << sq);
	}
	else if constexpr (dir == 1) {
		uint64_t C = 0x0101010101010101ull;
		return (C << FileOf(sq)) & ~(1ull << sq);
	}
	else if constexpr (dir == 2) {
		uint64_t C = 0x8040201008040201ull;
		int d = 8 * FileOf(sq) - (sq & 56);
		int n = -d & (d >> 31);
		int s = d & (-d >> 31);
		return ((C >> s) << n) & ~(1ull << sq);
	}
	else if constexpr (dir == 3) {
		uint64_t C = 0x0102040810204080ull;
		int d = 56 - 8 * FileOf(sq) - (sq & 56);
		int n = -d & (d >> 31);
		int s = d & (-d >> 31);
		return ((C >> s) << n) & ~(1ull << sq);
	}
}
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: [Question] Efficiently generate ray masks?

Post by Sven »

dangi12012 wrote: Sun Jan 16, 2022 8:02 pm Thanks everyone this is what I ended up with - and it works fine!
The constexpr and template stuff is just so I have fewer lines of code compared to 4 different functions with the same signature.

Code: Select all

	#define FileOf(S) ((S) & 7)
template<int dir>
static constexpr uint64_t dirMask(int sq)
{
	if constexpr (dir == 0) {
		uint64_t C = 0xFFull;
		return (C << (sq & 56)) & ~(1ull << sq);
	}
	else if constexpr (dir == 1) {
		uint64_t C = 0x0101010101010101ull;
		return (C << FileOf(sq)) & ~(1ull << sq);
	}
	else if constexpr (dir == 2) {
		uint64_t C = 0x8040201008040201ull;
		int d = 8 * FileOf(sq) - (sq & 56);
		int n = -d & (d >> 31);
		int s = d & (-d >> 31);
		return ((C >> s) << n) & ~(1ull << sq);
	}
	else if constexpr (dir == 3) {
		uint64_t C = 0x0102040810204080ull;
		int d = 56 - 8 * FileOf(sq) - (sq & 56);
		int n = -d & (d >> 31);
		int s = d & (-d >> 31);
		return ((C >> s) << n) & ~(1ull << sq);
	}
}
Why is that shorter in lines of code than my proposal above?
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)
Luecx
Posts: 138
Joined: Thu Jun 18, 2020 9:20 pm
Full name: Finn Eggers

Re: [Question] Efficiently generate ray masks?

Post by Luecx »

tcusr wrote: Sun Jan 16, 2022 7:52 pm
tcusr wrote: Sun Jan 16, 2022 5:46 pm this is the simplest code for anti/diagonals
https://github.com/Luecx/Koivisto/blob/ ... ard.h#L178
i don't know how you can get a formula out of this, i don't know where ANTI_DIAGONAL_7_BB is...
https://www.chessprogramming.org/Diagonals
koivisto seems to use 7 + rank - file enumeration for diagonals and rank + file for anti-diagonals
True but in fact we got rid of the usage of diagonals at the same time we got rid of HCE.
The ability to speak does not make you intelligent. https://github.com/Luecx/Koivisto

Image