Oh my. Take care of yourself Mike. That sounds really scary. Glad to see you came through it.Mike Sherwin wrote: ↑Mon Feb 13, 2023 3:29 am I just got out of the hospital after having a hemorrhagic stroke, in the right side of my brain, almost two weeks ago. Everyday was the decision to operate or not to operate. Fortunately for me it stopped bleeding on its own and most of the symptoms have lessened or gone away. I hope to get back to work on Quixotic soon.
If I were a Professor of Computer Science
Moderator: Ras
- 
				JoAnnP38
- Posts: 253
- Joined: Mon Aug 26, 2019 4:34 pm
- Location: Clearwater, Florida USA
- Full name: JoAnn Peeler
Re: If I were a Professor of Computer Science
- 
				Mike Sherwin
- Posts: 965
- Joined: Fri Aug 21, 2020 1:25 am
- Location: Planet Earth, Sol system
- Full name: Michael J Sherwin
Re: If I were a Professor of Computer Science
JoAnnP38 wrote: ↑Mon Feb 13, 2023 4:46 amOh my. Take care of yourself Mike. That sounds really scary. Glad to see you came through it.Mike Sherwin wrote: ↑Mon Feb 13, 2023 3:29 am I just got out of the hospital after having a hemorrhagic stroke, in the right side of my brain, almost two weeks ago. Everyday was the decision to operate or not to operate. Fortunately for me it stopped bleeding on its own and most of the symptoms have lessened or gone away. I hope to get back to work on Quixotic soon.

- 
				dangi12012
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: If I were a Professor of Computer Science
Get well friend!Mike Sherwin wrote: ↑Mon Feb 13, 2023 3:29 am I just got out of the hospital after having a hemorrhagic stroke, in the right side of my brain, almost two weeks ago. Everyday was the decision to operate or not to operate. Fortunately for me it stopped bleeding on its own and most of the symptoms have lessened or gone away. I hope to get back to work on Quixotic soon.
Worlds-fastest-Bitboard-Chess-Movegenerator 
Daniel Inführ - Software Developer
			
						Daniel Inführ - Software Developer
- 
				Mike Sherwin
- Posts: 965
- Joined: Fri Aug 21, 2020 1:25 am
- Location: Planet Earth, Sol system
- Full name: Michael J Sherwin
Re: If I were a Professor of Computer Science
It is going to take some time but the doctor said I should make a full recovery!dangi12012 wrote: ↑Mon Feb 13, 2023 7:15 pmGet well friend!Mike Sherwin wrote: ↑Mon Feb 13, 2023 3:29 am I just got out of the hospital after having a hemorrhagic stroke, in the right side of my brain, almost two weeks ago. Everyday was the decision to operate or not to operate. Fortunately for me it stopped bleeding on its own and most of the symptoms have lessened or gone away. I hope to get back to work on Quixotic soon.
- 
				lithander  
- Posts: 915
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: If I were a Professor of Computer Science
That's great news! I wish you a swift recovery!Mike Sherwin wrote: ↑Mon Feb 13, 2023 7:33 pm It is going to take some time but the doctor said I should make a full recovery!
- 
				algerbrex  
- Posts: 608
- Joined: Sun May 30, 2021 5:03 am
- Location: United States
- Full name: Christian Dean
Re: If I were a Professor of Computer Science
Awesome news! Wishing you the best!Mike Sherwin wrote: ↑Mon Feb 13, 2023 7:33 pmIt is going to take some time but the doctor said I should make a full recovery!dangi12012 wrote: ↑Mon Feb 13, 2023 7:15 pmGet well friend!Mike Sherwin wrote: ↑Mon Feb 13, 2023 3:29 am I just got out of the hospital after having a hemorrhagic stroke, in the right side of my brain, almost two weeks ago. Everyday was the decision to operate or not to operate. Fortunately for me it stopped bleeding on its own and most of the symptoms have lessened or gone away. I hope to get back to work on Quixotic soon.
- 
				martinn
- Posts: 20
- Joined: Fri Mar 10, 2023 9:33 am
- Full name: Martin Novák
Re: If I were a Professor of Computer Science
Hello, 
I really like the chess generator you made. It's easy to understand even for a complete chess programming beginner like me.
I tried to speed up your move generator. I think it is possible to improve calculating hIndex there. I think it is good to precompute the integer for bitwise shift in this part: occ >> ((sq & 0b11111000u) + 1) with occ >> shift_horizontal_table[sq].
To Init method i simply added this for cycle.
I replaced the number 248 with 56 (removed first 2 bits)  because square cannot be bigger than 63. 
I also noticed that in Bishop function it is not necessary to multiply occupied squares on the masks with whole bfile but it is enough to multiply only with b2b7 file. In the Rook we can also replace whole afile which works there as mask with a2a7 mask. I have no idea if this change has some influence on speed.
Here is my modification for Rook and Bishop functions.
I tried to use Daniel's Chess Movegen comparison and with this modification i am getting better results than Black Magic bitboards. But I had big problems compiling the program so I hope I didnt break it while trying to run it. I added it as new generator so it could be compared with the version before my modification. Here are most significant results on my computer. 
AMD Ryzen 7 5800H with Radeon Graphics
Million Lookups/s Random Squares, Random Occupation/s:
Name Performance [MQueens/s] Tablesize Dependencies Template
Author Reference
Kindergarten Super SISSY Bitboards 89.717466 16640 [130kb] imul64 no
Michael Sherwin https://www.talkchess.com/forum3/viewto ... 4&start=30
Modified KGSS 102.305105 16672 [130kb] imul64 no
https://www.talkchess.com/forum3/viewto ... 4&start=30
Fancy Magic BB - Variable shift 81.528829 93376 [729kb] imul64 yes
Pradu Kannan https://www.chessprogramming.org/Magic_Bitboards#Fancy
Plain Magic BB 88.170178 295168 [2306kb] imul64 no
Lasse Hansen https://www.chessprogramming.org/Magic_Bitboards#Plain
Black Magic BB - Fixed shift 90.846201 88891 [694kb] imul64 no
Onno Garms and Volker Annuss https://www.chessprogramming.org/Magic_ ... hift_Fancy
I think that Pext doesn't work on my CPU because results were same as Kogge Stone. So I would like if anyone else could try to run it.
Here is whole code with all changes.
			
			
									
						
										
						I really like the chess generator you made. It's easy to understand even for a complete chess programming beginner like me.
I tried to speed up your move generator. I think it is possible to improve calculating hIndex there. I think it is good to precompute the integer for bitwise shift in this part: occ >> ((sq & 0b11111000u) + 1) with occ >> shift_horizontal_table[sq].
To Init method i simply added this for cycle.
Code: Select all
        for (int i = 0; i < 64; i++) {
            shift_horizontal_table[i] = (i & 56) + 1;
        }
I also noticed that in Bishop function it is not necessary to multiply occupied squares on the masks with whole bfile but it is enough to multiply only with b2b7 file. In the Rook we can also replace whole afile which works there as mask with a2a7 mask. I have no idea if this change has some influence on speed.
Here is my modification for Rook and Bishop functions.
Code: Select all
    static constexpr uint64_t file_b2_b7 = 0x0002020202020200ull;
    static constexpr uint64_t file_a2_a7 = 0x0001010101010100ull;
    static constexpr uint64_t diag_c2h7  = 0x0080402010080400ull;
    static uint64_t Bishop(int sq, uint64_t occ)
    {
        return  dSubset[sq][(((occ & dMask[sq]) * file_b2_b7) >> 58)] +
                aSubset[sq][(((occ & aMask[sq]) * file_b2_b7) >> 58)];
    }
    static uint64_t Rook(int sq, uint64_t occ)
    {
        return hSubset[sq][(occ >> shift_horizontal_table[sq]) & 63] +
               vSubset[sq][((((occ >> (sq & 7)) & file_a2_a7) * diag_c2h7) >> 58)];
    }
    AMD Ryzen 7 5800H with Radeon Graphics
Million Lookups/s Random Squares, Random Occupation/s:
Name Performance [MQueens/s] Tablesize Dependencies Template
Author Reference
Kindergarten Super SISSY Bitboards 89.717466 16640 [130kb] imul64 no
Michael Sherwin https://www.talkchess.com/forum3/viewto ... 4&start=30
Modified KGSS 102.305105 16672 [130kb] imul64 no
https://www.talkchess.com/forum3/viewto ... 4&start=30
Fancy Magic BB - Variable shift 81.528829 93376 [729kb] imul64 yes
Pradu Kannan https://www.chessprogramming.org/Magic_Bitboards#Fancy
Plain Magic BB 88.170178 295168 [2306kb] imul64 no
Lasse Hansen https://www.chessprogramming.org/Magic_Bitboards#Plain
Black Magic BB - Fixed shift 90.846201 88891 [694kb] imul64 no
Onno Garms and Volker Annuss https://www.chessprogramming.org/Magic_ ... hift_Fancy
I think that Pext doesn't work on my CPU because results were same as Kogge Stone. So I would like if anyone else could try to run it.
Here is whole code with all changes.
Code: Select all
    static uint64_t hMask[64];
    static uint64_t dMask[64];
    static uint64_t aMask[64];
    static uint64_t vSubset[64][64];
    static uint64_t hSubset[64][64];
    static uint64_t dSubset[64][64];
    static uint64_t aSubset[64][64];
    static unsigned int shift_horizontal_table[64]; // new lookup table for shifts in calculation of hIndex
    static constexpr uint64_t Size = (64 * 64 * 4 + 64 * 4) * sizeof(uint64_t) + 64 * sizeof(unsigned int);
    enum { FILEa = 0, FILEh = 7};
    enum { RANK1 = 0, RANK8 = 7};
    static void InitSquare(int sq)
    {
        int ts, dx, dy;
        uint64_t occ, index;
        int x = sq % 8;
        int y = sq / 8;
        // Initialize Kindergarten Super SISSY Bitboards
        // diagonals
        for (ts = sq + 9, dx = x + 1, dy = y + 1; dx < FILEh && dy < RANK8; dMask[sq] |= 1ull << ts, ts += 9, dx++, dy++);
        for (ts = sq - 9, dx = x - 1, dy = y - 1; dx > FILEa && dy > RANK1; dMask[sq] |= 1ull << ts, ts -= 9, dx--, dy--);
        // anti diagonals
        for (ts = sq + 7, dx = x - 1, dy = y + 1; dx > FILEa && dy < RANK8; aMask[sq] |= 1ull << ts, ts += 7, dx--, dy++);
        for (ts = sq - 7, dx = x + 1, dy = y - 1; dx < FILEh && dy > RANK1; aMask[sq] |= 1ull << ts, ts -= 7, dx++, dy--);
        // diagonal indexes
        for (index = 0; index < 64; index++)
        {
            dSubset[sq][index] = 0;
            occ = index << 1;
            if ((sq & 7) != FILEh && (sq >> 3) != RANK8)
            {
                for (ts = sq + 9; ; ts += 9)
                {
                    dSubset[sq][index] |= (1ull << ts);
                    if (occ & (1ull << (ts & 7))) break;
                    if ((ts & 7) == FILEh || (ts >> 3) == RANK8) break;
                }
            }
            if ((sq & 7) != FILEa && (sq >> 3) != RANK1)
            {
                for (ts = sq - 9; ; ts -= 9)
                {
                    dSubset[sq][index] |= (1ull << ts);
                    if (occ & (1ull << (ts & 7))) break;
                    if ((ts & 7) == FILEa || (ts >> 3) == RANK1) break;
                }
            }
        }
        // anti diagonal indexes
        for (index = 0; index < 64; index++)
        {
            aSubset[sq][index] = 0;
            occ = index << 1;
            if ((sq & 7) != FILEa && (sq >> 3) != RANK8)
            {
                for (ts = sq + 7; ; ts += 7) {
                    aSubset[sq][index] |= (1ull << ts);
                    if (occ & (1ull << (ts & 7))) break;
                    if ((ts & 7) == FILEa || (ts >> 3) == RANK8) break;
                }
            }
            if ((sq & 7) != FILEh && (sq >> 3) != RANK1)
            {
                for (ts = sq - 7; ; ts -= 7)
                {
                    aSubset[sq][index] |= (1ull << ts);
                    if (occ & (1ull << (ts & 7))) break;
                    if ((ts & 7) == FILEh || (ts >> 3) == RANK1) break;
                }
            }
        }
        // horizontal lines
        for (ts = sq + 1, dx = x + 1; dx < FILEh; hMask[sq] |= 1ull << ts, ts += 1, dx++);
        for (ts = sq - 1, dx = x - 1; dx > FILEa; hMask[sq] |= 1ull << ts, ts -= 1, dx--);
        // vertical indexes
        for (index = 0; index < 64; index++) {
            vSubset[sq][index] = 0;
            uint64_t blockers = 0;
            for (int i = 0; i <= 5; i++) {
                if (index & (1ull << i)) {
                    blockers |= (1ull << (((5 - i) << 3) + 8));
                }
            }
            if ((sq >> 3) != RANK8) {
                for (ts = sq + 8; ; ts += 8) {
                    vSubset[sq][index] |= (1ull << ts);
                    if (blockers & (1ull << (ts - (ts & 7)))) break;
                    if ((ts >> 3) == RANK8) break;
                }
            }
            if ((sq >> 3) != RANK1) {
                for (ts = sq - 8; ; ts -= 8) {
                    vSubset[sq][index] |= (1ull << ts);
                    if (blockers & (1ull << (ts - (ts & 7)))) break;
                    if ((ts >> 3) == RANK1) break;
                }
            }
        }
        // horizontal indexes
        for (index = 0; index < 64; index++) {
            hSubset[sq][index] = 0;
            occ = index << 1;
            if ((sq & 7) != FILEh) {
                for (ts = sq + 1; ; ts += 1) {
                    hSubset[sq][index] |= (1ull << ts);
                    if (occ & (1ull << (ts & 7))) break;
                    if ((ts & 7) == FILEh) break;
                }
            }
            if ((sq & 7) != FILEa) {
                for (ts = sq - 1; ; ts -= 1) {
                    hSubset[sq][index] |= (1ull << ts);
                    if (occ & (1ull << (ts & 7))) break;
                    if ((ts & 7) == FILEa) break;
                }
            }
        }
    }
    static void Init()
    {
        for (int i = 0; i < 64; i++) {
            InitSquare(i);
        }
        
        for (int i = 0; i < 64; i++) {
            shift_horizontal_table[i] = (i & 56) + 1;
        }
    }
    static constexpr uint64_t file_b2_b7 = 0x0002020202020200ull;
    static constexpr uint64_t file_a2_a7 = 0x0001010101010100ull;
    static constexpr uint64_t diag_c2h7  = 0x0080402010080400ull;
    static uint64_t Bishop(int sq, uint64_t occ)
    {
        return  dSubset[sq][(((occ & dMask[sq]) * file_b2_b7) >> 58)] +
                aSubset[sq][(((occ & aMask[sq]) * file_b2_b7) >> 58)];
    }
    static uint64_t Rook(int sq, uint64_t occ)
    {
        return hSubset[sq][(occ >> shift_horizontal_table[sq]) & 63] +
               vSubset[sq][((((occ >> (sq & 7)) & file_a2_a7) * diag_c2h7) >> 58)];
    }
    static uint64_t Queen(int sq, uint64_t occ)
    {
        return Rook(sq, occ) | Bishop(sq, occ);
    }
- 
				Mike Sherwin
- Posts: 965
- Joined: Fri Aug 21, 2020 1:25 am
- Location: Planet Earth, Sol system
- Full name: Michael J Sherwin
Re: If I were a Professor of Computer Science
Thanks Martinn that is really cool.
 
 I hope you continue to work with it and demonstrate its worth in a real world chess engine!
The fact that you got results better than Black Magic Bitboards on a AMD processor should start sparking some interest.

- 
				gxchess1
- Posts: 29
- Joined: Thu Sep 15, 2022 4:51 pm
- Full name: Elad Yosef Cohen
Re: If I were a Professor of Computer Science
So what is the difference between this and a regular kindergarten approach?
			
			
									
						
										
						- 
				Mike Sherwin
- Posts: 965
- Joined: Fri Aug 21, 2020 1:25 am
- Location: Planet Earth, Sol system
- Full name: Michael J Sherwin
Re: If I were a Professor of Computer Science
In the following code snippet
hSubset[sq][(occ >> shift_horizontal_table[sq]) & 63]
I count
a memory fetch
a shift
a memory fetch
an and
for 4 instructions
in the next snippet not counting the return
occ = (rankMaskEx[sq] & occ) * bFile >> 58;
return rankMaskEx[sq] & fillUpAttacks[sq&7][occ];
i count
a move
a memory fetch
an and
a multiplication
a shift
an and
a memory fetch
an and
So roughly twice as many instructions for regular kindergarten bitboards.
It is similar for diagonals
dSubset[sq][(((occ & dMask[sq]) * file_b2_b7) >> 58)]
vs
occ = (diagonalMaskEx[sq] & occ) * bFile >> 58;
return diagonalMaskEx[sq] & fillUpAttacks[sq&7][occ];
The differences are quite large. Basically this code uses the same idea as SISSY bitboards but uses some techniques from Kindergarten bitboards to get better (super) performance than either by themselves. Hence the name Kindergarten Super SISSY bitboards.