Someone will have to explain to me how this "
I think part of what allows for it to be fast is that I designed it in a way that allows for (some/most/all of) the bound checking to not need to happen at runtime at all by inlining.", works because I just can't see it.
For the knight I see the inline request (no guarantee the compiler will actually inline the code).
Code: Select all
inline void coa_move_knight(struct coa_board *board, int turn, int x0, int y0)
{
coa_play(board, turn, x0, y0, x0 + 1, y0 + 2);
coa_play(board, turn, x0, y0, x0 + 2, y0 + 1);
coa_play(board, turn, x0, y0, x0 - 1, y0 + 2);
coa_play(board, turn, x0, y0, x0 - 2, y0 + 1);
coa_play(board, turn, x0, y0, x0 + 1, y0 - 2);
coa_play(board, turn, x0, y0, x0 - 2, y0 + 1);
coa_play(board, turn, x0, y0, x0 - 1, y0 - 2);
coa_play(board, turn, x0, y0, x0 - 2, y0 - 1);
}
Then each line calls coa_play() where the bounds checking is done.
Code: Select all
inline int coa_play(struct coa_board *board, int turn, int x0, int y0, int x1, int y1)
{
if (x1 < 0) return 1; // this is runtime code iiuc and these are bounds checks
if (y1 < 0) return 1;
if (x1 >= 8) return 1;
if (y1 >= 8) return 1;
unsigned char piece = board->pieces[x0 + y0 * 8];
unsigned char other = board->pieces[x1 + y1 * 8];
if ((other & coa_color) == (piece & coa_color)) return 1;
board->pieces[x0 + y0 * 8] = coa_none;
board->pieces[x1 + y1 * 8] = piece;
coa_perft(board, turn ^ 1);
board->pieces[x0 + y0 * 8] = piece;
board->pieces[x1 + y1 * 8] = other;
if (other != coa_none) return 1;
return 0;
}
"
Likewise, the computations of the array indices are also meant to be removed from the loop altogether. When you see ‘x + y * 8’, no multiplication or addition should need to be performed at runtime, because it was already performed at compile time in each of those 64 generated inline functions."
When X and Y are variable they can not be computed at compile time.
Someone please explain, my head is turning in circles.
