kbhearn wrote:that however costs you an extra indirection on lookup where adding an offset is free as part of memory indexing as far as i understand.
But the point of one of my previous postings was that it isn't: it needs an extra addition. Memory indexing can at most do A+8*B, and what you need here is 8*A+8*B or 8*(A+B), and there do not exist addressing modes for either of those in i386/x64 assembler. So t[index + rookOffset[fromSqr]] needs one (32-bit) memory LOAD for offset[fromSqr] with an addressing mode that multiplies fromSqr by 4, and ADD instruction to add it to the calculated index, and then a 64-bit memory LOAD with an adressing mode that multiplies the result of that ADD by 8. The 'extra indirection' can do with just the two LOADs: load the pointer table[fromSqr] with an addressing mode that multiplies fromSqr by 8, and use it to access t with an addressing mode that also multiplies by 8.
So the proper way to do it is the way I wrote it; adding an offset to the index is just an inefficient way to do it. It would be very hard for an optimizing compiler to realize that it can optimize the redundant ADD away by pre-multiplying the tabulated offsets by 8: these tables are likely in a global array, which could be used in other parts of the program outside the scope of the optimizer, so it cannot touch them. You could only achieve that by telling the compiler you want to store pre-multiplied offsets, and that it doesn't have to multiply them during the access, by casting the access to an array of characters:
#define K 1024
int rookOffset; // pre-multiplied by 8 during initialization
attacks = *((BitBoard*) ((char*)t)[8*index + rookOffset[sqr]]);
Then it could use the A+8*B addressing mode after calculating index in B and loading rookOffset into A, to load the bitboard. This would have the advantage over the method with
BitBoard *rookTable; // pointers into t
attacks = rookTable[sqr][index];
that rookOffset needs only 4 bytes per element (so 256 bytes in total) while rookTable would needs twice that with 8-byte pointers. But the price is awful code.
@Gerd: I hope that the above also answers your question. The table[pieceType] that I wrote before was just intended to symbolize hard-coded arrays rookTable and bishopTable, as of course in orthodox Chess those are the only two piece types for which it works this way, and Queens would have to be sythesized from them, while the leapers are just a lookup and don't have to bother with 'occupied'. But when you have more slider types you might actually want to do it that way. Although for slider-leaper compounds such as Chancellor, Archbishop, Dragon King or Dragon Horse you would probably just want to OR the Rook and Bishop attack-getter with the tabulated Knight or King moves to save space. But for Nightriders or asymmetric sliders like the Tori-Shogi Left and Right Quail you might want to have separate tables.