hgm wrote:When data is packed to reduce the entry size below 16 bytes, to squeeze more than 4 in a cache line, the following scheme is very competitive:
typedef struct { // 8 bytes
unsigned int lock;
short int score;
short int moveAndBoundType;
} HashData;
typedef struct { // 32 bytes
unsigned char lockExt[3];
char filler1;
HashData data[3];
unsigned int depth[3];
char filler2;
} HashBucket;
Should really express such things with the stdint types instead - the above is reliant on sizes that vary between platforms... plus i think you chose the wrong size for your depth to add up to 32 bytes even if you are counting int as always 32 bit.
Oops, you are right. I intended the depth to be (unsigned char). For now it seems we will have difficulty searching deeper than 256 ply...
Btw, I like this way of packing move and flags; it doesn't seem to cause any overhead. The & operations to test the flags don't care whether the move is also in the word. They only have to be stripped off with an extra & to get the hashMove. But the moveAndBoundType word appears in a register anyway for the bound-type testing, and when the move had been stored in a separate word, it would have to explicitly loaded there. Likewise, in the hash store you replace a separate store operation by an | .
Sven Schüle wrote:So you might consider to change your algorithm to determine the hash table size in a way that also works for an entry size of, say, 24 bytes.