Sven Schüle wrote:In "Plain" the square is actually used as an array index. None of these implementations uses HGM's proposal of a pointer array, though.
The code Gerd posts here uses a pointer rather than an integer offset...
Note that it is quite understandable/expected that it doesn't make much difference whether you use a compacted 800KB table through an extra level of irregular indirection, or a 2MB regular 2-dimensional array. The latter only wastes
address space. It doesn't occupy any extra
memory; it just leaves it unused. Unused memory does not cause cache pressure. As your program will hardly ever exactly need the available 4GB (or whatever people have nowadays), there will always be a lot of unused memory, and it doesn't make any difference whether all that unused memory is spread out in chunks within your table, or a contiguous range at the end of your program. Even a small Rook table for a single square takes 8KB, which completely fills two cache ways, so it is not like your array maps only in a sub-set of the cache, and leaves the rest unused. You don't even put more pressure on the TLB, when the table is page-aligned, as the boundaries between tables for squares coincide with page boundaries (for 4KB 'small' pages).
So the only difference is how to calculate the start address of the tables: load it from memory from the indirection table, or multiply. I think both operations have a maximum throughput of 1 per cycle, and a latency of 3 or 4 clocks. So it is really "lead for old iron", to use a Dutch expression.