I did some more work with the table driven move generation thing and managed do some more improvement.
Here are the ideas:
1. Pawn attack masks are hashed per two ranks, which means you require 16 bits for two ranks. Captures are excluded because capture attack masks are sparsely populated, and need to be separately hashed. Here's the pseudocode for white (note that I'm using uint16_t for indexing):
Code: Select all
uint16_t pawn_attacks_rank_3_4_idx = (single_pushes | double_pushes) >> 16;
uint16_t pawn_attacks_rank_5_6_idx = single_pushes >> 32;
uint16_t pawn_attacks_rank_7_8_idx = (single_pushes | non_cap_promotion) >> 48;
add_moves(PawnHashTable1[WHITE][pawn_attacks_rank_3_4_idx]);
// ...
Table sizes
---
For pawn attacks masks on rank 3 and 4, max possible moves are 16, so the table size is ~4.35MB.
For pawn attacks masks on rank 5 and 6, max possible moves are 8, so the table size is ~2.25MB.
For pawn attacks masks on rank 7 and 8, max possible moves are 32, so the table size is ~8.25MB.
Therefore the total table size of the three tables amounts to ~14.75MB.
Table sizes can be drastically brought down by using additional redirection to allocate only the required memory for each movelist.
Results
---
I must say, table-driven move generation paid off more than I expected it to. With no other changes in the source code except using this technique (and using movelists for knights and sliders as well), I got a 25-30% speedup compared to my implementation of magic bitboards, when there are all pawns on the board. The speedup also touched 39% for some positions! Of course you can test for yourself, the source code and a lot of test positions are provided.
I also want you to note how 32-bit friendly this approach is, since there are no multiplications involved, and the shifts are quite 32-bit friendly as well. It'd be interesting to test this technique for 32-bits and see the results.
Source Code
---
Here you go: https://sites.google.com/site/sydfhd/projects/yaka
Download the file "YakaPawnMoveLists.zip", be careful because there are some similar looking names.
See files attacks.cpp (lines 62-173) for initialization code, and also movegen.h (lines 216-288) for seeing it all in action.
Also, you can go "bench 6 ToughGuys ToughGuysOut.txt" in each executable to test the speed.
Postscript
---
Be alarmed people, there is something seriously faster than magic bitboards in town!