Jacob wrote:Had I a dime for each little mistake I've made, I could buy myself my own computer >_<.
Sure, it'd be fun to look through the source of the hashing versions. There used to be a site that listed perft values for several positions along with check counts, captures, e.p., etc. and it was very useful for debugging. Unfortunately it seems to be gone now, but if you combine the versions I could use it to validate my results, and I'd make a web page of my own.
Yes, this was the SMIRF website. I verified all my perfts against that, also the move-type specifications (except that my perft did not report checks and mates).
OK, I put a version with hash on my website, now, and it seems to work. I use the full 64 bits of the key, so although key collisions are possible in theory, they should be very unlikely. The opening perfts seem to be OK upto d=8. (I did not test deeper.) Hashing with a reasonable size table (say 128MB) speeds up the perft(7) by more than a factor 5 (53 sec to 9.6 sec on a 2.8GHz Pentium IV).
Funny thing is that I started using a very simplistic hashing scheme, (replace always) with the aim to first achieve correctness first, before optimizing, and then it turned out I could hardly improve on it by smarter replacement.
So it still uses replace always. Depth does not provide upward compatibility for perfting, so I just work the depth into the key. The same position can then be in the table several times, with different depth. As a consequence I don't have to store the depth in the entry, except for implementing things as depth-preferred replacement. So I decided not to bother.
There is some implicit depth-prefference, though: The d=1 nodes (which count the moves to determine how many d=0 nodes there would be, but except for a few exceptions were legality checking was difficult otherwise, it does not really make these moves) can be calculated so fast, that probing the main hash for them is actually counterproductive. The average DRAM access takes far longer than running the move generator. So I have a separate small hash table, of only 128KB (8K entries) that is so frequently probed that it will always remain in L2 (which was 512KB on my machine). All d=1 probes go there, and as they are always L2 hits, this really pays off (despite the lower hash hit rate).
The lowest depth to go into the main hash is thus d=2. I confined these probes to half the hash table (the odd entries), to protect the deeper stuff that goes all into the other half. Trying to give d=4 or d=5 more protection than d=3 did not seem to speed things up.
If you find any errors, let me know. 
