I've just implemented a fully legal move generator in Colossus and have been using PERFT to verify the correctness. All the numbers are good now but I noticed a weird effect with increasing transposition table size. As the size increased the time taken reduced as you would expect. However, when I got to 256MB table size the time taken started increasing!
(I was running a PERFT 7 search from the starting position. All searches returned the correct 3,195,901,860 count. The time taken to allocate and initialise the table memory is not included in the timing.)
MB Time(ms)
0 30890
1 11016
2 10328
4 9828
8 9046
16 8250
32 7734
64 7657
128 7579
256 7703 (!)
512 7985
1024 8484
2048 8594
4096 8734
Processor AMD Athlon(tm) X4 860K Quad Core Processor 3.70 GHz
Installed RAM 16.0 GB
System type 64-bit operating system, x64-based processor
Does anybody have any suggestions as to why this might be happening?
The only thing I could think of was an L1/L2 cache issue where it's filling them with rarely used entries from the larger tranposition table in preference to more commonly used stuff?!
(Windows task manager showed the total memory load never exceeded about 9GB, well below the 16GB installed.)
PERFT transposition table funny?!
Moderators: hgm, Rebel, chrisw
-
- Posts: 69
- Joined: Thu Nov 21, 2013 12:37 am
- Location: Manchester, UK
- Full name: Martin Bryant
-
- Posts: 300
- Joined: Mon Apr 30, 2018 11:51 pm
Re: PERFT transposition table funny?!
Check performance counters, but it sounds like a case of TLB trashing (ie., your hash table is so large that the CPU can no longer keep efficiently track of where its memory pages are). Huge pages would help if that's the issue, but I have no idea whether such an old CPU supports it.
-
- Posts: 1784
- Joined: Wed Jul 03, 2019 4:42 pm
- Location: Netherlands
- Full name: Marcel Vanthoor
Re: PERFT transposition table funny?!
In my engine, I observed the same effect, but later, at 1024 MB:MartinBryant wrote: ↑Sat Apr 10, 2021 2:25 pm I've just implemented a fully legal move generator in Colossus and have been using PERFT to verify the correctness. All the numbers are good now but I noticed a weird effect with increasing transposition table size. As the size increased the time taken reduced as you would expect. However, when I got to 256MB table size the time taken started increasing!
Code: Select all
MB ms
0 93557
1 67264
2 56872
4 44286
8 35723
16 29090
32 23101
64 19454
128 17848
256 16547
512 15872
1024 16270 (!)
2048 16530
4096 16990
8192 17388
16384 18070
Maybe supporting huge pages solves this issue, but I haven't looked into that yet. As long as I don't do multi-threading, and the CPU the engine is running on is not one of the very latest models, it can't fill up the 256 MB hash table in my testing games. As I prune more moves, I expect even less moves will end up in the hash table, so it becomes less of an issue. I'll probably start looking into that in the (far) future when I'm going to test with 8+ cores and I need 256 MB of hash per core.
PS: Rustic has a pseudo-legal move generator (for now), and the CPU is an i7-6700K. The system has 32GB RAM.
-
- Posts: 69
- Joined: Thu Nov 21, 2013 12:37 am
- Location: Manchester, UK
- Full name: Martin Bryant
Re: PERFT transposition table funny?!
I think you could be right! I just had a quick look and with a 4GB table about 99% of the entries aren't used
Not very good for caching
Thank you for your help!
Not very good for caching
Thank you for your help!
-
- Posts: 1784
- Joined: Wed Jul 03, 2019 4:42 pm
- Location: Netherlands
- Full name: Marcel Vanthoor
Re: PERFT transposition table funny?!
By the way: Rustic is using a hash table that has a bucket with four slots at each index. Using buckets in the hash table makes the thing a lot more efficient, because you can actually store multiple positions at the same index, which you can't do if you don't use buckets. That means that without buckets, you need to have a much bigger hash table to be able to store the same number of positions. (Excuse me if your engine already uses buckets; in that case, there isn't an improvement option there.)
-
- Posts: 160
- Joined: Tue Jan 23, 2018 10:18 am
- Location: Rotterdam
- Full name: Ronald Friederich
Re: PERFT transposition table funny?!
Using a bucket in the hashtable doesn't give you more entries, if you don't use a bucket in your case it will just get you 4 times as much entries. The advantage of using a bucket system is that you have more influence on the replacement of already used entries (although replacements will occur 4 times as much)..