Typical data appear below. Stockfish's Local/Remote Memory Access Ratio was 1.0, while Cfish's was 1.8.
Code: Select all
$ sudo ./pcm-numa.x -- Stockfish/src/stockfish bench 32768 20 15
Processor Counter Monitor: NUMA monitoring utility
IBRS and IBPB supported : yes
STIBP supported : yes
Spec arch caps supported : no
Number of physical cores: 20
Number of logical cores: 40
Number of online logical cores: 40
Threads (logical cores) per physical core: 2
Num sockets: 2
Physical cores per socket: 10
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 3100000000 Hz
IBRS enabled in the kernel : no
STIBP enabled in the kernel : no
Disabling NMI watchdog since it consumes one hw-PMU counter.
Package thermal spec power: 160 Watt; Package minimum power: 51 Watt; Package maximum power: 320 Watt;
Socket 0: 2 memory controllers detected with total number of 4 channels. 2 QPI ports detected. 0 M2M (mesh to memory) blocks detected.
Socket 1: 2 memory controllers detected with total number of 4 channels. 2 QPI ports detected. 0 M2M (mesh to memory) blocks detected.
Trying to use Linux perf events...
Successfully programmed on-core PMU using Linux perf
Socket 0
Max QPI link 0 speed: 19.2 GBytes/second (9.6 GT/second)
Max QPI link 1 speed: 19.2 GBytes/second (9.6 GT/second)
Socket 1
Max QPI link 0 speed: 19.2 GBytes/second (9.6 GT/second)
Max QPI link 1 speed: 19.2 GBytes/second (9.6 GT/second)
Detected Intel(R) Xeon(R) CPU E5-2687W v3 @ 3.10GHz "Intel(r) microarchitecture codename Haswell-EP/EN/EX" stepping 2 microcode level 0x43
Update every 1.0 seconds
Executing "Stockfish/src/stockfish" command:
Stockfish 140619 64 BMI2 by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
[bench output snipped]
===========================
Total time (ms) : 7092
Nodes searched : 68440190
Nodes/second : 9650336
DEBUG: caught signal to interrupt (Child exited).
Program Stockfish/src/stockfish launched with PID: 4f8a
Program exited with status 0
Time elapsed: 9803 ms
Core | IPC | Instructions | Cycles | Local DRAM accesses | Remote DRAM Accesses
0 0.58 11 G 20 G 7190 K 8045 K
1 0.61 11 G 18 G 5861 K 7463 K
2 0.61 10 G 17 G 6128 K 8213 K
3 0.61 10 G 16 G 5554 K 7458 K
4 0.61 9512 M 15 G 4589 K 6401 K
5 0.64 8872 M 13 G 4508 K 7049 K
6 0.69 7852 M 11 G 3631 K 6492 K
7 0.67 7468 M 11 G 3747 K 5454 K
8 0.75 6727 M 8926 M 3103 K 5360 K
9 0.16 699 M 4254 M 1773 K 223 K
10 0.31 371 M 1179 M 337 K 258 K
11 0.29 872 M 2995 M 784 K 611 K
12 0.38 1151 M 3026 M 892 K 776 K
13 0.32 1605 M 5032 M 1277 K 1056 K
14 0.62 8970 M 14 G 5169 K 5108 K
15 0.28 2852 M 10 G 2402 K 2199 K
16 0.29 3433 M 11 G 2915 K 2347 K
17 0.31 3981 M 12 G 3208 K 2719 K
18 0.38 3798 M 10 G 2560 K 2218 K
19 0.42 8157 M 19 G 4689 K 4433 K
20 0.94 941 M 1006 M 1800 K 210 K
21 0.18 5459 K 29 M 39 K 29 K
22 0.27 15 M 57 M 74 K 70 K
23 0.26 26 M 99 M 148 K 70 K
24 0.31 8031 K 25 M 20 K 3740
25 0.17 685 M 4135 M 2038 K 36 K
26 0.25 163 M 654 M 1897 K 560 K
27 0.33 23 M 71 M 99 K 49 K
28 0.20 1387 M 6780 M 3821 K 376 K
29 0.46 111 M 244 M 621 K 203 K
30 0.57 12 G 21 G 7062 K 6910 K
31 0.58 12 G 21 G 6876 K 6867 K
32 0.58 12 G 21 G 6820 K 6717 K
33 0.57 12 G 21 G 6912 K 6463 K
34 0.35 4503 M 12 G 3546 K 2914 K
35 0.91 17 G 19 G 20 M 16 M
36 0.77 10 G 14 G 5006 K 5596 K
37 0.74 10 G 14 G 5357 K 5402 K
38 0.51 11 G 21 G 7316 K 6494 K
39 0.39 8709 M 22 G 5458 K 4940 K
-------------------------------------------------------------------------------------------------------------------
* 0.55 236 G 433 G 155 M 154 M
Cleaning up
Zeroed uncore PMU registers
Freeing up all RMIDs
Re-enabling NMI watchdog.
Code: Select all
$ sudo ./pcm-numa.x -- Cfish/src/cfish bench 32768 20 15
Processor Counter Monitor: NUMA monitoring utility
IBRS and IBPB supported : yes
STIBP supported : yes
Spec arch caps supported : no
Number of physical cores: 20
Number of logical cores: 40
Number of online logical cores: 40
Threads (logical cores) per physical core: 2
Num sockets: 2
Physical cores per socket: 10
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 3100000000 Hz
IBRS enabled in the kernel : no
STIBP enabled in the kernel : no
Disabling NMI watchdog since it consumes one hw-PMU counter.
Package thermal spec power: 160 Watt; Package minimum power: 51 Watt; Package maximum power: 320 Watt;
Socket 0: 2 memory controllers detected with total number of 4 channels. 2 QPI ports detected. 0 M2M (mesh to memory) blocks detected.
Socket 1: 2 memory controllers detected with total number of 4 channels. 2 QPI ports detected. 0 M2M (mesh to memory) blocks detected.
Trying to use Linux perf events...
Successfully programmed on-core PMU using Linux perf
Socket 0
Max QPI link 0 speed: 19.2 GBytes/second (9.6 GT/second)
Max QPI link 1 speed: 19.2 GBytes/second (9.6 GT/second)
Socket 1
Max QPI link 0 speed: 19.2 GBytes/second (9.6 GT/second)
Max QPI link 1 speed: 19.2 GBytes/second (9.6 GT/second)
Detected Intel(R) Xeon(R) CPU E5-2687W v3 @ 3.10GHz "Intel(r) microarchitecture codename Haswell-EP/EN/EX" stepping 2 microcode level 0x43
Update every 1.0 seconds
Executing "Cfish/src/cfish" command:
Cfish 160619 64 BMI2 NUMA by Syzygy based on Stockfish
info string NUMA enabled.
info string Binding thread 0 to node 0.
info string Binding thread 1 to node 0.
info string Binding thread 2 to node 0.
info string Binding thread 3 to node 0.
info string Binding thread 4 to node 0.
info string Binding thread 5 to node 0.
info string Binding thread 6 to node 0.
info string Binding thread 7 to node 0.
info string Binding thread 8 to node 0.
info string Binding thread 9 to node 0.
info string Binding thread 10 to node 1.
info string Binding thread 11 to node 1.
info string Binding thread 12 to node 1.
info string Binding thread 13 to node 1.
info string Binding thread 14 to node 1.
info string Binding thread 15 to node 1.
info string Binding thread 16 to node 1.
info string Binding thread 17 to node 1.
info string Binding thread 18 to node 1.
info string Binding thread 19 to node 1.
[bench output snipped]
===========================
Total time (ms) : 1664
Nodes searched : 55178288
Nodes/second : 33160028
DEBUG: caught signal to interrupt (Child exited).
Program Cfish/src/cfish launched with PID: 530c
Program exited with status 0
Time elapsed: 7283 ms
Core | IPC | Instructions | Cycles | Local DRAM accesses | Remote DRAM Accesses
0 0.77 6462 M 8391 M 3564 K 2585 K
1 0.11 913 M 8350 M 2224 K 174 K
2 0.46 7125 M 15 G 4889 K 2205 K
3 0.47 7255 M 15 G 5099 K 2382 K
4 0.45 7028 M 15 G 4877 K 2198 K
5 0.47 7203 M 15 G 4857 K 2223 K
6 0.46 7169 M 15 G 4823 K 2189 K
7 0.46 7212 M 15 G 4691 K 2036 K
8 0.48 7232 M 15 G 5040 K 2326 K
9 0.46 7226 M 15 G 5092 K 2247 K
10 0.14 49 M 351 M 196 K 52 K
11 0.13 40 M 314 M 122 K 66 K
12 0.46 7233 M 15 G 4214 K 2619 K
13 0.20 15 M 76 M 132 K 87 K
14 0.46 7273 M 15 G 4324 K 2803 K
15 0.46 7286 M 15 G 4321 K 2759 K
16 0.46 7292 M 15 G 4205 K 2637 K
17 0.46 7181 M 15 G 4110 K 2557 K
18 0.45 7207 M 15 G 4178 K 2616 K
19 0.45 7198 M 15 G 4197 K 2681 K
20 0.11 911 M 8044 M 2230 K 107 K
21 1.08 12 G 11 G 13 M 12 M
22 0.19 130 M 677 M 1814 K 576 K
23 0.13 72 M 544 M 1130 K 590 K
24 0.78 612 M 787 M 1583 K 290 K
25 0.36 79 M 223 M 269 K 92 K
26 0.39 75 M 195 M 198 K 55 K
27 0.38 34 M 88 M 95 K 24 K
28 0.22 270 M 1242 M 2350 K 996 K
29 0.18 72 M 412 M 1023 K 312 K
30 0.46 7255 M 15 G 4196 K 2615 K
31 0.46 7257 M 15 G 4181 K 2607 K
32 0.21 63 M 300 M 604 K 468 K
33 0.46 7262 M 15 G 4163 K 2603 K
34 0.20 16 M 80 M 141 K 104 K
35 0.21 14 M 70 M 122 K 80 K
36 0.18 18 M 102 M 214 K 107 K
37 0.15 27 M 182 M 348 K 253 K
38 0.19 15 M 79 M 156 K 62 K
39 0.35 56 M 162 M 370 K 189 K
-------------------------------------------------------------------------------------------------------------------
* 0.47 152 G 324 G 113 M 63 M
Cleaning up
Zeroed uncore PMU registers
Freeing up all RMIDs
Re-enabling NMI watchdog.
Code: Select all
$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 20 21 22 23 24 25 26 27 28 29
node 0 size: 32112 MB
node 0 free: 30551 MB
node 1 cpus: 10 11 12 13 14 15 16 17 18 19 30 31 32 33 34 35 36 37 38 39
node 1 size: 32230 MB
node 1 free: 30749 MB
node distances:
node 0 1
0: 10 21
1: 21 10