hgm wrote:How does that compare to qperft?
Quick Perft by H.G. Muller
Perft mode: No hashing, bulk counting in horizon nodes
perft( 1)= 20 ( 0.000 sec)
perft( 2)= 400 ( 0.000 sec)
perft( 3)= 8902 ( 0.000 sec)
perft( 4)= 197281 ( 0.000 sec)
perft( 5)= 4865609 ( 0.020 sec)
perft( 6)= 119060324 ( 0.580 sec)
perft( 7)= 3195901860 (15.420 sec)
perft( 8)= 84998978956 (415.940 sec)
real 7m13.652s
user 7m11.965s
Is frcperft-win64 really orders of magnitude faster than frcperft-win32?
Almost 3 times (x2.75), even 4 times (x3.77) if you add sse4.2 (ie popcount) on the 64 bit versions :
here are some frcperft results, for perft 7 :
single threaded, no hashing, mode=FAST, extract=BSF32, count=LOOP
perft 7 3195901860 29.75s 107.4 mnps 31.7 ticks/move (ie lux32 binary)
single threaded, no hashing, mode=FAST, extract=BSF64, count=LOOP (ie lux64 binary):
perft 7 3195901860 10.81s 295.6 mnps 11.5 ticks/move
single threaded, no hashing, mode=FAST, extract=BSF64, count=POPCNT (recompiled)
perft 7 3195901860 7.88s 405.6 mnps 8.4 ticks/move
oliperft is even more impressive here :
oliperft 7
7 0 668 3195901860
and with bsf64 and popcount in assembly language :
7 0 563 3195901860
ie only 5.63 sec for perft(7) without hash.
When I translate your time of 216 sec for perft(8) to perft(7), which should be some 30 times faster, it would be something like 7.5 sec. While I have 110 sec for frcperft-win32. Of course my machine is pretty slow, but not 15 times slower than yours...
If your CPU does not support 64 bit nor the popcount instruction, it is possible.
I really should start working on getting that extra factor 10 on qperft...
I wonder what performance one could achieve if adding an hash table to oliperft or recompiling JetChess to use 64bit functionality and popcount ?