Computer: I use my old PC with AMD Ryzen 7 1700 Eight-Core Processor, 3.00 GHz, RAM 16.0 GB, Windows 10 Pro. I guess it could run AVX2 fine, but not with high performance. I have a newer laptop (MacBook Pro M1 Max) too, but some programs, such as chessbit, Gigantua are optimised, dedicated for Intel PC and cannot be compiled for my laptop.
Perft depth 8 for start position: I don’t want to wait too long. That depth is reasonable for all programs. It looks like a sweet depth when elapsed times are just right (not too short nor too long). I focused on computing Perft for the start position; thus, I ignored all other ones. Later, I may make another comparison using multiple positions. The good news is that all programs create the correct node count for that Perft.
Perft speed up techniques: So far, I know some techniques to speed up Pertt: 1) bulk counting, 2) hashing, 3) threading, 4) C++ templates (used to expand the code when compiling to reduce conditions/branches to gain speed), 5) optimisations using bit-manipulation instructions (BMI) for modern hardware to speed up move generators, especially for magic-bitboard ones. Typically, the first three methods can improve Perft speed significantly. The last two methods may get some gains. All programs in this comparison are open source. I looked at and studied all their code quickly to get their ideas of implementations for Perft.
1) Stockfish 17.1 by Stockfish team
https://github.com/official-stockfish/Stockfish
Stockfish is only a chess engine in this comparison when all other programs are Perft dedicated. I can’t compile the latest/development one on my PC. Thus, I downloaded and ran the exe file of version 17.1 (the file stockfish-windows-x86-64-avx2.exe).
Stockfish uses straightforward Perft code, using bulk counting, but not hashing or multithreading. It is also well optimised for modern hardware and uses a lot of C++ templates to gain speed. I use Stockfish to create the baseline speed. Surprisingly, Stockfish is the second slowest, not at the bottom of the comparison list.
2) qperft (Quick Perft) by H.G. Muller
https://home.hccnet.nl/h.g.muller/dwnldpage.html
Perhaps the program was released 20 years ago, and the latest update was about 12 years ago (based on some discussions on forums).
I downloaded and ran the exe file (qperft.exe). It’s a surprise for the oldest one, still running fine and got a good position in the list above two newer programs, including Stockfish. It seems to use mainly bulk counting and hashing to speed up. However, it is not much faster than Stockfish, probably because it uses mailbox board representation. From my experience, it is not easy to optimise the mailbox generators for speeding. It is only one using the mailbox in the list.
3) BBPerft by Manik Charan
https://github.com/Mk-Chan/BBPerft
The project was updated 6 years ago. I downloaded and compiled using the make command.
At a glance at the code, it uses templates but doesn't use any good gain technique (such as bulk counting, hashing, or multithreading). It doesn’t optimise for modern hardware either. Thus, it is amazing since it is (a bit) faster than Stockfish.
4) Juddperft by Judd Niemann
https://github.com/jniemann66/juddperft
The project was updated recently. I downloaded and ran the exe file for 64-bit.
It is significantly faster than Stockfish, 9 times faster. At a glance at the code, it uses all good techniques from bulk counting, hashing, and multithreading to speed up Persft speed. However, it doesn’t use templates, and it is not optimised for modern hardware
5) chessbit by Thomas Albert
https://github.com/thuijbregts/chessbit
The project was updated recently. I downloaded and ran the file chessbit_fastest.exe.
The program is faster than Stockfish. It is also faster than Gigantua, as the author claimed. However, the gap is not large. His other claim, "The fastest Perft engine", failed, at least on this test. At a glance at the code, the program is too complicated for me, using heavy templates. It is optimised for modern hardware. However, it looks like it doesn't use hashing or multithreading, which is why it cannot match the faster ones.
6) Gigantua by Daniel Infuehr
https://github.com/Gigantua/Gigantua
The project was updated 6 years ago. I downloaded and ran its exe file.
I knew there were some long, hard forum discussions about this program and the author's strong claims, such as "Worlds-fastest-Bitboard-Chess-Movegenerator" and "Worlds Fastest CPU Movegenerator". However, it is slower than Stockfish on my PC, and it is at the bottom of the list. Those all claims failed, at least on this test. It looks like the author tried to gain speed mostly by applying heavy templates and optimisations. Perhaps my computer is too old for those optimisations. As on one of his forum posts, the program uses bulk counting but not hashing or multithreading, thus it may be the main reason for lagging behind all others. Somewhat, it is similar to BBPerft in using techniques, but surprisingly, it is slower.
7) MPerft by Richard Delorme
https://github.com/abulmo/MPerft
The project was updated 6 years ago. I downloaded and compiled it via the make command.
It is the fastest and significantly faster than other programs in this comparison, 17 times faster than Stockfish. On the screen, it prints clearly that it uses hashing and multithreading. A quick code study reveals it doesn’t use templates, and it is not optimised for modern hardware.
The author has another and newer dedicated Perft program named hqperft, but it runs a bit slower on my computer.
The table below lists the names of programs and their elapsed times to complete Perft 8 of the start position, tested on a PC AMD Ryzen 7 1700 Eight-Core Processor, 3.00 GHz, 16 GB RAM, Windows 10 Pro:
Code: Select all
program elapsed (ms)
1 MPerft 32032
2 Juddperft 62094
3 BBPerft 431687
4 chessbit 441761
5 qperft 516060
6 Stockfish 17.1 543070
7 Gigantua 636927


