JarvisXerxes wrote:Hi everyone,
It's my first post here ... Recently I've started my chess engine project in C++ (from scratch).
I just completed the very basics - the magic bitboards, movegen and make/unmake moves.
I choose to generate only pseudo-legal moves in movegen(), and leave the legality check to an external bool function.
Therefore my perft is something like: (U64 is unsigned long long)
Code: Select all
U64 perft(depth)
{
U64 nodes = 0;
if (depth == 0)
return 1;
moveGen(); // to an external buffer, all pseudo moves
for (int i = currentPointer; i < nexPointer; i++)
{
Move m = moveBuf[i];
makeMove(m);
if (!isOppositeKingAttacked()) // external legality check
nodes += perft(depth - 1);
unmakeMove(m); // restore everything exactly
}
return nodes;
}
Checking the perft tables on chess-pedia, I find that my code works accurately.
But the speed is extremely disappointing. Only around 3500 kn/s on my i7 Dell Inspiron with 8G RAM. I also compiled the qperft.c by H.G. Muller: his perft runs at a stunning 55000 kn/s, almost 1500% my speed.
I tried for two days in vain. I'd appreciate any advice the experts on this forum might give me.
Thanks in advance.
From my experience, I can give you the following advice:
1/ be organized. do not just modify the code like that. use a source code management system like git, and use patches and branches, as well as unit tests (anytime you modify the code check that your perft results *stay* correct). This may seem like overkill at first, but really pays off massively.
2/ do not try to guess, but always measure. this is true in general, but in the specific case of speed optimization, it means that you need to use a code profiler, and identify the bottlenecks. Only by doing that you will understand where the inefficiencies are.
3/ remember what matters, and do not spend too much time on unimportant things. you can always come back to those later. in the present case, you may not realize it yet, but perft raw speed has little impact on program strength and even speed in NPS. At the beginning it will, but the more things your engine does at each node (eval + complex search tricks) the less the perft raw speed will matter, and it eventually becomes a drop in the ocean. That being said, 3500 nodes per second is really too low. There must be some serious inefficiencies, and only 1/ will help you fix them (my crystal cannot tell what you're doing wrong).
PS: keep it clean and keep things independant and with no side effects (as much as possible). think of it this way: if I remove this line of code, do I have to read the whole program to understand what side effect it will have and what else will break? For example, writing code like this is a shootable offense
Code: Select all
moveGen(); // to an external buffer, all pseudo moves
For now, you know that this function has a side effect on a global array. But once you start to multiply things like that, even you will get lost in your own code, and make mistakes. Not to mention the day when you want to implement SMP, and global variables and side effects will be your worst nightmare.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.