asanjuan wrote:Hi all.
My engine is bitboard-based, but not using magics, nor rotated BB. I think it searches with a reasonable branching factor, but to my surprise and after reading some posts, i found that is quite slow in nps (2.5 Mnps without eval in a single intel core2).
Currently, my search only reaches 270knps. And as i said, it seems to me that is quite slow.
Maybe i'm using too much the well known msb and lsb routines to generate moves for sliding pieces, even in eval.
After reading the wiki, finally understood how magic bitboard works, and i'm planning to implement it into Rhetoric (my engine).
any advice from the experts?
(and please, don't say patience
)
Compared to magics, I've found rotated BB to be almost as fast. If you're just looking at a board + move generation code (ie. a pure perft) magics won't make a big difference. The only advantage of magics is that the occupancy is a single bitboard (i/o 4 different ones to maintain dynamically with rotated BB). It is not only faster but more elegant, and allows for set operations in O(1): example you want to remove all white pawns from thr occupance, just a single &~ and that's it, while rotated BB would inevitably end up in an ugly loop... I found that useful for the eval, hence the switch from rotated BB to magics in my engine.
If your engine performs perft calculations at an average of 2.5 Mnps, then that's quite slow indeed (mine averages around 25-30 Mnps and it's surely far from optimal). I don't think the bottle neck is that you don't use magics. I can't give you any precise advice, as I haven't looked at your code, but the best way to improve it is to run the compiler profiler and see what you can optimize. Of course it's good to rely on your intuition when you code to make sure your code is efficient, but when you have such a complex combination of interacting functions to optimize, forget your intuition and let the profiler teach you
. I've spent a fair amount of time profilng my perft, and it was definitely worth it.
Basically what I'm saying is that magics or rotated BB or whatever else is just a set of primitive operations, and the inefficieny is probably not in these operations but in the conception of the board and/or move generation that calls these primitive operations. Perhaps you should parse through the Stockfish code to see how a real artist does this! It should give you some idea on how to organize the board/movegen code efficiently.
Another area of improvement is the move sorting machinery. Let's say that your move ordering is the following:
1/ hash move
2/ non losing captures(*) by MVV/LVA or SEE
3/ killer moves
4/ losing captures by MVV/LVA or SEE
5/ quiet moves by history score
(*) "captures" include en passant captures and promotions
You need to make use of early beta cutoffs to implement a lazy move generation scheme. Examples:
* if the hash move triggers a beta cutoff, you can save the whole move generation (just have a function that tests the legality of the hash move). when the hash move exist it should be legal (with infinitesimal chances that it's not) and it will very often be the best move.
* if a non losing capture triggers a beta cutoff, you'll only have generated captures, and you'll save all the quiet move generation
* if a killer move generates a beta cutoff (again test the killer legality before, as it will often be illegal), same story, you can save the quiet move generation
* same story if a losing capture generates the cutoff
My movesorting machinery is the real bottleneck of my engine, and I plan to do what I describe above at some point, when the program is mature enough that raw speed really matters (if I ever reach that stage).