Are Bitboards More Intoxicating Than They Are Good?

Mike Sherwin · Post by **Mike Sherwin** » Wed Feb 24, 2021 7:49 am

Around the year 2004 I started writing my first real and complete chess engine. Then just after it was up and running I discovered bitboards. I thought wow they are so in tune with the 64 squares that they must be way more efficient. Especially in Qsearch. So, I started from scratch and wrote a bitboard engine (RomiChess) using classic style bitboards. It was disappointing that it ran so much slower than my first start which was named Carnivor. But Romi also did more like hash tables and some endleaf pawn structure evaluation. Carnivor only evaluated material and piece square table lookups. So the fact that on today's hardware (3950x, 4.2 GH, single threaded) Carnivor searches 25 million nodes per second in the original position and RomiChess searches only 7.5 million nodes per second should not bother me because Romi does more. Okay. but now fast forward 17 years and my new bitboard engine, Bricabrac, is also right now only material and piece square tables just like Carnivor. They should be kinda close in performance, right? But they are not close at all. Their perft nodes per second are very close within a couple million nodes per second. It is in a real search that the node rates are far away from each other. Like was said above in the original position Carnivor searches 25 million nodes per second. Bricabrac only searches just 10 million nodes per second. In Carnivor there is a GNUChess4 (and below) style move generator. It is very fast. Here is the code for the queen.

Code: Select all

case QUEEN:
             sbns=qns+qol[fs];
             sbnd=qnd+qol[fs];
             ts=*(sbns+fs);
             while(ts<64)
             {
               switch(wtrgt[brd[ts]])
               {
                 case VACANT:
                        link(MOVE);
                        ts=*(sbns+ts);
                        continue;
                 case FRIEND:             
                        ts=*(sbnd+ts);
                        continue;
                 case ENEMY:
                        link(CAPTURE);
                        ts=*(sbnd+ts);
                        continue;
                 default:
                        return FALSE;
               }                            
             }  
             continue;

It has been said many times by many people that bitboards are faster for two reasons. They are faster in Qsearch because they do not generate moves to unoccupied squares. And they are faster in the evaluation like for masking off a group of squares to check for passed pawns, etc. The question then is, are they so much faster in special cases that it makes up for their slow speed in generating moves during normal search. Or am I just doing something wrong in the Bricabrac search? Or are bitboards just a little too intoxicating?

Sven · Post by **Sven** » Wed Feb 24, 2021 9:17 am

When comparing the raw speed of two different engines, there may be a lot more to consider than just mailbox vs bitboards. So is everything else really "the same" between Carnivor and Bricabrac? E.g. search features, move ordering, node counting, PST implementation, legality checking, incremental vs. from scratch calculations, data structures, piece encoding, programming style, compiler, compiler options, ...?

hgm · Post by **hgm** » Wed Feb 24, 2021 9:39 am

You can always make something look fast by comparing it to something that is unnecessary slow. My mailbox engine Inferno is about 100 times faster than the competing bitboard engine mShogi (500knps vs 5knps). But of course Inferno doesn't generate any moves to unoccupied squares at all, in QS. While mShogi loops over all squares of its (256-square) bitboard to extract the few moves that are in there. So the verdict 'fast' or 'slow' is mainly determined by what you use as a reference.

So many of the arguments used to argue why bitboards should be fast are typical 'straw men'. Of course it slows you down a lot when you generate all non-captures in QS. But there is no need to do that with mailbox, any more than there is a need to loop over all squares of a bitboard to extract the 1 bits. And as for determining passers: in a fast implementation of an engine this info comes from the Pawn hash table, and is reused from there so often that the effort to calculate it has zero impact in any case.

To make bitboards look fast you have to compare them to really poor and simplistic mailbox implementations.

Mike Sherwin · Post by **Mike Sherwin** » Wed Feb 24, 2021 9:48 am

Sven wrote: ↑Wed Feb 24, 2021 9:17 am When comparing the raw speed of two different engines, there may be a lot more to consider than just mailbox vs bitboards. So is everything else really "the same" between Carnivor and Bricabrac? E.g. search features, move ordering, node counting, PST implementation, legality checking, incremental vs. from scratch calculations, data structures, piece encoding, programming style, compiler, compiler options, ...?

Very similar. Except Carnivor was compiled with Turbo C 1.0 and Bricabrac was compiled with MSVC 2019 Community. They both generate moves and put them in a huge stack with begin and end kept in arrays indexed by ply. The biggest difference is Bric is already set up for mp and therefore all data is in a thread structure whereas Carnivor's data is all global arrays. I get what you are saying, 'there are too many variables of which you have no knowledge of so how could you possibly answer the question'. But you also do not add an opinion based on your own experience! But, there are some people that do have opinions based on experience that they can give that might help form a consensus.

Mike Sherwin · Post by **Mike Sherwin** » Wed Feb 24, 2021 9:48 am

Sven wrote: ↑Wed Feb 24, 2021 9:17 am When comparing the raw speed of two different engines, there may be a lot more to consider than just mailbox vs bitboards. So is everything else really "the same" between Carnivor and Bricabrac? E.g. search features, move ordering, node counting, PST implementation, legality checking, incremental vs. from scratch calculations, data structures, piece encoding, programming style, compiler, compiler options, ...?

Very similar. Except Carnivor was compiled with Turbo C 1.0 and Bricabrac was compiled with MSVC 2019 Community. They both generate moves and put them in a huge stack with begin and end kept in arrays indexed by ply. The biggest difference is Bric is already set up for mp and therefore all data is in a thread structure whereas Carnivor's data is all global arrays. I get what you are saying, 'there are too many variables of which you have no knowledge of so how could you possibly answer the question'. But you also do not add an opinion based on your own experience! But, there are some people that do have opinions based on experience that they can give that might help form a consensus.

Mike Sherwin · Post by **Mike Sherwin** » Wed Feb 24, 2021 10:03 am

hgm wrote: ↑Wed Feb 24, 2021 9:39 am You can always make something look fast by comparing it to something that is unnecessary slow. My mailbox engine Inferno is about 100 times faster than the competing bitboard engine mShogi (500knps vs 5knps). But of course Inferno doesn't generate any moves to unoccupied squares at all, in QS. While mShogi loops over all squares of its (256-square) bitboard to extract the few moves that are in there. So the verdict 'fast' or 'slow' is mainly determined by what you use as a reference.

So may of the arguments used to argue why bitboards should be fast are typical 'straw men'. Of course slows you down a lot when you generate all non-captures in QS. But there is no need to do that with mailbox, any more than there is a need to loop over all squares of a bitboard to extract the 1 bits. And as for determining passers: in a fast implementation of an engine this info comes from the Pawn hash table, and is reused from there so often that the effort to calculate it has zero impact in any case.

To make bitboards look fast you have to compare them to really poor and simplistic mailbox implementations.

I agree. It is my experience that bitboard is not fast.

hgm · Post by **hgm** » Wed Feb 24, 2021 10:23 am

That of course doesn't mean that any mailbox implementation will be fast automatically; you really have to work hard to make it fast. And you should not just worry about move generation, which in a complete engine only takes a minor fraction of the time.

E.g. I also do not see the need to put moves on a stack, in QS. (And most nodes are QS...). You could store them as a bitmap of 16 (your own pieces) x 16 (possible victims) = 256 bits, divided over four 64-bit words. Even when you would generate the captures from scratch in every node, you could start clearing the words, and then set the bit corresponding to every capture you find (sth like captureSet[victim2word[victim]] |= attacker2bit[piece][victim]). If you assigned the bits so that they get extracted in MVV/LVA order, you don't need any capture sorting either.

mvanthoor · Post by **mvanthoor** » Wed Feb 24, 2021 10:49 am

Mike Sherwin wrote: ↑Wed Feb 24, 2021 10:03 am I agree. It is my experience that bitboard is not fast.

In my (limited) experience, they are easier to make fast, though.

When programming an engine that uses an array with numbers in it for the board representation, you're always looping around the board and pieces; you'll have to create extra features to optimize this, such as piece lists.

With a bitboard engine, I find things to be easier, at least for my brain. If I want to know the moves of a knight on a certain square, I can just look it up in the knight/square lookup table, and it'll return a bitboard with all valid moves for that knight on that square. If I want to know if a file is open, I can just do "if occupancy & <mask_for_file> == 0 { ... }", and I'll know.

You can cache/store many things for lookup in both bitboard and mailbox engines, but in my experience, when working with bitboards, it's easier to keep things fast.

PS: I think you're using some strange way of measuring nodes/sec. 25 MNps is REALLY fast. Not even engines like Stockfish can reach that (or have a very hard time reaching it) on current-day hardware when running on a single thread. My own engine runs perft at 36-40 million leaves/sec on my computer, but it "only" searches at 3-5 M nodes/sec depending on the position. Search is about 10 times slower than perft. (And I only do material counting and psqt evaluation, and both are even incremental.)

Henk · Post by **Henk** » Wed Feb 24, 2021 10:57 am

My bitboard move generation costs at least 600 lines of code (to worry about). The less code the better.
No doubt I hate code. Always looks ugly.

mvanthoor · Post by **mvanthoor** » Wed Feb 24, 2021 11:17 am

Henk wrote: ↑Wed Feb 24, 2021 10:57 am My bitboard move generation costs at least 600 lines of code (to worry about). The less code the better.
No doubt I hate code. Always looks ugly.

That's bullocks. Less code is not automatically better. My engine has a lot of lines of code for what it does. The reason is that I work like this:

Code: Select all

let x = some_useful_value();
let y = this_is_whatever();
let a = get_some_stuff();
let special = (x * y) + a;
let result = use_this_somewhere(special);

return result;

I could also have written:

Code: Select all

return use_this_somewhere((some_useful_value() * this_is_whatever()) + get_some_stuff());

That's less code, but it's also unreadable if you keep this up for some time. It's like writing a book without interpunction or paragraphs. I write the first version above; the compiler will condense it into the second version, all on its own.

Are Bitboards More Intoxicating Than They Are Good?

Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?

Re: Are Bitboards More Intoxicating Than They Are Good?