Comparison of bitboard attack-getter variants

Henk · Post by **Henk** » Tue Jan 05, 2016 3:06 pm

If it costs a huge amount of memory them it is not worth to try. There are 8 * 128 = 1024 combinations for horizontal moves on a row. That means 16 * 1024 bit boards for all horizontal and vertical moves.

If say each bit board would contain 10 moves then it would be about 160000 moves. If a move is represented by a 32 bits reference then it costs about 5 megabyte for straight moves.

hgm · Post by **hgm** » Tue Jan 05, 2016 3:56 pm

I don't think Sven implies that you have to pre-calculate the moves. You tabulate the bitboards, select the one you need depending on from-square and board occupancy, possibly masking away the moves you are not interested in (e.g. non-captures, or futile victims), and then you extract the remaining moves and store those. None of the mentioned techniques gives you any savings on move extraction, other than the masking away the uninteresting ones. Every move you are going to search will have to be extracted.

As Sven mentions, there is no acceptable way to assign a different sort key to moves that reside in the same bitboard. You have to split them into individual moves. And a bitboard is not really an efficient way to encode a single to-square, so you usually would convert it to a square number.

Of course if you are not interested in sorting the moves, but want to play them in move-generation order anyway (which you might for non-captures is you don't use the history heuristic), then you could skip storing and sorting the moves, and just play them as you extract them, bitboard after bitboard. For captures that is not acceptable, as this would play the moves ordered by attacker (e.g. LVA), while you primarily want to order them by victim (MVV). As any piece can attack any victim this forces you to first extract all captures before you can pick the first one you want to play.

This is why I prefer a representation where the bits representing moves are already residing in their word in MVV/LVA order. E.g. in a 64-bit word you could store 4 x 16 bits for all 16 possible attackers of four equi-valued victims (NNBB). An attackers set on a single victim would then be a 'comb' like 0x1111111111111111 x 1, 2, 4 or 8, with attackers in LVA extraction order. Pieces could easily inherit such sets of attackers from each other when an attacker replaces a victim during a capture, and pieces that have been captured can be ignored by masking attacker sets with a 'presence' mask like 0xFFFF0FFFFFF0FFF0F before using them. You can then just loop through the attackers sets and extract the moves in MVV/LVA order, so that you can immediately play them. With a slight refinement you could do that in two passes, masking away any H x protected L captures in the first pass, and playing those in a second pass.

Henk · Post by **Henk** » Tue Jan 05, 2016 4:19 pm

I first try something simple. But if I would look up the horizontal captures then the look up operation must be cheaper than computing 12 shifts, 10 bit wise or and 2 exclusive or operations. So if the table has a more expensive hash key computation it will fail.

Sven · Post by **Sven** » Tue Jan 05, 2016 4:51 pm

Henk wrote:I first try something simple. But if I would look up the horizontal captures then the look up operation must be cheaper than computing 12 shifts, 10 bit wise or and 2 exclusive or operations. So if the table has a more expensive hash key computation it will fail.

I think we are talking about very different things. Maybe I misunderstood your original question. I thought you were talking about move generation and storing its results in the move list. But now it appears to me you are talking about precalculation of move information to be used during move generation. Please correct me if I am wrong here.

Otherwise, please explain why you want to precalculate move information. Move generation inherently deals with a "from" and a "to" square, and there is almost no other information that you can store as a "move" entity in the move list, except for the very rare exceptional move types like ep, castling, promotion (but rare cases do not justify a process used for all cases).

So I wonder what you would save by precalculating something here. Which information would you want to store, then? The MVV/LVA score? That is about the only thing I can imagine that is actually calculated during move generation. But hey, what would you save then? Just a few multiplications and additions, that's it.

Henk · Post by **Henk** » Tue Jan 05, 2016 5:23 pm

I don't know much about magic bit boards.

But I thought it was something like looking up pre- calculated moves when you store all moves for each possible occupancy of say a row or column or diagonal. So perhaps I'm talking non sense here. Forget about it.

hgm · Post by **hgm** » Tue Jan 05, 2016 5:37 pm

What magic bitboards do is deliver bitboards (from a hash table) that contain sets of all to-squares for a given from-square, piece type and board occupancy. This usually includes non-captures, captures and pseudo-captures (i.e. of own pieces), as you can easily mask away those that you are not interested in. That is all. The calculation of the hash key is very cheap:

index = (mask[pieceType][fromSqr] & occupied) * magic[pieceType][fromSqr] >> 52;

Then you use that to fetch the pre-calculated board:

toSet = table[pieceType][fromSqr][index];

That is the magic of it. The mask only leaves the squares of the 'occupied' bitboard that the piece could reach on an empty board. The multiply constructs a hash key of those in the upper part of the uint64, which is then shifted to the lower part to index a contiguous array.

Sven · Post by **Sven** » Tue Jan 05, 2016 6:35 pm

Henk wrote:I don't know much about magic bit boards.

But I thought it was something like looking up pre- calculated moves when you store all moves for each possible occupancy of say a row or column or diagonal. So perhaps I'm talking non sense here. Forget about it.

Right in principle, only that all ray occupancies are considered together, i.e. all rook rays at once or all bishop rays at once, so no separation into different directories. Apart from that you are right and it is no nonsense at all.

So now I understand better your initial question on that topic (I hope so at least ...). You asked:

How many moves do you need to store to make magic bit boards work ? I think only storing the bit boards that contains the moves will be too slow. For you still have to extract the moves out of these bit boards.

In each table element (which is one bitboard) you store all rook attacks from a given square related to one of all relevant occupancies on all rook rays for that square. And the same for bishop attacks. Extracting the moves from that information is necessary during move generation but that is not very expensive. It would probably be much more expensive to store all moves instead, since you would either need a complex "dynamic" data structure for that which is prepared for variable-sized lists of precalculated moves, or a very huge data structure which can hold the maximum possible number of moves, i.e. 14 for rooks and 13 for bishops. 14 moves x 16 bits per move would require 28 bytes, compare that to 8 bytes for one bitboard. A typical size of the lookup table for magics is about 800 kB, storing 16-bit moves instead of bitboards would therefore require around 2.8 MB. I think that would slow down the program significantly since you would have to access much more memory during move generation. Furthermore this data structure would not be suitable for other purposes like evaluation where you do not focus on single moves but on dealing with attack sets represented by one 64 bit word. For that reason you would need the bitboards as well, for a total of about 3.6 MB instead of 800 kB lookup table. I think that's not appropriate.

hgm wrote:What magic bitboards do is deliver bitboards (from a hash table) that contain sets of all to-squares for a given from-square, piece type and board occupancy. This usually includes non-captures, captures and pseudo-captures (i.e. of own pieces), as you can easily mask away those that you are not interested in. That is all. The calculation of the hash key is very cheap:

index = (mask[pieceType][fromSqr] & occupied) * magic[pieceType][fromSqr] >> 52;

Then you use that to fetch the pre-calculated board:

toSet = table[pieceType][fromSqr][index];

That is the magic of it. The mask only leaves the squares of the 'occupied' bitboard that the piece could reach on an empty board. The multiply constructs a hash key of those in the upper part of the uint64, which is then shifted to the lower part to index a contiguous array.

Typically there are only two one-dimensional lookup tables of different size, one for rook attacks and one for bishop attacks (and they are concatenated in memory). The "fromSqr" is not used for indexing, it is already given implicitly by "index". The sizes differ due to different properties of rooks and bishops in combination with different squares on the board which lead to different numbers of required bits for the (perfect) hashing.

hgm · Post by **hgm** » Tue Jan 05, 2016 7:03 pm

Sven Schüle wrote:The "fromSqr" is not used for indexing, it is already given implicitly by "index".

I don't get that. The shift I use leaves only 12 index bits, which is not nearly enough to index all possible different sets of Rook attacks. It was adapted to the worst case of the corner (from-)squares; most squares would have to leave fewer bits, so that the shift could be square dependent, but I did not want to bother Henk with that refinement yet. Without pushing things you would have 4 corner squares that need 4K(-entry) tables, 24 edge squares that need 2K tables and 36 central squares that need 1K tables, 4*4+2*24+36=100K entries in total, times 8 (bytes/bitboard) = 800K. So each square does have its own table. As attack sets for one square are almost never usable for another square, there does not seem much advantage in merging everything into one big table and work the from-square in the index to address it.

Sven · Post by **Sven** » Tue Jan 05, 2016 8:21 pm

hgm wrote:
Sven Schüle wrote:The "fromSqr" is not used for indexing, it is already given implicitly by "index".
I don't get that. The shift I use leaves only 12 index bits, which is not nearly enough to index all possible different sets of Rook attacks. It was adapted to the worst case of the corner (from-)squares; most squares would have to leave fewer bits, so that the shift could be square dependent, but I did not want to bother Henk with that refinement yet. Without pushing things you would have 4 corner squares that need 4K(-entry) tables, 24 edge squares that need 2K tables and 36 central squares that need 1K tables, 4*4+2*24+36=100K entries in total, times 8 (bytes/bitboard) = 800K. So each square does have its own table. As attack sets for one square are almost never usable for another square, there does not seem much advantage in merging everything into one big table and work the from-square in the index to address it.

You have described almost perfectly how it works. In practice most people maintain a 64-entry offset table (one for rook and one for bishop) into the lookup table. Of course there are also other ways but as far as I know most people do it this way.

Also your example is for rooks. For bishops the amount of data is much smaller.

hgm · Post by **hgm** » Tue Jan 05, 2016 8:48 pm

OK, you mean that table[fromSqr][index] on a 64-bit architecture could be more expensive than t[index + offset[fromSqr]], because int offset[] takes only 4 bytes per entry while BitBoard *table[] takes 8 byte per entry for the pointers. It does seem to take an extra addition, though, because you need to scale the index by 8 (the size of the BitBoard elements), and only one of the two index registers can do that in scaled-indexed mode. You could of course store the offset in bytes, and write something like (BitBoard*)(((char*) t)[8*index + offset[fromSqr]]) to prevent that.

Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants

Re: Comparison of bitboard attack-getter variants