Updating castling rights

Discussion of chess software programming and technical issues.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Updating castling rights

Post by bob »

wgarvin wrote:
bob wrote:
hgm wrote:That is exactly what he does, but with the quirk that the masks deciding which bits to clear are coming from the piece codes themselves, rather than from a separate table.
Not quite. I don't use anything but the square for this.

I have a separate piece of code for moving pawns, knights, etc...

in the rook move code:

if square == a1 castle&=0x0d;
if square == h1 castle&=0x0e;

in king move code:

castle &= 0xc0;

Of course the three AND masks are different for black and white. And square is the "from" square assuming white is moving, or "to" for black, since capturing either white rook ends castling to that side from here on...

In a typical program, make/unmake is < 10% of total time, so worrying about saving 0.1% of less than 10% is not exactly productively spent time.
Better if it can be done without if-tests though. Using a table indexed by source square and destination square, to mask off the appropriate rights, is branchless, but it uses a couple L1 cache lines. For bitboard programs, the "bitboard of squares that don't have their castling right" is also branchless and extremely lightweight. When generating the castling moves, you just have to test one byte of the bitboard with a mask containing the two bits (one for a rook, one for a king) and if either of them is set you know the move is not legal anymore. So its cheap in makemove and in the move generation.
I just ran a 30 min test, 15 different positions. 10 of them had not castled yet, 5 already had. I did it my current way (if-test) and using a simple table which dumps the branch. Could measure no speed difference whatsoever. The castle branch is probably predicted 99.99999% of the time in positions where you start off not yet castled, and is predicted 100% of the time when you start out with no castle rights at all. In real games, the latter is far more common, thanks to book lines...

I tend to do what is simplest in these kinds of cases...
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Updating castling rights

Post by Sven »

wgarvin wrote:Better if it can be done without if-tests though. Using a table indexed by source square and destination square, to mask off the appropriate rights, is branchless, but it uses a couple L1 cache lines. For bitboard programs, the "bitboard of squares that don't have their castling right" is also branchless and extremely lightweight. When generating the castling moves, you just have to test one byte of the bitboard with a mask containing the two bits (one for a rook, one for a king) and if either of them is set you know the move is not legal anymore. So its cheap in makemove and in the move generation.
You describe how simple *using* that bitboard would be. Agreed, but the classical way of maintaining four bits indicating availability of the four types of castling is even simpler.

Code: Select all

if (moveIsWhiteOO(move)  && (castlingRights & WhiteOORight)  && "F1, G1 are empty")
    generateMove(move);
if (moveIsWhiteOOO(move) && (castlingRights & WhiteOOORight) && "D1, C1, B1 are empty")
    generateMove(move);
if (moveIsBlackOO(move)  && (castlingRights & BlackOORight)  && "F8, G8 are empty")
    generateMove(move);
if (moveIsBlackOOO(move) && (castlingRights & BlackOOORight) && "D8, C8, B8 are empty")
    generateMove(move);
And you don't describe how simple *updating* that bitboard would be, or did I miss that? If I understand correctly this includes performing copy-and-update, and therefore keeping one instance of that bitboard per ply, since updating the bitboard by setting the 'from' and 'to' bits is not reversible (you don't know when to set bits back to 0) so make+unmake would not work. 8 extra bytes per ply on the stack, instead of just 4 bits (needing 1 byte), is the price.

Is this correct?

Sven
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: Updating castling rights

Post by wgarvin »

Sven Schüle wrote:
wgarvin wrote:Better if it can be done without if-tests though. Using a table indexed by source square and destination square, to mask off the appropriate rights, is branchless, but it uses a couple L1 cache lines. For bitboard programs, the "bitboard of squares that don't have their castling right" is also branchless and extremely lightweight. When generating the castling moves, you just have to test one byte of the bitboard with a mask containing the two bits (one for a rook, one for a king) and if either of them is set you know the move is not legal anymore. So its cheap in makemove and in the move generation.
You describe how simple *using* that bitboard would be. Agreed, but the classical way of maintaining four bits indicating availability of the four types of castling is even simpler.

Code: Select all

if (moveIsWhiteOO(move)  && (castlingRights & WhiteOORight)  && "F1, G1 are empty")
    generateMove(move);
if (moveIsWhiteOOO(move) && (castlingRights & WhiteOOORight) && "D1, C1, B1 are empty")
    generateMove(move);
if (moveIsBlackOO(move)  && (castlingRights & BlackOORight)  && "F8, G8 are empty")
    generateMove(move);
if (moveIsBlackOOO(move) && (castlingRights & BlackOOORight) && "D8, C8, B8 are empty")
    generateMove(move);
And you don't describe how simple *updating* that bitboard would be, or did I miss that? If I understand correctly this includes performing copy-and-update, and therefore keeping one instance of that bitboard per ply, since updating the bitboard by setting the 'from' and 'to' bits is not reversible (you don't know when to set bits back to 0) so make+unmake would not work. 8 extra bytes per ply on the stack, instead of just 4 bits (needing 1 byte), is the price.

Is this correct?

Sven
More or less. Its simple to update the bitboard because you just do this:

Code: Select all

board.noCastleBB |= (BITBOARD(1)<<fromSq) | (BITBOARD(1)<<toSq);
Notice that in a bitboard engine you'll have to compute (BITBOARD(1)<<fromSq) and (BITBOARD(1)<<toSq) anyway so that you can update your bitboards for the moving piece. So this update is as cheap as anything I can imagine.

Of course it adds 8 bytes to the board representation instead of just 4 bits. I was assuming that you back up the board struct in makemove (by copying its contents into an array for example) and restore it when undoing the move, so yes you'd have to copy the 8 bytes instead of 4 bits. Given what Dr. Hyatt pointed out about predictability of the branches, maybe its simpler to just use 4 bits and run a few extra instructions during makemove. I just like the idea because the update in makemove is extremely cheap.

As for "using" the bitboard (in the generation of moves), its exactly as simple as using the 4 bits would be.

instead of something like this:

Code: Select all

struct Board
{
    U8 castlingRights;
} board;

if (sideToMove == WHITE)
{
    if (board.castlingRights & 0x01)  GenerateMove(WHITE_CASTLE_KS);
    if (board.castlingRights & 0x02)  GenerateMove(WHITE_CASTLE_QS);
    if (board.castlingRights & 0x04)  GenerateMove(BLACK_CASTLE_KS);
    if (board.castlingRights & 0x08)  GenerateMove(BLACK_CASTLE_QS);
}
You end up with something like this:

Code: Select all

struct Board
{
    union
    {
        U8       u;
        BITBOARD bb;
    } noCastle;
} board;

if (sideToMove == WHITE)
{
    if ((board.noCastle.u[0] & KS_MASK) != 0)  GenerateMove(WHITE_CASTLE_KS);
    if ((board.noCastle.u[0] & QS_MASK) != 0)  GenerateMove(WHITE_CASTLE_QS);
    if ((board.noCastle.u[7] & KS_MASK) != 0)  GenerateMove(BLACK_CASTLE_KS);
    if ((board.noCastle.u[7] & QS_MASK) != 0)  GenerateMove(BLACK_CASTLE_QS);
}
Of course that union is funky because which end of the bitboard you access for White and for Black is going to depend on whether you're compiling for a big-endian or little-endian machine. If you already have #ifdefs for this, you can use those. Or if you want to keep it simple, you can just use a normal 64-bit masks and let the compiler worry about optimizing it.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: Updating castling rights

Post by wgarvin »

Also, even though it takes up 8 bytes in the board struct, only 6 of the 64 bits in it are interesting.

Suppose the update step were changed to this instead:

Code: Select all

board.noCastleBB |= (BITBOARD(0xFF000000000000FF) & ((BITBOARD(1)<<fromSq) | (BITBOARD(1)<<toSq)));
That's still extremely cheap, only about 2 extra instructions and no table and no branch. And it means it will never update 6 of the 8 bytes in this bitboard in the board struct. So you could store other data in those bytes (e.g. half-move counter, flags, enpassant... whatever.) [Edit: you'd want to be careful about load-hit-stores though. Hmm.]

If you have flag bits that don't need a full byte, you could actually use a more selective mask and mix those flags in with the 2 remaining bytes.

Just to be clear, I'm not proposing that anybody should actually do this. Except maybe Gerd. =)
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Updating castling rights

Post by Sven »

wgarvin wrote:Also, even though it takes up 8 bytes in the board struct, only 6 of the 64 bits in it are interesting.

Suppose the update step were changed to this instead:

Code: Select all

board.noCastleBB |= (BITBOARD(0xFF000000000000FF) & ((BITBOARD(1)<<fromSq) | (BITBOARD(1)<<toSq)));
That's still extremely cheap, only about 2 extra instructions and no table and no branch. And it means it will never update 6 of the 8 bytes in this bitboard in the board struct. So you could store other data in those bytes (e.g. half-move counter, flags, enpassant... whatever.) [Edit: you'd want to be careful about load-hit-stores though. Hmm.]

If you have flag bits that don't need a full byte, you could actually use a more selective mask and mix those flags in with the 2 remaining bytes.

Just to be clear, I'm not proposing that anybody should actually do this. Except maybe Gerd. =)
I fully agree to that one sentence which I emphasized above :-) ... just because that proposal you are making only to Gerd ;-) adds to the complexity of a program without really showing a *substantial* improvement. Nevertheless, I cannot say that it is uninteresting.

Sven