Bitboards / move generation on larger boards

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
Greg Strong
Posts: 388
Joined: Sun Dec 21, 2008 6:57 pm
Location: Washington, DC

Re: Bitboards / move generation on larger boards

Post by Greg Strong »

Gerd Isenberg wrote:Yes, please post it.
Ok. To reiterate, it requires the squares being mapped to the bits in a bitboard such that the bits of a file are stored consecutively, not the more common orientation where the bits of a rank are consecutive.

Then, you incrementally maintain, either in the search stack or your board structure, a byte[2][16] array which is a special pawn mask. The first index is the player, and the second index is the file, with some extras. It is necessary to have two extra files, one off of each edge of the board, so really only 10 files are necessary, but we use 16 so that the value can easily be popped into and out of the 128-bit XMM registers. You'll note, though, that except for the code for finding isolated pawns, nothing really needs the XMM registers for an 8x8 board because we're really only doing bit-wise operations of 64-bit groups. This array should be aligned to a 16-byte boundary for easy loading into XMM registers. For 8x8 board, I store the files for the actual board in bytes 4-12 so that it is quadword alligned. What is stored in this array is a mask which, for each player, has the bit set representing the back-most pawn, and all squares behind that pawn. For empty files, including the phantom files off the edges of the board, all bits are set.

Now that we have this information, to use it to find pawn information, you only need: the bitboard for the main board (bytes 4-12), the bitboard for the board shifted one file to the left (bytes 3-11), and the bitboard for the board shifted one file to the right (bytes 5-13.) You can do this by shifting the whole 128-bit mess in an XMM register, but it's just as easy to get them as follows (called center, left, and right, for simplicity)...

Code: Select all

byte behindPawns[2][16];
BitBoard64 center = *((BitBoard64 *) &behindPawns[0][4]);
BitBoard64 left = *((BitBoard64 *) &behindPawns[0][3]);
BitBoard64 right = *((BitBoard64 *) &behindPawns[0][5]);
Passed pawns: It's as simple as...

Code: Select all

enemyPawns & center & left & right
Outpost squares: Finding knights that can't be driven off their square by an enemy pawn is as simple as...

Code: Select all

enemyKnights & left & right
Backward pawns: To find the player's pawns behind their own pawn line...

Code: Select all

playerPawns & ((left & right) >> 1)
The >> 1 shifts the left and right pawn mask back one rank.

Doubled/tripled pawns:

Code: Select all

playerPawns & ~center
Isolated pawns: Ok, here's where it gets interesting, and the XMM registers come into play. First, you load up an XMM register with all bytes set to 255 (or, in other words, all 128 bits are set.) Then you compare it with your magic pawn mask (called "behindPawns" above) with the PCMPEQB instruction (compiler intrinsic _mm_cmpeq_epi8). What that leaves you with is a 128-bit value where, for each file, the bits are all set only if the file has no pawns, and all clear otherwise. Now, you make a new "left" and "right" by shifting this new value one file left and right. Then, you find isolated pawns...

Code: Select all

playerPawns & left & right
See how easy and fast? Notice that we found all that without doing a single branch! Maintaining your "behindPawns" array is easy, too. You just have a lookup table that you consult whenever a pawn moves. This table takes a byte representing the pawns of the file to be updated (thus having 256 entries) and returns a byte that marks the backmost pawn and the squares behind it. The table would be so small as to always be in cache. If it's a pawn capture you have two entries to update. If an enemy pawn is captured, that's another update. However, since the table-lookups are so fast, and the branches to check whether or not it's a pawn move, capture, etc are slow, probably best to just update the move-from-file and move-to-file on every move (unless your code is already performing these tests for other reasons.)

Changing the bitboard square mapping may seem pretty radical, but, since all the pieces in chess have symmetric movements except the pawns, there is no practical difference between rank-major mapping and file-major mapping except where pawns are concerned. Finding pawn formations, however, is all about examining files, not ranks, so I respectfully submit that a mapping that stores the bits of the files consecutively is clearly superior. Additionally, if you use the XMM registers for all these operations, it works for boards of sizes up to 14x8 with little or no penalty!
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Bitboards / move generation on larger boards

Post by Gerd Isenberg »

Greg Strong wrote:
Gerd Isenberg wrote:Yes, please post it.
Ok. To reiterate, it requires the squares being mapped to the bits in a bitboard such that the bits of a file are stored consecutively, not the more common orientation where the bits of a rank are consecutive.

Then, you incrementally maintain, either in the search stack or your board structure, a byte[2][16] array which is a special pawn mask. The first index is the player, and the second index is the file, with some extras. It is necessary to have two extra files, one off of each edge of the board, so really only 10 files are necessary, but we use 16 so that the value can easily be popped into and out of the 128-bit XMM registers. You'll note, though, that except for the code for finding isolated pawns, nothing really needs the XMM registers for an 8x8 board because we're really only doing bit-wise operations of 64-bit groups. This array should be aligned to a 16-byte boundary for easy loading into XMM registers. For 8x8 board, I store the files for the actual board in bytes 4-12 so that it is quadword alligned. What is stored in this array is a mask which, for each player, has the bit set representing the back-most pawn, and all squares behind that pawn. For empty files, including the phantom files off the edges of the board, all bits are set.

Now that we have this information, to use it to find pawn information, you only need: the bitboard for the main board (bytes 4-12), the bitboard for the board shifted one file to the left (bytes 3-11), and the bitboard for the board shifted one file to the right (bytes 5-13.) You can do this by shifting the whole 128-bit mess in an XMM register, but it's just as easy to get them as follows (called center, left, and right, for simplicity)...

Code: Select all

byte behindPawns[2][16];
BitBoard64 center = *((BitBoard64 *) &behindPawns[0][4]);
BitBoard64 left = *((BitBoard64 *) &behindPawns[0][3]);
BitBoard64 right = *((BitBoard64 *) &behindPawns[0][5]);
Passed pawns: It's as simple as...

Code: Select all

enemyPawns & center & left & right
Outpost squares: Finding knights that can't be driven off their square by an enemy pawn is as simple as...

Code: Select all

enemyKnights & left & right
Backward pawns: To find the player's pawns behind their own pawn line...

Code: Select all

playerPawns & ((left & right) >> 1)
The >> 1 shifts the left and right pawn mask back one rank.

Doubled/tripled pawns:

Code: Select all

playerPawns & ~center
Isolated pawns: Ok, here's where it gets interesting, and the XMM registers come into play. First, you load up an XMM register with all bytes set to 255 (or, in other words, all 128 bits are set.) Then you compare it with your magic pawn mask (called "behindPawns" above) with the PCMPEQB instruction (compiler intrinsic _mm_cmpeq_epi8). What that leaves you with is a 128-bit value where, for each file, the bits are all set only if the file has no pawns, and all clear otherwise. Now, you make a new "left" and "right" by shifting this new value one file left and right. Then, you find isolated pawns...

Code: Select all

playerPawns & left & right
See how easy and fast? Notice that we found all that without doing a single branch! Maintaining your "behindPawns" array is easy, too. You just have a lookup table that you consult whenever a pawn moves. This table takes a byte representing the pawns of the file to be updated (thus having 256 entries) and returns a byte that marks the backmost pawn and the squares behind it. The table would be so small as to always be in cache. If it's a pawn capture you have two entries to update. If an enemy pawn is captured, that's another update. However, since the table-lookups are so fast, and the branches to check whether or not it's a pawn move, capture, etc are slow, probably best to just update the move-from-file and move-to-file on every move (unless your code is already performing these tests for other reasons.)

Changing the bitboard square mapping may seem pretty radical, but, since all the pieces in chess have symmetric movements except the pawns, there is no practical difference between rank-major mapping and file-major mapping except where pawns are concerned. Finding pawn formations, however, is all about examining files, not ranks, so I respectfully submit that a mapping that stores the bits of the files consecutively is clearly superior. Additionally, if you use the XMM registers for all these operations, it works for boards of sizes up to 14x8 with little or no penalty!
Thanks! Interesting, looks great. I agree on the mapping issue. Whether file or rank is the big end doesn't cares - your mapping allows very cheap determinism, specially of passers, which is crucial to decide about lazy eval. Unaligned 64-bit or 128-bit access is a problem, of course you may use movdqu or the SSE3 equivalent to load an xmm register from unaligned overlapped memory. I would take one aligned load, two movdqa xmm, xmm and two 128-bit shifts by one byte.

The additional incremental update seems negligible in your case. Good luck and fun with your promising approach!

Gerd
User avatar
smrf
Posts: 484
Joined: Mon Mar 13, 2006 11:08 am
Location: Klein-Gerau, Germany

Re: Bitboards / move generation / ChessV

Post by smrf »

Here I still have your ChessV version 0.9.2, which sometimes is switching the player when starting to lose a game. Is there any new program /engine of yours testable for 10x8 chess games?
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Bitboards / move generation / ChessV

Post by hgm »

smrf wrote:..., which sometimes is switching the player when starting to lose a game.
That is smart! Why didn't I think of that? :lol: :lol: :lol:
User avatar
Greg Strong
Posts: 388
Joined: Sun Dec 21, 2008 6:57 pm
Location: Washington, DC

Re: Bitboards / move generation / ChessV

Post by Greg Strong »

My new engine, Quadrox, isn't doing 10x8 yet, and still needs a lot of basic things added (like null move.) It will probably be a month of two before I've got something to release.

If you can save the game in ChessV when it switches players and let me know at which move things go bad, I'll be happy to look into the problem!
User avatar
smrf
Posts: 484
Joined: Mon Mar 13, 2006 11:08 am
Location: Klein-Gerau, Germany

Re: Bitboards / move generation / ChessV

Post by smrf »

Here is SMIRF's PGN of the matches. At 1 min/move everything worked well, but at 45 sec/move the strange behaviour was seen at the end.

The games are done playing Embassy Chess. But ChessV did such switches also using other starting arrays.

Moreover ChessV's position evaluation mostly is obscure to me, while often seeing itself more than two pawn units better than SMIRF does.

Code: Select all

[Event "SmirfGUI Embassy-Chess Game"]
[Site "Mac Os X VM-Ware"]
[Date "2009.02.07"]
[Time "18:04:19"]
[Round "Test-Game Serie 1 min / move"]
[White "ChessV 0.9.2"]
[Black "Smirf MS-175j"]
[Result "0-1"]
[Annotator "RS"]
[SetUp "1"]
[FEN "rnbqkmcbnr/pppppppppp/10/10/10/10/PPPPPPPPPP/RNBQKMCBNR w mKQkq - 0 1"]

{no problem within this time-frame} 1. Nc3 g6 {(09.01) -0.234} 2. g3 Nc6
{(10.01=) -0.131} 3. Nh3 Nh6 {(09.14=) -0.139} 4. d3 Bxc3+ {(09.07) -0.080} 5.
bxc3 Cf6 {(09.16) -0.168} 6. Bd2 d5 {(09.01) -0.146} 7. Rb1 Qd6 {(09.03=)
-0.113} 8. Nf4 e6 {(10.38) -0.145} 9. Bg2 Qa3 {(08.29) +0.461} 10. c4 Qxa2
{(10.06=) -0.441} 11. cxd5 Nd4 {(10.46=) -0.566} 12. Rc1 e5 {(10.00) +0.092}
13. Bb4 Mg8 {(10.01) +0.305} 14. c3 Nb3 {(11.31) +0.408} 15. Rc2 Qa4 {(11.01=)
+0.820} 16. Rb2 a5 {(11.17) +1.389} 17. Ba3 Qxa3 {(12.15) +0.723} 18. Rxb3 Qe7
{(11.02) +0.543} 19. Nh3 e4 {(10.01=) +0.521} 20. dxe4 a4 {(10.33) +1.666} 21.
e5 Cxe5 {(11.01) +0.973} 22. f4 Cg7 {(11.01=) +1.912} 23. d6 axb3 {(12.01=)
+2.504} 24. dxe7 Cxc3+ {(13.01=) +2.119} 25. Kf2 Ng4+ {(11.01=) +19.05} 26. Kf3
Mxe7 {(11.01=) +21.20} 27. Ng5 Me5+ {(04.01=) +M~006} 28. fxe5 Cxe5+ {(02.01=)
+M~005} 29. Ke4 f5+ {(02.01=) +M~004} 30. Mxf5 gxf5+ {(02.01=) +M~003} 31. Kd5
Ra5+ {(02.01=) +M~002} 32. Cc5 c6# {(02.00?) +M~001} 0-1

[Event "SmirfGUI Embassy-Chess Game"]
[Site "Mac Os X VM-Ware"]
[Date "2009.02.08"]
[Time "12:57:41"]
[Round "Test-Game Serie 45 sec / move"]
[White "ChessV 0.9.2"]
[Black "Smirf MS-175j"]
[Result "*"]
[Annotator "RS"]
[SetUp "1"]
[FEN "rnbqkmcbnr/pppppppppp/10/10/10/10/PPPPPPPPPP/RNBQKMCBNR w mKQkq - 0 1"]

{finally ChessV tries to promote Black's pawn} 1. Nc3 g6 {(09.01) -0.234} 2. g3
Nc6 {(09.18) -0.180} 3. Nh3 Nh6 {(09.03=) -0.145} 4. d3 Bxc3+ {(09.10) -0.080}
5. bxc3 Cf6 {(09.11=) -0.168} 6. Bd2 d5 {(08.26) -0.133} 7. Rb1 Qd6 {(09.02+)
-0.209} 8. Nf4 e6 {(10.47) +0.018} 9. Bg2 Qa3 {(08.02) +0.461} 10. c4 Qxa2
{(10.01) -0.441} 11. cxd5 Nd4 {(10.39) -0.566} 12. Rc1 e5 {(09.01=) +0.092} 13.
Bb4 Mg8 {(10.01) +0.305} 14. c3 Nb3 {(11.05) +0.408} 15. Rc2 Qa4 {(10.56)
+0.318} 16. Rb2 a5 {(11.01=) +1.389} 17. Ba3 Qxa3 {(12.03) +0.723} 18. Rxb3 Qe7
{(11.02) +0.543} 19. Nh3 g5 {(09.25) +0.320} 20. Be4 a4 {(09.09=) +0.082} 21.
Rb4 a3 {(09.01) -0.066} 22. f3 Nf5 {(10.32) +0.146} 23. Ni5 j6 {(09.38=)
+0.311} 24. g4 jxi5 {(10.03) +0.283} 25. gxf5 Mh6 {(09.19=) +0.340} 26. Ch3 i4
{(10.01=) +0.512} 27. Ci5 Mi6 {(09.56) +0.334} 28. Cg3 Ra6 {(09.01) +0.682} 29.
Mf2 a2 {(09.02) +1.150} *
User avatar
hgm
Posts: 27787
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Bitboards / move generation / ChessV

Post by hgm »

smrf wrote:Moreover ChessV's position evaluation mostly is obscure to me, while often seeing itself more than two pawn units better than SMIRF does.
This does not only apply to ChessV, but to most 10x8 engines. They must all have very obscure evluation! :lol: :lol: :lol:
User avatar
smrf
Posts: 484
Joined: Mon Mar 13, 2006 11:08 am
Location: Klein-Gerau, Germany

Re: Bitboards / move generation / ChessV

Post by smrf »

hgm wrote:
smrf wrote:Moreover ChessV's position evaluation mostly is obscure to me, while often seeing itself more than two pawn units better than SMIRF does.
This does not only apply to ChessV, but to most 10x8 engines. They must all have very obscure evluation! :lol: :lol: :lol:
Not really - it has not been obscure to me because of being different from SMIRF (which is neither a miracle nor surprising), but because of claiming clear advantage in an obviously lost position.
User avatar
smrf
Posts: 484
Joined: Mon Mar 13, 2006 11:08 am
Location: Klein-Gerau, Germany

Re: Bitboards / move generation / ChessV

Post by smrf »

Another game of such:

Code: Select all

[Event "SmirfGUI Optimized Chess Game"]
[Site "Mac VM-Ware"]
[Date "2009.02.10"]
[Time "15:01:43"]
[Round "1 min / move - 25 min + 10 sec"]
[White "ChessV"]
[Black "SmiRF"]
[Result "*"]
[Annotator "RS"]
[SetUp "1"]
[FEN "nrcbqkbarn/pppppppppp/10/10/10/10/PPPPPPPPPP/NRCBQKBARN w sKQkq - 0 1"]

{ChessV finally tried to take over.} 1. Nb3 Ni6 {(10.01) -0.309} 2. e4 e6
{(10.01=) -0.207} 3. Ni3 d6 {(10.01) -0.297} 4. f4 Nb6 {(09.14) -0.311} 5. Bf2
Cg6 {(10.14) +0.037} 6. Bg3 f5 {(10.01) -0.291} 7. Bh5 Cxh5 {(12.01) -0.102} 8.
Nxh5 Qxh5 {(10.01) -0.121} 9. i4 Qf7 {(10.46+) -0.205} 10. j3 h6 {(09.02)
+0.088} 11. Nd4 c5 {(10.01) +0.613} 12. Nf3 g5 {(10.01=) +0.670} 13. exf5 exf5
{(10.01) +0.678} 14. Cf2 g4 {(10.01=) +0.906} 15. Nh4 Bh7 {(10.01=) +1.180} *
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: Bitboards / move generation on larger boards

Post by Rein Halbersma »

Gerd Isenberg wrote:
Greg Strong wrote:Since we're talking about bitboard techniques (such as magic bitboards and hyperbola quintessence) anyway, I'd like to take the opportunity to throw another variable into the discussion...

Chess variants with larger boards, in particular the 80-square 10x8, being as it seems to be the most common right now. ChessV uses rotated bitboards, and has a BitBoard96 class that allocates a 64-bit and a 32-bit value so it's capable of handling any board with up to 96 squares. (And there's also a BitBoard128 class for even larger boards.) It all works fine, but I'm now in the process of doing a complete re-write of ChessV, and I'm wondering if other techniques should be considered. The use of 128-bit SSE registers is appealing, and certainly useful for some aspects, but it has limitations - for example, you can't shift an entire 128-bit register by an arbitrary number of bits, nor can you perform any arithmetic with any values larger than 64 bits.

Anyone have any ideas they want to kick around?
I think those board sizes are the domain of mailbox approaches, specially for "variable" board sizes. For 10x8 you may use 128-bit xmm-registers. With the right mapping, one step shift is a combination of (left/right) bit-shift and byte-shift. So you may use something like this (saving the wrap ands), but replace bitwise shift 8 by byte-wise shift one. So quite nice for pawn attacks and dump-9 fill.
Does anyone have experience with using the C++ STL bitset<N> template as a bitboard type for arbitrary sizes? One advantage of this would be that you don't need to change any of your code when the only thing that changes is the board size.