Bitbases

frankp · Post by **frankp** » Sun May 03, 2009 1:52 pm

Would someone point me in the direction of a _primer_ on bitbases, please. Preferably one for idiots.

wgarvin · Post by **wgarvin** » Wed May 06, 2009 2:45 am

I assume you're asking about how to use them? (Or are you looking for info on the algorithms used to generate them?)

I haven't seen much info about bitbases out there, myself. Maybe I've just been looking in the wrong places.

Dann Corbit · Post by **Dann Corbit** » Wed May 06, 2009 3:29 am

Do you need something more than this:
http://dcorbit2008.corporate.connx.com/ ... readme.txt

Dann Corbit · Post by **Dann Corbit** » Wed May 06, 2009 3:40 am

See also:
http://horizonchess.com/FAQ/Winboard/egtb.html

frankp · Post by **frankp** » Wed May 06, 2009 8:04 pm

Thanks for the replies. I had googled but not found what I wanted.

I am looking for basic programming info. My aim was to generate and use my own.

Thanks for the links - although I cannot seem to connect to .....connx.com

wgarvin · Post by **wgarvin** » Wed May 06, 2009 10:04 pm

You might want to look for info here:
http://kirill-kryukov.com/chess/discuss ... um.php?f=6

I've thought a lot over the last 10 years about what it would take to generate a set of bitbases from scratch, but I am naturally lazy and its such a big project that I haven't actually done it. Good luck with it though!

The way to make bitbases is probably to first generate tablebases and then convert those into bitbases. (There are ways of generating them directly, but you will not be able to handle DTZ50 or things like that if the generator uses just 2 bits per position. So it might as well use a full byte and be a DTC or DTZ50 tablebase).

As for tablebases, they are generated by retrograde analysis. You initialize all of the illegal positions and positions where the game is already decided (checkmate, stalemate). Then you do a pass to incorporate the results of any capture moves from a position. Then, you do repeated passes that expand the amount of resolved positions by noticing that all of the moves in a position lead to other positions whose outcome is now known, and so the game-theoretic value of that position is also known too.

One way is to use a regular (forward) move generator, to find all the positions that come "after" this position in a game tree. However, that can get slow because you have to repeatedly scan the whole database for changes but only a tiny fraction of the nodes change on each pass. So an alternative is to record the indices of positions that changed, and on the next pass, start from those ones and work backwards to the positions that lead to them.

One way to do that, you need to keep a count of "unresolved successors" in the byte for each position, and decrement it each time you resolve one of the "after" positions. Which means you need a "reverse move generator" that can produce all the "before" positions of a position! So that when you resolve the result of a position, you can generate all the "before" positions and decrement their counters. There might be other ways that are better though, I don't know.

There are also some subtleties to do with indexing. For example, how will you index the positions (for some collection of pieces, e.g. KBBKNP) in a way that doesn't waste too much index space on illegal or duplicated positions?

For pawnless endgames, you can exploit horizontal and vertical symmetry but also one of the two diagonal symmetries. What that means is, you can flip any board position horizontally or vertically without affecting the outcome of the game in any way. You can also flip it diagonally. So rather than store these 2*2*2 = 8 positions that are reflections of each other along one of these 3 axes, it should only store 1 of them. This paper by Heinz has a bunch more info about that.
Space-Efficient Indexing of Chess Endgame Tables (Postscript)

Enumerating legal placements of white king and black king in combination with these three symmetries, gives 468 "king pair" indices for pawnless endgames. Usually they confine one of the two kings to one triangle of the board (only 10 squares), thats how the range is so small (4096 -> 468). However, if both kings are right on the diagonal of the triangle, you have the strange situation where their positions are a reflection of themselves and so there are TWO entries in the database rather than one, for each placement of the other pieces. So you have to pick another piece (next most significant) and confine it below the same diagonal, and so on. Either that or the move generators must know and account for, the fact that two indices can map to the same position (so e.g. if a reverse move generator produces one of them, it has to produce the other one too).

For endgames with pawns, you can only exploit one board symmetry, horizontal symmetry (the other ones are unavailable because flipping the board in those ways would change the movements of the pawns). You can still enumerate legal king positions with one king confined to one side of the board, and get 1806 combinations. Since there is no diagonal symmetry, the case with TWO entries instead of one does not occur (phew!)

With pawns, if you are trying to respect the 50-move rule, then pawns need to be handled specially because pushing a pawn is a converting move.

You can "slice" the databases into smaller sections to generate them (though this is less important nowadays with the huge amounts of RAM we all have... unless you want to generate tablebases for 6 or more pieces). For example, since each king can only move to a max of 8 squares from its current square... Out of the 468 or 1806 king combinations, if you pick a certain value, then there are at most 8 other values that can be reached by moving one of the kings. So if you slice the database up according to those, you only need a minimum of 9 of the slices loaded at any one time. (Loading and unloading them might be very slow though, because it would have to be done on each pass).

Another example of slicing: for endgames with pawns, you can index all of the pawns separately and chop the databases into slices using the pawn index. Since pawns can only move forwards, there is a natural chain of dependencies between the various slices. You could exploit this by completely generating the ones where all of the pawns are near the end first, and then generating the slices that depend on those ones, and then the ones that depend on those, and so on. That sort of thing may make the indexing more complicated though.

Its important to handle en-passant properly, so you need a way to encode that. Its not important to handle castling, but if you are a completionist you may want to--and so you'll also need a way to encode that. Keep them in mind when desigining the indexing.

[Edit: there are other aspects of tablebases/bitbases that are interesting, too. How will you identify positions in your search tree whose GTV can be answered from the bitbases? (answer: piece count, then Interior-node recognizers.) How will you compress the bitbases to make them as small as possible? (that's the question that's kept me interested in the topic for 10 years...) How will you map board positions to a database ID and an index within that database? (that can be complicated if you use a lot of different indexing schemes. Notice that your database generator tool will need to do this too, in order to resolve moves which are captures, promotions, or both).]

frankp · Post by **frankp** » Wed May 06, 2009 10:30 pm

Thank you very much for taking the time to post this.

Edmund · Post by **Edmund** » Thu May 07, 2009 1:00 am

wgarvin wrote:You might want to look for info here:
http://kirill-kryukov.com/chess/discuss ... um.php?f=6

I've thought a lot over the last 10 years about what it would take to generate a set of bitbases from scratch, but I am naturally lazy and its such a big project that I haven't actually done it. Good luck with it though!

The way to make bitbases is [...]

Some additions:

First you should decide whether you want your bitbases to consider the 50 moves rule at all. Not doing so is easier, though you do not have to first generate a DTC egtb. You can limit the generation to 50 iterations and make sure that you are not using values from the same iteration to calculate the value for a node. You might have to use a flag to ensure that.

Taking advantage of symmetry and the fact that the two kings may not attack each other on a board where there are no pawns, the two kings have 462 possible configurations.

Better start with something easier. You should start worrying about pawn slices if you have 6 or more pieces. For everything else, even if you used a very naive indexing approach you would only have 462 * 64 ^3 = 115m positions for a 5 men.. So that easily fits into ram.

Furthermore efficient index schemes for bitbases are not so important, as 'broken positions' are very well compressible.

You can theoretically skip the implementation of en passant squares. A 1 ply search will resolve the issue too for these rare cases.

wgarvin · Post by **wgarvin** » Fri May 08, 2009 3:19 am

Edmund Moshammer wrote:Taking advantage of symmetry and the fact that the two kings may not attack each other on a board where there are no pawns, the two kings have 462 possible configurations.

Oops, I think my memory failed me yesterday when I typed 468... the value 462 sounds a lot more correct, thanks!

Edmund Moshammer wrote:You can theoretically skip the implementation of en passant squares. A 1 ply search will resolve the issue too for these rare cases.

I am not sure if I believe that. If there was a position which was winnable in a universe with no en passant rule, but the only winning move was to push the pawn by two squares, then the generator would think that position was won, neglecting the opponent's opportunity to capture it. If capturing it converted a won position into a draw, then that would be bad. There might be other positions in the tablebase, which were also considered won, that depended on this position being won.

Such positions (if they exist at all??) are probably very rare. So it probably only matters in a tiny subset of all positions. But I don't want bitbases that are "very nearly" accurate, but rather I want ones that are 100% accurate, so I would make a generator that knows about en passant.

The 100% accuracy is actually one of my goals if I ever get back to work on a bitbase project: to make an "endgame player" that can assess the GTV of any and all positions covered by the bitbases/heuristics/whatever with 100% accuracy, and can also play out those endings and can provably win every winnable position (i.e. not fall foul of the 50-move rule) without the use of actual tablebases. I am not interested in the "chess theoretic" uses of tablebases--only in making engines that can play chess endings according to FIDE rules and win every winnable ending. The other goal is to find highly compressible representations that support fast random access, with the goal of keeping them all in RAM (and the less of it the better).

I think it would now be plausible to dedicate 256 MB of RAM to "endgame knowledge" in the form of some sort of highly-compressed, randomly accessible GTV knowledge for all 5-man endings (and maybe some 6-man ?). I know some people don't think that's a good use of RAM, and maybe with the current bitbase representations it isn't... but if they could be made 10x smaller, then we might have something useful!

Decompressing pages of data compressed with RLE or something into a cache, also seems unpalatable to me. I want a static compressed representation that directly supports random access. Prediction combined with sparse array compression might work. Of course, the longer I wait, the larger RAM becomes, making the experiments potentially easier.

Edmund · Post by **Edmund** » Fri May 08, 2009 9:36 am

wgarvin wrote:
Edmund Moshammer wrote:You can theoretically skip the implementation of en passant squares. A 1 ply search will resolve the issue too for these rare cases.
I am not sure if I believe that. If there was a position which was winnable in a universe with no en passant rule, but the only winning move was to push the pawn by two squares, then the generator would think that position was won, neglecting the opponent's opportunity to capture it. If capturing it converted a won position into a draw, then that would be bad. There might be other positions in the tablebase, which were also considered won, that depended on this position being won.

For probing the bitbase you, when entering a position with an ep-capture available, you mustn't use the value at the position itself, but check whether the opponent, after getting its pawn captured has a winning/drawing or only losing moves. Thus you can tell the outcome even if you didn't have the value of the position with the e.p. square set.

Now coming to generation. Lets assume you are using the retro-move generation type, where you store per position, if white-to-move white wins/not wins and black-to-move number of evasion moves not checked yet.
1) white side can capture ep:
if the move after the capture is won for white, make one retro move for white, and now decrease the evasion counter for black before the 2step move. This can easily be done in the first iteration and you never have to deal with it again.

2) black side can capture ep:
First of all, you just decrement the counter for the position before the capture as usual. Note that the counter doesn't include the ep capture as a move. Once it reaches 0 (black has no moves left that would save the position), as usual you set all retro-moves for white as wtm-white wins. And concerning the retro 2 step move of the white pawn you only set it won if the position after black's capture is lost for black

Before I thought it was always an advantage of having the right to do e.p.capture, but I just found a zugzwang position, where black would be happy not to have to do e.p. capture:
[d]8/8/8/8/1pP5/1P6/5K1p/2N4k b - - 0 1

wgarvin wrote:I think it would now be plausible to dedicate 256 MB of RAM to "endgame knowledge" in the form of some sort of highly-compressed, randomly accessible GTV knowledge for all 5-man endings (and maybe some 6-man ?). I know some people don't think that's a good use of RAM, and maybe with the current bitbase representations it isn't... but if they could be made 10x smaller, then we might have something useful!

Note that Scorpio Bitbases 3-5men occupy 361mb of disk space. And they are already pretty well compressed.

Bitbases

Bitbases

Re: Bitbases

Re: Bitbases

Re: Bitbases

Re: Bitbases

Re: Bitbases

Re: Bitbases

Re: Bitbases

Re: Bitbases

Re: Bitbases