New 6-piece tablebases

syzygy · Post by **syzygy** » Tue Apr 02, 2013 12:03 am

I have just released my tablebase generator for up to 6 pieces on github:
https://github.com/syzygy1/tb

It generates two sets of files:
- WDL files (extension: .rtbw) storing win/draw/loss information for access during search.
- DTZ files (extension: .rtbz) storing distance-to-zero information for access at the root.

In addition to win/draw/loss information, the WDL files also store whether the win or loss can be enforced within 50 moves.

The tables use custom compression. Total compressed size:

Code: Select all

                       WDL          DTZ
up to 5 pieces       378 MB        561 MB
up to 6 pieces       68.3 GB       81.9 GB

Generation is done (almost) completely in RAM and is fully multithreaded. For 6 pieces, it requires a system with at least 32 GB of RAM. On my system (6-core i3930K @ 4.2Ghz, 64 GB) all 6-piece tables were generated in less than 5 days. At least for now the generator requires Linux and gcc.

Probing code is included, but it requires some work to add it to an engine. As a proof of concept I have adapted the probing code to Stockfish. The probing code is fully multithreaded as well.

My engine (playing on FICS as TrojanKnight(C)) has been using these tables since a couple of months quite successfully. With the WDL tables stored on SSD, it is possible to probe the tables at all depths without much slowdown (I don't probe in qsearch).

The license for the generator is GPLv2. The probing code is released without restrictions. So anyone wanting to add support for these tables to his/her engine is free to do so.

nanochess · Post by **nanochess** » Fri Apr 05, 2013 5:46 am

Very interesting! I'll give a look

Thanks for sharing

lucasart · Post by **lucasart** » Fri Apr 05, 2013 7:14 am

syzygy wrote:I have just released my tablebase generator for up to 6 pieces on github:
https://github.com/syzygy1/tb

It generates two sets of files:
- WDL files (extension: .rtbw) storing win/draw/loss information for access during search.
- DTZ files (extension: .rtbz) storing distance-to-zero information for access at the root.

In addition to win/draw/loss information, the WDL files also store whether the win or loss can be enforced within 50 moves.

The tables use custom compression. Total compressed size:
Code: Select all
                       WDL          DTZ
up to 5 pieces       378 MB        561 MB
up to 6 pieces       68.3 GB       81.9 GB
Generation is done (almost) completely in RAM and is fully multithreaded. For 6 pieces, it requires a system with at least 32 GB of RAM. On my system (6-core i3930K @ 4.2Ghz, 64 GB) all 6-piece tables were generated in less than 5 days. At least for now the generator requires Linux and gcc.

Probing code is included, but it requires some work to add it to an engine. As a proof of concept I have adapted the probing code to Stockfish. The probing code is fully multithreaded as well.

My engine (playing on FICS as TrojanKnight(C)) has been using these tables since a couple of months quite successfully. With the WDL tables stored on SSD, it is possible to probe the tables at all depths without much slowdown (I don't probe in qsearch).

The license for the generator is GPLv2. The probing code is released without restrictions. So anyone wanting to add support for these tables to his/her engine is free to do so.

Impressive work. Congratulations!

How does the size of 5-men table size base compare with: the Nalimov TB ? the Gaviota TB ? And in terms of "average probing speed" ?

I'm wondering, if I ever implement TB probing in DiscoCheck, which one to choose from. The smaller/faster the better

mar · Post by **mar** » Fri Apr 05, 2013 10:01 am

Truly impressive Ronald.
If I understand correctly you first probe WDL and then DTZ table to make progress?
Having such 5-men info fit below 1 gigabyte is awesome, DTM can't beat that for sure, but it would be unfair to compare since DTM contains more information.
Thanks for sharing.

rvida · Post by **rvida** » Fri Apr 05, 2013 1:57 pm

syzygy wrote: The tables use custom compression. Total compressed size:
Code: Select all
                       WDL          DTZ
up to 5 pieces       378 MB        561 MB
up to 6 pieces       68.3 GB       81.9 GB

Is the custom compression scheme any better than LZMA (which is used in Gaviota) ?

AlvaroBegue · Post by **AlvaroBegue** » Fri Apr 05, 2013 4:11 pm

Perhaps a bit more googling on my part could answer this, but I couldn't find it after a few minutes: What do people mean by "distance to zero"? Is it what I would call "distance to capture or pawn move"?

hgm · Post by **hgm** » Fri Apr 05, 2013 4:17 pm

Indeed. Zero refers to zeroing of the 50-move counter.

syzygy · Post by **syzygy** » Fri Apr 05, 2013 9:09 pm

lucasart wrote:How does the size of 5-men table size base compare with: the Nalimov TB ? the Gaviota TB ? And in terms of "average probing speed" ?

The 5-piece Nalimov TB are over 7 GB. I believe the 6-piece are over 1.2 TB (and those lack the 5v1 tables).
The 5-piece Gaviota TB are 6.5GB with LZMA compression, see here.

Average probing speed of my tables should be considerably better than that of Nalimov tables, but to be honest I have not measured it. But probing 1.2 TB Nalimov 6-piece tables will of course lead to much more disk trashing than probing 68.3 GB of tables. And 68.3 GB on SSD is much more affordable than 1.2 TB on SSD.

syzygy · Post by **syzygy** » Fri Apr 05, 2013 10:00 pm

mar wrote:If I understand correctly you first probe WDL and then DTZ table to make progress?

Correct. During the search only the WDL tables are accessed. Once a position with 6 or fewer pieces is on the board, the DTZ tables are used to make progress.

To make sure the engine makes progress towards a tablebase win, the search should treat tablebase positions that are won or lost similar to mate positions, but using a different range of values.

Of course pure DTZ-optimal play (i.e. minimaxing the number of moves to the next capture or pawn move by either side) can be very unnatural, so it might be desirable to let the engine do searches on the winning moves until it becomes clear that insufficient progress is being made and only then switch to DTZ-optimal play (e.g. by detecting repetitions and monitoring the 50-move counter).

The main reason for DTZ is clearly size, but also important is the fact that DTZ plays well with the 50-move rule. Actually it is not really DTZ but what I would call DTZ50+. For a position that is won or lost under the 50-move rule, my tables give the number of moves to the next zeroing move with optimal play respecting the 50-move rule (the "DTZ50" value). For a position that is drawn under the 50-move rule, but won or lost without the 50-move rule, my tables give the number of moves either to a position that is won or lost under the 50-move rule, or to a zeroing move leading to a position that is drawn under the 50-move rule but won or lost without it.

This may look a bit complicated, but the result is that when ignoring the 50-move rule, a winning path can be found for all positions that are won, and when respecting the 50-move rule, all positions that are won under the 50-move rule can be won and all positions that are drawn under the 50-move rule (but otherwise lost) can be drawn.

Similarly, the WDL tables are in fact WDL50+ tables. The probe returns -2,-1,0,1,2 depending on whether the position is lost under the 50-move rule, lost but drawn under the 50-move rule, drawn, won but drawn under the 50-move rule, won under the 50-move rule.

syzygy · Post by **syzygy** » Fri Apr 05, 2013 11:03 pm

rvida wrote:
syzygy wrote: The tables use custom compression. Total compressed size:
Code: Select all
                       WDL          DTZ
up to 5 pieces       378 MB        561 MB
up to 6 pieces       68.3 GB       81.9 GB
Is the custom compression scheme any better than LZMA (which is used in Gaviota) ?

My compression scheme allows for compressed blocks of just 64 bytes that can be decompressed quickly, which means that there is no need to cache uncompressed data.

In addition, my compression scheme allows me more control on what to do with "don't care" values. The value of illegal positions can be set to whatever compresses best (which is not necessarily the value occurring with the highest frequency but depends on the neighbouring values). To further improve compression, the WDL probing code does a round of captures and probes the resulting positions (which should be fast because tables with 1 piece less are much smaller and may be assumed to be cached in RAM). This means that for positions with a winning capture I can store whatever I want. For drawn positions with a drawing capture I can store either a loss value or a draw value. (And since WDL50+ is 5-valued there are even more possibilities.) Theoretically this is possible with other compression schemes as well, but it is an integral part of my compression scheme.

Getting DTZ small was done by making them 1-sided (only wtm or only btm, whichever is smaller), storing draws as don't cares, and not storing whether a position wins or loses (since that information is available from the WDL tables). In addition, if a capture or pawn move is winning I store a don't care. Where possible I store distance in number of moves, but where necessary for accurate 50-move rule play I store distance in plies.

I believe the capture resolution idea is from Chinook. It is also used for Robbobases and I would be surprised if the Shredderbases (5-piece WDL, 441 MB 2-sided and 157 MB 1-sided) did not use it as well. Chinook uses a run-length encoding scheme and sets the don't care value to the most frequent of W, D, L (Robbobases is said to do the same). My scheme is a bit more complicated.

New 6-piece tablebases

New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases