disk-based 8-piece TB generator

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Ajedrecista
Posts: 2248
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Disk-based 8-piece TB generator.

Post by Ajedrecista »

Hello Ronald:
syzygy wrote: Tue Jun 02, 2026 12:00 am[...]

It seems he did not generate KQvKNNNNN, or at least I can't find his statistics for this (white wins only) table.
https://drive.google.com/drive/folders/ ... PuGCkwXDTI

[...]
They are there under the name knnnnnkq instead of kqknnnnn:

Code: Select all

=====================================================
Name                  Modification date     File size
=====================================================
knnnnnkq.txt           1 Apr 2021           2 kB
knnnnnkq.w.47.pgn     14 May 2020           805 bytes
knnnnnkq.zz.epd        2 Nov 2020           935 kB
The current download links for me are:
knnnnnkq.zz.epd is too large to copy its content there (13,140 EPD lines with c0 tag comments), but I can copy the full content of the other two files:

knnnnnkq.txt

Code: Select all

Ending knnnnnkq
WTM max win:  46
BTM max loss: 46
WTM wins:    99,652,226,474 (92.2%)
BTM loses:   64,590,726,084 (56.1%)
White wins: 164,242,952,558 (73.6%)
WTM legal:  108,036,115,779
BTM legal:  115,237,449,483

BTM stalemated: 0
WTM winning captures: 44,855,044,362
WTM win percent without captures: 86.73
BTM saving captures: 41,354,699,417
BTM loss percent without captures: 87.42

Depth         Wins
    1  45422519536
    2   7022225065
    3   1704270969
    4   1862781638
    5   2112454381
    6   2671179459
    7   3607919010
    8   4794589531
    9   5910518876
   10   6490630156
   11   6095903150
   12   4744610858
   13   3077255245
   14   1783439646
   15   1022218928
   16    600723603
   17    349730622
   18    191858310
   19     97599779
   20     47434384
   21     22460661
   22     10430712
   23      4828067
   24      2279083
   25      1118338
   26       570054
   27       303350
   28       163864
   29        90679
   30        50550
   31        29166
   32        16530
   33         9323
   34         5402
   35         2892
   36         1572
   37          880
   38          582
   39          382
   40          322
   41          301
   42          234
   43          248
   44          106
   45           26
   46            4

Depth       Losses
    0    274585268
    1  12071718313
    2   1104631493
    3   1255364446
    4   1545334763
    5   2037757025
    6   2866424923
    7   4057604569
    8   5454129559
    9   6714312909
   10   7300099434
   11   6749066949
   12   5193579353
   13   3389130466
   14   2006871805
   15   1162195160
   16    667623357
   17    369008299
   18    191471605
   19     94054623
   20     44866373
   21     21088174
   22      9935350
   23      4767167
   24      2375778
   25      1228045
   26       658589
   27       362268
   28       202434
   29       114865
   30        67463
   31        39011
   32        22250
   33        12995
   34         7767
   35         4556
   36         2850
   37         1736
   38         1114
   39          819
   40          670
   41          605
   42          451
   43          272
   44          150
   45           11
   46            2
I realise that you copied the beginning of these statistics in your post, so I do not know exactly what you are referring to. I guess it has to do with '(white wins only) table', whatever this means.

------------

knnnnnkq.w.47.pgn

Code: Select all

[Event "W +46"]
[Date "2020.05.14"]
[Site "?"]
[Round "?"]
[White "NNNNN"]
[Black "Q"]
[Result "1-0"]
[SetUp "1"]
[FEN "2Nq4/N7/8/8/8/8/7N/1K2N1Nk w - - 0 1"]

1. Nef3! Qd3+ 2. Kc1! {zz} Qc3+ 3. Kd1! Kg2 4. Ke2! Qb3 5. Kd2! Qc4 6. Ke3!
Qe6+ 7. Kd4 Qd7+ 8. Ke5 Qg7+ 9. Ke6 Qg6+ 10. Kd7 Qf5+ 11. Ke7 Qe4+ 12. Kf6 Qa4
13. Ke6 Qa6+ 14. Kd7 Qb7+ 15. Kd6 Qb4+ 16. Kc7 Qa5+ 17. Kc6 Qa6+ 18. Kc5 Qe6
19. Kb4 Qe4+ 20. Ka5 Qf5+ 21. Ka6 Qc5 22. Nb6 Qa3+ 23. Kb7! Qe7+ 24. Ka8 Qe4+
25. Kb8! Qf4+ 26. Kb7 Qf7+ 27. Ka8 Qe8+ 28. Nac8 Qc6+ 29. Kb8 Qe6 30. Kc7 Qf7+
31. Kc6 Qe8+ 32. Kc5 Qe3+ 33. Kb4 Qc1 34. Nd6 Qb2+ 35. Kc5! Qa3+ 36. Kc6 Qb2
37. Nbc4 Qh8 38. Ng4 Qa8+ 39. Kd7 Kh1 40. Ng5 Qb8 41. Nge4 Qa7+ 42. Ke6 Qa2 43.
Ng3+ Kg2 44. Kf5 Qb1+ 45. Kg5 Qb8 46. Nce3+ Kxg1 1-0 

%Time taken: 27 seconds
------------

There is a readme file at the end of the repository:

Code: Select all

This folder contains results for 8-man chess tablebases, as of April 12, 2021

Currently, only endings without pawns are considered.

Below is a description of the files, followed by more technical details:

(1) 8man_20200412.xlsx
    This spreadsheet summarizes all the results. I hope the meanings of the
    columns are fairly obvious.

(2) 8man_200.pgn
    A PGN file containing 22 endings with winning lines >= 200. I believe
    those are all such cases for 8-man pawnless endings.

(3) kqrrkqrr.fp.pgn
    Detailed analysis of 3R4/8/2q5/8/8/2k1r3/2r5/1R1K1Q2, a full point zugzwang
    without pawns, knights, and bishops. For 7-man endings there is only
    one fp zugzwang without pawns and knights: 8/8/8/8/2B2Q2/b7/1r3b2/2K1k3
    where Black unfortunately has two bishops of the same color.

(4) hhdbvi_8man_sample.pgn
    A small selection of studies from Harold van der Heijden's study database
    which are refuted by the 8-man tablebases.
    
(5) PGN files with longest lines
    For example, the file krrbnkqr.w.360.pgn contains a line where white wins
    in 360 moves. In this case, the line happens to end in checkmate, but in
    most endings the lines end upon a winning capture. This is the DTC metric,
    described in more detail below. Reciprocal zugzwangs are labeled as "zz".
    For endings with multiple bishops different color combinations are
    separated, using a notation described further below. For example,
    krrnkrbb_0020.w.400.pgn contains the longest winning line when Black has
    some-colored bishops, while krrnkrbb_0011.w.87.pgn contains the longest
    line when Black has opposite-colored bishops.

(6) Text files with ending statistics
    For example, krrbnkqr.txt contains basic win/loss statistics, the exact
    count of White wins for each depth, and the Black losses for each depth.
    The files also include statistics without winning or saving captures, which
    tend to provide a slightly better indication of whether the ending is a
    general win or not.
    For endings with multiple bishops additional files are provided, for
    example krrnkrbb_0020.stats.txt and krrnkrbb_011.stats.txt. Since this
    information is extracted after the complete tablebase has been created which
    does not separate the different bishop parities, there are fewer details.
    For example, the number of winning or saving captures is not available.

(7) EPD files with reciprocal zugzwangs
    For each position, the comment field (indicated by c0 following the EPD
    standard) lists the results for wtm and btm. Because of the one-sided
    nature of the tablebases, the zugwangs for a given ending kxky are limited
    to those where btm loses, and wtm draws or loses. Zugwangs where wtm loses
    and btm draws can be obtained from the "flipped" ending kykx by reversing
    colors in btm lost, wtm not won positions. For example, krbnkbnn.zz.epd
    contains 1,157,948 zugzwangs of the form wtm draws, btm loses, and one
    zugzwang of the form wtm loses, btm loses. kbnnkrbn.zz.epd contains 777
    zugzwangs where wtm draws, btm loses, and one position where wtm loses
    and btm loses. This single full-point (fp) zugzwang is the same as in 
    krbnkbnn.zz.epd. This way all zugzwangs for this material configuration
    are obtained. If a flipped ending is not available, it is not possible
    to resolve wtm draw from wtm loses, and so the comment field in the .epd
    file will read "wtm not won, btm lost in x".


Technical details
-----------------

Tablebases are created by retrograde analysis, using the DTC (distance to
conversion) metric, meaning the winning side tries to minimize the number of
moves to achieve checkmate or a winning capture. This is not the same as 
minimizing the distance to mate (DTM), so if any DTC minimizing lines end in
mate that is probably a coincidence and does not mean that is the true DTM,
but it is always true that DTC <= DTM. For finding the truth in a position
there is no difference between DTC and DTM, but DTC is computationally more
efficient.

Castling rights are not included, and the 50-move rule is ignored. While this
rule is very important for play between humans, it is less relevant from an
analysis perspective and does not even apply to chess studies. Ignoring the
50-move rule also allows the slight simplification of bundling together
checkmates and positions where a side is forced to make a losing capture as
"lost in 0". If a side is forced to make a losing capture but 50 quiet moves
have been played, it can claim a draw before having to execute the losing
capture. To correctly capture this situation would require measuring distances
in half-moves (ply), rather than full moves as is done here.

The tablebases are "one-sided", meaning they indicate whether a position is
won for wtm or lost for btm, and if so indicate the number of moves required to
win or lose. This idea can be a bit confusing, so here is an example. The
ending kqkr is almost always won for White, but there are positions where Black
not only does not lose, but even wins. For example, for the position Kc1, Qh1/
Ka1, Rb3 Black to move can play 1...Rb1+ and win the queen on the next move. In
the kqkr tablebase this position would be tagged as not lost for btm, and won in
2 for wtm (wtm can win in 2 with 1.Qh1+). To fully resolve the btm score 
requires the "flipped" tablebase krkq. If we "flip" colors and the side to move
in the original position we obtain Ka1, Rb3/Kc1, Qh1 which in the krkq
tablebase is scored as wtm wins in 2. This translates to btm wins in 2 in the
original kqkr tablebase. This works for endgames with pawns as well, if
flipping includes a reflection of the position about the horizontal. There
seems to be no major reason to prefer one type of tablebase to the other. I
like the one-sided form because in hunting for long lines I'm often not even
interested in the "flipped" tablebase.

Whether a tablebase is one- or two-sided will obviously affect the count of the
total number of tablebases. These counts can be derived from elementary
combinatorics. Consider endings with N pieces (not counting kings), and P piece
types, i.e, P = 4 for pawnless endings, and P = 5 for endings with pawns. If
we consider 2*P distinguishable containers labeled WN, BN, WB, BB, etc. for the
different piece types and piece colors, then the raw count R is the number in
which N indistinguishable balls can be distributed among these 2*P containers.
This is a well-known result:

R = Combin(N + 2*P - 1, N) = (N + 2*P - 1)! / N! / (2*P - 1)!

R needs to be adjusted depending on whether the tablebases are one- or 
two-sided. For one-sided tablebases we don't need kkx for any x, because the
number of wtm wins and btm losses are exactly zero. This number U is obviously
the number of ways N balls can be distributed over P, rather than 2*P,
containers:

U = Combin(N + P - 1, N) = (N + P - 1)! / N! / (P - 1)!

So the number of one-sided tablebases E_1 is:

E_1 = R - U

For two-sided tablebases, like the Nalimov and Syzygy tablebases, the count E_2
depends on whether N is even or odd. If N is odd, then to avoid double counting
kxky and kykx we just get E_2 = R/2. However, if N is even this would
undercount symmetric endings S of the form kxkx. S is obviously equivalent to
distributing N/2 balls over P containers:

S = Combin(N/2 + P - 1, N/2) = (N/2 + P - 1)! / (N/2)! / (P - 1)! if N is even
  = 0 if N is odd.

Therefore:

E_2 = (R + S) / 2

For 8-man endings, these formulas give E_1 = 1,632 for pawnless, and 4,795 for
endings with pawns, and E_2 = 868 for pawnless, and 2,520 for pawnful endings.

A final technical point relates to endings with multiple bishops. For a single
bishop there is no point considering different colors. For an 8x8 board, both
pawnless and pawnful endings are symmetric with respect to a reflection about
the vertical, which flips the colors of all squares (for a 7x7 board there is
an intrinsic difference between bishop colors, which is an interesting
discussion for another day). However, the relative colors of multiple bishops
remain the same for this and other symmetry transformations. Ken Thompson
introduced a notation to capture bishop colors: the 4 digit number "abcd"
denotes a configuration where White has "a" white-colored bishops and "b"
black-colored bishops, and Black has "c" white-colored and "d" black-colored
bishops. For example, krbnkbnn_1010 denotes same-colored bishops, and
krbnkbnn_1001 opposite-colored bishopes. kbbbnkrb_2101 denotes the case where
White has two bishop of one color, and one bishop of another color, and Black's
bishop has the same color as that other color. Because of the even dimensions
of the chess board, "abcd" is equivalent to "badc".
Please note the 4,795 figure (Urban's) at the end of the readme and how is explained, as well as 868 and 2,520. This comment also applies to Urban. Last, but not least, what about this possible explanation of 'saving captures' in the readme?
readme wrote:[...]

[...] If a side is forced to make a losing capture but 50 quiet moves
have been played, it can claim a draw before having to execute the losing
capture. To correctly capture this situation would require measuring distances
in half-moves (ply), rather than full moves as is done here.

[...]
'Saving' as an adjective in the sense of claim a draw instead of going into a sure lose; or 'saving' as a verb in the sense of the capture is not finally made.

Good luck when comparing!

Regards from Spain.

Ajedrecista.
User avatar
Ajedrecista
Posts: 2248
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Disk-based 8-piece TB generator.

Post by Ajedrecista »

Ajedrecista wrote: Tue Jun 02, 2026 6:27 pm[...]

knnnnnkq.zz.epd is too large to copy its content there (13,140 EPD lines with c0 tag comments), [...]

[...]

------------

There is a readme file at the end of the repository:

Code: Select all

This folder contains results for 8-man chess tablebases, as of April 12, 2021

Currently, only endings without pawns are considered.

Below is a description of the files, followed by more technical details:

(1) 8man_20200412.xlsx
    This spreadsheet summarizes all the results. I hope the meanings of the
    columns are fairly obvious.

(2) 8man_200.pgn
    A PGN file containing 22 endings with winning lines >= 200. I believe
    those are all such cases for 8-man pawnless endings.

(3) kqrrkqrr.fp.pgn
    Detailed analysis of 3R4/8/2q5/8/8/2k1r3/2r5/1R1K1Q2, a full point zugzwang
    without pawns, knights, and bishops. For 7-man endings there is only
    one fp zugzwang without pawns and knights: 8/8/8/8/2B2Q2/b7/1r3b2/2K1k3
    where Black unfortunately has two bishops of the same color.

(4) hhdbvi_8man_sample.pgn
    A small selection of studies from Harold van der Heijden's study database
    which are refuted by the 8-man tablebases.
    
(5) PGN files with longest lines
    For example, the file krrbnkqr.w.360.pgn contains a line where white wins
    in 360 moves. In this case, the line happens to end in checkmate, but in
    most endings the lines end upon a winning capture. This is the DTC metric,
    described in more detail below. Reciprocal zugzwangs are labeled as "zz".
    For endings with multiple bishops different color combinations are
    separated, using a notation described further below. For example,
    krrnkrbb_0020.w.400.pgn contains the longest winning line when Black has
    some-colored bishops, while krrnkrbb_0011.w.87.pgn contains the longest
    line when Black has opposite-colored bishops.

(6) Text files with ending statistics
    For example, krrbnkqr.txt contains basic win/loss statistics, the exact
    count of White wins for each depth, and the Black losses for each depth.
    The files also include statistics without winning or saving captures, which
    tend to provide a slightly better indication of whether the ending is a
    general win or not.
    For endings with multiple bishops additional files are provided, for
    example krrnkrbb_0020.stats.txt and krrnkrbb_011.stats.txt. Since this
    information is extracted after the complete tablebase has been created which
    does not separate the different bishop parities, there are fewer details.
    For example, the number of winning or saving captures is not available.

(7) EPD files with reciprocal zugzwangs
    For each position, the comment field (indicated by c0 following the EPD
    standard) lists the results for wtm and btm. Because of the one-sided
    nature of the tablebases, the zugwangs for a given ending kxky are limited
    to those where btm loses, and wtm draws or loses. Zugwangs where wtm loses
    and btm draws can be obtained from the "flipped" ending kykx by reversing
    colors in btm lost, wtm not won positions. For example, krbnkbnn.zz.epd
    contains 1,157,948 zugzwangs of the form wtm draws, btm loses, and one
    zugzwang of the form wtm loses, btm loses. kbnnkrbn.zz.epd contains 777
    zugzwangs where wtm draws, btm loses, and one position where wtm loses
    and btm loses. This single full-point (fp) zugzwang is the same as in 
    krbnkbnn.zz.epd. This way all zugzwangs for this material configuration
    are obtained. If a flipped ending is not available, it is not possible
    to resolve wtm draw from wtm loses, and so the comment field in the .epd
    file will read "wtm not won, btm lost in x".


Technical details
-----------------

Tablebases are created by retrograde analysis, using the DTC (distance to
conversion) metric, meaning the winning side tries to minimize the number of
moves to achieve checkmate or a winning capture. This is not the same as 
minimizing the distance to mate (DTM), so if any DTC minimizing lines end in
mate that is probably a coincidence and does not mean that is the true DTM,
but it is always true that DTC <= DTM. For finding the truth in a position
there is no difference between DTC and DTM, but DTC is computationally more
efficient.

Castling rights are not included, and the 50-move rule is ignored. While this
rule is very important for play between humans, it is less relevant from an
analysis perspective and does not even apply to chess studies. Ignoring the
50-move rule also allows the slight simplification of bundling together
checkmates and positions where a side is forced to make a losing capture as
"lost in 0". If a side is forced to make a losing capture but 50 quiet moves
have been played, it can claim a draw before having to execute the losing
capture. To correctly capture this situation would require measuring distances
in half-moves (ply), rather than full moves as is done here.

The tablebases are "one-sided", meaning they indicate whether a position is
won for wtm or lost for btm, and if so indicate the number of moves required to
win or lose. This idea can be a bit confusing, so here is an example. The
ending kqkr is almost always won for White, but there are positions where Black
not only does not lose, but even wins. For example, for the position Kc1, Qh1/
Ka1, Rb3 Black to move can play 1...Rb1+ and win the queen on the next move. In
the kqkr tablebase this position would be tagged as not lost for btm, and won in
2 for wtm (wtm can win in 2 with 1.Qh1+). To fully resolve the btm score 
requires the "flipped" tablebase krkq. If we "flip" colors and the side to move
in the original position we obtain Ka1, Rb3/Kc1, Qh1 which in the krkq
tablebase is scored as wtm wins in 2. This translates to btm wins in 2 in the
original kqkr tablebase. This works for endgames with pawns as well, if
flipping includes a reflection of the position about the horizontal. There
seems to be no major reason to prefer one type of tablebase to the other. I
like the one-sided form because in hunting for long lines I'm often not even
interested in the "flipped" tablebase.

Whether a tablebase is one- or two-sided will obviously affect the count of the
total number of tablebases. These counts can be derived from elementary
combinatorics. Consider endings with N pieces (not counting kings), and P piece
types, i.e, P = 4 for pawnless endings, and P = 5 for endings with pawns. If
we consider 2*P distinguishable containers labeled WN, BN, WB, BB, etc. for the
different piece types and piece colors, then the raw count R is the number in
which N indistinguishable balls can be distributed among these 2*P containers.
This is a well-known result:

R = Combin(N + 2*P - 1, N) = (N + 2*P - 1)! / N! / (2*P - 1)!

R needs to be adjusted depending on whether the tablebases are one- or 
two-sided. For one-sided tablebases we don't need kkx for any x, because the
number of wtm wins and btm losses are exactly zero. This number U is obviously
the number of ways N balls can be distributed over P, rather than 2*P,
containers:

U = Combin(N + P - 1, N) = (N + P - 1)! / N! / (P - 1)!

So the number of one-sided tablebases E_1 is:

E_1 = R - U

For two-sided tablebases, like the Nalimov and Syzygy tablebases, the count E_2
depends on whether N is even or odd. If N is odd, then to avoid double counting
kxky and kykx we just get E_2 = R/2. However, if N is even this would
undercount symmetric endings S of the form kxkx. S is obviously equivalent to
distributing N/2 balls over P containers:

S = Combin(N/2 + P - 1, N/2) = (N/2 + P - 1)! / (N/2)! / (P - 1)! if N is even
  = 0 if N is odd.

Therefore:

E_2 = (R + S) / 2

For 8-man endings, these formulas give E_1 = 1,632 for pawnless, and 4,795 for
endings with pawns, and E_2 = 868 for pawnless, and 2,520 for pawnful endings.

A final technical point relates to endings with multiple bishops. For a single
bishop there is no point considering different colors. For an 8x8 board, both
pawnless and pawnful endings are symmetric with respect to a reflection about
the vertical, which flips the colors of all squares (for a 7x7 board there is
an intrinsic difference between bishop colors, which is an interesting
discussion for another day). However, the relative colors of multiple bishops
remain the same for this and other symmetry transformations. Ken Thompson
introduced a notation to capture bishop colors: the 4 digit number "abcd"
denotes a configuration where White has "a" white-colored bishops and "b"
black-colored bishops, and Black has "c" white-colored and "d" black-colored
bishops. For example, krbnkbnn_1010 denotes same-colored bishops, and
krbnkbnn_1001 opposite-colored bishopes. kbbbnkrb_2101 denotes the case where
White has two bishop of one color, and one bishop of another color, and Black's
bishop has the same color as that other color. Because of the even dimensions
of the chess board, "abcd" is equivalent to "badc".
[...]
Edit time over. I learn now that this limit is half an hour.

I only wanted to add that zz of the EPD means 'reciprocal zugzwangs', as can be read in the readme.

I hope that all this info will be useful for you.

Regards from Spain.

Ajedrecista.