Syzygy bases from memory

Pio · Post by **Pio** » Mon Jun 28, 2021 1:37 pm

hgm wrote: ↑Sat Jun 26, 2021 11:31 am OK, so your idea is that this is a fight. That explains a lot.

I, however, are interested in chessprogramming facts. And it should be clear to all readers now that for perfect evaluation of draws in end-games that have on the order of 10% (or 90%), by far the best method is just store the draw/non-draw bitbase in memory, and probe the relevant bit by a simple memory access. Other methods are of course possible, but they require many times the amount of memory, and are easily an order of magnitude slower.

Especially just storing a table of only the draws (or wins) is not competative, because 10% is just not few enough to earn back the position encoding this requires.

Don't be fooled by persons who claim the contrary and are evasive when asked for details or throw smoke. They are wrong, and they know it.

If you have 10% draws you can save them using a bloom filter (see https://en.m.wikipedia.org/wiki/Bloom_filter. You could then encode the draws in less memory than bit bases with only one percentage false positive rate. If you only store the maybe one percentage of the draws that your engine massively miss predicts as non draws and scale down the evaluation for when you get a hit in the bloom filter you will get a size reduction of more than a hundred compared to bit bases with most of its benefits.

You can do the same for the wins your engine miss predicts and also for the losses in two additional bloom filters.

hgm · Post by **hgm** » Mon Jun 28, 2021 3:08 pm

Sure, if you have easy rules for determining game result, the number of exceptions that needs to be tabulated can be much reduced. I think this was the original idea with KPK in the first posting. You can use the 'rule of squares' for recognizing definite wins, and the 'King in Pawn path close to Pawn or with opposition' for recognizing definite wins. By using more and more complex rules you can even reduce the table to nothing.

The rules in this case would be more interesting than how exactly you organize the exception table. Without knowing the rules that are employed, any exception table would be completely useless. I don't know any rules for KRKB or KRKN.

Partial EGT can also be used with such rules; E.g. when you are only interested in KBPK positions with an edge Pawn of the wrong promotion color, you can test that first, and only store the part of the EGT with such an edge Pawn for probing. If KRKN is only lost with the weak King on the edge (I don't know if that is true), you can store an EGT that only contains such positions, and declare other positions draw without lookup.

KingSlayer has a dedicated KPK evaluator that is purely rule based; the rules say 'win/loss', 'draw' or 'search on/normal evaluation'. The win/loss and draw rules cover the cases where the actual result would otherwise take many moves to materialize. So a shallow search is always enough to resolve the 'search on' positions. But you still need some search, so in that respect it is inferior to a WDL EGT. In general, I am happy if I can make things such that a shallow search is guaranteed to get the right result. (No 'persistent' mis-evaluations, such as you would get when you think KQKP is +8 but in fact it is a draw.) There will always be positions where the static eval is wrong because of some shallow tactics, and it doesn't make much sense to try to remove that for the particular case of entering a simple end-game.

When KBPK was mentioned I wondered how useful it would be (Elo-wise) to also recognize draws when the Pawn is not an edge Pawn. There are some of those, like

[d]1k6/8/8/6p1/8/8/5K2/7b w

Perhaps KBPK should be heavily discounted when the Pawn is not protected.

Rebel · Post by **Rebel** » Mon Jun 28, 2021 6:47 pm

EPD release :

Code: Select all

kpk.epd  :    662.705
kbpk.epd :  4.729.762
kpkp.epd : 14.872.178
kqkp.epd :  3.170.189
krkb.epd : 33.754.513
krkn.epd : 34.793.857
krkp.epd : 36.126.097

About 125 million.

http://rebel13.nl/4-man.7z [453 Mb]

For ProDeo I am using a selection of 6.3 million to fix some holes in my HCE. Probably elo gain 1-3, if measurable. For starters as an educated guess ~20-30 elo.

Remarks:
1. Positions are analyzed with Komodo Dragon 2 with the Syzygy bases installed.
2. There are some quirks, a score of -32000 means stalemate, if there is no "bm" this also means stalemate.
3. The KBPK only contains the positions with the wrong colored bishop and the A or H pawns.
4. The KQKP only contains the positions when the white pawn is already on row 6 or the black pawn is advanced to the 3th row.

jonkr · Post by **jonkr** » Tue Jun 29, 2021 5:29 am

Rebel wrote: ↑Mon Jun 28, 2021 11:24 am
Thanks for the code!

Compiler needs "board.h"

Maybe you can post the neccesarry part of it?

The BitBase probing itself doesn't require board.h, but I included the functions for probing from my engine board format. (Partially because that's where it was, partially because it might be helpful to try to plug in a different board format and use the probe.) I think conversion from the engine board to the probe function inputs will alwys be needed, but my code wasn't set up as a library so it's probably trickier than it needs to be. Also I think programs nowadays are all using a flipped board from what I used (I had to vertically flip my board to send in syzygy base inputs.)

hgm · Post by **hgm** » Tue Jun 29, 2021 11:59 am

Interesting observation: In KRKB only 2.5% of the positions with the weak side to move are lost. By far the most (2%) of these are lost-in-1 (meaning the reply move will either checkmate or capture the Bishop). But in most of those the Bishop cannot be saved because the weak side is in check. Only 0.58% of the positions is lost when the Bishop player has the move and is not in check. (About 1/3 of that lost-in-1.)

That is a quite low percentage, for which it would save space to only store these as exceptions, e.g. in a hash table.

So a resource-economical approach would be this:

Always search (even at d=0) when the strong side is on move, but only search with the weak side to move when it is his first move in this end-game (i.e. ply counter <= 2) and he is in check, and probe the exception list in all other cases. If the position is not on the list, it is evaluated as draw (irrespective of remaining search depth). If it is listed as a loss, and the root is not yet in the end-game (ply counter < ply level), it is evaluated as mate-in-many. If the root is in the end-game you just search on if there is depth left, and evaluate as mate-in-many otherwise.

Pio · Post by **Pio** » Tue Jun 29, 2021 1:46 pm

hgm wrote: ↑Tue Jun 29, 2021 11:59 am Interesting observation: In KRKB only 2.5% of the positions with the weak side to move are lost. By far the most (2%) of these are lost-in-1 (meaning the reply move will either checkmate or capture the Bishop). But in most of those the Bishop cannot be saved because the weak side is in check. Only 0.58% of the positions is lost when the Bishop player has the move and is not in check. (About 1/3 of that lost-in-1.)

That is a quite low percentage, for which it would save space to only store these as exceptions, e.g. in a hash table.

So a resource-economical approach would be this:

Always search (even at d=0) when the strong side is on move, but only search with the weak side to move when it is his first move in this end-game (i.e. ply counter <= 2) and he is in check, and probe the exception list in all other cases. If the position is not on the list, it is evaluated as draw (irrespective of remaining search depth). If it is listed as a loss, and the root is not yet in the end-game (ply counter < ply level), it is evaluated as mate-in-many. If the root is in the end-game you just search on if there is depth left, and evaluate as mate-in-many otherwise.

This observation was also something I had in mind. Most endgames will either quickly lead to win for the stronger side or be a draw. Usually the most common exceptions are when the weaker side has the move and can check or do a fork.

I guess the most resource friendly way to store the exceptions are using a couple of bloom filters. The first bloom filter will register the exceptions and unfortunately some false positives. The false positives could then be stored in a second bloom filter and so on to generate a perfect hash function.

hgm · Post by **hgm** » Tue Jun 29, 2021 2:02 pm

'Bloom filter' isn't really a description of any algorithm; it just describes the purpose. There is no guarantee whatsoever that an actual implementation that performs the desired operation exists that is an improvement on an exhaustive tabulation of all possibilities.

Also note that won positions with a pretty large DTC do exist, both in KRKB (18 moves) and KRKN (27 moves). In an engine Quiescence Search would usually be enough (when the first two ply also search checks) to make sure you would not evaluate the tactical positions that you mention.

Pio · Post by **Pio** » Tue Jun 29, 2021 2:40 pm

hgm wrote: ↑Tue Jun 29, 2021 2:02 pm 'Bloom filter' isn't really a description of any algorithm; it just describes the purpose. There is no guarantee whatsoever that an actual implementation that performs the desired operation exists that is an improvement on an exhaustive tabulation of all possibilities.

Also note that won positions with a pretty large DTC do exist, both in KRKB (18 moves) and KRKN (27 moves). In an engine Quiescence Search would usually be enough (when the first two ply also search checks) to make sure you would not evaluate the tactical positions that you mention.

The main advantage I see with using bloom filters as I described is that the lookups can be parallelised on disk. If you just tabulate the exceptions and to binary search on the ordered exceptions you will have to do logarithmic (w.r.t. bit-length of position-index) many disk accesses that cannot be parallelised.

If you for example have approximately 4 billion positions each position will require 32 bits in the exception list. If 1 percentage of the 4 billion positions are exceptions you will need approximately 40 million * 32 bits to tabulate the exceptions.

If you on the other hand would use a couple of bloom filters with for example 1 percentage false negatives, you could save the exceptions in 40 million * 10 bits + 40 million * 10 bits + 400 thousand * 10 bits + … which will lead to less than 40 million times 21 bits. With my way of storing the exceptions you will only need 2/3 of the disk you proposed and in the same time it will be fast since you can parallelise the disk access.

Maybe I have done something wrong in the calculations. Please correct me if so. Maybe you have a better idea.

hgm · Post by **hgm** » Tue Jun 29, 2021 3:14 pm

I must admit that you lost me completely. You seem to make all kinds of assumptions that just pop out of nowhere.

Let us take a practical case like KRKN, which has about 9% wins even with the weak side on move, (which is 16M positions, reduced by symmetry to approximately 2M 'canonical' ones, which could be tabulated in 256KB), only 2.8% lost-in-1. Now what exactly would the bloom filter do, and how would it do that?

[Edit] One of the problems is that I don't see how you could beat the information content: with 9% wins this would be -(0.09*ln(0.09) + 0.91*ln(0.91))/ln(2) = 0.436 bit per position. So the draw/non-draw EGT should not be compressible to smaller than 43% of its uncompressed size, especially not by methods that randomize their storage location (and thus destroy any correlation patterns between the bits) like the bloom filter. And a bloom filter of half the uncompressed size of the EGT doesn't do very well: the number of false positives there is about the same as the number of real positives for a reasonable number of probes, so that applying it really doesn't gain you anything.

So the bloom filter seems a poor algorithm for compressing the amount of data that is required. I can imagine much better methods for the EGT, which require fewer or more localized access to the data set, that achieve a factor 2. One could for example think of run-length encoding w.r.t. the draws, and then Huffman encoding this run length.

Pio · Post by **Pio** » Wed Jun 30, 2021 1:22 am

hgm wrote: ↑Tue Jun 29, 2021 3:14 pm I must admit that you lost me completely. You seem to make all kinds of assumptions that just pop out of nowhere.

Let us take a practical case like KRKN, which has about 9% wins even with the weak side on move, (which is 16M positions, reduced by symmetry to approximately 2M 'canonical' ones, which could be tabulated in 256KB), only 2.8% lost-in-1. Now what exactly would the bloom filter do, and how would it do that?

[Edit] One of the problems is that I don't see how you could beat the information content: with 9% wins this would be -(0.09*ln(0.09) + 0.91*ln(0.91))/ln(2) = 0.436 bit per position. So the draw/non-draw EGT should not be compressible to smaller than 43% of its uncompressed size, especially not by methods that randomize their storage location (and thus destroy any correlation patterns between the bits) like the bloom filter. And a bloom filter of half the uncompressed size of the EGT doesn't do very well: the number of false positives there is about the same as the number of real positives for a reasonable number of probes, so that applying it really doesn't gain you anything.

So the bloom filter seems a poor algorithm for compressing the amount of data that is required. I can imagine much better methods for the EGT, which require fewer or more localized access to the data set, that achieve a factor 2. One could for example think of run-length encoding w.r.t. the draws, and then Huffman encoding this run length.

My simplified example with a table of four billion positions of which we were only interested in one percentage of the positions of a certain type (let’s say they are draws). In this simplified example we are neither interested in wins or losses for simplicity (but they could be saved exactly in the same way of course)

My way of using bloom filters will need approximately 20 % (which is 800 million bits) of the original size of 4 billion bits. If I am not mistaken that should be approximately a factor 6 worse than optimal encoding (but of course I can be wrong since I calculated it in my head)

The number of file accesses with my method will be approximately 20 for looking up whether the position is a draw or not but each read is independent of the other and could be done in parallel.

I know how Huffman encoding works (since I have implemented it myself using min-heap and binary tree) and run-length encoding is trivial but I do not see how you use it.

Please tell me how much you can compress the 4 billion positions of which we are only interested in the 1% draws and how many accesses to the file system you need and if they can be done in parallel.

Many thanks!!!

Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory

Re: Syzygy bases from memory