lucasart wrote: ↑Thu Mar 18, 2021 5:36 am
I think this algorithm is provably optimal, at least when the metric is the worst case. The proof is basically obvious by just following each step of the algorithm.
I am pretty sure there are better algorithms for the worst case. If we just look at the piece placements for now, the pext format takes
7*cnt(kings) + 7*cnt(queens) + 6*cnt(rooks) + 5*cnt(bishops) + 4*cnt(knights) + 3*cnt(pawns) + 1*cnt(empty)
bits, which for the starting position results in 168 bits.
With a huffman coding you can for example achieve
6*cnt(kings) + 6*cnt(queens) + 5*cnt(rooks) + 5*cnt(bishops) + 5*cnt(knights) + 3*cnt(pawns) + 1*cnt(empty)
bits, which in the starting position results in 164 bits. In the worst case the difference is even greater, because queening a pawn results in a greater increase in bits. Keep in mind that you have to capture a piece before pawns can promote. So for example you can have white's a pawn capture the black's b pawn and then both sides can promote one pawn, which will increase the number of bits by 2*6-3*3+1=4 for huffman coding and by 2*7-3*3+1=6 for pext encoding.
That said, I really like the idea of decompressing multiple pieces in parallel with a single instruction instead of decoding them square by square. If you are willing to slightly complicate the algorithm you can optimize the pext format even further:
There can never be any pawn in the first or last rank => save pext(pawns, occupied & ~rank1 & ~rank8) instead of pext(pawns, occupied). This will save 16 bit in the starting position and is probably most effective in the opening.
There will always be exactly one king of each color => save tzcnt(pext(white_king, occupied)) and tzcnt(pext(black_king, occupied)) and remove pext(queens, remaining). This will take ceil(log2(cnt(occupied)-1)) bits per king, but will save 12+cnt(queens) bits in the following pext compressions.
There will be at most one ep square => save tzcnt(pext(epSquare, candidates)) instead of pext(epSquare, candidates). This will take ceil(log2(cnt(candidates))) bits instead of cnt(candidates) bits. This optimization is only helpful, if cnt(candidates) is greater than 2, which should be quite rare.
lucasart wrote: ↑Wed Mar 17, 2021 1:35 am
Is this useful? Absolutely not. But it's fun!
I guess really useful is a format like Stockfish's binpack, which saves compressed moves instead of compressed positions for related positions (positions from the same game). I am not aware, of an application for saving a collection of unrelated FENs, were one of the "established" formats like .bin or .binpack is too slow or needs a better compression ratio, but I'd really like to see how low we can get the bitcount of the average case with this format.