Question to syzygy author

hgm · Post by **hgm** » Sat May 07, 2016 3:11 pm

syzygy wrote:..., but does not help so much in finding an optimal swindling strategy (whatever that might be... if the opponent uses the same tables nothing will help anyway).

Indeed, optimizing swindling chances requires opponent modelling. But basically all of Chess is swindling, because the initial position is (believed to be) a draw. So the heuristic evaluation is pretty good at it.

Usually the chances for a swindle are better in complex positions than in simple ones. This argues for taking as much of the pain as you can in the current phase, rather than deferring it to a later phase. If the opponent uses the same tables as you, nothing will matter. But if he doesn't, it is much more likely that he will lack the 6-men tables than that he will lack the 4-men tables, etc.

mcostalba · Post by **mcostalba** » Sun May 08, 2016 12:25 pm

syzygy wrote: You're welcome

Thanks again for your clear and detailed answers!

In probe_dtz_table I have found this code (reformatted by me, but same functionality than original):

Code: Select all

    if &#40;DTZTable.front&#40;).key != key && DTZTable.front&#40;).key2 != key&#41; &#123;
        
        // Enforce "Most Recently Used" &#40;MRU&#41; order for DTZ_list
        for &#40;auto it = DTZTable.begin&#40;); it != DTZTable.end&#40;); ++it&#41;
            if &#40;it->key == key&#41; &#123;
                // Move to front without deleting the element
                DTZTable.splice&#40;DTZTable.begin&#40;), DTZTable, it&#41;;
                break;
            &#125;

        // If still not found, add a new one
        if &#40;DTZTable.front&#40;).key != key&#41; &#123;

            WDLEntry* wdlEntry = WDLHash&#91;key&#93;;
            if (!wdlEntry&#41; &#123;
                *success = 0;
                return 0;
            &#125;

            .....   add new entry ....

In particular we check if current first DTZ entry in the list matches current position, and we test again both keys in this case, but if comparison fails, we continue searching across the DTZ list but in this case testing only first key. There is a reason for this? I have measured about 2% of cases where first key is wrong, but second key matches the position's key.

Also a side question: what is the rationale to limit the DTZ list to 64 entries?

syzygy · Post by **syzygy** » Sun May 08, 2016 2:38 pm

mcostalba wrote:In particular we check if current first DTZ entry in the list matches current position, and we test again both keys in this case, but if comparison fails, we continue searching across the DTZ list but in this case testing only first key. There is a reason for this? I have measured about 2% of cases where first key is wrong, but second key matches the position's key.

I can't think of a reason, so this seems to be a bug. Not critical, but it should be fixed. Thanks for finding it

Also a side question: what is the rationale to limit the DTZ list to 64 entries?

If the engine just plays a single game, the limit will not be reached.
If it plays many games, or if the engine is used to analyse many endgames, many DTZ files could be opened and mmap()ed to memory that are not useful for the current position. Because opening them is not time critical, it seemed sensible to place a limit on the number of DTZ files open at the same time.

64 is just a round number. It could probably be lowered to 16 or even 8. If there is no memory pressure, any cached data will remain cached in RAM anyway. If there is memory pressure, then data in tables that are no longer mmap()ed is probably released more quickly, which is good. On the other hand, I don't expect large parts of the DTZ files to be cached in RAM. Only relatively few positions are probed, so the OS should only cache a few disk pages. So difficult to say if changing it will have any good or bad impact at all...

yurikvelo · Post by **yurikvelo** » Sun May 08, 2016 2:42 pm

Onegin wrote:No, I'm not a developer, I'm a user. ... I couldn't find an explanation for such a behavior and started thinking about EGTBs

ask your questions in topic "Syzygy vs Nalimov" etc and I will explain

velmarin · Post by **velmarin** » Sun May 08, 2016 2:55 pm

yurikvelo wrote:
Onegin wrote:No, I'm not a developer, I'm a user. ... I couldn't find an explanation for such a behavior and started thinking about EGTBs
ask your questions in topic "Syzygy vs Nalimov" etc and I will explain

New moderator.

mcostalba · Post by **mcostalba** » Mon May 09, 2016 11:21 am

syzygy wrote: 64 is just a round number. It could probably be lowered to 16 or even 8. If there is no memory pressure, any cached data will remain cached in RAM anyway. If there is memory pressure, then data in tables that are no longer mmap()ed is probably released more quickly, which is good. .

I don't think you can map more files than available virtual address space, so maybe the limit becomes sensible with 32 bit architectures and many big DTZ files open.

Coming back to source code

I am trying to understand decompress_pairs() but it seems another very complicated part of syzygy. Care to please walk through that code enlightening me? Thanks.

syzygy · Post by **syzygy** » Mon May 09, 2016 11:50 pm

mcostalba wrote:
syzygy wrote: 64 is just a round number. It could probably be lowered to 16 or even 8. If there is no memory pressure, any cached data will remain cached in RAM anyway. If there is memory pressure, then data in tables that are no longer mmap()ed is probably released more quickly, which is good. .
I don't think you can map more files than available virtual address space, so maybe the limit becomes sensible with 32 bit architectures and many big DTZ files open.

Yes, but on a 32-bit architecture the 6-piece WDL files will also quickly run out of address space. (In theory you could also unmmap() WDL files, but that is a lot of hassle and you would need to add a lot of locking. As you see, things are already quite complicated, so it is nice that we can leave memory management to the OS.)

Coming back to source code I am trying to understand decompress_pairs() but it seems another very complicated part of syzygy. Care to please walk through that code enlightening me? Thanks.

The first two lines deal with the special case that all table positions store the same value (this happens in a few cases because of the don't cares, but one could wonder whether it was a good idea to make this a special case; anyway, that's what I did). Reusing d->idxbits and d->min_len for this is rather ugly.

To understand the rest, you first have to know that the compressed data (to which d->data points) is divided into blocks of size 1 << d->blocksize.

Each block stores a number of symbols compressed using Huffman coding. Each symbol represents either a WDL or (remapped, see above) DTZ value, or a pair of other symbols (recursively). If you keep expanding the symbols in a block, you end up with up to 65536 WDL or DTZ values.

Each symbol represents up to 256 values and will correspond after Huffman coding to at least 1 bit. So a block of 32 bytes corresponds to at most 32 x 8 x 256 = 65536 values. This maximum is only reached for tables that consist mostly of draws or mostly of wins, but such tables are actually quite common.

In principle, the blocks in WDL tables are 64 bytes long (and will be aligned on cache lines). But for mostly-draw or mostly-win tables this can leave many 64-byte blocks only half-filled, so in such cases blocks are 32 bytes long.

The blocks of DTZ tables are up to 1024 bytes long. The generator picks the size that leads to the smallest table (too big blocks means too many are not full, too small blocks means too much overhead is lost on indexing etc.).

The "book" of symbols and Huffman codes is the same for all blocks in the table (a non-symmetric pawnless TB file will have one table for wtm and one for btm, a TB file with pawns will have tables per file a,b,c,d). Everything is in the PairsData struct to which "d" points.

So now the code. First we need to locate the right block that stores the value at index "idx".

Since each block stores a variable number of values (up to 65536), we have an array that stores the number of values for each block: d->sizetable[]. Each entry is 2 bytes and you have to add 1 to the value stored in it to get the real size (so a stored value of 65535 means 65536 symbols).

We could start at block 0 and keep skipping blocks one by one, each time subtracting d->sizetable[current_block] from idx, until we are in the block that contains the value we want. But that is too much work. Instead, we have an index structure d->indextable with entries that point into d->sizetable.

This is slightly complicated. We have about one such entry per 1 << d->idxbits index values. Each entry consists of a block number (of 4 bytes) and a (sub)index into that block (of 2 bytes). The block number and (sub)index into the block for entry k corresponds to the location of the following index (let's call it N(k)):

Code: Select all

  N&#40;k&#41; = k * &#40;1 << d->idxbits&#41; + &#40;1 << &#40;d->idxbits - 1&#41;)

Given our index "idx" we calculate this value "k":

Code: Select all

  uint32 mainidx = idx >> d->idxbits;

We also calculate how far our "idx" is from N(k):

Code: Select all

  int litidx = &#40;idx & (&#40;1 << d->idxbits&#41; - 1&#41;) - &#40;1 << &#40;d->idxbits - 1&#41;);

This simply calculates litidx = idx - N(k).

We read out the number of the block that stores N(k):

Code: Select all

  uint32 block = *&#40;uint32 *)&#40;d->indextable + 6 * mainidx&#41;;

We add the (sub)index into that block:

Code: Select all

  litidx += *&#40;ushort *)&#40;d->indextable + 6 * mainidx + 4&#41;;

If idx happened to be equal to N(k), then litidx is now equal to the (sub)index into the block that corresponds to N(k), which is what we want.

If idx was smaller or bigger than N(k), then litidx will now be as much smaller or bigger than the (sub)index into the block that corresponds to N(k), so IF litidx points into the block, then we know precisely which value we need.

However, usually we need one of the preceding blocks and we have to roll back a bit:

Code: Select all

  if &#40;litidx < 0&#41; &#123;
    do &#123;
      litidx += d->sizetable&#91;--block&#93; + 1;
    &#125; while &#40;litidx < 0&#41;;
  &#125;

or we need one of the succeeding block and we have to go forward a bit:

Code: Select all

  else &#123;
    while &#40;litidx > d->sizetable&#91;block&#93;)
      litidx -= d->sizetable&#91;block++&#93; + 1;
  &#125;

(The reason for letting the entries point into the "middle" of a series of blocks instead of to the beginning is that this reduces the average number of iterations of these two loops by half compared to a single loop that always goes forward. I think I aim for about one "mainidx" per about 16 blocks, so max about 8 blocks backward or forward.)

Finally, we find the address of our block:

Code: Select all

  uint32 *ptr = &#40;uint32 *)&#40;d->data + &#40;block << d->blocksize&#41;);

The variable "litidx" is the (sub)index into this block of the value we need.

(I just noticed I have been copying and pasting from my own code instead of that of Stockfish, which also deals with endianness.)

Now comes Huffman decoding. We decode symbol by symbol until we are at the symbol we need. After each symbol we subtract from litidx the number of values the symbol represents, which is d->symlen[sym] + 1 (so at least 1 and at most 256). We use a canonical Huffman code. The first part of the loop decodes the next symbol:

Code: Select all

    int l = m;
    while &#40;code < base&#91;l&#93;) l++;
    sym = offset&#91;l&#93;;
    if (!LittleEndian&#41;
      sym = (&#40;sym & 0xff&#41; << 8&#41; | &#40;sym >> 8&#41;;
    sym += static_cast<int>(&#40;code - base&#91;l&#93;) >> &#40;64 - l&#41;);

The variable l is the number of bits of the symbol. The canonical code is ordered such that longer symbols (in terms of the number of bits of their Huffman code) come first. So if base[l] <= code < base[l - 1] where "code" contains the next 64 bits of the compressed data, then the first l (most significant) bits of "code" encode the next symbol.

Then the length (in values) of the symbol is subtracted and, if necessary, the next l bits are fetched. This could run into the next block, but that is OK. We cannot run beyond the end of the mmap()ed file, because there is still a 16-byte checksum at the end.

Code: Select all

    if &#40;litidx < &#40;int&#41;symlen&#91;sym&#93; + 1&#41; break;
    litidx -= &#40;int&#41;symlen&#91;sym&#93; + 1;
    code <<= l;
    bitcnt += l;
    if &#40;bitcnt >= 32&#41; &#123;
      bitcnt -= 32;
      uint32 tmp = *ptr++;
      if &#40;LittleEndian&#41;
        tmp = BSWAP32&#40;tmp&#41;;
      code |= (&#40;uint64&#41;tmp&#41; << bitcnt;
     &#125;

When we have the right symbol, we expand it into two symbols, choose the left or right symbol of the pair as appropriate (the one that contains the value we are looking for) and keep expanding until we are at a symbol of length 1 (symlen[sym] +1 == 1), which is the value we need:

Code: Select all

  ubyte *sympat = d->sympat;
  while &#40;symlen&#91;sym&#93; != 0&#41; &#123;
    ubyte* w = sympat + &#40;3 * sym&#41;;
    int s1 = (&#40;w&#91;1&#93; & 0xf&#41; << 8&#41; | w&#91;0&#93;;
    if &#40;litidx < &#40;int&#41;symlen&#91;s1&#93; + 1&#41;
      sym = s1;
    else &#123;
      litidx -= &#40;int&#41;symlen&#91;s1&#93; + 1;
      sym = &#40;w&#91;2&#93; << 4&#41; | &#40;w&#91;1&#93; >> 4&#41;;
    &#125;
  &#125;

Well, it's not yet the value we need. There is one last mapping step:

Code: Select all

 return sympat&#91;3 * sym&#93;;

Each sympat[] entry has 3 bytes or rather 2 x 12 bits. The first 12 bits is the left-hand symbol of the pair, the second 12 bits is the right-hand symbol. If the symbol has length 1, then it corresponds to a WDL or DTZ value which is held in the first byte of sympat.

mcostalba · Post by **mcostalba** » Sat May 14, 2016 12:31 pm

syzygy wrote: So now the code.

That was hard

This one is the most complex so far (and I think in general because I have more or less looked at all the code base now, apart the high level probing).

I have fully documented and also reformatted a bit (well, not just a bit

) all this part . I hope it is more clear now, at least it is for me.

Thanks for your support!

Full code is here: https://github.com/official-stockfish/S ... bprobe.cpp

velmarin · Post by **velmarin** » Sat May 14, 2016 2:19 pm

mcostalba wrote:
That was hard

This one is the most complex so far (and I think in general because I have more or less looked at all the code base now, apart the high level probing).

I have fully documented and also reformatted a bit (well, not just a bit ) all this part . I hope it is more clear now, at least it is for me.

Thanks for your support!

Full code is here: https://github.com/official-stockfish/S ... bprobe.cpp

I really like your work, very interesting.
I have compiled and I have been an exception.
Perhaps mine is a problem, but I don't think so.
[pgn]
[Event "DESKTOP-QFSOBQ2, Blitz 1m"]
[Site "DESKTOP-QFSOBQ2"]
[Date "2016.05.14"]
[Round "2"]
[White "Stockfish MC 64"]
[Black "Gull.3.0_JVx32"]
[Result "*"]
[ECO "A37"]
[Annotator "0.00;-0.05"]
[PlyCount "118"]
[TimeControl "60"]

{AMD Phenom(tm) II X4 B50 Processor 3215 MHz W=16.5 plies; 795kN/s; 39.627
TBAs B=12.5 plies; 726kN/s; 64.597 TBAs} 1. Nf3 g6 2. c4 Bg7 3. Nc3 c5 4. g3
Nc6 5. Bg2 e6 6. O-O Nge7 7. d3 O-O 8. Bd2 d6 {Ambos última jugada del libro}
9. Qc1 {0.00/17 3} d5 {-0.05/11 1 (Cf5)} 10. Bh6 {0.14/18 1} Bxh6 {-0.04/12 1}
11. Qxh6 {0.04/18 1} f6 {0.00/14 2} 12. Rad1 {0.18/18 2} Nf5 {-0.08/15 3 (Tf7)}
13. Qh3 {0.07/17 1} h5 {-0.02/14 4 (d4)} 14. g4 {0.00/17 2} hxg4 {-0.18/14 2}
15. Qxg4 {0.03/18 3} Kg7 {-0.34/14 4} 16. e4 {0.10/17 3} d4 {-0.29/13 2} 17.
Na4 {0.12/17 1 (Ce2)} Nh6 {-0.59/12 1} 18. Qg3 {0.09/16 0} e5 {-0.62/13 3 (b6)}
19. Nxc5 {-0.35/17 2 (h4)} Qa5 {-0.54/13 0} 20. b4 {-0.34/19 4 (Cb3)} Qxb4 {-0.
63/12 1} 21. Nb3 {-0.43/18 1} Qe7 {-0.71/13 2 (Th8)} 22. h4 {-0.36/17 4 (h3)}
Bg4 {-0.63/12 2 (Rh7)} 23. Rd2 {-0.31/16 1} Kh7 {-0.64/13 1 (Ah5)} 24. Nh2 {-0.
31/15 1 (Tb1)} Be6 {-0.57/12 1 (Ah5)} 25. f4 {-0.13/14 1} exf4 {-0.51/12 1
(Tad8)} 26. Rxf4 {0.00/17 2} Rad8 {-0.58/12 1 (Cf7)} 27. Bh3 {0.00/14 1 (Cf3)}
Bxh3 {-0.59/10 1} 28. Qxh3 {-0.14/11 0} a5 {-0.65/11 3 (Rh8)} 29. Nf3 {0.00/14
2 (Cc1)} Rg8 {-0.97/9 0 (a4)} 30. Rg2 {0.00/15 4} a4 {-0.54/9 1 (Rh8)} 31. Nc1
{0.20/16 1} Kh8 {-0.42/12 1 (Td6)} 32. Ne2 {0.41/13 0} b6 {-0.30/12 2 (Cf7)}
33. Nd2 {0.53/16 2 (Rh1)} f5 {0.22/13 4} 34. Rf1 {0.85/15 2} Ne5 {0.22/12 0
(Dd6)} 35. Nf4 {1.16/13 0} Nhg4 {0.22/12 0 (Dd6)} 36. Nf3 {0.63/14 1 (h5)}
Nxf3+ {0.36/12 2} 37. Rxf3 {0.91/13 0} fxe4 {0.36/11 0 (Dh7)} 38. Qxg4 {1.10/
15 1} exf3 {0.28/12 0} 39. Nxg6+ {0.52/13 0} Rxg6 {0.28/12 0} 40. Qh5+ {0.39/
19 1} Qh7 {0.28/11 0} 41. Qxg6 {0.39/16 0} Qxg6 {0.43/12 0 (Tf8)} 42. Rxg6 {0.
54/16 0} Re8 {0.43/12 0} 43. Rxb6 {0.35/18 1} Re2 {0.27/13 0 (Te3)} 44. a3 {0.
67/18 0} Rg2+ {0.15/13 0 (Te3)} 45. Kf1 {0.10/18 1} Ra2 {0.15/12 0} 46. c5 {0.
51/18 1} Rxa3 {0.15/11 0} 47. c6 {0.37/19 1} Kg7 {0.18/13 1} 48. Rb7+ {0.31/16
0 (c7)} Kf6 {0.24/14 0} 49. c7 {0.13/17 0} Rc3 {0.24/13 0} 50. Kf2 {0.11/16 0}
Ke5 {0.24/12 0} 51. Kg3 {0.13/18 0} a3 {0.24/14 1} 52. Ra7 {0.14/15 0} Kd6 {0.
24/13 0 (f2)} 53. h5 {0.29/15 0} Rxc7 {0.24/11 0} 54. Rxa3 {0.22/13 0} Rc3 {0.
16/12 0} 55. Ra6+ {0.00/17 1 (Ta1)} Ke5 {0.14/10 0 (Re7)} 56. h6 {0.00/18 0
(Ta1)} Rxd3 {0.00/9 0} 57. h7 {0.00/19 0 (Ta1)} f2+ {0.00/10 0} 58. Kxf2 {0.00/
20 0 (Rg2)} Rh3 {0.00/16 0} 59. h8=Q+ {0.00/26 0 (Ta5+)} Rxh8 {0.00/27 0
Stockfish MC 64: no move Stockfish MC 64 caused an exception, game stopped.} *

[/pgn]
[D]8/7P/R7/4k3/3p4/7r/5K2/8 w - - 0 59

Code: Select all

1935&#58; Stockfish MC 64 - Gull.3.0_JVx32, DESKTOP-QFSOBQ2, Blitz 1m
8/7P/R7/4k3/3p4/7r/5K2/8 w - - 0 1

Analysis by Stockfish MC 64&#58;

59.h8D+ Txh8 
  =  &#40;0.00&#41;   Profundidad&#58; 7/3   00&#58;00&#58;00  0kN, tb=5
59.h8D+ Txh8 
  =  &#40;0.00&#41;   Profundidad&#58; 8/3   00&#58;00&#58;00  0kN, tb=6
59.h8D+ Txh8 
ect,ect
  =  &#40;0.00&#41;   Profundidad&#58; 72/3   00&#58;02&#58;08  119mN, tb=2854604

Next move, no move Stockfish MC 64 caused an exception,

syzygy · Post by **syzygy** » Sat May 14, 2016 2:34 pm

mcostalba wrote:Full code is here: https://github.com/official-stockfish/S ... bprobe.cpp

That's looking very good!

If you like, feel free to add a copyright notice to the top of the file and replace the "without restrictions" license with the GPL. (The point of "without restrictions" is that you already have permission for that.)

Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author

Re: Question to syzygy author