Performance of Syzygy and Scorpio

Laskos · Post by **Laskos** » Mon Feb 10, 2014 11:37 am

Daniel Shawul wrote:
To note that SF Scorpio had a dramatic loss in NPS, on average by a factor 1.7 or so.
Well I already mentioned to Adam syzygy is probed much less frequently than I did because it probes 6 plies away from the root, while mine can do it in qsearch 0 plies. If I used 6 plies, my nps more than doubles but that doesn't work for scorpio which reaches much less depth and no endgame knowledge. I will fix it for Stockfish ofcourse.
Code: Select all
1 SF 05.02 Syzygy                &#58; 3109.5/6000  51.8   10    7   7   41.8 % 
  2 SF 05.02 Syzygy WDL            &#58; 3019.0/6000  50.3    2    7   7   42.6 % 
  3 SF 05.02 NO Syzygy             &#58; 2981.0/6000  49.7   -2    7   7   42.9 % 
  4 SF Scorpio                     &#58; 2957.0/6000  49.3   -4    7   7   43.1 % 
  5 SF NO Scorpio                  &#58; 2933.5/6000  48.9   -6    7   7   43.3 % 
The "SF 05.02 NO Syzygy" scored more than 50 points compared to "SF NO Scorpio" to begin with, so it is hard to compare but atleast now Scorpio EGBBs show as much improvement as syzygy WDL. Also why does "NO syzygy" version scores much better here than in the previous case anyway ?
Code: Select all
 1 SF 05.02 Syzygy                &#58; 2071.0/4000 
  2 SF 05.02 Syzygy WDL            &#58; 2042.0/4000 
  3 SF NO Scorpio                  &#58; 1978.5/4000 
  4 SF 05.02 NO Syzygy             &#58; 1966.5/4000 
  5 SF Scorpio                     &#58; 1942.0/4000 
Another significant difference is Syzygy WDL also scored significantly less than in the previous test. The updated version of Syzygy WDL that probes right after (captures/pawns i.e. fifty=0) is an improvement but it murkies the comparison because now it is using progress making code, so I prefered you tested the older version. Measuring the other differences with KBNK , KBBKN wins that don't involve captures or pawn pushes is very difficult because MES or other gerneral EPD do not have those a lot. I believe here we are just comparing nps differences due to probe settings more than anything else.

I guess different suites, 6-7 men positions which I used previously are very different from MES positions. Also statistical flukes, the variance of a RR on some not clear positions is higher than the variance in a gauntlet on 4-5 men TB wins testsuite with 5 men bases. It is expected that SF no Scorpio should score lower than SF no Syzygy, as your compile without EGBBs is ~10% slower than Ronald's without Syzygy. I continued the test using MES.EPD, and SF Scorpio shows a bigger progress over SF no TB than SF Syzygy WDL over SF no TB.

Code: Select all

10'' + 0.1''

    Program                              Score     %    Av.Op.  Elo    +   -    Draws

  1 SF 05.02 Syzygy                &#58; 5178.0/10000  51.8     -2     10    5   5   41.6 %
  2 SF 05.02 Syzygy WDL            &#58; 5003.0/10000  50.0     -0      0    5   5   42.9 %
  3 SF 05.02 NO Syzygy             &#58; 4986.5/10000  49.9      0     -1    5   5   42.8 %
  4 SF Scorpio                     &#58; 4952.0/10000  49.5      1     -3    5   5   43.0 %
  5 SF NO Scorpio                  &#58; 4880.5/10000  48.8      2     -7    5   5   43.0 %

So, SF Scorpio adds a significant number of points, and I hope you will adapt EGBBs for SF, which has much endgame knowledge and reaches large depths. SF Syzygy (WDL+DTZ) shows even larger improvement over SF no TB.

Daniel Shawul · Post by **Daniel Shawul** » Mon Feb 10, 2014 3:27 pm

Ok that seems reasonable. Indeed my compiles are slower because I didn't use SF makefile, but an MSVC project. I don't think I have defined the necessary pre-processor macro to enable PopCount and such.

I wouldn't bet on syzygy+DTZ or any DTM to sustain its lead the further one goes away from endgame because EGTBs tend to be probed more and more away from the root and at the leaves. So the games will be decided before we get there to see if it can finish it off. This is similar to the Nalimov EGTB case, which has scored perfect in the 5-men tests, but its effect will gradually wane off especially if it is probed only at the root.

Update:
Another bug I fixed today reduced the memory required to pre-load 5-men by half! I don't know how many gigantic mistakes I have been making in egbbdll. A week ago 5-men used to take 2 minutes and 370MB, now 0 sec and 211MB. I guess this saving could go to a default bigger cache of 128 MB. I have also enabled selective pre-loading of the most important ones according to this frequency/blunder table http://chess-db.com/public/research/end ... stics.html . Now with load_type=2, it loads all 4-men + KQPkq, KRpkr, KBpkb, KNpkn, KPPkp, KPPkq, KPPkr, KPPkb, KPPkn in just about 70MB. So with that option there will be more space for the cache (say 256MB), but I am not sure to which cache more space should be dedicated to (compressed or uncompressed blocks). The previous mistake I did was to allocate space necessary to store the Huffman decoded, but LZ encoded blocks, when it should have been for both Huffman + LZ compressed blocks. This mistake gave an idea that I could pre-load Huffman decoded but LZ encoded data instead. But I am not sure if decoding LZ alone is any faster than decoding LZ + Huffman. Maybe there is an online decoder for LZ ...

Adam Hair · Post by **Adam Hair** » Sat Feb 15, 2014 3:09 am

I extracted 700 positions containing 6 to 10 men from a database of chess engine matches I have. These 700 positions are judged by Stockfish (with no endgame bases) at depth 20 to be between +75 cp and -75 cp. I went ahead and did some testing using the compiles from my previous test and the compiles of the latest sources from GitHub for Syzygy and Scorpio. Every version of Stockfish with Scorpio bases and Stockfish with Syzygy bases played 1400 games (reverse colors) against Stockfish 090214 without endgame bases. Here are the results (time control was 20"+0.2"):

Code: Select all

   # PLAYER                                     &#58; RATING    POINTS  PLAYED    (%)
   1 Stockfish Syzygy 130214 DTZ                &#58;   16.9     739.0    1400   52.8%
   2 Stockfish Syzygy 040214 DTZ                &#58;   15.3     736.0    1400   52.6%
   3 Stockfish Syzygy 130214 WDL                &#58;   12.3     730.0    1400   52.1%
   4 Stockfish Syzygy 040214 WDL                &#58;    9.1     723.5    1400   51.7%
   5 Stockfish Scorpio 130214 EGBB              &#58;   -0.5     704.5    1400   50.3%
   6 Stockfish 090214 no TB                     &#58;   -2.7    6940.5   14000   49.6%
   7 Stockfish Scorpio 040214 EGBB cache 256    &#58;   -3.5     698.5    1400   49.9%
   8 Stockfish Syzygy 040214 none               &#58;   -8.0     689.5    1400   49.2%
   9 Stockfish Scorpio 040214 none              &#58;   -9.2     687.0    1400   49.1%
  10 Stockfish Syzygy 130214 none               &#58;  -13.2     679.0    1400   48.5%
  11 Stockfish Scorpio 130214 none              &#58;  -16.5     672.5    1400   48.0%

Here are the increases as compared to using no endgame bases:

Code: Select all

                                             EGBB
  Stockfish Scorpio 040214                    5.7
  Stockfish Scorpio 130214                   16.0


                                             WDL          DTZ
  Stockfish Syzygy 040214                    17.1         23.3
  Stockfish Syzygy 130214                    25.5         30.1

I am not certain just how variable these ratings are, so I do not think comparing the results for the two endgame bases is possible from these results. But I do think the results show that both bases help Elo-wise in the endgame.

I forgot to record the tbhits information, so I will do that this weekend.

Daniel Shawul · Post by **Daniel Shawul** » Sat Feb 15, 2014 4:11 am

Thanks for the tests. That Stockfish is a speed monster and also has existing endgame knowledge necessitates different parameter tuning than Toga/Scorpio. I had to disable probing in qsearch and limit it more to get the nps up by a good amount. The thing is speed is not everything and it may actually be better to hit TBs from qsearch though with a lower nps. Scorpio now also uses a very limited scheme but I am not sure if it will improve its performance.

Anyway I also tried an easier compression method, but that absolutely did not improve things even though it decompresses 2-3x times faster! It simply was not a bottleneck. Another improvement I just did today uses unordered_map for the cache, and it seems promising for the bigger cache sizes. Though it works now, I am not sure if you can even compile it on your system (requires MSVC class unordered_map but maybe GCC has it too). I will not have time in the next month b/c I will be going back home after a long long time and already too excited about it. But if I get bored midway through my break, I may do something with it and visit CCC too (hope that doesn't happen! )

Adam Hair · Post by **Adam Hair** » Sat Feb 15, 2014 1:27 pm

I hope that you have a great time back home. If/when you come back to this in the future, I will do some more tests.

syzygy · Post by **syzygy** » Sat Feb 15, 2014 2:15 pm

Adam Hair wrote:I extracted 700 positions containing 6 to 10 men from a database of chess engine matches I have. These 700 positions are judged by Stockfish (with no endgame bases) at depth 20 to be between +75 cp and -75 cp. I went ahead and did some testing using the compiles from my previous test and the compiles of the latest sources from GitHub for Syzygy and Scorpio. Every version of Stockfish with Scorpio bases and Stockfish with Syzygy bases played 1400 games (reverse colors) against Stockfish 090214 without endgame bases. Here are the results (time control was 20"+0.2"):

Impressive testing effort!

It is interesting that SF 130214 does worse than SF 040214 without TBs but does better with TBs.

Mike S. · Post by **Mike S.** » Sat Feb 15, 2014 4:41 pm

Thanks for this test!

It confirms that the Syzygys are a breakthrough in endgame tablebases technology. It is not too long ago that +30 Elo were the result after a year of hard programming work.

P.S. I assume 5-men tables have been used, only?

Jouni · Post by **Jouni** » Sat Feb 15, 2014 4:50 pm

But remember +30 in endings is 15 or less in whole games and 15 is 10 or less against other engines

. So total improvement is 5 - 10 ELO.

Daniel Shawul · Post by **Daniel Shawul** » Sat Feb 15, 2014 5:00 pm

Mike S. wrote:Thanks for this test! It confirms that the Syzygys are a breakthrough in endgame tablebases technology. It is not too long ago that +30 Elo were the result after a year of hard programming work.

P.S. I assume 5-men tables have been used, only?

Breakthrough of what exactly? Lets not make conclusions of what is not there... All that this test showed is DTZ performed only marginally better on late endgames that with 6-10 pieces. Remember Nalimov also scores 100% with 6 men, so what to say of that?. DTM/DTZ advantages will probably evaporate with real games started from opening positions. Then all you will be left is WDL, which has been here for a long time...

Daniel Shawul · Post by **Daniel Shawul** » Sat Feb 15, 2014 5:01 pm

Adam Hair wrote:I hope that you have a great time back home. If/when you come back to this in the future, I will do some more tests.

Thanks! I plan to enjoy myself.

Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio

Re: Performance of Syzygy and Scorpio