Stockfish Natural TB loses heavily to Stockfish master

Laskos · Post by **Laskos** » Tue Oct 03, 2017 9:28 am

Nordlandia wrote:In general, how often do 5-men convert 6-men positions.

How often do stockfish dev equipped with 5-men convert 6-men positions.

It depends on positions and time control. On moderately hard 6-men positions, as you see here, about 90% at 0.25s/move and a bit less at 0.10s/move. But in real games things are different. About 70-75%, maybe more, of 6-men positions occurring in normal games are so unbalanced or so balanced, that there is no need of any TBs, the games are pretty decided as Wins or Draws, and Stockfish converts almost all of them to the correct outcome without any TBs. The remaining, maybe 20-25% of 6-men positions occurring in normal games are trickier, and Stockfish by itself (no-TB) might fail to convert a significant part of them, say on average 50%. And here Syzygy-5 starts helping greatly, reducing that to say 10%.

That's why it's important to have good test suites. I do have many gradations of hardness 6-men suites, from regular, easy ones, occurring in normal games, to very hard ones for engines (and usually humans).

So, from regular games, 6-men positions with 5-men Syzygy are converted in more than 95% of cases. From my hardest 6-men suites, the conversion rate with 5-men Syzygy is below 60%.

I use the harder ones as magnifying glass for TB implementation, the only snag is that the hardest ones are not very long (not so many positions), and might be preferring only several few types of positions.

If one is using regular openings, say 2moves_v1.epd to test TB implementation, like Fishtest is dumbly doing, not only they are dealing with the easiest, regular 6-men positions (those 95% solved) occurring in games, but 80+% of games don't even enter a phase to rely on TBs at all. Their resolution power is orders of magnitude lower than mine.

mcostalba · Post by **mcostalba** » Tue Oct 03, 2017 9:30 am

Laskos wrote:The state of SF NTB4 as of today:

Thanks for testing correctness of NTB4 on 6-men: this is a valuable contribution.

Instead your metric of converting x-men wins with (x-1)-men installed is metric that makes no sense at all. Of course you are free to test with whatever you want, you can also find how many times NTB4 gives a mate with a knight instead of a queen and then compare to early in that way, for example.

The way NTB4 is implemented for the middle game makes it prone to be 'weak' to made-up stupid tests, but this means nothing regrading engine ELO as is normally intended.

If you want to compare the ELO in a sound way then you need to test it from starting position, eventually with a book, in full real game.

Nordlandia · Post by **Nordlandia** » Tue Oct 03, 2017 9:39 am

Kai Laskos: an interesting idea is 6-men nalimov stored on fast SSD as provisional alternative to 7-men lomonosov tablebases.

Houdini 6.02 equipped with 6-men nalimov through PCIe might be able to solve majority of 7-men positions.

syzygy · Post by **syzygy** » Tue Oct 03, 2017 10:12 am

mcostalba wrote:The way NTB4 is implemented for the middle game makes it prone to be 'weak' to made-up stupid tests, but this means nothing regrading engine ELO as is normally intended.

Your implementation is going to lose Elo no matter whether you start from the opening position, the middle game or a 6- or 7-piece position. Starting from the opening position will just make it harder to measure. Kai's tests are far from stupid. You could as well say that testing at 10+0.1 is stupid and that only 40 moves in 2 hours counts.

Still, it might be good to start with a few more pieces on the board than 6 or 7. Often winning the endgame will not be about forcing a piece capture from 7 to 6 pieces, but about forcing an exchange from 7 to 5 or from 8 to 6. In other words, 6-piece TBs might be more important for 8-piece endgames than for 7-piece endgames.

MikeB · Post by **MikeB** » Tue Oct 03, 2017 5:23 pm

Nordlandia wrote:Kai Laskos: an interesting idea is 6-men nalimov stored on fast SSD as provisional alternative to 7-men lomonosov tablebases.

Houdini 6.02 equipped with 6-men nalimov through PCIe might be able to solve majority of 7-men positions.

An interesting thought, but in the end you will need 7 man tbs to solve the majority of 7 men positions.

Nordlandia · Post by **Nordlandia** » Tue Oct 03, 2017 5:58 pm

Michael B: if you mean cursed wins which require tens of if not hundred accurate plies, then yes.

Nalimov 6-men is under one percent of lomonosov 7-men

1.11 TB / 140 TB = 0.8%

Additionally Houdini 6.02 is going to find egtb win considerable faster if probing 6-men than 5-men nalimov.

syzygy · Post by **syzygy** » Tue Oct 03, 2017 11:25 pm

Nordlandia wrote:Michael B: if you mean cursed wins which require tens of if not hundred accurate plies, then yes.

Nalimov 6-men is under one percent of lomonosov 7-men

1.11 TB / 140 TB = 0.8%

Additionally Houdini 6.02 is going to find egtb win considerable faster if probing 6-men than 5-men nalimov.

6-piece Nalimov cannot solve any position that cannot be solved by 6-piece Syzygy. The only difference is that Nalimov may lead to a "mate-in-N" announcement (and that there is no guarantee that a mate-in-51 and higher can be converted within the 50-move rule).

Nordlandia · Post by **Nordlandia** » Wed Oct 04, 2017 5:27 am

syzygy wrote:The only difference is that Nalimov may lead to a "mate-in-N" announcement.

I find Nalimov appealing for Chessbase's Let's Check Database.

Mate-in-N look better than 137.XXX & 250.00 in my book. Moreover mate is the ultimate goal of the game, #22 give more detailed information than 137.XXX.

http://www.lets-check.info/

syzygy · Post by **syzygy** » Wed Oct 04, 2017 8:25 am

Nordlandia wrote:Mate-in-N look better than 137.XXX & 250.00 in my book.

Sure, but it's cosmetic. A certain win is a certain win. Nalimov cannot find more certain wins than other 6-piece TBs just because it is DTM. Nalimov 6-piece TBs do not get closer to 7-piece TBs than other 6-piece TBs. 6 is 6, 7 is 7.

Nordlandia · Post by **Nordlandia** » Wed Oct 04, 2017 9:02 am

syzygy wrote:
Nordlandia wrote:Mate-in-N look better than 137.XXX & 250.00 in my book.
Sure, but it's cosmetic. A certain win is a certain win. Nalimov cannot find more certain wins than other 6-piece TBs just because it is DTM. Nalimov 6-piece TBs do not get closer to 7-piece TBs than other 6-piece TBs. 6 is 6, 7 is 7.

Mate-in-N is an constant, whereas 137.XXX is superficial, without depth. The latter is a sign without further information other than it's egtb win.

The interest is only toward Let's Check Database with exact distance to mate.

Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master

Re: Stockfish Natural TB loses heavily to Stockfish master