Stockfish Natural TB loses heavily to Stockfish master

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Nordlandia »

What is the exact difference between 5-men Lomonosov and 5-men Syzygy?

In size they are on par each other.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Laskos »

Joerg Oster wrote:Thanks, Kai.

So SF NTB not only produces longer lasting games than Master,
but also resolves mate later on a regular basis.
This is quite unexpected.

Another ugly thing is the drop of the score from one move to the other:

Code: Select all

32. Rf4+ {+132.75/19 0.26s} 32. ... Kxf4 {-132.79/20
0.25s} 33. d8=Q {+5.88/35 0.25s} 
(This is from the first game you posted.)
Ok, I seem (a bit indirectly) to confirm that on the whole database. I used that the notation is +M and -M.

In 500 SFM White 5-men Wins:
SFM showed 5664 +M
SFNTB showed 4604 -M

In 496 SFNTB White 5-men Wins:
SFM showed 7097 -M
SFNTB showed 5803 +M

So, SF NTB resolves Mate later than SF master, and significantly so.
whereagles
Posts: 565
Joined: Thu Nov 13, 2014 12:03 pm

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by whereagles »

tpoppins wrote:
whereagles wrote:hmmm.. to alternate natural TB/normal TB, does one need to have two TB sets installed?
The difference is in the probing algorithm; so no, you don't need two TB sets installed.
Thx for clarifying!!
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Laskos »

Laskos wrote:
Joerg Oster wrote:Thanks, Kai.

So SF NTB not only produces longer lasting games than Master,
but also resolves mate later on a regular basis.
This is quite unexpected.

Another ugly thing is the drop of the score from one move to the other:

Code: Select all

32. Rf4+ {+132.75/19 0.26s} 32. ... Kxf4 {-132.79/20
0.25s} 33. d8=Q {+5.88/35 0.25s} 
(This is from the first game you posted.)
Ok, I seem (a bit indirectly) to confirm that on the whole database. I used that the notation is +M and -M.

In 500 SFM White 5-men Wins:
SFM showed 5664 +M
SFNTB showed 4604 -M

In 496 SFNTB White 5-men Wins:
SFM showed 7097 -M
SFNTB showed 5803 +M

So, SF NTB resolves Mate later than SF master, and significantly so.
And the difference is more significant in "early Mates". Towards the end of the games, both engines usually resolve Mates, so the difference is not that big. But during the first 10 plies of the games, "early Mates" were resolved by Stockfish Master 970 times, by Stockfish Natural2 TB 555 times. Stockfish Master shows significantly more "early Mates" than Stockfish Natural (almost double). So, Stockfish Master both resolves Mates significantly earlier, and actually plays to Mate almost two times shorter games.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Laskos »

Laskos wrote:
Laskos wrote:
Joerg Oster wrote:Thanks, Kai.

So SF NTB not only produces longer lasting games than Master,
but also resolves mate later on a regular basis.
This is quite unexpected.

Another ugly thing is the drop of the score from one move to the other:

Code: Select all

32. Rf4+ {+132.75/19 0.26s} 32. ... Kxf4 {-132.79/20
0.25s} 33. d8=Q {+5.88/35 0.25s} 
(This is from the first game you posted.)
Ok, I seem (a bit indirectly) to confirm that on the whole database. I used that the notation is +M and -M.

In 500 SFM White 5-men Wins:
SFM showed 5664 +M
SFNTB showed 4604 -M

In 496 SFNTB White 5-men Wins:
SFM showed 7097 -M
SFNTB showed 5803 +M

So, SF NTB resolves Mate later than SF master, and significantly so.
And the difference is more significant in "early Mates". Towards the end of the games, both engines usually resolve Mates, so the difference is not that big. But during the first 10 plies of the games, "early Mates" were resolved by Stockfish Master 970 times, by Stockfish Natural2 TB 555 times. Stockfish Master shows significantly more "early Mates" than Stockfish Natural (almost double). So, Stockfish Master both resolves Mates significantly earlier, and actually plays to Mate almost two times shorter games.
Another maybe interesting statistic: in 996 won games from easy 5-men TB Wins at the root, Mates resolved from the first move the engines played are:

Stockfish Master: 330/996 = 33.13%
Stockfish Final2 NTB: 132/996 = 13.25%

A significant difference. Both much faster resolution of Mates and much shorter path to Win of the Stockfish Master defeats the whole purpose even of "naturalness", if "naturalness" is not a synonym to "dumb".
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Laskos »

Laskos wrote:
Laskos wrote:
Laskos wrote:
Joerg Oster wrote:Thanks, Kai.

So SF NTB not only produces longer lasting games than Master,
but also resolves mate later on a regular basis.
This is quite unexpected.

Another ugly thing is the drop of the score from one move to the other:

Code: Select all

32. Rf4+ {+132.75/19 0.26s} 32. ... Kxf4 {-132.79/20
0.25s} 33. d8=Q {+5.88/35 0.25s} 
(This is from the first game you posted.)
Ok, I seem (a bit indirectly) to confirm that on the whole database. I used that the notation is +M and -M.

In 500 SFM White 5-men Wins:
SFM showed 5664 +M
SFNTB showed 4604 -M

In 496 SFNTB White 5-men Wins:
SFM showed 7097 -M
SFNTB showed 5803 +M

So, SF NTB resolves Mate later than SF master, and significantly so.
And the difference is more significant in "early Mates". Towards the end of the games, both engines usually resolve Mates, so the difference is not that big. But during the first 10 plies of the games, "early Mates" were resolved by Stockfish Master 970 times, by Stockfish Natural2 TB 555 times. Stockfish Master shows significantly more "early Mates" than Stockfish Natural (almost double). So, Stockfish Master both resolves Mates significantly earlier, and actually plays to Mate almost two times shorter games.
Another maybe interesting statistic: in 996 won games from easy 5-men TB Wins at the root, Mates resolved from the first move the engines played are:

Stockfish Master: 330/996 = 33.13%
Stockfish Final2 NTB: 132/996 = 13.25%

A significant difference. Both much faster resolution of Mates and much shorter path to Win of the Stockfish Master defeats the whole purpose even of "naturalness", if "naturalness" is not a synonym to "dumb".
Marco posted a new update to his Natural and a PGN of 100 games, and the first results and stats are very promising, he finally forces DTZ optimal moves, probably achieving a perfect play from root TB positions (will check later on 6-men). For now, I will post results of SF master against Houdini 5 enabled with both Syzygy (6-men) and Nalimov (5-men). First, their combined implementation in Houdini seems not entirely theoretically sound, Nalimov are DTM and not DTM50, so Houdini sometimes fails to convert easy 5-men Wins at the root if left to deal with Nalimov alone.

1000 games
Suite: Easy 5-men positions at root:
TC: 0.25s/move

Score of SF Master vs Houdini: 500 - 488 - 12 [0.506] 1000
ELO difference: 4.17 +/- 21.40
Finished match

Houdini fails in 12 out of 500 easy 5-men Wins due to 50 moves rule. But considering their Wins only, Houdini plays optimally, and Stockfish Master close to optimally, while the previous Natural was way off, both in length of the Wins and in Mates resolved.

Length of the 5-men Wins at the root:

SF master:
Mean: 20.72
Median: 18

Houdini (optimal play):
Mean:17.41
Median: 15


Mates resolutions in these 988 games are: Houdini resolves all moves as Mates (whether M or 298.XX or 299.XX), while SF Master part of them.
Counts in 988 games:

Mates resolved
Houdini: 18356
SF master: 11447
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Laskos »

Laskos wrote:Marco posted a new update to his Natural and a PGN of 100 games, and the first results and stats are very promising,
First results with this database. I will call this new SF Natural as SF Natural DTZ.

100 games
Suite: Hard 5-men positions at the root
TC: 10''+ 0.1''
Score of Stockfish Natural DTZ vs Stockfish master: 50 - 50 - 0 [0.500]
Elo difference: 0.00 +/- 68.89
100 of 100 games finished.

Length of the 5-men Wins at the root:

SF master:
Mean: 37.2 moves
Median: 37 moves

SF Natural DTZ:
Mean: 45.8 moves
Median: 48 moves


Still longer Wins with Natural, but not by much,


Mates resolutions in these 100 games are now radically different, Natural drastically improved.
Counts in 100 games:

Mates resolved:
Natural: 1859
Master: 1238

The situation was opposite before, and there was no explanation why Natural plays very long Wins.

"Early Mate resolutions" (in the first 10 moves of the games) are drastically in favor of Natural: 48 to 4 of Master.

I will check later with easy and hard 6-men positions for the perfect play.
Last edited by Laskos on Fri Sep 08, 2017 11:21 am, edited 1 time in total.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Michel »

Hi Kai,

Do you have Gaviota? I think Peter claimed that Texel combines both Syzygy (DTZ50) and Gaviota (DTM) in a game theoretically correct way. It might be an interesting comparison.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Nordlandia »

Michel wrote:Hi Kai,

Do you have Gaviota? I think Peter claimed that Texel combines both Syzygy (DTZ50) and Gaviota (DTM) in a game theoretically correct way. It might be an interesting comparison.
DTM format for Texel only kicks in once 5-men or less. Otherwice syzygy is probed elsewhere in the game.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Stockfish Natural TB loses heavily to Stockfish master

Post by Laskos »

Michel wrote:Hi Kai,

Do you have Gaviota? I think Peter claimed that Texel combines both Syzygy (DTZ50) and Gaviota (DTM) in a game theoretically correct way. It might be an interesting comparison.
I don't have Gaviota TBs, but when I find time, I will download them. I got a bit hooked by this issue when I saw that the "Natural" fails often to convert root TB positions, this was unexpected to me. I was initially trying to see their behavior with more pieces on sensitive endgame suites, but when I saw that for the vague "naturalness" it fails in 90% of hard root TB positions, and in 6% usual root TB positions, I was :shock:. And those "Natural" had good chances to pass the regression test. Thanks for Texel+Gaviota+Syzygy tip.