Stockfish Natural TB loses heavily to Stockfish master
Moderators: hgm, Rebel, chrisw
-
- Posts: 3226
- Joined: Wed May 06, 2009 10:31 pm
- Location: Fuquay-Varina, North Carolina
Re: Stockfish Natural TB loses heavily to Stockfish master
Let me know if Gaviota 5men TBs are not available online. I can upload them if needed.
-
- Posts: 2821
- Joined: Fri Sep 25, 2015 9:38 pm
- Location: Sortland, Norway
Re: Stockfish Natural TB loses heavily to Stockfish master
Available here ->Adam Hair wrote:Let me know if Gaviota 5men TBs are not available online. I can upload them if needed.
http://oics.olympuschess.com/tracker/index.php
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Stockfish Natural TB loses heavily to Stockfish master
Thanks, I downloaded them in a matter of 20 minutes from the site given by Jon in a post before.Adam Hair wrote:Let me know if Gaviota 5men TBs are not available online. I can upload them if needed.
-
- Posts: 162
- Joined: Thu Dec 17, 2009 10:46 am
Re: Stockfish Natural TB loses heavily to Stockfish master
I applaud your decision to add a UCI option. I still feel your whole concept about what constitutes "good analysis" lines is deeply flawed and skewed by only looking at some pet examples without considering what positions you might break. Positions where trading down and/or sacrificing material to reach won endings is completely natural instead of going for a complicated mate. Your argument about "with reasonable time" it will eventually find the right move is wrong on so many levels that i do not really want to comment. But as long as i can turn it off i am happy.mcostalba wrote: So I will add an UCI option "Natural TB" by which users can toggle between natural and traditional behavior. This of course is just a placebo knob, but it takes me much less to add it than to convince people otherwise. Of course during real analysis I expect Natural TB to be always enabled because people wants to see good analysis lines, in multi PV, no odd sacrifices and with proper scores to understand the difference between competing PV lines (and of course wants to see a mate if engine finds it): all things that Natural TB does....well...naturally
I do see some value in your efforts though. If you ditch the concept of "naturalness" and concentrate on maximizing play without DTZ using WDL only (or statet diffenrently: minimize the ELO loss if no DTZ are installed), I see that this might be preparation for including the 6piece WDL bases in the SF testing framework (if one considers DTZ as to large). This would enable people to finally throw out all these endgame clutches and heuristics and even more important better tune middle game SF parameters (as there will be less interactions as the WDL bases can deal with a lot of cases). So i hope your project will evolve in that direction.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Stockfish Natural TB loses heavily to Stockfish master
The claim has some empirical grounds.Michel wrote:Hi Kai,
Do you have Gaviota? I think Peter claimed that Texel combines both Syzygy (DTZ50) and Gaviota (DTM) in a game theoretically correct way. It might be an interesting comparison.
I checked Texel 1.07a + Syzygy + Gaviota against Houdini 5 + Syzygy + Nalimov.
1000 games
Suite: Easy 5-men positions at the root:
TC: 1s/move
Score of Texel vs Houdini: 500 - 478 - 22 [0.511] 1000
ELO difference: 7.64 +/- 21.30
Finished match
Houdini fails in 22 out of 500 easy 5-men Wins due to 50 moves rule. It is probably significantly more than 12 failures against SF Master Syzygy. Texel, with DTZ50 and Gaviota, seems to trick Houdini (on 5-men only Nalimov DTM kicks in) into cursed Wins. Robert Houdart should be worried about 4.4% failure rate on easy 5-men Wins against Texel. But considering the Wins only, I expected both Houdini and Texel to play optimally. Not so.
Lengths of the 5-men Wins at the root:
Houdini: 478 Wins
Length of a Win:
Mean: 20.41 moves
Median: 18 moves
Texel: 500 Wins
Length of a Win:
Mean: 18.55 moves
Median: 17 moves
The difference in the mean length of the Win from 5-men positions at the root is significant.
The histograms of lengths of the Wins for the two engines are here:
They are similar, and are similar in shape to Stockfish Master but not at all to old "Natural". Observe longer tail of Houdini Wins compared to Texel Wins. Therefore , Texel TB implementation is checked to be the best around, and the claim that it is theoretically correct way may stand.
-
- Posts: 3286
- Joined: Wed Mar 08, 2006 8:15 pm
Re: Stockfish Natural TB loses heavily to Stockfish master
Houdini 5 syzygy implementation is buggy (almost none benefit from 5 piece and negative! from WDL only). Hope version 6 is better.
Jouni
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Stockfish Natural TB loses heavily to Stockfish master
Well, either I have a buggy compile, or the latest commit is itself buggy, but:Laskos wrote:Marco posted a new update to his Natural and a PGN of 100 games, and the first results and stats are very promising, he finally forces DTZ optimal moves, probably achieving a perfect play from root TB positions (will check later on 6-men).
From easy 5-men positions at 0.25s/move with 6-men on SSD:
Score of SF_Master vs SF_NTB_DTZ: 530 - 469 - 1 [0.530] 1000
ELO difference: 21.22 +/- 21.56
Finished match
30 loses and 1 draw in 5-men positions at root.
From easy 6-men positions at root at 0.25s/move:
Score of SF_Master vs SF_NTB_DTZ: 611 - 382 - 7 [0.615] 1000
ELO difference: 81.00 +/- 22.04
Finished match
Completely off.
No any time losses in these matches.
Here are 4 lost games of NTB from easy 5-men TB Wins, I don't see some malfunctioning of engines, Cutechess-Cli or failure in my TBs.
Code: Select all
[Event "?"]
[Site "?"]
[Date "2017.09.08"]
[Round "46"]
[White "SF_NTB_DTZ"]
[Black "SF_Master"]
[Result "0-1"]
[FEN "5k2/7R/6r1/2K5/8/2P5/8/8 w - - 0 1"]
[PlyCount "180"]
[SetUp "1"]
[TimeControl "0.25/move"]
1. Rd7 {+99.00/19 0.25s} Ke8 {-132.79/18 0.25s} 2. Rd1 {+132.70/18 0.25s}
Rg5+ {-132.79/23 0.25s} 3. Kb6 {+132.63/20 0.26s} Rg4 {-132.79/26 0.25s}
4. Kb5 {+132.64/18 0.25s} Rg5+ {-132.79/27 0.25s} 5. Kb4 {+99.00/20 0.25s}
Rg2 {-132.79/27 0.25s} 6. c4 {+132.69/23 0.25s} Rb2+ {-132.79/26 0.25s}
7. Kc5 {+132.70/20 0.25s} Ra2 {-132.79/24 0.25s} 8. Kb6 {+99.00/19 0.25s}
Rb2+ {-132.79/21 0.25s} 9. Kc6 {+132.70/29 0.25s} Kf7 {-132.79/20 0.25s}
10. Re1 {+132.69/19 0.25s} Rd2 {-132.79/19 0.25s} 11. c5 {+132.70/27 0.25s}
Kf6 {-132.79/20 0.25s} 12. Kb6 {+132.70/15 0.25s} Rb2+ {-132.79/21 0.25s}
13. Kc7 {+132.72/16 0.25s} Rb4 {-132.79/20 0.25s} 14. Rd1 {+51.26/17 0.25s}
Ke6 {-132.79/18 0.25s} 15. Rd7 {+132.72/16 0.25s} Rc4 {-132.79/19 0.25s}
16. Rd6+ {+132.78/17 0.26s} Ke7 {-132.79/19 0.25s} 17. Rd7+ {+9.38/17 0.25s}
Ke6 {-132.79/20 0.25s} 18. Rd6+ {+132.78/19 0.25s} Ke7 {-132.79/21 0.25s}
19. c6 {+132.66/20 0.25s} Rc2 {-132.79/19 0.25s} 20. Rd1 {+132.70/19 0.25s}
Rc3 {-132.79/18 0.25s} 21. Kb7 {+132.70/18 0.25s} Rb3+ {-132.79/18 0.25s}
22. Kc8 {+132.66/20 0.25s} Rb4 {-132.79/18 0.25s} 23. Rd3 {+132.71/29 0.25s}
Rb1 {-132.79/16 0.25s} 24. Rd7+ {+132.77/17 0.25s} Ke6 {-132.79/17 0.25s}
25. Rd8 {+132.69/25 0.25s} Rb3 {-132.79/18 0.25s} 26. Rd1 {+132.77/21 0.25s}
Ke7 {-132.79/17 0.25s} 27. Rd7+ {+132.73/25 0.25s} Ke8 {-132.79/17 0.25s}
28. Rd2 {+132.70/17 0.25s} Rb4 {-132.79/18 0.25s} 29. Re2+ {+132.72/23 0.25s}
Kf7 {-132.79/16 0.26s} 30. Kd7 {+132.75/15 0.26s} Rd4+ {-132.79/15 0.25s}
31. Kc7 {+132.71/28 0.25s} Kf6 {-132.79/16 0.25s} 32. Re1 {+132.71/26 0.25s}
Rb4 {-132.79/16 0.25s} 33. Kd6 {+132.70/16 0.25s} Rd4+ {-132.79/16 0.25s}
34. Kc7 {+132.69/24 0.25s} Rc4 {-132.79/18 0.25s} 35. Rd1 {+132.67/19 0.25s}
Ke6 {-132.79/19 0.25s} 36. Kb7 {+132.40/19 0.25s} Rb4+ {-132.79/17 0.25s}
37. Kc8 {+132.76/19 0.25s} Rb2 {-132.79/20 0.25s} 38. Rd3 {+132.77/21 0.25s}
Rb1 {-132.79/16 0.25s} 39. c7 {+132.54/19 0.25s} Ke7 {-132.79/16 0.25s}
40. Re3+ {+132.77/19 0.25s} Kf7 {-132.79/16 0.25s} 41. Ra3 {+132.68/18 0.25s}
Rb2 {-132.79/17 0.25s} 42. Rf3+ {+132.69/18 0.25s} Ke7 {-132.79/20 0.25s}
43. Ra3 {+132.66/19 0.25s} Rb1 {-132.79/17 0.25s} 44. Re3+ {+132.75/14 0.25s}
Kf7 {-132.79/18 0.25s} 45. Rf3+ {+132.75/21 0.25s} Ke7 {-132.79/19 0.25s}
46. Ra3 {+99.00/17 0.25s} Ke8 {-132.79/21 0.25s} 47. Ra2 {+132.61/19 0.25s}
Kf7 {-132.79/18 0.25s} 48. Kd7 {+132.75/16 0.25s} Rd1+ {-132.79/17 0.25s}
49. Kc8 {+99.00/15 0.25s} Rb1 {-132.79/18 0.25s} 50. Rc2 {+132.39/16 0.26s}
Ke7 {-132.79/21 0.25s} 51. Re2+ {+132.72/18 0.25s} Kf7 {-132.79/20 0.25s}
52. Ra2 {+99.00/18 0.26s} Rb4 {-132.79/19 0.25s} 53. Rf2+ {+132.77/17 0.26s}
Ke6 {-132.79/19 0.25s} 54. Re2+ {+132.72/19 0.25s} Kf6 {-132.79/21 0.25s}
55. Re1 {+132.51/17 0.25s} Rb3 {-132.79/16 0.25s} 56. Kd8 {+132.76/15 0.26s}
Rd3+ {-132.79/22 0.25s} 57. Ke8 {+132.65/18 0.25s} Rc3 {-132.79/27 0.25s}
58. Kd7 {+132.71/30 0.25s} Rd3+ {-132.79/28 0.25s} 59. Ke8 {+132.64/18 0.26s}
Rc3 {-132.79/24 0.25s} 60. Kd8 {+132.66/18 0.25s} Rd3+ {-132.79/20 0.25s}
61. Kc8 {+132.64/19 0.25s} Kg5 {-132.79/21 0.25s} 62. Rb1 {+99.00/16 0.25s}
Kf4 {-132.79/25 0.25s} 63. Kb8 {+132.45/16 0.25s} Rc3 {-132.79/27 0.25s}
64. Rb4+ {+11.61/23 0.25s} Ke3 {-132.79/28 0.25s} 65. Rb2 {+132.58/16 0.25s}
Kd4 {-132.79/22 0.25s} 66. Rb4+ {+47.63/26 0.25s} Kc5 {-132.79/19 0.25s}
67. Rb1 {+132.27/17 0.25s} Kd4 {-M52/25 0.25s} 68. Rb4+ {+132.77/19 0.25s}
Kc5 {-M46/26 0.25s} 69. Rb2 {+132.65/21 0.25s} Kd4 {-M42/28 0.25s}
70. Rd2+ {+117.90/33 0.25s} Ke3 {-132.79/16 0.25s} 71. Rd6 {+132.70/24 0.25s}
Rb3+ {-132.79/21 0.25s} 72. Kc8 {+132.65/20 0.25s} Kf4 {-132.79/15 0.25s}
73. Rd1 {+132.70/16 0.25s} Rg3 {-132.79/15 0.25s} 74. Kb7 {+132.65/17 0.25s}
Rg7 {-298.86/13 0.25s} 75. Kb6 {+132.76/17 0.25s} Rg8 {-298.96/17 0.25s}
76. Kb7 {+99.00/17 0.26s} Rg7 {-M62/20 0.25s} 77. Rf1+ {+132.74/18 0.25s}
Ke4 {-M54/21 0.25s} 78. Rg1 {+132.77/16 0.26s} Rxg1 {-132.79/14 0.25s}
79. c8=Q {+5.80/34 0.25s} Rg7+ {-132.79/19 0.25s} 80. Qd7 {+99.00/18 0.25s}
Rxd7+ {+M23/31 0.25s} 81. Kc6 {-132.67/18 0.25s} Rd5 {+M19/35 0.25s}
82. Kb7 {-M22/27 0.25s} Kd4 {+M17/38 0.25s} 83. Kb6 {-M18/32 0.25s}
Rc5 {+M15/42 0.25s} 84. Ka6 {-M14/38 0.25s} Kc4 {+M13/47 0.25s}
85. Kb6 {-M12/46 0.25s} Kb4 {+M11/56 0.25s} 86. Ka7 {-M10/54 0.25s}
Kb5 {+M9/68 0.25s} 87. Kb7 {-M8/82 0.25s} Rc4 {+M7/82 0.25s}
88. Ka7 {-M6/71 0.25s} Kc6 {+M5/88 0.25s} 89. Ka8 {-M4/81 0.25s}
Kc7 {+M3/127 0.005s} 90. Ka7 {-M2/127 0.003s} Ra4# {+M1/127 0.002s, Black mates}
0-1
[Event "?"]
[Site "?"]
[Date "2017.09.08"]
[Round "64"]
[White "SF_NTB_DTZ"]
[Black "SF_Master"]
[Result "0-1"]
[FEN "K7/8/P4k2/8/8/3R4/r7/8 w - - 0 1"]
[PlyCount "46"]
[SetUp "1"]
[TimeControl "0.25/move"]
1. a7 {+10.35/27 0.25s} Kg5 {-132.79/15 0.25s} 2. Rd5+ {+99.00/19 0.25s}
Kf6 {-132.79/12 0.25s} 3. Kb7 {+99.00/17 0.26s} Ke6 {-132.79/19 0.25s}
4. Rd2 {+99.00/17 0.25s} Rxd2 {-132.79/17 0.25s} 5. a8=Q {+5.84/28 0.25s}
Ke5 {-132.79/17 0.25s} 6. Qa1+ {+99.00/14 0.25s} Kf4 {-132.79/19 0.25s}
7. Qc1 {+99.00/19 0.26s} Ke3 {-132.79/24 0.25s} 8. Kc6 {+99.00/21 0.26s}
Ke2 {-132.79/24 0.25s} 9. Qc2 {+99.00/20 0.25s} Rxc2+ {+M39/23 0.25s}
10. Kd7 {-132.61/20 0.25s} Kf3 {+M27/32 0.25s} 11. Ke6 {-M32/23 0.25s}
Ke4 {+M25/35 0.25s} 12. Kd6 {-M26/29 0.25s} Rc3 {+M23/37 0.25s}
13. Ke6 {-M22/32 0.25s} Rd3 {+M21/41 0.25s} 14. Kf6 {-M20/35 0.25s}
Rd6+ {+M19/43 0.25s} 15. Ke7 {-M18/36 0.25s} Ke5 {+M17/45 0.25s}
16. Kf7 {-M16/40 0.25s} Kf5 {+M15/47 0.25s} 17. Ke7 {-M14/44 0.25s}
Rd5 {+M13/51 0.25s} 18. Kf7 {-M12/51 0.25s} Rd7+ {+M11/57 0.25s}
19. Ke8 {-M10/63 0.25s} Ke6 {+M9/67 0.25s} 20. Kf8 {-M8/112 0.25s}
Kf6 {+M7/77 0.25s} 21. Kg8 {-M6/89 0.25s} Rd8+ {+M5/70 0.25s}
22. Kh7 {-M4/127 0.060s} Rc8 {+M3/102 0.25s} 23. Kh6 {-M2/127 0.15s}
Rh8# {+M1/127 0.003s, Black mates} 0-1
[Event "?"]
[Site "?"]
[Date "2017.09.08"]
[Round "90"]
[White "SF_NTB_DTZ"]
[Black "SF_Master"]
[Result "0-1"]
[FEN "8/4K3/8/r7/8/3R4/2P5/2k5 w - - 0 1"]
[PlyCount "106"]
[SetUp "1"]
[TimeControl "0.25/move"]
1. c4 {+132.73/25 0.25s} Kc2 {-132.79/17 0.25s} 2. Rd5 {+99.00/19 0.25s}
Ra6 {-132.79/19 0.25s} 3. Kd7 {+132.74/18 0.25s} Ra7+ {-132.79/19 0.25s}
4. Kd6 {+132.75/32 0.25s} Ra6+ {-132.79/19 0.25s} 5. Kd7 {+132.74/19 0.25s}
Kc3 {-132.79/19 0.25s} 6. c5 {+8.56/35 0.25s} Kc4 {-132.79/18 0.25s}
7. Re5 {+132.68/29 0.25s} Kd4 {-132.79/18 0.25s} 8. Rh5 {+99.00/18 0.25s}
Ra5 {-132.79/19 0.25s} 9. Kd6 {+132.76/18 0.26s} Kc4 {-132.79/18 0.25s}
10. Rf5 {+132.74/33 0.26s} Kd3 {-132.79/18 0.25s} 11. Rf1 {+132.76/19 0.25s}
Ke4 {-132.79/16 0.25s} 12. Re1+ {+132.77/16 0.26s} Kd4 {-132.79/16 0.25s}
13. Rf1 {+132.70/18 0.26s} Ke4 {-132.79/16 0.25s} 14. Re1+ {+132.68/17 0.25s}
Kd4 {-132.79/16 0.25s} 15. Rd1+ {+132.67/19 0.25s} Ke3 {-132.79/16 0.25s}
16. c6 {+132.68/28 0.25s} Ke2 {-132.79/15 0.25s} 17. Rc1 {+57.90/27 0.25s}
Ra6 {-132.79/16 0.25s} 18. Ke7 {+132.75/17 0.25s} Ra7+ {-132.79/20 0.25s}
19. Kd6 {+132.71/29 0.26s} Kd2 {-132.79/16 0.25s} 20. Rc5 {+132.71/16 0.25s}
Ra6 {-132.79/21 0.25s} 21. Kd7 {+99.00/18 0.25s} Ra4 {-132.79/22 0.25s}
22. Rd5+ {+132.74/21 0.25s} Kc3 {-132.79/21 0.25s} 23. Rc5+ {+132.76/20 0.26s}
Kd4 {-132.79/20 0.25s} 24. Kd6 {+132.66/19 0.25s} Ra6 {-132.79/15 0.25s}
25. Rg5 {+132.77/17 0.26s} Ra8 {-132.79/14 0.25s} 26. Rg4+ {+132.78/22 0.25s}
Kc3 {-132.79/19 0.25s} 27. Kd7 {+132.68/16 0.25s} Kd2 {-132.79/23 0.25s}
28. Rg2+ {+132.78/17 0.25s} Kd3 {-132.79/22 0.25s} 29. Rg5 {+132.69/18 0.25s}
Rh8 {-132.79/23 0.25s} 30. Rg3+ {+53.63/19 0.25s} Ke4 {-132.79/19 0.25s}
31. Rg7 {+132.78/17 0.25s} Rh2 {-132.79/14 0.25s} 32. c7 {+132.41/21 0.25s}
Rd2+ {-132.79/23 0.25s} 33. Ke7 {+132.36/16 0.25s} Rc2 {-132.79/27 0.25s}
34. Kd8 {+99.00/16 0.25s} Rc1 {-132.79/28 0.25s} 35. Re7+ {+132.67/20 0.25s}
Kd5 {-132.79/26 0.25s} 36. Rg7 {+132.39/22 0.25s} Ke5 {-132.79/20 0.25s}
37. Re7+ {+132.20/19 0.25s} Kf6 {-M86/19 0.25s} 38. Re1 {+99.00/17 0.25s}
Rxe1 {-132.79/21 0.25s} 39. c8=Q {+5.84/41 0.25s} Rd1+ {-132.79/21 0.25s}
40. Ke8 {+99.00/22 0.25s} Re1+ {-132.79/23 0.25s} 41. Kf8 {+99.00/21 0.25s}
Re6 {-132.79/24 0.25s} 42. Qc3+ {+99.00/22 0.26s} Re5 {-132.79/26 0.25s}
43. Qc6+ {+99.00/24 0.25s} Kf5 {-132.79/25 0.25s} 44. Kf7 {+99.00/20 0.25s}
Kf4 {-132.79/24 0.25s} 45. Qc3 {+99.00/19 0.25s} Re3 {-132.79/23 0.25s}
46. Qe1 {+99.00/28 0.25s} Rxe1 {+M19/36 0.25s} 47. Kg6 {-M16/26 0.25s}
Re6+ {+M13/41 0.25s} 48. Kh7 {-M12/40 0.25s} Kf5 {+M11/49 0.25s}
49. Kg7 {-M10/58 0.25s} Re7+ {+M9/71 0.25s} 50. Kf8 {-M8/86 0.25s}
Kf6 {+M7/118 0.25s} 51. Kg8 {-M6/127 0.043s} Re8+ {+M5/121 0.25s}
52. Kh7 {-M4/84 0.25s} Rd8 {+M3/127 0.10s} 53. Kh6 {-M2/127 0.001s}
Rh8# {+M1/127 0.002s, Black mates} 0-1
[Event "?"]
[Site "?"]
[Date "2017.09.08"]
[Round "106"]
[White "SF_NTB_DTZ"]
[Black "SF_Master"]
[Result "0-1"]
[FEN "8/8/r3PR2/8/1k6/8/1K6/8 w - - 0 1"]
[PlyCount "12"]
[SetUp "1"]
[TimeControl "0.25/move"]
1. e7 {+132.77/14 0.25s} Rxf6 {-132.79/20 0.25s} 2. e8=Q {+5.92/39 0.25s}
Rd6 {-132.79/22 0.25s} 3. Qc6 {+99.00/36 0.25s} Rxc6 {+M7/91 0.25s}
4. Ka2 {-M6/80 0.25s} Kc3 {+M5/109 0.25s} 5. Ka1 {-M4/85 0.25s}
Kc2 {+M3/127 0.003s} 6. Ka2 {-M2/127 0.003s} Ra6# {+M1/127 0.003s, Black mates}
0-1
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Stockfish Natural TB loses heavily to Stockfish master
It's gone already (for the moment... I cannot predict what happens next).IQ wrote:I applaud your decision to add a UCI option.mcostalba wrote:So I will add an UCI option "Natural TB" by which users can toggle between natural and traditional behavior. This of course is just a placebo knob, but it takes me much less to add it than to convince people otherwise. Of course during real analysis I expect Natural TB to be always enabled because people wants to see good analysis lines, in multi PV, no odd sacrifices and with proper scores to understand the difference between competing PV lines (and of course wants to see a mate if engine finds it): all things that Natural TB does....well...naturally
Now, when SF has reached a TB position on the board, it probes DTZ to change its search (in ways that are not very intuitive and judging from Kai's results seem to be counterproductive) and if no mate was found it overrides not just the search score but also the search move and plays a DTZ-optimal move (meaning it is likely to sac a piece in a relatively easy 6-piece ending to trade down to a far more difficult to win 5-piece ending).
That it plays DTZ-optimal moves explains that it now at least plays TB endings perfectly. But "unnatural" is the proper term here.
For the sake of completeness, current SF treats this case (root position is in the TBs) as follows:
- first it selects those moves that preserve the win or draw;
- then it performs a regular search on those moves;
- the move played is the move with the highest search score. This move necessarily preserves the win or draw and is as "natural" as SF's regular search can be called natural;
- the score displayed in all PV lines is not the score returned by the search but a score corresponding to the value of the root position as determined by TBs.
If that were the goal, then all that's needed is to ask me for a patch. (I don't necessarily agree that throwing out the endgame code is a good idea, but I won't enter that discussion now.)I do see some value in your efforts though. If you ditch the concept of "naturalness" and concentrate on maximizing play without DTZ using WDL only (or statet diffenrently: minimize the ELO loss if no DTZ are installed), I see that this might be preparation for including the 6piece WDL bases in the SF testing framework (if one considers DTZ as to large). This would enable people to finally throw out all these endgame clutches and heuristics and even more important better tune middle game SF parameters (as there will be less interactions as the WDL bases can deal with a lot of cases). So i hope your project will evolve in that direction.
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Stockfish Natural TB loses heavily to Stockfish master
If it had been implemented correctly... It now goes for the quickest sac, winning or losing.syzygy wrote:That it plays DTZ-optimal moves explains that it now at least plays TB endings perfectly.
But "unnatural" is the proper term here.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Stockfish Natural TB loses heavily to Stockfish master
Marco, your commit "Force DTZ just before sending bestmove" is probably buggy. It plays sometimes suicide chess in TB positions at root. I checked my compile, PGN output, test conditions, varied time controls, etc. It still plays suicide chess at 60s+0.6s time control. What time control is "reasonable" for your NTB to work, now even when forcing DTZ? In 4 out of 100 5-men White Wins at 60s+0.6s time control (LTC at Fishtest), it loses as White. Check the PGN for those 4 Black Wins. I uploaded the PGN here:mcostalba wrote:Thanks Kay for testing NTB!Laskos wrote: At this time control, SF FNTB fails in 11 conversions out of 1000 Draws against SF master.
Could you please post the pgn of some game where SF fails to keep the draw? This should not happen. Never.
Instead converting the win is another story. Let me clarify.
SF NTB is always able to convert a win, but because of the way it is designed, finding the winning move is not immediate. We are talking of few seconds, not hours. It is very difficult to give a general rule but you can assume that within 1 minute of search it is able to find anything that there is to find in a position (but in the most cases we are talking of just fractions of a second).
Nevertheless when I read "a reasonable 400ms per move" and then in the same line "expect blunder at TCEC" and "I use it for analysis" I understand to make people change their minds is mission impossible.
So I will add an UCI option "Natural TB" by which users can toggle between natural and traditional behavior. This of course is just a placebo knob, but it takes me much less to add it than to convince people otherwise. Of course during real analysis I expect Natural TB to be always enabled because people wants to see good analysis lines, in multi PV, no odd sacrifices and with proper scores to understand the difference between competing PV lines (and of course wants to see a mate if engine finds it): all things that Natural TB does....well...naturally
One last thing. I have further improved the way DTZ is able to steer the engine in finding the winning line. I have done it in a way to preserve all the good properties of Natural TB, but now winning line is found on average in much shorter time.
I have tested it on your 2 reported games with 5-men where SF failed to convert the wins. Now it does.
http://s000.tinyupload.com/?file_id=343 ... 9656686888
Score of SF_Master vs SF_NTB_DTZ: 104 - 96 - 0 [0.520] 200
ELO difference: 13.90 +/- 48.43
Finished match
With 6-men it gets even worse.
Do you have testers out there? If I am not doing something completely wrong, this SF_NTB_DTZ, as I call it, will not pass even the regression test from 2moves_v1.epd. IIRC the window there was [-4,0], right? It seems the regression is above 10 ELO points even from the inadequate to test endgames 2moves_v1.epd. From my sensitive to TBs endgame suites, the results are simply ridiculous, and are conclusive even after 100-200 games. Here is a partial result from such an endgame suite at 0.25s per move (no time losses):
Finished game 785 (SF_Master vs SF_NTB_DTZ): 0-1 {Black mates}
Score of SF_Master vs SF_NTB_DTZ: 409 - 203 - 173 [0.631] 785
ELO difference: 93.36 +/- 22.01
I stopped it, it's a waste of my and CPU time.
If you are so talibanized about pushing your NTB, at least check it for many possible issues.