

Moderator: Ras
Nice experiment, thanks! If Leela can play TB wins and draws perfectly and avoid some TB losses against no-TB opponents, the implementation can probably be considered a success. All engines strong enough for a reasonable chance to reach a TB win against Leela have TB themselves, so the loss avoidance likely isn't that important.Laskos wrote: ↑Tue Oct 02, 2018 2:33 pm I tested yesterday on 100 long 6-men draws (not sure if they are exactly hard, but many of them are), no losses. Avoiding TB losses against non-TB engine needs comparison with another Syzygy-enabled engine against that non-TB engine. Might compare Lc0_Syzygy against SF8_no_TB compared to SF8_Syzygy against SF8_no_TB on 6-men hard wins. Will depend on time control used too.
Out of 50 6-men TB losses, Lc0_Syzygy (the fixed one) saves 13 against SF8_no_TB, SF8_Syzygy saves 15 against SF8_no_TB. So, behaves similarly to SF8, which has a good TB implementation (and good endgame eval, which Lc0 doesn't). Might check how many saves SF dev.
SF_dev_Syzygy saves 23 out of 50 TB losses. Might be due to different Syzygy implementation, stronger play, etc., I don't know.
I checked for another set of 400 won 6-men positions against SF8_Syzygy, not a single miss.jkiliani wrote: ↑Tue Oct 02, 2018 5:11 pmNice experiment, thanks! If Leela can play TB wins and draws perfectly and avoid some TB losses against no-TB opponents, the implementation can probably be considered a success. All engines strong enough for a reasonable chance to reach a TB win against Leela have TB themselves, so the loss avoidance likely isn't that important.Laskos wrote: ↑Tue Oct 02, 2018 2:33 pm I tested yesterday on 100 long 6-men draws (not sure if they are exactly hard, but many of them are), no losses. Avoiding TB losses against non-TB engine needs comparison with another Syzygy-enabled engine against that non-TB engine. Might compare Lc0_Syzygy against SF8_no_TB compared to SF8_Syzygy against SF8_no_TB on 6-men hard wins. Will depend on time control used too.
Out of 50 6-men TB losses, Lc0_Syzygy (the fixed one) saves 13 against SF8_no_TB, SF8_Syzygy saves 15 against SF8_no_TB. So, behaves similarly to SF8, which has a good TB implementation (and good endgame eval, which Lc0 doesn't). Might check how many saves SF dev.
SF_dev_Syzygy saves 23 out of 50 TB losses. Might be due to different Syzygy implementation, stronger play, etc., I don't know.
In a few months when we have strong Leela networks based on TB rescoring, it will be interesting how those will score in these benchmarks without using TB themselves...
You scepticism is totally in place. 6-men at root solves almost nothing for Lc0.Laskos wrote: ↑Tue Oct 02, 2018 8:23 pmI checked for another set of 400 won 6-men positions against SF8_Syzygy, not a single miss.
TBs are only up to 7 men, Leela often misevaluates heavily earlier endgames with many more pieces, I am bit skeptical that based on TB rescoring, Leela will become comparable to top engines in endgames, with their hand-crafted knowledge, which is often general and pretty abstract.
it's a relatively simple modification to the backed up value. if you ask for a value at a leaf, and the value is an absolute value (eg comes from the egtb or actually any terminal position) then:Milos wrote: ↑Tue Oct 02, 2018 9:26 pmYou scepticism is totally in place. 6-men at root solves almost nothing for Lc0.Laskos wrote: ↑Tue Oct 02, 2018 8:23 pmI checked for another set of 400 won 6-men positions against SF8_Syzygy, not a single miss.
TBs are only up to 7 men, Leela often misevaluates heavily earlier endgames with many more pieces, I am bit skeptical that based on TB rescoring, Leela will become comparable to top engines in endgames, with their hand-crafted knowledge, which is often general and pretty abstract.
Problem is adding 6-men properly to leaves is also not working due to MCTS using averaging as an operator. There is no easy solution. Maybe some hybrid approach that would switch operator to minmax once you start reaching TB positions at leaves, but that is not easy to implement and even harder to tune.
Again, not so easy. You backup lets say a win, move one ply back to the root and do what, since that becomes the worst move for your opponent you just average it up and loose any kind of information. It doesn't propagate further up the tree.chrisw wrote: ↑Tue Oct 02, 2018 9:58 pmit's a relatively simple modification to the backed up value. if you ask for a value at a leaf, and the value is an absolute value (eg comes from the egtb or actually any terminal position) then:Milos wrote: ↑Tue Oct 02, 2018 9:26 pmYou scepticism is totally in place. 6-men at root solves almost nothing for Lc0.Laskos wrote: ↑Tue Oct 02, 2018 8:23 pmI checked for another set of 400 won 6-men positions against SF8_Syzygy, not a single miss.
TBs are only up to 7 men, Leela often misevaluates heavily earlier endgames with many more pieces, I am bit skeptical that based on TB rescoring, Leela will become comparable to top engines in endgames, with their hand-crafted knowledge, which is often general and pretty abstract.
Problem is adding 6-men properly to leaves is also not working due to MCTS using averaging as an operator. There is no easy solution. Maybe some hybrid approach that would switch operator to minmax once you start reaching TB positions at leaves, but that is not easy to implement and even harder to tune.
if win, don't average it in, but back up the absolute win value
if draw, back up the max(0.5, average so far)
if loss, back up the average so far
you can get extra clever and flag terminal nodes, then test status from the ply below. If there's one terminal win above then this ply is a terminal loss in all cases. if all terminal losses above, then this ply is a terminal win. and so on.
it is a bit of a hassle to get it to work, you are kind of minimaxing, but only in terminal situations, and you have to treat win nodes differently to loss nodes.Milos wrote: ↑Tue Oct 02, 2018 10:07 pmAgain, not so easy. You backup lets say a win, move one ply back to the root and do what, since that becomes the worst move for your opponent you just average it up and loose any kind of information. It doesn't propagate further up the tree.chrisw wrote: ↑Tue Oct 02, 2018 9:58 pmit's a relatively simple modification to the backed up value. if you ask for a value at a leaf, and the value is an absolute value (eg comes from the egtb or actually any terminal position) then:Milos wrote: ↑Tue Oct 02, 2018 9:26 pmYou scepticism is totally in place. 6-men at root solves almost nothing for Lc0.Laskos wrote: ↑Tue Oct 02, 2018 8:23 pmI checked for another set of 400 won 6-men positions against SF8_Syzygy, not a single miss.
TBs are only up to 7 men, Leela often misevaluates heavily earlier endgames with many more pieces, I am bit skeptical that based on TB rescoring, Leela will become comparable to top engines in endgames, with their hand-crafted knowledge, which is often general and pretty abstract.
Problem is adding 6-men properly to leaves is also not working due to MCTS using averaging as an operator. There is no easy solution. Maybe some hybrid approach that would switch operator to minmax once you start reaching TB positions at leaves, but that is not easy to implement and even harder to tune.
if win, don't average it in, but back up the absolute win value
if draw, back up the max(0.5, average so far)
if loss, back up the average so far
you can get extra clever and flag terminal nodes, then test status from the ply below. If there's one terminal win above then this ply is a terminal loss in all cases. if all terminal losses above, then this ply is a terminal win. and so on.
Yes, the implementation is fine and does help a lot, especially the DTZ part. Just compare to what Lc0 does without Syzygy on 6-men TB positions at root:jkiliani wrote: ↑Tue Oct 02, 2018 5:11 pmNice experiment, thanks! If Leela can play TB wins and draws perfectly and avoid some TB losses against no-TB opponents, the implementation can probably be considered a success. All engines strong enough for a reasonable chance to reach a TB win against Leela have TB themselves, so the loss avoidance likely isn't that important.Laskos wrote: ↑Tue Oct 02, 2018 2:33 pm I tested yesterday on 100 long 6-men draws (not sure if they are exactly hard, but many of them are), no losses. Avoiding TB losses against non-TB engine needs comparison with another Syzygy-enabled engine against that non-TB engine. Might compare Lc0_Syzygy against SF8_no_TB compared to SF8_Syzygy against SF8_no_TB on 6-men hard wins. Will depend on time control used too.
Out of 50 6-men TB losses, Lc0_Syzygy (the fixed one) saves 13 against SF8_no_TB, SF8_Syzygy saves 15 against SF8_no_TB. So, behaves similarly to SF8, which has a good TB implementation (and good endgame eval, which Lc0 doesn't). Might check how many saves SF dev.
SF_dev_Syzygy saves 23 out of 50 TB losses. Might be due to different Syzygy implementation, stronger play, etc., I don't know.
Maybe it's worth trying to backup TB-wins/-losses not only once, but several times?chrisw wrote: ↑Tue Oct 02, 2018 9:58 pmit's a relatively simple modification to the backed up value. if you ask for a value at a leaf, and the value is an absolute value (eg comes from the egtb or actually any terminal position) then:Milos wrote: ↑Tue Oct 02, 2018 9:26 pmYou scepticism is totally in place. 6-men at root solves almost nothing for Lc0.Laskos wrote: ↑Tue Oct 02, 2018 8:23 pmI checked for another set of 400 won 6-men positions against SF8_Syzygy, not a single miss.
TBs are only up to 7 men, Leela often misevaluates heavily earlier endgames with many more pieces, I am bit skeptical that based on TB rescoring, Leela will become comparable to top engines in endgames, with their hand-crafted knowledge, which is often general and pretty abstract.
Problem is adding 6-men properly to leaves is also not working due to MCTS using averaging as an operator. There is no easy solution. Maybe some hybrid approach that would switch operator to minmax once you start reaching TB positions at leaves, but that is not easy to implement and even harder to tune.
if win, don't average it in, but back up the absolute win value
if draw, back up the max(0.5, average so far)
if loss, back up the average so far
you can get extra clever and flag terminal nodes, then test status from the ply below. If there's one terminal win above then this ply is a terminal loss in all cases. if all terminal losses above, then this ply is a terminal win. and so on.