NO! I'm saying that Leela NOT winning season 16 wasn't a fluke.
Regards,
Zenmastur
Moderators: hgm, Rebel, chrisw
NO! I'm saying that Leela NOT winning season 16 wasn't a fluke.
I think you need to re-parse that, because it's a double negative, so if you remove the "NOT"s you're left with "I'm saying that Leela winning season 16 was a fluke", but Leela isn't winning season 16. Can you reword what you're saying without using any negatives?
I think advertisers have paid for 100 games.Dann Corbit wrote: ↑Sat Oct 12, 2019 2:59 amFrom TCEC 16 Rules and Information:Chessqueen wrote: ↑Sat Oct 12, 2019 12:16 am I do not know if it was luck that AllieStein v0.5-dev_7b41f8c-n11 got a better score than LCO but AS did NOT do as good as LCO against Stockfish 19092522, probably next time around Alliestein with an update might be as strong as Stockfish, unless there is something better than RTX 2080 waiting around the corner. Anyway In 100 games if SF reaches 51 it should be stopped, or they will continue it anyway ? www.tcec-chess.com/
"Superfinal
The Superfinal consists of 100 games at TC 120+10, with 50 different openings, among them once the normal start position, so that each engine plays both black and white of the same opening position. The match will be presented with opening 1 used in games 1 and 2, then opening 2 used in games 3 and 4 etc.
If the match is theoretically won for one side before game 100, the match will still continue until all 100 games have been played."
SF has already won 51 games, and they are playing on, so the rules are being followed.
I like that, because the games produce really interesting data.
The two negatives aren't referring to the same subject. i.e "NOT winning" and "WASN"T a fluke" don't cancel out since the subject isn't the same.Ovyron wrote: ↑Sat Oct 12, 2019 6:43 pmI think you need to re-parse that, because it's a double negative, so if you remove the "NOT"s you're left with "I'm saying that Leela winning season 16 was a fluke", but Leela isn't winning season 16. Can you reword what you're saying without using any negatives?
First it's possible that three programs can be equal and yet A beats B, B beats C, and C beats A.Ovyron wrote: ↑Sat Oct 12, 2019 7:03 pm But that leads to a contradiction:
A. Leela beat Stockfish in TCEC 15 because it was better (it wasn't a fluke.)
B. Allie advanced to TCEC 16 super final because it was better than Leela (Leela on 3rd or fourth.)
C. Stockfish beat Allie because it is better (because NNs aren't mature enough, etc.)
D. Stockfish hasn't been improved significantly since TCEC 15.
So how did Stockfish become better than Allie and Leela without improving much? To resolve this contradictions one of these must be true:
a. Stockfish improved and is now better than Allie and Leela
b. Leela is still better than those but by a fluke it ended third before TCEC 16 superfinal.
c. Leela was never better than Stockfish and it won TCEC 15 by a fluke.
d. (something else that you're saying that I don't get)
In this case whoever wins wins by a fluke, not by being better than the others.
Okay, but if they still play at the level of A/B engines it means they blunder less generally, to compensate for the gross blunders. Any time Stockfish lost in the TCEC 15 final it was because it blundered, and it blundered more often than Leela, so I'd say it'd be more fruitful to reduce the figure of Stockfish's blunders than the ones from NNs, even if they're not as gross.
If that's what you want to call it. With a set ELO difference between two opponents you can statistically predict how often each should win in a given match length. I don't call that a fluke. It's just the way it is.
A/B engines tend more towards micro-blunders when they don't "understand" the position. And they generally produce many micro blunders to lose the game. Death by a thousand small cuts. NN engines can produce gross blunders at ANY time, EVEN when they fully "understand" the position and/or have a commanding lead. When an A/B engine has a commanding lead it has an attainable goal and will rarely blunder it. When an A/B engine "thinks" the position is about even OR there is no clear goal for it, they tend to micro-blunder much more often. The nature, magnitude and number of the blunders tend to differ between the two types of engines.Okay, but if they still play at the level of A/B engines it means they blunder less generally, to compensate for the gross blunders. Any time Stockfish lost in the TCEC 15 final it was because it blundered, and it blundered more often than Leela, so I'd say it'd be more fruitful to reduce the figure of Stockfish's blunders than the ones from NNs, even if they're not as gross.
Because a blunder is a blunder, it'll lose you the game even if it's not a gross one, so I don't see the difference between Stockfish blunders and NN blunders (?? is bad enough to lose.)
IMO you defined the behavior of the two paradigms well. So, you do agree that in most tactically quiet, fairly balanced positions Leela is better (possibly much better)? Doesn't this lead to "take Leela as the base engine, and SF as tactical backup" for analysis? We disagreed on that IIRC.Zenmastur wrote: ↑Sat Oct 12, 2019 8:39 pm
A/B engines tend more towards micro-blunders when they don't "understand" the position. And they generally produce many micro blunders to lose the game. Death by a thousand small cuts. NN engines can produce gross blunders at ANY time, EVEN when they fully "understand" the position and/or have a commanding lead. When an A/B engine has a commanding lead it has an attainable goal and will rarely blunder it. When an A/B engine "thinks" the position is about even OR there is no clear goal for it, they tend to micro-blunder much more often. The nature, magnitude and number of the blunders tend to differ between the two types of engines.
Regards,
Zenmastur