Vinvin wrote: ↑Wed Sep 30, 2020 11:51 am
"As chess is drawish by nature."
Note 1 : this is a belief not a proven fact. We saw this kind of allegation in different times :
- In the years '70s, when Russian GMs made more and more draws.
- When Rybka (around year 2010) was dominating the chess scene, the number of draws was rising so much that many people said that we reached a limit where only draws was possible.
- and so on ...
But now top engines are 500 Elo above Rybka and 1000 Elo above the level of play of '70s
Chess is drawish at the highest level with very long time controls. See correspondence chess.
Chess in fast time controls is not drawish at all, filled with many wins and losses. So is chess with weak players, or chess with a huge strength differential between the players.
DrCliche wrote: ↑Wed Sep 02, 2020 8:46 am
Sure enough that three draws (lol) at TCEC shouldn't measurably change our extreme confidence, yes.
Just a different worldview and perceptions. I guess the priors of both of you have a similar mean but very different widths. Both of you expected SF NNUE to be stronger than SF Classic by a similar margin, say 124 Elo points, but with different emotional and mental attitude about your confidence in that.
I pictured the priors here:
After these 3 TCEC draws, the posterior estimate of the difference between SF NNUE and SF Classic in his case dropped to 32 Elo points from a priori 124 Elo points, in your case dropped to 113 Elo points from the same 124 a priori Elo points. So he is legit in asking "Are we sure that Stockfish NNUE is better than the Normal Stockfish ?" and you are right in replying "Sure enough that three draws (lol) at TCEC shouldn't measurably change our extreme confidence, yes."
Can you post a better photo of you, Probably you can find a better photo of you when were a child
Vinvin wrote: ↑Wed Sep 30, 2020 11:51 am
"As chess is drawish by nature."
Note 1 : this is a belief not a proven fact. We saw this kind of allegation in different times :
- In the years '70s, when Russian GMs made more and more draws.
- When Rybka (around year 2010) was dominating the chess scene, the number of draws was rising so much that many people said that we reached a limit where only draws was possible.
- and so on ...
But now top engines are 500 Elo above Rybka and 1000 Elo above the level of play of '70s
Chess is drawish at the highest level with very long time controls. See correspondence chess.
Chess in fast time controls is not drawish at all, filled with many wins and losses. So is chess with weak players, or chess with a huge strength differential between the players.
Dann Corbit wrote: ↑Thu Sep 03, 2020 1:55 am
Stockfish nnue has a secret weapon. The Kamehameha blast. Of course, he has to go to level 5 before he can use it. You don't just go Kamehameha blasting stuff willy-nilly.
At the very end it will be LCZero Vs Stockfish NNUE, but I predict a very close encounter of the 3rd kind, LCZero from Planet 1140b Vs StockFish NNUE from Planet Earth, Now I am more convinced than ever https://tcec-chess.com/live.html
I agree. I just played 200 games with Stockfish 12 Vs Lc0 26.2. Stockfish 12 won by only 24 Elo in 200 games at 3m+2s. And in testing. We can see how badly Stockfish NNUE has scaled in past testing. At longer time controls.
Both are the best chess engines, and the winner may only be decided by hardware and time controls.
The sprinter Stockfish 12 vs. the marathon runner Lc0. Who wins the race. May depend on the distance of the race!
Lc0 is clearly improving faster then Stockfish at this point in time. Even at 3m+2s time controls vs past matches at the same time controls.
Result:
--------------------------------------------------------------------------
# name games wins draws losses score los% elo+/-
1. Stockfish 12 200 16 182 2 107.0 100.0 24.4
2. Lc0 v0.26.2 200 2 182 16 93.0 0.0 -24.4
Cross table:
--------------------------------------------------------------------------
# name score games 1 2
1. Stockfish 12 107.0 200 x =====1==1===1====================1======1========11========================1==========================1============================================1====1===1=========1==0================1===1=1==0====
2. Lc0 v0.26.2 93.0 200 =====0==0===0====================0======0========00========================0==========================0============================================0====0===0=========0==1================0===0=0==1==== x
Tech:
--------------------------------------------------------------------------
Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
# name nodes/m NPS depth/m time/m moves time
1. Stockfish 12 125173K 26565996 42.5 4.7 54.1 255.1
2. Lc0 v0.26.2 101K 20342 10.0 4.9 54.1 267.2
all --- 61216K 12984844 26.3 4.8 54.1 261.2
For years ppl come up with the BS theory that A/B engines tuned in micro-bullet would be weak in LTC and for years they are so bluntly proven wrong. Impact of eval on horizon effects is minimal and it doesn't change whether you search to depth 20 or depth 100. SF-NN search is SF and SF is proven to scale better than Lc0 (and as a matter of fact any MCTS engine) in LTC. Ergo SF-NN scales better than Lc0 in LTC.
Your claims are simply BS reflecting your cluelessness in the matter. You effectively draw conclusions from STC (just because it's not micro-bullet but blitz instead) with a sample size that is a joke.
The result in the superfinal will be much worse sweep than last year. And then ppl like you would be astonished and would come up with all kind of ridiculous excuses to justify what is basically their cluelessness.
The only one that is clueless here is you. As I test at the longer time controls, as well as short time controls. Along with 1 core testing, and up to 32 threads.
And I am not talking about A/B engine only testing at micro-bullet. And I never have. I am talking about NNUE! And my sample size is huge. This is not my only test. I test non stop.
My conclusion is what the data is showing us, and if it changes all will see that also. I test openly, and to video.
"SF-NN search is SF and SF is proven to scale better than Lc0"
Milos Lc0 looks good so far as expected.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Dann Corbit wrote: ↑Thu Sep 03, 2020 1:55 am
Stockfish nnue has a secret weapon. The Kamehameha blast. Of course, he has to go to level 5 before he can use it. You don't just go Kamehameha blasting stuff willy-nilly.
At the very end it will be LCZero Vs Stockfish NNUE, but I predict a very close encounter of the 3rd kind, LCZero from Planet 1140b Vs StockFish NNUE from Planet Earth, Now I am more convinced than ever https://tcec-chess.com/live.html
I agree. I just played 200 games with Stockfish 12 Vs Lc0 26.2. Stockfish 12 won by only 24 Elo in 200 games at 3m+2s. And in testing. We can see how badly Stockfish NNUE has scaled in past testing. At longer time controls.
Both are the best chess engines, and the winner may only be decided by hardware and time controls.
The sprinter Stockfish 12 vs. the marathon runner Lc0. Who wins the race. May depend on the distance of the race!
Lc0 is clearly improving faster then Stockfish at this point in time. Even at 3m+2s time controls vs past matches at the same time controls.
Result:
--------------------------------------------------------------------------
# name games wins draws losses score los% elo+/-
1. Stockfish 12 200 16 182 2 107.0 100.0 24.4
2. Lc0 v0.26.2 200 2 182 16 93.0 0.0 -24.4
Cross table:
--------------------------------------------------------------------------
# name score games 1 2
1. Stockfish 12 107.0 200 x =====1==1===1====================1======1========11========================1==========================1============================================1====1===1=========1==0================1===1=1==0====
2. Lc0 v0.26.2 93.0 200 =====0==0===0====================0======0========00========================0==========================0============================================0====0===0=========0==1================0===0=0==1==== x
Tech:
--------------------------------------------------------------------------
Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
# name nodes/m NPS depth/m time/m moves time
1. Stockfish 12 125173K 26565996 42.5 4.7 54.1 255.1
2. Lc0 v0.26.2 101K 20342 10.0 4.9 54.1 267.2
all --- 61216K 12984844 26.3 4.8 54.1 261.2
For years ppl come up with the BS theory that A/B engines tuned in micro-bullet would be weak in LTC and for years they are so bluntly proven wrong. Impact of eval on horizon effects is minimal and it doesn't change whether you search to depth 20 or depth 100. SF-NN search is SF and SF is proven to scale better than Lc0 (and as a matter of fact any MCTS engine) in LTC. Ergo SF-NN scales better than Lc0 in LTC.
Your claims are simply BS reflecting your cluelessness in the matter. You effectively draw conclusions from STC (just because it's not micro-bullet but blitz instead) with a sample size that is a joke.
The result in the superfinal will be much worse sweep than last year. And then ppl like you would be astonished and would come up with all kind of ridiculous excuses to justify what is basically their cluelessness.
The only one that is clueless here is you. As I test at the longer time controls, as well as short time controls. Along with 1 core testing, and up to 32 threads.
And I am not talking about A/B engine only testing at micro-bullet. And I never have. I am talking about NNUE! And my sample size is huge. This is not my only test. I test non stop.
My conclusion is what the data is showing us, and if it changes all will see that also. I test openly, and to video.
"SF-NN search is SF and SF is proven to scale better than Lc0"
Milos Lc0 looks good so far as expected.
Lc0 takes the lead after 33 games.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
And as quick and the blink of an eye, order is restores....see game 34.
I know. But that is not the point. Milos said SF would crush Lc0. Worse then last year. My testing said no.
Stockfish and Lc0 despite the hype are very close in strength at LTC.
And the reason for this is TCEC uses bias openings. If a match shows a win for both engines in the same line. It is a busted opening. But we already know from TCEC own words. Wins are more important for views. So this is expected.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
And as quick and the blink of an eye, order is restores....see game 34.
I know. But that is not the point. Milos said SF would crush Lc0. Worse then last year. My testing said no.
Stockfish and Lc0 despite the hype are very close in strength at LTC.
And the reason for this is TCEC uses bias openings. If a match shows a win for both engines in the same line. It is a busted opening. But we already know from TCEC own words. Wins are more important for views. So this is expected.
It become pretty Normal these days.
Public can easily access Stockfish progress from fishtest and regression tests.
Meanwhile, Leela progress is available on Leela discord only and hard to interpret results from various individual testers.
" If you believe SF had +100 elo recently, you have to believe Leela had +100 elo recently".
Madeleine Birchfield wrote: ↑Wed Sep 30, 2020 3:40 pm
...
Chess is drawish at the highest level with very long time controls. See correspondence chess.
Chess in fast time controls is not drawish at all, filled with many wins and losses. So is chess with weak players, or chess with a huge strength differential between the players.