Four openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
TCEC S15, END of an ERA event is much more Brutal than I thought!
Moderators: hgm, Rebel, chrisw
-
- Posts: 1534
- Joined: Sun Oct 25, 2009 2:30 am
Re: TCEC S15, END of an ERA event is much more Brutal than I thought!
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: TCEC S15, END of an ERA event is much more Brutal than I thought!
Yes, that's true, but I had in mind that TCEC has imbalanced openings when predicting months ago 2:1 Win/Loss ratio on +14 -7 =79 lines for S15 SuFi. It is now ending (100th game finishes now with a draw) exactly with my prediction.Ozymandias wrote: ↑Tue May 28, 2019 9:54 amFour openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
With balanced openings, yes, I would guess Win/Loss ratio would be 3:1 or so, but draw rate higher too.
-
- Posts: 1534
- Joined: Sun Oct 25, 2009 2:30 am
Re: TCEC S15, END of an ERA event is much more Brutal than I thought!
Imbalance is good, lopsidedness is horrible. Ideally, you'd want openings that ensure 1.5/2-0.5/2 or (even better) 2/2-0/2, not 1 out of 2 versus 1 out of 2 (it doesn't matter if they're lopsided of drawish, those openings tell you very little if anything).Laskos wrote: ↑Tue May 28, 2019 10:04 amYes, that's true, but I had in mind that TCEC has imbalanced openings when predicting months ago 2:1 Win/Loss ratio on +14 -7 =79 lines for S15 SuFi. It is now ending (100th game finishes now with a draw) exactly with my prediction.Ozymandias wrote: ↑Tue May 28, 2019 9:54 amFour openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
With balanced openings, yes, I would guess Win/Loss ratio would be 3:1 or so, but draw rate higher too.
-
- Posts: 3186
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: TCEC S15, END of an ERA event is much more Brutal than I thought!
I'd simply count every 1:1- pair as remis too, then I'd get 87% draw rate instead of 79.Ozymandias wrote: ↑Tue May 28, 2019 9:54 amFour openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
Considering this and a 26 Elo- difference for the sufi (of course only as correct as the absolute numeric Elo taken as starting- points may have been ), is that for 100 games with this "corrected" draw rate within or outside of an errorbar according to a 95% confidence- interval?
Last edited by peter on Tue May 28, 2019 12:51 pm, edited 1 time in total.
Peter.
-
- Posts: 51
- Joined: Mon Feb 20, 2017 8:29 am
- Location: Rialto, Venice
Re: TCEC S15, END of an ERA event is much more Brutal than I thought!
A new possible challenge for LC0.
in his latest book Garry Kasparov still states that “Centaur mode” is still the best expression of strength in chess games.
I am doubtful after witnessing how LC0 defeated Stockfish.
In order to test Kasparov's ipothesis I would suggest the following challenge:
GM + best collection Alpha Beta programs available at the spot vs. best Neural Network program (LC0 or Alpha 0)
Principal addictive rules:
- GM Centaur mode has access to the screen of analysis of AB programs, but without expanding analysis tree handmade
- on the other hand, GM centaur mode has the possibility to withdraw the move after LC0 reply and is allowed to play another single substitute move, with a time penalty.
According to you, who would win in a match of 16-24 games in this conditions ?
GM centaur mode or LCO ?
in his latest book Garry Kasparov still states that “Centaur mode” is still the best expression of strength in chess games.
I am doubtful after witnessing how LC0 defeated Stockfish.
In order to test Kasparov's ipothesis I would suggest the following challenge:
GM + best collection Alpha Beta programs available at the spot vs. best Neural Network program (LC0 or Alpha 0)
Principal addictive rules:
- GM Centaur mode has access to the screen of analysis of AB programs, but without expanding analysis tree handmade
- on the other hand, GM centaur mode has the possibility to withdraw the move after LC0 reply and is allowed to play another single substitute move, with a time penalty.
According to you, who would win in a match of 16-24 games in this conditions ?
GM centaur mode or LCO ?
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: TCEC S15, END of an ERA event is much more Brutal than I thought!
LOS for 10-3 score is 97%, and the confidence interval is 94%, a bit lower than 2 standard deviations.peter wrote: ↑Tue May 28, 2019 12:47 pmI'd simply count every 1:1- pair as remis too, then I'd get 87% draw rate instead of 79.Ozymandias wrote: ↑Tue May 28, 2019 9:54 amFour openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
Considering this and a 26 Elo- difference for the sufi (of course only as correct as the absolute numeric Elo taken as starting- points may have been ), is that for 100 games with this "corrected" draw rate within or outside of an errorbar according to a 95% confidence- interval?
-
- Posts: 3186
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
-
- Posts: 3186
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: TCEC S15, END of an ERA event is much more Brutal than I thought!
Ah, I see, but that wasn't my question, was it?
Superiority could be for a single game won, out of 100 with 99 draws too.
I thought, we would measure engine- strength in centi- Elo only nowadays, so my question was, if for 87% draw rate and 100 games 26 Elo would be within or without the error bar of 95% confidence.
Peter.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: TCEC S15, END of an ERA event is much more Brutal than I thought!
Not within 95%, but within 94%. I already wrote that. So, the result is inside usual 2 standard deviations error margins, but barely so.peter wrote: ↑Tue May 28, 2019 2:52 pmAh, I see, but that wasn't my question, was it?
Superiority could be for a single game out of 100 won with 99 draws too.
I thought, we would measure engine- strength in centi- Elo only nowadays, so my question was, if for 87% draw rate and 100 games 26 Elo would be within or without the error bar of 95% confidence.