TCEC S15, END of an ERA event is much more Brutal than I thought!

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Ozymandias
Posts: 1102
Joined: Sun Oct 25, 2009 12:30 am

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by Ozymandias » Tue May 28, 2019 7:54 am

Laskos wrote:
Tue May 28, 2019 6:24 am
After 99/100 games in TCEC S15 SuFi the score is indeed +14 -7 =78. Sometimes I have an inhuman ability to predict the outcome months in advance
Four openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by Laskos » Tue May 28, 2019 8:04 am

Ozymandias wrote:
Tue May 28, 2019 7:54 am
Laskos wrote:
Tue May 28, 2019 6:24 am
After 99/100 games in TCEC S15 SuFi the score is indeed +14 -7 =78. Sometimes I have an inhuman ability to predict the outcome months in advance
Four openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
Yes, that's true, but I had in mind that TCEC has imbalanced openings when predicting months ago 2:1 Win/Loss ratio on +14 -7 =79 lines for S15 SuFi. It is now ending (100th game finishes now with a draw) exactly with my prediction.

With balanced openings, yes, I would guess Win/Loss ratio would be 3:1 or so, but draw rate higher too.

User avatar
Ozymandias
Posts: 1102
Joined: Sun Oct 25, 2009 12:30 am

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by Ozymandias » Tue May 28, 2019 9:42 am

Laskos wrote:
Tue May 28, 2019 8:04 am
Ozymandias wrote:
Tue May 28, 2019 7:54 am
Laskos wrote:
Tue May 28, 2019 6:24 am
After 99/100 games in TCEC S15 SuFi the score is indeed +14 -7 =78. Sometimes I have an inhuman ability to predict the outcome months in advance
Four openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
Yes, that's true, but I had in mind that TCEC has imbalanced openings when predicting months ago 2:1 Win/Loss ratio on +14 -7 =79 lines for S15 SuFi. It is now ending (100th game finishes now with a draw) exactly with my prediction.

With balanced openings, yes, I would guess Win/Loss ratio would be 3:1 or so, but draw rate higher too.
Imbalance is good, lopsidedness is horrible. Ideally, you'd want openings that ensure 1.5/2-0.5/2 or (even better) 2/2-0/2, not 1 out of 2 versus 1 out of 2 (it doesn't matter if they're lopsided of drawish, those openings tell you very little if anything).

peter
Posts: 1775
Joined: Sat Feb 16, 2008 6:38 am
Full name: Peter Martan

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by peter » Tue May 28, 2019 10:47 am

Ozymandias wrote:
Tue May 28, 2019 7:54 am
Laskos wrote:
Tue May 28, 2019 6:24 am
After 99/100 games in TCEC S15 SuFi the score is indeed +14 -7 =78. Sometimes I have an inhuman ability to predict the outcome months in advance
Four openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
I'd simply count every 1:1- pair as remis too, then I'd get 87% draw rate instead of 79.
Considering this and a 26 Elo- difference for the sufi (of course only as correct as the absolute numeric Elo taken as starting- points may have been :)), is that for 100 games with this "corrected" draw rate within or outside of an errorbar according to a 95% confidence- interval?
Last edited by peter on Tue May 28, 2019 10:51 am, edited 1 time in total.
Peter.

Kanizsa
Posts: 32
Joined: Mon Feb 20, 2017 7:29 am
Location: Rialto, Venice

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by Kanizsa » Tue May 28, 2019 11:05 am

A new possible challenge for LC0.

in his latest book Garry Kasparov still states that “Centaur mode” is still the best expression of strength in chess games.
I am doubtful after witnessing how LC0 defeated Stockfish.

In order to test Kasparov's ipothesis I would suggest the following challenge:
GM + best collection Alpha Beta programs available at the spot vs. best Neural Network program (LC0 or Alpha 0)

Principal addictive rules:
- GM Centaur mode has access to the screen of analysis of AB programs, but without expanding analysis tree handmade
- on the other hand, GM centaur mode has the possibility to withdraw the move after LC0 reply and is allowed to play another single substitute move, with a time penalty.

According to you, who would win in a match of 16-24 games in this conditions ?
GM centaur mode or LCO ?

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by Laskos » Tue May 28, 2019 11:48 am

peter wrote:
Tue May 28, 2019 10:47 am
Ozymandias wrote:
Tue May 28, 2019 7:54 am
Laskos wrote:
Tue May 28, 2019 6:24 am
After 99/100 games in TCEC S15 SuFi the score is indeed +14 -7 =78. Sometimes I have an inhuman ability to predict the outcome months in advance
Four openings were a double 1-0, so it could be argued that those are losing positions and should not count. In that case we'd be looking at +10 -3 =79.
I'd simply count every 1:1- pair as remis too, then I'd get 87% draw rate instead of 79.
Considering this and a 26 Elo- difference for the sufi (of course only as correct as the absolute numeric Elo taken as starting- points may have been :)), is that for 100 games with this "corrected" draw rate within or outside of an errorbar according to a 95% confidence- interval?
LOS for 10-3 score is 97%, and the confidence interval is 94%, a bit lower than 2 standard deviations.

peter
Posts: 1775
Joined: Sat Feb 16, 2008 6:38 am
Full name: Peter Martan

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by peter » Tue May 28, 2019 12:09 pm

Laskos wrote:
Tue May 28, 2019 11:48 am
LOS for 10-3 score is 97%, and the confidence interval is 94%, a bit lower than 2 standard deviations.
LOS?
Peter.

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by Laskos » Tue May 28, 2019 12:39 pm

peter wrote:
Tue May 28, 2019 12:09 pm
Laskos wrote:
Tue May 28, 2019 11:48 am
LOS for 10-3 score is 97%, and the confidence interval is 94%, a bit lower than 2 standard deviations.
LOS?
Likelyhood of Superiority.

peter
Posts: 1775
Joined: Sat Feb 16, 2008 6:38 am
Full name: Peter Martan

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by peter » Tue May 28, 2019 12:52 pm

Laskos wrote:
Tue May 28, 2019 12:39 pm
peter wrote:
Tue May 28, 2019 12:09 pm
Laskos wrote:
Tue May 28, 2019 11:48 am
LOS for 10-3 score is 97%, and the confidence interval is 94%, a bit lower than 2 standard deviations.
LOS?
Likelyhood of Superiority.
Ah, I see, but that wasn't my question, was it?
Superiority could be for a single game won, out of 100 with 99 draws too.
:)
I thought, we would measure engine- strength in centi- Elo only nowadays, so my question was, if for 87% draw rate and 100 games 26 Elo would be within or without the error bar of 95% confidence.
Peter.

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: TCEC S15, END of an ERA event is much more Brutal than I thought!

Post by Laskos » Tue May 28, 2019 12:57 pm

peter wrote:
Tue May 28, 2019 12:52 pm
Laskos wrote:
Tue May 28, 2019 12:39 pm
peter wrote:
Tue May 28, 2019 12:09 pm
Laskos wrote:
Tue May 28, 2019 11:48 am
LOS for 10-3 score is 97%, and the confidence interval is 94%, a bit lower than 2 standard deviations.
LOS?
Likelyhood of Superiority.
Ah, I see, but that wasn't my question, was it?
Superiority could be for a single game out of 100 won with 99 draws too.
:)
I thought, we would measure engine- strength in centi- Elo only nowadays, so my question was, if for 87% draw rate and 100 games 26 Elo would be within or without the error bar of 95% confidence.
Not within 95%, but within 94%. I already wrote that. So, the result is inside usual 2 standard deviations error margins, but barely so.

Post Reply