Komodo strikes first at the TCEC Superfinal

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Komodo strikes first at the TCEC Superfinal

Post by syzygy »

AdminX wrote:
M ANSARI wrote:Yes the +6.5 score needs a re-think. I don't see the value of not letting that go to say when both sides see mate. If one engine has a +6.5 score then in general the other side will be mated quite quickly.
I disagree, I think it should be left the way it is. I think doing what you suggest would just encourage developers to lower their evals to beat the TCEC rule on this.
Isn't that exactly the (potential) problem of the current TCEC rule?

I would not for a moment want to suggest that the Komodo developers (or any of the other developers) have done this (clearly they did not), but reporting relatively low scores is what saved game 22 for Komodo. Had it reported +6.5 or higher for a couple of moves, SF would have won the game. Knowing how the game continued, an SF win would have been completely undeserved, but the adjudication would have prevented any one to know about it.

In short, a TCEC competitor can only benefit from capping its (negative) evaluations to 6.49 and such benefit has now been proven to be more than theoretical.
User avatar
AdminX
Posts: 6340
Joined: Mon Mar 13, 2006 2:34 pm
Location: Acworth, GA

Re: Komodo strikes first at the TCEC Superfinal

Post by AdminX »

syzygy wrote:
AdminX wrote:
M ANSARI wrote:Yes the +6.5 score needs a re-think. I don't see the value of not letting that go to say when both sides see mate. If one engine has a +6.5 score then in general the other side will be mated quite quickly.
I disagree, I think it should be left the way it is. I think doing what you suggest would just encourage developers to lower their evals to beat the TCEC rule on this.
Isn't that exactly the (potential) problem of the current TCEC rule?

I would not for a moment want to suggest that the Komodo developers (or any of the other developers) have done this (clearly they did not), but reporting relatively low scores is what saved game 22 for Komodo. Had it reported +6.5 or higher for a couple of moves, SF would have won the game. Knowing how the game continued, an SF win would have been completely undeserved, but the adjudication would have prevented any one to know about it.

In short, a TCEC competitor can only benefit from capping its (negative) evaluations to 6.49 and such benefit has now been proven to be more than theoretical.
True, I see your point here.
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
Modern Times
Posts: 3548
Joined: Thu Jun 07, 2012 11:02 pm

Re: Komodo strikes first at the TCEC Superfinal

Post by Modern Times »

Adjudication is always a small risk, but usually a necessary one when running a decent number of longer time control games.

Personally I think 6.50 is not high enough. I normally use 8.50 in my matches and for 8 consecutive moves. If I'm watching, I like to see more of the game.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Komodo strikes first at the TCEC Superfinal

Post by syzygy »

Hai wrote:It's simple Stockfish hadn't won game 22 because he evaluated Kh3 as a draw by 3-Fold Rep but it was only a 2-Fold Rep.
But Stockfish evaluates 2-Fold Rep = 3-Fold Rep.
And that's why Stockfish played Kf3??.

As a result you need to fix 2-Fold Rep and evaluate this correctly.
No, the problem is simply that it missed Bc2 and played Kg4 instead (due to a bug).

After Kg4 Rg2+ most engines, including SF and Komodo, evaluate Kh3 as a draw because it allows Rb2 resulting in a first repetition. This has always tested better in terms of Elo, so that's why they do that.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Komodo strikes first at the TCEC Superfinal

Post by Milos »

syzygy wrote:In short, a TCEC competitor can only benefit from capping its (negative) evaluations to 6.49 and such benefit has now been proven to be more than theoretical.
That's just BS. If engine is not capable to convert evaluation of higher than 6.5 to a win then it doesn't deserve to win, period.
It has nothing to do with what other engine reports.
Here, 62. Bc2 is very easy winning move that engines few hundred Elo points weaker than SF quickly see, pick up and not let go. For example, H4 with Syz needs only 1M nodes to find Bc2.
SF's SMP search was terrible on that move. This kind of search instability that SF shown is expected when the score is above 10-15, not that early as SF shown.
The same SF without Syz and on single core pretty quickly picks up Bc2 (after 20M nodes) and never lets it go (even though the search is a bit unstable), so it is clearly a bug, either in lazy SMP or in the probing code.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Komodo strikes first at the TCEC Superfinal

Post by syzygy »

Milos wrote:
syzygy wrote:In short, a TCEC competitor can only benefit from capping its (negative) evaluations to 6.49 and such benefit has now been proven to be more than theoretical.
That's just BS. If engine is not capable to convert evaluation of higher than 6.5 to a win then it doesn't deserve to win, period.
You again with your usual foul language.

What is there not to understand... can't help you. If Komodo had reported +7 scores it would have lost here by adjudication. SF would not have reached the point where it blundered. Not difficult at all for the rest of us.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Komodo strikes first at the TCEC Superfinal

Post by Milos »

syzygy wrote:
Milos wrote:
syzygy wrote:In short, a TCEC competitor can only benefit from capping its (negative) evaluations to 6.49 and such benefit has now been proven to be more than theoretical.
That's just BS. If engine is not capable to convert evaluation of higher than 6.5 to a win then it doesn't deserve to win, period.
You again with your usual foul language.

What is there not to understand... can't help you. If Komodo had reported +7 scores it would have lost here by adjudication. SF would not have reached the point where it blundered. Not difficult at all for the rest of us.
How sensitive you become when you have no arguments.
Why would Komodo help SF to cover its own bugs???
SF SMP search is clearly buggy and SF deserves to suffer for it. If I designed an engine for TCEC I would intentionally scale down reported scores when they are negative and scale them up when they are positive just to benefit from bugs of other engines.
sedicla
Posts: 178
Joined: Sat Jan 08, 2011 12:51 am
Location: USA
Full name: Alcides Schulz

Re: Komodo strikes first at the TCEC Superfinal

Post by sedicla »

They could let the engines play until checkmate.
I think would be interesting to see how engines finish the game.
syzygy
Posts: 5566
Joined: Tue Feb 28, 2012 11:56 pm

Re: Komodo strikes first at the TCEC Superfinal

Post by syzygy »

Milos wrote:Why would Komodo help SF to cover its own bugs???
So you cannot read simple sentences.

This is what you reacted too. You quoted it yourself:
In short, a TCEC competitor can only benefit from capping its (negative) evaluations to 6.49 and such benefit has now been proven to be more than theoretical.
This is simply a correct statement.

If anything, I am saying that Komodo (and all other engines) SHOULD cap its (negative) evaluations if only to avoid helping SF to cover its bugs...

:roll:
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Komodo strikes first at the TCEC Superfinal

Post by michiguel »

Modern Times wrote:Adjudication is always a small risk, but usually a necessary one when running a decent number of longer time control games.

Personally I think 6.50 is not high enough. I normally use 8.50 in my matches and for 8 consecutive moves. If I'm watching, I like to see more of the game.
I think that the problem is that a high eval score could be because the engine is seeing too deep but they have not reached a point in which a normal human would resign. Even if it is true, the spectator would miss the chance to see the execution.

I always said that what is necessary is both a high score and a clearly positive material value. For instance, +6.5 AND at least two pawns up or equivalent. Here, there would not have been any adjudication.

Miguel