TCEC stage 3 , New Houdini starts with a bang

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
lantonov
Posts: 216
Joined: Sun Apr 13, 2014 5:19 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by lantonov »

Daniel Anulliero wrote: And at the very end , I'm very happy the monthly hgm's tournament exist, here is the real fun , no " head crash " kind guys etc ... Thanks Harm :wink: !
Dany
Can you give a link to the hgm tournament. I admire Harm not only as an extremely able programmer but as a kind and helpful person as well.
Frank Brenner
Posts: 34
Joined: Sat Jul 02, 2016 1:47 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Frank Brenner »

> b) to see which program wins the game, using optimal settings? I've always gone for (b).




We all vote for (b). And in order to vote for (b) the first game should not be repeated:

Imagine, what would happen if Houdini would have won the second game also and the Komodo team would claim "Oh, here is one more abolute-best setting 5 / 107" .... we could repeat this situation until komodo wins the n-th repitition ...
Branko Radovanovic
Posts: 89
Joined: Sat Sep 13, 2014 4:12 pm
Location: Zagreb, Croatia
Full name: Branko Radovanović

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Branko Radovanovic »

syzygy wrote:
Branko Radovanovic wrote:The game which was played in violation of the rules was null and void in the moment when it started, null and void while it was played, and null and void after it ended - there is no difference. That's a "legalist" point of view, so to say, and I don't think it can be reasonably disputed.
If the rules had been so clear, I would have agreed. (But I wonder how the rules would have had to be formulated. What do you do if the error is detected only very late in the stage and affects many games? Replay everything?)
That's a good point. Even worse, the error could have been detected after the stage had ended and the next stage had begun. What to do then, replay tainted games from the previous stage and void all games from the current stage? Or void them only if the qualifiers change as a result of the replay? Imagine, for example, that tomorrow someone discovers that in Stage 1a one engine played with the wrong parameters - the entire Stage 2 would possibly need to be replayed, even if the handicapped engine qualified from 1a regardless, and if none of the authors raised any objection!

The current written rules don't say a thing about replays, but it is clear that - short of using a time machine - it is the only way to preserve adherence to the rules (and thus fairness) in case of accidental deviations. Still, TCEC is no longer a one-man tournament, it carries significant weight now, and needs a much better and more complete ruleset (e.g. the one that provides an unambiguous definition of a "serious, play-limiting bug" and describes when exactly an engine can be disqualified).
Ralf Müller
Posts: 127
Joined: Sat Dec 29, 2012 12:07 am

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Ralf Müller »

Imagine, what would happen if Houdini would have won the second game also and the Komodo team would claim "Oh, here is one more abolute-best setting 5 / 107" .... we could repeat this situation until komodo wins the n-th repitition ...
The Komodo team gave their settings before the game to the TCEC operator and he took the wrong settings. So your argument has no point.
whereagles
Posts: 565
Joined: Thu Nov 13, 2014 12:03 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by whereagles »

Graham Banks wrote:
whereagles wrote:This sort of incident is embarassing and must be avoided.

Still, the solution is sensible.
Nobody is perfect.

I'd have made exactly the same decision if I did this in one of my tournaments.
As I would, if I were running one. Cheers :)
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: TCEC stage 3 , New Houdini starts with a bang

Post by bob »

Frank Brenner wrote:> b) to see which program wins the game, using optimal settings? I've always gone for (b).




We all vote for (b). And in order to vote for (b) the first game should not be repeated:

Imagine, what would happen if Houdini would have won the second game also and the Komodo team would claim "Oh, here is one more abolute-best setting 5 / 107" .... we could repeat this situation until komodo wins the n-th repitition ...
Your comment makes no sense. The first game was played with incorrect settings, not something caused by the program, but something caused by a mistake the operator made. The very idea of favoring the engine over operator requires that any games played with an operator mistake present should be replayed, no matter what the original outcome was.

Otherwise you introduce random noise into a pretty tightly controlled experimental setup.
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by mjlef »

IGarcia wrote:
mjlef wrote:
syzygy wrote:
Uri Blass wrote:
Laskos wrote:
IanO wrote:
carldaman wrote:What was wrong with Komodo's settings? People should at least be entitled to a proper explanation by the organizers.
The differences can be found in the initial game comment.

Before:

Code: Select all

BlackEngineOptions: Use Syzygy=true; Syzygy Probe Depth=2; Syzygy Probe Limit=6; Dynamism=110;
After:

Code: Select all

BlackEngineOptions: Contempt=7;
To summarize, removing custom Syzygy probe options and "Dynamism" (whatever that is), and adding Contempt 7.

It would be nice to hear from Mark what the effects of these changes are likely to be.
The values "Before" were close to to optimal. The ELO loss was no larger than 5-10 ELO points and hardly the reason for the lost game. Seems unfair to Houdini to replay the game.
I do not think that we know the elo loss at long time control and there is no basis to claim that the wrong setting is not the reason for the lost game.

I see nothing unfair for Houdini.
It is only fair if the game had been replayed no matter what.

That this game would have been replayed no matter what is far from clear. The rules are silent on this situation and what if it had been discovered only much later?

What if Komodo had drawn or won "despite its disadvantage"? I think most people would have considered it fair not to replay the game in that case.
Although I did not make the decision to replay the game, Anton made clear in the email the game would have been replayed whatever the result of the first game is. That is only fair. I would have actually insisted on it especially if Komodo was winning but he already said it would be replayed. Fair is fair. I do not want Komodo to get an unfair advantage due to a simple error in setup.
To proof true is needed to know the email time and compare to game time, this way it will be clear (or not) the game had a balanced score when the replay was decided.

Besides that, TCEC is wrong allowing you to set different settings on every stage, allowing komodo to take risk in early stages and setting a conservative (draw all) mode o last stages. Wining only the games where other engines make little mistakes probably because not being in same conservative mode.

TCEC also allows engine version change between stages, so you can compile custom engines based on rival (for final stage).

All that is very unfair and make TCEC title worth nothing. (for me)

Martin Thoresen should notice this towards his tournament PRESTIGE and run this tournament with the last common available / released engine version and not all this custom settings and unreleased engine versions.

Regards.


PS: the ONLY fair thing to do know is in the future reverse color game Komodo-Houdini REPLAY if Komodo wins, giving a second chance to Houdini to draw as it was done by komodo.
I have the email. I did not say the evals were balanced. I said the Komodo eval was a bit above 1.0 (so in Houdini's favor). Email times will confirm all I said. In any case the score does not matter. In correct settings must be fixed. If a company sponsoring a race between two cars forgot to fill the tak on one, they would restart the race once they filled both cars of course.

Humans change their internal settings when playing against other humans, so the programs can do the same (but only between stages). I think allowing updates between stages improves the quality of the chess played, and I suspect most TCEC viewers would disagree with your proposal of using an old version throughout the multi-month season.
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by mjlef »

syzygy wrote:
Ralf Müller wrote:If Houdini is equal, then both engines will now finish equally (without random factors). So that's completely fair, isn't it?
By that reasoning it would always be fair to replay any game. So the reasoning is flawed.

If after flipping a coin you decide to flip it again on the basis of a condition linked, however remotely, to the outcome of the first flip, then you have made the experiment unfair. This is elementary statistics.

If, before the first flip, it was already 100% certain that the first flip would not count and a second flip would be taken, then there is no problem (apart from the waste of time for the first flip).

In the present case the rules are silent on the matter. In such a case it seems fairer to at least collect the participants' views on the matter before taking a decision.

Of course from now on it is clear that any misconfiguration, however insignificant, will result in cancellation of all affected games regardless of their number and the time that has passed. And, most importantly, regardless of the wishes of the affected participants.
I must disagree with the coin analogy. A proper be an unbalanced coin was flipped and replaced with a fair coin. The first flip was not correct since the parameters of the coin were not correct.
mjlef
Posts: 1494
Joined: Thu Mar 30, 2006 2:08 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by mjlef »

syzygy wrote:
Ralf Müller wrote:It's not about replay a game, it's about restart the whole tournament. All engines start at 0 points and have the same possibilities.

Also it would be fair to restart a whole coin flip tournament (if there are enough rounds to go).
You are making a basic error.

If someone thinks he might have a patch that improves SF, he can run it on fishtest. If the patch passes all tests, it will normally be added to the official SF.

Now suppose you try this and the patch fails some of the tests.
You try again... you "restart the whole tournament". Now it passes.

What do you expect to happen? Will your patch be added to the official Stockfish?
No, your patch will not be added and for very good reasons.

Restarting the tournament for whatever reason (provided it is linked, however remotely, to the outcome of the earlier games) is inherently unfair. (And in any event, it is absolutely irrelevant that the game that was restarted was the first game of the series.)


Completely separate question: what if Komodo had won the game despite its disadvantage and the stage would have continued without replay. Would you have objected?
I disagree. The test run is not the same. A better analogy would be you had a bug in the first run (you set some numbers wrong) so you fixed them and restarted the test. This has happened several time in testing Komodo. We always disregard the faulty data, clear it and run again with proper settings.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Michel »

I must disagree with the coin analogy. A proper be an unbalanced coin was flipped and replaced with a fair coin. The first flip was not correct since the parameters of the coin were not correct.
The coin analogy is correct (since the bias was very small).

If the game would also have been replayed in the case of a draw/loss for Houdini, then the decision was fair.

However a more likely scenario in case of a win for K would have been that the game would not have been replayed, the reason being that it was K that was handicapped, not Houdini.... (and it would have been hard to argue with this).
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.