TCEC stage 3 , New Houdini starts with a bang

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Laskos
Posts: 9982
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Laskos » Sun Jul 24, 2016 6:47 pm

Branko Radovanovic wrote:
Laskos wrote:The 5 ELO points difference in Komodo settings with probability of 98% didn't have any impact on the outcome of the first game, and these 5 ELO points in single game are negligible. It's replaying of the game which impacted seriously the stage. It flipped the fortunes of a single game approximately by the value of the drawelo of 250 ELO points. Replaying the game had a pretty devastating and an expected a priori effect on the outcome of a single previously won by Houdini game. Come on, humans have pretty good intuition on these matters, everybody knew that the decision favors Komodo for that game massively.
"Someone stole a lottery ticket from Komodo and gave it to Houdini. Then, the ticket won the jackpot. Taking the jackpot from Houdini and giving it to Komodo would have a devastating effect on Houdini because it would cost him millions, while the value of the lottery ticket in the moment it was stolen was pretty much negligible, being a million-to-one longshot. So, this decision favors Komodo massively."

That's really just a rephrasing of my earlier reply to Ronald de Man: one either compares a priori with a priori or a posteriori with a posteriori - one cannot mix the two, because it leads to faulty conclusions like the one above.
You simply don't realize a simple fact obesrved intuitively by almost everybody (but you) that flipping a coin until it gives a desired result is not science about the coin and I am afraid is a bit flawed stage of TCEC.

Uri Blass
Posts: 8730
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Uri Blass » Sun Jul 24, 2016 7:13 pm

syzygy wrote:
Uri Blass wrote:
Laskos wrote:
IanO wrote:
carldaman wrote:What was wrong with Komodo's settings? People should at least be entitled to a proper explanation by the organizers.
The differences can be found in the initial game comment.

Before:

Code: Select all

BlackEngineOptions: Use Syzygy=true; Syzygy Probe Depth=2; Syzygy Probe Limit=6; Dynamism=110;
After:

Code: Select all

BlackEngineOptions: Contempt=7;
To summarize, removing custom Syzygy probe options and "Dynamism" (whatever that is), and adding Contempt 7.

It would be nice to hear from Mark what the effects of these changes are likely to be.
The values "Before" were close to to optimal. The ELO loss was no larger than 5-10 ELO points and hardly the reason for the lost game. Seems unfair to Houdini to replay the game.
I do not think that we know the elo loss at long time control and there is no basis to claim that the wrong setting is not the reason for the lost game.

I see nothing unfair for Houdini.
It is only fair if the game had been replayed no matter what.

That this game would have been replayed no matter what is far from clear. The rules are silent on this situation and what if it had been discovered only much later?

What if Komodo had drawn or won "despite its disadvantage"? I think most people would have considered it fair not to replay the game in that case.
Replying to:
"What if Komodo had drawn or won "despite its disadvantage"?"

I do not know the opinion of most people but my point of view that it is not fair not to replay the game also in this case.

I suggest that in the future everybody can see the default setting of the programs before the tournament(so everybody and not only the programmers can complain and ask to replay games if they see something is wrong).

Ralf Müller
Posts: 127
Joined: Fri Dec 28, 2012 11:07 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Ralf Müller » Sun Jul 24, 2016 7:50 pm

It's not about replay a game, it's about restart the whole tournament. All engines start at 0 points and have the same possibilities.

Also it would be fair to restart a whole coin flip tournament (if there are enough rounds to go).

Branko Radovanovic
Posts: 64
Joined: Sat Sep 13, 2014 2:12 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Branko Radovanovic » Sun Jul 24, 2016 8:02 pm

Laskos wrote:
Branko Radovanovic wrote:
Laskos wrote:The 5 ELO points difference in Komodo settings with probability of 98% didn't have any impact on the outcome of the first game, and these 5 ELO points in single game are negligible. It's replaying of the game which impacted seriously the stage. It flipped the fortunes of a single game approximately by the value of the drawelo of 250 ELO points. Replaying the game had a pretty devastating and an expected a priori effect on the outcome of a single previously won by Houdini game. Come on, humans have pretty good intuition on these matters, everybody knew that the decision favors Komodo for that game massively.
"Someone stole a lottery ticket from Komodo and gave it to Houdini. Then, the ticket won the jackpot. Taking the jackpot from Houdini and giving it to Komodo would have a devastating effect on Houdini because it would cost him millions, while the value of the lottery ticket in the moment it was stolen was pretty much negligible, being a million-to-one longshot. So, this decision favors Komodo massively."

That's really just a rephrasing of my earlier reply to Ronald de Man: one either compares a priori with a priori or a posteriori with a posteriori - one cannot mix the two, because it leads to faulty conclusions like the one above.
You simply don't realize a simple fact obesrved intuitively by almost everybody (but you) that flipping a coin until it gives a desired result is not science about the coin and I am afraid is a bit flawed stage of TCEC.
Flipping a coin until it gives a desired result is indeed not science, but that's not at all what happened here. The game which was played in violation of the rules was null and void in the moment when it started, null and void while it was played, and null and void after it ended - there is no difference. That's a "legalist" point of view, so to say, and I don't think it can be reasonably disputed.

Here is a real question: what if Komodo won the game with wrong settings? Would it be fair then to: a) accept this result as valid, since the engine won while handicapped, or b) replay the game regardless? Mathematically, it's both intuitive and easy to show that the correct answer is b), even if this could lead to a seemingly grossly unfair outcome, such as Komodo losing with the correct settings afterwards. In fact, only if one decides to replay or not based on the outcome of the tainted game, the thing becomes equivalent to what you've described as "flipping a coin until it gives a desired result".

Unconditional replay is both "legally" right and mathematically fair, as it results in the same expectancy as if everything was correct from the start.

syzygy
Posts: 4476
Joined: Tue Feb 28, 2012 10:56 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by syzygy » Sun Jul 24, 2016 9:25 pm

Ralf Müller wrote:It's not about replay a game, it's about restart the whole tournament. All engines start at 0 points and have the same possibilities.

Also it would be fair to restart a whole coin flip tournament (if there are enough rounds to go).
You are making a basic error.

If someone thinks he might have a patch that improves SF, he can run it on fishtest. If the patch passes all tests, it will normally be added to the official SF.

Now suppose you try this and the patch fails some of the tests.
You try again... you "restart the whole tournament". Now it passes.

What do you expect to happen? Will your patch be added to the official Stockfish?
No, your patch will not be added and for very good reasons.

Restarting the tournament for whatever reason (provided it is linked, however remotely, to the outcome of the earlier games) is inherently unfair. (And in any event, it is absolutely irrelevant that the game that was restarted was the first game of the series.)


Completely separate question: what if Komodo had won the game despite its disadvantage and the stage would have continued without replay. Would you have objected?

syzygy
Posts: 4476
Joined: Tue Feb 28, 2012 10:56 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by syzygy » Sun Jul 24, 2016 9:33 pm

Branko Radovanovic wrote:The game which was played in violation of the rules was null and void in the moment when it started, null and void while it was played, and null and void after it ended - there is no difference. That's a "legalist" point of view, so to say, and I don't think it can be reasonably disputed.
If the rules had been so clear, I would have agreed. (But I wonder how the rules would have had to be formulated. What do you do if the error is detected only very late in the stage and affects many games? Replay everything?)
Here is a real question: what if Komodo won the game with wrong settings? Would it be fair then to: a) accept this result as valid, since the engine won while handicapped, or b) replay the game regardless? Mathematically, it's both intuitive and easy to show that the correct answer is b), even if this could lead to a seemingly grossly unfair outcome, such as Komodo losing with the correct settings afterwards. In fact, only if one decides to replay or not based on the outcome of the tainted game, the thing becomes equivalent to what you've described as "flipping a coin until it gives a desired result".
I fully agree. Unfortunately I could not avoid seeing phrases such as "Komodo has the right to a replay". (Please note: not coming from Mark.)

I know this is speculative, but if Komodo had won (or drawn or even lost) and the game would not have been replayed, I don't think any discussion would have been taking place now. Of course everybody would have agreed that Komodo should play its remaining games with the correct settings.

Did you check the written rules?

User avatar
MikeB
Posts: 3830
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: TCEC stage 3 , New Houdini starts with a bang

Post by MikeB » Sun Jul 24, 2016 9:58 pm

syzygy wrote:
Ralf Müller wrote:It's not about replay a game, it's about restart the whole tournament. All engines start at 0 points and have the same possibilities.

Also it would be fair to restart a whole coin flip tournament (if there are enough rounds to go).
You are making a basic error.

If someone thinks he might have a patch that improves SF, he can run it on fishtest. If the patch passes all tests, it will normally be added to the official SF.

Now suppose you try this and the patch fails some of the tests.
You try again... you "restart the whole tournament". Now it passes.

What do you expect to happen? Will your patch be added to the official Stockfish?
No, your patch will not be added and for very good reasons.

Restarting the tournament for whatever reason (provided it is linked, however remotely, to the outcome of the earlier games) is inherently unfair. (And in any event, it is absolutely irrelevant that the game that was restarted was the first game of the series.)


Completely separate question: what if Komodo had won the game despite its disadvantage and the stage would have continued without replay. Would you have objected?
+1 Absolutely correct. It's so unfortunate that such a poor decision was made and this illustrates the down fall of a single organization (or person) making all the decisions. Going forward , they should have an independent body of seasoned chess programmers who do not have an engine involved to make such decisions. They might have come to the same conclusion, but unfortunately even the slightest hint of perceived favoritism becomes reality in cases like this.

Peter Berger
Posts: 403
Joined: Thu Mar 09, 2006 1:56 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Peter Berger » Sun Jul 24, 2016 10:24 pm

You have to realize that these situations happen ALL THE TIME in engine tournaments. They just go unnoticed often.

I attended 3 WCCCs as an operator. Each year I observed unexpected situations that demanded a judgement by an official. Some I agreed to, some I didn't.

E.g. Crafty-Shredder Ramat-Gan 1st round when Crafty crashed and wasn't set back up in time, It should have been a loss for Crafty in a perfect world IMHO. The game mainly ended as a draw because SMK was very friendly and resisted the visible temptation to insist on a win by default. The scorssheet eventually just showed a pretty boring draw.

It is the same with ALL basement tourneys. I have run quite a few myself ,and OFTEN sth unexpected happened at some point, despite utmost care with the setup.

How do you judge the case when at some point in the game some scheduled process running monthly you have completely forgotten about fires off and steals processor time from an engine?

How do you judge crashes when it can't be worked out who is to blame ( engine, setup, machine)? As every judgement changes the probablities of the final result.

You can try to do your best though.

In case the one to blame is the operator as he set up the engine incorrectly, the only reasonable decision is to ALWAYS replay the game/s though IMHO. No matter the result of the game.

So I agree with the TCEC decision.

Peter

Ralf Müller
Posts: 127
Joined: Fri Dec 28, 2012 11:07 pm

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Ralf Müller » Sun Jul 24, 2016 11:10 pm

Your example of the patch misses one important thing: If you have completed a test series with a new patch and the patch passes all tests and before you've done one another SINGLE game under slightly OTHER circumstances, where the patch lost this single game - than you'll of course add the patch to the official Stockfish for good reasons.

And yes, I would also insist for a replay if Komodo had won this. If the result shall be worth something, all games must be played under similar circumstances.

Rochester
Posts: 55
Joined: Sat Feb 20, 2016 5:11 am

Re: TCEC stage 3 , New Houdini starts with a bang

Post by Rochester » Sun Jul 24, 2016 11:21 pm

Sensible only play with the default setting. Then the programer must code it correctly always. He can make the program use the setting he want. Make new program when change setting.

And when programmer go to the movie he can't complain when he return. Movie is more important? NO!

Post Reply