2009 WCRCC: Bright/Spark issue

CRoberson · Post by **CRoberson** » Thu Aug 13, 2009 7:03 pm

The rules for entering a second program state that it must be significantly different. If we look at the previous programs entered by the same
authors, we see plenty of evidence that they are different. My programs (Telepath and NoonianChess) don't even finish at the same end of the crosstables.
In fact, some have joked that they have a practical bye when paried with NoonianChess. The other programs are by HGM. It is clear that they are different.
They even have a different purpose. micro-Max isn't even meant to be competitive, its goal is to be small in lines of code.

We have little data on Spark. I'll review what we do have. Spark is in OpenWar. http://www.open-aurec.com/chesswar/Open ... penwar.htm

When Allard first sent me email it was in 32nd position and Bright was in the top 4. Now, Bright is in 4th and Spark is tied for 14th in
round 40. Now, for the details.
1) Bright and Spark have been paired with a draw result.
2) Spark has played all of the top 4.
3) Spark has a score of 32/40 and Bright has 36/40
4) 10 of the programs above Spark have not played the top 4. This and how Spark has done thus far means it will move up the list and close
the gap on Bright.
5) The Spark version in OpenWar is single proc while Bright is running on two processors.
6) Spark has beaten the 64 bit versions of Frenzee (not shabby).
7) Spark has beaten the 64 bit version of LearningLemmnig (not shabby).
8) Both programs are currently ranked above Spark.
9) LL hasn't played the top 4, thus it could fall in the rankings relative to Spark.

This data suggests that single proc Spark is very close in strength to dual proc Bright. In fact, theie is virtually no data to suggest that
one is stronger than the other. Also, I am told that the new Spark version is SMP and Kenny has an 8 rpoc machine.

The data clearly shows that Spark and Bright are not sufficiently different in playing strength and therefore I can not allow both in the tournament.

So, one of two things will happen now:
1) Allard and team will decide which engine to drop.
2) I will decide which engine to drop.

I prefer that option 1 is followed. If we have to go with option 2 and I get too much feedback from the team,
then I will drop both. I prefer that this does not happen.

I have clearly shown that this decision is based on the data at hand and a rule that has been in the ACCA tournaments since their beginning.
This decision doesn't have any personal issues in it at all.

Please, Allard and team take option 1 so that I don't have to be involved with this issue anymore.

Christopher Conkie · Post by **Christopher Conkie** » Thu Aug 13, 2009 7:05 pm

Good luck with the tournament Charles.

Christopher

hgm · Post by **hgm** » Thu Aug 13, 2009 7:30 pm

This is really strange. The rules published in advance say that you can have multiple entries (making this a tournament for engines, rather than authors). It said nothing about a minimal Elo difference, or a maximum Elo for the second program. Using that now as an argument is the same as when we would refuse Rybka to enter as an after-thought because it is too strong... That would also be a good move to enhance the winning probability of other participants, which seems the only reason that drives this decision.

The key should be how different the programs are. Who wrote them should not play any role whatsoever, because of the announced rules. That two programs cannot participate if they are too similar is obvious, and also true if they are by different authors. If I would enter with something that is 90% identical to Fruit, would be refused (and accused of cloning on top of that). That is how it should be.

But an engine that would have been allowed as a participant when it was written by someone else, because it was not sufficiently similar to any open-source program to be considered a clone, should also be allowed to enter if it was written by the same author. The judgement must be purely made on the similarity. Is Spark a clone of Bright or not?

bob · Post by **bob** » Thu Aug 13, 2009 7:46 pm

My take is this is all irrelevant. It doesn't matter _how_ different the two programs are, that is a subjective measurement that leads to pointless arguments.

What possible justification is there for entering two programs? If someone wants to write 2 or more, that's fine. But they should enter the best, or the one they want to test, and not cloud the issue with more than one. I have several hundred old versions of Crafty. It would be easy to make the case that Crafty version 13.10 (I think that was the Jakarta WMCCC version) is _far_ different from version 23.1 in use today. Ditto for the Cray Blitz version we have available. Ditto for Crafty version 8.x and so forth.

What is the reasoning to allow more than one entrant per author? Yes, one extra program is nice to avoid byes, but other than that? What is the limit on the number? How is this decided? What are the criteria for determining the "significantly different" requirement? How is it measured? Etc.

Big can of worms. Serving no useful purpose.

For derivatives, one should enter. Fruit or Toga are OK, so long as just one enters. otherwise this degenerates into a quantity above quality event which is not, IMHO, very interesting. If I enter a hundred versions of Crafty, my chances for winning will be astronomical. Does a 3-sigma event mean anything, however??? Only if you do enough trials so that one is likely. We ought to be avoiding that rather than encouraging it.

Christopher Conkie · Post by **Christopher Conkie** » Thu Aug 13, 2009 7:51 pm

bob wrote:My take is this is all irrelevant. It doesn't matter _how_ different the two programs are, that is a subjective measurement that leads to pointless arguments.

What possible justification is there for entering two programs? If someone wants to write 2 or more, that's fine. But they should enter the best, or the one they want to test, and not cloud the issue with more than one. I have several hundred old versions of Crafty. It would be easy to make the case that Crafty version 13.10 (I think that was the Jakarta WMCCC version) is _far_ different from version 23.1 in use today. Ditto for the Cray Blitz version we have available. Ditto for Crafty version 8.x and so forth.

What is the reasoning to allow more than one entrant per author? Yes, one extra program is nice to avoid byes, but other than that? What is the limit on the number? How is this decided? What are the criteria for determining the "significantly different" requirement? How is it measured? Etc.

Big can of worms. Serving no useful purpose.

For derivatives, one should enter. Fruit or Toga are OK, so long as just one enters. otherwise this degenerates into a quantity above quality event which is not, IMHO, very interesting. If I enter a hundred versions of Crafty, my chances for winning will be astronomical. Does a 3-sigma event mean anything, however??? Only if you do enough trials so that one is likely. We ought to be avoiding that rather than encouraging it.

Does that mean I can operate Glaurung 2.7?

(not trying to make your job harder btw)

Christopher

michiguel · Post by **michiguel** » Thu Aug 13, 2009 7:55 pm

hgm wrote:This is really strange. The rules published in advance say that you can have multiple entries (making this a tournament for engines, rather than authors). It said nothing about a minimal Elo difference, or a maximum Elo for the second program. Using that now as an argument is the same as when we would refuse Rybka to enter as an after-thought because it is too strong... That would also be a good move to enhance the winning probability of other participants, which seems the only reason that drives this decision.

The key should be how different the programs are. Who wrote them should not play any role whatsoever, because of the announced rules. That two programs cannot participate if they are too similar is obvious, and also true if they are by different authors. If I would enter with something that is 90% identical to Fruit, would be refused (and accused of cloning on top of that). That is how it should be.

But an engine that would have been allowed as a participant when it was written by someone else, because it was not sufficiently similar to any open-source program to be considered a clone, should also be allowed to enter if it was written by the same author. The judgement must be purely made on the similarity. Is Spark a clone of Bright or not?

The rules are also clear that the organizer's decision is final. There is no way to come to a perfect decision about this and in borderline cases it may require a judgment call. I do not believe it really deserves a huge deal. If Charles has any doubt in this case and likes to err on the safe side of making sure that the programs are more different, then so be it.

Congratulations Charles, all this is a signal of your success.

I means people are starting to show interest in the ACCA tournaments.

"Ladran Sancho, señal que cabalgamos" Don Quijote de la Mancha.

(Sancho, they are barking; it mean we are riding")

Miguel
PS: I wish I can participate but I cannot guarantee to be in front of the computer all the time

hgm · Post by **hgm** » Thu Aug 13, 2009 8:09 pm

michiguel wrote:The rules are also clear that the organizer's decision is final. ...

You left out the qualification "in case of a dispute". I don't think there can be any dispute that the rules allow this?

I don't think it is very good for the credibility of a tournament when the TD violates _any_ rule. That Bob thinks the rules are bad is as relevant as a muderer defending himself in court with the argument that he thinks the law should be changed to allow killing.

CRoberson · Post by **CRoberson** » Thu Aug 13, 2009 8:11 pm

michiguel wrote:
hgm wrote:This is really strange. The rules published in advance say that you can have multiple entries (making this a tournament for engines, rather than authors). It said nothing about a minimal Elo difference, or a maximum Elo for the second program. Using that now as an argument is the same as when we would refuse Rybka to enter as an after-thought because it is too strong... That would also be a good move to enhance the winning probability of other participants, which seems the only reason that drives this decision.

The key should be how different the programs are. Who wrote them should not play any role whatsoever, because of the announced rules. That two programs cannot participate if they are too similar is obvious, and also true if they are by different authors. If I would enter with something that is 90% identical to Fruit, would be refused (and accused of cloning on top of that). That is how it should be.

But an engine that would have been allowed as a participant when it was written by someone else, because it was not sufficiently similar to any open-source program to be considered a clone, should also be allowed to enter if it was written by the same author. The judgement must be purely made on the similarity. Is Spark a clone of Bright or not?
The rules are also clear that the organizer's decision is final. There is no way to come to a perfect decision about this and in borderline cases it may require a judgment call. I do not believe it really deserves a huge deal. If Charles has any doubt in this case and likes to err on the safe side of making sure that the programs are more different, then so be it.

Congratulations Charles, all this is a signal of your success. I means people are starting to show interest in the ACCA tournaments.

"Ladran Sancho, señal que cabalgamos" Don Quijote de la Mancha.

(Sancho, they are barking; it mean we are riding")

Miguel
PS: I wish I can participate but I cannot guarantee to be in front of the computer all the time

Hi Miguel,

One doesn't have to stay in front of the computer. The requirment is that you can get the computer or program back up quickly if there is a problem. Thus, you only need to periodically check it and possibly be there at the beginning of each round.

Zach Wegner · Post by **Zach Wegner** » Thu Aug 13, 2009 8:11 pm

Two points:

1. I don't really agree with this decision. I don't think the strength of an engine really has much to do with its similarity to another. The rules were clear, and I think it is also clear that Bright and Spark are substantially different. Changing it at the last minute is unfair (though of course entirely within Charles' authority as TD).

2. That being said, I don't really agree with the rule in the first place. It's sometimes fun to beat up on Noonian or umax, but most of the time it's just boring. Also, as GCP pointed out, it doesn't increase the amount of authors participating, which is IMO the most important part of the tournament. Add to that the author of the two programs in question won't be there at all, which is quite a shame.

Zach Wegner · Post by **Zach Wegner** » Thu Aug 13, 2009 8:18 pm

CRoberson wrote: Hi Miguel,

One doesn't have to stay in front of the computer. The requirment is that you can get the computer or program back up quickly if there is a problem. Thus, you only need to periodically check it and possibly be there at the beginning of each round.

Actually, I will be gone for a couple of hours on Saturday. I'm not sure how I'll handle this. I think starting the rounds should be fine, as either the opponent can match or the ICC admin can spoof me. I'm a bit concerned about crashes though. ZCT has a tendency to lock up every once in a while. One possibility is to write a script that detects when less than 4 cpus are running, and kick/re-login if so. Not exactly easy to test though, since I don't want to run on ICC and kick myself, possibly getting banned (again).

Also, if it was to re-login, wouldn't it have to send "resume" to the server? Or can an admin handle this?

2009 WCRCC: Bright/Spark issue

2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue

Re: 2009 WCRCC: Bright/Spark issue