Adam Hair's article on Pairwise comparison of engines

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
CRoberson
Posts: 1988
Joined: Mon Mar 13, 2006 1:31 am
Location: North Carolina, USA
Contact:

Adam Hair's article on Pairwise comparison of engines

Post by CRoberson » Tue May 19, 2015 6:29 pm

Adam,

I scanned the article. Nice.

My thoughts concern: "There are 4851 pairs. The average matched move percentage was 45.16, the standard deviation of the data was 2.86. "

Looks like you included known clones/derivatives in the calculation of the standard deviation. I think their inclusion biases the standard deviation. Of course, it makes it smaller. Shouldn't the standard deviation be calculated without the known clones? It would make the number larger (maybe only slightly).

Dann Corbit
Posts: 9894
Joined: Wed Mar 08, 2006 7:57 pm
Location: Redmond, WA USA
Contact:

Re: Adam Hair's article on Pairwise comparison of engines

Post by Dann Corbit » Wed May 20, 2015 3:14 am

CRoberson wrote:Adam,

I scanned the article. Nice.

My thoughts concern: "There are 4851 pairs. The average matched move percentage was 45.16, the standard deviation of the data was 2.86. "

Looks like you included known clones/derivatives in the calculation of the standard deviation. I think their inclusion biases the standard deviation. Of course, it makes it smaller. Shouldn't the standard deviation be calculated without the known clones? It would make the number larger (maybe only slightly).
In order to do that you have to presuppose the result you hope to establish (assuming that you do not really know every single clone).

An interesting experiment would be to take known clones and see the match rate and standard deviations.

A further interesting experiment would be to find the highest match rate and standard deviations for engines known definitely not to be clones.

Without doing these experiments, I wonder what the controls are.

Adam Hair
Posts: 3201
Joined: Wed May 06, 2009 8:31 pm
Location: Fuquay-Varina, North Carolina

Re: Adam Hair's article on Pairwise comparison of engines

Post by Adam Hair » Fri May 22, 2015 12:07 am

CRoberson wrote:Adam,

I scanned the article. Nice.
Thanks, Charles.
CRoberson wrote: My thoughts concern: "There are 4851 pairs. The average matched move percentage was 45.16, the standard deviation of the data was 2.86. "

Looks like you included known clones/derivatives in the calculation of the standard deviation. I think their inclusion biases the standard deviation. Of course, it makes it smaller. Shouldn't the standard deviation be calculated without the known clones? It would make the number larger (maybe only slightly).
Actually, their inclusion makes the standard deviation from the mean larger.

ernest
Posts: 1851
Joined: Wed Mar 08, 2006 7:30 pm

Re: Adam Hair's article on Pairwise comparison of engines

Post by ernest » Fri May 22, 2015 4:58 pm

CRoberson wrote: I scanned the article.
Would be nice to know what article !!!

Robert Pope
Posts: 499
Joined: Sat Mar 25, 2006 7:27 pm

Re: Adam Hair's article on Pairwise comparison of engines

Post by Robert Pope » Fri May 22, 2015 5:59 pm


ernest
Posts: 1851
Joined: Wed Mar 08, 2006 7:30 pm

Re: Adam Hair's article on Pairwise comparison of engines

Post by ernest » Fri May 22, 2015 7:04 pm

Thanks !

User avatar
Graham Banks
Posts: 32881
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: Adam Hair's article on Pairwise comparison of engines

Post by Graham Banks » Fri May 22, 2015 8:53 pm

Robert Pope wrote:I believe it is: http://www.top-5000.nl/clone.htm
Interesting to note that Rybka and Fruit don't show up as a pair to be suspicious about.
My email addresses:
gbanksnz at gmail.com
gbanksnz at yahoo.co.nz

User avatar
Rebel
Posts: 4541
Joined: Thu Aug 18, 2011 10:04 am

Re: Adam Hair's article on Pairwise comparison of engines

Post by Rebel » Fri May 22, 2015 11:08 pm

Graham Banks wrote:
Robert Pope wrote:I believe it is: http://www.top-5000.nl/clone.htm
Interesting to note that Rybka and Fruit don't show up as a pair to be suspicious about.
It's and indication but not more than that. The SYM tool pretty much with precision can detect a clone and even a close derivative. It can not proof an engine is clean.

Roger Brown
Posts: 782
Joined: Wed Mar 08, 2006 8:22 pm

Re: Adam Hair's article on Pairwise comparison of engines

Post by Roger Brown » Sat May 23, 2015 3:53 am

Graham Banks wrote:
Robert Pope wrote:I believe it is: http://www.top-5000.nl/clone.htm
Interesting to note that Rybka and Fruit don't show up as a pair to be suspicious about.

Hello Graham,

You should be cautious when referring to other works.

The information presented doesn't attempt to validate or invalidate suspicions, so it really isn't as interesting as you think it is.

Later


The need to avoid false accusation is greater than the need to determine authors who break the rules slightly. In other words, it is better to let lesser offenders slip through than to make accusations against innocent authors.

This tool should not be used solely for determining derivatives and clones. Other methods should be used in conjunction with this tool. Ultimately, any accusation of cloning requires an examination of the code of the accused author.

Norm Pollock
Posts: 1017
Joined: Thu Mar 09, 2006 3:15 pm
Location: Long Island, NY, USA
Contact:

Re: Adam Hair's article on Pairwise comparison of engines

Post by Norm Pollock » Sat May 23, 2015 3:54 am

Adam,

Would it be unreasonable to test a 2nd copy of each engine to establish consistency?

I think you need to have some assurance that each engine is strongly consistent and will produce the same move from the same position the great majority of the time if given a 2nd chance.

After running a second version through the 8000+ positions, the 2 versions of each engine can be compared. If the versions don't have at least 95% matched moves, then I would not consider the engine consistent and possible disqualify the engine from the test.

-Norm

Post Reply