Pairwise Analysis of Chess Engine Move Selections

Dann Corbit · Post by **Dann Corbit** » Tue May 17, 2011 6:42 am

Adam Hair wrote: {snip}
Sven, could I ask you how you did that?

He either hit the Code button twice to surround the table or he manually pasted it in.

I think it would be interesting to see how well an engine correlates against itself. IOW, how often does Fruit guess Fruit's moves and how often does Rybka guess Rybka's moves, etc.

Sven · Post by **Sven** » Tue May 17, 2011 9:20 am

Adam Hair wrote:Sven, could I ask you how you did that?

I clicked "quote" and copied the table into "vi", an old but powerful UNIX text editor.

Sven

Adam Hair · Post by **Adam Hair** » Tue May 17, 2011 1:34 pm

Dann Corbit wrote:
Adam Hair wrote: {snip}
Sven, could I ask you how you did that?
He either hit the Code button twice to surround the table or he manually pasted it in.

I think it would be interesting to see how well an engine correlates against itself. IOW, how often does Fruit guess Fruit's moves and how often does Rybka guess Rybka's moves, etc.

I haven't fully investigated this, but I do know that Fruit picks the same moves about
95%, while for Rybka and Houdini it is about 75%. I can post more numbers this
evening. I have two sets of data where the engines had 100ms per position.

Adam Hair · Post by **Adam Hair** » Tue May 17, 2011 1:37 pm

Sven Schüle wrote:
Adam Hair wrote:Sven, could I ask you how you did that?
I clicked "quote" and copied the table into "vi", an old but powerful UNIX text editor.

Sven

I use Notepad++. I have not figured out how to align the columns before I post.

bob · Post by **bob** » Tue May 17, 2011 5:29 pm

Sven Schüle wrote:
Graham Banks wrote:
Adam Hair wrote:........ so it seems to me the fairest thing would be to test every engine above 2600 Elo (CCRL). Unfortunately, that is not proving to be possible at the moment.
Well your analysis has certainly changed my opinion to a degree.
In light of the fact that it seems that many engines might have heavily borrowed ideas from strong open source engines, I'm more amenable now to the the idea of testing Houdini as the Ippo family representative, and also to continuing testing Naraku.
It would seem pretty difficult now to tar one or two engines without having lingering doubts about others.
Must have a think about all this.
I must admit that this analysis is indeed very interesting, and I highly appreciate the effort. But I don't see how it changes anything substantial for the "engine origins" discussion (and as I understand Adam's statements this was also not his main intention). The key point of this analysis is that certain engines behave similar to each other regarding their move selection. There are several possible explanations for this similarity. A common explanation is that these engines are partially based on the same ideas and algorithms.

Some people may think that selecting the same moves as another engine means that code must have been copied from that other engine. But "borrowing ideas" should not be mixed up with "borrowing implementation (code)", and it is important to understand that selecting the same moves as another program *may* imply reuse of algorithms but *does not* imply reuse of code. (And, for those not being familiar with formal logic, it is also important to understand the meaning of "to imply" )

The former, "borrowing ideas", is fully o.k., even to a high extent, although one could argue that having two engines sharing the same ideas but implementing them differently is somewhat less "interesting" than having two engines with different ideas.

The latter, "borrowing implementation", is o.k. as long as copyright and license are not violated.

If your rule of thumb were that there should be only one representative from a "family" being included into testing activities, and if it is defined that an engine B belongs to the "family" of another (original) engine A if it is based on at least a certain percentage X of ideas and algorithms taken from A, then you would face these two problems:

- you would have to define the value of X, which is practically impossible;

- you would have to exclude a lot of engines from testing, including all Ippo* engines, Houdini, Fire, Strelka, Rybka, but - according to the data presented here - perhaps also Onno, Loop, Critter, Thinker, Delfi, Naraku, TwistedLogic, SmarThink, Hamsters, Pupsi, Daydreamer, Philou, Cyrano, ... - which obviously goes into the wrong direction.

Another problem would be, for consistency you would also have to exclude those open source engines which are legally derived from another engine in full GPL compliance, like Toga. But for which reason? Legally sharing implementation does of course imply sharing ideas and algorithms. This shows that the notion of "borrowing ideas" does not qualify as a guideline to decide about testing or not testing an engine.

In my opinion a better rule of thumb would be to continue concentrating on the cases where copyright/license violation is proven - difficult enough as it is, but looking more consistent to me at least.

This rule does not help much for the Ippolit case itself, though, since there is no clear agreement yet about the question whether heavily reusing code by reverse engineering a commercial program is considered as a copyright violation, and also whether such a heavy code reuse actually happened in case of Ippolit. In my opinion both get a "yes" but certainly others disagree.

As a second rule, I would still insist on not testing engines by anonymous authors.

Sven

I have said this previously. Now, Adam is exposing just the "tip" of the iceberg. I didn't see any significant surprises in the above. There are more. Many more.

bob · Post by **bob** » Tue May 17, 2011 5:31 pm

Dann Corbit wrote:
Graham Banks wrote: {snip other stuff}
I didn't mean to imply any copying of code by any of these engine authors. I agree that the use of ideas is obviously fine.
The main point that I was trying to make was that I now find it difficult to single out one or two engines in particular, given that Adam's analysis seems to show that the implementation of ideas from open source engines is fairly widespead.
Please correct me if you think I'm wrong in this assumption.

Cheers,
Graham.
In my opinion, there is nothing at all bad implied from using ideas from Fruit and Glaurung/Stockfish (so long as the borrowing of ideas is done legally). In fact, it is even virtuous to do so. What is the purpose of open source software if it is not to share ideas? If, in fact, you have an open source program and someone uses one of your algorithms and then you feel miffed, I puzzle at your attitude. Why open source then?

There is clearly virtue in sharing ideas. This can be through code, or through papers or discussions on chess programming boards or conversations at events or many other methods.

There is also nothing wrong with keeping your ideas to yourself. When the source is closed, it is up to the author to decide if:
1. He is going to expose his source code to the world
2. He is going to discuss his chess ideas with the world
3. He is going to keep things private for himself
4. He is going to give the engine away for free
5. He is going to sell the engine
etc.

On the other hand, there are right ways to do things and wrong ways to do things. There is nothing at all wrong with reading code, understanding the idea, and then making your own version [*]. There is something wrong with taking or using code in a way that specifically violates the license of the code. In my opinion, it is very hard to tell if this was done in a specific case since algorithms are not protected.

Specifically, this is NOT protected:

al·go·rithm [ álgə rìəm ] (plural al·go·rithms)

noun

Definition:

1. problem-solving procedure: a logical step-by-step procedure for solving a mathematical problem in a finite number of steps, often involving repetition of the same basic operation

[*] OK, patents, but fortunately, I do not know of any chess authors who have patented their ideas.

Now, with the so-called controversial engines, it is possible that the authors *may* have done something wrong. However, I think that we should abide by the principle "innocent until proven guilty" for all of them. We are not a formal body entrusted with power of law.

On the other, other hand, I do think that it is possible to form our own *opinions* about whether someone has *probably* done something wrong. There is nothing wrong with that either. So I can dictate my own personal choices according to opinions (guesses) that I have formed if I feel that they are factual enough to have merit. But if I tell others to do the same as my choice, I think that is an error.

IMO-YMMV

I know you really wish that we were talking about "borrowing ideas" which I agree is fine. But in reality, we are talking about something much closer to outright copying than borrowing ideas. Unfortunately. Borrowing "ideas" does not lead to identical results. The devil is in the implementation details, not in the idea itself.

"You can wish in one hand, crap in the other, and see which one fills up first."

--Burgess Meridith, Grumpy old men...

Jan Brouwer · Post by **Jan Brouwer** » Tue May 17, 2011 8:11 pm

Adam Hair wrote:Yes. Though, many of the other engines I have tested are actually earlier versions of engines listed above.

My main focus at this point is to see if every group of engines that tend to choose similar moves at a higher rate include an open source engine that preceded the other engines.

It may be interesting to include Rebel in your comparison because it is an engine with a detailed description of the ideas used on Ed's web pages, but with no access to the source code. If there happens to be an engine with a high similarity to Rebel, it may indicate a lower bound to the similarity achievable with copying code.

PS: there are methods which allow "cloning" of software where it can be proven to a certain degree that no copyright violation occurred, such as clean room design. It is possible that a (commercial) engine is developed by carefully observing the behaviour of a specific competitor without looking at its source code or without the source code even being available. This would almost by definition optimize the similarity of move selection without any copyright violation.
Could this happened with in the case of the Rybka clones?

Jan

Adam Hair · Post by **Adam Hair** » Tue May 17, 2011 11:25 pm

Jan Brouwer wrote:
Adam Hair wrote:Yes. Though, many of the other engines I have tested are actually earlier versions of engines listed above.

My main focus at this point is to see if every group of engines that tend to choose similar moves at a higher rate include an open source engine that preceded the other engines.
It may be interesting to include Rebel in your comparison because it is an engine with a detailed description of the ideas used on Ed's web pages, but with no access to the source code. If there happens to be an engine with a high similarity to Rebel, it may indicate a lower bound to the similarity achievable with copying code.

ProDeo is one of the engines that I am trying to include. Many winboard
engines are proving difficult to test. But, I think I know a way around the
problems I am having.

If only two or more closed source engines showed a high level of similarity
it would be an interesting result that would highlight an additional limitation
of this sort of comparison. Unfortunately, every such case that I have
found so far also involves an older open source engine.

Jan Brouwer wrote: PS: there are methods which allow "cloning" of software where it can be proven to a certain degree that no copyright violation occurred, such as clean room design. It is possible that a (commercial) engine is developed by carefully observing the behaviour of a specific competitor without looking at its source code or without the source code even being available. This would almost by definition optimize the similarity of move selection without any copyright violation.
Could this happened with in the case of the Rybka clones?

Jan

Dann Corbit · Post by **Dann Corbit** » Wed May 18, 2011 12:24 am

Adam Hair wrote:
Dann Corbit wrote:
Adam Hair wrote: {snip}
Sven, could I ask you how you did that?
He either hit the Code button twice to surround the table or he manually pasted it in.

I think it would be interesting to see how well an engine correlates against itself. IOW, how often does Fruit guess Fruit's moves and how often does Rybka guess Rybka's moves, etc.
I haven't fully investigated this, but I do know that Fruit picks the same moves about
95%, while for Rybka and Houdini it is about 75%. I can post more numbers this
evening. I have two sets of data where the engines had 100ms per position.

I think that will perhaps be very interesting (well, at least to me).
If we know how often (on average) an identical engine picks identical moves I think that will be something like a control.

I would also like to see a correlation based upon strength. For instance, suppose that engine A is 50 Elo stronger than engine B... Then run an experiment where A has one thread and B has two threads (or some similar way to try to match impedence). Perhaps correlations are *largely* a function of strength, and perhaps not.

Adam Hair · Post by **Adam Hair** » Wed May 18, 2011 4:38 am

Dann Corbit wrote:
Adam Hair wrote:
Dann Corbit wrote:
Adam Hair wrote: {snip}
Sven, could I ask you how you did that?
He either hit the Code button twice to surround the table or he manually pasted it in.

I think it would be interesting to see how well an engine correlates against itself. IOW, how often does Fruit guess Fruit's moves and how often does Rybka guess Rybka's moves, etc.
I haven't fully investigated this, but I do know that Fruit picks the same moves about
95%, while for Rybka and Houdini it is about 75%. I can post more numbers this
evening. I have two sets of data where the engines had 100ms per position.
I think that will perhaps be very interesting (well, at least to me).
If we know how often (on average) an identical engine picks identical moves I think that will be something like a control.

I would also like to see a correlation based upon strength. For instance, suppose that engine A is 50 Elo stronger than engine B... Then run an experiment where A has one thread and B has two threads (or some similar way to try to match impedence). Perhaps correlations are *largely* a function of strength, and perhaps not.

I tried to make an allowance for differences in strength by allocating the
amount of time each engine has for each position based on its estimated
Elo difference from Houdini 1.5. Time = 10ms * 2^(Elo diff/100) with
Houdini 1.5 getting 10ms for each position. The formula is a guesstimate
based on what I have read on CCC and the CCRL 40/4 rating list. In
previous tests, where each engine got the same amount of time, I noted
that there seemed to be some correlation based on strength (but not
much). So, for the data I am presenting now, I decided to use the
formula above. At some point, I will do regression analysis to determine
if there is any dependence on strength. But it does not seem to be evident
that there is a dependence in my data.

By the way, I've started a test run check self-similarity for some of the
engines. 11 engines, tested repeated 10 times for each engine. If I had
enough time and resources, I would do this for every engine tested. I will
share the results when it is done, possibly tomorrow night.

Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections

Re: Pairwise Analysis of Chess Engine Move Selections