More proof of Strelka cloning

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

MartinBryant

More proof of Strelka cloning

Post by MartinBryant »

On Sunday I first saw Uri's positional posts about Strelka reproducing Rybka analysis.
I was impressed by the objective work he had done and found his conclusions logical and convincing.

I then spent some time looking at other areas of Strelka and Rybka before making my own original post agreeing with Uri's conclusion.
I have now documented my own findings below.

The area I focussed on was nothing to do with position analyses but something more intimate to an engine writer.
I was particularly interested in the FORMAT of the UCI messages being generated by the engines.
Although these messages are not usually seen by most people, they are second nature to the engine writer.
Apologies if the data below then looks like gobbledegook to most people but it is what is given in the standard and I will try to explain the data with some examples as I go on. (Apologies to the engine writers who know all this stuff backwards.)

Now, although UCI is a 'standard' it is very loose in places allowing the engine writer much variety in interpretation.
This leads to a wonderful diversity of engine output that GUI's have to cope with as writers 'express' their individuality.

The particular message I will focus on here is the UCI 'info' message which an engine sends repeatedly (sometimes hundreds of times) to the GUI as its search progresses. The info message contains a number of key/value pairs, telling the GUI things like how many nodes the engine has searched, what move it is currently searching, the best line found so far etc.

An example...
info depth 2 score cp 11 nodes 259 time 1 pv e2e4 e7e5

Here the engine is telling the GUI that it is currently on iteration 2, has found a best line of e2e4 e7e5 with a score of +11 centipawns, using 259 nodes and taking 1 millisecond.

There are many other keys available such as currmove, nps, hashfull, etc all documented in the UCI standard.

However, the UCI standard does NOT specify what ORDER the keys should be sent to the GUI. Nor does the engine have to send all keys, it can send just those it chooses to, depending usually on the stage of the search.

Now the info messages generally sent can be put quite easily into five categories...
1) Intro message
Often sent at the start of an iteration, usually something simple like...
e.g. info depth 10
"I am starting iteration 10"

2) Progress message
Sent at unspecified intervals during the search, usually something about the current move...
e.g. info currmove g1f3 currmovenumber 14
"I am starting to search move g1f3 which is 14th in my root movelist"

3) PV message
The engine is announcing a new best move...
e.g. info depth 2 score cp 11 nodes 259 time 1 pv e2e4 e7e5
(Explanation above)

4) Outro message
Sent at the end of an iteration, usually some summary info...
e.g. info depth 10 nodes 456789 time 1234
"I have finished iteration 10 and I searched 456789 nodes in 1234 milliseconds"

5) Miscellaneous messages
These can vary GREATLY and are entirely down to the programmer.
Some engines print a special message only after the last iteration just before announcing their best move.
Some engines print extra progress messages maybe using some of the more esoteric keys like cpuload.


Now below I have documented the messages produced by a number of established engines, listing their name followed by the keys they send for their messages in the first four categories (documenting the fifth category would be nigh on impossible).

As a first example...

Alaric
depth
currmove currmovenumber
depth score nodes time pv
nodes time nps

So Alaric produces an intro message in the format "info depth X" to start an iteration, occasionally producing progress messages in the format "info currmove XXXX currmovenumber X". When it finds a best move it sends a message in the format "info depth X score cp X nodes X time X pv X X X X" and ends its iterations with an "info nodes X time X nps X"

Below are another 21 engines outputs...
(I have written 'none' where an engine produces no message in that category.)
Note that it's not the data values that are of any relevance here, just the message FORMATS.

AnMon
depth
currmove currmovenumber
score depth nodes time pv
none

Booot
currmove currmovenumber
none
depth time nodes score pv
none

Colossus
depth seldepth
depth currmove currmovenumber nodes time
depth time nodes score pv
depth nps nodes time

Delfi
none
none
depth score time nodes pv
none

Dragon
depth nodes
currmovenumber currmove
score depth nodes time pv
none

Fruit
depth
currmove currmovenumber
depth seldepth score time node pv
depth seldepth time nodes nps

Glaurung
depth
currmove currmovenumber
multipv depth score time nodes nps pv
none

Hermann
none
currmove currmovenumber
depth seldepth time nodes pv score hashfull nps
none

Ktulu
none
currmove currmovenumber depth nodes time nps
depth score time nodes nps pv
none

List
none
depth nodes currmove currmovenumber
score depth nodes time pv
none

Movei
none
currmove currmovenumber
depth score time nodes pv
repeats last pv message with latest data

Muse
none
currmove currmovenumber
none
depth seldepth score nodes nps time pv full-stop!

Naum
depth seldepth
currmove currmovenumber nodes nps
score depth seldepth time nodes nps pv
none

Nejmet
none
currmove currmovenumber
depth seldepth score time nodes nps pv
repeats last pv message with latest data

Pepito
none
currmove currmovenumber
score depth nodes time nps hashfull tbhits pv
score depth nodes time tbhits pv

Pharaon
depth seldepth
nps hashfull nodes currmove currmovenumber
score depth nodes nps time pv hashfull
none

Ruffian
none
depth seldepth nodes hashfull nps currmove currmovenumber
depth seldepth nodes nps time score pv
none

Rybka
depth
currmove currmovenumber
depth score time nodes nps pv
depth time nodes nps

SOS
depth seldepth
currmove currmovenumber
score depth seldepth nodes time pv
depth seldepth

Ufim
none
depth nodes time nps currmove currmovenumber hashfull
depth score time nodes pv
none

Yace
none
currmove currmovenumber depth seldepth
depth score nodes time pv
repeats last pv message with latest data


As you can see the total outputs are ALL DIFFERENT, yet they are all valid to the UCI standard!
(Well actually apart from Muse which sends a spurious full-stop at the end of it's pv?!)
No engine produces exactly the same format messages in all categories as any other!

Now this is EXACTLY to be expected.
The old IT managers adage... "You give a hundred programmers the same spec and you get back a hundred different programs!" is absolutely true.
Programming is more an art form than an exact science. You would EXPECT every engines message formats to be different.


So let's finally look at Strelka.

Strelka
depth
currmove currmovenumber
depth score time nodes nps pv
depth time nodes nps

Oh but now... this isn't unique?! But it isn't like open source Fruit or Glaurung, or any of the other engines is it?
As you can see, the basic format of all of its UCI info messages is identical to, who other than, Rybka!

Not only is this thing definitely a clone of Rybka, it's not even a well-disguised clone of Rybka!!
User avatar
fern
Posts: 8755
Joined: Sun Feb 26, 2006 4:07 pm

Re: More proof of Strelka cloning

Post by fern »

Interesting, but let me ask and/or comment you a couple of things:
a) there are not 12, but hundreds of engines. What if you examine all of them? Are you sure none of them will show the same output of UCI messages to the gui?
b) in fact, in your examination there are not many messages. Typically Score, number of nodes, time expended, iteration depth, PV line and two or three more. With so few variables, is too much difficult or impossible a coincidence in the pattern of it? With five elements you have 5x4x3x2 possible orders = 120 possible orders. The chance to get the same is then 1/120 if there are only two engines to compare. But if we have hundreds of chances, then what....?
Even more: what if -as it happens- every new engine is made taking into account what has been made by the precedents?
Please correct me if my maths are preposterous or I missed something. I have not decisive position, I want to know for certain.

My best and still waiting....
Fernando
Christopher Conkie
Posts: 6073
Joined: Sat Apr 01, 2006 9:34 pm
Location: Scotland

Re: More proof of Strelka cloning

Post by Christopher Conkie »

MartinBryant wrote:On Sunday I first saw Uri's positional posts about Strelka reproducing Rybka analysis.
I was impressed by the objective work he had done and found his conclusions logical and convincing.

I then spent some time looking at other areas of Strelka and Rybka before making my own original post agreeing with Uri's conclusion.
I have now documented my own findings below.

The area I focussed on was nothing to do with position analyses but something more intimate to an engine writer.
I was particularly interested in the FORMAT of the UCI messages being generated by the engines.
Although these messages are not usually seen by most people, they are second nature to the engine writer.
Apologies if the data below then looks like gobbledegook to most people but it is what is given in the standard and I will try to explain the data with some examples as I go on. (Apologies to the engine writers who know all this stuff backwards.)

Now, although UCI is a 'standard' it is very loose in places allowing the engine writer much variety in interpretation.
This leads to a wonderful diversity of engine output that GUI's have to cope with as writers 'express' their individuality.

The particular message I will focus on here is the UCI 'info' message which an engine sends repeatedly (sometimes hundreds of times) to the GUI as its search progresses. The info message contains a number of key/value pairs, telling the GUI things like how many nodes the engine has searched, what move it is currently searching, the best line found so far etc.

An example...
info depth 2 score cp 11 nodes 259 time 1 pv e2e4 e7e5

Here the engine is telling the GUI that it is currently on iteration 2, has found a best line of e2e4 e7e5 with a score of +11 centipawns, using 259 nodes and taking 1 millisecond.

There are many other keys available such as currmove, nps, hashfull, etc all documented in the UCI standard.

However, the UCI standard does NOT specify what ORDER the keys should be sent to the GUI. Nor does the engine have to send all keys, it can send just those it chooses to, depending usually on the stage of the search.

Now the info messages generally sent can be put quite easily into five categories...
1) Intro message
Often sent at the start of an iteration, usually something simple like...
e.g. info depth 10
"I am starting iteration 10"

2) Progress message
Sent at unspecified intervals during the search, usually something about the current move...
e.g. info currmove g1f3 currmovenumber 14
"I am starting to search move g1f3 which is 14th in my root movelist"

3) PV message
The engine is announcing a new best move...
e.g. info depth 2 score cp 11 nodes 259 time 1 pv e2e4 e7e5
(Explanation above)

4) Outro message
Sent at the end of an iteration, usually some summary info...
e.g. info depth 10 nodes 456789 time 1234
"I have finished iteration 10 and I searched 456789 nodes in 1234 milliseconds"

5) Miscellaneous messages
These can vary GREATLY and are entirely down to the programmer.
Some engines print a special message only after the last iteration just before announcing their best move.
Some engines print extra progress messages maybe using some of the more esoteric keys like cpuload.


Now below I have documented the messages produced by a number of established engines, listing their name followed by the keys they send for their messages in the first four categories (documenting the fifth category would be nigh on impossible).

As a first example...

Alaric
depth
currmove currmovenumber
depth score nodes time pv
nodes time nps

So Alaric produces an intro message in the format "info depth X" to start an iteration, occasionally producing progress messages in the format "info currmove XXXX currmovenumber X". When it finds a best move it sends a message in the format "info depth X score cp X nodes X time X pv X X X X" and ends its iterations with an "info nodes X time X nps X"

Below are another 21 engines outputs...
(I have written 'none' where an engine produces no message in that category.)
Note that it's not the data values that are of any relevance here, just the message FORMATS.

AnMon
depth
currmove currmovenumber
score depth nodes time pv
none

Booot
currmove currmovenumber
none
depth time nodes score pv
none

Colossus
depth seldepth
depth currmove currmovenumber nodes time
depth time nodes score pv
depth nps nodes time

Delfi
none
none
depth score time nodes pv
none

Dragon
depth nodes
currmovenumber currmove
score depth nodes time pv
none

Fruit
depth
currmove currmovenumber
depth seldepth score time node pv
depth seldepth time nodes nps

Glaurung
depth
currmove currmovenumber
multipv depth score time nodes nps pv
none

Hermann
none
currmove currmovenumber
depth seldepth time nodes pv score hashfull nps
none

Ktulu
none
currmove currmovenumber depth nodes time nps
depth score time nodes nps pv
none

List
none
depth nodes currmove currmovenumber
score depth nodes time pv
none

Movei
none
currmove currmovenumber
depth score time nodes pv
repeats last pv message with latest data

Muse
none
currmove currmovenumber
none
depth seldepth score nodes nps time pv full-stop!

Naum
depth seldepth
currmove currmovenumber nodes nps
score depth seldepth time nodes nps pv
none

Nejmet
none
currmove currmovenumber
depth seldepth score time nodes nps pv
repeats last pv message with latest data

Pepito
none
currmove currmovenumber
score depth nodes time nps hashfull tbhits pv
score depth nodes time tbhits pv

Pharaon
depth seldepth
nps hashfull nodes currmove currmovenumber
score depth nodes nps time pv hashfull
none

Ruffian
none
depth seldepth nodes hashfull nps currmove currmovenumber
depth seldepth nodes nps time score pv
none

Rybka
depth
currmove currmovenumber
depth score time nodes nps pv
depth time nodes nps

SOS
depth seldepth
currmove currmovenumber
score depth seldepth nodes time pv
depth seldepth

Ufim
none
depth nodes time nps currmove currmovenumber hashfull
depth score time nodes pv
none

Yace
none
currmove currmovenumber depth seldepth
depth score nodes time pv
repeats last pv message with latest data


As you can see the total outputs are ALL DIFFERENT, yet they are all valid to the UCI standard!
(Well actually apart from Muse which sends a spurious full-stop at the end of it's pv?!)
No engine produces exactly the same format messages in all categories as any other!

Now this is EXACTLY to be expected.
The old IT managers adage... "You give a hundred programmers the same spec and you get back a hundred different programs!" is absolutely true.
Programming is more an art form than an exact science. You would EXPECT every engines message formats to be different.


So let's finally look at Strelka.

Strelka
depth
currmove currmovenumber
depth score time nodes nps pv
depth time nodes nps

Oh but now... this isn't unique?! But it isn't like open source Fruit or Glaurung, or any of the other engines is it?
As you can see, the basic format of all of its UCI info messages is identical to, who other than, Rybka!

Not only is this thing definitely a clone of Rybka, it's not even a well-disguised clone of Rybka!!
I have saved your post.

8-)

Christopher
MartinBryant

Re: More proof of Strelka cloning

Post by MartinBryant »

To an extent you are right fernando.
It is not impossible that another engine that I have not looked at could completely by chance use the same format as Rybka. However we are not trying to prove some other engine is a clone, just Strelka. And lo and behold Strelka DOES produce the same format messages. A coincidence? I think not. How many similarities does there have to be before someone accepts it's a clone? I guess it depends on the individual.
But it it looks like a duck, it walks like a duck and it quacks like a duck, it's probably a duck!

Also there are actually now 17 possible keys on the info command making a staggering 355687428096000 permutations (I think Windows calculator has actually overflowed there?), although admittedly some of them are very obscure. Also there are the 4 categories considered so x4 too.

I can understand someone wanting to learn actual playing code techniques from others but there is nothing to be learned from simple messages.
AGove

Re: More proof of Strelka cloning

Post by AGove »

Let's suppose that I write the code for a new engine. I take as my inspiration the best engines around, and I deliberately copy the way that UCI info messages are displayed from someone else's engine - perhaps because I think it's a good exemplar. So what? Does that make my engine a clone? No. In this example I haven't done anything remotely wrong so far.

If I were a cloner it would be for reasons other than copying a UCI message format, and for which evidence would be different to the mere display of info messages.

By all means let's share evidence of engine cloning, but it has to be of reasonable quality.
Christopher Conkie
Posts: 6073
Joined: Sat Apr 01, 2006 9:34 pm
Location: Scotland

Re: More proof of Strelka cloning

Post by Christopher Conkie »

AGove wrote:Let's suppose that I write the code for a new engine. I take as my inspiration the best engines around, and I deliberately copy the way that UCI info messages are displayed from someone else's engine - perhaps because I think it's a good exemplar. So what? Does that make my engine a clone? No. In this example I haven't done anything remotely wrong so far.

If I were a cloner it would be for reasons other than copying a UCI message format, and for which evidence would be different to the mere display of info messages.

By all means let's share evidence of engine cloning, but it has to be of reasonable quality.
You can't be serious.

:lol:

Christopher
Dave McClain
Posts: 1018
Joined: Fri Mar 10, 2006 12:56 am
Location: Major, 45 Commando, Royal Marines, Condor Barracks, Arbroath, Scotland
Full name: Dave MCClain

Re: More proof of Strelka cloning

Post by Dave McClain »

MartinBryant wrote:To an extent you are right fernando.
It is not impossible that another engine that I have not looked at could completely by chance use the same format as Rybka. However we are not trying to prove some other engine is a clone, just Strelka. And lo and behold Strelka DOES produce the same format messages. A coincidence? I think not. How many similarities does there have to be before someone accepts it's a clone? I guess it depends on the individual.
But it it looks like a duck, it walks like a duck and it quacks like a duck, it's probably a duck!

Also there are actually now 17 possible keys on the info command making a staggering 355687428096000 permutations (I think Windows calculator has actually overflowed there?), although admittedly some of them are very obscure. Also there are the 4 categories considered so x4 too.

I can understand someone wanting to learn actual playing code techniques from others but there is nothing to be learned from simple messages.
Martin,

This is not being critical of your analysis.............

If someone was going to clone another person's chess engine, would it be too obvious to clone it too closely in strength? I'm using Strelka / Rybka as an example.

In other words, if Strelka is a clone of Rybka, why is there such a large discrepancy in strength? Strelka isn't close to Rybka. Was this done purposely to not direct attention or is there a difference in programming skill?

At what point does using another's code make a program a clone? Any?
User avatar
fern
Posts: 8755
Joined: Sun Feb 26, 2006 4:07 pm

Re: More proof of Strelka cloning

Post by fern »

OK, but then, why a programmer capable of cloning such a complex thing as Rybka would leave a so easily changeable feature as are the messages as a yelling proof of his wrongdoing?

Even me could do at least the effort to hide that. I mean, at least putting OFF those functions.

Still bewildered....
Fernando
AGove

Re: More proof of Strelka cloning

Post by AGove »

The reasoning is seriously at fault thrice over:
  • It is mistaken to assume that the selection by an author of a UCI message format will be at random.
  • It is mistaken to assume that evidence of copying a UCI message format is evidence of copying anything else.
  • It is mistaken to assume that copying a UCI message format amounts to cloning.
Uri Blass
Posts: 10296
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: More proof of Strelka cloning

Post by Uri Blass »

Dave McClain wrote:
MartinBryant wrote:To an extent you are right fernando.
It is not impossible that another engine that I have not looked at could completely by chance use the same format as Rybka. However we are not trying to prove some other engine is a clone, just Strelka. And lo and behold Strelka DOES produce the same format messages. A coincidence? I think not. How many similarities does there have to be before someone accepts it's a clone? I guess it depends on the individual.
But it it looks like a duck, it walks like a duck and it quacks like a duck, it's probably a duck!

Also there are actually now 17 possible keys on the info command making a staggering 355687428096000 permutations (I think Windows calculator has actually overflowed there?), although admittedly some of them are very obscure. Also there are the 4 categories considered so x4 too.

I can understand someone wanting to learn actual playing code techniques from others but there is nothing to be learned from simple messages.
Martin,

This is not being critical of your analysis.............

If someone was going to clone another person's chess engine, would it be too obvious to clone it too closely in strength? I'm using Strelka / Rybka as an example.

In other words, if Strelka is a clone of Rybka, why is there such a large discrepancy in strength? Strelka isn't close to Rybka. Was this done purposely to not direct attention or is there a difference in programming skill?

At what point does using another's code make a program a clone? Any?
This is simply wrong and there is no big difference in strength between the engines

Strelka is slightly weaker but the difference is small

From the CCRL list:


http://computerchess.org.uk/ccrl/4040/r ... t_all.html
Rybka 1.0 32-bit 2882 +16 −16 60.8% −71.9 40.4% 1269
66.6%
Strelka 1.0b 2872 +39 −38 57.5% −47.9 40.8% 213

from the CEGT list
http://www.husvankempen.de/nunn/40_40%2 ... liste.html

52 Rybka 1.0 Beta w32 2821 8 8 4891 65.1 % 2713 33.0 %
67 Strelka 1.8 2796 34 34 249 52.2 % 2781 38.6 %
86 Strelka 1.0 Beta 2758 17 17 1030 51.9 % 2744 34.7 %