Clone detection test

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Clone detection test

Post by michiguel »

Hart wrote:from the id: "Strelka R-3-E (Rybka-3-Eval)", by you know who.

It's a couple of years old and the source was not included. Can't even remember where I downloaded it from.
There is no doubt that R 3 eval was not a joke.

Miguel
Osipov Jury
Posts: 186
Joined: Mon Jan 21, 2008 2:07 pm
Location: Russia

Re: Clone detection test

Post by Osipov Jury »

Dann Corbit wrote:What's strelka_3r?
I've never heard of it.
It was my first experiment with sources of Rybka 3.
I posted the link here:

http://talkchess.com/forum/viewtopic.ph ... ht=#321347
Rstchess

Clone Engine List

Post by Rstchess »

"This is a fairly complete list of all Winboard/UCI chess engines that were proven to be unauthorized copies of legitimate engines. I may remove some clones that did not get distributed widely. I am not going to list fake “upgrade versions” of legitimate engines. There are too many clones to list them all, so I want to list those that fooled us long enough to play in a major tournament. Many of the programmer names shown under 'Cloner' are not real, but some of them are. This list is intended to be an “avoid list”. I feel strongly that Winboard people should not play clone engines in any tournament! The only exception being the engines marked with an asterisk (*) indicating that the engine is no longer a clone. "
Computer-Chess Wiki

Clone engine list

http://computer-chess.org/doku.php?id=c ... ngine_list
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Clone detection test

Post by michiguel »

Hart wrote:
M ANSARI wrote:That is interesting. I do think though a MPV output of say the best 2 moves is even better, or as someone mentioned to include the second ponder move. Also the positions should not be where many positions have a possibility of equivalent scores for different moves, so the move chose should be much better than the second best. With regards to Rybka, the contempt value set should not corrupt the engine output, so analyse contempt should be used. I would be very interested to see how different R3 human, default and dynamic are to say RL or other IPPOLIT based engines.
As far as RL/FB is concerned, they are most similar to Rybka 3, followed very closely by Rybka 3 Human, and the least like Rybka 3 Dynamic, when only comparing Rybka 3 flavors.

Code: Select all

       +------------------------------------------------------zappa_1.1 
  +---14  
  !    +------------------------------------------------zpa_mexico
  !  
  !            +---------------------------------------fruit_21  
  !     +------8 
  !  +-16      +----------------------------------------toga_1.3.1
  !  !  !  
  !  !  +-------------------------------------------------ptor_1.3.2
  !  !  
  !  !                       +-----------------------------------dh64_1.3.3
  !  !        +--------------3 
  !  !        !              +----------------------------------komodo_1.0
  !  !        !  
  !  !        !                   +--------------------------firebird_1
  !  !     +-15       +-----------1 
  !  !     !  !       !           +-------------------------rlito_85g3
  !  !     !  !    +-12  
  !  !     !  !    !  !        +------------------------------rybka_3   
 18-20     !  !    !  !      +-4 
  !  !     !  +---13  +------5 +-------------------------------rybka_3H  
  !  !     !       !         ! 
  !  !     !       !         +---------------------------------rybka_3D  
  !  !  +-17       !  
  !  !  !  !       +-----------------------------------------strelka_3r
  !  !  !  !  
  !  !  !  !                     +---------------------------naum_4.0  
  !  !  !  !             +-------2 
  !  !  !  !         +---6       +-------------------------naum_4.1  
  !  !  !  !         !   ! 
  !  !  !  +---------9   +---------------------------------rybka_22n2
  !  +-21            ! 
  !     !            !  +------------------------------------rybka_1w32
  !     !            +--7 
  !     !               +-----------------------------------strelka_2 
  !     !  
  !     !             +-----------------------------------------gurung_2.2
  !     !          +-10  
  !     !  +------11  +-------------------------------------skfish_1.4
  !     !  !       !  
  !     +-19       +-----------------------------------------skfish_1.6
  !        !  
  !        +-------------------------------------------------spark_0.3a
  !  
  +-------------------------------------------------------bright_04a
Naum stands out again. Interestingly, Rybka 2.2n2 is "closer" to Naum 4 than Rybka 1 or Rybka 3.
Just a confirmation of your data.
Your 6400 positions per engine were re-sampled, taking only ~10% of the data randomly (about 640 positions per tree) and the trees were recalculated. This was done 1000 times. The numbers you see are the number of times that particular branch showed up in the tree (%). If the signal is strong, the branch will appear most of time, because It won't depend too much on what positions were chosen randomly. The branches that are outside the light blue area have extremely strong signal.

Michael, I will send you later the ruby script I wrote to resample your data. I have now everything automatized in my computer.

Interestingly, your positions match Don's results extremely well. Considering that and the statistical analysis, I think there is very little doubt about the stylistic similarities. The only curiosity now is what happen if more engines are included.

Miguel

Image

The raw data with the bootstrap numbers (1000 max):

Code: Select all

Extended majority rule consensus tree

CONSENSUS TREE:
the numbers on the branches indicate the number
of times the partition of the species into the two sets
which are separated by that branch occurred
among the trees, out of 1000.00 trees

                                                                 +------naum 4.1
                                                          +995.0-|
                                                   +910.0-|      +------naum 4.0
                                                   |      |
                                     +--------1000-|      +-------------rybka 22n2
                                     |             |
                                     |             +--------------------strelka 2
                                     |
                              +885.0-|                           +------doch64 1.3
                              |      |      +---------------1000-|
                              |      |      |                    +------komodo 1.0
                              |      |      |
                              |      +828.0-|             +-------------rybka 3
                              |             |      +681.0-|
                              |             |      |      |      +------firebird 1
                              |             +964.0-|      +-1000-|
                       +416.0-|                    |             +------robbolito 
                       |      |                    |
                       |      |                    +--------------------strelka 3r
                       |      |
                       |      |                                  +------glaurung 2
                       |      |                           +677.0-|
                +794.0-|      |                    +998.0-|      +------sf  1.4
                |      |      |                    |      |
                |      |      +--------------488.0-|      +-------------sf  1.6
                |      |                           |
                |      |                           +--------------------spark 0.3a
                |      |
         +997.0-|      |                                  +-------------bright 04a
         |      |      +----------------------------553.0-|
         |      |                                         |      +------zappa mexi
         |      |                                         +939.0-|
  +------|      |                                                +------zappa 1.1
  |      |      |
  |      |      +-------------------------------------------------------protector 
  |      |
  |      +--------------------------------------------------------------toga 1.3.1
  |
  +---------------------------------------------------------------------fruit 21


  remember: this is an unrooted tree!

User avatar
Spacious_Mind
Posts: 317
Joined: Mon Nov 02, 2009 12:05 am
Location: Alabama

Re: Clone detection test

Post by Spacious_Mind »

Osipov Jury wrote:
Dann Corbit wrote:What's strelka_3r?
I've never heard of it.
It was my first experiment with sources of Rybka 3.
I posted the link here:

http://talkchess.com/forum/viewtopic.ph ... ht=#321347
A question Yuri, does Strelka_3r search correctly with ply games or do you also have to deduct 3 ply?

Thanks and regards

Nick
Osipov Jury
Posts: 186
Joined: Mon Jan 21, 2008 2:07 pm
Location: Russia

Re: Clone detection test

Post by Osipov Jury »

Strelka_3R has the same search as Strelka 2.0B with correct ply, nodes and PV.
User avatar
Spacious_Mind
Posts: 317
Joined: Mon Nov 02, 2009 12:05 am
Location: Alabama

Re: Clone detection test

Post by Spacious_Mind »

Osipov Jury wrote:Strelka_3R has the same search as Strelka 2.0B with correct ply, nodes and PV.
Thanks Jury

The reason I asked is because of some ply tests, I wanted to make absolutely sure that Strelka is not 8 ply instead of 5. This is way to early because of the number of games are not high, but still sofar at 5 ply your Streka_3R looks good:

Code: Select all

   Engine                Score
01: Strelka3_R            141.0/208
02: Ippolit               113.5/209
03: IvanHoe_v81_w32       111.0/209
04: FireBird 1.0 beta w32 109.5/209
05: Igorrit_0086v         107.5/208
06: Robbolito_0085f1_IA32 105.5/208
07: RobboLito_0085g3_w32  104.5/208
08: Igorrit_0086v9_w32    103.5/208
09: Igorrit_0086v_Plus    103.5/208
10: Igorrit_0086v2        102.5/208
11: IvanHoe_v68_w32_JR    102.0/209
12: RobboLito_0085e1_w32  102.0/209
13: IvanHoe_v73_w32       100.0/209
14: IvanHoe999970         94.0/209
15: Tankist 1.2 32-bit    94.0/208
regards

Nick
swami
Posts: 6662
Joined: Thu Mar 09, 2006 4:21 am

Re: Clone detection test

Post by swami »

Hi Don and others,
Don wrote:Suppose you ran 1000 random positions on many different versions of a
the same program, then run the same positions on many versions of
other programs. What could be deduced statistically from how often
the various program versions picked the same move?

The 1000 positions are from a set of positions that Larry Kaufman and
I created long ago that are designed to compare chess programs to
humans in playing style. So few problems are blatantly tactical and
in many of these positions the choice of moves is going to based on
preference more than raw strength.

The test compares any two programs by how often they pick the same
move, out of a sample of 1000 positions. I run each program to the
same time limit which in this case is 1/10 of a second.
I hope STS can be used for the clone detection test as it also comes with the partial credit moves which just maximizes the probing and the choices of engines will be more easily and comprehensively assessed.

Next version will be released probably by the end of this month, and we will have 1000 positions.
swami
Posts: 6662
Joined: Thu Mar 09, 2006 4:21 am

Re: Clone detection test

Post by swami »

Elchinito was considered a clone of Crafty. I guess that El Chinito will pass this test without suspicion as it had Style/Strength completely different to Crafty at the time.

It was a difficult-to-detect kind of clone, ever.

It'd be interesting if someone could compare El Chinito with various versions of Crafty released around that time and before.

I believe there were about two versions of Chinito.
BubbaTough
Posts: 1154
Joined: Fri Jun 23, 2006 5:18 am

Re: Clone detection test

Post by BubbaTough »

swami wrote:
I hope STS can be used for the clone detection test as it also comes with the partial credit moves which just maximizes the probing and the choices of engines will be more easily and comprehensively assessed.

Next version will be released probably by the end of this month, and we will have 1000 positions.
I thought the point of STS was that there was an objectively best move (as well as a possibly 2nd or 3rd best for partial credit). If this is the case, then the better the programs are the more they would look like each other in terms of STS results. If anything, the positions you have rejected are more likely to be good test candidates, because assumably you rejected them as not having a clear best move. Or better yet, the positions you did not even consider using, because it is completely unclear what the best move might be.

-Sam