Clone detection test

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
M ANSARI
Posts: 3726
Joined: Thu Mar 16, 2006 7:10 pm

Re: Clone detection test

Post by M ANSARI »

That is interesting. I do think though a MPV output of say the best 2 moves is even better, or as someone mentioned to include the second ponder move. Also the positions should not be where many positions have a possibility of equivalent scores for different moves, so the move chose should be much better than the second best. With regards to Rybka, the contempt value set should not corrupt the engine output, so analyse contempt should be used. I would be very interested to see how different R3 human, default and dynamic are to say RL or other IPPOLIT based engines.
User avatar
Kempelen
Posts: 620
Joined: Fri Feb 08, 2008 10:44 am
Location: Madrid - Spain

Re: Clone detection test

Post by Kempelen »

I have another idea it could be interesting. It is to develop a little tool which takes two parameters: the engines executable name and path. This tool would search for bytes streams of one into the other. The minimum stream would be, i.e. 10 bytes.

So, if I compare rybka.exe and fruit.exe and the output says I have 20 streams of 15 (minimum) bytes lenght identically in both, that would be very suspect, while 1 or 2 would be normal.

I am developing the idea first, but would be very interesting to see how many bytes two engines share.....

what is your opinion?
Fermin Serrano
Author of 'Rodin' engine
http://sites.google.com/site/clonfsp/
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Clone detection test

Post by Milos »

Kempelen wrote:I have another idea it could be interesting. It is to develop a little tool which takes two parameters: the engines executable name and path. This tool would search for bytes streams of one into the other. The minimum stream would be, i.e. 10 bytes.

So, if I compare rybka.exe and fruit.exe and the output says I have 20 streams of 15 (minimum) bytes lenght identically in both, that would be very suspect, while 1 or 2 would be normal.

I am developing the idea first, but would be very interesting to see how many bytes two engines share.....

what is your opinion?
It won't work. Binaries are strongly compiler dependent.
User avatar
Kempelen
Posts: 620
Joined: Fri Feb 08, 2008 10:44 am
Location: Madrid - Spain

Re: Clone detection test

Post by Kempelen »

Milos wrote:
Kempelen wrote:I have another idea it could be interesting. It is to develop a little tool which takes two parameters: the engines executable name and path. This tool would search for bytes streams of one into the other. The minimum stream would be, i.e. 10 bytes.

So, if I compare rybka.exe and fruit.exe and the output says I have 20 streams of 15 (minimum) bytes lenght identically in both, that would be very suspect, while 1 or 2 would be normal.

I am developing the idea first, but would be very interesting to see how many bytes two engines share.....

what is your opinion?
It won't work. Binaries are strongly compiler dependent.
I was thinking more in data that code. A 8 int vector (like passed pawns evaluation by files) would be similar among most compilers
Fermin Serrano
Author of 'Rodin' engine
http://sites.google.com/site/clonfsp/
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Clone detection test

Post by Milos »

Kempelen wrote:
Milos wrote:
Kempelen wrote:I have another idea it could be interesting. It is to develop a little tool which takes two parameters: the engines executable name and path. This tool would search for bytes streams of one into the other. The minimum stream would be, i.e. 10 bytes.

So, if I compare rybka.exe and fruit.exe and the output says I have 20 streams of 15 (minimum) bytes lenght identically in both, that would be very suspect, while 1 or 2 would be normal.

I am developing the idea first, but would be very interesting to see how many bytes two engines share.....

what is your opinion?
It won't work. Binaries are strongly compiler dependent.
I was thinking more in data that code. A 8 int vector (like passed pawns
evaluation by files) would be similar among most compilers
Tables you could certainly identify this way. However, it is always easy to trick this kind of checking by slightly changing table values (e.g. for 1cp) from the original ones.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Clone detection test

Post by Don »

I would argue that testing for data in common is not a very good clone detection algorithm. It does not distinguish between what is important and key and what is mundane and not important.

Suppose I have the 1 bit per entry king and pawn vs king database compiled in to my program? I did my own a few years ago and gave John Stanback the technology, but that does not make Zarkov anything like my program.

Also, the magic bit boards for move generation in chess programs is pretty mundane, it's a pretty big chunk of data that has nothing to do with how a program behaves or how skilled the program author is.
Kempelen wrote:
Milos wrote:
Kempelen wrote:I have another idea it could be interesting. It is to develop a little tool which takes two parameters: the engines executable name and path. This tool would search for bytes streams of one into the other. The minimum stream would be, i.e. 10 bytes.

So, if I compare rybka.exe and fruit.exe and the output says I have 20 streams of 15 (minimum) bytes lenght identically in both, that would be very suspect, while 1 or 2 would be normal.

I am developing the idea first, but would be very interesting to see how many bytes two engines share.....

what is your opinion?
It won't work. Binaries are strongly compiler dependent.
I was thinking more in data that code. A 8 int vector (like passed pawns evaluation by files) would be similar among most compilers
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Clone detection test

Post by michiguel »

Kempelen wrote:
Milos wrote:
Kempelen wrote:I have another idea it could be interesting. It is to develop a little tool which takes two parameters: the engines executable name and path. This tool would search for bytes streams of one into the other. The minimum stream would be, i.e. 10 bytes.

So, if I compare rybka.exe and fruit.exe and the output says I have 20 streams of 15 (minimum) bytes lenght identically in both, that would be very suspect, while 1 or 2 would be normal.

I am developing the idea first, but would be very interesting to see how many bytes two engines share.....

what is your opinion?
It won't work. Binaries are strongly compiler dependent.
I was thinking more in data that code. A 8 int vector (like passed pawns evaluation by files) would be similar among most compilers
Hi Fermin,

There is an extensive literature in biology about what you say (for gene comparison). The simplest technique is called "dot plot". In fact, it has been used for clone detection in software, using the same ideas from the natural sciences. Look at the plot in page 2

Hope the link works

http://docs.google.com/viewer?a=v&q=cac ... YnzoKirW1Q

google "dot plot clone detection"

Miguel
Hart

Re: Clone detection test

Post by Hart »

M ANSARI wrote:That is interesting. I do think though a MPV output of say the best 2 moves is even better, or as someone mentioned to include the second ponder move. Also the positions should not be where many positions have a possibility of equivalent scores for different moves, so the move chose should be much better than the second best. With regards to Rybka, the contempt value set should not corrupt the engine output, so analyse contempt should be used. I would be very interested to see how different R3 human, default and dynamic are to say RL or other IPPOLIT based engines.
As far as RL/FB is concerned, they are most similar to Rybka 3, followed very closely by Rybka 3 Human, and the least like Rybka 3 Dynamic, when only comparing Rybka 3 flavors.

Code: Select all

       +------------------------------------------------------zappa_1.1 
  +---14  
  !    +------------------------------------------------zpa_mexico
  !  
  !            +---------------------------------------fruit_21  
  !     +------8 
  !  +-16      +----------------------------------------toga_1.3.1
  !  !  !  
  !  !  +-------------------------------------------------ptor_1.3.2
  !  !  
  !  !                       +-----------------------------------dh64_1.3.3
  !  !        +--------------3 
  !  !        !              +----------------------------------komodo_1.0
  !  !        !  
  !  !        !                   +--------------------------firebird_1
  !  !     +-15       +-----------1 
  !  !     !  !       !           +-------------------------rlito_85g3
  !  !     !  !    +-12  
  !  !     !  !    !  !        +------------------------------rybka_3   
 18-20     !  !    !  !      +-4 
  !  !     !  +---13  +------5 +-------------------------------rybka_3H  
  !  !     !       !         ! 
  !  !     !       !         +---------------------------------rybka_3D  
  !  !  +-17       !  
  !  !  !  !       +-----------------------------------------strelka_3r
  !  !  !  !  
  !  !  !  !                     +---------------------------naum_4.0  
  !  !  !  !             +-------2 
  !  !  !  !         +---6       +-------------------------naum_4.1  
  !  !  !  !         !   ! 
  !  !  !  +---------9   +---------------------------------rybka_22n2
  !  +-21            ! 
  !     !            !  +------------------------------------rybka_1w32
  !     !            +--7 
  !     !               +-----------------------------------strelka_2 
  !     !  
  !     !             +-----------------------------------------gurung_2.2
  !     !          +-10  
  !     !  +------11  +-------------------------------------skfish_1.4
  !     !  !       !  
  !     +-19       +-----------------------------------------skfish_1.6
  !        !  
  !        +-------------------------------------------------spark_0.3a
  !  
  +-------------------------------------------------------bright_04a
Naum stands out again. Interestingly, Rybka 2.2n2 is "closer" to Naum 4 than Rybka 1 or Rybka 3.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Clone detection test

Post by Dann Corbit »

What's strelka_3r?
I've never heard of it.
Hart

Re: Clone detection test

Post by Hart »

from the id: "Strelka R-3-E (Rybka-3-Eval)", by you know who.

It's (correction) 14 months old and the source was not included. Can't even remember where I downloaded it from.
Last edited by Hart on Wed Feb 03, 2010 1:32 am, edited 1 time in total.