Uri's Challenge : TwinFish

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

JVMerlino
Posts: 1404
Joined: Wed Mar 08, 2006 10:15 pm
Location: San Francisco, California

Re: Uri's Challenge : TwinFish

Post by JVMerlino »

Andres Valverde wrote:
Steve Maughan wrote:
tpetzke wrote:(...) I just like the concept of property (...)
+1

Indeed!

Steve
+2
+3

I have sent my code to roughly 8-10 people privately over the five years I've worked on Myrddin. One of those ended up cloning it, literally by changing only the engine name. And this was a version that, at the time, was rated maybe 1500. So you can guess the lengths a motivated cloner will go to for an engine that's 2500+ or even 3000+.

I've spent a lot of hours on Myrddin, and who cares if it is still only about 2300? Why should somebody else benefit from all of that work practically instantly?

jm
pilgrimdan
Posts: 405
Joined: Sat Jul 02, 2011 10:49 pm

Re: Uri's Challenge : TwinFish

Post by pilgrimdan »

JVMerlino wrote:
Andres Valverde wrote:
Steve Maughan wrote:
tpetzke wrote:(...) I just like the concept of property (...)
+1

Indeed!

Steve
+2
+3

I have sent my code to roughly 8-10 people privately over the five years I've worked on Myrddin. One of those ended up cloning it, literally by changing only the engine name. And this was a version that, at the time, was rated maybe 1500. So you can guess the lengths a motivated cloner will go to for an engine that's 2500+ or even 3000+.

I've spent a lot of hours on Myrddin, and who cares if it is still only about 2300? Why should somebody else benefit from all of that work practically instantly?

jm
agree... nobody should benefit from that... unless the author allows it... and that should be documented...
Uri Blass
Posts: 10905
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Uri's Challenge : TwinFish

Post by Uri Blass »

Milos wrote:
Tennison wrote:The only changes made to reach a "<55%" similarity are a complete asymetric PST (based on Adam Hair values).

If you want to see the changes just search for "Robber" in the sources files.
This is well known thing from long ago, that similarity test actually measures PST matching. All other eval terms are completely irrelevant.
It's totally unscientific thing, made to look like some science.


It is not logical that other terms are completely irrelevant(perhaps they are relatively irrelevant if you have crazy high values in the piece square table).
If you change the mobility evaluation or the king safety evaluation you change the choice of the move so mobility or king safety should be relevant.

Also if you change the search you change the choice of the moves so search should be also relevant.

Note that I expect asymetric crazy PST to be relatively weaker at longer time control so some questions:
1)what is the time control that you measure 70-80 elo difference and what happens at time control that is 3 times slower?
2)What happens to the similarity at longer time control and do you find relatively bigger similarity to stockfish(when you compare with other engines) if you use longer time control?
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Uri's Challenge : TwinFish

Post by Adam Hair »

Milos wrote:
Tennison wrote:The only changes made to reach a "<55%" similarity are a complete asymetric PST (based on Adam Hair values).

If you want to see the changes just search for "Robber" in the sources files.
This is well known thing from long ago, that similarity test actually measures PST matching. All other eval terms are completely irrelevant.
It's totally unscientific thing, made to look like some science.
As I recall, you are the only one who has espoused the idea that the similarity test basically only measures PST matching. Are you sure that this is a position that you want to take? Yes, PSTs have a definite influence on move selection. However, it has been shown more than once (and soon, once again) that simply using the same PSTs do not make two engines highly similar. This negates the statement "All other eval terms are completely irrelevant".
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Uri's Challenge : TwinFish

Post by Adam Hair »

velmarin wrote:I've never seen positions used in SIM.
Although it seems they can be changed.

They will guess middlegame.
If you change positions near the opening, what would happen?
or change them at the end positions. :?:
Larry may recall the nature of the positions. I do know that they include opening and midgame positions. Possibly early endgame, but I have not looked that closely.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Uri's Challenge : TwinFish

Post by Adam Hair »

Laskos wrote:
Rebel wrote:And so we are witnessing the death of similarity tester. Now that the cat is out of the bag I can confirm Ben's findings. During the PST-thread in the programmers forum I did some experiments with the several posted PST's and Piece Values and indeed they dreadfully bring down the similarity percentage without too much elo loss (20-30).

So folks be aware, cloners will find out anyway.
Still, no false positives with Sim, only false negatives.
Yes. 2 or 3 years ago I did some experiments with Fruit, and substantially changing the piece/square values would fool the test. However, it appears that the changes do not have to be as drastic as what I used.
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Uri's Challenge : TwinFish

Post by Sedat Canbaz »

Hello dear Adam,

Right now I am testing Tweenfish and so far the results are very good, soon I hope to publish the results

It seems, we need another positions for Simtest, otherwise many new engine releases will appear as derivative work or clones :)

Best,
Sedat
RoadWarrior
Posts: 73
Joined: Fri Jan 13, 2012 12:39 am
Location: London, England
Full name: Mark Pearce

Re: Uri's Challenge : TwinFish

Post by RoadWarrior »

lucasart wrote:At least going open source means you have nothing to hide. It still puzzles me why people develop private engines (so you can't even run the similarity test?) or closed source engines when they are hundreds of elo below the top engines. Why do they fear to show us their code?
As others have remarked, this has nothing to do with fear. My chess engine is my IP, my sweat and effort, my time away from my family, and my "baby". Turning the question around, what advantage does anybody gain if I opened the source, and why should I care?

And even if you ignore those arguments and have the source code in front of you, there are limits to what a human reader can absorb from thousands of lines of text designed primarily to function, not to convey meaning. When knowledge passes into code, it changes state; like water turned to ice, it becomes a new thing, with new properties. That's one reason why transplanting code from one program to another doesn't usually have the desired effect.
There are two types of people in the world: Avoid them both.
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Uri's Challenge : TwinFish

Post by Sedat Canbaz »

TwinFish 0.07's simtest results

As wee see, TwinFish has similarity less than 40 % comparing to Fruit 2.1

Well-done to the creator of Twinfish !
It seems they hacked Simtest tool :)

And now I have a question to all:
- I wonder now, how many engines we are testing, which are created in similar way as Tweenfish ?

Btw, (if there will be a such tournament) then probably I will include Tweenfish in my next Non-Fruit style tournament :)

Code: Select all

sim version 3
------ TwinFish 0.07 (time: 100 ms  scale: 1.0) ------
 52.25  Stockfish 070114 64 SSE4.2 (time: 100 ms  scale: 1.0)
 47.54  Houdini 4 x64 (time: 100 ms  scale: 1.0)
 47.51  Bouquet 1.8 x64 (time: 100 ms  scale: 1.0)
 47.20  Komodo TCECr 64-bit  (time: 100 ms  scale: 1.0)
 47.17  Fire 3.0 x64 (time: 100 ms  scale: 1.0)
 46.71  IvanHoe-Beta 999946h6 x64 Tr (time: 100 ms  scale: 1.0)
 45.65  Chiron 2 64bit (time: 100 ms  scale: 1.0)
 44.62  Protector 1.6.0 x64 (time: 100 ms  scale: 1.0)
 44.11  DiscoCheck 5.2 (time: 100 ms  scale: 1.0)
 39.80  Fruit 2.1 (time: 100 ms  scale: 1.0)
Uri Blass
Posts: 10905
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Uri's Challenge : TwinFish

Post by Uri Blass »

Sedat Canbaz wrote:TwinFish 0.07's simtest results

As wee see, TwinFish has similarity less than 40 % comparing to Fruit 2.1

Well-done to the creator of Twinfish !
It seems they hacked Simtest tool :)

And now I have a question to all:
- I wonder now, how many engines we are testing, which are created in similar way as Tweenfish ?

Btw, (if there will be a such tournament) then probably I will include Tweenfish in my next Non-Fruit style tournament :)

Code: Select all

sim version 3
------ TwinFish 0.07 (time: 100 ms  scale: 1.0) ------
 52.25  Stockfish 070114 64 SSE4.2 (time: 100 ms  scale: 1.0)
 47.54  Houdini 4 x64 (time: 100 ms  scale: 1.0)
 47.51  Bouquet 1.8 x64 (time: 100 ms  scale: 1.0)
 47.20  Komodo TCECr 64-bit  (time: 100 ms  scale: 1.0)
 47.17  Fire 3.0 x64 (time: 100 ms  scale: 1.0)
 46.71  IvanHoe-Beta 999946h6 x64 Tr (time: 100 ms  scale: 1.0)
 45.65  Chiron 2 64bit (time: 100 ms  scale: 1.0)
 44.62  Protector 1.6.0 x64 (time: 100 ms  scale: 1.0)
 44.11  DiscoCheck 5.2 (time: 100 ms  scale: 1.0)
 39.80  Fruit 2.1 (time: 100 ms  scale: 1.0)
I see a difference of 4.71 between stockfish and second similiar at 100 ms
52.25-47.54=4.71
I think that search is relatively more significant at longer time control so I wonder if the difference between stockfish and second similiar is bigger at 500 ms.

Maybe it is possible to use simtest to find programs that you suspect not based on the single number but based on finding that 4.71 is increasing to a bigger number at long time control(note that I do not know if the 4.71 goes up).