tpetzke wrote:(...) I just like the concept of property (...)
+1
Indeed!
Steve
+2
+3
I have sent my code to roughly 8-10 people privately over the five years I've worked on Myrddin. One of those ended up cloning it, literally by changing only the engine name. And this was a version that, at the time, was rated maybe 1500. So you can guess the lengths a motivated cloner will go to for an engine that's 2500+ or even 3000+.
I've spent a lot of hours on Myrddin, and who cares if it is still only about 2300? Why should somebody else benefit from all of that work practically instantly?
tpetzke wrote:(...) I just like the concept of property (...)
+1
Indeed!
Steve
+2
+3
I have sent my code to roughly 8-10 people privately over the five years I've worked on Myrddin. One of those ended up cloning it, literally by changing only the engine name. And this was a version that, at the time, was rated maybe 1500. So you can guess the lengths a motivated cloner will go to for an engine that's 2500+ or even 3000+.
I've spent a lot of hours on Myrddin, and who cares if it is still only about 2300? Why should somebody else benefit from all of that work practically instantly?
jm
agree... nobody should benefit from that... unless the author allows it... and that should be documented...
Tennison wrote:The only changes made to reach a "<55%" similarity are a complete asymetric PST (based on Adam Hair values).
If you want to see the changes just search for "Robber" in the sources files.
This is well known thing from long ago, that similarity test actually measures PST matching. All other eval terms are completely irrelevant.
It's totally unscientific thing, made to look like some science.
It is not logical that other terms are completely irrelevant(perhaps they are relatively irrelevant if you have crazy high values in the piece square table).
If you change the mobility evaluation or the king safety evaluation you change the choice of the move so mobility or king safety should be relevant.
Also if you change the search you change the choice of the moves so search should be also relevant.
Note that I expect asymetric crazy PST to be relatively weaker at longer time control so some questions:
1)what is the time control that you measure 70-80 elo difference and what happens at time control that is 3 times slower?
2)What happens to the similarity at longer time control and do you find relatively bigger similarity to stockfish(when you compare with other engines) if you use longer time control?
Tennison wrote:The only changes made to reach a "<55%" similarity are a complete asymetric PST (based on Adam Hair values).
If you want to see the changes just search for "Robber" in the sources files.
This is well known thing from long ago, that similarity test actually measures PST matching. All other eval terms are completely irrelevant.
It's totally unscientific thing, made to look like some science.
As I recall, you are the only one who has espoused the idea that the similarity test basically only measures PST matching. Are you sure that this is a position that you want to take? Yes, PSTs have a definite influence on move selection. However, it has been shown more than once (and soon, once again) that simply using the same PSTs do not make two engines highly similar. This negates the statement "All other eval terms are completely irrelevant".
velmarin wrote:I've never seen positions used in SIM.
Although it seems they can be changed.
They will guess middlegame.
If you change positions near the opening, what would happen?
or change them at the end positions.
Larry may recall the nature of the positions. I do know that they include opening and midgame positions. Possibly early endgame, but I have not looked that closely.
Rebel wrote:And so we are witnessing the death of similarity tester. Now that the cat is out of the bag I can confirm Ben's findings. During the PST-thread in the programmers forum I did some experiments with the several posted PST's and Piece Values and indeed they dreadfully bring down the similarity percentage without too much elo loss (20-30).
So folks be aware, cloners will find out anyway.
Still, no false positives with Sim, only false negatives.
Yes. 2 or 3 years ago I did some experiments with Fruit, and substantially changing the piece/square values would fool the test. However, it appears that the changes do not have to be as drastic as what I used.
lucasart wrote:At least going open source means you have nothing to hide. It still puzzles me why people develop private engines (so you can't even run the similarity test?) or closed source engines when they are hundreds of elo below the top engines. Why do they fear to show us their code?
As others have remarked, this has nothing to do with fear. My chess engine is my IP, my sweat and effort, my time away from my family, and my "baby". Turning the question around, what advantage does anybody gain if I opened the source, and why should I care?
And even if you ignore those arguments and have the source code in front of you, there are limits to what a human reader can absorb from thousands of lines of text designed primarily to function, not to convey meaning. When knowledge passes into code, it changes state; like water turned to ice, it becomes a new thing, with new properties. That's one reason why transplanting code from one program to another doesn't usually have the desired effect.
There are two types of people in the world: Avoid them both.
sim version 3
------ TwinFish 0.07 (time: 100 ms scale: 1.0) ------
52.25 Stockfish 070114 64 SSE4.2 (time: 100 ms scale: 1.0)
47.54 Houdini 4 x64 (time: 100 ms scale: 1.0)
47.51 Bouquet 1.8 x64 (time: 100 ms scale: 1.0)
47.20 Komodo TCECr 64-bit (time: 100 ms scale: 1.0)
47.17 Fire 3.0 x64 (time: 100 ms scale: 1.0)
46.71 IvanHoe-Beta 999946h6 x64 Tr (time: 100 ms scale: 1.0)
45.65 Chiron 2 64bit (time: 100 ms scale: 1.0)
44.62 Protector 1.6.0 x64 (time: 100 ms scale: 1.0)
44.11 DiscoCheck 5.2 (time: 100 ms scale: 1.0)
39.80 Fruit 2.1 (time: 100 ms scale: 1.0)
I see a difference of 4.71 between stockfish and second similiar at 100 ms
52.25-47.54=4.71
I think that search is relatively more significant at longer time control so I wonder if the difference between stockfish and second similiar is bigger at 500 ms.
Maybe it is possible to use simtest to find programs that you suspect not based on the single number but based on finding that 4.71 is increasing to a bigger number at long time control(note that I do not know if the 4.71 goes up).