Uri's Challenge : TwinFish

bob · Post by **bob** » Sat Feb 15, 2014 4:05 am

Adam Hair wrote:
Laskos wrote:
Uri Blass wrote:
The incentive is to have a stronger engine and based on my memory the programmer of Naum already did it with Rybka based on his words(I think that he did it with Rybka2.3.2a but I am not sure about the exact version of Rybka).

I think that we can at least agree that big similarity is not something that can happen by accident and the engine is derived from the code or from the output of another engine.
I don't believe in this Naum story. Try to optimize engine's strength via a test suite instead of pure Elo-wise tests. You will get a weaker engine. Trying to optimize to play these neutral positions from Sim similarly to a stronger engine, if your own engine has a very different eval, will only wreck the engine. There are hundreds of parameters to tune, it would be a miracle for a completely different eval to be tunable according to the same parameters, and to get a stronger engine.
If you trace the changes in Naum with the similarity tool, you will see a relatively unique engine (v1.91, v2.0) start displaying increased similarity with Strelka 2.0B (v2.1, v2.2), then high similarity with Rybka 2.x. (v3.1, v4.2). I do believe that Strelka was essential for tuning Naum to Rybka.
Code: Select all
  Key&#58;

  1&#41; Fruit 2.1 &#40;time&#58; 290 ms  scale&#58; 1.0&#41;
  2&#41; Naum 1.91 &#40;time&#58; 502 ms  scale&#58; 1.0&#41;
  3&#41; Naum 2.0 &#40;time&#58; 290 ms  scale&#58; 1.0&#41;
  4&#41; Naum 2.1 &#40;time&#58; 217 ms  scale&#58; 1.0&#41;
  5&#41; Naum 2.2 &#40;time&#58; 180 ms  scale&#58; 1.0&#41;
  6&#41; Naum 3.1 &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
  7&#41; Naum 4.2 &#40;time&#58; 58 ms  scale&#58; 1.0&#41;
  8&#41; Rybka 1.0 Beta &#40;time&#58; 171 ms  scale&#58; 1.0&#41;
  9&#41; Rybka 1.1 &#40;time&#58; 121 ms  scale&#58; 1.0&#41;
 10&#41; Rybka 1.2f &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
 11&#41; Rybka 2.1o &#40;time&#58; 116 ms  scale&#58; 1.0&#41;
 12&#41; Rybka 2.2n2 &#40;time&#58; 76 ms  scale&#58; 1.0&#41;
 13&#41; Rybka 2.3.2a &#40;time&#58; 60 ms  scale&#58; 1.0&#41;
 14&#41; Strelka 2.0 B &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
 15&#41; Thinker 5.4c Inert &#40;time&#58; 102 ms  scale&#58; 1.0&#41;

         1     2     3     4     5     6     7     8     9    10    11    12    13    14    15
  1.  ----- 48.51 47.32 53.12 52.33 54.38 54.92 55.75 56.00 55.32 55.00 55.16 56.71 57.60 53.82
  2.  48.51 ----- 68.44 51.30 51.75 44.53 46.03 45.63 46.10 46.53 46.04 45.61 47.48 48.35 46.54
  3.  47.32 68.44 ----- 52.49 53.69 43.93 45.36 45.31 45.62 45.42 44.93 45.44 46.97 46.78 45.91
  4.  53.12 51.30 52.49 ----- 71.11 54.33 55.54 54.81 56.06 55.29 55.06 55.41 54.92 57.71 53.56
  5.  52.33 51.75 53.69 71.11 ----- 53.30 54.99 53.74 54.71 54.45 53.65 54.75 53.92 56.55 52.76
  6.  54.38 44.53 43.93 54.33 53.30 ----- 67.01 59.48 65.06 67.84 68.66 63.22 60.33 61.51 57.48
  7.  54.92 46.03 45.36 55.54 54.99 67.01 ----- 60.37 64.20 64.25 64.13 64.54 62.01 62.99 58.24
  8.  55.75 45.63 45.31 54.81 53.74 59.48 60.37 ----- 67.30 65.32 64.72 65.25 62.19 68.52 58.76
  9.  56.00 46.10 45.62 56.06 54.71 65.06 64.20 67.30 ----- 73.61 72.51 69.66 65.11 68.57 60.34
 10.  55.32 46.53 45.42 55.29 54.45 67.84 64.25 65.32 73.61 ----- 87.14 71.78 66.16 66.58 60.67
 11.  55.00 46.04 44.93 55.06 53.65 68.66 64.13 64.72 72.51 87.14 ----- 72.07 64.91 65.96 59.86
 12.  55.16 45.61 45.44 55.41 54.75 63.22 64.54 65.25 69.66 71.78 72.07 ----- 66.39 66.31 59.84
 13.  56.71 47.48 46.97 54.92 53.92 60.33 62.01 62.19 65.11 66.16 64.91 66.39 ----- 65.19 59.59
 14.  57.60 48.35 46.78 57.71 56.55 61.51 62.99 68.52 68.57 66.58 65.96 66.31 65.19 ----- 63.26
 15.  53.82 46.54 45.91 53.56 52.76 57.48 58.24 58.76 60.34 60.67 59.86 59.84 59.59 63.26 -----

Here is the critical question, however. Tuning against another engine is absolutely within ICGA tournament rules. The rules are about copied code. So one has to ask, how similar is the naum CODE to Rybka or strelka or whatever? Right now, the observation that it was tuned over several versions neither confirms or refutes any observation about originality or lack thereof.

Uri Blass · Post by **Uri Blass** » Sat Feb 15, 2014 4:25 am

bob wrote:
Adam Hair wrote:
Laskos wrote:
Uri Blass wrote:
The incentive is to have a stronger engine and based on my memory the programmer of Naum already did it with Rybka based on his words(I think that he did it with Rybka2.3.2a but I am not sure about the exact version of Rybka).

I think that we can at least agree that big similarity is not something that can happen by accident and the engine is derived from the code or from the output of another engine.
I don't believe in this Naum story. Try to optimize engine's strength via a test suite instead of pure Elo-wise tests. You will get a weaker engine. Trying to optimize to play these neutral positions from Sim similarly to a stronger engine, if your own engine has a very different eval, will only wreck the engine. There are hundreds of parameters to tune, it would be a miracle for a completely different eval to be tunable according to the same parameters, and to get a stronger engine.
If you trace the changes in Naum with the similarity tool, you will see a relatively unique engine (v1.91, v2.0) start displaying increased similarity with Strelka 2.0B (v2.1, v2.2), then high similarity with Rybka 2.x. (v3.1, v4.2). I do believe that Strelka was essential for tuning Naum to Rybka.
Code: Select all
  Key&#58;

  1&#41; Fruit 2.1 &#40;time&#58; 290 ms  scale&#58; 1.0&#41;
  2&#41; Naum 1.91 &#40;time&#58; 502 ms  scale&#58; 1.0&#41;
  3&#41; Naum 2.0 &#40;time&#58; 290 ms  scale&#58; 1.0&#41;
  4&#41; Naum 2.1 &#40;time&#58; 217 ms  scale&#58; 1.0&#41;
  5&#41; Naum 2.2 &#40;time&#58; 180 ms  scale&#58; 1.0&#41;
  6&#41; Naum 3.1 &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
  7&#41; Naum 4.2 &#40;time&#58; 58 ms  scale&#58; 1.0&#41;
  8&#41; Rybka 1.0 Beta &#40;time&#58; 171 ms  scale&#58; 1.0&#41;
  9&#41; Rybka 1.1 &#40;time&#58; 121 ms  scale&#58; 1.0&#41;
 10&#41; Rybka 1.2f &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
 11&#41; Rybka 2.1o &#40;time&#58; 116 ms  scale&#58; 1.0&#41;
 12&#41; Rybka 2.2n2 &#40;time&#58; 76 ms  scale&#58; 1.0&#41;
 13&#41; Rybka 2.3.2a &#40;time&#58; 60 ms  scale&#58; 1.0&#41;
 14&#41; Strelka 2.0 B &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
 15&#41; Thinker 5.4c Inert &#40;time&#58; 102 ms  scale&#58; 1.0&#41;

         1     2     3     4     5     6     7     8     9    10    11    12    13    14    15
  1.  ----- 48.51 47.32 53.12 52.33 54.38 54.92 55.75 56.00 55.32 55.00 55.16 56.71 57.60 53.82
  2.  48.51 ----- 68.44 51.30 51.75 44.53 46.03 45.63 46.10 46.53 46.04 45.61 47.48 48.35 46.54
  3.  47.32 68.44 ----- 52.49 53.69 43.93 45.36 45.31 45.62 45.42 44.93 45.44 46.97 46.78 45.91
  4.  53.12 51.30 52.49 ----- 71.11 54.33 55.54 54.81 56.06 55.29 55.06 55.41 54.92 57.71 53.56
  5.  52.33 51.75 53.69 71.11 ----- 53.30 54.99 53.74 54.71 54.45 53.65 54.75 53.92 56.55 52.76
  6.  54.38 44.53 43.93 54.33 53.30 ----- 67.01 59.48 65.06 67.84 68.66 63.22 60.33 61.51 57.48
  7.  54.92 46.03 45.36 55.54 54.99 67.01 ----- 60.37 64.20 64.25 64.13 64.54 62.01 62.99 58.24
  8.  55.75 45.63 45.31 54.81 53.74 59.48 60.37 ----- 67.30 65.32 64.72 65.25 62.19 68.52 58.76
  9.  56.00 46.10 45.62 56.06 54.71 65.06 64.20 67.30 ----- 73.61 72.51 69.66 65.11 68.57 60.34
 10.  55.32 46.53 45.42 55.29 54.45 67.84 64.25 65.32 73.61 ----- 87.14 71.78 66.16 66.58 60.67
 11.  55.00 46.04 44.93 55.06 53.65 68.66 64.13 64.72 72.51 87.14 ----- 72.07 64.91 65.96 59.86
 12.  55.16 45.61 45.44 55.41 54.75 63.22 64.54 65.25 69.66 71.78 72.07 ----- 66.39 66.31 59.84
 13.  56.71 47.48 46.97 54.92 53.92 60.33 62.01 62.19 65.11 66.16 64.91 66.39 ----- 65.19 59.59
 14.  57.60 48.35 46.78 57.71 56.55 61.51 62.99 68.52 68.57 66.58 65.96 66.31 65.19 ----- 63.26
 15.  53.82 46.54 45.91 53.56 52.76 57.48 58.24 58.76 60.34 60.67 59.86 59.84 59.59 63.26 -----
Here is the critical question, however. Tuning against another engine is absolutely within ICGA tournament rules. The rules are about copied code. So one has to ask, how similar is the naum CODE to Rybka or strelka or whatever? Right now, the observation that it was tuned over several versions neither confirms or refutes any observation about originality or lack thereof.

If you tune based on the output of another engine then I expect it to cause the code also to have a bigger similarity
because part of the code is the numbers in different tables that many engines share like piece square tables and I expect the numbers to be closer after you tune based on another engine.

bob · Post by **bob** » Sat Feb 15, 2014 6:18 am

Uri Blass wrote:
bob wrote:
Adam Hair wrote:
Laskos wrote:
Uri Blass wrote:
The incentive is to have a stronger engine and based on my memory the programmer of Naum already did it with Rybka based on his words(I think that he did it with Rybka2.3.2a but I am not sure about the exact version of Rybka).

I think that we can at least agree that big similarity is not something that can happen by accident and the engine is derived from the code or from the output of another engine.
I don't believe in this Naum story. Try to optimize engine's strength via a test suite instead of pure Elo-wise tests. You will get a weaker engine. Trying to optimize to play these neutral positions from Sim similarly to a stronger engine, if your own engine has a very different eval, will only wreck the engine. There are hundreds of parameters to tune, it would be a miracle for a completely different eval to be tunable according to the same parameters, and to get a stronger engine.
If you trace the changes in Naum with the similarity tool, you will see a relatively unique engine (v1.91, v2.0) start displaying increased similarity with Strelka 2.0B (v2.1, v2.2), then high similarity with Rybka 2.x. (v3.1, v4.2). I do believe that Strelka was essential for tuning Naum to Rybka.
Code: Select all
  Key&#58;

  1&#41; Fruit 2.1 &#40;time&#58; 290 ms  scale&#58; 1.0&#41;
  2&#41; Naum 1.91 &#40;time&#58; 502 ms  scale&#58; 1.0&#41;
  3&#41; Naum 2.0 &#40;time&#58; 290 ms  scale&#58; 1.0&#41;
  4&#41; Naum 2.1 &#40;time&#58; 217 ms  scale&#58; 1.0&#41;
  5&#41; Naum 2.2 &#40;time&#58; 180 ms  scale&#58; 1.0&#41;
  6&#41; Naum 3.1 &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
  7&#41; Naum 4.2 &#40;time&#58; 58 ms  scale&#58; 1.0&#41;
  8&#41; Rybka 1.0 Beta &#40;time&#58; 171 ms  scale&#58; 1.0&#41;
  9&#41; Rybka 1.1 &#40;time&#58; 121 ms  scale&#58; 1.0&#41;
 10&#41; Rybka 1.2f &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
 11&#41; Rybka 2.1o &#40;time&#58; 116 ms  scale&#58; 1.0&#41;
 12&#41; Rybka 2.2n2 &#40;time&#58; 76 ms  scale&#58; 1.0&#41;
 13&#41; Rybka 2.3.2a &#40;time&#58; 60 ms  scale&#58; 1.0&#41;
 14&#41; Strelka 2.0 B &#40;time&#58; 114 ms  scale&#58; 1.0&#41;
 15&#41; Thinker 5.4c Inert &#40;time&#58; 102 ms  scale&#58; 1.0&#41;

         1     2     3     4     5     6     7     8     9    10    11    12    13    14    15
  1.  ----- 48.51 47.32 53.12 52.33 54.38 54.92 55.75 56.00 55.32 55.00 55.16 56.71 57.60 53.82
  2.  48.51 ----- 68.44 51.30 51.75 44.53 46.03 45.63 46.10 46.53 46.04 45.61 47.48 48.35 46.54
  3.  47.32 68.44 ----- 52.49 53.69 43.93 45.36 45.31 45.62 45.42 44.93 45.44 46.97 46.78 45.91
  4.  53.12 51.30 52.49 ----- 71.11 54.33 55.54 54.81 56.06 55.29 55.06 55.41 54.92 57.71 53.56
  5.  52.33 51.75 53.69 71.11 ----- 53.30 54.99 53.74 54.71 54.45 53.65 54.75 53.92 56.55 52.76
  6.  54.38 44.53 43.93 54.33 53.30 ----- 67.01 59.48 65.06 67.84 68.66 63.22 60.33 61.51 57.48
  7.  54.92 46.03 45.36 55.54 54.99 67.01 ----- 60.37 64.20 64.25 64.13 64.54 62.01 62.99 58.24
  8.  55.75 45.63 45.31 54.81 53.74 59.48 60.37 ----- 67.30 65.32 64.72 65.25 62.19 68.52 58.76
  9.  56.00 46.10 45.62 56.06 54.71 65.06 64.20 67.30 ----- 73.61 72.51 69.66 65.11 68.57 60.34
 10.  55.32 46.53 45.42 55.29 54.45 67.84 64.25 65.32 73.61 ----- 87.14 71.78 66.16 66.58 60.67
 11.  55.00 46.04 44.93 55.06 53.65 68.66 64.13 64.72 72.51 87.14 ----- 72.07 64.91 65.96 59.86
 12.  55.16 45.61 45.44 55.41 54.75 63.22 64.54 65.25 69.66 71.78 72.07 ----- 66.39 66.31 59.84
 13.  56.71 47.48 46.97 54.92 53.92 60.33 62.01 62.19 65.11 66.16 64.91 66.39 ----- 65.19 59.59
 14.  57.60 48.35 46.78 57.71 56.55 61.51 62.99 68.52 68.57 66.58 65.96 66.31 65.19 ----- 63.26
 15.  53.82 46.54 45.91 53.56 52.76 57.48 58.24 58.76 60.34 60.67 59.86 59.84 59.59 63.26 -----
Here is the critical question, however. Tuning against another engine is absolutely within ICGA tournament rules. The rules are about copied code. So one has to ask, how similar is the naum CODE to Rybka or strelka or whatever? Right now, the observation that it was tuned over several versions neither confirms or refutes any observation about originality or lack thereof.
If you tune based on the output of another engine then I expect it to cause the code also to have a bigger similarity
because part of the code is the numbers in different tables that many engines share like piece square tables and I expect the numbers to be closer after you tune based on another engine.

The PST values might be the same. Or everything could be multiplied by some odd number so that they are the same, but don't look the same. But PSTs are a tiny part of a program. Search. Evaluation. Move generation. Make/Unmake. Hashing. you-name-it. The programs would not have to look anything alike, IMHO. Only problem is that I am not willing to try to do something like that and waste all the time required. For example, the Rybka numbers don't look anything like Fruit if you just look at the raw numbers. It requires the work Zach did to see what actually happened, since Rybka used a really strange value for the value of a pawn compared to Fruit, making all numbers look different at first glance.

Rebel · Post by **Rebel** » Sat Feb 15, 2014 10:21 am

bob wrote: If source comparison "can fail" then the similarity test is hopeless from the get-to, because source comparison is about 100x more accurate.

1. You are entitled to your (BTW circular) bold statement but ever since some past cases and the controversy among programmers that followed I would say source comparison can be the ultimate authority given that all programmers agree. If you can accept that not everybody agrees with you then perhaps we can make progess.

2. OTOH similarity tester is unbiased, no emotional strings attached, no human errors, no tunnel visions, no like or dislike of persons, just cold numbers and no false positives at 65+ which is the (tolerant) line I draw in the sand and so far I have been proven right after each source code comparison.

3. If you realize what similarity tester measures (see my post to Milos) then this should be obvious to an experienced programmer as yourself with some basic understanding of statistics. False positives (AKA exceptions) do happen in an enviroment with millions of random variables. Here we are just dealing with a couple of hundred engines.

Rebel · Post by **Rebel** » Sat Feb 15, 2014 10:46 am

bob wrote:There is no such term as "statistically out of the question". It might have a low probability of happening, but low != 0.0... So there is absolutely room for a false positive just as there is room for a false negative as already shown.

And neither is there in court, but the accused goes to jail if there is a DNA match which as you know is also not 100% reliable. In the end it is a matter of statistics.

bob · Post by **bob** » Sat Feb 15, 2014 6:04 pm

Rebel wrote:
bob wrote: If source comparison "can fail" then the similarity test is hopeless from the get-to, because source comparison is about 100x more accurate.
1. You are entitled to your (BTW circular) bold statement but ever since some past cases and the controversy among programmers that followed I would say source comparison can be the ultimate authority given that all programmers agree. If you can accept that not everybody agrees with you then perhaps we can make progess.

2. OTOH similarity tester is unbiased, no emotional strings attached, no human errors, no tunnel visions, no like or dislike of persons, just cold numbers and no false positives at 65+ which is the (tolerant) line I draw in the sand and so far I have been proven right after each source code comparison.

3. If you realize what similarity tester measures (see my post to Milos) then this should be obvious to an experienced programmer as yourself with some basic understanding of statistics. False positives (AKA exceptions) do happen in an enviroment with millions of random variables. Here we are just dealing with a couple of hundred engines.

My only point was that both false positives and false negatives will occur. Which is THE reason this can not be considered as a "proof" of either innocence or copying. It can be used as a filter. Pass the test and chances are pretty good the program is original, fail and chances are pretty good the program is a derivative. But that is really all. If you run as many tests as I run in a year, even 30K games, which produces an error bar of +/-4 will occasionally produce a bad result. I had one a couple of weeks ago that puzzled me (simple change dropped Elo by 13.). Couldn't see anything wrong, re-ran the 30K game match 3 times and all three were within what I expected, back to normal rather than that odd drop.

It doesn't happen often, but it happens often enough to realize that this IS statistical in nature. A 95% confidence interval still has a 1 in 20 chance of breaking. For the test I ran, an unchanged Elo was expected. Most of the time, on such tests, I get what I expect, just a validation that I didn't break something. Occasionally a change has an unexpected side-effect that drops the Elo significantly (as above) or on occasion shows an unexpected gain. Those get a further look. And on occasion the test turns out to be a statistical anomaly. Most of the time not.

Ergo, positives can be used to trigger further investigation, negatives can (with more risk) be used to avoid triggering further digging. Either can happen with some degree of probability that is much greater than zero. A code comparison is as accurate as one chooses to make it. There's no statistical analysis involved.

bob · Post by **bob** » Sat Feb 15, 2014 6:07 pm

Rebel wrote:
bob wrote:There is no such term as "statistically out of the question". It might have a low probability of happening, but low != 0.0... So there is absolutely room for a false positive just as there is room for a false negative as already shown.
And neither is there in court, but the accused goes to jail if there is a DNA match which as you know is also not 100% reliable. In the end it is a matter of statistics.

Quite a bit of difference between a simtest with 95%B confidence interval and a DNA test that is usually specified as 1 in 100,000,000 chances of being wrong (depends on number of matching markers). 5 out of 100 is way bigger than 1 out of 100 million. But even DNA, by itself, is not an automatic conviction. There has to be other evidence to go along with it. Just proving I was somewhere does not prove I committed a crime, by itself.

Uri Blass · Post by **Uri Blass** » Sat Feb 15, 2014 6:25 pm

bob wrote:
Rebel wrote:
bob wrote:There is no such term as "statistically out of the question". It might have a low probability of happening, but low != 0.0... So there is absolutely room for a false positive just as there is room for a false negative as already shown.
And neither is there in court, but the accused goes to jail if there is a DNA match which as you know is also not 100% reliable. In the end it is a matter of statistics.
Quite a bit of difference between a simtest with 95%B confidence interval and a DNA test that is usually specified as 1 in 100,000,000 chances of being wrong (depends on number of matching markers). 5 out of 100 is way bigger than 1 out of 100 million. But even DNA, by itself, is not an automatic conviction. There has to be other evidence to go along with it. Just proving I was somewhere does not prove I committed a crime, by itself.

1)I do not think that we have 5 out of 100 wrong when the similarity test shows more than 65% similarity and the probability is significantly smaller.

2)The probability for wrong conviction is clearly bigger than 1 to 100,000,000 based on my knowledge about the innocence project.

bob · Post by **bob** » Sat Feb 15, 2014 6:44 pm

Uri Blass wrote:
bob wrote:
Rebel wrote:
bob wrote:There is no such term as "statistically out of the question". It might have a low probability of happening, but low != 0.0... So there is absolutely room for a false positive just as there is room for a false negative as already shown.
And neither is there in court, but the accused goes to jail if there is a DNA match which as you know is also not 100% reliable. In the end it is a matter of statistics.
Quite a bit of difference between a simtest with 95%B confidence interval and a DNA test that is usually specified as 1 in 100,000,000 chances of being wrong (depends on number of matching markers). 5 out of 100 is way bigger than 1 out of 100 million. But even DNA, by itself, is not an automatic conviction. There has to be other evidence to go along with it. Just proving I was somewhere does not prove I committed a crime, by itself.
1)I do not think that we have 5 out of 100 wrong when the similarity test shows more than 65% similarity and the probability is significantly smaller.

2)The probability for wrong conviction is clearly bigger than 1 to 100,000,000 based on my knowledge about the innocence project.

You are not following. The typical computer confidence interval here is 95%. IE BayesElo for example.

the 1 in 100,000,000 is a common statistic representing the probability of a false DNA match. Nothing to do with convictions or anything else. They have DNA from a suspect, DNA from the crime scene (rape is by far the most common case), the DNA results come back as 1 in 100,000,000 chances of a false match.

It seems those estimates are flawed after lots of additional investigation. There are several cases of two different people matching at the common "9 loci level" that is frequently used. There are several cases with a match at 10 loci. Etc. Still far better than the usual 95% confidence interval we use to establish a program's Elo.

That is the problem with a statistical answer. There is no 100% accurate answer. There is some error bar that is considered acceptable. I don't consider the similarity tester to be bad. But I also do not consider it to be "proof" of anything whatsoever. It is just a suggestion that some seem to take with more weight than others. There's a danger when you begin to believe it is nearly perfect. Nobody knows how many false matches there can be. But to assume there are none is certainly a bit off the wall.

Guenther · Post by **Guenther** » Sun Jan 26, 2020 4:08 pm

Tennison wrote: ↑Fri Jan 31, 2014 9:46 am
Uri Blass wrote:
lucasart wrote:I can make a few trivial changes to Stockfish and pass the similarity tests, any day!
Then please do it and release the source.
It may be interesting to know how much elo do you lose for it and if the engine that you get is stronger than DiscoCheck(note that 60% is not enough and you need similarity that is smaller than 55%)
TwinFish 0.07

The similarity is less than 55% and the elo fall is about 70-80 only.

This version of TwinFish is based on Stockfish dev 14 01 29 6:02PM (TimeStamp : 1391014933 )

The only changes made to reach a "<55%" similarity are a complete asymetric PST (based on Adam Hair values).

If you want to see the changes just search for "Robber" in the sources files.

There is only the source code, no binary.
If someone wants to compile good binaries it should be nice.

Don't forget : this version is only a joke and I don't steal Stockfish! ;-)

I'm very interested to see the result in similarity dendogram now !
Twinfish 0.07 is more related with Toga Hair than with Stockfish with the Don's Similarity Tester. And there is no code from Toga in it !!! ;-)

Does someone still have the source or a binary of it? (Wayback and search returns nothing for this anymore)
I would like to do some experiments with it.

Thanks.

Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish

Re: Uri's Challenge : TwinFish