When will we see HOUDINI in official tournaments?

Houdini · Post by **Houdini** » Sat May 05, 2012 11:18 am

neelbasant wrote:Robert, I have immense respect for you as a programmer ( houdini 1.5a)
But making serious accusation to a respected person and questioning the originality of critter is waste of time .
will anyone be agree in the whole world in the subject that critter is REed from Houdini 1.5a ? except you.

Poll " Critter is a REed engine from Houdini"

No.................1000000000000000000
Yes................1 (it is you)

Welcome to Talkchess.
Please read my answers on the German forum CSS http://forum.computerschach.de/cgi-bin/ ... l?tid=4623 in reply to a person making the same over-simplification as you did.

Robert

Rebel · Post by **Rebel** » Sat May 05, 2012 11:47 am

Houdini wrote:
michiguel wrote:I suggest to wind this Critter issue down in this thread. Mentioning the similarity is a fact, but everything else... we are walking on thin ice.

Miguel
The point of my post was not "accusing" Critter (I don't really care about that).
My point was about the "60% rule" which may prove not to be very useful for tournaments.

Maybe not for you because it shows the origin of Houdini.

Code: Select all

RobboLito 0.085d1 vs Houdini 1.00   70.37%

Is your current version below the tolerant 60% ?

Houdini · Post by **Houdini** » Sat May 05, 2012 11:57 am

Ed, I don't know and I don't care. Success of Houdini 3 will not be measured by its similarity score but by its 40+ Elo increase over the previous version.

Can you please return to the case at hand?
What is the relevance of the "60% similarity" rule you advocate, when the first tournament that is supposed to apply the rule accepts an engine that has a similarity score with Houdini 1.5a that is at 62%?

Robert

Graham Banks · Post by **Graham Banks** » Sat May 05, 2012 12:03 pm

Houdini wrote:Ed, I don't know and I don't care. Success of Houdini 3 will not be measured by its similarity score but by its 40+ Elo increase over the previous version.

Can you please return to the case at hand?
What is the relevance of the "60% similarity" rule you advocate, when the first tournament that is supposed to apply the rule accepts an engine that has a similarity score with Houdini 1.5a that is at 62%?

Robert

I think Ed was insinuating that perhaps Critter 1.5 will be participating and that it passes the test. Just a guess.

Rebel · Post by **Rebel** » Sat May 05, 2012 12:27 pm

Houdini wrote:Ed, I don't know and I don't care. Success of Houdini 3 will not be measured by its similarity score but by its 40+ Elo increase over the previous version.

It's important because people would like to see the strongest engines in the world to meet and compete.

Can you please return to the case at hand?
What is the relevance of the "60% similarity" rule you advocate, when the first tournament that is supposed to apply the rule accepts an engine that has a similarity score with Houdini 1.5a that is at 62%?

I am not advocating 60%, personally I would draw the line at 55% to be on the safe side. But I do expect the CSVN to stick to their own rules and that the Critter version that will play is below 60%.

Living in a post-ICGA-Rybka world it perhaps is a way to pick up the broken pieces and move on. And the REAL debate among programmers about this new development still has to start, to define a percentage everybody can live with, make suggestions to improve the current system because there certainly is room for improvement(s).

So Robert, this not only about Houdini and Critter but about the CC community as a whole to define new rules for fair competition, rules that can endure the pressure of strong open sources (Ippolit especially since it is freeware) has put upon the CC community since 2009.

FriedmannC · Post by **FriedmannC** » Sat May 05, 2012 12:27 pm

RobboLito 0.085d1 vs Houdini 1.00 70.37% strange indeed - I didn't know that the greatest magician's family name was IPPOLIT.....

Mike S. · Post by **Mike S.** » Sat May 05, 2012 12:42 pm

As someone who is not a scientific expert of the similarity tool test methodics, I would like to ask:

Is this method capable (enough) to spot differences of search code, not of evaluation only?

Especially the fact that the calculation times per position were varied (*) depending on the engine's strenghts, has led me to the assumption that the difference of search quality and -effectiveness cannot be found, that way.

*) As found in the very comprehensive documentation on your website, thanks! Anyone interested in this matter should take a look at that, at least.

I mean, if "similarity" is simplified down to a percentage of X%, it suggests an "overall" identity of X%. But depending on the testing method, it could be that only some parts are X% similar while other parts, which the testing method may fail to spot, are much more different...

Rebel · Post by **Rebel** » Sat May 05, 2012 1:41 pm

Mike S. wrote:As someone who is not a scientific expert of the similarity tool test methodics, I would like to ask:

Is this method capable (enough) to spot differences of search code, not of evaluation only?

Especially the fact that the calculation times per position were varied (*) depending on the engine's strenghts, has led me to the assumption that the difference of search quality and -effectiveness cannot be found, that way.

*) As found in the very comprehensive documentation on your website, thanks! Anyone interested in this matter should take a look at that, at least.

I mean, if "similarity" is simplified down to a percentage of X%, it suggests an "overall" identity of X%. But depending on the testing method, it could be that only some parts are X% similar while other parts, which the testing method may fail to spot, are much more different...

Adam has tried various time controls and the overall picture remains the same. In my own testings I deliberately use 0.1 second just to do it different than Adam to test the reliability of the system and also here the picture doesn't change much, small fluctuations of 1 to 1.5 percent.

One improvement I suggest is to run a second opinion test based on a fixed depth in suspect cases. A fixed depth will limit the influence of the search even more. As after all the similarity software is meant to measure the similarity of the evaluation function.

Another way to test similarity is the ponder-hit system. It's obvious disadvantage is the greater influence of the search as games (such as from CCRL and CEGT) are played at much longer time controls than 0.1 second and yet the ponder-hit system is telling the same story. I am referring to study of Kai Laskos here in CCC later used in the Soren Riis article at Chessbase.

IWB · Post by **IWB** » Sat May 05, 2012 1:42 pm

Hi,

Actually the whole 60% and the similarity test for a tourney is flawed!

They have a Rybka on a cluster participating that tourney. How will it be possible to really test THAT with the tool against a bunch of other engines? Whatever they get (if they get something) is for sure not the version which is playing!

According to the list Robolito 085 is allowed to participate as the similarity to R3 is below 60% ... I guess VR would not like that.

What are the comparisions of R4/4.1 to Robo0.85, R9? I miss that as well. Both Robos where released before R4 ...? Anyone knows?

I am sorry to say this, but this rule is rubbish and the CVSN opened the gate to hell with their Rybka decision. With their current argumentation they either have to allow every engine or they close it completly. Everything else is arbitrary!

Bye
Ingo

Rebel · Post by **Rebel** » Sat May 05, 2012 2:08 pm

Ingo, Robolito is an Ippolit clone, thus not allowed. If an engine gives just one 60% hit with any other engine that engine is not allowed to play.

1) Houdini 1.0 (time: 100 ms scale: 1.0)
2) Houdini 1.5 (time: 100 ms scale: 1.0)
3) IPPOLIT 0.080a (time: 100 ms scale: 1.0)
4) RobboLito 0.09 (time: 100 ms scale: 1.0)
5) Strelka 5 (time: 100 ms scale: 1.0)

Code: Select all

        1     2     3     4     5
 1.  ----- 63.27 67.27 71.07 63.66
 2.  63.27 ----- 58.87 61.68 66.79
 3.  67.27 58.87 ----- 69.31 60.56
 4.  71.07 61.68 69.31 ----- 62.37
 5.  63.66 66.79 60.56 62.37 -----

When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?