Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match results

ivoryknight · Post by **ivoryknight** » Mon Mar 19, 2012 1:33 am

Houdini wrote:Hello Brent,
...
Why are you surprised or "quite shocked" to find a 60-40 that lies well within the 95% confidence interval of the expected outcome?

Robert

Hi, Robert. I've tested your engines many times, formally for the Power Chess Project, the CCRL (almost all of the blitz 6CPU & more), for my own fun, and for the Westport Chess Club. I've had various versions of Critter play your versions of Houdini many times as well. I've never had a Critter x.x vs Houdini x.x match end with this large of a difference in the final score. I knew it was statistically possible, but I had never seen it. Your engine dominated at that quicker time control. Right now, I am re-running the match with a 10' + 10" control, and after 78 games, Critter 1.4a is scoring +18/=33/-27 vs Houdini 2.0c, which is more like--in my experience--a "usual" Critter-Houdini score.

Houdini · Post by **Houdini** » Mon Mar 19, 2012 12:38 pm

Hello Brent, you should expect this kind of 60-40 result about once every 10 matches (it's at about the 90% confidence level). At the other side of the scale you'll also find cases in which Critter ties with or even beats Houdini in a 100-game match.
From your blog post you appeared to be quite upset with the result (up to the point that you questioned the validity of your setup) but there's no reason to be so - it's all just part of the large variability of chess engine matches.
Thanks for playing the matches and sharing the results!

Robert

ivoryknight · Post by **ivoryknight** » Mon Mar 19, 2012 1:24 pm

Houdini wrote:Hello Brent, ...
From your blog post you appeared to be quite upset with the result (up to the point that you questioned the validity of your setup) ...

Robert

This result coincided with my first test of Critter 1.4a (I only test with 1.4 for the CCRL), so I wondered if it had something to do with that version of Critter. I was surprised, yes, but not upset. Nice talking with you; for future reference, however, it's pointless to question how someone feels about something. You can't argue w/ feelings.

Keep up the good work.

rvida · Post by **rvida** » Mon Mar 19, 2012 1:57 pm

ivoryknight wrote: This result coincided with my first test of Critter 1.4a (I only test with 1.4 for the CCRL), so I wondered if it had something to do with that version of Critter.

There were only very minor changes between versions 1.4 and 1.4a. They should perform almost identical (within +-3 Elo).

ATOMICC · Post by **ATOMICC** » Mon Mar 19, 2012 2:03 pm

Thank you, Mr. Vida. Keep up the good work, as well.

PawnStormZ · Post by **PawnStormZ** » Wed Mar 21, 2012 4:07 am

Hi Richard.
I recently ran 300 1min +1 sec games between 1.4 and 1.4a using the default settings. 1.4a won the match by 13 points; not sure if that is enough games to mean anything. I am sure that you have enough games of your own but if you want me to post the games I will.

I noticed that you have the minimum split depth parameter. Is there some automated way to "tune" that to a particular pc like what Houdini does? If not "auto", what is the best way for me to know what depth is best here without running full matches?

1: Critter_1.4a_64bit_sse4 156.5 / 300 +58 =197 -45
2: Critter_1.4_64bit_sse4 143.5 / 300 +45 =197 -58

Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match results

Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu

Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu

Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu

Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu

Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu

Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu