Hi, Robert. I've tested your engines many times, formally for the Power Chess Project, the CCRL (almost all of the blitz 6CPU & more), for my own fun, and for the Westport Chess Club. I've had various versions of Critter play your versions of Houdini many times as well. I've never had a Critter x.x vs Houdini x.x match end with this large of a difference in the final score. I knew it was statistically possible, but I had never seen it. Your engine dominated at that quicker time control. Right now, I am re-running the match with a 10' + 10" control, and after 78 games, Critter 1.4a is scoring +18/=33/-27 vs Houdini 2.0c, which is more like--in my experience--a "usual" Critter-Houdini score.Houdini wrote:Hello Brent,
...
Why are you surprised or "quite shocked" to find a 60-40 that lies well within the 95% confidence interval of the expected outcome?
Robert
Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match results
Moderators: hgm, Rebel, chrisw
-
- Posts: 117
- Joined: Fri Mar 25, 2011 10:40 pm
- Location: USA
Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu
Hello Brent, you should expect this kind of 60-40 result about once every 10 matches (it's at about the 90% confidence level). At the other side of the scale you'll also find cases in which Critter ties with or even beats Houdini in a 100-game match.
From your blog post you appeared to be quite upset with the result (up to the point that you questioned the validity of your setup) but there's no reason to be so - it's all just part of the large variability of chess engine matches.
Thanks for playing the matches and sharing the results!
Robert
From your blog post you appeared to be quite upset with the result (up to the point that you questioned the validity of your setup) but there's no reason to be so - it's all just part of the large variability of chess engine matches.
Thanks for playing the matches and sharing the results!
Robert
-
- Posts: 117
- Joined: Fri Mar 25, 2011 10:40 pm
- Location: USA
Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu
This result coincided with my first test of Critter 1.4a (I only test with 1.4 for the CCRL), so I wondered if it had something to do with that version of Critter. I was surprised, yes, but not upset. Nice talking with you; for future reference, however, it's pointless to question how someone feels about something. You can't argue w/ feelings.Houdini wrote:Hello Brent, ...
From your blog post you appeared to be quite upset with the result (up to the point that you questioned the validity of your setup) ...
Robert
Keep up the good work.
-
- Posts: 481
- Joined: Thu Apr 16, 2009 12:00 pm
- Location: Slovakia, EU
Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu
There were only very minor changes between versions 1.4 and 1.4a. They should perform almost identical (within +-3 Elo).ivoryknight wrote: This result coincided with my first test of Critter 1.4a (I only test with 1.4 for the CCRL), so I wondered if it had something to do with that version of Critter.
-
- Posts: 150
- Joined: Sat Mar 10, 2012 11:50 pm
- Location: USA
Re: Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu
Thank you, Mr. Vida. Keep up the good work, as well.
Happy chessing!
-
- Posts: 880
- Joined: Mon Feb 15, 2010 6:43 am
Critter 1.4a x64 SSE4 vs Houdini 2.0c Pro x64 match resu
Hi Richard.
I recently ran 300 1min +1 sec games between 1.4 and 1.4a using the default settings. 1.4a won the match by 13 points; not sure if that is enough games to mean anything. I am sure that you have enough games of your own but if you want me to post the games I will.
I noticed that you have the minimum split depth parameter. Is there some automated way to "tune" that to a particular pc like what Houdini does? If not "auto", what is the best way for me to know what depth is best here without running full matches?
1: Critter_1.4a_64bit_sse4 156.5 / 300 +58 =197 -45
2: Critter_1.4_64bit_sse4 143.5 / 300 +45 =197 -58
I recently ran 300 1min +1 sec games between 1.4 and 1.4a using the default settings. 1.4a won the match by 13 points; not sure if that is enough games to mean anything. I am sure that you have enough games of your own but if you want me to post the games I will.
I noticed that you have the minimum split depth parameter. Is there some automated way to "tune" that to a particular pc like what Houdini does? If not "auto", what is the best way for me to know what depth is best here without running full matches?
1: Critter_1.4a_64bit_sse4 156.5 / 300 +58 =197 -45
2: Critter_1.4_64bit_sse4 143.5 / 300 +45 =197 -58