100 long games Rybka 4 vs Houdini 1.03a

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

100 long games Rybka 4 vs Houdini 1.03a

Post by IWB »

Hello all

In the recent discussion about the 60 game match of Jeroen Noomen I was missing some "meet" on the result. basicaly all games and the full PGN with eval and times. So I decided to make my own 100 game set with the same time control. I will not comment the result for some time to make it possible for the interessted to judge by themself. Nevertheless I will comment tommorow!

Conditions:

GUI: Shredder Classic
CPU: AMD Phenom II 6 core @ 3.120 GHz (this is approx. a 3GHz Q6600)
Threads: 3(!) (~equal to a 4 core Q6600 @ 2.4GHz)
Hash: 1GB each engine
Opening Positions: FQ 50 Position set
Colours: changing, every engine has to play White and Black, 100 games overall
Time control: 60 minutes per game/each side, played on 3 equal computers.
Ponder: ON (of course, I play real matches!)
Tablebases: 4 piece Nalimov (only Rybka can make use of it)
Ettiquette: ON (three consecutive 0.00 evals of both engines is a draw, 3 times in a row +/- 5.xx of the loosing engines is a loss!)

Result: + 23,= 60,- 17 in favor for Rybka 4

Download all games as PGN: http://www.mediafire.com/?9xmbb835fi7k61k

The games in the PGN file are given in the order of their finalization. No crash of an engine during the 100 games.

Have fun
Ingo
S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by S.Taylor »

I think that at longer TC's, Rybka would score slightly more over Houdini than here.
Kurt Utzinger
Posts: 169
Joined: Sun May 11, 2008 10:31 pm
Location: Switzerland

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by Kurt Utzinger »

Hi Ingo
No surprise that Rybka remains the best engine at longer time controls.
Kurt
User avatar
M ANSARI
Posts: 3707
Joined: Thu Mar 16, 2006 7:10 pm

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by M ANSARI »

I think this is not a bad result for R4 as only 3 cores were used. If more cores per engine were used and ponder ON and LTC, I would expect R4 to do better as it seems to scale much better than other non Rybka engines.
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by IWB »

Hello
M ANSARI wrote:I think this is not a bad result for R4 as only 3 cores were used. If more cores per engine were used and ponder ON and LTC, I would expect R4 to do better as it seems to scale much better than other non Rybka engines.
Do you have any numbers or is this just a believe?

The only real test I have seen with 4 cores and significant number of games and opponents, but ponder off, shows something different, even if it is not finished yet. (http://www.pcschach.de/ check the QBRL)

The other thing, to think that some engines increase playing strength over the average with more time that much more Elo increase is visible, is not backed up with any available rating lists. They all have more or less the same engines as neigbours ...
Actually as we do not have any data that shows ONE engine outside its expected rating when you compare CEGT 40/4, IPON, SWCR, CEGT 40/20, CCRL 40/40 and SSDF. If you check the engines which are available in one of the other list (and a have reasonable number of games) they ALL have the same neighbours (they might change ranking if they are very close, but then this ranking is wihin the error margin).

So no prove that a any engine increases more with more time than its direct competitors! Actualy I think this is just a repeated opinion because a human is seeing that humans play better with more time. (And with humans that really can make a difference!)

(There might be one exception, but this is a hypothesis: There is a certain lower time limit for a resonable rating list but I dont know exactly where this is but all mentioned lists above are long enough on modern comuters. This lower limit of course does not mean that you cant see an Elo increase if you test an particular engine on very short time controls ...)

Bye
Ingo
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by IWB »

Hello Kurt,
Kurt Utzinger wrote:Hi Ingo
No surprise that Rybka remains the best engine at longer time controls.
Kurt
I strongly assume you have read my much longer comment in the CSS about this, but just to be sure: A 100 game match between two engines, where the difference to be equal is just 3 wins can not be enough to judge that ANY enigne is, was or 'remains' better than another. This is just not enough data.

Bye
Ingo
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by IWB »

Hi
S.Taylor wrote:I think that at longer TC's, Rybka would score slightly more over Houdini than here.
I think that in this field where is too much thinking and not enough data. I see my list, and I see how often engines with a better rating lose the small 100 game machtes vs a weaker opponent. Thats why I dont hink that one can draw any conclusion out of a 100 game match (and then just between 2 opoonents!).

Bye
Ingo
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by Dr.Wael Deeb »

IWB wrote:Hi
S.Taylor wrote:I think that at longer TC's, Rybka would score slightly more over Houdini than here.
I think that in this field where is too much thinking and not enough data. I see my list, and I see how often engines with a better rating lose the small 100 game machtes vs a weaker opponent. Thats why I dont hink that one can draw any conclusion out of a 100 game match (and then just between 2 opoonents!).

Bye
Ingo
Hi Ingo,
Totally agreed here....that's why I play tournaments with a lot of opponents using a RR system facing each other....
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
User avatar
M ANSARI
Posts: 3707
Joined: Thu Mar 16, 2006 7:10 pm

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by M ANSARI »

Hi Ingo,

Yes I tested this very extensively during beta testing of R4 and it was a problem that we really couldn't figure out. R4 on big hardware simply performs much better as you go up in hardware, and became dramatically better on over clocked 8 cores and ponder ON. I was testing it against many engines, but was specifically interested in how it would do against the then available R3 clones. To give you an idea, the advantage of Rybka 4 would consistently score much higher on big hardware to a point where it became very alarming (sometimes 100 ELO difference from small hardware to big hardware against identical opponents). I have a theory about this, but this is my own theory ... that is that Rybka 4 has a lot of communication lag because it uses processes instead of threads. That seems to hurt performance at short time controls or when ponder is OFF, but this drop in performance seems to reverse when more cores are used as the communication lag caused by processes is more than compensated for with better scalability. Again this is only my own theory and is the only way all this makes sense to me.
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: 100 long games Rybka 4 vs Houdini 1.03a

Post by IWB »

Hello

1. Actually I don't buy anything of this.

Why that? Because this would mean that R4 in 2 Years will improve (maybe 100 Elo?) just because of better hardware despite the fact that this better hardware is available for the others as well?
There was no engine like this in the past and there is no working theory why this should happen in the future.
Today I have tested some games at long time control, the 3 cores are as fast as a C2Q at 3GHz and there is basically no statistical relevant change to a 5 + 3 match on just one core despite the fact that 60 +0 on 4 cores are MUCH more 'sophisticated' then the one core system. So no data is supporting your statement!
Of course one can always say "more time, faster hardware ..." but of what benefit is this if it is simply not practical (and unproven).

What exactly is an "extensive test" on extremly fast hardware AND long time controls for you? How many thousands of games vs different opponents did you play? Actually I have learned (as a beta tester as well) that one has to be extremly carefully not to find what one one want to find!

You as a beta tester claim something unlikely, under conditions which are impossible to check. I am beta testing as well but whatever I find which others cant check I will not publish, simply because I would consider myself as an untrustworthy source. I do not think that Rybka deserves that kind of dubious advertisement!

Lets conclude: All data from all sources (mentioned above) we have do not support your claim, neither for Rybka nor for any other engine!

Bye
Ingo