Houdini 3 running for the IPON

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
M ANSARI
Posts: 3726
Joined: Thu Mar 16, 2006 7:10 pm

Re: Houdini 3 running for the IPON

Post by M ANSARI »

Uri Blass wrote:
S.Taylor wrote:
IWB wrote:
Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse. :lol:
Thanks for running the test, I think it was really worth the while.

Robert
No no, I ment worse!

I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.

It is not your fault but I am more worried about computerchess overall.

In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy :-)

Any congrats again, nice jump for your engine.

Bye
Ingo
I MUCH prefer there being only one engine that i would want to use.
I waited for this many years.

Then came the Rybkas and now Houdinis.

For some reason, i didn't enjoy everything about Rybka, but if Houdini holds on to a clear edge, in all aspects of the game, then i am quite happy about it.

If another program makes Houdini look bad, and shows up its weaknesses, then i would feel unsettled, until i have a clear thing.

But, i already had the feeling that Houdini is good enough, and that now i can at long last, get satifaction from its opinions on all positions i like to speculate about.

If another one goes WELL OVER that, then i would get that one instead.

But i don't enjoy the constant closeness between so many programs.
Houdini certainly has weaknesses and it is not better than everything in every type of position.

Same was for rybka
If people think that there is basically only one engine to use only because that engine get at least 60% against everything then they are clearly wrong.
I don't know, I am doing a H3 match against Rybka 4.1 on 8 cores and things look pretty impressive. Houdini did initially have some weaknesses most notably in overly high score for a queen and incorrectly assuming that all rook endgames are drawn. These 2 things seem to have been corrected and by looking at some games played, the engine really looks like it is extremely fast and efficient with a very good evaluation and excellent time management. I did not look closely at the games yet, but so far I couldn't see any weaknesses in evaluation. At the moment it is scoring more than 70 ELO points over Rybka 4.1 at a platform that Rybka 4.1 was extremely strong in.
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: Houdini 3 running for the IPON

Post by Dr.Wael Deeb »

Waschbaer wrote:
Dr.Wael Deeb wrote:
Houdini wrote:
Laskos wrote:+42 now, after 1,000 games. Don't forget that the final result is calculated with Bayeselo and could be off the value against the average (as it is estimated now). And H3 perfoms ~80 points above K5 in the direct match. If these numbers will stay, I don't see Komodo "catching in a few months".
Currently 3083 (+46) after 1088 games.
I would be disappointed with less than 40 Elo improvement for Houdini 3 in IPON.
Either way, we're very close to the 50 Elo gain I "officially" announced, you cannot expect any more precision from any rating list nor from my own development testing gauntlets (which measured 50 to 55 Elo).

Robert
Many moons ago I predicted +40 Elo at best and it seems that my prediction is transforming to reality.........
Dr.D
IWB (Ingo Bauer) wrote:
Hi,

H3 after 2700 games is finished. The final result is online at http://www.inwoba.de.

+62 Elo over H2! (72 with Elostat!).

Unbelivable.

Hm ...

You were wrong, Mr. Dr.D.

THAT is the reality, take it easy
That's Dr.D Wittwar....

Indeed it's an unbelievable fact........

But remember,we are yet to se the results of Houdini 3 under long time controls.....

IPON is fine,but I want to see games at longer time controls where my prediction of around 40 Elo improvement will translate to reality....

Nevertheess,Houdini 3 is a real monster regards,
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: Houdini 3 running for the IPON

Post by Dr.Wael Deeb »

M ANSARI wrote:
Uri Blass wrote:
S.Taylor wrote:
IWB wrote:
Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse. :lol:
Thanks for running the test, I think it was really worth the while.

Robert
No no, I ment worse!

I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.

It is not your fault but I am more worried about computerchess overall.

In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy :-)

Any congrats again, nice jump for your engine.

Bye
Ingo
I MUCH prefer there being only one engine that i would want to use.
I waited for this many years.

Then came the Rybkas and now Houdinis.

For some reason, i didn't enjoy everything about Rybka, but if Houdini holds on to a clear edge, in all aspects of the game, then i am quite happy about it.

If another program makes Houdini look bad, and shows up its weaknesses, then i would feel unsettled, until i have a clear thing.

But, i already had the feeling that Houdini is good enough, and that now i can at long last, get satifaction from its opinions on all positions i like to speculate about.

If another one goes WELL OVER that, then i would get that one instead.

But i don't enjoy the constant closeness between so many programs.
Houdini certainly has weaknesses and it is not better than everything in every type of position.

Same was for rybka
If people think that there is basically only one engine to use only because that engine get at least 60% against everything then they are clearly wrong.
I don't know, I am doing a H3 match against Rybka 4.1 on 8 cores and things look pretty impressive. Houdini did initially have some weaknesses most notably in overly high score for a queen and incorrectly assuming that all rook endgames are drawn. These 2 things seem to have been corrected and by looking at some games played, the engine really looks like it is extremely fast and efficient with a very good evaluation and excellent time management. I did not look closely at the games yet, but so far I couldn't see any weaknesses in evaluation. At the moment it is scoring more than 70 ELO points over Rybka 4.1 at a platform that Rybka 4.1 was extremely strong in.
Thanks Majd for sharing your impressions with us.....

Best regards,
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
Uri Blass
Posts: 10893
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Houdini 3 running for the IPON

Post by Uri Blass »

Dr.Wael Deeb wrote:
Waschbaer wrote:
Dr.Wael Deeb wrote:
Houdini wrote:
Laskos wrote:+42 now, after 1,000 games. Don't forget that the final result is calculated with Bayeselo and could be off the value against the average (as it is estimated now). And H3 perfoms ~80 points above K5 in the direct match. If these numbers will stay, I don't see Komodo "catching in a few months".
Currently 3083 (+46) after 1088 games.
I would be disappointed with less than 40 Elo improvement for Houdini 3 in IPON.
Either way, we're very close to the 50 Elo gain I "officially" announced, you cannot expect any more precision from any rating list nor from my own development testing gauntlets (which measured 50 to 55 Elo).

Robert
Many moons ago I predicted +40 Elo at best and it seems that my prediction is transforming to reality.........
Dr.D
IWB (Ingo Bauer) wrote:
Hi,

H3 after 2700 games is finished. The final result is online at http://www.inwoba.de.

+62 Elo over H2! (72 with Elostat!).

Unbelivable.

Hm ...

You were wrong, Mr. Dr.D.

THAT is the reality, take it easy
That's Dr.D Wittwar....

Indeed it's an unbelievable fact........

But remember,we are yet to se the results of Houdini 3 under long time controls.....

IPON is fine,but I want to see games at longer time controls where my prediction of around 40 Elo improvement will translate to reality....

Nevertheess,Houdini 3 is a real monster regards,
Dr.D
around 40 elo is not going to happen(except maybe very fast time control).
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.


Houdini 2.0c x64 4CPU = 3086

Houdini 3.0 x64 4CPU against

Critter 1.6 x64 4CPU 3137
Stockfish 2.3 x64 4CPU 3155
Equinox 1.50 x64 4CPU 3179
Kommodo 4.0 x64 1CPU 3170
...looks around +70 too

source
http://cegt.siteboard.eu/f5t411-coordin ... ini-3.html
User avatar
pohl4711
Posts: 2807
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: Houdini 3 running for the IPON

Post by pohl4711 »

Houdini wrote:Stefan, thank's for the testing.
Not that it's really important, but the jump from 62.0% (+85 Elo) to 68.2% (+133 Elo) is 48 Elo.

The "1% = 7 Elo" rule is only valid up to about 60% performance, see for example the Elo table http://www.pradu.us/old/Nov27_2008/Buzz/elotable.html which I used for the numbers above.

Robert
Yes, thats correct. I use 1%=7 elo - thats too simple for such a strong engine. By the way: At the moment 4100 of 10000 games are played and the score is now 67.9% (=+45 Elo). Final result on sunday evening or monday morning - stay tuned.

Best - Stefan
Wolfgang
Posts: 989
Joined: Sat May 13, 2006 1:08 am

Re: Houdini 3 running for the IPON

Post by Wolfgang »

Hi Uri,
Uri Blass wrote:....
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.
"40/20" is a theoretical value, because at our 40/20-list the real time control is adapted to hardware power by a benchmark. Real 40/20 would only be played on an AMD-X2-4200 @ 2,4 GHZ (our hardware reference) but the hardware used by Werner and Johan is very much faster. IIRC they play with 40/10 (Werner) and 40/8 (?!, Johan), but I don't know exactly.

This is slower than IPON but still far away from "real slow" time controls.

These will be played at:

- CEGT 40/20 with PB = on (starts as soon as Critter 1.6 test is finished, approx. beginning/middle next week)
- CEGT 40/120 (starts this evening)
and CCRL 40/40

But this will take some time.

and don't forget: only 140 (!!) games played so far! Next weekend we will have relatively reliable ratings at least for 40/3 and hopefully for 40/"20" too.
Best
Wolfgang
CEGT-Team
www.cegt.net
www.cegt.forumieren.com
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Houdini 3 running for the IPON

Post by IWB »

The IPON-RRRL is updated as well.

85 Elo now from the first to the second place!

http://www.inwoba.de

Bye
Ingo
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Houdini 3 running for the IPON

Post by lkaufman »

Wolfgang wrote:Hi Uri,
Uri Blass wrote:....
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.
"40/20" is a theoretical value, because at our 40/20-list the real time control is adapted to hardware power by a benchmark. Real 40/20 would only be played on an AMD-X2-4200 @ 2,4 GHZ (our hardware reference) but the hardware used by Werner and Johan is very much faster. IIRC they play with 40/10 (Werner) and 40/8 (?!, Johan), but I don't know exactly.

This is slower than IPON but still far away from "real slow" time controls.

These will be played at:

- CEGT 40/20 with PB = on (starts as soon as Critter 1.6 test is finished, approx. beginning/middle next week)
- CEGT 40/120 (starts this evening)
and CCRL 40/40

But this will take some time.

and don't forget: only 140 (!!) games played so far! Next weekend we will have relatively reliable ratings at least for 40/3 and hopefully for 40/"20" too.
OK, so 40/20 CEGT (no PB) is too close to IPON levels to provide scaling info, so the CCRL 40/40 will be the first substantial test that sheds light on scaling.
Wolfgang
Posts: 989
Joined: Sat May 13, 2006 1:08 am

Re: Houdini 3 running for the IPON

Post by Wolfgang »

lkaufman wrote: OK, so 40/20 CEGT (no PB) is too close to IPON levels to provide scaling info, so the CCRL 40/40 will be the first substantial test that sheds light on scaling.
that's correct, our 40/20PB=off (which effectively is 40/10 or less) is rather close to IPON. But our 40/20PB=on whithout adapting, so it is 40/20 effectively, is ~ 5 times slower than IPON. Mean game length is around 80 minutes, IPON 16 minutes. Hope to have first results next week.

Our 40/120 results will take some time as on one quad a 50-game match lasts 3-4 days when Shredder Classic interface is started four times (12 to 16 games per day). So it will take at least one month to have 400-500 games.
Best
Wolfgang
CEGT-Team
www.cegt.net
www.cegt.forumieren.com
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: Houdini 3 running for the IPON

Post by beram »

and these are the individual statistics (Thx Ingo)
Wow.......what else can be said

Code: Select all

Individual statistics:

1 Houdini 3 STD             : 3086  3150 (+2146,=818,-186), 81.1 %

Deep Junior 13.3              : 150 (+116,= 28,-  6), 86.7 %
Deep Fritz 13 32b             : 150 (+103,= 37,- 10), 81.0 %
spark-1.0                     : 150 (+116,= 28,-  6), 86.7 %
Zappa Mexico II               : 150 (+127,= 19,-  4), 91.0 %
Stockfish 2.2.2 JA            : 150 (+ 75,= 59,- 16), 69.7 %
Spike 1.4 32b                 : 150 (+125,= 20,-  5), 90.0 %
Quazar 0.4                    : 150 (+123,= 19,-  8), 88.3 %
Protector 1.4.0               : 150 (+111,= 34,-  5), 85.3 %
Naum 4.2                      : 150 (+107,= 37,-  6), 83.7 %
MinkoChess 1.3                : 150 (+135,= 12,-  3), 94.0 %
Hannibal 1.2                  : 150 (+114,= 29,-  7), 85.7 %
Gull 1.2                      : 150 (+119,= 29,-  2), 89.0 %
Deep Sjeng c't 2010 32b       : 150 (+124,= 21,-  5), 89.7 %
Deep Shredder 12              : 150 (+108,= 32,- 10), 82.7 %
Deep Rybka 4.1                : 150 (+ 70,= 63,- 17), 67.7 %
Critter 1.6a                  : 150 (+ 56,= 80,- 14), 64.0 %
Critter 1.4a                  : 150 (+ 57,= 73,- 20), 62.3 %
Chiron 1.1a                   : 150 (+117,= 29,-  4), 87.7 %
Komodo 5                      : 150 (+ 61,= 69,- 20), 63.7 %
HIARCS 14 WCSC 32b            : 150 (+108,= 34,-  8), 83.3 %
Stockfish 2.3.1 JA            : 150 (+ 74,= 66,- 10), 71.3 %