Houdini 3 running for the IPON

M ANSARI · Post by **M ANSARI** » Wed Oct 17, 2012 10:28 am

Uri Blass wrote:
S.Taylor wrote:
IWB wrote:
Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse.
Thanks for running the test, I think it was really worth the while.

Robert
No no, I ment worse!

I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.

It is not your fault but I am more worried about computerchess overall.

In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy

Any congrats again, nice jump for your engine.

Bye
Ingo
I MUCH prefer there being only one engine that i would want to use.
I waited for this many years.

Then came the Rybkas and now Houdinis.

For some reason, i didn't enjoy everything about Rybka, but if Houdini holds on to a clear edge, in all aspects of the game, then i am quite happy about it.

If another program makes Houdini look bad, and shows up its weaknesses, then i would feel unsettled, until i have a clear thing.

But, i already had the feeling that Houdini is good enough, and that now i can at long last, get satifaction from its opinions on all positions i like to speculate about.

If another one goes WELL OVER that, then i would get that one instead.

But i don't enjoy the constant closeness between so many programs.
Houdini certainly has weaknesses and it is not better than everything in every type of position.

Same was for rybka
If people think that there is basically only one engine to use only because that engine get at least 60% against everything then they are clearly wrong.

I don't know, I am doing a H3 match against Rybka 4.1 on 8 cores and things look pretty impressive. Houdini did initially have some weaknesses most notably in overly high score for a queen and incorrectly assuming that all rook endgames are drawn. These 2 things seem to have been corrected and by looking at some games played, the engine really looks like it is extremely fast and efficient with a very good evaluation and excellent time management. I did not look closely at the games yet, but so far I couldn't see any weaknesses in evaluation. At the moment it is scoring more than 70 ELO points over Rybka 4.1 at a platform that Rybka 4.1 was extremely strong in.

Dr.Wael Deeb · Post by **Dr.Wael Deeb** » Wed Oct 17, 2012 10:32 am

Waschbaer wrote:
Dr.Wael Deeb wrote:
Houdini wrote:
Laskos wrote:+42 now, after 1,000 games. Don't forget that the final result is calculated with Bayeselo and could be off the value against the average (as it is estimated now). And H3 perfoms ~80 points above K5 in the direct match. If these numbers will stay, I don't see Komodo "catching in a few months".
Currently 3083 (+46) after 1088 games.
I would be disappointed with less than 40 Elo improvement for Houdini 3 in IPON.
Either way, we're very close to the 50 Elo gain I "officially" announced, you cannot expect any more precision from any rating list nor from my own development testing gauntlets (which measured 50 to 55 Elo).

Robert
Many moons ago I predicted +40 Elo at best and it seems that my prediction is transforming to reality.........
Dr.D
IWB (Ingo Bauer) wrote:
Hi,

H3 after 2700 games is finished. The final result is online at http://www.inwoba.de.

+62 Elo over H2! (72 with Elostat!).

Unbelivable.

Hm ...

You were wrong, Mr. Dr.D.

THAT is the reality, take it easy

That's Dr.D Wittwar....

Indeed it's an unbelievable fact........

But remember,we are yet to se the results of Houdini 3 under long time controls.....

IPON is fine,but I want to see games at longer time controls where my prediction of around 40 Elo improvement will translate to reality....

Nevertheess,Houdini 3 is a real monster regards,
Dr.D

Dr.Wael Deeb · Post by **Dr.Wael Deeb** » Wed Oct 17, 2012 10:43 am

M ANSARI wrote:
Uri Blass wrote:
S.Taylor wrote:
IWB wrote:
Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse.
Thanks for running the test, I think it was really worth the while.

Robert
No no, I ment worse!

I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.

It is not your fault but I am more worried about computerchess overall.

In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy

Any congrats again, nice jump for your engine.

Bye
Ingo
I MUCH prefer there being only one engine that i would want to use.
I waited for this many years.

Then came the Rybkas and now Houdinis.

For some reason, i didn't enjoy everything about Rybka, but if Houdini holds on to a clear edge, in all aspects of the game, then i am quite happy about it.

If another program makes Houdini look bad, and shows up its weaknesses, then i would feel unsettled, until i have a clear thing.

But, i already had the feeling that Houdini is good enough, and that now i can at long last, get satifaction from its opinions on all positions i like to speculate about.

If another one goes WELL OVER that, then i would get that one instead.

But i don't enjoy the constant closeness between so many programs.
Houdini certainly has weaknesses and it is not better than everything in every type of position.

Same was for rybka
If people think that there is basically only one engine to use only because that engine get at least 60% against everything then they are clearly wrong.
I don't know, I am doing a H3 match against Rybka 4.1 on 8 cores and things look pretty impressive. Houdini did initially have some weaknesses most notably in overly high score for a queen and incorrectly assuming that all rook endgames are drawn. These 2 things seem to have been corrected and by looking at some games played, the engine really looks like it is extremely fast and efficient with a very good evaluation and excellent time management. I did not look closely at the games yet, but so far I couldn't see any weaknesses in evaluation. At the moment it is scoring more than 70 ELO points over Rybka 4.1 at a platform that Rybka 4.1 was extremely strong in.

Thanks Majd for sharing your impressions with us.....

Best regards,
Dr.D

Uri Blass · Post by **Uri Blass** » Wed Oct 17, 2012 11:35 am

Dr.Wael Deeb wrote:
Waschbaer wrote:
Dr.Wael Deeb wrote:
Houdini wrote:
Laskos wrote:+42 now, after 1,000 games. Don't forget that the final result is calculated with Bayeselo and could be off the value against the average (as it is estimated now). And H3 perfoms ~80 points above K5 in the direct match. If these numbers will stay, I don't see Komodo "catching in a few months".
Currently 3083 (+46) after 1088 games.
I would be disappointed with less than 40 Elo improvement for Houdini 3 in IPON.
Either way, we're very close to the 50 Elo gain I "officially" announced, you cannot expect any more precision from any rating list nor from my own development testing gauntlets (which measured 50 to 55 Elo).

Robert
Many moons ago I predicted +40 Elo at best and it seems that my prediction is transforming to reality.........
Dr.D
IWB (Ingo Bauer) wrote:
Hi,

H3 after 2700 games is finished. The final result is online at http://www.inwoba.de.

+62 Elo over H2! (72 with Elostat!).

Unbelivable.

Hm ...

You were wrong, Mr. Dr.D.

THAT is the reality, take it easy
That's Dr.D Wittwar....

Indeed it's an unbelievable fact........

But remember,we are yet to se the results of Houdini 3 under long time controls.....

IPON is fine,but I want to see games at longer time controls where my prediction of around 40 Elo improvement will translate to reality....

Nevertheess,Houdini 3 is a real monster regards,
Dr.D

around 40 elo is not going to happen(except maybe very fast time control).
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.

Houdini 2.0c x64 4CPU = 3086

Houdini 3.0 x64 4CPU against

Critter 1.6 x64 4CPU 3137
Stockfish 2.3 x64 4CPU 3155
Equinox 1.50 x64 4CPU 3179
Kommodo 4.0 x64 1CPU 3170
...looks around +70 too

source
http://cegt.siteboard.eu/f5t411-coordin ... ini-3.html

pohl4711 · Post by **pohl4711** » Wed Oct 17, 2012 12:00 pm

Houdini wrote:Stefan, thank's for the testing.
Not that it's really important, but the jump from 62.0% (+85 Elo) to 68.2% (+133 Elo) is 48 Elo.

The "1% = 7 Elo" rule is only valid up to about 60% performance, see for example the Elo table http://www.pradu.us/old/Nov27_2008/Buzz/elotable.html which I used for the numbers above.

Robert

Yes, thats correct. I use 1%=7 elo - thats too simple for such a strong engine. By the way: At the moment 4100 of 10000 games are played and the score is now 67.9% (=+45 Elo). Final result on sunday evening or monday morning - stay tuned.

Best - Stefan

Wolfgang · Post by **Wolfgang** » Wed Oct 17, 2012 5:18 pm

Hi Uri,

Uri Blass wrote:....
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.

"40/20" is a theoretical value, because at our 40/20-list the real time control is adapted to hardware power by a benchmark. Real 40/20 would only be played on an AMD-X2-4200 @ 2,4 GHZ (our hardware reference) but the hardware used by Werner and Johan is very much faster. IIRC they play with 40/10 (Werner) and 40/8 (?!, Johan), but I don't know exactly.

This is slower than IPON but still far away from "real slow" time controls.

These will be played at:

- CEGT 40/20 with PB = on (starts as soon as Critter 1.6 test is finished, approx. beginning/middle next week)
- CEGT 40/120 (starts this evening)
and CCRL 40/40

But this will take some time.

and don't forget: only 140 (!!) games played so far! Next weekend we will have relatively reliable ratings at least for 40/3 and hopefully for 40/"20" too.

IWB · Post by **IWB** » Wed Oct 17, 2012 5:20 pm

The IPON-RRRL is updated as well.

85 Elo now from the first to the second place!

http://www.inwoba.de

Bye
Ingo

lkaufman · Post by **lkaufman** » Wed Oct 17, 2012 5:41 pm

Wolfgang wrote:Hi Uri,

Uri Blass wrote:....
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.
"40/20" is a theoretical value, because at our 40/20-list the real time control is adapted to hardware power by a benchmark. Real 40/20 would only be played on an AMD-X2-4200 @ 2,4 GHZ (our hardware reference) but the hardware used by Werner and Johan is very much faster. IIRC they play with 40/10 (Werner) and 40/8 (?!, Johan), but I don't know exactly.

This is slower than IPON but still far away from "real slow" time controls.

These will be played at:

- CEGT 40/20 with PB = on (starts as soon as Critter 1.6 test is finished, approx. beginning/middle next week)
- CEGT 40/120 (starts this evening)
and CCRL 40/40

But this will take some time.

and don't forget: only 140 (!!) games played so far! Next weekend we will have relatively reliable ratings at least for 40/3 and hopefully for 40/"20" too.

OK, so 40/20 CEGT (no PB) is too close to IPON levels to provide scaling info, so the CCRL 40/40 will be the first substantial test that sheds light on scaling.

Wolfgang · Post by **Wolfgang** » Wed Oct 17, 2012 6:03 pm

lkaufman wrote: OK, so 40/20 CEGT (no PB) is too close to IPON levels to provide scaling info, so the CCRL 40/40 will be the first substantial test that sheds light on scaling.

that's correct, our 40/20PB=off (which effectively is 40/10 or less) is rather close to IPON. But our 40/20PB=on whithout adapting, so it is 40/20 effectively, is ~ 5 times slower than IPON. Mean game length is around 80 minutes, IPON 16 minutes. Hope to have first results next week.

Our 40/120 results will take some time as on one quad a 50-game match lasts 3-4 days when Shredder Classic interface is started four times (12 to 16 games per day). So it will take at least one month to have 400-500 games.

beram · Post by **beram** » Wed Oct 17, 2012 6:16 pm

and these are the individual statistics (Thx Ingo)
Wow.......what else can be said

Code: Select all

Individual statistics:

1 Houdini 3 STD             : 3086  3150 (+2146,=818,-186), 81.1 %

Deep Junior 13.3              : 150 (+116,= 28,-  6), 86.7 %
Deep Fritz 13 32b             : 150 (+103,= 37,- 10), 81.0 %
spark-1.0                     : 150 (+116,= 28,-  6), 86.7 %
Zappa Mexico II               : 150 (+127,= 19,-  4), 91.0 %
Stockfish 2.2.2 JA            : 150 (+ 75,= 59,- 16), 69.7 %
Spike 1.4 32b                 : 150 (+125,= 20,-  5), 90.0 %
Quazar 0.4                    : 150 (+123,= 19,-  8), 88.3 %
Protector 1.4.0               : 150 (+111,= 34,-  5), 85.3 %
Naum 4.2                      : 150 (+107,= 37,-  6), 83.7 %
MinkoChess 1.3                : 150 (+135,= 12,-  3), 94.0 %
Hannibal 1.2                  : 150 (+114,= 29,-  7), 85.7 %
Gull 1.2                      : 150 (+119,= 29,-  2), 89.0 %
Deep Sjeng c't 2010 32b       : 150 (+124,= 21,-  5), 89.7 %
Deep Shredder 12              : 150 (+108,= 32,- 10), 82.7 %
Deep Rybka 4.1                : 150 (+ 70,= 63,- 17), 67.7 %
Critter 1.6a                  : 150 (+ 56,= 80,- 14), 64.0 %
Critter 1.4a                  : 150 (+ 57,= 73,- 20), 62.3 %
Chiron 1.1a                   : 150 (+117,= 29,-  4), 87.7 %
Komodo 5                      : 150 (+ 61,= 69,- 20), 63.7 %
HIARCS 14 WCSC 32b            : 150 (+108,= 34,-  8), 83.3 %
Stockfish 2.3.1 JA            : 150 (+ 74,= 66,- 10), 71.3 %

Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON

Re: Houdini 3 running for the IPON