I don't know, I am doing a H3 match against Rybka 4.1 on 8 cores and things look pretty impressive. Houdini did initially have some weaknesses most notably in overly high score for a queen and incorrectly assuming that all rook endgames are drawn. These 2 things seem to have been corrected and by looking at some games played, the engine really looks like it is extremely fast and efficient with a very good evaluation and excellent time management. I did not look closely at the games yet, but so far I couldn't see any weaknesses in evaluation. At the moment it is scoring more than 70 ELO points over Rybka 4.1 at a platform that Rybka 4.1 was extremely strong in.Uri Blass wrote:Houdini certainly has weaknesses and it is not better than everything in every type of position.S.Taylor wrote:I MUCH prefer there being only one engine that i would want to use.IWB wrote:No no, I ment worse!Houdini wrote:You're supposed to say better, not worse.IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!![]()
Thanks for running the test, I think it was really worth the while.
Robert
I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.
It is not your fault but I am more worried about computerchess overall.
In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy
Any congrats again, nice jump for your engine.
Bye
Ingo
I waited for this many years.
Then came the Rybkas and now Houdinis.
For some reason, i didn't enjoy everything about Rybka, but if Houdini holds on to a clear edge, in all aspects of the game, then i am quite happy about it.
If another program makes Houdini look bad, and shows up its weaknesses, then i would feel unsettled, until i have a clear thing.
But, i already had the feeling that Houdini is good enough, and that now i can at long last, get satifaction from its opinions on all positions i like to speculate about.
If another one goes WELL OVER that, then i would get that one instead.
But i don't enjoy the constant closeness between so many programs.
Same was for rybka
If people think that there is basically only one engine to use only because that engine get at least 60% against everything then they are clearly wrong.
Houdini 3 running for the IPON
Moderator: Ras
-
- Posts: 3726
- Joined: Thu Mar 16, 2006 7:10 pm
Re: Houdini 3 running for the IPON
-
- Posts: 9773
- Joined: Wed Mar 08, 2006 8:44 pm
- Location: Amman,Jordan
Re: Houdini 3 running for the IPON
That's Dr.D Wittwar....Waschbaer wrote:IWB (Ingo Bauer) wrote:Dr.Wael Deeb wrote:Many moons ago I predicted +40 Elo at best and it seems that my prediction is transforming to reality.........Houdini wrote:Currently 3083 (+46) after 1088 games.Laskos wrote:+42 now, after 1,000 games. Don't forget that the final result is calculated with Bayeselo and could be off the value against the average (as it is estimated now). And H3 perfoms ~80 points above K5 in the direct match. If these numbers will stay, I don't see Komodo "catching in a few months".
I would be disappointed with less than 40 Elo improvement for Houdini 3 in IPON.
Either way, we're very close to the 50 Elo gain I "officially" announced, you cannot expect any more precision from any rating list nor from my own development testing gauntlets (which measured 50 to 55 Elo).
Robert
Dr.DHi,
H3 after 2700 games is finished. The final result is online at http://www.inwoba.de.
+62 Elo over H2! (72 with Elostat!).
Unbelivable.
Hm ...
You were wrong, Mr. Dr.D.
THAT is the reality, take it easy
Indeed it's an unbelievable fact........
But remember,we are yet to se the results of Houdini 3 under long time controls.....
IPON is fine,but I want to see games at longer time controls where my prediction of around 40 Elo improvement will translate to reality....
Nevertheess,Houdini 3 is a real monster regards,
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
-
- Posts: 9773
- Joined: Wed Mar 08, 2006 8:44 pm
- Location: Amman,Jordan
Re: Houdini 3 running for the IPON
Thanks Majd for sharing your impressions with us.....M ANSARI wrote:I don't know, I am doing a H3 match against Rybka 4.1 on 8 cores and things look pretty impressive. Houdini did initially have some weaknesses most notably in overly high score for a queen and incorrectly assuming that all rook endgames are drawn. These 2 things seem to have been corrected and by looking at some games played, the engine really looks like it is extremely fast and efficient with a very good evaluation and excellent time management. I did not look closely at the games yet, but so far I couldn't see any weaknesses in evaluation. At the moment it is scoring more than 70 ELO points over Rybka 4.1 at a platform that Rybka 4.1 was extremely strong in.Uri Blass wrote:Houdini certainly has weaknesses and it is not better than everything in every type of position.S.Taylor wrote:I MUCH prefer there being only one engine that i would want to use.IWB wrote:No no, I ment worse!Houdini wrote:You're supposed to say better, not worse.IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!![]()
Thanks for running the test, I think it was really worth the while.
Robert
I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.
It is not your fault but I am more worried about computerchess overall.
In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy
Any congrats again, nice jump for your engine.
Bye
Ingo
I waited for this many years.
Then came the Rybkas and now Houdinis.
For some reason, i didn't enjoy everything about Rybka, but if Houdini holds on to a clear edge, in all aspects of the game, then i am quite happy about it.
If another program makes Houdini look bad, and shows up its weaknesses, then i would feel unsettled, until i have a clear thing.
But, i already had the feeling that Houdini is good enough, and that now i can at long last, get satifaction from its opinions on all positions i like to speculate about.
If another one goes WELL OVER that, then i would get that one instead.
But i don't enjoy the constant closeness between so many programs.
Same was for rybka
If people think that there is basically only one engine to use only because that engine get at least 60% against everything then they are clearly wrong.
Best regards,
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
-
- Posts: 10893
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Houdini 3 running for the IPON
around 40 elo is not going to happen(except maybe very fast time control).Dr.Wael Deeb wrote:That's Dr.D Wittwar....Waschbaer wrote:IWB (Ingo Bauer) wrote:Dr.Wael Deeb wrote:Many moons ago I predicted +40 Elo at best and it seems that my prediction is transforming to reality.........Houdini wrote:Currently 3083 (+46) after 1088 games.Laskos wrote:+42 now, after 1,000 games. Don't forget that the final result is calculated with Bayeselo and could be off the value against the average (as it is estimated now). And H3 perfoms ~80 points above K5 in the direct match. If these numbers will stay, I don't see Komodo "catching in a few months".
I would be disappointed with less than 40 Elo improvement for Houdini 3 in IPON.
Either way, we're very close to the 50 Elo gain I "officially" announced, you cannot expect any more precision from any rating list nor from my own development testing gauntlets (which measured 50 to 55 Elo).
Robert
Dr.DHi,
H3 after 2700 games is finished. The final result is online at http://www.inwoba.de.
+62 Elo over H2! (72 with Elostat!).
Unbelivable.
Hm ...
You were wrong, Mr. Dr.D.
THAT is the reality, take it easy
Indeed it's an unbelievable fact........
But remember,we are yet to se the results of Houdini 3 under long time controls.....
IPON is fine,but I want to see games at longer time controls where my prediction of around 40 Elo improvement will translate to reality....
Nevertheess,Houdini 3 is a real monster regards,
Dr.D
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.
Houdini 2.0c x64 4CPU = 3086
Houdini 3.0 x64 4CPU against
Critter 1.6 x64 4CPU 3137
Stockfish 2.3 x64 4CPU 3155
Equinox 1.50 x64 4CPU 3179
Kommodo 4.0 x64 1CPU 3170
...looks around +70 too
source
http://cegt.siteboard.eu/f5t411-coordin ... ini-3.html
-
- Posts: 2807
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: Houdini 3 running for the IPON
Yes, thats correct. I use 1%=7 elo - thats too simple for such a strong engine. By the way: At the moment 4100 of 10000 games are played and the score is now 67.9% (=+45 Elo). Final result on sunday evening or monday morning - stay tuned.Houdini wrote:Stefan, thank's for the testing.
Not that it's really important, but the jump from 62.0% (+85 Elo) to 68.2% (+133 Elo) is 48 Elo.
The "1% = 7 Elo" rule is only valid up to about 60% performance, see for example the Elo table http://www.pradu.us/old/Nov27_2008/Buzz/elotable.html which I used for the numbers above.
Robert
Best - Stefan
-
- Posts: 989
- Joined: Sat May 13, 2006 1:08 am
Re: Houdini 3 running for the IPON
Hi Uri,
This is slower than IPON but still far away from "real slow" time controls.
These will be played at:
- CEGT 40/20 with PB = on (starts as soon as Critter 1.6 test is finished, approx. beginning/middle next week)
- CEGT 40/120 (starts this evening)
and CCRL 40/40
But this will take some time.
and don't forget: only 140 (!!) games played so far! Next weekend we will have relatively reliable ratings at least for 40/3 and hopefully for 40/"20" too.
"40/20" is a theoretical value, because at our 40/20-list the real time control is adapted to hardware power by a benchmark. Real 40/20 would only be played on an AMD-X2-4200 @ 2,4 GHZ (our hardware reference) but the hardware used by Werner and Johan is very much faster. IIRC they play with 40/10 (Werner) and 40/8 (?!, Johan), but I don't know exactly.Uri Blass wrote:....
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.
This is slower than IPON but still far away from "real slow" time controls.
These will be played at:
- CEGT 40/20 with PB = on (starts as soon as Critter 1.6 test is finished, approx. beginning/middle next week)
- CEGT 40/120 (starts this evening)
and CCRL 40/40
But this will take some time.
and don't forget: only 140 (!!) games played so far! Next weekend we will have relatively reliable ratings at least for 40/3 and hopefully for 40/"20" too.
-
- Posts: 1539
- Joined: Thu Mar 09, 2006 2:02 pm
Re: Houdini 3 running for the IPON
The IPON-RRRL is updated as well.
85 Elo now from the first to the second place!
http://www.inwoba.de
Bye
Ingo
85 Elo now from the first to the second place!
http://www.inwoba.de
Bye
Ingo
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Houdini 3 running for the IPON
OK, so 40/20 CEGT (no PB) is too close to IPON levels to provide scaling info, so the CCRL 40/40 will be the first substantial test that sheds light on scaling.Wolfgang wrote:Hi Uri,
"40/20" is a theoretical value, because at our 40/20-list the real time control is adapted to hardware power by a benchmark. Real 40/20 would only be played on an AMD-X2-4200 @ 2,4 GHZ (our hardware reference) but the hardware used by Werner and Johan is very much faster. IIRC they play with 40/10 (Werner) and 40/8 (?!, Johan), but I don't know exactly.Uri Blass wrote:....
Initial CEGT results suggest 70 elo improvement relative to houdini2
and they use 40/20 and 4 cpu.
This is slower than IPON but still far away from "real slow" time controls.
These will be played at:
- CEGT 40/20 with PB = on (starts as soon as Critter 1.6 test is finished, approx. beginning/middle next week)
- CEGT 40/120 (starts this evening)
and CCRL 40/40
But this will take some time.
and don't forget: only 140 (!!) games played so far! Next weekend we will have relatively reliable ratings at least for 40/3 and hopefully for 40/"20" too.
-
- Posts: 989
- Joined: Sat May 13, 2006 1:08 am
Re: Houdini 3 running for the IPON
that's correct, our 40/20PB=off (which effectively is 40/10 or less) is rather close to IPON. But our 40/20PB=on whithout adapting, so it is 40/20 effectively, is ~ 5 times slower than IPON. Mean game length is around 80 minutes, IPON 16 minutes. Hope to have first results next week.lkaufman wrote: OK, so 40/20 CEGT (no PB) is too close to IPON levels to provide scaling info, so the CCRL 40/40 will be the first substantial test that sheds light on scaling.
Our 40/120 results will take some time as on one quad a 50-game match lasts 3-4 days when Shredder Classic interface is started four times (12 to 16 games per day). So it will take at least one month to have 400-500 games.
-
- Posts: 1187
- Joined: Wed Jan 06, 2010 3:11 pm
Re: Houdini 3 running for the IPON
and these are the individual statistics (Thx Ingo)
Wow.......what else can be said
Wow.......what else can be said
Code: Select all
Individual statistics:
1 Houdini 3 STD : 3086 3150 (+2146,=818,-186), 81.1 %
Deep Junior 13.3 : 150 (+116,= 28,- 6), 86.7 %
Deep Fritz 13 32b : 150 (+103,= 37,- 10), 81.0 %
spark-1.0 : 150 (+116,= 28,- 6), 86.7 %
Zappa Mexico II : 150 (+127,= 19,- 4), 91.0 %
Stockfish 2.2.2 JA : 150 (+ 75,= 59,- 16), 69.7 %
Spike 1.4 32b : 150 (+125,= 20,- 5), 90.0 %
Quazar 0.4 : 150 (+123,= 19,- 8), 88.3 %
Protector 1.4.0 : 150 (+111,= 34,- 5), 85.3 %
Naum 4.2 : 150 (+107,= 37,- 6), 83.7 %
MinkoChess 1.3 : 150 (+135,= 12,- 3), 94.0 %
Hannibal 1.2 : 150 (+114,= 29,- 7), 85.7 %
Gull 1.2 : 150 (+119,= 29,- 2), 89.0 %
Deep Sjeng c't 2010 32b : 150 (+124,= 21,- 5), 89.7 %
Deep Shredder 12 : 150 (+108,= 32,- 10), 82.7 %
Deep Rybka 4.1 : 150 (+ 70,= 63,- 17), 67.7 %
Critter 1.6a : 150 (+ 56,= 80,- 14), 64.0 %
Critter 1.4a : 150 (+ 57,= 73,- 20), 62.3 %
Chiron 1.1a : 150 (+117,= 29,- 4), 87.7 %
Komodo 5 : 150 (+ 61,= 69,- 20), 63.7 %
HIARCS 14 WCSC 32b : 150 (+108,= 34,- 8), 83.3 %
Stockfish 2.3.1 JA : 150 (+ 74,= 66,- 10), 71.3 %