Houdini 3 running for the IPON

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Houdini 3 running for the IPON

Post by Houdini »

IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse. :lol:
Thanks for running the test, I think it was really worth the while.

Robert
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Houdini 3 running for the IPON

Post by MM »

Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse. :lol:
Thanks for running the test, I think it was really worth the while.

Robert
Meant worse for the opponents.
MM
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Houdini 3 running for the IPON

Post by lkaufman »

IWB wrote:I am sorry but I have to correct myself.

Accidently I added a wrong game set of 150 Chiron games (Chiron vs Stockfish 2 times) ...

Now it is even worse: 65Elo plus from H2 to H3!

Web site is already corrected.

Bye
Ingo
I must congratulate Robert on this achievement. My own testing against latest Komodo at 5' + 3" (no ponder) implies that we are about 60 elo behind h3, which is pretty consistent with these results and my statement that we are now on a par with h2. It looks like we will have to improve Komodo by about 60 elo at blitz to catch Houdini 3. A tall order, but we intend to do it somehow.
I would also like to say that while I did not consider Robert to be the real author of Houdini 1.5 or 2 because they were basically just tweaked versions of Ippolit (Ivanhoe), it seems that his contribution to H3 is significant enough to warrant equal credit along with the unknown "Mr. Ippolit". Aside from the strength improvement, the rescaling of scores makes the program more attractive for use in Aquarium IDeA, because the tiny scores in previous Houdini versions together with the rounding done by IDeA made the combination pretty useless, which is no longer an issue. Congrats again!
Now it only remains to be seen if these huge elo gains hold up at longer time limits like the 40/20 CEGT or 40/40 CCRL.
S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: Houdini 3 running for the IPON

Post by S.Taylor »

Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse. :lol:
Thanks for running the test, I think it was really worth the while.

Robert

Worse for chess backwardness.

The lowest rated program in the world would be best, for that.
S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: Houdini 3 running for the IPON

Post by S.Taylor »

lkaufman wrote:
IWB wrote:I am sorry but I have to correct myself.

Accidently I added a wrong game set of 150 Chiron games (Chiron vs Stockfish 2 times) ...

Now it is even worse: 65Elo plus from H2 to H3!

Web site is already corrected.

Bye
Ingo
I must congratulate Robert on this achievement. My own testing against latest Komodo at 5' + 3" (no ponder) implies that we are about 60 elo behind h3, which is pretty consistent with these results and my statement that we are now on a par with h2. It looks like we will have to improve Komodo by about 60 elo at blitz to catch Houdini 3. A tall order, but we intend to do it somehow.
I would also like to say that while I did not consider Robert to be the real author of Houdini 1.5 or 2 because they were basically just tweaked versions of Ippolit (Ivanhoe), it seems that his contribution to H3 is significant enough to warrant equal credit along with the unknown "Mr. Ippolit". Aside from the strength improvement, the rescaling of scores makes the program more attractive for use in Aquarium IDeA, because the tiny scores in previous Houdini versions together with the rounding done by IDeA made the combination pretty useless, which is no longer an issue. Congrats again!
Now it only remains to be seen if these huge elo gains hold up at longer time limits like the 40/20 CEGT or 40/40 CCRL.
I thought that Houdini 1.5 and 2.0 were already enough proof that Robert was on par with the best. (Except maybe still below the real Mr. ippolit [Vas?]. So maybe he has now done the double Ippolit).

Longer time limits? My g.... I really hope so. If not, what's it worth?

And Komodo? Already at Houdini 2 level? If so, I'd love to see results of test. Then Komodo should be made into like a Houdini 3 plus.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Houdini 3 running for the IPON

Post by Laskos »

IPON results using Ordo v0.6 with Deep Shredder 12 fixed at 2800

C:\Users\Ordo>ordo-win64 -a 2800 -A "Deep Shredder 12" -W -s 1000 -p IPON.pgn -o IPON.txt

+74 Elo points improvement of Houdini 3 over Houdini 2. Bayeselo compresses a bit the ratings over large range of strengths.

Code: Select all



   # ENGINE                     : RATING  ERROR   POINTS  PLAYED    (%)
   1 Houdini 3 STD              : 3117.5   17.0   2230.5    2700   82.6%
   2 Houdini 2.0 STD            : 3043.4   11.9   4562.5    5850   78.0%
   3 Houdini 1.5a               : 3036.1   14.2   3162.5    4000   79.1%
   4 Komodo 5                   : 3023.1   14.5   2594.0    3600   72.1%
   5 Critter 1.4a               : 3002.5   12.3   3982.0    5350   74.4%
   6 Komodo 4                   : 2999.9   12.3   3653.0    4850   75.3%
   7 Critter 1.6a               : 2992.0   14.5   2329.0    3300   70.6%
   8 Komodo 3                   : 2988.4   16.0   2075.5    2800   74.1%
   9 Stockfish 2.3.1 JA         : 2978.8   15.5   1969.0    2850   69.1%
  10 Deep Rybka 4.1             : 2978.8   11.2   4775.5    6800   70.2%
  11 Stockfish 2.2.2 JA         : 2978.6   11.9   3797.0    5250   72.3%
  12 Deep Rybka 4               : 2978.2   13.0   3627.0    4900   74.0%
  13 Critter 1.2                : 2978.0   14.9   2232.0    3100   72.0%
  14 Houdini 1.03a              : 2976.5   15.1   2520.0    3200   78.8%
  15 Komodo 2.03 DC             : 2970.9   15.7   1985.5    2700   73.5%
  16 Stockfish 2.1.1 JA         : 2961.9   14.3   2426.5    3500   69.3%
  17 Critter 1.01               : 2942.2   15.5   1970.0    2800   70.4%
  18 Stockfish 2.01 JA          : 2941.9   14.8   2246.0    3100   72.5%
  19 Stockfish 1.9.1 JA         : 2919.9   15.1   2131.0    3000   71.0%
  20 Rybka 3 mp                 : 2919.4   13.0   3228.0    4200   76.9%
  21 Critter 0.90               : 2911.8   13.8   2327.5    3400   68.5%
  22 Stockfish 1.7.1 JA         : 2904.1   14.6   2131.0    2900   73.5%
  23 Rybka 3 32b                : 2859.7   18.5   1191.5    1700   70.1%
  24 Stockfish 1.6.x JA         : 2843.8   15.0   1792.5    2600   68.9%
  25 Komodo 1.3 JA              : 2839.0   13.5   1946.0    3300   59.0%
  26 Naum 4.2                   : 2833.2    8.9   5536.0    9900   55.9%
  27 Chiron 1.1a                : 2831.9   11.4   2811.5    5400   52.1%
  28 Deep Fritz 13 32b          : 2831.2   13.4   1739.0    3600   48.3%
  29 Critter 0.80               : 2824.6   14.6   1795.5    2800   64.1%
  30 HIARCS 14 WCSC 32b         : 2819.7   13.7   1599.0    3450   46.3%
  31 Fritz 13 32b               : 2818.7   12.3   2308.0    4300   53.7%
  32 Komodo 1.2 JA              : 2809.1   13.2   2175.0    3700   58.8%
  33 Rybka 2.3.2a mp            : 2804.6   13.3   2172.5    3500   62.1%
  34 Deep Shredder 12           : 2800.0   ----   5820.0   11000   52.9%
  35 Hannibal 1.2               : 2795.6   12.6   1869.0    4200   44.5%
  36 Gull 1.2                   : 2793.2   10.4   3302.5    6900   47.9%
  37 Critter 0.70               : 2791.1   16.8   1107.0    1900   58.3%
  38 Gull 1.1                   : 2790.7   13.9   1675.5    3100   54.0%
  39 Naum 4.1                   : 2789.1   15.3   1465.0    2300   63.7%
  40 Deep Sjeng c't 2010 32b    : 2787.5   10.1   3767.5    7900   47.7%
  41 Komodo 1.0 JA              : 2784.4   15.1   1756.5    2900   60.6%
  42 Spike 1.4 32b              : 2780.0   10.7   3237.5    7000   46.3%
  43 Deep Fritz 12 32b          : 2777.3   10.6   3268.5    6300   51.9%
  44 Naum 4                     : 2775.2   14.9   1628.5    2700   60.3%
  45 Rybka 2.2n2 mp             : 2774.6   16.6   1311.5    2100   62.5%
  46 Gull 1.0a                  : 2765.9   15.5   1254.0    2300   54.5%
  47 Stockfish 1.5.1 JA         : 2761.3   16.8   1128.5    1900   59.4%
  48 Rybka 1.2f                 : 2760.3   16.0   1578.5    2400   65.8%
  49 Protector 1.4.0            : 2757.2   10.5   3115.5    7100   43.9%
  50 spark-1.0                  : 2756.7   10.6   3318.0    7600   43.7%
  51 Hannibal 1.1               : 2746.4   12.2   2142.0    4900   43.7%
  52 Deep Junior 13.3           : 2745.0   13.3   1373.0    3750   36.6%
  53 HIARCS 13.2 MP 32b         : 2744.5   10.7   2922.5    6800   43.0%
  54 Deep Junior 13             : 2743.6   13.1   1452.5    3600   40.3%
  55 Fritz 12 32b               : 2739.2   16.4   1091.0    2000   54.5%
  56 Quazar 0.4                 : 2734.6   12.3   1709.0    4500   38.0%
  57 HIARCS 13.1 MP 32b         : 2726.4   12.7   1734.5    3600   48.2%
  58 Deep Junior 12.5           : 2724.9   12.1   1963.0    4850   40.5%
  59 Deep Fritz 11 32b          : 2719.2   20.2    744.5    1300   57.3%
  60 Doch64 1.2 JA              : 2708.9   18.1    820.5    1600   51.3%
  61 spark-0.4                  : 2707.4   13.3   1458.0    3100   47.0%
  62 Stockfish 1.4 JA           : 2706.6   17.6    849.0    1700   49.9%
  63 Zappa Mexico II            : 2704.8    9.0   5260.0   12300   42.8%
  64 Shredder Bonn 32b          : 2703.4   15.6   1119.0    2200   50.9%
  65 Critter 0.60               : 2692.0   15.6   1072.0    2200   48.7%
  66 Protector 1.3.2 JA         : 2691.8   11.2   2361.5    5300   44.6%
  67 MinkoChess 1.3             : 2686.3   13.0   1301.0    4200   31.0%
  68 Deep Shredder 11           : 2683.0   15.0   1412.0    2700   52.3%
  69 Doch64 09.980 JA           : 2680.2   18.1    710.0    1500   47.3%
  70 Deep Junior 12             : 2672.4   13.4   1356.0    3600   37.7%
  71 Onno-1-1-1                 : 2672.3   12.6   1923.0    4300   44.7%
  72 Hannibal 1.0a              : 2671.8   12.2   1600.0    4200   38.1%
  73 Deep Onno 1-2-70           : 2670.6   10.0   2806.5    7700   36.4%
  74 Naum 3.1                   : 2670.6   14.3   1514.5    3000   50.5%
  75 Zappa Mexico I             : 2670.1   16.0   1221.0    2200   55.5%
  76 Rybka 1.0 Beta             : 2669.1   15.9   1023.5    2300   44.5%
  77 Spark-0.3 VC(a)            : 2665.9   13.6   1625.0    3600   45.1%
  78 Onno-1-0-0                 : 2663.1   20.7    594.5    1200   49.5%
  79 Deep Sjeng WC2008          : 2660.8   11.1   2434.5    5600   43.5%
  80 Toga II 1.4 beta5c BB      : 2657.0    9.7   3255.5    8300   39.2%
  81 Deep Junior 11.2           : 2655.6   13.9   1176.0    2900   40.6%
  82 Strelka 2.0 B              : 2651.2   11.4   1778.5    5500   32.3%
  83 Hiarcs 12.1 MP 32b         : 2647.7   11.2   2427.5    5600   43.3%
  84 Tornado 4.88               : 2645.3   15.1    803.0    2400   33.5%
  85 Deep Sjeng 3.0             : 2645.1   19.6    601.5    1400   43.0%
  86 Umko 1.2                   : 2644.6   13.9   1016.5    3300   30.8%
  87 Critter 0.52b              : 2633.9   14.8   1097.0    2600   42.2%
  88 Shredder Classic 4 32b     : 2633.7   17.5    922.5    1800   51.3%
  89 Deep Junior 11.1a          : 2623.6   14.3   1153.0    2800   41.2%
  90 Naum 2.2 32b               : 2621.9   20.0    614.0    1300   47.2%
  91 Crafty 23.5                : 2620.1   15.3    670.5    2700   24.8%
  92 Umko 1.1                   : 2616.7   13.3   1146.0    3900   29.4%
  93 Nemo 1.0.1                 : 2616.6   15.2    708.0    2700   26.2%
  94 Deep Junior 2010           : 2614.6   13.7   1210.0    3100   39.0%
  95 Glaurung 2.2 JA            : 2613.7   15.2   1027.5    2600   39.5%
  96 Rybka 1.0 Beta 32b         : 2613.6   21.5    506.0    1100   46.0%
  97 HIARCS 11.2 32b            : 2608.6   17.5    827.0    1900   43.5%
  98 Fruit 05/11/03 32b         : 2605.9   12.7   1774.0    4400   40.3%
  99 Loop 2007                  : 2598.7   10.4   2456.0    7900   31.1%
 100 Toga II 1.2.1a             : 2595.6   18.6    716.5    1600   44.8%
 101 Jonny 4.00 32b             : 2594.8   12.1   1389.5    5200   26.7%
 102 ListMP 11                  : 2591.2   14.8    987.5    2600   38.0%
 103 LoopMP 12 32b              : 2589.0   18.5    635.0    1500   42.3%
 104 Tornado 4.80               : 2586.9   15.4    681.5    2700   25.2%
 105 Deep Shredder 10           : 2585.3   12.5   1754.0    4400   39.9%
 106 Twisted Logic 20100131x    : 2580.8   13.9   1140.0    3500   32.6%
 107 Crafty 23.3 JA             : 2575.9   12.4   1290.5    5200   24.8%
 108 Spike 1.2 Turin 32b        : 2558.1   10.6   2349.5    7700   30.5%
 109 Deep Sjeng 2.7 32b         : 2533.7   20.3    465.5    1400   33.3%
 110 Crafty 23.1 JA             : 2522.2   13.3   1002.0    3800   26.4%
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Houdini 3 running for the IPON

Post by IWB »

Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse. :lol:
Thanks for running the test, I think it was really worth the while.

Robert
No no, I ment worse!

I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.

It is not your fault but I am more worried about computerchess overall.

In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy :-)

Any congrats again, nice jump for your engine.

Bye
Ingo
S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: Houdini 3 running for the IPON

Post by S.Taylor »

IWB wrote:
Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse. :lol:
Thanks for running the test, I think it was really worth the while.

Robert
No no, I ment worse!

I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.

It is not your fault but I am more worried about computerchess overall.

In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy :-)

Any congrats again, nice jump for your engine.

Bye
Ingo
I MUCH prefer there being only one engine that i would want to use.
I waited for this many years.

Then came the Rybkas and now Houdinis.

For some reason, i didn't enjoy everything about Rybka, but if Houdini holds on to a clear edge, in all aspects of the game, then i am quite happy about it.

If another program makes Houdini look bad, and shows up its weaknesses, then i would feel unsettled, until i have a clear thing.

But, i already had the feeling that Houdini is good enough, and that now i can at long last, get satifaction from its opinions on all positions i like to speculate about.

If another one goes WELL OVER that, then i would get that one instead.

But i don't enjoy the constant closeness between so many programs.
S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: Houdini 3 running for the IPON

Post by S.Taylor »

Still, if you will tell me that a program is deeper and more intuitive than Houdini, even though it has a slightly lower elo, then i would need to have that one as well.

I appreciate a program risking alot to get stunning results, even if it has a weak side to it, of sometimes failing, as it may have ideas about positions, which the no.1 machine might not have, and will sometimes be right, and stunningly so.
Uri Blass
Posts: 10893
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Houdini 3 running for the IPON

Post by Uri Blass »

S.Taylor wrote:
IWB wrote:
Houdini wrote:
IWB wrote:Now it is even worse: 65Elo plus from H2 to H3!
You're supposed to say better, not worse. :lol:
Thanks for running the test, I think it was really worth the while.

Robert
No no, I ment worse!

I do not envy the success, but we had a situation like this for 5 years and this was boring! Until 2 days ago we had a No.1 with one or two strong condenders very close, that was interesting... basicaly there is now just one engine to use.

It is not your fault but I am more worried about computerchess overall.

In other words and with all respect for you and your engine and the achievement: I do not mind the No.1 (I am not one of those fanboys) but I mind the competition. If we would have a No.1 Enigne which is 200 Elo worse than yours but a competition between 5 engines I would be happy :-)

Any congrats again, nice jump for your engine.

Bye
Ingo
I MUCH prefer there being only one engine that i would want to use.
I waited for this many years.

Then came the Rybkas and now Houdinis.

For some reason, i didn't enjoy everything about Rybka, but if Houdini holds on to a clear edge, in all aspects of the game, then i am quite happy about it.

If another program makes Houdini look bad, and shows up its weaknesses, then i would feel unsettled, until i have a clear thing.

But, i already had the feeling that Houdini is good enough, and that now i can at long last, get satifaction from its opinions on all positions i like to speculate about.

If another one goes WELL OVER that, then i would get that one instead.

But i don't enjoy the constant closeness between so many programs.
Houdini certainly has weaknesses and it is not better than everything in every type of position.

Same was for rybka
If people think that there is basically only one engine to use only because that engine get at least 60% against everything then they are clearly wrong.