Houdini 2.0 running for the IPON

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Houdini 2.0 running for the IPON

Post by Adam Hair »

Here are the three methods side-by-side:

Code: Select all

         Name           Miguel         Name          Bayesel        Name         EloStat 
    Houdini 2.0 Pro x64  3036   Houdini 2.0 Pro x64   3021   Houdini 2.0 Pro x64   3012  
           Houdini 1.5a  3030      Houdini 1.5a       3015      Houdini 1.5a       3007  
       Komodo64 3 SSE42  2983    Komodo64 3 SSE42     2969    Komodo64 3 SSE42     2960  
   Deep Rybka 4.1 SSE42  2977  Deep Rybka 4.1 SSE42   2960      Houdini 1.03a      2959  
            Critter 1.2  2976       Critter 1.2       2958  Deep Rybka 4.1 SSE42   2957  
           Deep Rybka 4  2973      Deep Rybka 4       2958      Deep Rybka 4       2954  
          Houdini 1.03a  2971  Komodo 2.03 DC SSE42   2956       Critter 1.2       2954  
   Komodo 2.03 DC SSE42  2967      Houdini 1.03a      2956  Komodo 2.03 DC SSE42   2949  
     Stockfish 2.1.1 JA  2960   Stockfish 2.1.1 JA    2947   Stockfish 2.1.1 JA    2940  
     Critter 1.01 SSE42  2938   Critter 1.01 SSE42    2927    Stockfish 2.01 JA    2925  
      Stockfish 2.01 JA  2938    Stockfish 2.01 JA    2926   Critter 1.01 SSE42    2924  
     Stockfish 1.9.1 JA  2916       Rybka 3 mp        2906       Rybka 3 mp        2914  
             Rybka 3 mp  2916   Stockfish 1.9.1 JA    2905   Stockfish 1.9.1 JA    2907  
     Critter 0.90 SSE42  2908   Critter 0.90 SSE42    2899   Critter 0.90 SSE42    2898  
     Stockfish 1.7.1 JA  2901   Stockfish 1.7.1 JA    2891   Stockfish 1.7.1 JA    2893  
            Rybka 3 32b  2857       Rybka 3 32b       2854       Rybka 3 32b       2858  
     Stockfish 1.6.x JA  2842   Stockfish 1.6.x JA    2836   Stockfish 1.6.x JA    2841  
        Komodo64 1.3 JA  2837     Komodo64 1.3 JA     2835     Komodo64 1.3 JA     2833  
               Naum 4.2  2831        Naum 4.2         2826        Naum 4.2         2827  
           Critter 0.80  2823      Critter 0.80       2822      Critter 0.80       2822  
          Komodo 1.2 JA  2808      Komodo 1.2 JA      2807      Komodo 1.2 JA      2808  
        Rybka 2.3.2a mp  2804     Rybka 2.3.2a mp     2801     Rybka 2.3.2a mp     2804  
       Deep Shredder 12  2800    Deep Shredder 12     2800    Deep Shredder 12     2800  
               Gull 1.2  2794        Gull 1.2         2795        Gull 1.2         2794  
           Critter 0.70  2790        Gull 1.1         2791        Naum 4.1         2792  
               Gull 1.1  2790      Critter 0.70       2791        Gull 1.1         2791  
               Naum 4.1  2788        Naum 4.1         2788      Critter 0.70       2791  
Deep Sjeng c't 2010 32b  2785  Deep Sjeng c't 2010 3  2787  Deep Sjeng c't 2010 3  2787  
          Komodo 1.0 JA  2784      Komodo 1.0 JA      2783      Komodo 1.0 JA      2786  
          Spike 1.4 32b  2779      Spike 1.4 32b      2781      Spike 1.4 32b      2782  
      Deep Fritz 12 32b  2777    Deep Fritz 12 32b    2778     Rybka 2.2n2 mp      2780  
                 Naum 4  2775     Rybka 2.2n2 mp      2776    Deep Fritz 12 32b    2779  
         Rybka 2.2n2 mp  2774         Naum 4          2776         Naum 4          2779  
              Gull 1.0a  2766        Gull 1.0a        2768        Gull 1.0a        2770  
     Stockfish 1.5.1 JA  2761       Rybka 1.2f        2764       Rybka 1.2f        2768  
             Rybka 1.2f  2760   Stockfish 1.5.1 JA    2763   Stockfish 1.5.1 JA    2767  
    Protector 1.4.0 x64  2756   Protector 1.4.0 x64   2758   Protector 1.4.0 x64   2761  
        spark-1.0 SSE42  2752     spark-1.0 SSE42     2755     spark-1.0 SSE42     2757  
           Hannibal 1.1  2751      Hannibal 1.1       2754      Hannibal 1.1       2757  
     HIARCS 13.2 MP 32b  2747   HIARCS 13.2 MP 32b    2750   HIARCS 13.2 MP 32b    2753  
           Fritz 12 32b  2740      Fritz 12 32b       2744      Fritz 12 32b       2746  
     HIARCS 13.1 MP 32b  2727   HIARCS 13.1 MP 32b    2730   HIARCS 13.1 MP 32b    2735  
       Deep Junior 12.5  2725    Deep Junior 12.5     2728    Deep Junior 12.5     2734  
      Deep Fritz 11 32b  2720    Deep Fritz 11 32b    2727    Deep Fritz 11 32b    2729  
          Doch64 1.2 JA  2710     Protector 1.4.0     2716      Doch64 1.2 JA      2718  
              spark-0.4  2709      Doch64 1.2 JA      2715        spark-0.4        2717  
       Stockfish 1.4 JA  2708        spark-0.4        2713     Zappa Mexico II     2716  
        Protector 1.4.0  2708    Stockfish 1.4 JA     2713    Stockfish 1.4 JA     2716  
        Zappa Mexico II  2707     Zappa Mexico II     2712    Shredder Bonn 32b    2714  
      Shredder Bonn 32b  2705    Shredder Bonn 32b    2711   Protector 1.3.2 JA    2704  
           Critter 0.60  2694   Protector 1.3.2 JA    2699      Critter 0.60       2703  
     Protector 1.3.2 JA  2694      Critter 0.60       2699    Deep Shredder 11     2695  
       Deep Shredder 11  2685    Deep Shredder 11     2692    Doch64 09.980 JA     2691  
       Doch64 09.980 JA  2682    Doch64 09.980 JA     2688    Deep Onno 1-2-70     2688  
         Deep Junior 12  2675        Naum 3.1         2682      Hannibal 1.0a      2687  
             Onno-1-1-1  2675    Deep Onno 1-2-70     2681     Deep Junior 12      2687  
          Hannibal 1.0a  2674      Hannibal 1.0a      2680       Onno-1-1-1        2685  
       Deep Onno 1-2-70  2674       Onno-1-1-1        2680     Rybka 1.0 Beta      2684  
               Naum 3.1  2673     Deep Junior 12      2679     Protector 1.4.0     2683  
         Zappa Mexico I  2672     Zappa Mexico I      2679        Naum 3.1         2683  
         Rybka 1.0 Beta  2671     Rybka 1.0 Beta      2677     Zappa Mexico I      2683  
        Spark-0.3 VC(a)  2668     Spark-0.3 VC(a)     2674     Spark-0.3 VC(a)     2679  
             Onno-1-0-0  2666       Onno-1-0-0        2673       Onno-1-0-0        2675  
      Deep Sjeng WC2008  2663    Deep Sjeng WC2008    2670    Deep Sjeng WC2008    2675  
  Toga II 1.4 beta5c BB  2660      Strelka 2.0 B      2668      Strelka 2.0 B      2675  
          Strelka 2.0 B  2658  Toga II 1.4 beta5c BB  2667  Toga II 1.4 beta5c BB  2673  
       Deep Junior 11.2  2658    Deep Junior 11.2     2662    Deep Junior 11.2     2672  
     Hiarcs 12.1 MP 32b  2651     Umko 1.2 SSE42      2660     Umko 1.2 SSE42      2668  
         Umko 1.2 SSE42  2650   Hiarcs 12.1 MP 32b    2656   Hiarcs 12.1 MP 32b    2662  
         Deep Sjeng 3.0  2648     Deep Sjeng 3.0      2654     Deep Sjeng 3.0      2660  
          Critter 0.52b  2637  Shredder Classic 4 32  2645      Critter 0.52b      2650  
 Shredder Classic 4 32b  2637      Critter 0.52b      2644  Shredder Classic 4 32  2649  
      Deep Junior 11.1a  2627      Naum 2.2 32b       2636    Deep Junior 11.1a    2641  
           Naum 2.2 32b  2625    Deep Junior 11.1a    2635     Umko 1.1 SSE42      2638  
         Umko 1.1 SSE42  2621     Umko 1.1 SSE42      2630      Naum 2.2 32b       2637  
       Deep Junior 2010  2618   Rybka 1.0 Beta 32b    2627    Deep Junior 2010     2632  
        Glaurung 2.2 JA  2618     Glaurung 2.2 JA     2627   Rybka 1.0 Beta 32b    2629  
     Rybka 1.0 Beta 32b  2617    Deep Junior 2010     2625     Glaurung 2.2 JA     2628  
        HIARCS 11.2 32b  2612     HIARCS 11.2 32b     2621     HIARCS 11.2 32b     2624  
     Fruit 05/11/03 32b  2610   Fruit 05/11/03 32b    2620   Fruit 05/11/03 32b    2622  
         Loop 13.6/2007  2601     Loop 13.6/2007      2612     Loop 13.6/2007      2621  
         Toga II 1.2.1a  2600     Jonny 4.00 32b      2608     Jonny 4.00 32b      2620  
         Jonny 4.00 32b  2600     Toga II 1.2.1a      2608     Toga II 1.2.1a      2612  
              ListMP 11  2596        ListMP 11        2605        ListMP 11        2611  
          LoopMP 12 32b  2594      LoopMP 12 32b      2603      LoopMP 12 32b      2605  
       Deep Shredder 10  2590    Deep Shredder 10     2597  Twisted Logic 2010013  2603  
Twisted Logic 20100131x  2585  Twisted Logic 2010013  2593    Deep Shredder 10     2603  
         Crafty 23.3 JA  2580     Crafty 23.3 JA      2589     Crafty 23.3 JA      2601  
    Spike 1.2 Turin 32b  2563   Spike 1.2 Turin 32b   2574   Spike 1.2 Turin 32b   2582  
     Deep Sjeng 2.7 32b  2539   Deep Sjeng 2.7 32b    2551   Deep Sjeng 2.7 32b    2552  
         Crafty 23.1 JA  2528     Crafty 23.1 JA      2539     Crafty 23.1 JA      2549  
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Houdini 2.0 running for the IPON

Post by IWB »

Finaly finished: http://www.inwoba.de

The "official initial" result is online now!

In the download I added a "game.pgn" with the results (and only results!) of all the games of the IPON in case someone want to do some own statistical tricks.

Bye
Ingo
ernest
Posts: 2045
Joined: Wed Mar 08, 2006 8:30 pm

Re: Houdini 2.0 running for the IPON

Post by ernest »

Hi Ingo,

As you can see in
http://www.talkchess.com/forum/viewtopi ... 330#422330
I am completely satisfied by your ratings.
Keep it up!
Jouni
Posts: 3617
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Houdini 2.0 running for the IPON

Post by Jouni »

Hmm, too expensive 40€ for mere 10 ELO. Besides I have feeling we have soon new Critter or Stockfish.

Jouni
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Houdini 2.0 running for the IPON

Post by Houdini »

Jouni wrote:Hmm, too expensive 40€ for mere 10 ELO. Besides I have feeling we have soon new Critter or Stockfish.

Jouni
There's so much more in Houdini 2 than its 25 Elo increase (when you look at more test results than just the IPON list you'll find that this is about the average gain that is found).
A price of 40€ for the strongest chess engine on the planet doesn't really feel exaggerated, it's about the cost of your average Wii or Playstation game sold by millions. But of course the free Houdini 1.5a remains a pretty good performer ;).

Robert
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Houdini 2.0 running for the IPON

Post by Houdini »

Houdini wrote:There's so much more in Houdini 2 than its 25 Elo increase (when you look at more test results than just the IPON list you'll find that this is about the average gain that is found).
A recent illustration of what I was saying, Ahmed Kamal's Top 10 Rating List at the Chess2U forum:
http://www.chess2u.com/t3992-top10-rati ... ember-2011

Houdini 2.0 (1530 games) +54 ELO improvement over Houdini 1.5a.
It's amazing how much the results from the various rating lists differ...

Robert
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Houdini 2.0 running for the IPON

Post by mwyoung »

Houdini wrote:
Houdini wrote:There's so much more in Houdini 2 than its 25 Elo increase (when you look at more test results than just the IPON list you'll find that this is about the average gain that is found).
A recent illustration of what I was saying, Ahmed Kamal's Top 10 Rating List at the Chess2U forum:
http://www.chess2u.com/t3992-top10-rati ... ember-2011

Houdini 2.0 (1530 games) +54 ELO improvement over Houdini 1.5a.
It's amazing how much the results from the various rating lists differ...

Robert
Houdini 2.0 is a beast.

I have done my own testing on since the early 80's. When I had to do testing by hand. I can say Houdini 2.0 is much stronger then 10 elo over Houdini 1.5a.

I suggest if you want to see what advantage Houdini 2.0 will give you over Houdini 1.5a or other programs. Run them side by side looking at your Favorite GM games. Houdini 2.0 analysis is much faster then Houdini 1.5a. Not the same program for sure.

My testing is still showing around 40 elo better then version 1.5a of Houdini.

Houdini was tested with multi cores being used, not one. I don't know if this is part of the reason why Ipon is showing a lower ratings gain of 10 elo.

But what is clear is Ipon has shown the lowest ratings gain for Houdini I have seen. The other ratings I have seen have shown around a 25 elo or better ratings gain for Houdini 2.0. Time will tell if Ipon has it right, or wrong as more testing is done with Houdini 2.0.
User avatar
mhull
Posts: 13447
Joined: Wed Mar 08, 2006 9:02 pm
Location: Dallas, Texas
Full name: Matthew Hull

Re: Houdini 2.0 running for the IPON

Post by mhull »

mwyoung wrote: But what is clear is Ipon has shown the lowest ratings gain for Houdini I have seen. The other ratings I have seen have shown around a 25 elo or better ratings gain for Houdini 2.0. Time will tell if Ipon has it right, or wrong as more testing is done with Houdini 2.0.
Firstly, it's way too few games to nail down such small Elo gain/loss.

Secondly, I wouldn't expect the rating lists to agree since they are forcing different, limited opening suites on the engine and its competitors in a misguided attempt a reducing variability, not realizing they are choosing arbitrary game tree "topologies". One would expect different relative strength depending on the topology dilineated by the non-dynamic book and/or set of start positions.
Matthew Hull
lkaufman
Posts: 6215
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Houdini 2.0 running for the IPON

Post by lkaufman »

IWB wrote:Finaly finished: http://www.inwoba.de

The "official initial" result is online now!

In the download I added a "game.pgn" with the results (and only results!) of all the games of the IPON in case someone want to do some own statistical tricks.

Bye
Ingo
I notice that you made a comment about the huge difference of the top engine on Elostat vs. Bayeselo. I think you made a huge mistake. The Bayeselo list is based on Shredder 12 = 2800, but the Elostat list is based on Shredder 12 = 3000! So all the ratings are about 200 higher! Once you correct this, Houdini 2.0 is five elo lower on Elostat than on Bayeselo, no big deal.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Houdini 2.0 running for the IPON

Post by Don »

Houdini wrote:
Houdini wrote:There's so much more in Houdini 2 than its 25 Elo increase (when you look at more test results than just the IPON list you'll find that this is about the average gain that is found).
A recent illustration of what I was saying, Ahmed Kamal's Top 10 Rating List at the Chess2U forum:
http://www.chess2u.com/t3992-top10-rati ... ember-2011

Houdini 2.0 (1530 games) +54 ELO improvement over Houdini 1.5a.
It's amazing how much the results from the various rating lists differ...

Robert
That list is completely broken, you cannot use it for making a point about anything if you want to be taken seriously.

Shredder is rated 2734 on that list - so to compare it to Ingo's list you must add 66 ELO which makes Houdini 1.5 3017 + 66 = 3083 and Houdini 2.0 would be 3137. Even WITHOUT the 66 ELO adjustment Houdini 1.5 comes out stronger on that list - so it's a ridiculous list.

Critter is rated 52 ELO above Komodo on this list, but on Ingo's list Komodo is 12 ELO higher. That is a discrepancy of 64 ELO.

Some possible explanations:

1. The tester does not know how to do scientific testing.
2. The tester has introduced his own biases (perhaps inadvertently)
3. The test is done done at a very fast level.

If you test at really fast time controls, these results might make more sense but still seem pretty far off. One thing we have discovered is that all of the Ippo based programs get a fast start - their ratings are off the charts at fast time controls (a minute or less) but decline with increasing depth. We are not sure that this represents a permanent scalability bug or whether it's localized to levels below 5 or 10 minutes.

Code: Select all

    -----------------------------------------------------------------------------
      Program                          Elo    +  -  Games    Score  Av.Op. Draws 
    -----------------------------------------------------------------------------
    1 Houdini 2.0 x64----------------: 3071  16  16  1530    75.3 %  2878  27.3 %
    2 Critter 1.2 x64----------------: 2964    7  7  5820    61.3 %  2884  38.9 %
    3 Ivanhoe B47cB x64--------------: 2957    8  8  3820    56.2 %  2914  46.6 %
    4 Fire 2.2 xTreme x64------------: 2956    8  8  3670    56.2 %  2913  45.8 %
    5 Deep Rybka 4.1 x64-------------: 2917    8  8  3820    49.6 %  2920  43.1 %
    6 komodo 3 x64-------------------: 2907  14  14  1520    49.5 %  2910  34.1 %
    7 Stockfish 2.1.1 JA x64---------: 2895    9  9  3820    45.9 %  2923  39.1 %
    8 Gull 1.2 x64-------------------: 2786    9  9  3820    34.6 %  2896  30.3 %
    9 Naum 4.2 x64-------------------: 2785    9  9  3820    34.4 %  2896  30.8 %
    10 Deep Shredder 12 x64----------: 2734  10  10  3820    27.2 %  2905  27.3 %