New Houdini

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

stavros
Posts: 165
Joined: Tue Dec 02, 2014 1:29 am

Re: re: win adjudication/Re: New Houdini

Post by stavros »

Laskos wrote:Still 2 games to play in round 58/62 TCEC Rapid, the Ordo logistic standings:

Code: Select all

   # PLAYER                  : RATING  ERROR    POINTS  PLAYED     (%)   CFS(next)

   1 Houdini 200716          : 3341.8  117.4      52.5      58    90.5      90    
   2 Komodo 1692.19          : 3244.4  102.1      49.5      58    85.3      55    
   3 Stockfish 030916        : 3234.8  100.9      49.0      58    84.5      93    
   4 Fire 5                  : 3135.6   92.6      46.0      58    79.3      76    
   5 Jonny 8                 : 3091.7   85.2      43.0      58    74.1      64    
   6 Ginkgo 1.9h             : 3069.7   84.2      42.0      58    72.4      75    
   7 Gull 3                  : 3030.3   82.8      40.0      58    69.0      60    
   8 Andscacs 0.872b         : 3015.0   82.8      39.0      58    67.2      54    
   9 Rybka 4.1               : 3009.7   76.0      39.0      58    67.2      60    
  10 Nirvana 010916          : 2996.3   79.1      38.0      58    65.5      63    
  11 Protector 1.9           : 2977.6   78.7      37.5      58    64.7      56    
  12 Chiron 030916           : 2969.9   78.3      36.5      58    62.9      64    
  13 Texel 1.07a6            : 2950.7   76.9      36.0      58    62.1      62    
  14 Naum 4.6                : 2934.3   76.3      33.5      58    57.8      53    
  15 Critter 1.6a            : 2930.2   77.0      34.5      58    59.5      57    
  16 Hannibal 1.7            : 2920.1   79.1      34.5      58    59.5      65    
  17 Fizbo 1.8               : 2900.0   76.0      33.5      58    57.8      89    
  18 Bobcat 070916           : 2832.4   78.4      28.0      58    48.3      79    
  19 Raptor 2.3              : 2788.3   77.9      27.5      58    47.4      80    
  20 Vajolet2 2.2.15         : 2739.8   83.2      22.5      57    39.5      76    
  21 Fruit 070916            : 2698.5   83.6      22.0      57    38.6      53    
  22 Laser 280816            : 2693.9   86.3      22.0      58    37.9      57    
  23 Arasan 19.1             : 2683.2   86.9      20.0      58    34.5      73    
  24 Gaviota 1.01            : 2647.7   81.2      19.0      58    32.8      64    
  25 The Baron 3.40b         : 2626.0   87.1      19.5      58    33.6      51    
  26 DisasterArea 1.63       : 2625.0   91.1      17.5      57    30.7      74    
  27 Hakkapeliitta 210416    : 2585.0   90.3      17.5      58    30.2     100    
  28 Jellyfish 1.1           : 2340.7  123.1       9.5      58    16.4      95    
  29 Myrddin 0.87            : 2180.9  156.4       6.0      58    10.3      69    
  30 Delphil 3.3b2           : 2130.2  170.0       5.0      58     8.6      69    
  31 Firefly 2.7.0           : 2075.6  182.8       4.0      57     7.0      88    
  32 Fridolin 2              : 1928.6  232.0       2.0      58     3.4     ---    

White advantage = 35.28 +/- 11.48
Draw rate (equal opponents) = 55.17 % +/- 2.99
Houdini is 100 ELO points above Komodo and Stockfish. Also, confidence of superiority of 90% of Houdini over Komodo (and higher over Stockfish) in this Rapid.

The most striking peculiarity is that Houdini keeps more Queens on the board till the Win adjudication than Stockfish and Komodo. Average number of Queens at the Win adjudication:

Houdini: 1.06
Stockfish: 0.70
Komodo: 0.68

97.5% or 2 standard deviations that it's not a statistical fluke.

right,at 58 games only,also you "forgot" the games between H,ko,and sf only,who is first,also you "forgot" that the latest SF is far ahead from KO ,instead in tcec rapids shows are equals means doesnt represent the true sf-ko difference,as for sf-H games the sf is +2 wins, i know you would say the sample is small,btw the latest sf with contempt +10 shows sf is 90-100 more than komodo ,not to mention asmfish.
i know the positional play of houdini might overestimate it to many ppl eyes like komodo 10 when came out but soon the "crap" sf proved more "pragmatic" chess engine.
as for houdini 5 to be 100 elo better than sf based from your sample (with weak opponents",the only sample i trust is head to head match .til now sf-H +2=8-0 small sample ok ,oh as i always say give some gredit to o p e n source sf :) got the meaning?
stavros
Posts: 165
Joined: Tue Dec 02, 2014 1:29 am

Re: re: win adjudication/Re: New Houdini

Post by stavros »

clumma wrote:Amazing. How can someone leave the field for years and come back with this kind of performance straight away?
just took some ideas from open st..ops sorry,from his open mind:)
clumma
Posts: 186
Joined: Fri Oct 10, 2014 10:05 pm
Location: Berkeley, CA

Re: re: win adjudication/Re: New Houdini

Post by clumma »

stavros wrote:just took some ideas from open st..ops sorry,from his open mind:)
Maybe, but anyone can do that. Including Komodo team.
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: re: win adjudication/Re: New Houdini

Post by MikeB »

Laskos wrote:After 54-55 rounds out of 62 in the TCEC Rapidi:

Code: Select all

   # PLAYER                  : RATING  ERROR    POINTS  PLAYED     (%)   CFS(next)
   1 Houdini 200716          : 3341.7  121.5      50.0      55    90.9      88    
   2 Stockfish 030916        : 3247.0  104.6      47.0      55    85.5      53    
   3 Komodo 1692.19          : 3241.6  106.0      47.5      55    86.4      94    
   4 Fire 5                  : 3127.7   95.0      42.0      54    77.8      71    
   5 Jonny 8                 : 3094.0   89.8      40.0      54    74.1      52    
   6 Ginkgo 1.9h             : 3091.3   85.5      41.5      55    75.5      76    
   7 Gull 3                  : 3048.6   87.1      38.0      54    70.4      68    
   8 Rybka 4.1               : 3021.0   80.1      37.5      55    68.2      53    
   9 Andscacs 0.872b         : 3016.4   84.3      36.5      54    67.6      54    
  10 Nirvana 010916          : 3010.2   85.3      37.0      55    67.3      55    
  11 Chiron 030916           : 3002.3   82.4      35.5      54    65.7      59    
  12 Protector 1.9           : 2989.6   82.1      35.5      55    64.5      71    
  13 Texel 1.07a6            : 2958.1   83.2      34.0      54    63.0      70    
  14 Naum 4.6                : 2928.5   79.4      31.0      55    56.4      53    
  15 Hannibal 1.7            : 2925.0   78.7      32.0      55    58.2      52    
  16 Fizbo 1.8               : 2922.8   80.5      32.5      55    59.1      53    
  17 Critter 1.6a            : 2918.5   81.8      31.5      54    58.3      90    
  18 Bobcat 070916           : 2845.8   81.0      26.0      55    47.3      80    
  19 Raptor 2.3              : 2797.4   82.2      26.5      55    48.2      79    
  20 Vajolet2 2.2.15         : 2751.6   83.4      21.0      55    38.2      75    
  21 Fruit 070916            : 2710.5   87.2      20.5      55    37.3      56    
  22 Laser 280816            : 2700.8   87.7      20.5      55    37.3      66    
  23 Arasan 19.1             : 2675.1   91.6      17.5      55    31.8      59    
  24 Gaviota 1.01            : 2659.8   87.9      18.5      55    33.6      60    
  25 The Baron 3.40b         : 2643.7   91.4      18.5      54    34.3      64    
  26 DisasterArea 1.63       : 2619.7   97.4      16.0      55    29.1      65    
  27 Hakkapeliitta 210416    : 2592.0   96.1      16.5      55    30.0     100    
  28 Jellyfish 1.1           : 2354.1  131.5       9.5      55    17.3      91    
  29 Myrddin 0.87            : 2226.1  158.3       6.0      55    10.9      85    
  30 Delphil 3.3b2           : 2106.3  193.7       4.0      55     7.3      55    
  31 Firefly 2.7.0           : 2091.2  194.0       4.0      55     7.3      87    
  32 Fridolin 2              : 1941.5  238.0       2.0      55     3.6     ---    

White advantage = 32.74 +/- 11.28
Draw rate (equal opponents) = 54.94 % +/- 3.17
Houdini performs about 100 ELO points above SF and Komodo, SF a bit above K, although it has less points. Houdini is 88% likely to be stronger than SF in the Rapdis. Not that high confidence, but I predict that Superfinal won't be a walkover of SF over H.
a slightly different look , using bayeselo, mm 1 1 , covariance

Code: Select all

Rank Name                  Rating   Δ     +    -     #     Σ    Σ%     W    L    D   W%    =%   OppR 
------------------------------------------------------------------------------------------------------
   1 Houdini 200716         3509   0.0   77   77    59   53.5  90.7   48    0   11  81.4  18.6  3098 
   2 Komodo 1692.19         3436  72.9   69   69    58   49.5  85.3   41    0   17  70.7  29.3  3100 
   3 Stockfish 030916       3423  12.9   67   67    59   49.5  83.9   40    0   19  67.8  32.2  3103 
   4 Fire 5                 3361  62.0   66   66    58   46.0  79.3   37    3   18  63.8  31.0  3077 
   5 Jonny 8                3326  35.1   65   65    58   43.0  74.1   34    6   18  58.6  31.0  3090 
   6 Ginkgo 1.9h            3305  21.1   63   63    58   42.0  72.4   32    6   20  55.2  34.5  3099 
   7 Gull 3                 3279  25.3   62   62    58   40.0  69.0   29    7   22  50.0  37.9  3094 
   8 Andscacs 0.872b        3271   8.0   62   62    58   39.0  67.2   28    8   22  48.3  37.9  3098 
   9 Rybka 4.1              3262   8.9   61   61    58   39.0  67.2   28    8   22  48.3  37.9  3097 
  10 Nirvana 010916         3250  12.8   61   61    58   38.0  65.5   25    7   26  43.1  44.8  3102 
  11 Protector 1.9          3237  12.8   61   61    58   37.5  64.7   27   10   21  46.6  36.2  3089 
  12 Chiron 030916          3229   7.4   62   62    58   36.5  62.9   27   12   19  46.6  32.8  3099 
  13 Texel 1.07a6           3216  13.4   63   63    58   36.0  62.1   29   15   14  50.0  24.1  3090 
  14 Naum 4.6               3196  20.1   59   59    59   34.5  58.5   22   12   25  37.3  42.4  3109 
  15 Hannibal 1.7           3195   0.9   61   61    58   34.5  59.5   24   13   21  41.4  36.2  3089 
  16 Critter 1.6a           3195   0.6   61   61    58   34.5  59.5   23   12   23  39.7  39.7  3098 
  17 Fizbo 1.8              3181  13.3   62   62    59   34.0  57.6   25   16   18  42.4  30.5  3093 
  18 Bobcat 070916          3124  57.8   59   59    59   29.0  49.2   19   20   20  32.2  33.9  3111 
  19 Raptor 2.3             3081  42.8   62   62    58   27.5  47.4   18   21   19  31.0  32.8  3090 
  20 Vajolet2 2.2.15        3049  31.6   61   61    59   24.0  40.7   14   25   20  23.7  33.9  3105 
  21 Laser 280816           3021  27.9   63   63    59   23.0  39.0   15   28   16  25.4  27.1  3098 
  22 Fruit 070916           3013   8.5   64   64    59   22.5  38.1   16   30   13  27.1  22.0  3097 
  23 Arasan 19.1            3003   9.6   62   62    59   20.0  33.9   11   30   18  18.6  30.5  3116 
  24 Gaviota 1.01           2983  20.4   64   64    59   19.0  32.2   12   33   14  20.3  23.7  3115 
  25 The Baron 3.40b        2963  19.5   66   66    58   19.5  33.6   14   33   11  24.1  19.0  3092 
  26 DisasterArea 1.63      2956   7.2   67   67    59   18.5  31.4   13   35   11  22.0  18.6  3107 
  27 Hakkapeliitta 210416   2947   8.8   65   65    59   18.5  31.4   11   33   15  18.6  25.4  3091 
  28 Jellyfish 1.1          2781 165.9   78   78    58    9.5  16.4    6   45    7  10.3  12.1  3102 
  29 Myrddin 0.87           2676 105.0   93   93    58    6.0  10.3    5   51    2   8.6   3.4  3105 
  30 Delphil 3.3b2          2631  45.3  109  109    58    5.0   8.6    4   52    2   6.9   3.4  3106 
  31 Firefly 2.7.0          2603  28.4  112  112    59    4.0   6.8    4   55    0   6.8   0.0  3116 
  32 Fridolin 2             2498 104.9  129  129    59    2.0   3.4    1   56    2   1.7   3.4  3105 
------------------------------------------------------------------------------------------------------
  Δ = delta from the next higher rated opponent
  # = number of games played
  Σ = total score, 1 point for win, 1/2 point for draw
the error bars are huge, still interesting nonetheless..

and likelihood of superiority (los)

Code: Select all

ResultSet-EloRating>los
                      Ho Ko St Fi Jo Gi Gu An Ry Ni Pr Ch Te Na Ha Cr Fi Bo Ra Va La Fr Ar Ga Th Di Ha Je My De Fi Fr
Houdini 200716           92 95 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99100100100100100100100100100100100100100100
Komodo 1692.19         7    60 93 98 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99100100100100100100100100100100100100
Stockfish 030916       4 39    90 98 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99100100100100100100100100100100100100
Fire 5                 0  6  9    77 88 96 97 98 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99100100100100100100
Jonny 8                0  1  1 22    67 84 88 91 95 97 98 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99100100100100100
Ginkgo 1.9h            0  0  0 11 32    71 76 82 88 93 95 97 99 99 99 99 99 99 99 99 99 99 99 99 99 99100100100100100
Gull 3                 0  0  0  3 15 28    57 64 74 82 86 91 97 96 97 98 99 99 99 99 99 99 99 99 99 99100100100100100
Andscacs 0.872b        0  0  0  2 11 23 42    57 68 78 82 88 95 95 95 97 99 99 99 99 99 99 99 99 99 99100100100100100
Rybka 4.1              0  0  0  1  8 17 35 42    61 71 77 85 93 93 93 96 99 99 99 99 99 99 99 99 99 99100100100100100
Nirvana 010916         0  0  0  0  4 11 25 31 38    61 67 77 89 89 89 93 99 99 99 99 99 99 99 99 99 99100100100100100
Protector 1.9          0  0  0  0  2  6 17 21 28 38    56 67 82 82 83 89 99 99 99 99 99 99 99 99 99 99100100100100100
Chiron 030916          0  0  0  0  1  4 13 17 22 32 43    61 77 77 78 85 99 99 99 99 99 99 99 99 99 99100100100100100
Texel 1.07a6           0  0  0  0  0  2  8 11 14 22 32 38    67 67 68 78 98 99 99 99 99 99 99 99 99 99 99100100100100
Naum 4.6               0  0  0  0  0  0  2  4  6 10 17 22 32    50 51 63 95 99 99 99 99 99 99 99 99 99 99100100100100
Hannibal 1.7           0  0  0  0  0  0  3  4  6 10 17 22 32 49    50 62 94 99 99 99 99 99 99 99 99 99 99100100100100
Critter 1.6a           0  0  0  0  0  0  2  4  6 10 16 21 31 48 49    61 94 99 99 99 99 99 99 99 99 99 99100100100100
Fizbo 1.8              0  0  0  0  0  0  1  2  3  6 10 14 21 36 37 38    90 98 99 99 99 99 99 99 99 99 99100100100100
Bobcat 070916          0  0  0  0  0  0  0  0  0  0  0  0  1  4  5  5  9    83 95 98 99 99 99 99 99 99 99 99 99 99100
Raptor 2.3             0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1 16    76 90 93 95 98 99 99 99 99 99 99 99 99
Vajolet2 2.2.15        0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  4 23    72 78 84 92 96 97 98 99 99 99 99 99
Laser 280816           0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  9 27    57 65 79 89 91 94 99 99 99 99 99
Fruit 070916           0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  6 21 42    58 74 85 88 92 99 99 99 99 99
Arasan 19.1            0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  4 15 34 41    67 80 83 88 99 99 99 99 99
Gaviota 1.01           0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  7 20 25 32    66 71 77 99 99 99 99 99
The Baron 3.40b        0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  3 10 14 19 33    55 63 99 99 99 99 99
DisasterArea 1.63      0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  2  8 11 16 28 44    57 99 99 99 99 99
Hakkapeliitta 210416   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  5  7 11 22 36 42    99 99 99 99 99
Jellyfish 1.1          0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0    95 98 99 99
Myrddin 0.87           0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  4    73 84 98
Delphil 3.3b2          0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1 26    64 95
Firefly 2.7.0          0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 15 35    89
Fridolin 2             0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  4 10   
ResultSet-EloRating>
edit: I should say a slightly modified version of bayeselo

source: https://github.com/MichaelB7/bayeselo
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: re: win adjudication/Re: New Houdini

Post by Laskos »

MikeB wrote:
edit: I should say a slightly modified version of bayeselo

source: https://github.com/MichaelB7/bayeselo
BayeseElo uses "BayesElos" and not logistic Elos. Without re-scaling it usually means compression of logistic Elos. But LOS should be fine, and we see you get for Houdini LOS of 95% against Stockfish, which does not give extreme confidence, but still a serious one that Houdini is not significantly weaker. With my mining of these 50 or 60 games per engine, I tried to see if the qualification of Houdini to Superfinal, which came as a surprise, was not a fluke, and it seems not. Also, the style of Houdini seems pretty peculiar, it seems to care about its Queen more than other engines.

All in all, it's very hard to predict the Superfinal outcome.
User avatar
reflectionofpower
Posts: 1610
Joined: Fri Mar 01, 2013 5:28 pm
Location: USA

Re: re: win adjudication/Re: New Houdini

Post by reflectionofpower »

Yes,that is amazing, I always did like Houdini's style.
"Without change, something sleeps inside us, and seldom awakens. The sleeper must awaken." (Dune - 1984)

Lonnie
User avatar
Ozymandias
Posts: 1535
Joined: Sun Oct 25, 2009 2:30 am

Re: re: win adjudication/Re: New Houdini

Post by Ozymandias »

Nay Lin Tun wrote:This Houdini scored 90 percent, 48/53 so far. So,Houdini is nearly 400 elo above the average elo.Impressive...
56/62 in the end, still over 90%. Sample size aside, it IS impressive.
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: re: win adjudication/Re: New Houdini

Post by Dr.Wael Deeb »

clumma wrote:Amazing. How can someone leave the field for years and come back with this kind of performance straight away?
Robert is extremely talented....
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
User avatar
reflectionofpower
Posts: 1610
Joined: Fri Mar 01, 2013 5:28 pm
Location: USA

Re: re: win adjudication/Re: New Houdini

Post by reflectionofpower »

Dr.Wael Deeb wrote:
clumma wrote:Amazing. How can someone leave the field for years and come back with this kind of performance straight away?
Robert is extremely talented....
Dr.D
I like Houdini but everyone copies from everyone nowadays. Years ago every program had a different style and programmers never openly talked about how they did this or that on their program.

Mark Zukerberg - Facebook - took the idea of the facebook concept from his colleagues when he was in college - HarvardConnection

Steve Jobs - took the idea of an iconical GUI from Xerox - became the Apple concept

Bill Gates - took the idea of Windows from Steve Jobs in collaboration with him one year later I believe.

It's a rarity from someone to come up with an idea or a device that is original.

I am actually in the process of developing an invisible condom. The tagline is," Not seeing is believing" I would be a god among mere mortals. :lol:
"Without change, something sleeps inside us, and seldom awakens. The sleeper must awaken." (Dune - 1984)

Lonnie
User avatar
reflectionofpower
Posts: 1610
Joined: Fri Mar 01, 2013 5:28 pm
Location: USA

Re: re: win adjudication/Re: New Houdini

Post by reflectionofpower »

Dr.Wael Deeb wrote:
clumma wrote:Amazing. How can someone leave the field for years and come back with this kind of performance straight away?
Robert is extremely talented....
Dr.D
Robert is extremely talented....
Dr.D[/quote]

I like Houdini but everyone copies from everyone nowadays. Years ago every program had a different style and programmers never openly talked about how they did this or that on their program.

Mark Zukerberg - Facebook - took the idea of the facebook concept from his colleagues when he was in college - HarvardConnection

Steve Jobs - took the idea of an iconical GUI from Xerox - became the Apple concept

Bill Gates - took the idea of Windows from Steve Jobs in collaboration with him one year later I believe.

It's a rarity from someone to come up with an idea or a device that is original.

I am actually in the process of developing an invisible condom. The tagline is," Not seeing is believing" I would be a god among mere mortals. :lol:
"Without change, something sleeps inside us, and seldom awakens. The sleeper must awaken." (Dune - 1984)

Lonnie