Long TC matches with Houdini 3 Beta
Moderators: hgm, Rebel, chrisw
-
- Posts: 1833
- Joined: Thu Jun 22, 2006 12:07 am
Re: Long TC matches with Houdini 3 Beta
Impressive results Robert, I can only say congratulations!
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Long TC matches with Houdini 3 Beta
Noomen test positions + reversed colours are pretty fair and representative, I guess. Not only these positions do not favour one side, but also favouring heavily one side compresses the differences if played reversed colours. 360 games give about 25 points errors, so let's see, but for now very impressive, the ratio W/L against the top engines is 2-3 for H3.MM wrote:Pity that the tactical mode weakens the engine (overall), otherwise it would be by default.Laskos wrote:
Seems 50-60 points imrovement at long TC, which is impressive. Also, in tactical mode beats everything on test suites.
+50/60 elo? Maybe, but i would be very prudent about it,
i think it's very soon to estimate an improvement, note that many opening lines of the matches are pretty long and that probably favors one side or another (usually Houdini adapts to a wider range of positions of the middlegame better than other engines).
Note also that 360 games still are not a sample to jump to conclusions.
On the other hand, considering that probably H3 is not tactical stronger than H2 or, at least, it's not much stronger than it, if H3 should be really so strong overall at long time control it would mean that Robert Houdart has made a superb work on the positional play.
I would be curious to know some results of H3 at chess960, in which the influence of the opening book is zero.
Best Regards
Kai
-
- Posts: 766
- Joined: Sun Oct 16, 2011 11:25 am
Re: Long TC matches with Houdini 3 Beta
I didn't say that ''positions'' favor one side, i said that medium-long opening books lines can favor one side (or another). I think Houdini adapts better than other engines to the different positions of the middlegames, even when it doesn't like them, even the same position with white and black.Laskos wrote:Noomen test positions + reversed colours are pretty fair and representative, I guess. Not only these positions do not favour one side, but also favouring heavily one side compresses the differences if played reversed colours. 360 games give about 25 points errors, so let's see, but for now very impressive, the ratio W/L against the top engines is 2-3 for H3.MM wrote:Pity that the tactical mode weakens the engine (overall), otherwise it would be by default.Laskos wrote:
Seems 50-60 points imrovement at long TC, which is impressive. Also, in tactical mode beats everything on test suites.
+50/60 elo? Maybe, but i would be very prudent about it,
i think it's very soon to estimate an improvement, note that many opening lines of the matches are pretty long and that probably favors one side or another (usually Houdini adapts to a wider range of positions of the middlegame better than other engines).
Note also that 360 games still are not a sample to jump to conclusions.
On the other hand, considering that probably H3 is not tactical stronger than H2 or, at least, it's not much stronger than it, if H3 should be really so strong overall at long time control it would mean that Robert Houdart has made a superb work on the positional play.
I would be curious to know some results of H3 at chess960, in which the influence of the opening book is zero.
Best Regards
Kai
That's why, when i can, i prefer to test at chess960, i don't like to see engines jump over the opening and play already prepared positions.
Best Regards
MM
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Long TC matches with Houdini 3 Beta
Chess960 is a bit different game, I think Critter is optimized for that. Maybe it's better to test with shorter lines, testing groups use different length book lines or opening positions, but to me those Noomen positions are good enough. Maybe Houdini adapts to middlegames better, but theoretically longer the (balanced) lines, more balanced are the results.MM wrote:I didn't say that ''positions'' favor one side, i said that medium-long opening books lines can favor one side (or another). I think Houdini adapts better than other engines to the different positions of the middlegames, even when it doesn't like them, even the same position with white and black.Laskos wrote:Noomen test positions + reversed colours are pretty fair and representative, I guess. Not only these positions do not favour one side, but also favouring heavily one side compresses the differences if played reversed colours. 360 games give about 25 points errors, so let's see, but for now very impressive, the ratio W/L against the top engines is 2-3 for H3.MM wrote:Pity that the tactical mode weakens the engine (overall), otherwise it would be by default.Laskos wrote:
Seems 50-60 points imrovement at long TC, which is impressive. Also, in tactical mode beats everything on test suites.
+50/60 elo? Maybe, but i would be very prudent about it,
i think it's very soon to estimate an improvement, note that many opening lines of the matches are pretty long and that probably favors one side or another (usually Houdini adapts to a wider range of positions of the middlegame better than other engines).
Note also that 360 games still are not a sample to jump to conclusions.
On the other hand, considering that probably H3 is not tactical stronger than H2 or, at least, it's not much stronger than it, if H3 should be really so strong overall at long time control it would mean that Robert Houdart has made a superb work on the positional play.
I would be curious to know some results of H3 at chess960, in which the influence of the opening book is zero.
Best Regards
Kai
That's why, when i can, i prefer to test at chess960, i don't like to see engines jump over the opening and play already prepared positions.
Best Regards
Kai
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: Long TC matches with Houdini 3 Beta
I expect the improvement for Chess960 to be slightly larger than for normal chess.MM wrote:That's why, when i can, i prefer to test at chess960, i don't like to see engines jump over the opening and play already prepared positions.
Two months ago I've run Chess960 matches facing Critter 1.6a with Houdini 2.0c and Houdini 3 DEV - see my post http://www.talkchess.com/forum/viewtopic.php?p=476331 and following. 1920 games at 2'+2", single thread.
Code: Select all
Houdini 2.0c - Critter 1.6a : 910-1010 (-18 Elo ± 12 Elo)
Houdini 3 DEV - Critter 1.6a : 1134-786 (+64 Elo ± 12 Elo)
The current Houdini 3 is approx. 10 Elo stronger than the DEV version of July.
Robert
-
- Posts: 6081
- Joined: Fri Mar 10, 2006 11:14 pm
- Location: Munster, Nuremberg, Princeton
Re: Long TC matches with Houdini 3 Beta
Could you please publish these 1920 games? Thanks
-Popper and Lakatos are good but I'm stuck on Leibowitz
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Long TC matches with Houdini 3 Beta
you mean 1/phi ?carldaman wrote:Nice result, Robert. Also, ironic and interesting that the win % =~ phi, wonder if there is a significance
Regards,
CL
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 766
- Joined: Sun Oct 16, 2011 11:25 am
Re: Long TC matches with Houdini 3 Beta
Hi Robert, thank you, i was aware of that match, i'm glad to hear that the current Houdini3 is about 10 elo stronger than that. Don't you think it would be interesting to run some other quick match (2'+2'' is good) against other engines (eg Stockfish, Rybka 4.1..)?Houdini wrote:I expect the improvement for Chess960 to be slightly larger than for normal chess.MM wrote:That's why, when i can, i prefer to test at chess960, i don't like to see engines jump over the opening and play already prepared positions.
Two months ago I've run Chess960 matches facing Critter 1.6a with Houdini 2.0c and Houdini 3 DEV - see my post http://www.talkchess.com/forum/viewtopic.php?p=476331 and following. 1920 games at 2'+2", single thread.Measured gain was (82 Elo ± 17 Elo).Code: Select all
Houdini 2.0c - Critter 1.6a : 910-1010 (-18 Elo ± 12 Elo) Houdini 3 DEV - Critter 1.6a : 1134-786 (+64 Elo ± 12 Elo)
The current Houdini 3 is approx. 10 Elo stronger than the DEV version of July.
Robert
Best Regards
MM
-
- Posts: 10317
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: Long TC matches with Houdini 3 Beta
I think that for comparison between long and short time controlMM wrote:Hi Robert, thank you, i was aware of that match, i'm glad to hear that the current Houdini3 is about 10 elo stronger than that. Don't you think it would be interesting to run some other quick match (2'+2'' is good) against other engines (eg Stockfish, Rybka 4.1..)?Houdini wrote:I expect the improvement for Chess960 to be slightly larger than for normal chess.MM wrote:That's why, when i can, i prefer to test at chess960, i don't like to see engines jump over the opening and play already prepared positions.
Two months ago I've run Chess960 matches facing Critter 1.6a with Houdini 2.0c and Houdini 3 DEV - see my post http://www.talkchess.com/forum/viewtopic.php?p=476331 and following. 1920 games at 2'+2", single thread.Measured gain was (82 Elo ± 17 Elo).Code: Select all
Houdini 2.0c - Critter 1.6a : 910-1010 (-18 Elo ± 12 Elo) Houdini 3 DEV - Critter 1.6a : 1134-786 (+64 Elo ± 12 Elo)
The current Houdini 3 is approx. 10 Elo stronger than the DEV version of July.
Robert
Best Regards
it is better to use the same type of time control and the same positions
and the same opponents.
This is the reason that I suggested 6'+2''(90/15+30/15) or 3'+1''(90/30+30/30)
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: Long TC matches with Houdini 3 Beta
It's not, and the result is invalid. There was a bug in the tester, and it gave this bogus result.Uri Blass wrote:It seems based on the results that Komodo 4471.02 64 bit is significantly stronger than other versions but for some reason you tested it only in the 90+1 list.Code: Select all
90+1 Rank Name Elo + - games score oppo. draws 1 Komodo 4471.02 64 bit 3060.9 12.9 12.9 2530 57.9% 2990.0 35.4% 2 Komodo 4467.01 64 bit 3027.6 8.7 8.7 5300 54.4% 2990.0 42.0% 3 Houdini 1.5a x64 3025.2 7.2 7.2 7884 49.7% 3027.6 39.1% 4 Komodo 4468.00 64 bit 3024.8 8.7 8.7 5298 54.1% 2990.1 42.4% 5 Komodo 4471.01 64 bit 3021.6 8.6 8.6 5321 53.7% 2990.0 43.5% 6 Komodo 5 64 bit dev 3020.7 8.7 8.7 5313 53.7% 2990.1 43.1% 7 Critter 1.4 64-bit SSE4 3000.0 7.1 7.1 7957 46.7% 3027.6 44.4% 8 Stockfish 2.2.2 JA 2945.0 7.1 7.1 7921 40.4% 3027.6 42.4% 120+2 Rank Name Elo + - games score oppo. draws 1 Komodo 4467.01 64 bit 3036.1 8.5 8.5 5594 55.0% 2992.1 43.9% 2 Houdini 1.5a x64 3030.6 6.0 6.0 11500 50.4% 3027.7 42.1% 3 Komodo 4463.00 64 bit 3029.7 6.1 6.1 10939 54.3% 2992.3 44.6% 4 Komodo 4466.02 64 bit 3027.1 7.5 7.5 7127 53.9% 2992.2 45.0% 5 Komodo 5 64 bit dev 3021.9 6.1 6.1 10906 53.4% 2992.2 44.4% 6 Critter 1.4 64-bit SSE4 3000.0 5.9 5.9 11530 46.8% 3027.7 45.5% 7 Stockfish 2.2.2 JA 2946.2 5.9 5.9 11536 40.7% 3027.7 45.9%
I wonder if it was really a big improvement relative to other versions of komodo or maybe there is some mistake in the data or some problem in the machine that tested it.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."