The influence of the length of openings

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

The influence of the length of openings

Post by Laskos »

With Houdini 3 at 2.5''+0.05'', Komodo 5.1 at 5''+0.1'', Rybka 4.1 at 5''+0.1'', Stockfish 3 at 7.5''+0.15'' to somewhat equalize the strengths, I put them to play from different length opening lines using SWCR.PGN and GM2006.PGN game collections, the opening length being 4, 16 and 30 plies. I expected a more spectacular result, with Houdini showing a better performance at only 4 ply length openings, as it's very good at Chess 960, but it's not the case, all four engines show similar behavior and dependency with the opening length. The longer the opening lines, more are the draws (9% more draws from 4 to 30 plies), and the ranking is compressed (from 76 points difference at 4 plies to 56 points at 30 plies openings). The strength shown subjected to compression remains very stable, so there is no danger of distorting the order in the rankings from longer or shorter openings of testers and testing groups.

Code: Select all

Openings 4 plies 

    Program                            Score       %     Elo    +   -    Draws

  1 Komodo 5.1                     : 1699.0/3000  56.6    35   10  10   31.3 %
  2 Stockfish 3                    : 1602.5/3000  53.4    18   10  10   32.4 %
  3 Houdini 3                      : 1430.0/3000  47.7   -12   11  11   28.4 %
  4 Rybka 4.1                      : 1268.5/3000  42.3   -41   10  10   31.1 %



Openings 16 plies

    Program                            Score       %     Elo    +   -    Draws

  1 Komodo 5.1                     : 1662.5/3000  55.4    28   10  10   36.4 %
  2 Stockfish 3                    : 1597.5/3000  53.2    17   10  10   34.4 %
  3 Houdini 3                      : 1429.5/3000  47.6   -12   10  10   33.7 %
  4 Rybka 4.1                      : 1310.5/3000  43.7   -33   10  10   36.4 %



Openings 30 plies

    Program                            Score       %    Elo    +   -    Draws

  1 Komodo 5.1                     : 1646.0/3016  54.6    24   10  10   38.7 %
  2 Stockfish 3                    : 1605.5/3015  53.3    17   10  10   41.0 %
  3 Houdini 3                      : 1457.5/3017  48.3    -9   10  10   38.9 %
  4 Rybka 4.1                      : 1322.0/3014  43.9   -32   10  10   41.0 %
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: The influence of the length of openings

Post by IWB »

Laskos wrote:...(from 76 points difference at 4 plies to 56 points at 30 plies openings...
What is the average game length for 4 and 30 plies? If there is a bigger difference in the "self calculated moves" it is like a longer o shorter time control which might increase the draw rate.
But interesting tha besides the higher draw rate notin changes in ranking ...

Interesting, thx
Ingo
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: The influence of the length of openings

Post by Laskos »

IWB wrote:
Laskos wrote:...(from 76 points difference at 4 plies to 56 points at 30 plies openings...
What is the average game length for 4 and 30 plies? If there is a bigger difference in the "self calculated moves" it is like a longer o shorter time control which might increase the draw rate.
But interesting tha besides the higher draw rate notin changes in ranking ...

Interesting, thx
Ingo
I looked in SCID for the game length, for 4 plies the average was about 70, for 16 about 74, for 30 about 77. And the time control was with increment, so the total effective TC didn't increase with longer openings by more than 2-3%. It cannot account for such an increase in draw rate (from 31% to 40%). Yes, I found surprising that the rating just compresses due to higher draw rate, but the relative ratings remain almost unchanged. Sure, this is with many games and neutral openings, if one begins to set all sorts of traps in the openings, the picture might look different, but I was not interested in that. My gut feeling was that Houdini deals better with the opening phase of the game by itself, without a book, but as it turns out, all the 4 top engines behave the same. And I somehow mistrusted tests with openings longer than 12 moves, it turns out that they are valid, if the openings are neutral, at least for these 4 engines (only that we should expect a higher draw rate).
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: The influence of the length of openings

Post by IWB »

Laskos wrote:
IWB wrote:
Laskos wrote:...(from 76 points difference at 4 plies to 56 points at 30 plies openings...
What is the average game length for 4 and 30 plies? If there is a bigger difference in the "self calculated moves" it is like a longer o shorter time control which might increase the draw rate.
But interesting tha besides the higher draw rate notin changes in ranking ...

Interesting, thx
Ingo
I looked in SCID for the game length, for 4 plies the average was about 70, for 16 about 74, for 30 about 77. And the time control was with increment, so the total effective TC didn't increase with longer openings by more than 2-3%. It cannot account for such an increase in draw rate (from 31% to 40%). Yes, I found surprising that the rating just compresses due to higher draw rate, but the relative ratings remain almost unchanged. Sure, this is with many games and neutral openings, if one begins to set all sorts of traps in the openings, the picture might look different, but I was not interested in that. My gut feeling was that Houdini deals better with the opening phase of the game by itself, without a book, but as it turns out, all the 4 top engines behave the same. And I somehow mistrusted tests with openings longer than 12 moves, it turns out that they are valid, if the openings are neutral, at least for these 4 engines (only that we should expect a higher draw rate).
If the average game length is basically not influenced by the length of the opening the only thing to cause a higher draw rate I can think of is the fact that a longer but equal opening simply leaves less room for engines as it IS already more drawish ...
For a tester this means that he has to find openings which are
1. as short as possible (o let an engine decide and not a book or position)
2. as different as possible (to have a variaty of different chess openings)

Thx for the info
Ingo
Nelson Hernandez
Posts: 101
Joined: Sun Nov 14, 2010 9:36 pm
Location: U.S.

Re: The influence of the length of openings

Post by Nelson Hernandez »

Games are won when one player or the other makes a blunder or inaccuracy which, with accurate play, can be converted to a win. In long openings that exit into "neutral" positions you have substantially reduced the number of move-opportunities where an engine could potentially make such an inaccurate move, and the achievement of a neutral position after so many moves is inherently very drawish.

It turns out that in a surprising number of games one side is already losing by the 10th or 15th move and never equalizes. You eliminate such cases by exiting to neutral positions.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: The influence of the length of openings

Post by lucasart »

It is a well known fact that opening books are mostly useless ELO-wise, past the first 3-4 moves.

They are only useful to bring diversity in the games (no opening book would mean engine plays the same as algorithm is deterministic).

In fact, the real risk, is that using a large opening book (compiled automatically out of large database of unverified moves) forces mistakes that the engine wouldn't play. In other words, engines generally play better than large books (which doesn't mean that all moves they play are better. remember that ELO is more determined by your 1% worst moves than your 1% best ones...)

Personally, I use 8-moves, in order to reach a good balance between diversity and engine creativity. There's nothing more boring than these extremely long book lines that effectively start the game into an already drawn endgame.

An engine that I really hated for that was Spike 1.2: you cannot disable the book and it's hardcoded inside the executable! And the book is just endless and always tries to play for boring, blocked, symmetric, drawish positions... It really killed the creatiity and the fun.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: The influence of the length of openings

Post by Don »

We do all our testing with the very shallow and wide opening books - we have only 5 moves (10 ply.)

Obviously, if the opening book is playing most of the game you are not really testing the engine as much and you give much greater chances to the weaker engine.

Some people use these highly developed opening books that go very deep and that is fine for maximizing the strength on-line or in a tournament setting but not for computer vs computer objective testing.
Laskos wrote:With Houdini 3 at 2.5''+0.05'', Komodo 5.1 at 5''+0.1'', Rybka 4.1 at 5''+0.1'', Stockfish 3 at 7.5''+0.15'' to somewhat equalize the strengths, I put them to play from different length opening lines using SWCR.PGN and GM2006.PGN game collections, the opening length being 4, 16 and 30 plies. I expected a more spectacular result, with Houdini showing a better performance at only 4 ply length openings, as it's very good at Chess 960, but it's not the case, all four engines show similar behavior and dependency with the opening length. The longer the opening lines, more are the draws (9% more draws from 4 to 30 plies), and the ranking is compressed (from 76 points difference at 4 plies to 56 points at 30 plies openings). The strength shown subjected to compression remains very stable, so there is no danger of distorting the order in the rankings from longer or shorter openings of testers and testing groups.

Code: Select all

Openings 4 plies 

    Program                            Score       %     Elo    +   -    Draws

  1 Komodo 5.1                     : 1699.0/3000  56.6    35   10  10   31.3 %
  2 Stockfish 3                    : 1602.5/3000  53.4    18   10  10   32.4 %
  3 Houdini 3                      : 1430.0/3000  47.7   -12   11  11   28.4 %
  4 Rybka 4.1                      : 1268.5/3000  42.3   -41   10  10   31.1 %



Openings 16 plies

    Program                            Score       %     Elo    +   -    Draws

  1 Komodo 5.1                     : 1662.5/3000  55.4    28   10  10   36.4 %
  2 Stockfish 3                    : 1597.5/3000  53.2    17   10  10   34.4 %
  3 Houdini 3                      : 1429.5/3000  47.6   -12   10  10   33.7 %
  4 Rybka 4.1                      : 1310.5/3000  43.7   -33   10  10   36.4 %



Openings 30 plies

    Program                            Score       %    Elo    +   -    Draws

  1 Komodo 5.1                     : 1646.0/3016  54.6    24   10  10   38.7 %
  2 Stockfish 3                    : 1605.5/3015  53.3    17   10  10   41.0 %
  3 Houdini 3                      : 1457.5/3017  48.3    -9   10  10   38.9 %
  4 Rybka 4.1                      : 1322.0/3014  43.9   -32   10  10   41.0 %
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Modern Times
Posts: 3546
Joined: Thu Jun 07, 2012 11:02 pm

Re: The influence of the length of openings

Post by Modern Times »

I prefer short (8 or less) but even better, chess960 with no book at all.
PaulieD
Posts: 211
Joined: Tue Jun 25, 2013 8:19 pm

Re: The influence of the length of openings

Post by PaulieD »

This is why engine test suites with colors reversing cannot be beat for engine testing.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: The influence of the length of openings

Post by Laskos »

lucasart wrote:It is a well known fact that opening books are mostly useless ELO-wise, past the first 3-4 moves.

They are only useful to bring diversity in the games (no opening book would mean engine plays the same as algorithm is deterministic).

In fact, the real risk, is that using a large opening book (compiled automatically out of large database of unverified moves) forces mistakes that the engine wouldn't play. In other words, engines generally play better than large books (which doesn't mean that all moves they play are better. remember that ELO is more determined by your 1% worst moves than your 1% best ones...)

Personally, I use 8-moves, in order to reach a good balance between diversity and engine creativity. There's nothing more boring than these extremely long book lines that effectively start the game into an already drawn endgame.

An engine that I really hated for that was Spike 1.2: you cannot disable the book and it's hardcoded inside the executable! And the book is just endless and always tries to play for boring, blocked, symmetric, drawish positions... It really killed the creatiity and the fun.
Yes, it seems that a shallow, but still diverse book is optimal. What is curious is that I expected some engines to overperform or underpeform with only 4 ply book compared to 16 or 30 plies books. It's not the case, it seems that engines are playing openings by themselves at similar level to their overall strength. For openings you have to encode many things in the evaluation, similarly to endgames. But in endgames the engines do shift their strength compared to overall play, and it is well known that some engines under- or over-perform in the endgames.

Endgame balanced suite at same TC as in the first post:

Code: Select all

    Program                              Score     %      Elo    +   -    Draws

  1 Stockfish 3                    : 1651.0/3021  54.7     24    8   8   61.7 %
  2 Komodo 5.1                     : 1598.5/3025  52.8     15    8   8   60.3 %
  3 Rybka 4.1                      : 1436.5/3023  47.5    -13    8   8   60.8 %
  4 Houdini 3                      : 1362.0/3027  45.0    -26    8   8   57.5 %
The ratings are even more compressed, the draw ratio is very high, but one can say that in the endgames Stockfish overperforms, Houdini underperforms compared to overall rating. No such thing in the openings.