Low impact of opening phase in engine play?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Low impact of opening phase in engine play?

Post by Laskos » Tue Apr 18, 2017 1:51 am

With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:

First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.

So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.

The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.

If the result stands, it might be something worth noting.

Dann Corbit
Posts: 9977
Joined: Wed Mar 08, 2006 7:57 pm
Location: Redmond, WA USA
Contact:

Re: Low impact of opening phase in engine play?

Post by Dann Corbit » Tue Apr 18, 2017 1:57 am

Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:

First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.

So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.

The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.

If the result stands, it might be something worth noting.
There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.

There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.

I think it matters a lot what 2moves_v1.epd contains.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.

Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 10:41 am

Re: Low impact of opening phase in engine play?

Post by Lyudmil Tsvetkov » Tue Apr 18, 2017 5:03 am

Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:

First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.

So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.

The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.

If the result stands, it might be something worth noting.
or, most probably, both engines do not know how to play the openings, so they are able to squeeze out of them the same amount of elo they are able to squeeze from other game phases.

also, there is a large, and important, degree of discontinuity, when you switch from one engine to another. how would their search by synchronised?

Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 10:41 am

Re: Low impact of opening phase in engine play?

Post by Lyudmil Tsvetkov » Tue Apr 18, 2017 5:05 am

also, Andscacs is likely to spoil any good positions it gets from SF.

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Low impact of opening phase in engine play?

Post by Laskos » Tue Apr 18, 2017 5:30 am

Dann Corbit wrote:
Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:

First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.

So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.

The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.

If the result stands, it might be something worth noting.
There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.

There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.

I think it matters a lot what 2moves_v1.epd contains.
That would mean that Stockfish development, which uses 2moves_v1.epd openings, is astray testing the opening phase, and Stockfish should be particularly weak in this department, which doesn't seem to be true. I will try to build a 3-mover or so opening book from GM games in GM2600.pgn to re-test.

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Low impact of opening phase in engine play?

Post by Laskos » Tue Apr 18, 2017 6:39 am

Laskos wrote:
Dann Corbit wrote:
Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:

First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.

So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.

The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.

If the result stands, it might be something worth noting.
There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.

There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.

I think it matters a lot what 2moves_v1.epd contains.
That would mean that Stockfish development, which uses 2moves_v1.epd openings, is astray testing the opening phase, and Stockfish should be particularly weak in this department, which doesn't seem to be true. I will try to build a 3-mover or so opening book from GM games in GM2600.pgn to re-test.
I built 3moves_GM.epd opening file from GM2600.pgn file of GM games. It has 1170 unique opening positions and can be downloaded here: http://s000.tinyupload.com/index.php?fi ... 3625088497

Michel
Posts: 2038
Joined: Sun Sep 28, 2008 11:50 pm

Re: Low impact of opening phase in engine play?

Post by Michel » Tue Apr 18, 2017 6:46 am

Laskos wrote:
Dann Corbit wrote:
Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:

First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.

So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.

The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.

If the result stands, it might be something worth noting.
There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.

There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.

I think it matters a lot what 2moves_v1.epd contains.
That would mean that Stockfish development, which uses 2moves_v1.epd openings, is astray testing the opening phase, and Stockfish should be particularly weak in this department, which doesn't seem to be true. I will try to build a 3-mover or so opening book from GM games in GM2600.pgn to re-test.
The 2moves_v1.epd book was chosen in SF development, not for developing opening play, but to increase the sensitivity of their testing process. IMHO it has not been conclusively shown this is true (I am not saying it is wrong!). There were some tests that showed that 2moves_v1.epd indeed increases elo differences but no attention was paid to the width of the error bars which increases as well (due to the reduced draw ratio).

Also I believe that the positions in 2moves_v1.epd are generally a bit unbalanced (I may be wrong on this though) so that one needs to use the pentanomial model to get precise error bars (the trinomial ones are too big).
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Low impact of opening phase in engine play?

Post by Laskos » Tue Apr 18, 2017 8:24 am

Michel wrote:
Laskos wrote:
Dann Corbit wrote:
Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:

First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.

So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.

The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.

If the result stands, it might be something worth noting.
There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.

There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.

I think it matters a lot what 2moves_v1.epd contains.
That would mean that Stockfish development, which uses 2moves_v1.epd openings, is astray testing the opening phase, and Stockfish should be particularly weak in this department, which doesn't seem to be true. I will try to build a 3-mover or so opening book from GM games in GM2600.pgn to re-test.
The 2moves_v1.epd book was chosen in SF development, not for developing opening play, but to increase the sensitivity of their testing process. IMHO it has not been conclusively shown this is true (I am not saying it is wrong!). There were some tests that showed that 2moves_v1.epd indeed increases elo differences but no attention was paid to the width of the error bars which increases as well (due to the reduced draw ratio).
I remember when they tired to gauge the Elo differences from 2moves_v1.epd and longer books. I guess it indeed magnifies the Elo differences, but the increase in sensitivity is unclear. It seems from my new results here, it damages opening testing with its random 2-movers. My first results with 3moves_GM.epd (reasonable moves) seem a bit different (and Elo differences are compressed).

Also I believe that the positions in 2moves_v1.epd are generally a bit unbalanced (I may be wrong on this though) so that one needs to use the pentanomial model to get precise error bars (the trinomial ones are too big).
Yes, they are indeed a bit unbalanced. I remember I had about 10-15% of openings above 0.6 threshold. Using pentanomial model in SPRT from 2 moves_v1.epd decreased the number of games to stop by 20-30% IIRC (standard deviation by 10-15%).

User avatar
Laskos
Posts: 9408
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: Low impact of opening phase in engine play?

Post by Laskos » Tue Apr 18, 2017 9:32 am

Dann Corbit wrote:
Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:

First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.

So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.

The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.

If the result stands, it might be something worth noting.
There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.

There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.

I think it matters a lot what 2moves_v1.epd contains.
There is some point in what you stated. I repeated the test from 3moves_GM.epd, and the result is a bit different, although opening phase contribution is still smaller than I expected. From the file of openings, in the same conditions, the difference SF - Andscacs is 211 Elo points, SF - ChessCombi (12 moves SF, the rest Andscacs) is 165 ELO points. So, 22% contribution of opening phase. I expected more.

jdart
Posts: 3813
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: Low impact of opening phase in engine play?

Post by jdart » Tue Apr 18, 2017 4:18 pm

42 Elo points on an absolute scale is pretty significant.

--Jon

Post Reply