With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:
First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.
So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.
The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.
If the result stands, it might be something worth noting.
Low impact of opening phase in engine play?
Moderators: hgm, Rebel, chrisw
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
-
- Posts: 12538
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Low impact of opening phase in engine play?
There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:
First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.
So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.
The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.
If the result stands, it might be something worth noting.
There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.
I think it matters a lot what 2moves_v1.epd contains.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: Low impact of opening phase in engine play?
or, most probably, both engines do not know how to play the openings, so they are able to squeeze out of them the same amount of elo they are able to squeeze from other game phases.Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:
First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.
So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.
The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.
If the result stands, it might be something worth noting.
also, there is a large, and important, degree of discontinuity, when you switch from one engine to another. how would their search by synchronised?
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: Low impact of opening phase in engine play?
also, Andscacs is likely to spoil any good positions it gets from SF.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Low impact of opening phase in engine play?
That would mean that Stockfish development, which uses 2moves_v1.epd openings, is astray testing the opening phase, and Stockfish should be particularly weak in this department, which doesn't seem to be true. I will try to build a 3-mover or so opening book from GM games in GM2600.pgn to re-test.Dann Corbit wrote:There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:
First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.
So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.
The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.
If the result stands, it might be something worth noting.
There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.
I think it matters a lot what 2moves_v1.epd contains.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Low impact of opening phase in engine play?
I built 3moves_GM.epd opening file from GM2600.pgn file of GM games. It has 1170 unique opening positions and can be downloaded here: http://s000.tinyupload.com/index.php?fi ... 3625088497Laskos wrote:That would mean that Stockfish development, which uses 2moves_v1.epd openings, is astray testing the opening phase, and Stockfish should be particularly weak in this department, which doesn't seem to be true. I will try to build a 3-mover or so opening book from GM games in GM2600.pgn to re-test.Dann Corbit wrote:There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:
First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.
So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.
The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.
If the result stands, it might be something worth noting.
There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.
I think it matters a lot what 2moves_v1.epd contains.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Low impact of opening phase in engine play?
The 2moves_v1.epd book was chosen in SF development, not for developing opening play, but to increase the sensitivity of their testing process. IMHO it has not been conclusively shown this is true (I am not saying it is wrong!). There were some tests that showed that 2moves_v1.epd indeed increases elo differences but no attention was paid to the width of the error bars which increases as well (due to the reduced draw ratio).Laskos wrote:That would mean that Stockfish development, which uses 2moves_v1.epd openings, is astray testing the opening phase, and Stockfish should be particularly weak in this department, which doesn't seem to be true. I will try to build a 3-mover or so opening book from GM games in GM2600.pgn to re-test.Dann Corbit wrote:There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:
First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.
So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.
The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.
If the result stands, it might be something worth noting.
There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.
I think it matters a lot what 2moves_v1.epd contains.
Also I believe that the positions in 2moves_v1.epd are generally a bit unbalanced (I may be wrong on this though) so that one needs to use the pentanomial model to get precise error bars (the trinomial ones are too big).
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Low impact of opening phase in engine play?
I remember when they tired to gauge the Elo differences from 2moves_v1.epd and longer books. I guess it indeed magnifies the Elo differences, but the increase in sensitivity is unclear. It seems from my new results here, it damages opening testing with its random 2-movers. My first results with 3moves_GM.epd (reasonable moves) seem a bit different (and Elo differences are compressed).Michel wrote:The 2moves_v1.epd book was chosen in SF development, not for developing opening play, but to increase the sensitivity of their testing process. IMHO it has not been conclusively shown this is true (I am not saying it is wrong!). There were some tests that showed that 2moves_v1.epd indeed increases elo differences but no attention was paid to the width of the error bars which increases as well (due to the reduced draw ratio).Laskos wrote:That would mean that Stockfish development, which uses 2moves_v1.epd openings, is astray testing the opening phase, and Stockfish should be particularly weak in this department, which doesn't seem to be true. I will try to build a 3-mover or so opening book from GM games in GM2600.pgn to re-test.Dann Corbit wrote:There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:
First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.
So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.
The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.
If the result stands, it might be something worth noting.
There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.
I think it matters a lot what 2moves_v1.epd contains.
Yes, they are indeed a bit unbalanced. I remember I had about 10-15% of openings above 0.6 threshold. Using pentanomial model in SPRT from 2 moves_v1.epd decreased the number of games to stop by 20-30% IIRC (standard deviation by 10-15%).
Also I believe that the positions in 2moves_v1.epd are generally a bit unbalanced (I may be wrong on this though) so that one needs to use the pentanomial model to get precise error bars (the trinomial ones are too big).
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: Low impact of opening phase in engine play?
There is some point in what you stated. I repeated the test from 3moves_GM.epd, and the result is a bit different, although opening phase contribution is still smaller than I expected. From the file of openings, in the same conditions, the difference SF - Andscacs is 211 Elo points, SF - ChessCombi (12 moves SF, the rest Andscacs) is 165 ELO points. So, 22% contribution of opening phase. I expected more.Dann Corbit wrote:There are some openings where the lines are fairly forced. So all good engines will follow the pv to some degree.Laskos wrote:With ChessCombi by Mark Alba, a UCI chess engine that combines two UCI chess engines into 1, I performed the following experiment:
First, play at 10''+0.1'' 1000 games between Stockfish dev and Andscacs 0.90 form 2moves_v1.epd. The difference was 280 Elo points.
Second, play at 10''+0.1'' 1000 games between Stockfish and ChessCombi combining Stockfish moves 1-12 (from 2moves_v1.epd), and Andscacs 0.90 for the rest. The difference was 238 Elo points.
So, the contribution of moves 1-12 was 42 Elo points, or 15% of total game-play difference between Stockfish dev and Andscacs 0.90. This is lower than I expected (30-35% maybe). Human play at high level seems to put accent on openings.
The separation in moves 1-12 for openings is somehow ad-hoc, but roughly these are the openings. Also, maybe the time control is too short to reveal the force of openings. Also, maybe Andscacs 0.90 is particularly strong in openings, as to match closer Stockfish than in general play. Or maybe 2moves_v1.epd is so scrambled, that there are very few reasonable openings left to play after 2 moves.
If the result stands, it might be something worth noting.
There are other openings where the best move is very much up for grabs and engines are very likely to play different moves.
I think it matters a lot what 2moves_v1.epd contains.
-
- Posts: 4366
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Low impact of opening phase in engine play?
42 Elo points on an absolute scale is pretty significant.
--Jon
--Jon