With the release of a new official version, I was curious to see how well Stockfish would perform against other top engines using a well-balanced opening book like Perfect Book 2023. So I ran a 30s + 1s 100-pair gauntlet each against the latest Obsidian, Berserk, and Dragon. There may have been some SSS at play here, but the fact that Stockfish took over 10% of the games against each engine (over 20% vs Dragon 3.3) was a bit unexpected. Does this indicate that Perfect Book 2023 is no longer as bulletproof or has Stockfish made a major leap?
GUI: Cutechess
CPU: Ryzen 9 7950x
Threads: 8
Hash: 4 GB each
Tablebase: 5-men Syzygy
Ponder OFF
Time Control: 30 sec + 1 sec/move
Book: Perfect Book 2023
Perfect book 2023 is just a name. Here is a quote from Sedat's site: "Balsa, Perfect or Unique book's openings would never gain higher points vs Current Top trend books, because I've used to play many various openings.."
It is not clear if SF used the book or not.
At very fast time control all engines, including SF will make bad decisions.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
I have extracted top 104 lines (at 8-ply) by frequency from the 40h database. Each line has appeared at least 1000 times in the 40h database. How would an opening book based on them be? Would it be quite balanced? The first field on each line is the frequency of a line.
nmcrazyim5 wrote: ↑Tue Apr 01, 2025 8:18 am
With the release of a new official version, I was curious to see how well Stockfish would perform against other top engines using a well-balanced opening book like Perfect Book 2023. So I ran a 30s + 1s 100-pair gauntlet each against the latest Obsidian, Berserk, and Dragon. There may have been some SSS at play here, but the fact that Stockfish took over 10% of the games against each engine (over 20% vs Dragon 3.3) was a bit unexpected. Does this indicate that Perfect Book 2023 is no longer as bulletproof or has Stockfish made a major leap?
GUI: Cutechess
CPU: Ryzen 9 7950x
Threads: 8
Hash: 4 GB each
Tablebase: 5-men Syzygy
Ponder OFF
Time Control: 30 sec + 1 sec/move
Book: Perfect Book 2023
Rank Name Elo +/- Games Score Draw
0 Stockfish 17.1 43 16 600 57.3% 85.3%
1 Obsidian 15.09 -35 21 200 45.0% 90.0%
2 Berserk dev_250307 -38 22 200 44.5% 89.0%
3 Dragon 3.3 -69 36 200 38.5% 77.0%
600 of 600 games finished.
I suspect the problem is that stockfish's opponents are not perfect.
I have 2 ideas to test if you are interested.
1)Stockfish with the perfect book against the same oppponets you chose with no book to see if no book is better for the opponents relative to perfect book or not.
2)Stockfrish against Stockfish when both use the perfect book to see what is the percentage of draws.
Ciekce wrote: ↑Thu Apr 03, 2025 6:24 am
30+1 is far from "very fast" for engines
essentially, even though this book is pre-drawn, sf is Just Better™
Exactly my conclusions too. 30s + 1s is 3x slower than ipman for instance and ipman uses UHO. Perfect book has all low exits (<0.4) and is very much like a top GM repertoire. As chess evolves books and opening theory have to change too and seems like SF is breaking away from the rest a bit.
I remember L Kaufman mentioning over 95% draw rate around the time of Dragon 3 release which was quite true then, I even ran similar tests against SF (v14?) and most balanced book exits were all drawn (~97-98% draws). Seems like chess has more to offer (which is great news for both the players and devs)
chesskobra wrote: ↑Thu Apr 03, 2025 2:41 am
I have extracted top 104 lines (at 8-ply) by frequency from the 40h database. Each line has appeared at least 1000 times in the 40h database. How would an opening book based on them be? Would it be quite balanced? The first field on each line is the frequency of a line.
Most lines you found through aggregation are the first few plies of standard openings like Spanish, Open Sicilian, Queen's Gambit, Slav etc. This makes sense because those positions are solid for black and make them a good candidate for a balanced book with low exits.
chesskobra wrote: ↑Thu Apr 03, 2025 2:41 am
I have extracted top 104 lines (at 8-ply) by frequency from the 40h database. Each line has appeared at least 1000 times in the 40h database. How would an opening book based on them be? Would it be quite balanced? The first field on each line is the frequency of a line.
Most lines you found through aggregation are the first few plies of standard openings like Spanish, Open Sicilian, Queen's Gambit, Slav etc. This makes sense because those positions are solid for black and make them a good candidate for a balanced book with low exits.
In fact they are the most frequent lines from a database of human games (40h db of Norman Pollock). So they are standard openings. My thought is that if a line is played 1000s of times by GMs, it is unlikely that it gives significant advantage to one side. Isn't this the objective of the perfect book?