I'm starting to work on a self-building opening book, and I want to roll my own instead of making a generic PolyGlot book. So I'm trying to get an idea ahead of time how many unique positions might reasonably be generated in a 8-move vs 10-move vs 15-move book, etc. , assuming only actual games get saved.
Is that a question that somebody with a large database of games can easily answer? Obviously there's no hard upper limit, but I'm guessing that for comp/comp games, there's a value that gets reached quickly by a million games, but grows much more slowly after that point.
"Perft" of an opening book
Moderator: Ras
-
- Posts: 563
- Joined: Sat Mar 25, 2006 8:27 pm
- Location: USA
- Full name: Robert Pope
-
- Posts: 2098
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: "Perft" of an opening book.
Hello Robert:
You can see the pattern: a(12) should be around 8.51 times a(11) at most (almost 6.18e+12), then slowly decreasing, so I expect that the branching factor of your problem will behave in a similar way.
Regards from Spain.
Ajedrecista.
I will not answer your question exactly, but OEIS A083276 sequence is Number of distinct chess positions after n plies including differences due to availability and possibility of castling and en passant captures, that is, a kind of perft with unique positions, and could be helpful in your question. The highest known value nowadays is for 11 plies = 5.5 moves and that shall be understood as an upper bound to your question IMHO. The branching factor BF(n) = [a(n)]/[a(n-1)] is:Robert Pope wrote: ↑Mon Jul 10, 2023 2:07 am I'm starting to work on a self-building opening book, and I want to roll my own instead of making a generic PolyGlot book. So I'm trying to get an idea ahead of time how many unique positions might reasonably be generated in a 8-move vs 10-move vs 15-move book, etc. , assuming only actual games get saved.
Is that a question that somebody with a large database of games can easily answer? Obviously there's no hard upper limit, but I'm guessing that for comp/comp games, there's a value that gets reached quickly by a million games, but grows much more slowly after that point.
Code: Select all
n a(n) BF(n)
------------------------------------
1 20 20.000000
2 400 20.000000
3 5,362 13.405000
4 72,078 13.442372
5 822,518 11.411499
6 9,417,681 11.449818
7 96,400,068 10.236073
8 988,187,354 10.250899
9 9,183,421,888 9.293199
10 85,375,278,064 9.296674
11 726,155,461,002 8.505454
Regards from Spain.
Ajedrecista.
-
- Posts: 563
- Joined: Sat Mar 25, 2006 8:27 pm
- Location: USA
- Full name: Robert Pope
Re: "Perft" of an opening book
Just to wrap up this thread, I built a 20 ply book using the "CCRL Training" dataset of 2 million games from lczero.org. For the first 800K games, I was adding on average 2.7 new positions per game. The next 800K added 1.95 positions/game, and the final 400K added 1.7 positions/game.
The growth in my opening book really didn't slow enough to be able to put a figure on a reasonable upper bound, but I can at least assume it will grow no faster than 1.7 additional positions/game on average (for a 20 ply book).
The growth in my opening book really didn't slow enough to be able to put a figure on a reasonable upper bound, but I can at least assume it will grow no faster than 1.7 additional positions/game on average (for a 20 ply book).