Getting book lines from a polyglot book

Discussion of chess software programming and technical issues.

Moderator: Ras

chesskobra
Posts: 348
Joined: Thu Jul 21, 2022 12:30 am
Full name: Chesskobra

Getting book lines from a polyglot book

Post by chesskobra »

How to get all book lines (i.e., lines in which both black and white make book moves) from a polyglot book? I imagine (ignoring isolated positions) a polyglot bin book to be a tree (or a DAG), and I would like to get all lines that start at the root and end at a leaf (with both white and black making book moves). Also, just curious, why does polyglot dump-book output what it does, e.g., with -color white, it prints for each white move all possible black responses, not just the ones in the book.

Can Banksia GUI do what I want?
Jonathan003
Posts: 243
Joined: Fri Jul 06, 2018 4:23 pm
Full name: Jonathan Cremers

Re: Getting book lines from a polyglot book

Post by Jonathan003 »

Interesting question. Lucas Chess has an option to import all best moves from a bin book. But that's not what you are asking. SCID also has an option to export a bin book to a merged pgn but it is a long process and it only works with small bin books.
I have tried it with older versions of Banksia but it didn't work, (not all lines from the bin book where exported, and also lines that where not in the bin book where in the exported pgn)
chesskobra
Posts: 348
Joined: Thu Jul 21, 2022 12:30 am
Full name: Chesskobra

Re: Getting book lines from a polyglot book

Post by chesskobra »

I would consider doing the following (at least for the books that I build myself): every time a line is added, I can output it. I believe this will not require major changes to the code. I would appreciate if someone here could advise me as to what part of the code to look at. I have a snapshot (polyglot-5904a29.tar.gz) from HGM's website. I have made many books by setting different parameters such as -min-book, -min-score, and would like to know what lines are going into these books.
chrisw
Posts: 4624
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: Getting book lines from a polyglot book

Post by chrisw »

chesskobra wrote: Sun Mar 05, 2023 3:42 pm I would consider doing the following (at least for the books that I build myself): every time a line is added, I can output it. I believe this will not require major changes to the code. I would appreciate if someone here could advise me as to what part of the code to look at. I have a snapshot (polyglot-5904a29.tar.gz) from HGM's website. I have made many books by setting different parameters such as -min-book, -min-score, and would like to know what lines are going into these books.
I believe it doesn't contain "lines", it contains positions with hash indexes.
JoAnnP38
Posts: 253
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: Getting book lines from a polyglot book

Post by JoAnnP38 »

chrisw wrote: Sun Mar 05, 2023 7:43 pm
chesskobra wrote: Sun Mar 05, 2023 3:42 pm I would consider doing the following (at least for the books that I build myself): every time a line is added, I can output it. I believe this will not require major changes to the code. I would appreciate if someone here could advise me as to what part of the code to look at. I have a snapshot (polyglot-5904a29.tar.gz) from HGM's website. I have made many books by setting different parameters such as -min-book, -min-score, and would like to know what lines are going into these books.
I believe it doesn't contain "lines", it contains positions with hash indexes.
+1

However, you could reconstruct lines with your own code if you have a board structure that can be manipulated, and you can calculate Zobrist keys or hashes. You just start at the normal start position, lookup all the moves associated with the current position hash, make one of those moves and repeat (i.e., lookup new hash and all the new moves associated with that hash). You can output all the lines as you find them.
chesskobra
Posts: 348
Joined: Thu Jul 21, 2022 12:30 am
Full name: Chesskobra

Re: Getting book lines from a polyglot book

Post by chesskobra »

I didn't find any hash. Here are some sample types of lines in the output of dump-book -color white
  • 1: 1. Na3
  • 2: 1. Nc3
  • 3: 1. Nf3 c5 2. Na3
  • ...
  • 100: 1. Nf3 c5 2. c4 Nc6 3. Nc3 Nf6 4. Nb1 Nb8 5. Nc3 e6 6. d3
  • 101: 1. Nf3 c5 2. c4 Nc6 3. Nc3 Nf6 4. Nb1 Nb8 5. Nc3 e6 6. d4 d5 {trans: line=40, ply=12}
  • 102: 1. Nf3 c5 2. c4 Nc6 3. Nc3 Nf6 4. Nb1 Nb8 5. Nc3 e6 6. e3
  • ...
  • 165: 1. Nf3 c5 2. c4 Nc6 3. Nc3 Nf6 4. Nb1 Nb8 5. Nc3 Nc6 {cycle: ply=6}
  • ...
  • 13487: 1. h3
  • 13488: 1. h4
My book has 264 lines for white and 280 for black, 388 positions on white lines, 361 positions on black lines, 18 isolated positions. I did grep on the output text file, and it has 342 lines with {trans: ...} at the end.
JoAnnP38
Posts: 253
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: Getting book lines from a polyglot book

Post by JoAnnP38 »

chesskobra wrote: Sun Mar 05, 2023 8:42 pm I didn't find any hash.
The hash is part of the polyglot .bin file record.
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Getting book lines from a polyglot book

Post by hgm »

To get the lines in a Polyglot book you basically have to run a perft that prunes when it reaches a position that is not in the book. I am not sure whether Polyglot can do this by itself; I once read that it did have a perft function. I never was interested much in Polyglot's book functions; I only use it as UCI adapter.

[Edit] About whether dump-book works like it does:

I suppose the relevant thing to know is which positions a book user could end up in. And you will in general have no control over what your opponent plays. So you consider all opponent moves, even those not in book. As long as they lead to a position that again has a book move.

It seems that what you want is the (non-existing) option '-color both', that would only show moves in the book. But on most books this would be a pretty useless option. Most 'lines' in a strong book are 'interrupted'. You don't want poor moves to be in the book, as you don't want the book user to ever play these moves. But in case your opponent plays such a poor move, you want to have the moves that punish it in the book, rather than being out of book completely. So you will have many 'one-sided' lines in the book, which only contain moves of a single player, because the other one would never be interested in following that line. And you would not get to see those unless you followed moves that were not in the book.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Getting book lines from a polyglot book

Post by Michel »

chesskobra wrote: Sun Mar 05, 2023 12:51 pm Also, just curious, why does polyglot dump-book output what it does, e.g., with -color white, it prints for each white move all possible black responses, not just the ones in the book.
dump-book -color white gives you what you get when you use the book as an engine book with the engine playing white (obviously the engine has no control over the moves played by black).
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
chesskobra
Posts: 348
Joined: Thu Jul 21, 2022 12:30 am
Full name: Chesskobra

Re: Getting book lines from a polyglot book

Post by chesskobra »

Yes, something like -color both, as suggested by HGM, would be useful. But I am surprised that others have not felt a need for such a thing. I made a book using high level games with polyglot option '-min-game 512', and now I want to know what these opening lines are that have appeared at least 512 times at high level. To get approximately what I want, I can of course do 'pgn-extract -E3' on the game collection and see which ECO codes contain at least 512 games at high level or order the files A00 to D99 by number of games in each. But I don't see a way to get a finer answer than that. So I guess my question is partly about polyglot and partly about using existing tools to get information about popular opening lines as in maximal move sequences that have each appeared at least N times in a game collection.