Page 1 of 4

Some opening book design questions

Posted: Wed Feb 21, 2018 1:07 pm
by phhnguyen
I have implemented my own opening book for a while - it worked well with my old engine (Xiangqi variant). After a long break from computer chess I have almost forgotten everything. Thus it’s a good time to re-design and implement a new one :)

I have searched and read to update knowledge, studied some book formats / user guide such as Polyglot one. I have been considering to write a total new one (as an opening source project) or modify Polyglot to meet what I need.

So far I have not looked into any code since it’s hard work (I’m lazy) and want to keep my brain out of being affected by other codes/ designs. Thus I still have some gaps / misunderstandings, regarding opening books in general and Polyglot in particular as followings:

1) Data of one side:
Suppose I want to build an opening book for black side. Should I store white positions or discard them all?
In my previous implementation I keep data of both sides for being easy to search/verify. But now I think that’s redundant. Is there something I missed?

2) Flip board:
In my previous implementation if I cannot find out a new opening move, I will flip current board horizontally (Xiangqi is horizontally symmetrical) and vertically then check again. For chess do you check with vertical flipping? Any drawback?

3) From CPW (https://chessprogramming.wikispaces.com/CTG) I have read that ChessBase builds opening books from white side only and flip the board for black side (looks similar to my question 2). However, without real data of the black side, how it could hit? For example, after the 1st move e4, if we flip the board, the board become as below with white side in turn. I don’t think we have any board starts like that.

[d]rnbqkbnr/pppp1ppp/8/4p3/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 2[/d]


4) Weight size:
Polyglot uses 16 bit (my previous one used 8 bit). Look like I will use 16 or 8 bits too. My dilemma is what or how to store data for weights. If I calculate and store percentages, it is fine for any number of bits. However if later I let users to add few more opening lines into an existing book I have no clue to change those percentages to right values. If I store number of hits - it is easier for both creating and adding later but it may be overflowed when creating huge opening books - in that case the app cannot know which move are better than others since all overflowing values will be cut into a constant. I may do some tricks, say divide them for a constant number but still see some drawbacks. Any advice?

5) Searching:
In my previous implementation I did binary search based on sorted positions. That required to search whole data every time. Any tips/tricks to reduce search scope?


6) Moves:
My previous one did not store moves in data. Only positions in form of Zobrist hash key. From a given position I generate and make all legal moves, search new positions from books so I can get which moves are still in opening and their weights. I don’t understand why Polyglot, ChessBase store moves in their opening positions?


Thanks in advance for any help

Re: Some opening book design questions

Posted: Wed Feb 21, 2018 3:36 pm
by jdart
For chess, piece setup + castling status + en passant status + side to move define the position (this is what FEN stores).

The simplest approach is just to compute a hash code based on those factors and use it to look up the position.

My book, which is a custom format, stores a persistent hash table. Each hash table entry is an index to a storage block that contains a move list for the position. Each move has an associated weight. Multiple move lists are packed into each storage block.

I have thought about using a simple key-value store, which can store variable-length arrays, as the book format. One interesting one is LMBD - https://en.wikipedia.org/wiki/Lightning ... d_Database.

Right now I only store a weight value. It is computed from some other factors such as win/loss/draw counts, but I don't store those other factors in the book itself (some book formats do).

If you allow editing the book you should probably allow editing the weights, too.

I think Polyglot does most of what I have described above and just adopting that might not be a bad choice.

Some other things you might consider: IMO a book should flag somehow positions that lead to a more or less forced draw. A draw has a value that is not fixed but depends on the relative opponent rating, or the contempt value if you don't have a rating available. So the program should be more willing to play into a draw if the opponent is rated higher.

Book learning is also an interesting topic. I won't go into that here but it has been discussed on this forum. I have a simple form of book learning but it is external to the book itself.

--Jon

Re: Some opening book design questions

Posted: Wed Feb 21, 2018 5:33 pm
by hgm
Note that WinBoard uses Polyglot format for its opening books, and supports Xiangqi. It just defines a new set of piece-square keys for the pieces and squares that do not exist in Chess (as rotated versions of the keys in the list it uses for Chess). If you want to be compatible with something else, this is one of the options. WinBoard has support for building a book from a game file, or hand-editing an existing book.

For Xiangqi you might want to adapt the probing code to also probe for mirror-image positions, so that you can reduce the book size by a factor two.

The XQWizzard / Elephant Eye project also has an open-source book format, similar to Polyglot format (i.e. key + move + weight), but with about half the size (4-byte key instead of 8-byte, and no learning info).

Re: Some opening book design questions

Posted: Wed Feb 21, 2018 7:53 pm
by noobpwnftw
If you are meant for Xiangqi:

1. You can have a index scheme that stores only one of the mirrored positions (LR/BW/LRBW). Scores should remain the same with current moving player's perspective.
Sample probing code here:
https://github.com/noobpwnftw/chessdb/b ... update.php

2. Use binary encoding of FEN would benefit you in case of hash collisions, it happens quite often in my case.
I have some code shared for my board functions with binary FEN format here:
https://github.com/noobpwnftw/chessdb/t ... il/ccboard

3. Searching is inevitable if you want to avoid repetition rule problems.

4. To avoid probing the book multiple times on one position you can use a key-value store of which key(board representation) maps to scores of known moves.

5. To further reduce size you can use radix trees.


Borrowing your thread I also want to ask about whether a proper method exists to use book knowledge to assist search, I've tried to store book PV into TT and/or sorting PV moves at new depth with book scores, they don't seem to be effective.

Re: Some opening book design questions

Posted: Wed Feb 21, 2018 8:16 pm
by hgm
I wondered about this ymmetry business. In Chess you could also do BW mirroring. But no one I know does this, because in good Chess openings the mirrored positions just never occur. To reach them white would have to lose a tempo, and this is considered very bad in Chess.

Is it really an advantage to do the BW (and LRBW) mirroring in Xiangqi?

Re: Some opening book design questions

Posted: Wed Feb 21, 2018 8:42 pm
by noobpwnftw
Maybe that is because no one has ever built an opening database large enough for BW symmetry to matter. At least EGTBs implement them, so maybe we should do the same for openings, and doing so is not expensive.

BTW, losing a tempo in the openings is never considered bad in Xiangqi, in fact, it is very frequently used to get out of book moves because playing almost any "good" openings now would lead to book hits all the way to the endgame.

If I somehow achieve convergence on Xiangqi opening variations, I will port it to Chess and see how it ends up.

Re: Some opening book design questions

Posted: Wed Feb 21, 2018 9:32 pm
by Gerd Isenberg
hgm wrote:I wondered about this ymmetry business. In Chess you could also do BW mirroring. But no one I know does this, because in good Chess openings the mirrored positions just never occur. To reach them white would have to lose a tempo, and this is considered very bad in Chess.

Is it really an advantage to do the BW (and LRBW) mirroring in Xiangqi?
Nope, in closed and half-open openings where a tempo loss is not that critical, it is quite common to transpose to color-flipped positions, where in the original opening black has some better play (but now white). Also of course to throw programs out of book, as used by humans versus programs (Aegon) but also famous the WMCCC 1988 with Pandix and Rebel versus Y! even with the color reversed Ruy Lopez:
https://chessprogramming.wikispaces.com/WMCCC%201988

After that, and due to the simplicity to implement it with Zobrist keys, many chess programs probed color-flipped positions during the opening. IsiChess had it from the very beginning, and iirc many others as well ...

Re: Some opening book design questions

Posted: Wed Feb 21, 2018 9:55 pm
by hgm
OK, I stand corrected. Strange that the Polyglot probing code doesn't do this, then.

Re: Some opening book design questions

Posted: Wed Feb 21, 2018 11:48 pm
by D Sceviour
phhnguyen wrote:6) Moves:
My previous one did not store moves in data. Only positions in form of Zobrist hash key. From a given position I generate and make all legal moves, search new positions from books so I can get which moves are still in opening and their weights. I dont understand why Polyglot, ChessBase store moves in their opening positions?
Currently, polyglot book moves are valued based on win-loss-draw ratios from games. The book move can be selected from either a "pickbest" or random percentage value. However, different positions have different complexities. For example, the Stockfish Cerebellum book was produced using long thinking times, but the results are complex positions that are difficult to solve in very fast time control.

An idea might be to consider a depth-to-solve number (DTS) for possible moves in a position rather than moves based on win-loss-draw percentages. Thus when playing very fast blitz, the engine could select a move that averages no more than 10 ply to solve, and avoid a move that requires 30 ply of thinking to solve. Perhaps select a move that minimizes the depth-to-solve for the current position, and maximizes the depth-to-solve for the opponent responses.

Re: Some opening book design questions

Posted: Thu Feb 22, 2018 1:22 am
by phhnguyen
jdart wrote:For chess, piece setup + castling status + en passant status + side to move define the position (this is what FEN stores).

The simplest approach is just to compute a hash code based on those factors and use it to look up the position.

My book, which is a custom format, stores a persistent hash table. Each hash table entry is an index to a storage block that contains a move list for the position. Each move has an associated weight. Multiple move lists are packed into each storage block.

I have thought about using a simple key-value store, which can store variable-length arrays, as the book format. One interesting one is LMBD - https://en.wikipedia.org/wiki/Lightning ... d_Database.
I see your book should work faster than mine since it can avoid generating and making all legal moves. However look like you have to use much more memory to store moves and pointers to those moves.
jdart wrote: Right now I only store a weight value. It is computed from some other factors such as win/loss/draw counts, but I don't store those other factors in the book itself (some book formats do).

If you allow editing the book you should probably allow editing the weights, too.

I think Polyglot does most of what I have described above and just adopting that might not be a bad choice.

Some other things you might consider: IMO a book should flag somehow positions that lead to a more or less forced draw. A draw has a value that is not fixed but depends on the relative opponent rating, or the contempt value if you don't have a rating available. So the program should be more willing to play into a draw if the opponent is rated higher.
It is good idea to store some extra information. However, they may be totally useless if we let GUIs manage and make opening books instead of engines.
jdart wrote: Book learning is also an interesting topic. I won't go into that here but it has been discussed on this forum. I have a simple form of book learning but it is external to the book itself.

--Jon
I have read a Hyatt's paper and implemented his method of book learning already. Is there any new method / update?