Opening book read speed

xr_a_y · Post by **xr_a_y** » Sun Aug 12, 2018 8:32 pm

I am currently trying so use a bigger opening book in Weini.
I am using my own book format (that's probably a first mistake, but anyway ...), in a binary format of my own, and the book is a 19Mb file.

It took 10sec to read (performing some little verifications on the fly) that Book which is, I think, really too much. I guess you all have such big book, for sure in better format or use the GUI to handle the book.

This book contains 397457 lines each with 24 moves, so a total of 9538968 moves. So Weini is able to read from a file (and validate a little) something like 1M moves per second ; this is not so bad.

What is your experience about time needed to read an opening book ?

Sesse · Post by **Sesse** » Sun Aug 12, 2018 8:43 pm

Why would you want to read the entire book?

xr_a_y · Post by **xr_a_y** » Sun Aug 12, 2018 8:48 pm

Of course, good question ... because my book is not in the "position hash -> list of good moves" format but in a "line" format...

Good point !

Sven · Post by **Sven** » Sun Aug 12, 2018 9:10 pm

xr_a_y wrote: ↑Sun Aug 12, 2018 8:48 pm Of course, good question ... because my book is not in the "position hash -> list of good moves" format but in a "line" format...

Even then it should not take 10 seconds to read and parse the book. Reading it takes milliseconds, and parsing text would not qualify to take that long, either, for me. So you are doing something very costly, but what?

xr_a_y · Post by **xr_a_y** » Sun Aug 12, 2018 10:01 pm

Sven wrote: ↑Sun Aug 12, 2018 9:10 pm
xr_a_y wrote: ↑Sun Aug 12, 2018 8:48 pm Of course, good question ... because my book is not in the "position hash -> list of good moves" format but in a "line" format...
Even then it should not take 10 seconds to read and parse the book. Reading it takes milliseconds, and parsing text would not qualify to take that long, either, for me. So you are doing something very costly, but what?

1M move per second is not so slow isn't it ?

Sesse · Post by **Sesse** » Sun Aug 12, 2018 10:35 pm

At 3 GHz, it's 3000 cycles/move. That's a fair amount of time for parsing a five-byte string.

Sven · Post by **Sven** » Sun Aug 12, 2018 11:05 pm

Ok, if you do move generation and all that stuff once per move string when reading the book into memory then it is quite reasonable. But I would not do that at all with a book in line-oriented text format. Instead I would only build a tree of move strings in memory, that should be pretty fast, also looking up a variation in it.

jdart · Post by **jdart** » Mon Aug 13, 2018 12:08 am

I think you should consider converting your current format into something more efficient.

See for example https://github.com/sshivaji/polyglot - supports reading/writing Polyglot book format (GPL).

--Jon

mar · Post by **mar** » Mon Aug 13, 2018 12:28 am

First of all, you don't have to read the whole book into memory, a waste of RAM and CPU resources, just as others pointed out already.

Back to your (assume binary) book reader.

A couple of things:
1) use unordered_map instead of map, I got 1.5x perf boost
2) in ReadBinary book you reconstruct the position from FEN each time you encounter a zero hash (new line?) instead of copying a local pre-initialized startpos
3) in ContainsMove you look up twice, you may try this instead:

Code: Select all

bool Book::ContainsMove(const Move & m, const Position & p)
{
   const auto it = _book.find(p.GetZHash());
   if ( it == _book.end() ) return false;
   BookMove bm{m, m.ZHash()};
   return it->second.find(bm) != it->second.end();
}

4) why do you do heap allocation for move in Book::Add instead of a simple local variable on stack? this isn't the bottleneck but let's say bad practice, do you come from Java background?

Even after these changes the rest is split between first lookup in ContainsMove and some move validation, not sure you can do more here.

Conclusion:
- profile your code
- maybe try to rethink the way you validate moves?
- do you really have to validate moves in your binary format? you already validated them when you generated the binary format from text and my guess is you'll validate the book moves again when you try to pick a book move and play it?
- as other people said, consider using another binary format and book lookup (for example polyglot-like format with sorted hashes, then lower bound on file records)

MOBMAT · Post by **MOBMAT** » Mon Aug 13, 2018 5:29 am

I think you should consider converting your current format into something more efficient.
See for example https://github.com/sshivaji/polyglot - supports reading/writing Polyglot book format (GPL).
--Jon

I agree. There are many FREE tools out there to create and maintain opening books in the polyglot format.
The book is position based, so you avoid the issues of moves/transpositions.
The format of the book is very simple, though you will have to deal with "endian" issues with binary values.
The book stores positions sorted by a unique polyglot key.

It turns out, that to kill two birds with one stone, I used the polyglot keys (which are used like TT keys) as the basis for my TT. That saved some space and also made it easy to query the book since my TT key for any position is exactly what the polyglot requires.

I created a C++ class to interface with polyglot books and that made it easy to add my own custom features, such as retrieving a random, best, or "weighed" random move. Since the polyglot entries are sorted, I implemented a binary search to zero in on the correct key. This saves a lot of time, especially if you have a huge opening book.

You can find more info here: http://hgm.nubati.net/book_format.html

So, you can probably tell, I'm a fan!

Opening book read speed

Opening book read speed

Re: Opening read speed

Re: Opening read speed

Re: Opening read speed

Re: Opening read speed

Re: Opening book read speed

Re: Opening book read speed

Re: Opening book read speed

Re: Opening book read speed

Re: Opening book read speed