.abk opening book maxes out at 2gb file size?

Discussion of chess software programming and technical issues.

Moderator: Ras

n4k3dw4ff13s
Posts: 19
Joined: Sat Jan 28, 2023 4:01 am
Full name: Ifti Ram

.abk opening book maxes out at 2gb file size?

Post by n4k3dw4ff13s »

Hello all! I'm trying to make an opening book that can hold my current database of 18 mil games. However, I'm running into a problem where the .abk file type won't hold more than 2gb worth of games. I'm not sure why this is-- if this is a setting I can turn off. I've tried many times to add my .pgns to new opening books, but the file size gets capped at 2.00 GB (2,147,487,744 bytes).

I have the games stored in a .ctg opening book right now, but that's proprietary and not easily shareable with people who don't have chessbase. Any help would be appreciated!
Modern Times
Posts: 3708
Joined: Thu Jun 07, 2012 11:02 pm

Re: .abk opening book maxes out at 2gb file size?

Post by Modern Times »

Possibly something to do with the fact that Arena is a 32-bit program ? I can't recall exactly, but that does ring a bell with me in some context.
JoAnnP38
Posts: 253
Joined: Mon Aug 26, 2019 4:34 pm
Location: Clearwater, Florida USA
Full name: JoAnn Peeler

Re: .abk opening book maxes out at 2gb file size?

Post by JoAnnP38 »

n4k3dw4ff13s wrote: Sun Jan 29, 2023 9:43 pm Hello all! I'm trying to make an opening book that can hold my current database of 18 mil games. However, I'm running into a problem where the .abk file type won't hold more than 2gb worth of games. I'm not sure why this is-- if this is a setting I can turn off. I've tried many times to add my .pgns to new opening books, but the file size gets capped at 2.00 GB (2,147,487,744 bytes).

I have the games stored in a .ctg opening book right now, but that's proprietary and not easily shareable with people who don't have chessbase. Any help would be appreciated!
Have you given any consideration to the Polyglot binary book format? I started with an 8 million game PGN and after filtering out games with ELO below 2000-2200 I produced a .BIN book file of about 8MB in size. Since a a lot of those games in your 16 million database overlap through the opening there is a lot of redundancy. And I don't look at moves past about 16 moves or so. My .bin book file is so small (relatively) I just include it in my .EXE as a binary resource. If it were anywhere near 100MB or so I wouldn't even think about doing that. If you are already using Zobrist hashes then using the polyglot book format is a simple matter of using a binary search to find a group of entries matching your position and then (at least I do this) randomly select from the book moves from that position. Good luck.
n4k3dw4ff13s
Posts: 19
Joined: Sat Jan 28, 2023 4:01 am
Full name: Ifti Ram

Re: .abk opening book maxes out at 2gb file size?

Post by n4k3dw4ff13s »

Modern Times wrote: Sun Jan 29, 2023 10:06 pm Possibly something to do with the fact that Arena is a 32-bit program ? I can't recall exactly, but that does ring a bell with me in some context.
Thanks! I totally didn't think about this.
n4k3dw4ff13s
Posts: 19
Joined: Sat Jan 28, 2023 4:01 am
Full name: Ifti Ram

Re: .abk opening book maxes out at 2gb file size?

Post by n4k3dw4ff13s »

JoAnnP38 wrote: Sun Jan 29, 2023 10:54 pm
n4k3dw4ff13s wrote: Sun Jan 29, 2023 9:43 pm Hello all! I'm trying to make an opening book that can hold my current database of 18 mil games. However, I'm running into a problem where the .abk file type won't hold more than 2gb worth of games. I'm not sure why this is-- if this is a setting I can turn off. I've tried many times to add my .pgns to new opening books, but the file size gets capped at 2.00 GB (2,147,487,744 bytes).

I have the games stored in a .ctg opening book right now, but that's proprietary and not easily shareable with people who don't have chessbase. Any help would be appreciated!
Have you given any consideration to the Polyglot binary book format? I started with an 8 million game PGN and after filtering out games with ELO below 2000-2200 I produced a .BIN book file of about 8MB in size. Since a a lot of those games in your 16 million database overlap through the opening there is a lot of redundancy. And I don't look at moves past about 16 moves or so. My .bin book file is so small (relatively) I just include it in my .EXE as a binary resource. If it were anywhere near 100MB or so I wouldn't even think about doing that. If you are already using Zobrist hashes then using the polyglot book format is a simple matter of using a binary search to find a group of entries matching your position and then (at least I do this) randomly select from the book moves from that position. Good luck.
The games are all actually 2200 and above, but I haven't strictly looked for duplicates. I'll go try out pgn-extract.
User avatar
phhnguyen
Posts: 1524
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: .abk opening book maxes out at 2gb file size?

Post by phhnguyen »

n4k3dw4ff13s wrote: Sun Jan 29, 2023 11:27 pm
JoAnnP38 wrote: Sun Jan 29, 2023 10:54 pm
n4k3dw4ff13s wrote: Sun Jan 29, 2023 9:43 pm Hello all! I'm trying to make an opening book that can hold my current database of 18 mil games. However, I'm running into a problem where the .abk file type won't hold more than 2gb worth of games. I'm not sure why this is-- if this is a setting I can turn off. I've tried many times to add my .pgns to new opening books, but the file size gets capped at 2.00 GB (2,147,487,744 bytes).

I have the games stored in a .ctg opening book right now, but that's proprietary and not easily shareable with people who don't have chessbase. Any help would be appreciated!
Have you given any consideration to the Polyglot binary book format? I started with an 8 million game PGN and after filtering out games with ELO below 2000-2200 I produced a .BIN book file of about 8MB in size. Since a a lot of those games in your 16 million database overlap through the opening there is a lot of redundancy. And I don't look at moves past about 16 moves or so. My .bin book file is so small (relatively) I just include it in my .EXE as a binary resource. If it were anywhere near 100MB or so I wouldn't even think about doing that. If you are already using Zobrist hashes then using the polyglot book format is a simple matter of using a binary search to find a group of entries matching your position and then (at least I do this) randomly select from the book moves from that position. Good luck.
The games are all actually 2200 and above, but I haven't strictly looked for duplicates. I'll go try out pgn-extract.
I think he doesn't mean about game duplicates but position duplicates. Because of storing the opening tree structure, ABK can't avoid node/position duplicates. Trimming duplicated games almost doesn't help.

Just my understanding:
- by using much more complicated structures for nodes, plus duplicates, ABK can eat much more memory/disk space than other book formats for the same number of positions and similar quality
- you need a complicated/sophisticated program to create ABK books from games thus it could be as good as a Polyglot one. The main limitation of ABK is caused by storing lines instead of positions. The program needs to find and store all missing lines to move from position to position. Otherwise, the book hit ratio will be reduced significantly

IMHO, Polyglot is much better than ABK since it is supported by almost all chess GUIs including Arena while ABK is supported by Arena only. In general, the quality of Polyglot is better than ABK.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
n4k3dw4ff13s
Posts: 19
Joined: Sat Jan 28, 2023 4:01 am
Full name: Ifti Ram

Re: .abk opening book maxes out at 2gb file size?

Post by n4k3dw4ff13s »

phhnguyen wrote: Mon Jan 30, 2023 1:48 am
n4k3dw4ff13s wrote: Sun Jan 29, 2023 11:27 pm
JoAnnP38 wrote: Sun Jan 29, 2023 10:54 pm
n4k3dw4ff13s wrote: Sun Jan 29, 2023 9:43 pm Hello all! I'm trying to make an opening book that can hold my current database of 18 mil games. However, I'm running into a problem where the .abk file type won't hold more than 2gb worth of games. I'm not sure why this is-- if this is a setting I can turn off. I've tried many times to add my .pgns to new opening books, but the file size gets capped at 2.00 GB (2,147,487,744 bytes).

I have the games stored in a .ctg opening book right now, but that's proprietary and not easily shareable with people who don't have chessbase. Any help would be appreciated!
Have you given any consideration to the Polyglot binary book format? I started with an 8 million game PGN and after filtering out games with ELO below 2000-2200 I produced a .BIN book file of about 8MB in size. Since a a lot of those games in your 16 million database overlap through the opening there is a lot of redundancy. And I don't look at moves past about 16 moves or so. My .bin book file is so small (relatively) I just include it in my .EXE as a binary resource. If it were anywhere near 100MB or so I wouldn't even think about doing that. If you are already using Zobrist hashes then using the polyglot book format is a simple matter of using a binary search to find a group of entries matching your position and then (at least I do this) randomly select from the book moves from that position. Good luck.
The games are all actually 2200 and above, but I haven't strictly looked for duplicates. I'll go try out pgn-extract.
I think he doesn't mean about game duplicates but position duplicates. Because of storing the opening tree structure, ABK can't avoid node/position duplicates. Trimming duplicated games almost doesn't help.

Just my understanding:
- by using much more complicated structures for nodes, plus duplicates, ABK can eat much more memory/disk space than other book formats for the same number of positions and similar quality
- you need a complicated/sophisticated program to create ABK books from games thus it could be as good as a Polyglot one. The main limitation of ABK is caused by storing lines instead of positions. The program needs to find and store all missing lines to move from position to position. Otherwise, the book hit ratio will be reduced significantly

IMHO, Polyglot is much better than ABK since it is supported by almost all chess GUIs including Arena while ABK is supported by Arena only. In general, the quality of Polyglot is better than ABK.
Thanks! I was looking for a Python library that parsed pgns and allowed you to make Polyglot books, but they don't exist. I see the Banksia gui can be used for simple .bin creation as well.
dkl
Posts: 28
Joined: Wed Jan 14, 2015 5:55 pm

Re: .abk opening book maxes out at 2gb file size?

Post by dkl »

Thanks! I was looking for a Python library that parsed pgns and allowed you to make Polyglot books, but they don't exist. I see the Banksia gui can be used for simple .bin creation as well.
In case Java works for you as well, I've written a small tool that converts PGN files into (extended) Polyglot books, available @ https://github.com/asdfjkl/bookmaker
By extended I mean that some stats are included (i.e. win-ratio etc.) that are not present in polyglot books (as they are intended to create books for engines).

It would be easy to adapt this however to create real polyglot books. In fact JChesslib (cf https://asdfjkl.github.io/jchesslib/io/ ... mmary.html and https://search.maven.org/artifact/io.gi ... ib/1.2/jar) should have all functionality that you need...
Not Fritz, it's Jerry! Free Chess GUI - https://github.com/asdfjkl/jerry
Free Book about Neural Networks for Chess - https://github.com/asdfjkl/neural_network_chess