Gaviota's EGTBs are only 6.5 Gb now

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Gaviota's EGTBs are only 6.5 Gb now

Post by michiguel »

So far, so good.

In order to have a prototype scheme for compressed EGTBs, I implemented a rudimentary compressor based on Huffman coding. Even with this simple scheme, I got the files to 12.1 Gb. That was promising. I designed it to have a module with encodes()/decode() functions completely isolated from the probing code. Thus, I can easily test several compression algorithms.

Based on Dann Corbit's suggestion, the first serious compression scheme I tried was LZMA and I shrunk the files to 6.5 Gb. This is really good because that makes them already smaller than Nalimov's (7 Gb as a I read, I do not have them here). This could be potentially better (Gabor compressed them with 7z as a whole to 4 Gb, but may be when I compressed them in chunks the compression ratio is hurt). More experiments are needed to see is this can be improved, but even if it can't, what I have already is at the same level or possibly slightly better than Nalimov's (memory wise).

I would like to test the speed of the whole scheme compared to the uncompressed version.

Is there any particular positions that you experienced a noticeable slow down with previous EGTBs? Is there a test suite? The only thing I remember or found is that people refrained to probe the EGTBs near to the leaves, but that is about it. Were there any tests run that I can compare data?

Miguel
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by bob »

michiguel wrote:So far, so good.

In order to have a prototype scheme for compressed EGTBs, I implemented a rudimentary compressor based on Huffman coding. Even with this simple scheme, I got the files to 12.1 Gb. That was promising. I designed it to have a module with encodes()/decode() functions completely isolated from the probing code. Thus, I can easily test several compression algorithms.

Based on Dann Corbit's suggestion, the first serious compression scheme I tried was LZMA and I shrunk the files to 6.5 Gb. This is really good because that makes them already smaller than Nalimov's (7 Gb as a I read, I do not have them here). This could be potentially better (Gabor compressed them with 7z as a whole to 4 Gb, but may be when I compressed them in chunks the compression ratio is hurt). More experiments are needed to see is this can be improved, but even if it can't, what I have already is at the same level or possibly slightly better than Nalimov's (memory wise).

I would like to test the speed of the whole scheme compared to the uncompressed version.

Is there any particular positions that you experienced a noticeable slow down with previous EGTBs? Is there a test suite? The only thing I remember or found is that people refrained to probe the EGTBs near to the leaves, but that is about it. Were there any tests run that I can compare data?

Miguel
The main issue is to use a couple of positions with 10-11 pieces left, making probes quite frequent. Better to use minimal pawns since they get traded/captured less frequently than pairs of rooks, or queens.

The next step is to simply probe the same way, but run with either Eugene's stuff, or with yours, everything else being equal, to see the effect.
BBauer
Posts: 658
Joined: Wed Mar 08, 2006 8:58 pm

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by BBauer »

The following study by Awerbach makes heavily use of tablebases.
It has 8 pieces and is a mate in 46?
kind regards
Bernhard

[D]2k2K2/8/pp6/2p5/2P5/PP6/8/8 w - -
User avatar
jshriver
Posts: 1342
Joined: Wed Mar 08, 2006 9:41 pm
Location: Morgantown, WV, USA

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by jshriver »

If you need or want, I'd be willing to host the egtb dataset for free.

Just let me know so we can arrange a transfer (will probably setup a temp ftp account for you).

-Josh
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by michiguel »

jshriver wrote:If you need or want, I'd be willing to host the egtb dataset for free.

Just let me know so we can arrange a transfer (will probably setup a temp ftp account for you).

-Josh
Thanks Josh, that will be great. Once I settle for one specific compression scheme, I will gladly accept your offer. I figure that many people will be more inclined to download the files than generate them.

Miguel
User avatar
jshriver
Posts: 1342
Joined: Wed Mar 08, 2006 9:41 pm
Location: Morgantown, WV, USA

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by jshriver »

No problem. Wonderful engine btw. On my lunch break I had my machine at home start generating 4 and 5 men :)

Do you have any plans for 6-7 men?
-Josh
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by michiguel »

BBauer wrote:The following study by Awerbach makes heavily use of tablebases.
It has 8 pieces and is a mate in 46?
kind regards
Bernhard

[D]2k2K2/8/pp6/2p5/2P5/PP6/8/8 w - -
Thanks, this is a very interesting position, not only for TBs. I can see two type of positions 1) *Very* heavy probing of a narrow set of TBs (generally when we are too close to convert to a 5-pc TB) or 2) moderate probing of a wider set of TBs, which is what happens when we are close but not so much and there is a diversity of pieces.
1 is your case, 2 is Bob's. Apparently, case #2 is more challenging despite the number of probes is lower. The problem is that the higher diversity blows the efficiency of the cache out.

In any case, as long as I do not probe too close to the leaves (~ 2 plies), there is no significant slow down. I will keep experimenting.

Miguel
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by michiguel »

jshriver wrote:No problem. Wonderful engine btw.
Thanks...

On my lunch break I had my machine at home start generating 4 and 5 men :)

Do you have any plans for 6-7 men?
-Josh
Certainly for 6. I have some ideas, but I do not know how efficient they will be. First, I need to optimize the generation of 5, speed and memory wise (using whatever scheme I will use for the six).

For seven... I need to see how it feels the generation of 6-pc TBs.

Miguel
User avatar
jshriver
Posts: 1342
Joined: Wed Mar 08, 2006 9:41 pm
Location: Morgantown, WV, USA

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by jshriver »

I highly recommend checking out the egtb forum as well.

http://kirill-kryukov.com/chess/discuss ... =6&start=0

There have been some strong attempt but from my understanding there still isn't a reliable 6 and definitely 7men egtb generator.

*Edit* there is the Nalimov code which does 6-men but I'm still fuzzy on if the core code released did 6men, know some people had modified it to support 6men so the final bits of the set could be completed a few years back.
*Done edit*

When the time comes and you start working the 6men set I'd be willing to dedicate some cpu time and hosting for them.

Another big issue with egtb is the licensing. What do you plan to license the code (reading library) and data files under? A BSD, GPL or something open and not very restrictive would be nice. This would allow people to use it even expand it in the event you choose to leave the chess scene.

-Josh
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Gaviota's EGTBs are only 6.5 Gb now

Post by michiguel »

jshriver wrote:I highly recommend checking out the egtb forum as well.

http://kirill-kryukov.com/chess/discuss ... =6&start=0
Thanks! I just registered.

There have been some strong attempt but from my understanding there still isn't a reliable 6 and definitely 7men egtb generator.

*Edit* there is the Nalimov code which does 6-men but I'm still fuzzy on if the core code released did 6men, know some people had modified it to support 6men so the final bits of the set could be completed a few years back.
*Done edit*

When the time comes and you start working the 6men set I'd be willing to dedicate some cpu time and hosting for them.

Another big issue with egtb is the licensing. What do you plan to license the code (reading library) and data files under? A BSD, GPL or something open and not very restrictive would be nice. This would allow people to use it even expand it in the event you choose to leave the chess scene.

-Josh
The idea is to release the probe and generation code as open source. Most likely a BSD with no advertising as it was suggested or MIT license. I do not know the specifics of which one yet and I am not an expert, but the idea is to allow everybody to use it so testers, gui writers, and engine writers can use a common resource. For that to happen, it should be open. Whatever the people do or use in tournaments, it is a complete different issue.

Miguel