New 6-piece tablebases

Discussion of chess software programming and technical issues.

Moderator: Ras

Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: New 6-piece tablebases

Post by Edmund »

An advantage of Gaviota Tablebases over the current syzygy implementation is the ability of soft-probing.

This is a probe which returns no result if the relevant block does not lie in the cache, in which case it prefetches the block to return a result on the next probe. Depending on whether you have your WDL tables on a SSD or HD you might want to use this option for probes in the Quiescence search or when you are n plies away from the leafnodes.

Is there any chance of including this feature in the syzygy code?
If I understand correctly, the caching currently is completely left to the OS. Is this only done for simplicity or are there advantages to it that outweigh the benefit from soft-probing?
syzygy
Posts: 5780
Joined: Tue Feb 28, 2012 11:56 pm

Re: New 6-piece tablebases

Post by syzygy »

Edmund wrote:An advantage of Gaviota Tablebases over the current syzygy implementation is the ability of soft-probing.

This is a probe which returns no result if the relevant block does not lie in the cache, in which case it prefetches the block to return a result on the next probe. Depending on whether you have your WDL tables on a SSD or HD you might want to use this option for probes in the Quiescence search or when you are n plies away from the leafnodes.

Is there any chance of including this feature in the syzygy code?
If I understand correctly, the caching currently is completely left to the OS. Is this only done for simplicity or are there advantages to it that outweigh the benefit from soft-probing?
Currently caching is indeed completely left to the OS. This is simple and in my experience works very well, at least on Linux with a decent amount of RAM. A nice advantage is that the data cached in RAM is shared by all engines using the tables. This also make it efficient for engines doing SMP using processes instead of threads: the various processes automatically share the tablebase cache. Another advantage is that the OS caching code is designed to work well with multiple threads (apparently other tablebase implementations suffer from synchronisation problems).

A disadvantage is that the user has no control over the amount of RAM used for caching. With 6-piece tables on a machine with limited RAM this might negatively affect the performance of engines not using the tables. For engine-engine matches on a machine with relatively little RAM (where this problem occurs) I would recommend using 5-piece tables only.

The 5-piece WDL tables take up less than 400MB (even if 10 engines are accessing these tables concurrently). Nowadays this is so little that, when using 5-piece tables only, there is no benefit to be had from a special caching scheme and soft probing. You can effectively assume that everything is in RAM. (To compare with the Gaviota tables, when using them in "bitbase mode" caching the full 5-piece tables in RAM would take up about 9.6GB: 38.5 billion positions, 4 per byte.)

When using 6-piece tables a soft probing feature and more control over RAM usage may have advantages. The implementation would need to be quite sophisticated though, because you will have to do in software what the OS is doing partially in hardware and it is easy to lose the multi-threading efficiency. (And I'm not sure it will really give you much control over RAM usage, since the OS will still be caching large parts of the tablebase files in its page cache in RAM.)

If I ever add a cache to the probing code, I will still use the current mmap()-based mechanism to "load" the indexing tables (which is at the beginning of each file). The cache would be used to store the actual compressed data (still in compressed form), probably in blocks of 4K (or whatever fits best with the OS). Most of the present probing code can be left unchanged.

Regarding soft probing, I have thought a bit about how to do that with the OS-based caching scheme. In theory it should be possible to add a page fault handler that lets probes fail when the data is not cached. Ideally, this page fault handler would then queue access requests and let another (hyper)thread page in the memory (or maybe the OS can be told to do this asynchronously). The probe could then succeed at the next ply.
guyhaw
Posts: 9
Joined: Wed Nov 04, 2015 10:16 pm
Location: Reading
Full name: Guy McCrossan Haworth

Re: New 6-piece tablebases: missing statistics

Post by guyhaw »

The stats files for the syzygy EGTs, at least the ones I came by, do not include exemplar 'longest frustrated wins' for the four cases where the 'longest frustrated win' for Black is just 1 (i.e., '101') ply.

There are examples of 'longest frustrated wins' for White in 1 ply, and for Black in more than 1 ply.

I have found - without too much difficulty - exemplar positions for three of them but have no exemplar positions for KQPKBP/0-1.

Any offers?

Guy Haworth
U of Reading, UK

PS: the stats for KBNKB were also missing (no big deal) but have been created.
User avatar
jshriver
Posts: 1360
Joined: Wed Mar 08, 2006 9:41 pm
Location: Morgantown, WV, USA

Re: New 6-piece tablebases

Post by jshriver »

This is a thread I haven't seen in a long time :)

I've missed it.
User avatar
velmarin
Posts: 1600
Joined: Mon Feb 21, 2011 9:48 am

Re: New 6-piece tablebases: missing statistics

Post by velmarin »

guyhaw wrote:The stats files for the syzygy EGTs, at least the ones I came by, do not include exemplar 'longest frustrated wins' for the four cases where the 'longest frustrated win' for Black is just 1 (i.e., '101') ply.
There are examples of 'longest frustrated wins' for White in 1 ply, and for Black in more than 1 ply.
.
Oh, yes, also I have seen, amazing that lack a movement and draw !!...

It looks like how a joke, internally to the engines it must be a rule of 49 movements... :twisted:
guyhaw
Posts: 9
Joined: Wed Nov 04, 2015 10:16 pm
Location: Reading
Full name: Guy McCrossan Haworth

Re: New 6-piece tablebases: missing statistics

Post by guyhaw »

For KQPKBP/0-1, there are only five 'frustrated wins' for Black, all requiring an immediate pawn conversion.

The other three endgames with 0-1 frustrated wins for Black in 1 ply are:
KQNPKQ/0-1: 4K1k1/1N1P4/8/3q4/8/1Q6/8/8 b, 1... Qxb3 (KNPKQ)
KQRPKP/0-1: 8/1K6/8/2P5/2R5/7k/p7/1Q6 b, 1... axb1=Q+ (KRPKQ)
KRRPKP/0-1: 8/1K6/8/2P5/2R5/7k/p7/1R6 b (similar), 1. ... axb1=Q+ (KRPKQ)

I got RdM's code run on the KQPKBP EGT, and on the KBPKQP EGT. In the latter case, the software failed to report any longest cursed win for White - so I think it 'flips colours' for the computation' and then 'flips back' when reporting the results.

g
guyhaw
Posts: 9
Joined: Wed Nov 04, 2015 10:16 pm
Location: Reading
Full name: Guy McCrossan Haworth

Re: New 6-piece tablebases: missing statistics

Post by guyhaw »

I got lucky and found a KQPKBP/0-1 frustrated win in 1, one of the five:

bP wK Position dtm, dtz50'

a2 h7 Q7/6K1/P7/8/1b6/8/p7/3k4 b 113p, 1p
1. a1=Q+''''' Kf7' Qa2+''''' 3. Kg7'''' Qb2+''''' 4. Kg6'' Qc2+'' 5. Kf6'' Qc3+' 6. Kg6'' Qg3+''

a2 h8 Q6K/8/P7/8/1b6/8/p7/3k4 b 113p, 1p
1. ... a1=Q+''''' 2. Kg8' Qa2''''' 3. Kg7'''' {pos. 3b in the a2/h7 line}

a2 f6/e5 draw

a2 d4 Q7/8/P7/8/1b1K4/8/p7/3k4 b 111p, 1p
1. ... a1 = Q+''''' 2. Ke4'''' Qb1+''''' 3. Ke5'' Qb2+'''' 4. Kf5'''' Qf2+''''' 5. Kg6'' Qg3+'' {pos. 7w in the a2/h7 line}

b2 h7 Q7/7K/P7/8/1b6/8/1p6/3k4 b 111p, 1p
1. ... b1=Q+''''' 2. Kg7' Qb2+'' {pos. 4w in the a2/h7 line}

b2 g6 Q7/8/P5K1/8/1b6/8/1p6/3k4 b 111p, 1p
1. ... b1=Q+''''' 2. Kf6' Qb2+''''' 3. Kg6'''' {pos. 4b in the a2/h7 line}

b2 f5/d3 draw

b2 e4 1. ... b1=Q+''''' {a ‘-2’ 100p loss}

g
Rein Halbersma
Posts: 751
Joined: Tue May 22, 2007 11:13 am

Re: New 6-piece tablebases

Post by Rein Halbersma »

syzygy wrote:
jshriver wrote:
syzygy wrote: With some minor changes it should be possible to generate 7-piece tables on a system with 1 TB of shared RAM. If anyone wants to give that a try let me know. ;-)
What about using 1TB of SSD as swap? Know it would be a lot slower, but possible.
SSDs are great for storing tablebases, but terrible for use as swap for the generation of tablebases. Writing to flash means erasing a whole block and rewriting it. The erase block size may be as large as 2MB (OCZ Vertex 3 SSD, says Google). During generation of a tablebase using my generator lots and lots of memory writes are made, many of which are in random order which means that caching won't help much and they have to be physically written to flash. There could easily be 500 writes per block during generation and another 500 during compression. The flash memory of recent SSDs has a maximum of about 3000 rewrites. Cheaper (larger capacity) TLC flash lasts from 500 to 1000 writes.

The firmware algorithms of SSD are probably pretty smart with how they continuously remap 4K "disk sectors" to flash erase blocks, so that it may not be as bad as it seems, but I would not count on it that a 1TB SSD survives the generation of more than a few 7-piece tables.
Reply to 3 year old message: I am contemplating buying an SSD specifically for fast draughts endgame database generation. I expect to be able to do everything in RAM, and write back to disk after everything is finished, so I don't expect to wear out the SSD.

Your algorithm seems to rely on repeated writes during the build, is that the main thing you would worry about? It seems that recent measurements about repeated writes are more promising: http://www.anandtech.com/show/7173/sams ... s-tested/3 Or is this apples and oranges?
syzygy
Posts: 5780
Joined: Tue Feb 28, 2012 11:56 pm

Re: New 6-piece tablebases

Post by syzygy »

Rein Halbersma wrote:Your algorithm seems to rely on repeated writes during the build, is that the main thing you would worry about? It seems that recent measurements about repeated writes are more promising: http://www.anandtech.com/show/7173/sams ... s-tested/3 Or is this apples and oranges?
1129 write/erase cycles is not very promising if the generation of a single table rewrites the full ssd 500 to 1000 times.

I am thinking of tables that don't fit in RAM, so in chess that will be 7-piece tables. The largest ones require about 700 GB. So 500 times writing the full table takes about 350 write/erase cycles of an entire 1 TB ssd, assuming optimal wear-leveling.
noobpwnftw
Posts: 694
Joined: Sun Nov 08, 2015 11:10 pm
Full name: Bojun Guo

Re: New 6-piece tablebases

Post by noobpwnftw »

I've tried build EGTB with SSDs some years ago, they worn out soon after a single run. Just don't do it. :P

Nowadays consumer-class servers can have 1TB of memory, I have some spare ones if you want to do a 7-man run. But even if we build it, its hard to distribute and we will end up with some kind of online query and using local WDL tablebases only.