Gaviota EGTBs, interface proposal for programmers

michiguel · Post by **michiguel** » Tue Dec 15, 2009 7:42 pm

mcostalba wrote:
michiguel wrote:
Code: Select all
		int	ws[17], bs[17]; /* list of squares for white and black */
		int	wp[17], bp[17]; /* what pieces are on those squares */
Why not
Code: Select all
int pieceList[2][8][16]; // [color][pieceType][index]
int index[64]; // [square]
Instead of ws, bs, wp, bp arrays ?

I understand this is a request specific to satisfy how lists are defined in Stockfish, but if it doesn't hurt performance I would propose that. And I don't think it will hurt performance (at least from user point of view) because in a real engine the piece lists will be updated incrementally in do_move(), not from scratch.

I will think if this can be improved, but the the conversion to something like ws[], bs[], wp[], bp[] needs to be done at one point or another. Either the API does it behind to curtains, or the engine does is explicitly. Considering that such specific format will differ from other engines, it is better if the engine itself does the conversion. You ask something as low level as possible, and I think that the format I am requesting is the best I can do.

Anyhow, in case of this library will be used in SF the API (and of course internals) will be changed anyway to be adapted to SF. So if you don't see negative side effects please consider this list layout in first instance.

In case data is on disk, I agree with you has no sense talking about this, but I am not sure a 6 TB on disk will result in a stronger engine then one using, say 4 TB in RAM, especially if this engine has _already_ a good handling of endgame positions.

An engine with enough cache for 6-pc TBs, will have all the needed 4-pc information in cache when the time comes; so, it can only be better. The question is how much better.

Regarding the relative strength of an engine with or without TBs, I have seen for years the claims that it does not help. Personally, I do not believe it because I do not think that chances to use TBs optimally has been exhausted.

Miguel
PS: Let me know whenever you see an engine with a good handling of endgame positions

. I think that overall engines are worse at endgames than the people think. The main reason is that they are not punished enough because their opponents have the same holes in knowledge. But, this is just a theory.

mcostalba · Post by **mcostalba** » Tue Dec 15, 2009 8:30 pm

michiguel wrote: PS: Let me know whenever you see an engine with a good handling of endgame positions . I think that overall engines are worse at endgames than the people think. The main reason is that they are not punished enough because their opponents have the same holes in knowledge. But, this is just a theory.

This is not true, you just need to pick up a punisher with TB support

....and then to see how much the punisher is able to punish

mcostalba · Post by **mcostalba** » Tue Dec 15, 2009 8:34 pm

michiguel wrote: I will think if this can be improved, but the the conversion to something like ws[], bs[], wp[], bp[] needs to be done at one point or another. Either the API does it behind to curtains, or the engine does is explicitly.

In this case it is better the engine does it explicitly. IMHO the glue logic between interface and actual egbt look-up should be reduced to a bare minimum...also to let people to easily change internal impelmentation if needed

Dann Corbit · Post by **Dann Corbit** » Tue Dec 15, 2009 8:40 pm

mcostalba wrote:
michiguel wrote: I will think if this can be improved, but the the conversion to something like ws[], bs[], wp[], bp[] needs to be done at one point or another. Either the API does it behind to curtains, or the engine does is explicitly.
In this case it is better the engine does it explicitly. IMHO the glue logic between interface and actual egbt look-up should be reduced to a bare minimum...also to let people to easily change internal impelmentation if needed

My original suggestion to Miguel was:
Have the customer supply a standard EPD record. Use the EPD record as the key for the lookup. The reason I suggested this was that almost every chess engine can produce an EPD record (and those that can't *ought* to be able to).

If Miguel gives you something more complicated than that, it is for speed. Most chess programs already have things split up into their atoms which are easily supplied to the interface (and the person who wrote the program can supply these basic constituents more efficiently than anyone else can). A simpler interface will work, but it will have a cost in performance.

mcostalba · Post by **mcostalba** » Tue Dec 15, 2009 8:50 pm

Dann Corbit wrote: If Miguel gives you something more complicated than that, it is for speed.

I don't think I have understood this. My suggestion was to _do not_ perform any hidden calculation inside the library that should be as much as possible a simple "transport" channel between the data and the engine that uses it.

So my suggestion was to use an API interface that is the most similar as possible to how the input data are internally used for the actual egbt probing.

Using a "simplified" interface so that engines are fast, but then there is an hidden decode burden in the library is not IMHO the way to go because you simply don't know how engine works so you cannot just "suppose" that engines are fast if the API is done in a way instead of another. What you know is how library works, so you know what API is the best to make the library as fast as possible, and that's the interface I would choose if I were Miguel.

hMx · Post by **hMx** » Tue Dec 15, 2009 10:48 pm

michiguel wrote:Is there anything else that a tablebase interface should have?

I'm missing statistics.
Offer some struct and a function that fills numbers like:
#files opened
#probes
#has file
#disk reads
#bytes read
#cache efficiency
#cache hits
etc
That could help tuning memory sizes.

Dann Corbit · Post by **Dann Corbit** » Wed Dec 16, 2009 12:56 am

mcostalba wrote:
Dann Corbit wrote: If Miguel gives you something more complicated than that, it is for speed.
I don't think I have understood this. My suggestion was to _do not_ perform any hidden calculation inside the library that should be as much as possible a simple "transport" channel between the data and the engine that uses it.

So my suggestion was to use an API interface that is the most similar as possible to how the input data are internally used for the actual egbt probing.

Using a "simplified" interface so that engines are fast, but then there is an hidden decode burden in the library is not IMHO the way to go because you simply don't know how engine works so you cannot just "suppose" that engines are fast if the API is done in a way instead of another. What you know is how library works, so you know what API is the best to make the library as fast as possible, and that's the interface I would choose if I were Miguel.

To be truthful, I think that an optimal interface will be a bad thing to provide. That is because it will be difficult to use, and therefore be very expensive for Miguel.

Imagine 10,000 chess programs all requesting access and Miguel trying to compose 10,000 different letters to explain how to do it to each author. No wonder answers from Eugene sometimes took years to get.

If I were Miguel, here is what I would expose:
One single input to EGTB system:
Standard EPD string

Four distinct outputs from EGTB:
1. Distance to mate -- integer
2. Drawn or not -- char ('0'/'1')
3. Broken position -- char ('0'/'1')
4. Probe failed -- char ('0'/'1')

Or something along those lines. Then, if someone asks him what an EPD string is, he simply points to the PGN standard. If someone asks what the outputs mean, he has a single page that explains them.

Since the system will come with source code, it may be possible to write more efficient access methods.

michiguel · Post by **michiguel** » Wed Dec 16, 2009 3:52 am

hMx wrote:
michiguel wrote:Is there anything else that a tablebase interface should have?
I'm missing statistics.
Offer some struct and a function that fills numbers like:
#files opened
#probes
#has file
#disk reads
#bytes read
#cache efficiency
#cache hits
etc
That could help tuning memory sizes.

I have that here, so It won't be much effort to provide it. I thought people might not be interested on it, but you proved me wrong. How should I provide this? one function that outputs a structure with all the info or functions like this

Code: Select all

typedef uint64_t stat_t;

stat_t   tb_stat_reset (void);
stat_t   tb_stat_probe_hits (void);
stat_t   tb_stat_probe_miss (void);
stat_t   tb_stat_cache_hits (void);
stat_t   tb_stat_cache_miss (void);
etc.

I think I would provide non-redundant information and let the user calculate whatever it can be deduced from the ones provided (for instance, cache_efficiency = cache_hits/ (cache_hits + cache_miss)).

Miguel

michiguel · Post by **michiguel** » Wed Dec 16, 2009 5:21 am

Dann Corbit wrote:
zamar wrote:Interface looks really nice and easy to use!!

I don't understand too well the internals of EGTB, so is there some reason why program using TBs should specify internal memory block size?
I guess because the EGTB designer has no way to know if you are on a Windows CE handheld with 256MB RAM total or on a 64-way server with 256TB RAM.

I think he may have referred to the block sizes and not to the total cache size.

I have been thinking quite a bit about this because there is a distinction between the compressed and uncompressed schemes. Both work in blocks. With uncompressed schemes the user can change the block size. Why is that important? For instance, if uncompressed files are used with solid state disks with very fast access, maybe it will be better to have a very small block size.

But the compressed scheme has a block size that is not up to the user to change. It has been determined when the files were compressed and the block size information is on the files themselves. So, it does not make sense to provide block sizes for compressed schemes. They will be ignored. I do not like that.

Maybe I can remove the block size parameter when the cache is initialized. I could set up a default block size that might be altered by the user with one separate function, only if the scheme is uncompressed. One more extra init function

Miguel

extern bool_t tbcache_init (size_t cache_mem, size_t block_mem);

Another thing is that for Nalimov TBs UCI specifies:

* NalimovPath, type string:
this is the path on the hard disk to the Nalimov compressed format.

* NalimovCache, type spin:
this is the size in MB for the cache for the nalimov table bases

I hope that when you release your EGTBs, you could give recommendations for the used UCI option names, so that we can get some standardization in here.
BallicoraPath, type string:
this is the path on the hard disk to the Ballicora compressed format.

BallicoraCache, type spin:
this is the size in MB for the cache for the Ballicora table bases

Aleks Peshkov · Post by **Aleks Peshkov** » Wed Dec 16, 2009 9:45 am

mcostalba wrote:So my suggestion was to use an API interface that is the most similar as possible to how the input data are internally used for the actual egbt probing.

I think it is a wrong idea in principle. Good interface should hide implementation but not be based on volatile early implementation decisions. There are thousands potential users (chess engines, GUIs, websites) and good opportunity to unify many alternative tablebases and bitbases.

How much progress of chess programming would be without UCI/XBoard protocols? Imagine if we have C-like interface to relational databases instead of SQL? How WWW would evolved if we had binary HTTP-protocol?

Can you compare computational overhead of handling 100 byte string comparing to database disk access?

Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers

Re: Gaviota EGTBs, interface proposal for programmers