New 6-piece tablebases

Don · Post by **Don** » Sun Jun 16, 2013 3:04 pm

lucasart wrote:
syzygy wrote:
lucasart wrote:How does the size of 5-men table size base compare with: the Nalimov TB ? the Gaviota TB ? And in terms of "average probing speed" ?
The 5-piece Nalimov TB are over 7 GB. I believe the 6-piece are over 1.2 TB (and those lack the 5v1 tables).
The 5-piece Gaviota TB are 6.5GB with LZMA compression, see here.

Average probing speed of my tables should be considerably better than that of Nalimov tables, but to be honest I have not measured it. But probing 1.2 TB Nalimov 6-piece tables will of course lead to much more disk trashing than probing 68.3 GB of tables. And 68.3 GB on SSD is much more affordable than 1.2 TB on SSD.
Woaw! So you're being modest, but in reality, your TB beats the pants out of everything out there. And yes, probing speed will surely be linked to size (disk access much slower than RAM). The size of your TB is more comparable to the size of a bitbase than a TB, so it's really a big improvement over Nalimov and Gaviota.

Would it be possible to modify your code to generate a bitbase only:
- one bit per position (disregarding 50-move counter)
- bit just says if it's a draw or not.

if not a draw, good engines will (in the vast majority of cases) be able to find the win on their own. the real difficulty is to understand that certain positions are a "no progress situation", despite a material advantage, or an apparent advantage in terms of eval score.

A very small bitbase with only draw info, that could fit in RAM for 5-men, would be really nice.

That is an interesting proposal to only specify if the position is a draw. It would of course lead to really frugal use of memory.

I have to wonder if there are some interesting positions that would give chess programs a really difficult time finding with this scheme? A trivial example is king and pawn vs king and pawn - if this is not a draw then it's totally unclear which side has the win (without a deep search.) Of course computers solve this easily but they don't necessarily get it right with a static evaluation - especially if it involved king opposition and tricky tempo stuff.

Of course knowing the ones that are draws would be a big savings even if we could not use the winning/losing results as terminal nodes.

syzygy · Post by **syzygy** » Sun Jun 16, 2013 3:43 pm

Don wrote:That is an interesting proposal to only specify if the position is a draw. It would of course lead to really frugal use of memory.

But see also my reply:

syzygy wrote:That is possible, but I don't think it would reduce their size by much. E.g. with KQvKR the positions where black K or R can immediately capture the white Q are already stored as "don't care". The few positions where black can mate or take the white Q in more than 1 ply don't take up much space, and the space that it takes up seems to be worthwhile since the evaluation will usually not catch these situations. Leaving it out would either result in erroneous results or force the engine to search a few more ply at each node where it could otherwise immediately return a result.

For tables that really have only two values such as KPvK or KBNvK, there would be no difference at all.

It is true that knowing it is a draw allows you to stop searching without a possibility of erroneous results. In the case of KQvKR draws are few, so instead of draw/non-draw it would be better to store win-for-Q/not-win-for-Q and leave the rest to the engine. Not such a bad idea, but I believe the gain is still quite minimal with proper compression. (And for those tables where the gain would be substantial because of winning chances for both sides, the extra information seems to be quite welcome.)

Don · Post by **Don** » Sun Jun 16, 2013 4:23 pm

syzygy wrote:
Don wrote:That is an interesting proposal to only specify if the position is a draw. It would of course lead to really frugal use of memory.
But see also my reply:
syzygy wrote:That is possible, but I don't think it would reduce their size by much. E.g. with KQvKR the positions where black K or R can immediately capture the white Q are already stored as "don't care". The few positions where black can mate or take the white Q in more than 1 ply don't take up much space, and the space that it takes up seems to be worthwhile since the evaluation will usually not catch these situations. Leaving it out would either result in erroneous results or force the engine to search a few more ply at each node where it could otherwise immediately return a result.

For tables that really have only two values such as KPvK or KBNvK, there would be no difference at all.
It is true that knowing it is a draw allows you to stop searching without a possibility of erroneous results. In the case of KQvKR draws are few, so instead of draw/non-draw it would be better to store win-for-Q/not-win-for-Q and leave the rest to the engine. Not such a bad idea, but I believe the gain is still quite minimal with proper compression. (And for those tables where the gain would be substantial because of winning chances for both sides, the extra information seems to be quite welcome.)

Yes, I can see that you cannot really cheat that much because of information theory - very little actual information removed when you do that for most databases.

At MIT we built a rubiks cube solver and we basically worked it from both ends and could find the perfect minimal solution for any cube presented. For positions close to the solved cube, which would correspond to simple endings in chess, I came up with an interesting technique - but I'm not sure it could be applied here - but it was a major gain in performance. Here is a brief outline of how it worked:

Using retrograde analysis I built a HASH table which contained 4 bit records. But each entry did not represent a specific position, it represented ALL the positions that hashed to this entry up to some depth. In the entry we stored the minimum depth of solution of any cube which hashed to this location. The search process itself was iterative so in the tree we could use this table to prune nodes. For example if we had 3 ply left to search and the entry value was 4, we knew that it was not possible to solve the cube from this position given the remaining depth.

The hashing function for the cube however was partial - we just hashed one face of the cube if I remember this correctly (this was 15 years ago give or take) and this had a positive effect on the idea although even a complete hash of the cube worked too. I think the partial hashing allowed a kind of implicit classifying of positions that made the table more effective. (We did other cool stuff too with this such as building a complete table for the 2x2 cube and treating the 3x3 cube as a superset of the 2x2 cube to get lower bounds too.)

Anyway, I have often considered doing some of my own research on more innovative approaches to chess endings such as some type of entry sharing scheme like this and hueristic assist. But good compress like is done now is hard to beat. The heuristic assist idea is to build a domain specific function that tries to solve the position heuristically but imperfectly with a much smaller "correction table" - such as a bloom filter. One problem with this is that the heuristic function must be correct a very high percentage of the time before the break even point. A table of exceptions requires many bits per exception and a good endgame bitbase requires less than 1 bit per position (but covers all positions.) So there is some break even point and due to the slow heuristic function one would want to see a pretty major gain in compression.

I have observed in some endings that just a few really simple rules will cover 99% or more of all positions, IF you can cover shallow tactics. KRPvsKR is dominated by many positions where a rook is captured due to a simple tactic, such as a skewer or fork.

A related idea is to simply build a specialized search for each ending as part of the database system. It's basically the same thing but a properly constructed very shallow deterministic search may get well over 99% of these right and you might be able to cover the rest with a small correction table. The interesting part of this would be to construct the evaluation function of such a search and that would likely need to be done in some automated way such as by simulated annealing. However I think there are some ending so profound it would be difficult constructing an accurate evaluation function that would be workable. Queen vs Rook comes to mind. Of course an automated learning system may find rules that no human would consider and turn this ending into a relatively simple one, who knows?

syzygy · Post by **syzygy** » Sun Jun 16, 2013 5:06 pm

Don wrote:Yes, I can see that you cannot really cheat that much because of information theory - very little actual information removed when you do that for most databases.

With endgame databases the situation is a bit more complex, since it is possible to reduce the size of the table to the size of the generator at the cost of the time it takes to regenerate the table. There is a trade-off between access speed and table size, and there must exist some "law of conservation of misery" that applies here.

I have observed in some endings that just a few really simple rules will cover 99% or more of all positions, IF you can cover shallow tactics. KRPvsKR is dominated by many positions where a rook is captured due to a simple tactic, such as a skewer or fork.

I think simple tactics give by far the best "heuristic" and have the advantage that they are no heuristic but give precise information. If there are no immediate tactics, what is left are simple statistics ("white usually wins") and case-by-case rules of thumb.

Using statistics to improve the compression will not help much, because the compression will find out those statistics by itself anyway. Case-by-case rules of thumb might help a bit more, but would have to be hand-coded for each and every piece combination. (If someone claims they can be automatically extracted from computed tablebases, let him first show an implementation.)

However I think there are some ending so profound it would be difficult constructing an accurate evaluation function that would be workable. Queen vs Rook comes to mind. Of course an automated learning system may find rules that no human would consider and turn this ending into a relatively simple one, who knows?

Queen vs Rook is relatively simple if it is only about win/draw/loss (which is what you need during the search... it is not so bad to have large tablebases to probe at the root once a particular endgame has been reached on the board). Q vs R+N+P is more complex (and is my largest compressed WDL-table at 1.96 GB, including information on which positions are won/lost but affected by the 50-move rule). I don't believe in an effective automated learning system for that endgame until I see one.

Don · Post by **Don** » Sun Jun 16, 2013 5:26 pm

syzygy wrote:
Don wrote:Yes, I can see that you cannot really cheat that much because of information theory - very little actual information removed when you do that for most databases.
With endgame databases the situation is a bit more complex, since it is possible to reduce the size of the table to the size of the generator at the cost of the time it takes to regenerate the table. There is a trade-off between access speed and table size, and there must exist some "law of conservation of misery" that applies here.

I have observed in some endings that just a few really simple rules will cover 99% or more of all positions, IF you can cover shallow tactics. KRPvsKR is dominated by many positions where a rook is captured due to a simple tactic, such as a skewer or fork.
I think simple tactics give by far the best "heuristic" and have the advantage that they are no heuristic but give precise information. If there are no immediate tactics, what is left are simple statistics ("white usually wins") and case-by-case rules of thumb.

Using statistics to improve the compression will not help much, because the compression will find out those statistics by itself anyway. Case-by-case rules of thumb might help a bit more, but would have to be hand-coded for each and every piece combination. (If someone claims they can be automatically extracted from computed tablebases, let him first show an implementation.)

However I think there are some ending so profound it would be difficult constructing an accurate evaluation function that would be workable. Queen vs Rook comes to mind. Of course an automated learning system may find rules that no human would consider and turn this ending into a relatively simple one, who knows?
Queen vs Rook is relatively simple if it is only about win/draw/loss (which is what you need during the search... it is not so bad to have large tablebases to probe at the root once a particular endgame has been reached on the board). Q vs R+N+P is more complex (and is my largest compressed WDL-table at 1.96 GB, including information on which positions are won/lost but affected by the 50-move rule). I don't believe in an effective automated learning system for that endgame until I see one.

I had in mind hand constructing the most dominant endings. It has already been done for the simple ending KPvsK, there exists a set of rules that will give you the right answer 100% of the time. I think there is less than 10 rules but it's been a long time since I have seen them. For that ending Komodo has the database simply compiled in to to program anyway which I generated over 20 years ago for one of my ancient programs.

Can it be done in some automated way? Not clear to me but if I were to try what I would do is provide a lot of domain specific attributes that seem to be important. An attribute might be, "king is on square in from of enemy pawn" and so on. An attribute is basically binary so there would have to be hundreds of them, or we could use pre-computed parameters (king is N squares away from pawn.) You could provide enough of them to capture a lot of the important principles and this would then end up being a kind of optimization problem.

There are already enough attributes provided to identify a position with accuracy, just by considering the bit index of the squares that the pieces are on. For example a pawn on the first 4 ranks will always have a particular bit set or unset. A pawn on the opponent half of the board will be set the other way and so on. If you go into attribute building you would probably want to use the smallest number of attributes that provided a highly expressible meta-language. It would be a good idea to be able (by construction) to express fairly interesting chess concepts with very concise combinations of attributes. I'm quite sure this is more easily said than done of course. On the other hand I'm not sure a lot of research has been done on this either. Komodo is pretty strong primarily because I never quickly dismissed an idea that seems a little far-fetched and came up with some nice surprises along the way.

My intuition on this is that it would very challenging but perhaps possible. I don't really want to build a better mousetrap (as you have done with great success and I thank you for) but to break some new ground if possible.

Anyway, I forgot to congratulation you on this. From what I am hearing you did an excellent job engineering this and thinking things through. This is likely to be a very nice contribution to the computer chess community.

syzygy · Post by **syzygy** » Sun Jun 16, 2013 6:55 pm

Don wrote:Komodo is pretty strong primarily because I never quickly dismissed an idea that seems a little far-fetched and came up with some nice surprises along the way.

My intuition on this is that it would very challenging but perhaps possible. I don't really want to build a better mousetrap (as you have done with great success and I thank you for) but to break some new ground if possible.

I obviously don't know everything that is inside Komodo, but I would be surprised if it is not "just" a better mouse trap in the same way as my generator is a better mouse trap. And I don't think there is anything wrong with that. There are quite a few concrete ideas in it that make it better.

On the other hand, the idea of "come up with a bunch of rules that replace the 2 GB table" might be an appealing idea, but it is also a quite trivial and naive idea and does not give any concrete help in actually implementing it. If I give this idea to a brilliant person that manages to make it work, then that brilliant person deserves 100% of the credit and I deserve 0%. It is a bit comparable to "I have this idea that if you let the engine only look at the good moves, that will allow it to search much deeper" as you'll probably receive by e-mail on a daily basis. Sure, this is a sound principle, but it is trivial and naive and still leaves 100% of the work to the one developing the engine.

Of course now the challenge is on you to come up with say 1 MB of code and data that together accurately encode Q vs R+N+P with 0.1ms access time. The reward is a statue.

Don · Post by **Don** » Sun Jun 16, 2013 8:19 pm

syzygy wrote:
Don wrote:Komodo is pretty strong primarily because I never quickly dismissed an idea that seems a little far-fetched and came up with some nice surprises along the way.

My intuition on this is that it would very challenging but perhaps possible. I don't really want to build a better mousetrap (as you have done with great success and I thank you for) but to break some new ground if possible.
I obviously don't know everything that is inside Komodo, but I would be surprised if it is not "just" a better mouse trap in the same way as my generator is a better mouse trap. And I don't think there is anything wrong with that. There are quite a few concrete ideas in it that make it better.

Yes, that is basically what Komodo is, mostly just a better mousetrap and that is something I want to get beyond.

On the other hand, the idea of "come up with a bunch of rules that replace the 2 GB table" might be an appealing idea, but it is also a quite trivial and naive idea and does not give any concrete help in actually implementing it. If I give this idea to a brilliant person that manages to make it work, then that brilliant person deserves 100% of the credit and I deserve 0%. It is a bit comparable to "I have this idea that if you let the engine only look at the good moves, that will allow it to search much deeper" as you'll probably receive by e-mail on a daily basis. Sure, this is a sound principle, but it is trivial and naive and still leaves 100% of the work to the one developing the engine.

Of course now the challenge is on you to come up with say 1 MB of code and data that together accurately encode Q vs R+N+P with 0.1ms access time. The reward is a statue.

You were offended by my comments and I did not intend to cause you to reveal yourself. You have a great product here and I hope to get this working in Komodo at some point in the near future.

syzygy · Post by **syzygy** » Sun Jun 16, 2013 9:54 pm

Don wrote:You were offended by my comments

Not at all, just surprised that you seemed to consider your engine to be more than a better mouse trap.

Don · Post by **Don** » Mon Jun 17, 2013 2:21 am

syzygy wrote:
Don wrote:You were offended by my comments
Not at all, just surprised that you seemed to consider your engine to be more than a better mouse trap.

Probably because I didn't say that. Do you have a quote?

Kirill Kryukov · Post by **Kirill Kryukov** » Tue Jun 18, 2013 10:25 am

Hi Ronald,

You mention Stockfish integration on your page, is the complete source or Windows binary of this engine available?

Also, are you aware of any other engines that are able to use your tablebases? If yes, do you think it may be a good idea to list such engines on your TB homepage?

Fantastic work and please keep it up!

Best,
Kirill

New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases

Re: New 6-piece tablebases