Forget Syzygy -- Presenting the Emanuel Torresbase!

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
klx
Posts: 111
Joined: Tue Jun 15, 2021 6:11 pm
Full name: Emanuel Torres

Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx » Thu Jun 24, 2021 4:09 pm

Hi there, I have come up with a pretty revolutionary idea to vastly reduce the size of a DTM endgame database.

Here are the facts that lead to my discovery:

1. The vast majority of positions are won within a few number of moves. For example, from the syzygy stats site it seems that towards 99% of positions are won in less than 20 plies (depending the table, some are a lot more like 99.95%).

2. We can trivially search to depth 20 plies for endgames with alpha-beta in a fraction of a second.

So, in the Emanuel Torresbase, we store a special identifier instead of the actual DTM for these "easily-won/lost" positions. During query of a position, if this special value is found, we do alpha-beta to find the outcome. In other words, we can cut out up to 99.95% of the table!

The Emanuel Torresbase is fully adjustable. We can configure the threshold up and down to prefer more disk usage and less compute time, and vice versa. With threshold 0, it becomes plain endgame database. With threshold Infinity it becomes pure alpha-beta search. The threshold can be configured per table, and either in plies or percent.

The Emanuel Torresbase reduces the size of existing databases and paves the way for 8-men database.

Syzygy 7 men: 16.7 TiB
Minus 99.95%: 8.5 GiB

Estimated 8 men size: 8.5 GiB * (16.7 TiB / 149.2 GiB) = 974 GiB

Emanuel Torresbase!

User avatar
hgm
Posts: 26562
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by hgm » Thu Jun 24, 2021 4:36 pm

Sounds like total nonsense. For one, 'a fraction of a second' is about a million times slower than 'a fraction of a micro-second', which is what a conventional EGT probe takes.

klx
Posts: 111
Joined: Tue Jun 15, 2021 6:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx » Thu Jun 24, 2021 4:44 pm

hgm wrote:
Thu Jun 24, 2021 4:36 pm
'a fraction of a second' is about a million times slower than 'a fraction of a micro-second', which is what a conventional EGT probe takes
If you are in this position and need to find the best move, sub-second or sub-microsecond performance might not matter.

klx
Posts: 111
Joined: Tue Jun 15, 2021 6:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx » Thu Jun 24, 2021 4:46 pm

hgm wrote:
Thu Jun 24, 2021 4:36 pm
'a fraction of a second' is about a million times slower than 'a fraction of a micro-second', which is what a conventional EGT probe takes
Also, the Emanuel Torresbase is adjustable if you need more speed.

User avatar
hgm
Posts: 26562
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by hgm » Thu Jun 24, 2021 4:54 pm

klx wrote:
Thu Jun 24, 2021 4:44 pm
hgm wrote:
Thu Jun 24, 2021 4:36 pm
'a fraction of a second' is about a million times slower than 'a fraction of a micro-second', which is what a conventional EGT probe takes
If you are in this position and need to find the best move, sub-second or sub-microsecond performance might not matter.
Indeed. But that is not how EGT are used. Their main use is probing close to the leaves of the search tree, to see if the position there is won, lost or draw. So that you can go for the won positions, and avoid the others. And if the tree has a million nodes, the difference between using conventional EGT and your method is whther you complete the search in 1 sec or in a year.

klx
Posts: 111
Joined: Tue Jun 15, 2021 6:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx » Thu Jun 24, 2021 5:09 pm

hgm wrote:
Thu Jun 24, 2021 4:54 pm
Indeed. But that is not how EGT are used.
Oh ok, didn't know this. Just learned about endgame databases last week, so I'm not up to speed with all intricacies yet. How can we make best use of the Emanuel Torresbase then? I feel like I'm on to something.

AndrewGrant
Posts: 1392
Joined: Tue Apr 19, 2016 4:08 am
Location: U.S.A
Full name: Andrew Grant
Contact:

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by AndrewGrant » Thu Jun 24, 2021 9:30 pm

So this is just a theory? You've not actually done anything to bring it to reality?

klx
Posts: 111
Joined: Tue Jun 15, 2021 6:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx » Thu Jun 24, 2021 10:52 pm

AndrewGrant wrote:
Thu Jun 24, 2021 9:30 pm
So this is just a theory? You've not actually done anything to bring it to reality?
Well I did the investigation and math in the post above as a kind of proof of concept, but for the actual implementation I plan to start this weekend. Mostly wanted to get some feedback, and to share the idea since I never heard of this concept before and in case this ends up laying the foundation for 8 men database.

AndrewGrant
Posts: 1392
Joined: Tue Apr 19, 2016 4:08 am
Location: U.S.A
Full name: Andrew Grant
Contact:

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by AndrewGrant » Thu Jun 24, 2021 11:04 pm

klx wrote:
Thu Jun 24, 2021 10:52 pm
AndrewGrant wrote:
Thu Jun 24, 2021 9:30 pm
So this is just a theory? You've not actually done anything to bring it to reality?
Well I did the investigation and math in the post above as a kind of proof of concept, but for the actual implementation I plan to start this weekend. Mostly wanted to get some feedback, and to share the idea since I never heard of this concept before and in case this ends up laying the foundation for 8 men database.
Engines can achieve depth 20 easily, but its heavily pruned. If you want a proven tree, your AlphaBeta is very limited in its depth without unsafe pruning decisions.

User avatar
phhnguyen
Posts: 1095
Joined: Wed Apr 21, 2010 2:58 am
Location: Australia
Full name: Nguyen Hong Pham
Contact:

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by phhnguyen » Fri Jun 25, 2021 12:09 am

klx wrote:
Thu Jun 24, 2021 4:09 pm
Hi there, I have come up with a pretty revolutionary idea to vastly reduce the size of a DTM endgame database.

Here are the facts that lead to my discovery:

1. The vast majority of positions are won within a few number of moves. For example, from the syzygy stats site it seems that towards 99% of positions are won in less than 20 plies (depending the table, some are a lot more like 99.95%).

2. We can trivially search to depth 20 plies for endgames with alpha-beta in a fraction of a second.

So, in the Emanuel Torresbase, we store a special identifier instead of the actual DTM for these "easily-won/lost" positions. During query of a position, if this special value is found, we do alpha-beta to find the outcome. In other words, we can cut out up to 99.95% of the table!

The Emanuel Torresbase is fully adjustable. We can configure the threshold up and down to prefer more disk usage and less compute time, and vice versa. With threshold 0, it becomes plain endgame database. With threshold Infinity it becomes pure alpha-beta search. The threshold can be configured per table, and either in plies or percent.

The Emanuel Torresbase reduces the size of existing databases and paves the way for 8-men database.

Syzygy 7 men: 16.7 TiB
Minus 99.95%: 8.5 GiB

Estimated 8 men size: 8.5 GiB * (16.7 TiB / 149.2 GiB) = 974 GiB

Emanuel Torresbase!
So it is just an idea, not a real practice nor a real base!

I guess the reality may be much harder than you think. From my experience (I have done some studies a few years ago):

1) I agree a large number of non-draw positions, not that 99%, but about 90% is easy to search out in reasonable time, say, within 3 seconds threshold
2) However, 10% of non-draw positions may take much much longer, even hours
3) The most trouble is about draw-positions. You can't easily search and conclude if a position is a draw. Sometimes you need to search over 100 plies (because of rule 50-moves) and/or hours
4) the mix between 2 and 3 gives you no good solution

Your idea of collecting data for some specific positions only (then search or do whatever from them) is a bad one since they will eat a lot of space, memory, and time for processing (you may need to load them all into memory then make some binary searches for a given position). Your data may end up quite closed on size with Syzygy and/or not worth trouble. Note that for a standard EGTB, it doesn't store position's identifier (such as hash keys, FENs) nor require searching: the probe function can locate exactly and instantly (via index functions) where the data it needs from files without loading them all nor doing any search.

IMHO, it turns out the best solution for a mix of data-searching is something similar to a bitbase (https://www.chessprogramming.org/Endgame_Bitbases).
https://banksiagui.com
A freeware chess GUI, based on opensource Banksia - the chess tournament manager

Post Reply