Forget Syzygy -- Presenting the Emanuel Torresbase!

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx »

Hi there, I have come up with a pretty revolutionary idea to vastly reduce the size of a DTM endgame database.

Here are the facts that lead to my discovery:

1. The vast majority of positions are won within a few number of moves. For example, from the syzygy stats site it seems that towards 99% of positions are won in less than 20 plies (depending the table, some are a lot more like 99.95%).

2. We can trivially search to depth 20 plies for endgames with alpha-beta in a fraction of a second.

So, in the Emanuel Torresbase, we store a special identifier instead of the actual DTM for these "easily-won/lost" positions. During query of a position, if this special value is found, we do alpha-beta to find the outcome. In other words, we can cut out up to 99.95% of the table!

The Emanuel Torresbase is fully adjustable. We can configure the threshold up and down to prefer more disk usage and less compute time, and vice versa. With threshold 0, it becomes plain endgame database. With threshold Infinity it becomes pure alpha-beta search. The threshold can be configured per table, and either in plies or percent.

The Emanuel Torresbase reduces the size of existing databases and paves the way for 8-men database.

Syzygy 7 men: 16.7 TiB
Minus 99.95%: 8.5 GiB

Estimated 8 men size: 8.5 GiB * (16.7 TiB / 149.2 GiB) = 974 GiB

Emanuel Torresbase!
[Moderation warning] This signature violated the rule against commercial exhortations.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by hgm »

Sounds like total nonsense. For one, 'a fraction of a second' is about a million times slower than 'a fraction of a micro-second', which is what a conventional EGT probe takes.
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx »

hgm wrote: Thu Jun 24, 2021 6:36 pm 'a fraction of a second' is about a million times slower than 'a fraction of a micro-second', which is what a conventional EGT probe takes
If you are in this position and need to find the best move, sub-second or sub-microsecond performance might not matter.
[Moderation warning] This signature violated the rule against commercial exhortations.
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx »

hgm wrote: Thu Jun 24, 2021 6:36 pm 'a fraction of a second' is about a million times slower than 'a fraction of a micro-second', which is what a conventional EGT probe takes
Also, the Emanuel Torresbase is adjustable if you need more speed.
[Moderation warning] This signature violated the rule against commercial exhortations.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by hgm »

klx wrote: Thu Jun 24, 2021 6:44 pm
hgm wrote: Thu Jun 24, 2021 6:36 pm 'a fraction of a second' is about a million times slower than 'a fraction of a micro-second', which is what a conventional EGT probe takes
If you are in this position and need to find the best move, sub-second or sub-microsecond performance might not matter.
Indeed. But that is not how EGT are used. Their main use is probing close to the leaves of the search tree, to see if the position there is won, lost or draw. So that you can go for the won positions, and avoid the others. And if the tree has a million nodes, the difference between using conventional EGT and your method is whther you complete the search in 1 sec or in a year.
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx »

hgm wrote: Thu Jun 24, 2021 6:54 pm Indeed. But that is not how EGT are used.
Oh ok, didn't know this. Just learned about endgame databases last week, so I'm not up to speed with all intricacies yet. How can we make best use of the Emanuel Torresbase then? I feel like I'm on to something.
[Moderation warning] This signature violated the rule against commercial exhortations.
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by AndrewGrant »

So this is just a theory? You've not actually done anything to bring it to reality?
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx »

AndrewGrant wrote: Thu Jun 24, 2021 11:30 pm So this is just a theory? You've not actually done anything to bring it to reality?
Well I did the investigation and math in the post above as a kind of proof of concept, but for the actual implementation I plan to start this weekend. Mostly wanted to get some feedback, and to share the idea since I never heard of this concept before and in case this ends up laying the foundation for 8 men database.
[Moderation warning] This signature violated the rule against commercial exhortations.
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by AndrewGrant »

klx wrote: Fri Jun 25, 2021 12:52 am
AndrewGrant wrote: Thu Jun 24, 2021 11:30 pm So this is just a theory? You've not actually done anything to bring it to reality?
Well I did the investigation and math in the post above as a kind of proof of concept, but for the actual implementation I plan to start this weekend. Mostly wanted to get some feedback, and to share the idea since I never heard of this concept before and in case this ends up laying the foundation for 8 men database.
Engines can achieve depth 20 easily, but its heavily pruned. If you want a proven tree, your AlphaBeta is very limited in its depth without unsafe pruning decisions.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
User avatar
phhnguyen
Posts: 1434
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by phhnguyen »

klx wrote: Thu Jun 24, 2021 6:09 pm Hi there, I have come up with a pretty revolutionary idea to vastly reduce the size of a DTM endgame database.

Here are the facts that lead to my discovery:

1. The vast majority of positions are won within a few number of moves. For example, from the syzygy stats site it seems that towards 99% of positions are won in less than 20 plies (depending the table, some are a lot more like 99.95%).

2. We can trivially search to depth 20 plies for endgames with alpha-beta in a fraction of a second.

So, in the Emanuel Torresbase, we store a special identifier instead of the actual DTM for these "easily-won/lost" positions. During query of a position, if this special value is found, we do alpha-beta to find the outcome. In other words, we can cut out up to 99.95% of the table!

The Emanuel Torresbase is fully adjustable. We can configure the threshold up and down to prefer more disk usage and less compute time, and vice versa. With threshold 0, it becomes plain endgame database. With threshold Infinity it becomes pure alpha-beta search. The threshold can be configured per table, and either in plies or percent.

The Emanuel Torresbase reduces the size of existing databases and paves the way for 8-men database.

Syzygy 7 men: 16.7 TiB
Minus 99.95%: 8.5 GiB

Estimated 8 men size: 8.5 GiB * (16.7 TiB / 149.2 GiB) = 974 GiB

Emanuel Torresbase!
So it is just an idea, not a real practice nor a real base!

I guess the reality may be much harder than you think. From my experience (I have done some studies a few years ago):

1) I agree a large number of non-draw positions, not that 99%, but about 90% is easy to search out in reasonable time, say, within 3 seconds threshold
2) However, 10% of non-draw positions may take much much longer, even hours
3) The most trouble is about draw-positions. You can't easily search and conclude if a position is a draw. Sometimes you need to search over 100 plies (because of rule 50-moves) and/or hours
4) the mix between 2 and 3 gives you no good solution

Your idea of collecting data for some specific positions only (then search or do whatever from them) is a bad one since they will eat a lot of space, memory, and time for processing (you may need to load them all into memory then make some binary searches for a given position). Your data may end up quite closed on size with Syzygy and/or not worth trouble. Note that for a standard EGTB, it doesn't store position's identifier (such as hash keys, FENs) nor require searching: the probe function can locate exactly and instantly (via index functions) where the data it needs from files without loading them all nor doing any search.

IMHO, it turns out the best solution for a mix of data-searching is something similar to a bitbase (https://www.chessprogramming.org/Endgame_Bitbases).
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager