Space/Time Tradeoff in year 2020+: What can the community do with 20TB+ Hard Drives?

DustyMonkey · Post by **DustyMonkey** » Thu Nov 14, 2019 3:14 am

Why inject leaf removal into this?

A sparse-in-memory table can trivially be converted to a simple compact dense list (just skip the zeros) and can be done in place (no extra memory required.)

Then at least part of the stored N-bit hash codes in that dense list are mostly sorted in long ascending runs (because you used them to index into the sparse table), and you can store those ascending values a lot more compactly, and follow that up with the exceptions-to-the-ascensions encoded less compactly. See any research on encoding a list of sorted strings compactly, and translate to bitstrings instead of ascii/unicode.

None of what I just talked about has a large runtime cost... probably amortizing to only a few cycles per entry, essentially free given the asynchronous waiting for more data from the slower storage.

---

and 128 bit hashes is massive overkill given the storage being discussed. The drive storage we are discussing is limited to approximately 2^40 positions, give or take. A quick calculation lands this at the probability of any collisions after 2^40 keys is 1 in 562,950,220,000,000 against. You probably don't live in that universe. In fact, none of the billions of people that have ever existed has lived in that universe.

96 bits should be more that sufficient, such that if every human being to ever exist was doing this, then maybe a couple of them would experienced a hash collision after generating 2^40 keys.

2^40 is a good number because its a trillion. An N terabyte drive can therefore store a trillion positions if each position is N bytes. Its quite convenient for head-calculations (which is why it is obvious that 128-bit hashes is significant overkill.) The upper bound for number of positions that can be stored is approximately 2^40 unless we are talking massive extremely expensive storage arrays, and then the chance of a drive error is still more than the chance of a collision!

dragontamer5788 · Post by **dragontamer5788** » Thu Nov 14, 2019 3:48 am

DustyMonkey wrote: ↑Thu Nov 14, 2019 3:14 am Why inject leaf removal into this?

Because if 100-Million nodes per second are visited, we're generating too much data to ever hope to write to disk! Disks only support 200MBps write speed! We have to remove nodes not because of capacity issues, but because of the slow-speed of writing to disk.

100-Million nodes/second x 24-bytes / node is 2.4GBps write speeds. You'll need 12x hard drives in RAID0 to actually support that kind of read or write-bandwidth... and faster CPUs are coming.

A sparse-in-memory table can trivially be converted to a simple compact dense list (just skip the zeros) and can be done in place (no extra memory required.)

Indeed. The way I visualize it is a similar effect. Instead of just skipping over only zeros... you skip over any nodes that are leaf-nodes or otherwise beyond the depth cutoff (say depth-cutoff of 1 or 2 is probably all that is needed). So instead of writing 2.4GBps (24-byte nodes at 100 Million nodes/second), you're writing 192MBps to disc.

Recalculating the leaf nodes wouldn't be a major issue if you needed them.

EDIT: Leaf-removal grossly reduces RAM-capacity requirements. 64GB of RAM will fill up in under 30 seconds if you store everything. But it will fill up in 330-second instead if you ignore the leaf-nodes. If you ignore the bottom two layers, then your 64GB of RAM will last for over an hour of analysis.

and 128 bit hashes is massive overkill given the storage being discussed. The drive storage we are discussing is limited to approximately 2^40 positions, give or take. A quick calculation lands this at the probability of any collisions after 2^40 keys is 1 in 562,950,220,000,000 against. You probably don't live in that universe. In fact, none of the billions of people that have ever existed has lived in that universe.

96 bits should be more that sufficient, such that if every human being to ever exist was doing this, then maybe a couple of them would experienced a hash collision after generating 2^40 keys.

2^40 is a good number because its a trillion. An N terabyte drive can therefore store a trillion positions if each position is N bytes. Its quite convenient for head-calculations (which is why it is obvious that 128-bit hashes is significant overkill.) The upper bound for number of positions that can be stored is approximately 2^40 unless we are talking massive extremely expensive storage arrays, and then the chance of a drive error is still more than the chance of a collision!

All good points. I don't think I have a counter-argument to any of these points.

Dann Corbit · Post by **Dann Corbit** » Thu Nov 14, 2019 4:19 am

Imagine 1000 machines analyzing different EPD positions, charging their hash tables along the way.
The database just of the PV nodes can get pretty big.

dragontamer5788 · Post by **dragontamer5788** » Thu Nov 14, 2019 5:13 am

Dann Corbit wrote: ↑Thu Nov 14, 2019 4:19 am Imagine 1000 machines analyzing different EPD positions, charging their hash tables along the way.
The database just of the PV nodes can get pretty big.

1000x machines would be 100-racks of 10x4U servers. Anyone with that kind of budget would be able to afford 1000x 18TB hard drives. Either distributed 1-hard drive per server, or maybe centralized into 23x servers (45 Hard Drives per 4U storage node).

The 1x hard drive per machine would probably be a good way to distribute the 7-man Syzygy Tablebase, so each machine has a local tablebase it could consult without hampering network traffic.

dragontamer5788 · Post by **dragontamer5788** » Thu Nov 14, 2019 7:40 pm

dragontamer5788 wrote: ↑Thu Nov 14, 2019 5:13 amThe 1x hard drive per machine would probably be a good way to distribute the 7-man Syzygy Tablebase, so each machine has a local tablebase it could consult without hampering network traffic.

Upon further thought, I disagree with myself. While some degree of replication is probably beneficial, replicating the Syzygy Tablebase across all machines is a stupid idea. Hard drives benefit from consolidated work loads (higher-IOPS when they're more fully loaded), so you'd get better utilization from a degree of consolidation.

1000x Syzygy Tablebases across 1000x servers will take up 1870 TBs of space. Consolidate into one server, and it only takes up 18.7TB, losing aggregate bandwidth but reducing storage costs significantly. Some degree of replication will probably help (2x the space, but 2x the bandwidth). Tuning would be required to know for sure the best configuration, but it certainly won't be 1000x servers each loaded with 1x hard drive each.

But whatever, I'm not actually an IT guy. I just like playing pretend with imaginary expensive computers.

. Realistically, the only machines (or clusters-of-machines) I'd ever touch would be stuff well under $10,000. A cluster of cheap Ebay used boxes, with maybe a few select "new" components to leverage.

Space/Time Tradeoff in year 2020+: What can the community do with 20TB+ Hard Drives?

Re: Space/Time Tradeoff in year 2020+: What can the community do with 20TB+ Hard Drives?

Re: Space/Time Tradeoff in year 2020+: What can the community do with 20TB+ Hard Drives?

Re: Space/Time Tradeoff in year 2020+: What can the community do with 20TB+ Hard Drives?

Re: Space/Time Tradeoff in year 2020+: What can the community do with 20TB+ Hard Drives?

Re: Space/Time Tradeoff in year 2020+: What can the community do with 20TB+ Hard Drives?