top | item 46337479

(no title)

hawk_ | 2 months ago

Durability always has an asterisk i.e. guaranteed up to N number of devices failing. Once that N is set, your durability is out the moment those N devices all fail together. Whether that N counts local disks or remote servers.

discuss

amluto|2 months ago

Interestingly, on bare metal or old-school VMs, durability of local storage was pretty good. If the rack power failed, your data was probably still there. Sure, maybe it was only 99% or 99.9%, but that’s not bad if power failures are rare.

AWS etc, in contrast, have really quite abysmal durability for local disk storage. If you want the performance and cost benefits of using local storage (as opposed to S3, EBS, etc), there are plenty of server failure scenarios where the probability that your data is still there hovers around 0%.

rdtsc|2 months ago

This is about not even trying durability before returning a result ("Commit-to-disk on a single system is [...] unnecessary") it's hoping that servers won't crash and restart together: some might fail but others will eventually commit. However that assumes a subset of random (uncoordinated) hardware failures, maybe a cosmic ray blasts the ssd controller. That's fine, but it fails to account for coordinated failure where, a particular workload leads to the same overflow scenario on all servers the same. They all acknowledge the writes to the client but then all crash and restart.

zrm|2 months ago

To some extent the only way around that is to use non-uniform hardware though.

Suppose you have each server commit the data "to disk" but it's really a RAID controller with a battery-backed write cache or enterprise SSD with a DRAM cache and an internal capacitor to flush the cache on power failure. If they're all the same model and you find a usage pattern that will crash the firmware before it does the write, you lose the data. It's little different than having the storage node do it. If the code has a bug and they all run the same code then they all run the same bug.