top | item 45371580

(no title)

I realize you're making a general point about space/IO ratios and the below is orthogonal, no contradiction.

It's actually a lot less user-facing per disk IO capacity that you will be able to "sell" in a large distributed storage system. There's constant maintenance churn to keep data available: - local hardware failure - planned larger scale maintenance - transient, unplanned larger scale failures (etc)

In general, you can fall back to using reconstruction from the erasure codes for serving during degradation. But that's a) enormously expensive in IO and CPU and b) you carry higher availability and/or durability risk because you lost redundancy.

Additionally, it may make sense to rebalance where data lives for optimal read throughput (and other performance reasons).

So in practice, there's constant rebalancing going on in a sophisticated distributed storage system that takes a good chunk of your HDD IOPS.

This + garbage collection also makes tape really unattractive for all but very static archives.

discuss

No comments yet.