top | item 40416079

(no title)

RMarcus | 1 year ago

This is my post from 2018 (I didn't submit it to HN), and it could definitely use a "here's what practical systems do" update! I'll put it on the TODO list...

Your point about systems dealing with a relatively small number of large objects vs. small objects also makes sense: this is essentially the "cost" of an overflow (4kb spills once in a blue moon? Oh well, handle that as a special case. 4TB spills once in a blue moon? The system might crash). This is more obvious, as you also point out, in load balancing.

One aspect I found very counter-intuitive: before this investigation, I would've guessed that having a large number of large bins makes overflow increasingly unlikely. This is only partially true: more bins is obviously good, but larger bins are actually more sensitive to changes in load factor!

Overall, I think you are right that this is not really a concern in modern systems today. Compared to Dynamo, I still think Vimeo's solution (linked at the bottom of the post) is both intuitive and low-complexity. But regardless, more of an interesting mathematical diversion than a practical systems concern these days.

discuss

order

No comments yet.