top | item 30401382

(no title)

aij | 4 years ago

> Step one, double the limit to alleviate immediate customer pain.

I've been oncall for systems where that would not work.

Doubling the memory means you need twice as many machines. Depending on the service, that could require significantly increased network bandwidth. Now the network is saturated and every node needs to queue more data. Now latency and throughput are even worse, and even more requests are being dropped, so you automatically double the limit again...

discuss

order

jedberg|4 years ago

While that all may be true (but are indications of a poorly architected system), my code would still work. It would double the limit and then page someone. If they logged in and saw all those failures, then they could address those issues.

The whole point is that having an around the world follow the sun team would not alleviate those issues or make anything better.