top | item 14294987

(no title)

I'm the author of this post.

Here are a few things I highlight in this post as things to consider when architecting for failure: Retry, Backoff and Rate Limit. Use a Cache. Add Redundancy. Build a Buffer. Reconsider Dependencies. Introduce Isolation. Improve Test and Release Practices.

Click the post for more about how I think about each of these. I think that considering cost tradeoffs when doing evaluating each of these approaches is what makes architecting systems so challenging (and interesting).

What else do you do in your systems to handle failures gracefully? Any questions about what we are doing or how we are doing it?

discuss

No comments yet.