100% Uptime for Web Applications

[+] Smerity|14 years ago|reply

Luckily when a client demands something outrageous they tend to discover that it comes with a formidable price tag. Uptime requires redundancy, redundancy requires a substantial increase in cost without even factoring in the additional development time.

A theoretical (as you can't guarantee it anyway) 100% uptime requires a ridiculous amount of redundancy (i.e. consider that you now need to factor in every idiot with a backhoe[1], car[2] or boat[3]). At 100% uptime you have to consider not just your site being up but all the infrastructure between your servers and the given user being up as well.

What's better is discussing with your client what happens when they hit downtime and how the negative impact can be mitigated. Can you run a limited service that covers the core functionality of the website (i.e. only make the core components redundant[4])? What parts of the service are absolutely desired? How much money will the client lose (either directly or through lost customer faith)?

You'll tend to find the client will realise the time is better spent developing features or improving the user experience than worrying over that last 0.1% between 99.9 and 100.

[1]: http://www.wired.com/science/discoveries/news/2006/01/70040

[2]: http://www.datacenterknowledge.com/archives/2010/05/13/car-c...

[3]: http://en.wikipedia.org/wiki/2008_submarine_cable_disruption

[3]: http://techblog.netflix.com/2011/07/netflix-simian-army.html

[+] sicxu|14 years ago|reply

To some extend, 100% uptime is like perfect security. There is no perfect security, the closer you get to it, the more expensive it is. At some point, you have to draw a line and design your system to deal with the downtime.

[+] davyjones|14 years ago|reply

I read somewhere that for each 9 added after the decimal point, the cost goes up exponentially. Can't recall the link/study.

[+] maytc|14 years ago|reply

I believe every client would like to have a system with a 100% up time. You should instead frame the question to what eplison error is acceptable which in you case, is 0.02% error.

[+] mVChr|14 years ago|reply

This could easily turn into a thread of "when clients say X what they really mean is Y... and you better clarify it up front!"

[+] unknown|14 years ago|reply

[deleted]

[+] GMali|14 years ago|reply

It says in the article that "In 2007, only 3 of the top 20 websites were able to achieve 5 nines or 99.999% uptime." Among the top 3 was Myspace. Which is funny because you need traffic to worry about Uptimes...

[+] SoftwareMaven|14 years ago|reply

Myspace was #3 by traffic, not by uptime. Yahoo, Comcast, and AOL were the three that made it in the top (5 nines) by availability.

Zero down time by Yahoo is, quite frankly, astounding. Kudos to their devops team! Not even Google made that (though I bet they can now).

8 comments