(no title)
cullenking | 11 months ago
We just replaced our top of rack firewall/proxies that were 11 years old and working just fine. We did it for power and reliability concerns, not because there was a problem. App servers get upgraded more often, but that's because of density and performance improvements.
What does cause a service blip fairly regularly is a single upstream ISP. I will have a second ISP into our rack shortly, which means that whole class of short outage will go away. It's really the only weak spot we've observed. That being said, we are in a nice datacenter that is a critical hub in the pacific northwest. I'm sure a budget datacenter will have a different class of reliability problems that I am not familiar with.
But again, an occasional 15m outage is really not a big deal business wise. Unless you are running a banking service or something, no one cares when something happens for 15m. Heck, all my banks regularly have "maintenance" outages that are unpredictable. I promise, no one relaly cares about five nines of reliability in the strong majority of services.
Axsuul|11 months ago
sgarland|11 months ago
I used to joke that my homelab almost had better reliability than any company I’d been at, save for my ISP’s spotty availability. Now that I have a failover WAN, it literally is more reliable. In the five years of running a rack, I’ve had precisely one catastrophic hardware failure (mobo died on a Supermicro). Even then, I had a standby node, so it was more of an annoyance (the standby ran hotter and louder) than anything.