Honestly in this day and age rolling your own k8s cluster is negligent. I've worked at multiple companies using EKS, AKS, GKE, and we haven't had 10% of the issues I see people complaining about.
I've picked my fair share of outages on managed k8s solutions. The difference there is once it's hosed, your fate is 100% in the hands of cloud support and well... good luck with that one. The cloud apologists in this thread will ofc try to shame you for not buying into their marketing
if your fate is in the hands of one of the cloud gods, what right does anyone have to blame you for what transpires?
mere mortals are not privy to all of the internal downstream impacts from that public-facing service outage. it would be like shouting into the void and expecting an answer, and, more, liking it.
no, it is easier to recognize one’s place, pay the tithes, and enjoy one god’s blessings and curses alike. do not stray and attempt to please two, it will only end in misery. (three is right out.)
Once your team has upgrades down, everything is pretty rote. This submission (Urbit, lol) seemed particularly incompetent at managing cert rotation.
The other capital lesson here? Have backups. The team couldnt restore a bunch of their services effectively, cause they didn't have the manifests. Sure, a managed provider may have less disruptions/avoid some fuckups, but the whole point of Kubernetes is Promise Theory, is Desired State Mamagememt. If you can re-state your asks, put the manifests back, most shit should just work again, easy as that. The team had seemingly no operational system so their whole cluster was a vast special pet. They fucked up. Don't do that.
dilyevsky|2 years ago
catchnear4321|2 years ago
mere mortals are not privy to all of the internal downstream impacts from that public-facing service outage. it would be like shouting into the void and expecting an answer, and, more, liking it.
no, it is easier to recognize one’s place, pay the tithes, and enjoy one god’s blessings and curses alike. do not stray and attempt to please two, it will only end in misery. (three is right out.)
jauntywundrkind|2 years ago
The other capital lesson here? Have backups. The team couldnt restore a bunch of their services effectively, cause they didn't have the manifests. Sure, a managed provider may have less disruptions/avoid some fuckups, but the whole point of Kubernetes is Promise Theory, is Desired State Mamagememt. If you can re-state your asks, put the manifests back, most shit should just work again, easy as that. The team had seemingly no operational system so their whole cluster was a vast special pet. They fucked up. Don't do that.
nyolfen|2 years ago
anotherhue|2 years ago