(no title)
Apocalypse_666 | 6 years ago
If only I could!! That’s exactly the frustrating part: there seems to be no way of grokking what goes on under the hood, and there are so many different ways of setting up a cluster and very few have any information about them online whatsoever.
As a practical example, what happened yesterday was that all of a sudden my pods could no longer resolve DNS lookups (took a while to figure out that that was what was going on, no fun when all your sites are down and customers are on the phone). Logging into the nodes, we found out about half of them had iptables disabled (but still worked somehow?). You try to figure out what’s going on, but there’s about 12 containers running in tandem to enable networking in the first place (what’s Calico again? KubeDNS? CoreDNS? I set it up a year ago, can’t remember now...) and no avail in Googling, because your setup is unique and nobody else was harebrained enough to set up their own cluster and blog about it. Commence the random sequence of commands I’ll never remember until by some miracle things seems to fix themselves. Now it’s just waiting for this to happen again, and being not one step closer to fixing it
antocv|6 years ago
If you use a managed kubernetes (not in aws since they suck, eks is not really managed). Like gke or aks, then you skip the whole "there is a problem in my own cloud of my own making".
btw, I also encountered DNS problems in kubernetes, on ACS, it took 5-10 minutes to resolve, and was caused by ACS not having services enabled to restart dns upon reboot, lol.
dijit|6 years ago
Reading this comment made me realise that often new technology is adopted because it is optional and promises options. But those options quickly shrink away and suddenly you’re locked into it.
Not to invoke a controversial name. But this is what happened with systemd.
unknown|6 years ago
[deleted]
rixed|6 years ago
Apocalypse_666|6 years ago
rezeroed|6 years ago
cbanek|6 years ago
But I have also had a number of DNS problems that we still haven't resolved, and they sometimes go away on their own. Same for IP tables rules issues. This is of course on a hosted kubernetes cluster at a large supercomputing center. (I didn't set it up, I just have to fix it. Ugh.) At Google, it's been great and we've had no networking problems, but they almost certainly run their own overlay network driver.
The various networking solutions you can plug into kubernetes seem pretty spotty, and they are very hard to debug. I still haven't figured it out myself. But I am trying to not throw the baby away with the bathwater. I think the networking (and storage) parts will get better.
unknown|6 years ago
[deleted]
nickthemagicman|6 years ago