top | item 32403746

Falling for Kubernetes

164 points| icyfox | 3 years ago |freeman.vc | reply

184 comments

order
[+] fuziontech|3 years ago|reply
For anyone managing a k8s cluster and are fatigued with memorizing and reciting kubectl commands should definitely take a look at k9s[0]. It provides a curses like interface for managing k8s which makes it really easy to operate and dive into issues when debugging. Move from grabbing logs for a pod to being at a terminal on the container and then back out to looking at or editing the yaml for the resource definition in only a few key presses.

[0] https://k9scli.io/

[+] Bayart|3 years ago|reply
I've used k9s every day for the last 6 months and it's really superior to everything if you have any vi-fu. It even plays nice with the terminal emulator's colour scheme. It's simply an all-around pleasant experience in a way no dashboard is.
[+] quickthrower2|3 years ago|reply
I like Lens, as more of a GUIs fan, and very occasional k8s-er. It has saved me a lot of time.
[+] simiones|3 years ago|reply
For those who use emacs, I'd also recommend the very nice `kubel` plugin - an emacs UI for K8S, based on Magit.
[+] _carbyau_|3 years ago|reply
I had to look up k9s because I wondered what you meant by "curses like interface" - it couldn't be where my mind went:"f*ck_u_k8s -doWhatIwant -notWhatISaid"

And upon lookup I was transported back to my youth of ascii interfaces.

[+] hhthrowaway1230|3 years ago|reply
K9s made my learning of k8s way way way easier. I still use it every single day and I absolute adore it. The terminal user interface was so absolutely fantastic that it genuinely sparked my motivation to build more TUIs myself.
[+] OscarCunningham|3 years ago|reply
Do you pronounce it 'canines' or 'K-9-S'?
[+] physicles|3 years ago|reply
I use k9s every day, love it. Only problem is that the log viewing interface is buggy and slower than kubectl get logs. Still love it though.
[+] yuppie_scum|3 years ago|reply
Can’t vouch for k9s enough, it’s great and I think it helped me to gain a much better understanding of the resource/api system.
[+] swozey|3 years ago|reply
Has anyone on AWS gotten k9s to work with Awsume [0] authentication? I miss using it but I can't auth to different AWS accounts and access my EKS clusters with it unfortunately.

[0] https://awsu.me/

edit: I figured it out! You need to use autoawsume which is triggered by awsume $profile -a

[+] suryao|3 years ago|reply
I've been using and recommending k9s to everyone and it just works. I love it and use it enough that I'm a sponsor.

It's an amazing project by a solo dev, please consider sponsoring. My guess is anyone using kubernetes can afford to get their org to shell out $10 for it.

(I'm not affiliated with k9s in any way except as a happy user)

[+] skinnyarms|3 years ago|reply
It's been a smash hit at work. There is a bit of a learning curve, but nothing compared to kubectl.

: for changing "resource" types (and a few other miscellaneous things) / for filtering shift-? to see more commands

[+] lbriner|3 years ago|reply
Lots of people complain about Kubernetes complexity but I have found it is as complex as you make it. If you are running some simple workloads then once you have the pipeline setup, there is almost no maintenance required.

When people complain and then start talking about super complex configuration, bespoke networking functionality and helm charts that have "too many options" then surely that just means you don't have the skills to use the system to that degree?

I could say that .Net is too complicated because it has MSIL and library binding sequences involving versions and public keys and the fact you can not always link e.g. netfx with netstandard but these are either just things you need to learn, or things that you can't use until you do learn them.

It's like someone complaining that a Ferrari is too complicated because you can't work out how to change the cylinder head when most people will just drive it.

[+] mrweasel|3 years ago|reply
Where some people collide, and disagree about complexity depends on their roles.

If you're a consumer, then yes, it's as complex as you make it. If you keep it super simple you may lose out on some features, but that a reasonable trade-off.

If you're the person responsible for running and maintaining the Kubernetes cluster, then you're kinda out of luck. It honestly not that bad to install, you can do that in an afternoon. Where I find Kubernetes to be exceedingly complex is in debug-ability. I'm not sure there's anyway around that, it's basically a software defined datacenter, with all the complexity that brings... For some of us it's a software defined datacenter, on top of an actual datacenter, just to make things worse.

When I read about a company that just spin up a new Kubernetes cluster, because it's quicker than debugging the existing one, then I get concerned. For running payload, absolutely, just use the subset of the features you're comfortable with and build from there. Still I'd argue that most of us will never have problems large enough or complex enough that Kubernetes is a hard requirement.

[+] jiggawatts|3 years ago|reply
This is a bit like people being apologetic for PHP. Sure, technically, it is possible to write good PHP code. It doesn't have to turn into a spaghetti.

I have several issues with Kubernetes that superficially look like I'm just avoiding the complexity, but I've dealt with systems that are much more complex with ease.

1. In most orgs and environments, its a cloud-on-a-cloud. A few years ago I used to joke with people that the virtualisation in the cloud is 7 layers deep and no human can understand or troubleshoot it any longer. Adding something like Docker adds 1-2 layers. Kubernetes doubles the layers. Everything you do with the underlying cloud is duplicated in Kubernetes, but incompatible. E.g.:

    Azure has:        Kubernetes has:

    Resource Groups   Namespaces
    Tags              Labels
    Disks             PVs & Container Images
    VMs               Nodes
    (various)         Pods
    Load balancers    Ingress
    NSGs & FWs        (various)
    Policy            Policies
    Key Vault         etcd
    ARM Templates     Helm charts
    JSON APIs         gRPC APIs
    Azure AD          Pluggable auth
    Azure Metrics     Prometheus
    Log Analytics     (various)
    PowerShell        cli tool
These interact in weird and wonderful ways. Azure NATs all traffic, and then Kubernetes NATs it again by default. There's "security" at every layer, but just to be irritating, all Kubernetes traffic comes from unpredictable IPs in a single Subnet, making firewalling a nightmare. You're running cloud VMs already, but Windows containers run in nested VMs by default on Kubernetes. Kubernetes has its own internal DNS service for crying out loud!

2. Trying to do everything is to be less than optimal for everyone. There are four distinct ways of managing a cluster, and they're not compatible. You can run imperative commands, upload Helm charts, sync the cluster with an entire folder of stuff, or use a plugin like Flux to do GitOps. But if different people in a large team mix these styles, then this causes a giant mess. (To be fair, this is an issue with all of the major public cloud providers also.)

3. Google-isms everywhere. Every aspect of Kubernetes uses their internal shorthand. I'm not an ex-Googler. Nobody at any of my customers is. Nobody around here "speaks this dialect", because we're 12,000 kilometres from Silicon Valley. I'm sure this is not deliberate, but there are definite "cliques" with distinct cultures in the IT world. As FAANG employees flush with cash jump ship to start projects and startups like Kubernetes, they take their culture with them. In my opinion, mixing these together at random into a larger enterprise architecture is generally a mistake.

4. Kubernetes is not much better than bare VMs for developers, especially when compared with something like Azure App Service. The latter will "hold your hand for you" and has dozens of slick diagnostic tools integrated with it. On Kubernetes, if you want to do something as simple as capture crash dumps when there's a memory leak detected, you have to set this up yourself.

5. Microservices and RPC-oriented by default. Sure: you're not forced to implement this pattern, but it's a very steep slippery slope with arrows pointing downhill. In my experience, this is unnecessary complexity 99.99% of the time. Just last week I had to talk a developer out of adding Kubernetes to a trivial web application. Notice that I said "a" developer? Yes, a solo developer was keen on adopting this fad "just because". He was seriously planning on splitting individual REST endpoints out into containerised microservices. He's the third solo developer I've had to talk out of adopting Kubernetes this year.

6. Startup culture. Kubernetes shipped too early in my opinion. Really basic things are still being worked out, and it is already littered with deprecation warnings in the documentation. It's the type of product that should have been fleshed out a bit better at one or two large early adopter customers, and only then released to the general public. But its authors had a lot of competition (Docker Swarm, etc...) so they felt a lot of pressure to ship an MVP and iterate fast. That's fine I suppose, for them, but as an end-user I have to deal with a lot of churn and breakage. A case-in-point is that the configuration file formats are so inadequate that they have spawned a little ecosystem of config-file-generator-generators. I don't even know how deep those layers go these days. (Again, to be fair, Azure now has Bicep -> ARM as a standard transpilation step.)

7. Weak security by default because containers aren't security boundaries as far as Linus Torvalds or Microsoft Security Response Center are concerned. Everyone I've ever talked to about Kubernetes in the wild assumes the opposite, that it's magically more secure than hypervisors.

I get the purpose of Kubernetes in the same way that I get the purpose of something like Haskell, coq, or hand-rolled cryptography. They all have their legitimate uses, and can be useful in surprising ways. Should they be the norm for typical development teams? I'm not convinced.

Maybe one day Kubernetes v3 will be mature, stable, and useful for a wider audience. Especially once the underlying cloud is removed and there are smaller providers offering "pure Kubernetes clouds" where node pools are bare metal and there's no NAT and there isn't an external load balancer in front of the Kubernetes load balancer to make things extra spicy when diagnosing performance issues late at night across multiple incompatible metrics collector systems...

[+] orthoxerox|3 years ago|reply
k8s is deceptively simple (or is that deceptively complex?). Anyway, what I mean is that spinning up a basic cluster isn't hard. Maintaining a cluster on premises while following every existing infosec and ops guideline is. It's not that you can't do this, it's just a very non-trivial amount of work.
[+] philliphaydon|3 years ago|reply
> once you have the pipeline setup, there is almost no maintenance required.

You could apply this to a traditional deployment. Once you setup all the CI/CD there’s no maintenance required.

But the non kubernetes would probably be cheaper.

[+] stickfigure|3 years ago|reply
...or deploy your code on Google App Engine, Heroku, Elastic Beanstalk, Digital Ocean App Platform, Fly.io (etc etc) and spend all your time implementing user-facing features instead of maintaining infrastructure.

Yeah, I get it, compared to maintaining bare metal, k8s is amazing. But you're still wasting your time working on plumbing.

[+] stn_za|3 years ago|reply
Amazon EKS + Fargate.

No bare metal to manage. Control plane complexity is abrastacted away. Fargate namespace + profiles, no worker node configuration.

EKS cost will be $90/m Thereafter you only pay for what cpu/mem limits you assign to your deployments/pods.

Otherwise, why bare metal? For your basic needs, bare metal, self managed control planes, etc are definitely over complicating things.

[+] easton|3 years ago|reply
If you're abstracting away most of the complexity of k8s, why not just go use ECS and spend nothing on the cluster? You will probably have to do some rewriting of your deployment scripts when you move off of EKS anyway (just like when you move off of ECS), so you might as well use ECS and save the $90/m (and it's generally easier to use).
[+] denvrede|3 years ago|reply
Using Fargate for long / permanently running workloads in EKS is only an option when the costs are none of your concern.
[+] umvi|3 years ago|reply
> This deployment needed to serve docker images on boot-up, which instance templates do support. However they don't support them via API, only in the web console or the CLI.

Not exactly true. I got around this limitation by sending a startup script[1] in API metadata which basically just does invokes `docker run ...` and it works just fine. This allows spinning up/down container-based VMs via API only which is nice.

[1] https://cloud.google.com/compute/docs/instances/startup-scri...

[+] kelnos|3 years ago|reply
> Keep it managed since that legitimately becomes a reliability headache.

This is the thing that I think will always give me pause. If I have to pay a third party to manage my cluster orchestration backplane, that seems like a pretty big piece of overhead.

Sure, I can do it myself, but then I have to deal with said reliability headache. It seems ironic that a cluster management framework -- that touts its ability to reliably keep your applications running -- has its own reliability issues.

[+] NightMKoder|3 years ago|reply
This may not be a surprise to some, but when folks talk about reliability of the control plane, they usually think failure means their web service goes down. That’s not true. If you shot the kubernetes control plane, the individual servers can’t talk to it anymore - so they do nothing. Docker images that were running stay running. They even get restarted if they crash (via restartPolicy). Services that had specific other pods they were referencing continue referencing those pods. In net: everything except for kubectl and other kubenetes internals keeps working.

That said, one piece that isn’t talked about frequently is the network overlay. Kubernetes virtualizes IPs (so each pod gets an IP), which is awesome to work with when it works. But if your overlay network goes down - god help you. DNS failures are the first to show up, but it’s all downhill from there. Most overlays take great care to degrade well, so they’re not tied to the control plane, but I have yet to find one that’s perfect. The overlay is the part of kube that truly isn’t failure tolerant in my experience.

[+] lmarcos|3 years ago|reply
Exactly. We are building these incredible open source tools... but they grow so complex that we need to pay others in order to use them effectively?

What would you say if you had to pay Google if you want to use Golang in an effective way (because the language has become so complex that it's difficult to handle it on your own?). Crazy.

I wanted to take a look at how to use K8s on my own cluster, and damn it, to install the whole thing is not that straightforward. So, now to keep my sanity I need to pay a cloud provider to use k8s! I guess that's the trick: build some open source monster that's very hard to install/maintain/use but has cool features. People will love it and they'll pay for it.

[+] lumost|3 years ago|reply
Something that isn't appreciated enough is how reliability issues demolish your teams throughput. Got a random cluster restart heisenbug taking customers offline? good luck debugging it for the next 6 weeks. Alternately, ignore the problem while your engineers get woken up at night until they quit...

The progress of software has been towards managed offerings for a reason. A company can make it their entire business to own fixing these reliability issues for you, and keeping them at bay. Do you really want to be int the business of managing a cloud on your cloud?

[+] CommanderData|3 years ago|reply
I don't fully understand, there's benefits for using a managed service in instances the control plane is something you only interact with but don't manage. Not every Ops team will have a CKA administrator at hand to delve into etcd or controller manager. Open a ticket and it's generally fixed.

Then there's situations where you want full control over the control plane itself, I've worked with companies that had clusters installed on bare metal in their stores in the UK. A CKA engineer is an essential in this case but brings it's own reliability headaches.

[+] yunwal|3 years ago|reply
I don't disagree with you, but if you can reliably trade your dataplane outages for control plane outages, that's still usually a good tradeoff.
[+] anotherhue|3 years ago|reply
Vanilla k8s is pretty good. But once the 8 trillion vendors have you 'curl | helm' ing you end up with a knot of a system.

Keep it simple, use GitOps (ArgoCD is great), let k8s do what it's good at, managing workloads, not as a delivery mechanism for a vendor.

As an aside, the existence of the '{{ | indent 4 }}' function in helm should disqualify it from any serious use. Render, don't template.

[+] azurezyq|3 years ago|reply
We are currently building a Database-as-a-service platform (tidbcloud.com) using Kubernetes. I have to say it is a love-and-hate story.

On the bright side, k8s is almost the only option of an abstraction layer on top of different Clouds, for a complex system with tens of components. Database is more than masters and workers, there are so many components you need to take care of. For example, we may need monitoring agents, certificate managers, health checkers, admin proxies, etc. Without k8s, you have to be the owner of a kindergarten.

On the other side, k8s IS complicated. It's like an unexplored garden. People just enter it and try to use whatever they see, and cause all kinds of problems. What we met are:

* Try to apply the operator pattern to everything, debugging is really painful. Learning curve is steep. * Small services still cost a lot. VPA is not mature enough and many tiny services may be just better off on lambda. * k8s is not really designed for small per-tenant clusters. Managing a fleet of clusters is no easy job, but it is something SaaS companies have to deal with.

[+] Beltalowda|3 years ago|reply
> We are currently building a Database-as-a-service platform (tidbcloud.com) using Kubernetes.

I worked for a company that did exactly that (database-as-a-service on k8s); no one in the entire company knew how to run a cluster from scratch. This is a real problem if your developers want to do crazy bizarre advanced stuff like run tests because no one knows how anything fits with anything. At least, I thought it was a real problem as it wrecked any semblance of productivity, but no one else seemed to mind much and thought "it kind-of works on the CI on a good day if the planetary alignments are good" was fine. But hey, VC money so free to piss it all away on a small army of devs being wildly unproductive shrug

Also, the reliability of it all was nothing short of embarrassing, and debugging issues was hard because it was all pretty darn complex. Any technology is fantastic if it works, but I think that's the wrong metric. You need to look at how well things will go for you when it doesn't work – for any reason – because you only set your infrastructure up once, but you will "read" (debug) them countless times.

I hope it works out better for you, but I always felt that the advantages that k8s gave us could have been implemented from scratch significantly better by a few devs working on it for a few months. The other day I spent about an hour writing a shell script (and then spent a bit of time fixing a bug a few days later) to automate a ~15 minute task. k8s kinda felt like that.

[+] nijave|3 years ago|reply
Probably the next closest is just plain VMs (and potentially backplane/management layer running on k8s or whatever)

But yeah... Even then each cloud has quirks with Kubernetes and there's still quite a few resources just to stand up a cluster. Kubernetes can partially solve the initial provisioning but you generally need the cluster running with nodes before you can use something like CAPI or Crossplane (so you still need Terraform or Pulumi or scripts or whatever)

Having worked with a similar system, shared tenancy with tenant per namespace is just as bad but in a different way (if you use the classic operator pattern with 1 operator per cluster, you potentially have a massive blast radius). Then there's security...

[+] gizzlon|3 years ago|reply
Seems like everyone is forgetting about PaaS, and I don't understand why ..

For many use-cases it's going to be much simpler and cheaper than a manged k8.

There's no lock-in with Cloud Run than GKE. (actual lock-in comes with proprietary databases and the like.)

edit: Missed the GPU part, might make the OP's project the exception to the rule

People also forget about auto-scaling groups of VM's such as Managed Instance Groups in GCP: https://cloud.google.com/compute/docs/instance-groups/

[+] topspin|3 years ago|reply
"Azure is the only major provider that still has free control panels"

Oracle Cloud Infrastructure does as well. Perhaps it does not yet qualify as major... It's major to Oracle, that's for sure.

[+] lsofzz|3 years ago|reply
This is one of the pet peeves of HN submitters and readers :)

Sure, here is my two cents FWIW - Kubernetes is complex for some set of folks but not for others. So, the answer is - depends; On a lot of external factors outside of just technical capabilities of team.

Kubernetes solves many non-trivial challenges but its not a silver bullet. I could certainly learn from Mercedes platform/infra team's "story from the trenches" (they reportedly run 900+ k8s clusters on production:)

[+] master_crab|3 years ago|reply
The fact that there is a think piece everyday either extolling or cursing Kubernetes is a key indicator this bit of tech has some serious teething issues.
[+] woopwoop24|3 years ago|reply
i still believe most of the companies are better off with just deploying containers on vm's with autoscaling groups with a LB in front of it and some kind of terraform + ansible | CI-pipeline deployment to manage it.

all the complexity you have to buy with dealing with k8s. you have to update it frequently, update all the charts, fix all the breaking things in a developing ecosystem, and still have to deal with the hard parts of dynamic certs, storage and network. and you would still need to dedicate "normal" vms to host databases and your monitoring and your storage because if your cluster goes down, so is your management layer.

i have been with k8s since 1.3 and it is so disproportionate that i will not touch it for most of the times

[+] mihaigalos|3 years ago|reply
For very bare minimum playing around, I use k3s on Linode (1 shared CPU, 1 GB RAM, 25 GB disk for a master and another one for worker node) at 10$ / month.
[+] throwaway787544|3 years ago|reply
> Bare metal is always my first choice both for past projects and startups.

Is this a new Tech Hipster thing? Like writing a letter with a typewriter rather than an ink pen or computer/printer? "You don't understand, man; a virtual machine is, like, not authentic, man."

[+] pm90|3 years ago|reply
There's benefits to using bare metal, but generally only for situations where you really need them. e.g. if you want to run hardware accelerated apps, run your own virtualization/VMM (like firecracker) etc.

So yes, for the most part it is a hipster thing.

[+] EddySchauHai|3 years ago|reply
Bare metal can still have containers, I assumed it just meant on their own hardware?
[+] _jal|3 years ago|reply
Probably.

If you need to run bare metal, you know it, because it is a money question.

The big reasons that make it make sense include performance, storage, connectivity and security. If you're doing interesting things on any of the first three, the economics make owning your own tend to make a lot of sense. The 4th is usually less a technical than a liability concern, and tends to be externally imposed. But sometimes firms are really just that secretive.

[+] nijave|3 years ago|reply
I think all the major cloud providers have bare metal offerings that work similarly to VMs. On the other hand, I think they're all running accelerator hardware to offload all the VM overhead so the performance is basically the same
[+] umvi|3 years ago|reply
My problem with Kubernetes is my problem with with front-end web frameworks - they introduce too much complexity to the point of being esoteric for simple systems.

If you have a simple website built on boring technologies like HTML, CSS, and vanilla JS, then nearly anyone can read, understand, and make changes to it, even backend developers. If you instead wrote it in React/Webpack/etc. then suddenly only frontend experts can understand and contribute and debug.

Same with k8s. If you make a cloud backend using boring technologies like plain old programs calling APIs then nearly anyone can read, understand, and make changes to it. But if you instead make it use a big pile of configuration files with a million switches then suddenly only k8s experts can understand and contribute and debug.

I'm not saying don't use ReactJS or Kubernetes, I'm just saying make sure that the benefits you get from switching to it outweigh the new complexity and therefore expertise required to understand and debug it.