I work for an ISP and we are trying to write another success story ;) As an ISP, we have tons of constraints in terms of infrastructure. We're not allowed to use any public cloud services. At the same time, the in-house infrastructure is either too limited, or managed via spreadsheets by a bunch of dysfunctional teams.
For my team, Kubernetes has been truly a life saver when it comes to deploying applications. We're still working on making our cluster production-ready, but we're getting there very fast. Some people are already queuing up to get to deploy their applications on Kubernetes :D
What I especially love about Kubernetes is how solid the different concepts are and how they make you think differently about (distributed) systems.
It sure takes a lot of time to truly grasp it, and even more so to be confident managing and deploying it as Ops / SRE. But once you get it, it starts to feel like second nature.
Plus the benefits, in almost any possible way, are huge.
Red Hat's OpenShift makes it a lot easier by providing all of the infrastructure around it (docker registry, docker build from Git, Ansible integration and so on).
From the ops side, I would also suggest to take a look at Mesos, DC/OS and Marathon. Kubernetes, like Docker, is more developer-friendly; DC/OS is more ops-friendly. DC/OS can use Kubernetes natively.
> During this migration, we encountered an issue that persists to this day: during times of high load and/or high rates of container churn, some of our Kubernetes nodes will kernel panic and reboot.
Considering that Kubernetes doesn't modify the kernel, this issue sounds like is present in mainline and kernel devs should be involved.
I would be interested to know what storage driver they're using for their nodes. High container churn puts a lot of stress on the VFS subsystem of Linux, and we've seen cases where customers have trigger lots of mount/umounts which results in filesystems causing panics. At SUSE, we do have some kernel devs debugging the issues, but the workaround is almost always "rate limit all the things". There are a few other kernel areas that are stressed with high container churn (like networking), but VFS is the most likely candidate from my experience.
While on paper containers are very lightweight, spawning a lot of them exercises kernel codepaths that probably haven't been exerted to that type of stress during development.
FWIW, the stock kernel (and HWE/HWE-edge kernels) recently picked up current ENA drivers. ixgbevf, unfortunately, doesn't look like it's been updated in-tree so it still lags behind Amazon's recommendation (currently 2.14.2, whereas Xenial's in-tree driver claims to be 2.12.1 and Trusty has 2.11.3).
> Several qualities of Kubernetes stood out from the other platforms we evaluated: the vibrant open source community supporting the project, the first run experience (which allowed us to deploy a small cluster and an application in the first few hours of our initial experiment), and a wealth of information available about the experience that motivated its design.
It's interesting that the reasons they cite for choosing Kubernetes over alternatives are entirely driven by 'developer experience' and not at all technical. It shows how critical community development, good documentation, and marketing are to building a successful open source project.
I believe developer experience on being introduced to a tool is paramount to its success. It gives a lot of confidence in what you're doing and keeps things moving forward. To me appear that the application is built on a solid simple concept instead of a convoluted complex architecture. Some tools sin on the opposite though, very simple to setup but very complicated to understand how to scale.
Really exciting stuff, happy to see the Github team launch this.
Kubernetes is becoming the goto for folks needing both their own physical metal presence and cloud footprint too. And the magic of Kubernetes is that it has APIs that can actually give teams the confidence to run and reuse deployment strategies in all environments. Even across clouds.
If you are like Github and want to use Kubernetes across clouds (AWS, Azure, etc) & bare metal and do deploy/customize that infra using Terraform checkout CoreOS Tectonic[1]. It also tackles more of the subtle things that aren't covered in this article like cluster management, LDAP/SAML authentication, user authorization, etc.
I'm still utterly perplexed as to what Tectonic actually -is-. I kinda get that it's a kubernetes setup, but is it a GUI over the top of it? The website is pretty confusing and I think I gave up really quickly when trying to set it up.
We're currently looking at moving our applications to k8s, and was wondering what deployment tools people are using? This week we are evaluating spinnaker, helm and bash wrappers for kubectl. There is concern over adding too many layers of abstraction and that KISS is the best approach.
At SAP, we're using Helm [1] to deploy OpenStack (plus a bunch of extra services like a custom Rails-based dashboard [2]) on baremetal Kubernetes. For image building, testing and deployment, we use Concourse CI [3], and OpenStack assets (like service users, service projects and roles) are deployed with a custom Kubernetes operator [4].
We also did some evaluation and then decided to stick to KISS an chose kubectl commands combined with cat and kexpand. Really simple approach to allow dynamic kubernetes deployments.
The service.yml contains the full deployment configuration, service definition and ingress rules. So this works without preconfiguring anything in kubernetes when deploying an new service.
An engineer only has to create the service.yml and Jenkins does deploy it automatically on every master build.
*kexpand is a small tool which does something similar to sed, but in a simpler and less powerful way (keep it simple): https://github.com/kopeio/kexpand
We're using helm, but that was chosen mostly based on gut feel. It's an official project, and has momentum. We didn't want to spend too much time choosing a tool until we knew our requirements, and we don't really have a firm grasp of requirements until we've used something for a while. It seems to be working well for us so far, but it's still early.
There are lots of answers here that aren't helm, so I'm curious if there are any particular reasons that people ruled out helm?
A lot of folks are using Helm, but I find it very opaque to debug when templates go wrong (and I feel quite strongly that we shouldn't be writing untyped templates for our models). Also I found writing reusable spec components to be very difficult, e.g. a reverse proxy that I add to a number of pods.
I use pykube (also worth looking at the incubator project client-python) to write deploy scripts in Python; client-python is particularly nice as it uses OpenAPI specs to give you type hinting on the API objects/fields that you're writing. Much more civilized than typing bare yaml into a text editor.
If Python isn't your thing you can generate your own client from the OpenAPI specs, though I've found the client generation process to be a bit buggy.
I've got a simple setup that makes use of YAML files, Rake tasks, and raw kubectl. I've yet to take a look at helm or spinnaker but it's on my list. You really can go a long way just with K8s' own tooling.
I feel you. About a month ago I was fighting with the same feeling. In the end, decided to use Kubernetes only for a single piece of infrastructure so it's all pretty manageable through scripts. Managing secrets in particular is a pain in the ass.
One route I started checking but didn't commit to was using Ansible. They have a relatively good Kubernetes playbook and a facility to store secrets. That said, every damn task needs to be pointed to the K8S API endpoint, which is not the greatest.
We wrote an internal tool that wraps Helm and GPG. But we're really using Helm as a glorified templating system; since we deploy from git, Helm's release management is useless to us, and is even somewhat in the way. We might decide to drop Helm at some point, I think.
We are using ecfg (from Shopify) and Jenkins+kubectl. We use ansible for a couple of things but it's largely only because of some parameters of our architecture for a legacy monolith.
I say this in spite of the fact that it was announced last week[1], the next release of Deis Workflow will be the last (under the current stewardship, and probably under that name.)
It's just such a solid system, I would even more strongly recommend the (already EOL'ed early last year)[2] Deis v1 PaaS, except that you've already indicated you're moving to K8S, and Deis v2 is designed for K8S. I still recommend the v1 PaaS for people learning about principles of HA clusters. (Another disclosure: I have published[3] about how to do this, a work on how to do a cheap HA cluster using Deis v1 PaaS.)
I have a strong suspicion that Deis will live on after March under stewardship of new leadership from the community.
In the mean time, you have roughly 6 months of support from Microsoft, maybe I am overstating to say that they have committed to keeping the lights on for that long, but they have committed to merging critical fixes for that long (and we hope that in 6 months, Kubernetes will have solidified enough that we don't have to worry too much about breaking changes from upstream release mongers anymore.)
Personally I don't buy commercial support and it would not be the deal maker or breaker for me.
Any load balancer can be configured or modified to target routable Pod IP addresses and skip node ports altogether. You'll have to integrate with the Kubernetes Endpoints API[1] and support dynamic backends. Another option would be to leverage Kubernetes' DNS and the SRV records[2] backing each service.
The reason node ports are used in the Cloud today is because most Cloud load balancing solutions only target VMs, not arbitrary endpoints such as containers, a limitation that will go away over time.
This is an apt point. Kubernetes models Borg, and Borg has no concept of ingress. That's an entirely different problem space.
Obviously that doesn't fly if there isn't an equivalent open solution, so we did what we could with the system to make it not terrible. We can do more.
The point about Swarm is interesting, and has been much on my mind. Some of Kubernetes' perceived complexity is because we go to great lengths to avoid ever having two users collide, with escape hatches for the people who really need "unfriendly" features. This is because, again, Kubernetes models Borg. Borg clusters are giant, shared, multi-user, multi-app animals, where the users are in different business units and chances of collisions are high.
Swarm, on the other hand, thinks of a cluster more as an application construct. Sharing is not a big problem, and coordination is easy and local. This allows them to make different tradeoffs. I doubt very much that you can run a large number of similar apps in a single swarm without having collisions on things like ports.
I still believe the large-shared-cluster model is right in the limit. There are so many efficiencies to be had. But there are legit reasons it is hard to achieve right now.
I'm very interested in ways to make Kubernetes easier to use, ESPECIALLY in this regard. Real user feedback is critical.
Can you elaborate? I found Kubernetes ingresses to be one of the most pleasant parts - we use the `nginx-ingress` from helm and it works very well. Not exactly an industrial github-strength load balancer, but it will get you a long way surely?
How hard (and how realistic) it is to actually get a reasonable understanding (and then stay up-to-date) with Kubernetes internals? Is there any go-to reading material?
We had ran another large-footprint container management system (not K8s, but also popular), and when its DNS component started to eat all the CPU on all nodes, best I was able to do fast,was just scrapping the whole thing and quickly replacing it with some quick-and-dirty Compose files and manual networking. At least, we were back to normal in an hour or so. Obvious steps (recreating nodes) failed, logs looked perfectly normal, quick strace/ltrace gave no insights, and trying to debug the problem in detail would've taken more time.
But that was only possible because all we ran was small 2.5-node system, not even a proper full HA or anything. And it had resembled Compose close enough.
Since then I'm really wary about using larger black boxes for critical parts. Just Linux kernel and Docker can bring enough headache, and K8s on top of this looks terrifying. Simplicity has value. GitHub can afford to deal with a lot of complexity, but a tiny startup probably can't.
I wouldn't say that you're unnecessarily scaring yourself at all. Kubernetes is extremely complex. I've been running it for a few months and I'm just starting to get my hands around it. Things will just stop working for what seems like no reason, and there are so many places to investigate you can easily burn most of a day troubleshooting.
It's a great system, but it's also relatively new, and most issues aren't well documented. You'll spend a lot of time in github issues or asking for help in the (very active, and often very helpful) community.
If you have a valid use case, I wouldn't steer you away from it, but your fears are well founded.
> Enhancements to our internal deployment application to support deploying Kubernetes resources from a repository into a Kubernetes namespace, as well as the creation of Kubernetes secrets from our internal secret store.
Would love to hear more about this was accomplished. I'm currently exploring a similar issue (pulling per-namespace Vault secrets into a cluster). From what I've found, it looks like more robust secrets management is scheduled for the next few k8s releases, but in the meantime have been thinking about a custom solution that would poll Vault and update secrets in k8s when necessary.
One thing I would have liked to have seen addressed in the article is whether the new architecture requires additional hardware (presumably) to operate and if so how much more.
We're on AWS, so this is tangentially related. At my company, we moved to k8s because we have quite a few low-usage services. Before k8s, each one of those services was getting its own EC2 instance. After k8s, we just have one set of machines which all the services use. If one service is getting more traffic, the resources for that service scale up, but we maintain a low baseline resource usage. In short, it's resulted in a measurable drop in our EC2 usage.
I've only dabbled in K8s and it strikes me that using it in production is a long term investment and, as it stands currently, a long term project to implement properly. You'll want to do exactly what Github did: setup a "review lab" or similarly comprehensive dev and test environment until you are absolutely comfortable with it in production. This will lead to the provisioning (and cost) of quite a bit of hardware - and when it is finally in production it'll likely be over-provisioned for quite some time until norms can be established and excess cut.
So basically its a traditional devops migration. But you get quite a few goodies and arguably much better practices at the end of it.
I'd be interested in hearing what kind of autoscaling system they use for their Ruby pods.
We're running a few (legacy — we're moving to Go) Ruby apps in production on Kubernetes. We're using Puma, which is very similar to Unicorn, and it's unclear what the optimal strategy here is. I've not benchmarked this in any systematic way.
For example, in theory you could make a single deployment run a single Unicorn worker, then set resources:requests:cpu and resources:limits:cpu both to 1.0, and then add a horizontal pod autoscaler that's set to scale the deployment up on, say, 80% CPU.
But that gives you terrible request rates, and will be choking long before it's reaching 80% CPU. So it's better to give it, say, 4 workers. At the same time, it's counter-productive to allocate it 4 CPUs, because Ruby will generally not be able to utilize them fully. At the same time, more workers mean a lot more memory usage, obviously.
I did some quick benchmarking, and found I could give them 4 workers but still constrain to 1 CPU, and that would still give me a decent qps.
hey! we at bitmovin have been using k8s for quite a while for our infrastructure and on premise deployments. In case you're interested in how we do multi stage canary deployments, check out: http://blog.kubernetes.io/2017/04/multi-stage-canary-deploym...
They could easily still use standalone puppet to handle the config management for individual container images. I currently do this with salt-minion. It reduces the burden on the Dockerfile itself, and lets you embrace a declarative configuration state at build time.
[+] [-] iagooar|8 years ago|reply
I work for an ISP and we are trying to write another success story ;) As an ISP, we have tons of constraints in terms of infrastructure. We're not allowed to use any public cloud services. At the same time, the in-house infrastructure is either too limited, or managed via spreadsheets by a bunch of dysfunctional teams.
For my team, Kubernetes has been truly a life saver when it comes to deploying applications. We're still working on making our cluster production-ready, but we're getting there very fast. Some people are already queuing up to get to deploy their applications on Kubernetes :D
What I especially love about Kubernetes is how solid the different concepts are and how they make you think differently about (distributed) systems.
It sure takes a lot of time to truly grasp it, and even more so to be confident managing and deploying it as Ops / SRE. But once you get it, it starts to feel like second nature.
Plus the benefits, in almost any possible way, are huge.
[+] [-] lima|8 years ago|reply
Best docs of all open source projects I've seen.
[+] [-] atemerev|8 years ago|reply
[+] [-] shock|8 years ago|reply
Considering that Kubernetes doesn't modify the kernel, this issue sounds like is present in mainline and kernel devs should be involved.
[+] [-] cyphar|8 years ago|reply
While on paper containers are very lightweight, spawning a lot of them exercises kernel codepaths that probably haven't been exerted to that type of stress during development.
[+] [-] erulabs|8 years ago|reply
https://github.com/kubernetes/kops/issues/1558
Missing ENA and ixgbevf can be a real performance killer!
[+] [-] bmoyles|8 years ago|reply
[+] [-] joombaga|8 years ago|reply
[+] [-] skewart|8 years ago|reply
It's interesting that the reasons they cite for choosing Kubernetes over alternatives are entirely driven by 'developer experience' and not at all technical. It shows how critical community development, good documentation, and marketing are to building a successful open source project.
[+] [-] gu4x|8 years ago|reply
[+] [-] philips|8 years ago|reply
Kubernetes is becoming the goto for folks needing both their own physical metal presence and cloud footprint too. And the magic of Kubernetes is that it has APIs that can actually give teams the confidence to run and reuse deployment strategies in all environments. Even across clouds.
If you are like Github and want to use Kubernetes across clouds (AWS, Azure, etc) & bare metal and do deploy/customize that infra using Terraform checkout CoreOS Tectonic[1]. It also tackles more of the subtle things that aren't covered in this article like cluster management, LDAP/SAML authentication, user authorization, etc.
[1] https://coreos.com/tectonic
[+] [-] robotmay|8 years ago|reply
[+] [-] DDub|8 years ago|reply
[+] [-] majewsky|8 years ago|reply
Our charts are at [5] if you want to take a look.
[1] https://github.com/kubernetes/helm
[2] https://github.com/sapcc/elektra
[3] https://concourse.ci
[4] https://github.com/sapcc/kubernetes-operators in the "openstack-operator" directory
[5] https://github.com/sapcc/openstack-helm and https://github.com/sapcc/helm-charts (two different repos since we're in the middle of modularizing the original monolithic OpenStack chart in the first repo into per-service charts in the second one)
[+] [-] foxylion|8 years ago|reply
Example command can be
The service.yml contains the full deployment configuration, service definition and ingress rules. So this works without preconfiguring anything in kubernetes when deploying an new service.An engineer only has to create the service.yml and Jenkins does deploy it automatically on every master build.
*kexpand is a small tool which does something similar to sed, but in a simpler and less powerful way (keep it simple): https://github.com/kopeio/kexpand
[+] [-] MattRogish|8 years ago|reply
[+] [-] bryanlarsen|8 years ago|reply
There are lots of answers here that aren't helm, so I'm curious if there are any particular reasons that people ruled out helm?
[+] [-] rubenv|8 years ago|reply
[+] [-] sytse|8 years ago|reply
[+] [-] theptip|8 years ago|reply
I use pykube (also worth looking at the incubator project client-python) to write deploy scripts in Python; client-python is particularly nice as it uses OpenAPI specs to give you type hinting on the API objects/fields that you're writing. Much more civilized than typing bare yaml into a text editor.
If Python isn't your thing you can generate your own client from the OpenAPI specs, though I've found the client generation process to be a bit buggy.
[+] [-] jameskilton|8 years ago|reply
[+] [-] dguaraglia|8 years ago|reply
One route I started checking but didn't commit to was using Ansible. They have a relatively good Kubernetes playbook and a facility to store secrets. That said, every damn task needs to be pointed to the K8S API endpoint, which is not the greatest.
[+] [-] deadmik3|8 years ago|reply
disclaimer: I work on openshift
[+] [-] lobster_johnson|8 years ago|reply
[+] [-] invisible|8 years ago|reply
[+] [-] yebyen|8 years ago|reply
I say this in spite of the fact that it was announced last week[1], the next release of Deis Workflow will be the last (under the current stewardship, and probably under that name.)
It's just such a solid system, I would even more strongly recommend the (already EOL'ed early last year)[2] Deis v1 PaaS, except that you've already indicated you're moving to K8S, and Deis v2 is designed for K8S. I still recommend the v1 PaaS for people learning about principles of HA clusters. (Another disclosure: I have published[3] about how to do this, a work on how to do a cheap HA cluster using Deis v1 PaaS.)
I have a strong suspicion that Deis will live on after March under stewardship of new leadership from the community.
In the mean time, you have roughly 6 months of support from Microsoft, maybe I am overstating to say that they have committed to keeping the lights on for that long, but they have committed to merging critical fixes for that long (and we hope that in 6 months, Kubernetes will have solidified enough that we don't have to worry too much about breaking changes from upstream release mongers anymore.)
Personally I don't buy commercial support and it would not be the deal maker or breaker for me.
[1]: https://deis.com/blog/2017/deis-workflow-final-release/#futu...
[2]: https://deis.com/blog/2016/deis-1-13-lts/
[3]: https://deis.com/blog/2016/cheapest-fault-tolerant-cluster-d...
[+] [-] sandGorgon|8 years ago|reply
Everyone does this - because Kubernetes Achilles heel is its ingress. It is still built philosophically as a post-loadbalancing system .
This is the single biggest reason why using Docker Swarm is so pleasant.
[+] [-] kelseyhightower|8 years ago|reply
The reason node ports are used in the Cloud today is because most Cloud load balancing solutions only target VMs, not arbitrary endpoints such as containers, a limitation that will go away over time.
[1] Envoy with Kubernetes Endpoints integration: https://github.com/kelseyhightower/kubernetes-envoy-sds
[2] https://kubernetes.io/docs/concepts/services-networking/dns-...
[+] [-] thockingoog|8 years ago|reply
Obviously that doesn't fly if there isn't an equivalent open solution, so we did what we could with the system to make it not terrible. We can do more.
The point about Swarm is interesting, and has been much on my mind. Some of Kubernetes' perceived complexity is because we go to great lengths to avoid ever having two users collide, with escape hatches for the people who really need "unfriendly" features. This is because, again, Kubernetes models Borg. Borg clusters are giant, shared, multi-user, multi-app animals, where the users are in different business units and chances of collisions are high.
Swarm, on the other hand, thinks of a cluster more as an application construct. Sharing is not a big problem, and coordination is easy and local. This allows them to make different tradeoffs. I doubt very much that you can run a large number of similar apps in a single swarm without having collisions on things like ports.
I still believe the large-shared-cluster model is right in the limit. There are so many efficiencies to be had. But there are legit reasons it is hard to achieve right now.
I'm very interested in ways to make Kubernetes easier to use, ESPECIALLY in this regard. Real user feedback is critical.
[+] [-] orf|8 years ago|reply
[+] [-] hardymansen|8 years ago|reply
https://github.com/unibet/ext_nginx
Basically just handling nginx.conf from information in k8.
We run in production with ECMP in our routers to load balance stateless over any number of nodes. Easy to understand and very scalable.
[+] [-] movedx|8 years ago|reply
[+] [-] alpb|8 years ago|reply
[+] [-] brown9-2|8 years ago|reply
[+] [-] dookahku|8 years ago|reply
I found this one so far: https://classroom.udacity.com/courses/ud615
But any extra courses/trainings is always appreciated
[+] [-] pdelgallego|8 years ago|reply
Pluralsight: https://www.pluralsight.com/courses/getting-started-kubernet...
[+] [-] Omnipresent|8 years ago|reply
[+] [-] drdaeman|8 years ago|reply
We had ran another large-footprint container management system (not K8s, but also popular), and when its DNS component started to eat all the CPU on all nodes, best I was able to do fast,was just scrapping the whole thing and quickly replacing it with some quick-and-dirty Compose files and manual networking. At least, we were back to normal in an hour or so. Obvious steps (recreating nodes) failed, logs looked perfectly normal, quick strace/ltrace gave no insights, and trying to debug the problem in detail would've taken more time.
But that was only possible because all we ran was small 2.5-node system, not even a proper full HA or anything. And it had resembled Compose close enough.
Since then I'm really wary about using larger black boxes for critical parts. Just Linux kernel and Docker can bring enough headache, and K8s on top of this looks terrifying. Simplicity has value. GitHub can afford to deal with a lot of complexity, but a tiny startup probably can't.
Or am I just unnecessarily scaring myself?
[+] [-] AlexB138|8 years ago|reply
It's a great system, but it's also relatively new, and most issues aren't well documented. You'll spend a lot of time in github issues or asking for help in the (very active, and often very helpful) community.
If you have a valid use case, I wouldn't steer you away from it, but your fears are well founded.
[+] [-] dmart|8 years ago|reply
Would love to hear more about this was accomplished. I'm currently exploring a similar issue (pulling per-namespace Vault secrets into a cluster). From what I've found, it looks like more robust secrets management is scheduled for the next few k8s releases, but in the meantime have been thinking about a custom solution that would poll Vault and update secrets in k8s when necessary.
[+] [-] Yeroc|8 years ago|reply
[+] [-] howinator|8 years ago|reply
[+] [-] ben_jones|8 years ago|reply
So basically its a traditional devops migration. But you get quite a few goodies and arguably much better practices at the end of it.
[+] [-] lobster_johnson|8 years ago|reply
We're running a few (legacy — we're moving to Go) Ruby apps in production on Kubernetes. We're using Puma, which is very similar to Unicorn, and it's unclear what the optimal strategy here is. I've not benchmarked this in any systematic way.
For example, in theory you could make a single deployment run a single Unicorn worker, then set resources:requests:cpu and resources:limits:cpu both to 1.0, and then add a horizontal pod autoscaler that's set to scale the deployment up on, say, 80% CPU.
But that gives you terrible request rates, and will be choking long before it's reaching 80% CPU. So it's better to give it, say, 4 workers. At the same time, it's counter-productive to allocate it 4 CPUs, because Ruby will generally not be able to utilize them fully. At the same time, more workers mean a lot more memory usage, obviously.
I did some quick benchmarking, and found I could give them 4 workers but still constrain to 1 CPU, and that would still give me a decent qps.
[+] [-] daxfohl|8 years ago|reply
[+] [-] lukaskroepfl|8 years ago|reply
[+] [-] Alan01252|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] spmurrayzzz|8 years ago|reply
[+] [-] bpicolo|8 years ago|reply