This guide makes an interesting choice with regards to etcd security, which I'm not sure I'd go with.
etcd stores a load of sensitive cluster information, so unauthorised access to it is a bad thing.
There's an assumption in the guide that you have a "secure network" and therefore don't have to worry about etcd authentication/encryption. The thing is if you have a compromised container (say) and that container, which has an in-cluster IP address can see your etcd server, then it can easily dump the etcd database and get access to the information held in it...
This really annoys me about Kubernetes. Essentially all the official documentation is about how to select a cloud and let a cloud-specific tool magically do everything for you. There's no procedure for setting up a single host for development purposes or to have a Dokku-like personal PaaS.
This guide is super useful because it avoids all the magic and lets you set things up properly (despite assuming you're doing it on a cloud) and potentially even do it on a single host.
> There's no procedure for setting up a single host for development purposes or to have a Dokku-like personal PaaS."
Can you clarify what you mean by this? Since the beginning (pre-1.0) it's been possible to stand up a cluster on bare metal/VMs using a single command. This used to be done with the `kube-up.sh`[1] tool, which has been replaced with kubeadm in 1.6 [1].
If anything, I thought the fair complaint would be that there are too many ways to set up a cluster, and it's confusing to figure out which is the right one [2].
It requires manual kubelet setup (usually there are OS packages for that - which is the proper way to install software on the host system), and then it generates a CA, node certificate, spins up etcd and flanneld containers and sets up k8s on top of those. It also lets one join that cluster in a semi-automatic manner, making the whole setup no harder than Docker Swarm.
I think it's a reasonably nice middle point between unexplainable cloud magic and do-it-all-yourself-the-hard-way setups.
Red Hat Container Development Kit is a great way to get started locally, if you want to deploy OpenShift on a VM there's an Ansible-based installed that will interview you and install it on your VM's (since OpenShift is just a layer on top of k8s you can use all the underlying primitives if you don't want the additional features).
Yeah, the community follows the wrong approach. It's not so different what two services need in features. There should be an end-to-end solution. But each tool, including kubernetes, only delivers something that is not even 100% of one feature and hopes that someone else comes up with a solution for the other features.
Usually in my day job I have to work with these tools, and currently have a stack of 5, that mostly have incomplete documentation, zero explanation of how they actually solve the problem and nearly zero ability to debug (e.g. what value have kubernetes logs and events. Usually when you have a problem is when you have no kubernetes logs yet/anymore and the events only tell you what you already know). Now I'll probably need to learn another one, considering these three options for storage.
At the weekend, where I'm mostly trying to relax and have physically only 2/7th of the time to alot I learn the basics behind containerization, e.g. namespaces, cgroups, virtual network adapters, iptables. And I feel in this small time slot I make a lot more progress at getting to that end-to-end solution that people actually need.
The example I'm using is wordpress+mysql. It's a simple thing that covers >70% of what anybody wants to deploy. And on-premise with kubernetes+docker it's still not possible without hacks (e.g. for volumeclaims and logging), after, I don't know, 4 years of Kubernetes? I bet in spending 2 years worth of weekends any normal person can come up with something better.
---
Re missing features, examples for Kubernetes:
A) Why does Kubernetes not solve the networking part? If I have a cluster and containers that may run in different places, of course the tooling I use to maintain that cluster needs to ensure that containers can talk to each other. There can be an abstract API and the option for other people to write plugins, but the core needs to come with one solution, that mostly works and is debuggable when not.
B) CrashLoopBackoff. Why did no Kubernetes developer get the idea that this state may require any kind of log/debugging?
C) Why does kubernetes assign random ports to services and not provide a simple way to retrieve them? Of course I can get them after I've learned the JSON api. But usually that is not considered a solution but hacking. I really don't care what the service is running on, I just want to use it.
D) Why do I need to manually say how many containers should run for each service? There are very distinct options a user may need. E.g., most services should run 1 instance and replace it if it dies. If it needs reliability I want to define how reliable it should be and accordingly the cluster should decide if it needs 2, 3 or 5 instances. And lastly I want to run stuff on all my nodes. Each node with a container. That is not even possible afaik.
E) Most kubernetes tools are not working well with environment proxies. For instance kube-proxy shell tool will completely bug out. But surprise, clusters make a lot of sense in enterprise environments, and enterprise environments have proxies. It's also not a hard setup for testing. A raspberry pie can be your home networks proxy.
F) on premise storage solutions, considering that a restarted container may not run on the same host.
G) Since not much is really running right now we didn't run into that problem yet. But it's totally possible that the whole cluster runs out of resources. I haven't seen any piece of info about the general cluster status and when I need to increase or replace infrastructure.
Honestly if I don't have all these things solved, am I really better of using Kubernetes or writing my own scripts? I think it currently is about equal in effort. And if that's the case my own solution has the huge advantage of being under my control and allowing me to learn a lot of new things.
It seems more like setting up the ha clusters we had in the 90s. Things like veritas, lifekeeper, service guard, etc. The parts aren't even that much different, really. They managed a network, services exposed on the network, health of said services, etc.
And the installation was just as confusing it seems.
Why are you doing all of this stuff manually? There are several providers that will set all of this stuff up automatically for you. I like the Kismatic toolkit (https://github.com/apprenda/kismatic), but there are a bunch of others. Sure, maybe once you go to production you'll want to install manually so that you have everything finely tuned the way you want, but learn it by using it rather than trying to have to figure things up front.
Or even better just use GKE for development / learning purposes. Just stop the cluster when you're not using it, and it'll be a lot cheaper than something you won't want to take down because you spent days installing it.
Because Kubernetes is a complex beast with many moving parts, and learning about all those moving parts becomes more and more important as your usage grows.
Personally I've used Stackpoint.io to provision some small clusters but I was very excited to see this project because deploying my own cluster from scratch is next on my todo list. Kelsey Hightower's "Kubernetes the Hard Way"[1] is the canonical go-to reference here but it's also very daunting so this looks like a great middle ground.
Let's face it, even today the k8s docs can be quite sparse sometimes or gloss over the details, so knowing how all of the pieces work from the ground up can be a big help. Plus, you prevent vendor lock-in when whatever automated tool you're using doesn't solve your use-case or decides to start charging a lot of money.
There are indeed many options for automated setups. I did my first steps with Kubernetes on GKE, later followed the CoreOS guide to set it up on bare metal. This guide is for creators, a written form of my lessons learned. It should enable people to run secure clusters wherever they want.
In case you missed it, there's a repository[1] in the same org which offers fully automated provisioning using Terraform.
Stackpoint.io is great for just spinning up k8s to get a feel for how it works. No charge, and all web based. It supports other providers, but I tried it with digital ocean. You pass it a DO API token, like this: https://cdn-images-1.medium.com/max/800/1*tcGINse5on6qnbsRYN...
Then it does the installation and gives you urls for the control panel.
I don't think I'd use it for production, as everything is behind stackpoint.io urls, but for experimenting, it saves a ton of time.
We open source our Kube bootstrap toolkit (which uses kops). No one should build this by hand. It would be like, I dunno, setting up a MariaDB Galera Cluster by hand when you intend to use RDS (not a perfect example, please no pedantry!). It's a fun learning experience, I guess - if you're into that sorta thing - but is not what you want to use in any production context.
Great set of resources -- I just went through the process of defining a terraform cluster in AWS over the past few weeks, though I'm leveraging the k8s Saltbase installer for the master and nodes.
I'm curious, why no mention of AWS as a provider for roll-your-own? Is this a cost thing?
Also, I get the feeling that Ubuntu is _not_ a first class citizen of the k8s ecosystem, but perhaps my newness to the ecosystem is to blame here. The Saltbase installer, for example, only supports Debian and RHEL distros, `kops` prefers Debian, and the documentation for cluster deployments on kubernetes.io and elsewhere also seems to be somewhat suggestive of Debian and Core OS. Perhaps thats just a mistaken interpretation on my part. I'm curious what other peoples thoughts on this topic are!
Ubuntu is absolutely a 1st class citizen in the K8s Ecosystem!
The front page of https://kubernetes.io/docs/ has a bullet that links to a super simple way to deploy Kubernetes to Ubuntu on any of [localhost, baremetal cluster, public cloud, private cloud]!
See:
* Installing Kubernetes on Ubuntu: Deploy a Kubernetes cluster on-premise, baremetal, cloud providers, or localhost with Charms and conjure-up.
kops doesn't necessarily prefer debian - we support Ubuntu, Debian, CentOS/RHEL, CoreOS and Google's Container OS. One of the outputs of the Kubernetes-on-AWS efforts is an AMI that is "Kubernetes Optimized" - a 4.4 kernel, Docker pre-installed, lots of inodes etc. That AMI _is_ based on Debian, hence the suggestion is that if you don't otherwise care (and my hope is that eventually you won't), that you should probably just use that AMI. But if you do have a preference, by all means use your distro of choice.
> There are plenty of ways to set up a Kubernetes cluster from scratch. At this point however, we settle on kubeadm. This dramatically simplifies the setup process by automating the creation of certificates, services and configuration files.
I found gluster-kubernetes quite simple to install. But the install instructions do assume that you're going to be giving it it's own partition, which you would be doing on any sort of real production deployment.
There are a couple reasons to do it manually and/or outsite GKE, notably:
1) Cost. In VPSs like Digital Ocean/Scaleway, there's usually a large network out transfer quota included in the price, which just isn't there with GCP, where you pay for network usage on a metered basis.
2) Learning. Although you can defer most of the heavy work to GKE, it's still good to understand the moving parts so you can make better choices as you grow.
[+] [-] raesene6|8 years ago|reply
etcd stores a load of sensitive cluster information, so unauthorised access to it is a bad thing.
There's an assumption in the guide that you have a "secure network" and therefore don't have to worry about etcd authentication/encryption. The thing is if you have a compromised container (say) and that container, which has an in-cluster IP address can see your etcd server, then it can easily dump the etcd database and get access to the information held in it...
Personally I'd recommend setting up a small CA for etcd and using it's authentication features, there's a good guide to this on the CoreOS site https://coreos.com/etcd/docs/latest/op-guide/security.html
[+] [-] pstadler|8 years ago|reply
Related issue on GitHub: https://github.com/hobby-kube/guide/issues/6
[+] [-] 013a|8 years ago|reply
[+] [-] _rp6i|8 years ago|reply
> Choosing a cloud provider
This really annoys me about Kubernetes. Essentially all the official documentation is about how to select a cloud and let a cloud-specific tool magically do everything for you. There's no procedure for setting up a single host for development purposes or to have a Dokku-like personal PaaS.
This guide is super useful because it avoids all the magic and lets you set things up properly (despite assuming you're doing it on a cloud) and potentially even do it on a single host.
[+] [-] pstadler|8 years ago|reply
I've grown quite a thick skin since I exposed the first project of mine to a wider audience. But still, it's feedback like yours that keeps me going.
[+] [-] theptip|8 years ago|reply
Can you clarify what you mean by this? Since the beginning (pre-1.0) it's been possible to stand up a cluster on bare metal/VMs using a single command. This used to be done with the `kube-up.sh`[1] tool, which has been replaced with kubeadm in 1.6 [1].
If anything, I thought the fair complaint would be that there are too many ways to set up a cluster, and it's confusing to figure out which is the right one [2].
[1]: https://kubernetes.io/docs/getting-started-guides/kubeadm/ [2]: https://kubernetes.io/docs/setup/pick-right-solution/#table-...
[+] [-] drdaeman|8 years ago|reply
I felt exactly like this, but then found one.
https://kubernetes.io/docs/getting-started-guides/kubeadm/
It requires manual kubelet setup (usually there are OS packages for that - which is the proper way to install software on the host system), and then it generates a CA, node certificate, spins up etcd and flanneld containers and sets up k8s on top of those. It also lets one join that cluster in a semi-automatic manner, making the whole setup no harder than Docker Swarm.
I think it's a reasonably nice middle point between unexplainable cloud magic and do-it-all-yourself-the-hard-way setups.
[+] [-] snuxoll|8 years ago|reply
[+] [-] user5994461|8 years ago|reply
It's not worth bothering to make a single machine setup.
[+] [-] dkarapetyan|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] oblio|8 years ago|reply
[+] [-] erikb|8 years ago|reply
Usually in my day job I have to work with these tools, and currently have a stack of 5, that mostly have incomplete documentation, zero explanation of how they actually solve the problem and nearly zero ability to debug (e.g. what value have kubernetes logs and events. Usually when you have a problem is when you have no kubernetes logs yet/anymore and the events only tell you what you already know). Now I'll probably need to learn another one, considering these three options for storage.
At the weekend, where I'm mostly trying to relax and have physically only 2/7th of the time to alot I learn the basics behind containerization, e.g. namespaces, cgroups, virtual network adapters, iptables. And I feel in this small time slot I make a lot more progress at getting to that end-to-end solution that people actually need.
The example I'm using is wordpress+mysql. It's a simple thing that covers >70% of what anybody wants to deploy. And on-premise with kubernetes+docker it's still not possible without hacks (e.g. for volumeclaims and logging), after, I don't know, 4 years of Kubernetes? I bet in spending 2 years worth of weekends any normal person can come up with something better.
---
Re missing features, examples for Kubernetes:
A) Why does Kubernetes not solve the networking part? If I have a cluster and containers that may run in different places, of course the tooling I use to maintain that cluster needs to ensure that containers can talk to each other. There can be an abstract API and the option for other people to write plugins, but the core needs to come with one solution, that mostly works and is debuggable when not.
B) CrashLoopBackoff. Why did no Kubernetes developer get the idea that this state may require any kind of log/debugging?
C) Why does kubernetes assign random ports to services and not provide a simple way to retrieve them? Of course I can get them after I've learned the JSON api. But usually that is not considered a solution but hacking. I really don't care what the service is running on, I just want to use it.
D) Why do I need to manually say how many containers should run for each service? There are very distinct options a user may need. E.g., most services should run 1 instance and replace it if it dies. If it needs reliability I want to define how reliable it should be and accordingly the cluster should decide if it needs 2, 3 or 5 instances. And lastly I want to run stuff on all my nodes. Each node with a container. That is not even possible afaik.
E) Most kubernetes tools are not working well with environment proxies. For instance kube-proxy shell tool will completely bug out. But surprise, clusters make a lot of sense in enterprise environments, and enterprise environments have proxies. It's also not a hard setup for testing. A raspberry pie can be your home networks proxy.
F) on premise storage solutions, considering that a restarted container may not run on the same host.
G) Since not much is really running right now we didn't run into that problem yet. But it's totally possible that the whole cluster runs out of resources. I haven't seen any piece of info about the general cluster status and when I need to increase or replace infrastructure.
Honestly if I don't have all these things solved, am I really better of using Kubernetes or writing my own scripts? I think it currently is about equal in effort. And if that's the case my own solution has the huge advantage of being under my control and allowing me to learn a lot of new things.
[+] [-] tyingq|8 years ago|reply
And the installation was just as confusing it seems.
[+] [-] mtve|8 years ago|reply
[+] [-] cryptarch|8 years ago|reply
[+] [-] bryanlarsen|8 years ago|reply
Or even better just use GKE for development / learning purposes. Just stop the cluster when you're not using it, and it'll be a lot cheaper than something you won't want to take down because you spent days installing it.
[+] [-] irontoby|8 years ago|reply
Personally I've used Stackpoint.io to provision some small clusters but I was very excited to see this project because deploying my own cluster from scratch is next on my todo list. Kelsey Hightower's "Kubernetes the Hard Way"[1] is the canonical go-to reference here but it's also very daunting so this looks like a great middle ground.
Let's face it, even today the k8s docs can be quite sparse sometimes or gloss over the details, so knowing how all of the pieces work from the ground up can be a big help. Plus, you prevent vendor lock-in when whatever automated tool you're using doesn't solve your use-case or decides to start charging a lot of money.
[1] https://github.com/kelseyhightower/kubernetes-the-hard-way
[+] [-] pstadler|8 years ago|reply
In case you missed it, there's a repository[1] in the same org which offers fully automated provisioning using Terraform.
[1] https://github.com/hobby-kube/provisioning
[+] [-] scurvy|8 years ago|reply
[+] [-] tyingq|8 years ago|reply
Then it does the installation and gives you urls for the control panel.
I don't think I'd use it for production, as everything is behind stackpoint.io urls, but for experimenting, it saves a ton of time.
[+] [-] MattRogish|8 years ago|reply
We open source our Kube bootstrap toolkit (which uses kops). No one should build this by hand. It would be like, I dunno, setting up a MariaDB Galera Cluster by hand when you intend to use RDS (not a perfect example, please no pedantry!). It's a fun learning experience, I guess - if you're into that sorta thing - but is not what you want to use in any production context.
https://www.reactiveops.com/blog/kops-102-an-inside-look-at-...
https://www.reactiveops.com/blog/using-k8s-scripts/
Use the automation to bootup clusters, please!
[+] [-] fndrplayer13|8 years ago|reply
I'm curious, why no mention of AWS as a provider for roll-your-own? Is this a cost thing?
Also, I get the feeling that Ubuntu is _not_ a first class citizen of the k8s ecosystem, but perhaps my newness to the ecosystem is to blame here. The Saltbase installer, for example, only supports Debian and RHEL distros, `kops` prefers Debian, and the documentation for cluster deployments on kubernetes.io and elsewhere also seems to be somewhat suggestive of Debian and Core OS. Perhaps thats just a mistaken interpretation on my part. I'm curious what other peoples thoughts on this topic are!
[+] [-] dustinkirkland|8 years ago|reply
The front page of https://kubernetes.io/docs/ has a bullet that links to a super simple way to deploy Kubernetes to Ubuntu on any of [localhost, baremetal cluster, public cloud, private cloud]!
See:
* Installing Kubernetes on Ubuntu: Deploy a Kubernetes cluster on-premise, baremetal, cloud providers, or localhost with Charms and conjure-up.
[+] [-] justinsb|8 years ago|reply
[+] [-] rdli|8 years ago|reply
[+] [-] fndrplayer13|8 years ago|reply
[+] [-] gtirloni|8 years ago|reply
https://kubernetes.io/docs/getting-started-guides/kubeadm/
[+] [-] cweagans|8 years ago|reply
> There are plenty of ways to set up a Kubernetes cluster from scratch. At this point however, we settle on kubeadm. This dramatically simplifies the setup process by automating the creation of certificates, services and configuration files.
[+] [-] pstadler|8 years ago|reply
[+] [-] ramshanker|8 years ago|reply
[+] [-] paukiatwee|8 years ago|reply
[+] [-] daviddumenil|8 years ago|reply
[+] [-] guiriduro|8 years ago|reply
[+] [-] bryanlarsen|8 years ago|reply
[+] [-] empath75|8 years ago|reply
[+] [-] ralmeida|8 years ago|reply
1) Cost. In VPSs like Digital Ocean/Scaleway, there's usually a large network out transfer quota included in the price, which just isn't there with GCP, where you pay for network usage on a metered basis.
2) Learning. Although you can defer most of the heavy work to GKE, it's still good to understand the moving parts so you can make better choices as you grow.
[+] [-] rhizome|8 years ago|reply
[+] [-] tuco86|8 years ago|reply
[+] [-] jug5|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]