Let’s Encrypt, OAuth 2, and Kubernetes Ingress

[+] andrewstuart2|9 years ago|reply

Suggestion to anybody reading this: don't use a DaemonSet for this. This really ought to be a Deployment of nginx-ingress resources behind a service exposed as `type: LoadBalancer` (if you're in a cloud-provider that supports LoadBalancer services). Then just create DNS aliases and configure nginx to do session affinity if needed, etc. Not only will it be able to scale with your load instead of cluster size, but you can actually update it in a rolling update already; DaemonSets cannot yet do that.

Really the most important part, though, is that DaemonSets are for services that need to run on each host. Like a log collection service [1] or prometheus node exporter [2].

[1] https://github.com/kubernetes/kubernetes/tree/master/cluster...

[2] https://github.com/prometheus/node_exporter

[+] fortytw2|9 years ago|reply

So I (the author) am a bit torn on this - a `type: LoadBalancer` service will create a NodePort underneath (yet another internal loadbalancer) and map those ports to a $cloud-platform-tcp-loadbalancer. By using a DaemonSet with a host port bound, you avoid a layer of internal routing.

I'm not so sure if one approach is _particularly_ better than the other though.

[+] maktouch|9 years ago|reply

We started nginx-ingress as a deployment, and we converted it to a DaemonSet:

- We rarely deploy new versions of the ingress controller

- We can't (or don't know how to) choose in which nodes the pods will go. If I make a deployment with 10 replicas, there's a chance that it'll all go in the same node

- Because we can't choose to distribute the pods, when a node containing the ingress pod went down, there was a noticeable blip of downtime (~27 seconds approx.). That's kinda unacceptable.

- Nginx ingress is pretty light. I don't mind it having just 1 of them in each node.

- Since we put our databases and stateful stuff outside kubernetes, we also decided to separate web facing kubernetes cluster and worker ones. This solves the problem of the "spin up 50 new nodes to handle some batch machine learning job".

So far, so good. I would actually suggest that you use a DaemonSet for this, just like I suggest you convert Kube-DNS to a daemonset (it's not by default on GKE for some obscure reason).

[+] caleblloyd|9 years ago|reply

In order to get the proper Source IP with TCP Load Balancing before Kubernetes 1.5, you used to need to use a Daemon Set with Host Networking.

Kubernetes 1.5 introduced Source IP using Source NAT and Health Checks: https://kubernetes.io/docs/tutorials/services/source-ip/

You still must write scheduling rules so that pods are scheduled to have at most 1 instance running per node.

[+] rusht|9 years ago|reply

It's worth noting that there is a discussion on GitHub [0] about building letsencrypt auto cert creation directly into ingress controller.

[0] https://github.com/kubernetes/kubernetes/issues/19899

[+] zalmoxes|9 years ago|reply

That's cool, I've done pretty much the same thing for our internal services. I noticed you use the github org for oauth2proxy.

In our setup, I wanted to add authentication to a few dozen sub domains, but use a single oauth2proxy instance. Github Oauth makes this kind of gross, the callback must point to the same subdomain you're trying to authenticate. But it does allow something like /oauth2/callback/route.to.this.instead

In the end, to achieve what I wanted (a single oauth2proxy for multiple internal services) I had to - fork oauth2proxy and make a few small changes to the redirect-url implementation - create a small service with takes oauth.acme.co/oauth2/callback/subdomain.acme.co and redirects to subdomain.acme.co to comply with GitHub' oauth requirements - created a small reverse proxy in Go which does something similar to nginx_auth_request. I had a few specific reasons to do this (like proxying websockets and supporting JWT directly) https://gist.github.com/groob/ea563ea1f3092449cd75eeb78213cd...

I hope that someone ends up writing a k8s ingress controller specific to this use case.

[+] aledbf|9 years ago|reply

Please check https://github.com/kubernetes/ingress/pull/190

[+] theptip|9 years ago|reply

Note one significant gotcha with this approach: the Ingress does TLS termination, so the hop from the Ingress to your pod is unencrypted.

That might be OK if 1) your data isn's sensitive or 2) you're running on your own metal (and so you control the network), but in GKE your nodes are on Google's SDN, and so you're sending your traffic across their DCs in the clear.

There are a couple of pieces of hard-to-find config required to achieve TLS-to-the-pod with Ingress:

1) You need to enable ssl-passthrough on your nginx ingress; this is a simple annotation: https://github.com/kubernetes/contrib/issues/1854. This will use nginx's streaming mode to route requests with SNI without terminating the TLS connection.

2) Now you'll need a way of getting your certs into the pod; kube-lego attaches the certs to the Ingress pod, which is not what you want for TLS-to-the-pod. https://github.com/PalmStoneGames/kube-cert-manager/ lets you do this in an automated way, by creating k8s secrets containing the letsencrypt certs.

3) Your pods will need an SSL proxy to terminate the TLS connection. I use a modified version of https://github.com/GoogleCloudPlatform/nginx-ssl-proxy.

4) You'll want a way to dynamically create DNS entries; Mate is a good approach here. Note that once you enable automatic DNS names for your Services, then it becomes less important to share a single public IP using SNI. You can actually abandon the Ingress, and have Mate set up your generated DNS records to point to the Service's LoadBalancer IP.

(As an aside, if you stick with Nginx Ingress, you can connect it to the outside world using a Kubernetes loadbalancer, instead of having to use a Terraform LB; the (hard-to-find and fairly new) config flag for that is `publish-service` (https://github.com/kubernetes/ingress/blob/master/core/pkg/i...).

[+] lobster_johnson|9 years ago|reply

I wonder how much of a vulnerability that really is. The SDN encapsulates everything and is supposedly IP-spoofing-secure, so in principle there's no way for anyone else in the same DC to get your traffic.

Of course, you could have a local attacker get in through other means, and then access local DC traffic within your SDN. But if you get to that point, you probably have bigger problem than terminating SSL.

[+] SEJeff|9 years ago|reply

Yes and no.

You're sending traffic over google's SDN in the clear, which is still encrypted by google if you believe:

https://cloud.google.com/security/security-design/

[+] conradev|9 years ago|reply

Would an overlay network with a shared secret for encryption of pod-to-pod networking be another solution to this problem? I feel like the ideal should involve keeping the key material in as few places as possible.

[+] dkoston|9 years ago|reply

Make sure to use an HPA and set up resource constraints on that ingress controller pod. Unbounded resource utilization may bite you in the a$$.

https://kubernetes.io/docs/user-guide/horizontal-pod-autosca...

Also, you may have redacted it but you don't appear to be adding a service with a static IP:

spec: loadBalancerIP: 1.2.3.4

Not having a global static IP for publicly accessible resources seems risky for uptime.

We've gone away from using ingress controllers and using services with static IPs + HPAs on nginx pods for this reason. Having to add a service + ingress controller adds complexity and doesn't really add value (IMO) since you can easily add nginx.conf as a ConfgMap and get the same ease of configuration as an ingress controller. Your mileage may vary with let's encrypt integrations.

[+] prydonius|9 years ago|reply

This is a really great post, excited to give the OAuth 2 auth a try.

FWIW, an easier way to get started with the NGINX Ingress and kube-lego services is using the official Helm[1] Charts for them (https://github.com/kubernetes/charts/tree/master/stable/ngin... and https://github.com/kubernetes/charts/tree/master/stable/kube...).

[1] https://github.com/kubernetes/helm

[+] blwide|9 years ago|reply

That's impressive but also quite some effort. Feels like premature optimization when looking at the (rather low) traffic of fromAtoB. On the other side, it's always good to have a scalable deployment when dealing with RoR apps.

[+] captn3m0|9 years ago|reply

>On GCP, the HTTP load balancers do not support TLS-SNI, which means you need a new frontend IP address per SSL certificate you have. For internal services, this is a pain, as you cannot point a wildcard DNS entry to a single IP, like * .fromatob.com and then have everything just work.

Wouldn't a wildcard SSL cert + wildcard DNS entry work even without SNI support here? I haven't used the GCP load balancer, but as long as you are serving a single certificate (* .fromatob.com), the client/server don't have to rely on SNI at all.

[+] agentgt|9 years ago|reply

Question for the author: We just migrated some stuff to GCP as well but do not use kubernetes. For managing infrasructure we only use packer, bash, and google cloud deployment yaml files (similar to the kubernetes manifest).

Why do you still need saltstack and how do you find terraform? Why do you need terraform (I suppose it is for your non kubernetes infrastructure?)?

[+] fortytw2|9 years ago|reply

For the moment at least, it's much more comfortable for us to keep our databases outside of kubernetes, so we use saltstack(masterless), packer, and terraform to manage them. We also use terraform to manage all of our DNS, which is split between Route53 and the GCP DNS service.

[+] linkmotif|9 years ago|reply

Discovered kube-lego via Google a few weeks ago and I am really excited to try it with my next product. Thanks for this post.

30 comments