Building highly available applications using Kubernetes new multi-zone clusters

[+] TheIronYuppie|10 years ago|reply

Disclosure: I work at Google on Kubernetes.

We're really excited to be announcing this support, it makes multiple zone Kubernetes really straightforward. Please let us know what we can do better!

[+] zoomzoom|10 years ago|reply

Will GKE offer any of this multi-zone capability for nodes, or is the plan to wait for 1.3 to offer any more availability options there?

I imagine there is some way to do this yourself today by just adding nodes with the right labels and startup script, or even referencing the same images as the google-managed instance group for the cluster, but it feels like that would take away the ability to upgrade the nodes using the google tooling?

Thanks! Really loving the kubernetes ecosystem and appreciate the activity here on HN from the google team members...

[+] boulos|10 years ago|reply

And if you'd prefer not to comment publicly, you can send mail to David here (username is aronchick, for his @google.com account) with your comments, questions or complaints.

[+] boulos|10 years ago|reply

While the support in this release is fairly simple, it already unblocks lots of use cases. Distributed "stateless" services should now be super simple to do in a multi-zone HA manner.

If you find that you're still not able to run your system multi-zone (e.g. because of the need to be attached to your PersistentVolume), the next releases (coming in just a "few" weeks time!) will keep building upon this. I'd encourage you to either send mail to the Product Manager (TheIronYuppie) or shout at Quinton during their SIG-Federation meetings.

Disclosure: I work at Google on Compute Engine (but Disclaimer: I don't actually work on Kubernetes).

[+] lcalcote|10 years ago|reply

Are next steps for multi-zone support inclusive of these capabilities?

1) HA for master components 2) Multi-zone support for PersistentVolumes 3) Federated master components governing separate deployments

[+] quintonh|10 years ago|reply

1) HA for master components is actually supported already, albeit with some additional work required at cluster setup. See http://kubernetes.io/docs/admin/high-availability/ . There's some ongoing work to automate that, but it's not at the top of our priority list, based on user feedback we've received.

2) It's not clear exactly what multi-zone support you're looking for w.r.t. persistent volumes. Right now pods with attached persistent volumes are scheduled into the zone where the volume exists. There is an undesirable limitation that it's not possible to create these volumes in zones other than where the master exists, and that's something near the top of our priority list to fix soon. Is there anything else you're looking for here?

3) It's not clear to me what you're asking here. Are you referring to federating multiple Kubernetes clusters together? In 1.3 we plan to release some aspects of that (multi-cluster GUI, multi-cluster command line tools, multi-cluster services etc).

[+] justinsb|10 years ago|reply

1) Yes, hopefully in 1.3. I believe all the components are there today, but I'm hoping we'll put together an official HA configuration.

2) It's in 1.2 :-) If you use PersistentVolumes, they will also be tagged with the zone of the volume on AWS & GCE, and there's a scheduler predicate which will enforce the AWS/GCE restrictions that volumes must be mounted in the same AZ: http://kubernetes.io/docs/admin/multiple-zones/

3) I think you mean the full federation between multiple clusters ("Ubernetes")? Yes, work on that is ongoing - this multizone support is a simpler subset of functionality that we carved off.

[+] lcalcote|10 years ago|reply

Noting that the scheduling spread strategy is centrally configured using the SelectorSpreadPriority, can you override the scheduling behavior on a per service or pod basis?

[+] justinsb|10 years ago|reply

It is a weighting function (not the sole strategy), so you can add additional weightings or constraints. For example, you can target nodes with specific labels which some people are using to target GPU nodes. What did you have in mind?

[+] hanikesn|10 years ago|reply

How does the configuration look like for a bare metal setup? E.g. spreading pods over multiple racks?

[+] justinsb|10 years ago|reply

The functionality is driven by the labels on the nodes & volumes. So if your bare metal setup adds the same labels to the nodes & volumes, it will work identically.

We can apply these labels automatically on GCE & AWS (or any cloudprovider that implements the appropriate interface in the code), but on bare-metal this will rely on your provisioning infrastructure (which presumably understands your racks & datacenters). You can now pass custom labels to kubelet also, so this should be as easy as dealing with physical hardware ever is!

15 comments