top | item 17582379

(no title)

jsmthrowaway | 7 years ago

Borg will remain orders of magnitude beyond Kubernetes until Kubernetes is completely rearchitected. It’s not scalability bugs. It’s decisions regarding how the cluster maintains state that hamstring it, and that’s so fundamental to everything it’s not a find/squish loop.

As I said in my comment, those major customers (one personal experience, three anecdotally, eight or nine I’ve consulted with) have quietly ruled out Kubernetes, either by trying it or prying it apart and deciding not to try it. That feedback isn’t coming. At Borg scale, Kubernetes is very much considered a nonstarter.

discuss

order

davidopp__|7 years ago

> Borg will remain orders of magnitude beyond Kubernetes until Kubernetes is completely rearchitected. It’s not scalability bugs. It’s decisions regarding how the cluster maintains state that hamstring it, and that’s so fundamental to everything it’s not a find/squish loop.

Can you say more about this? Borgmaster uses Paxos for replicating checkpoint data, and etcd uses Raft for replicating the equivalent data, but these are really just two flavors of the same algorithm. I don't doubt that there are probably more efficient ways that Kubernetes could handle state (I don't claim to be an expert in that area), but I don't think they're approaches that would look any more like Borg than Kubernetes does.

If you're at liberty to do so, could you say what orchestrators the customers you mentioned chose in lieu of Kubernetes? What scale are they running at for a single cluster?

[Disclaimer: I work on Kubernetes/GKE at Google.]

dilyevsky|7 years ago

> It’s decisions regarding how the cluster maintains state that hamstring it

Jed, you keep repeating this like it's true, but it's not actually so. Here's an excerpt from Borg paper (which David co-authored btw ;-)):

> A single elected master per cell serves both as the Paxos leader and the state mutator, handling all operations that change the cell’s state, such as submitting a job or terminating a task on a machine.

And while we're at it, I don't know what it has to do with FauxMaster since it ran single replica and the passage about C++ is just pure fud.

elvinyung|7 years ago

Just curious, does Borgmaster use Chubby, or is it a completely separate Paxos store?