top | item 38012742

Building a global deployment platform is hard – a tour of the trade-offs we took

45 points| nicoche | 2 years ago |koyeb.com

13 comments

order

0xbadcafebee|2 years ago

There's actually no such thing as a multi-region anything. There also isn't such a thing as a single-region anything.

Cloud infrastructure is built and managed as a sort of bizarre multi-layer cake, that is actually several cakes. It's like Escher's Cake. Multiple cakes cut into slices and pushed together, where one cake bleeds into another cake seamlessly. Every layer is actually more than one layer, and not.

The way this works is pretty simple: cloud architecture is built on many different services. Those services may operate either on a "global" level, or "regional", or "zonal". By combining different services together into one solution, your solution will be affected by global changes, and by regional and zonal changes. So there's no way to completely isolate it from other areas, even at the zonal level. At some point, a change to a global, regional, or zonal service will affect whatever zone/region/multi-region thing you have, no matter how you architect it. In the cloud, anyway.

(There would be if every single service in a cloud provider could be zonally segregated, but they can't. If you make an IAM change, that's global; no way to make it only affect one zone at a time, and no way to ensure it is applied in every zone. DNS is intended to be global; you could technically run your own nameservers and target specific ones in specific zones for a particular update. But even then you could not guarantee an update goes out to all zones, or even one zone)

If you built an entire cloud provider from scratch, you could try to build literally every service to be zonal, and expand from there. But no cloud provider I'm aware of has that capability, and I'm not aware of anybody who has DIY'd their own PaaS that does it either. So you can claim you have a global platform, or a regional or zonal platform, but in practice, it is almost always an architectural Escher Cake and subject to the random whims of the universe.

nicoche|2 years ago

I completely agree with you . I would even go further: even if you can, you probably do not want to build a set of fully zonal services. As you said, there are some features that you want to provide which should affect "global" entities, like IAM or billing. For this kind of stuff, I believe that it makes sense to have some simple, globally unique components -and carefully plan around their failure scenarios

__turbobrew__|2 years ago

You can get a poor mans multi topology (dc, region, az, rack, etc) system by splitting a kubernetes control plane and etcd cluster across that topology. For example if you have 5 dcs you can run a control plane and etcd instance in each dc and be impervious to 2/5 dcs failing.

The performance of running etcd clusters across high latency links (100-200ms) is not as bad as I would have originally thought and scales into tens of thousands of pods without too much trouble.

As long as you label kubelets with topology information, deploying selectively in different topologies is as simple as specifying the correct node selector. Workload high availability can be ensured with topologyspread to make sure not too many pods of a single workload are running in the same failure zone.

If you can go k8s native with your infra this stuff becomes easier, but by no means easy.

Sparkyte|2 years ago

Lots of major companies still today don't have a global deployment platform. A lot of stuff is still very centralized because the complexity of managing infrastructure multiplies with the complexity of the existing infrastructure.

ushakov|2 years ago

Do they actually need one? With replicated deployments you will want a globally replicated database as well. This makes operating your app n times expensive and time-consuming where n is the amount of locations

ushakov|2 years ago

Why would I use this instead of Fly.io or AWS Global Accelerator?

nicoche|2 years ago

I think that we provide a higher level experience than Fly. Regarding AWS global accelerator, I haven't tested the product, but from what I see we also provide a higher-level experience: we take as input a GitHub repo or a container image, so we can abstract away the VM layer, EC2 - we directly run "applications", not virtual machines

bithavoc|2 years ago

I feel like Koyeb is bound to make the same mistakes as Fly.io which is moving away from global CDN-like app deployments to regular 12-factor app hosting

ushakov|2 years ago

It’s not a mistake, it’s what the customer base wants.

The problem with Koyeb is that their business is essentially a niche feature (multi-region). The main selling point of Fly and Cloudflare Workers is not that they’re multi-region, but that they are cheap and easy for developers. It’s a very bloody business to be in, if you ask me…