top | item 45642430

(no title)

freeqaz | 4 months ago

Since I'm 5+ years out from my NDA around this stuff, I'll give some high level details here.

Snapchat heavily used Google AppEngine to scale. This was basically a magical Java runtime that would 'hot path split' the monolithic service into lambda-like worker pools. Pretty crazy, but it worked well.

Snapchat leaned very heavily on this though and basically let Google build the tech that allowed them to scale up instead of dealing with that problem internally. At one point, Snap was >70% of all GCP usage. And this was almost all concentrated on ONE Java service. Nuts stuff.

Anyway, eventually Google was no longer happy with supporting this and the corporate way of breaking up is "hey we're gonna charge you 10x what did last year for this, kay?" (I don't know if it was actually 10x. It was just a LOT more)

So began the migration towards Kubernetes and AWS EKS. Snap was one of the pilot customers for EKS before it was generally available, iirc. (I helped work on this migration in 2018/2019)

Now, 6+ years later, I don't think Snap heavily uses GCP for traffic unless they migrated back. And this outage basically confirms that :P

discuss

garbthetill|4 months ago

Thats so interesting to me, I always assume companies like google who have "unlimited" dollars will always be happy to eat the cost to keep customers, especially given gcp usage outside googles internal services is way smaller compared to azure and aws. Also interesting to see snapchat had a hacky solution with AppEngine

freeqaz|4 months ago

These are the best additional bits of information that I can find to share with you if you're curious to read more about Snap and what they did. (They were spending $400m per year on GCP which was famously disclosed in their S-1 when they IPO'd)

0: https://chrpopov.medium.com/scaling-cloud-infrastructure-5c6...

1: https://eng.snap.com/monolith-to-multicloud-microservices-sn...

makeitdouble|4 months ago

The "unlimited dollars" come from somewhere after all.

GCP is behind in market share, but has the incredible cheat advantage of just not being Amazon. Most retailers won't touch Amazon services with a ten foot pole, so the choice is GCP or Azure. Azure is way more painful for FOSS stacks, so GCP has its own area with only limited competition.

array_key_first|4 months ago

Google does not give even a singular fuck about keeping their customers. They will happily kill products that are actively in use and are low-effort for... convenience? Streamlining? I don't know, but Google loves to do that.

lesuorac|4 months ago

High margin companies are always looking to cut the lower-margin parts of their business regardless of if they're profitable.

The general idea being that you'll losing money due to opportunity cost.

Personally, I think you're better off just not laying people off and having them work the less (but still) profitable stuff. But I'm not in charge.