top | item 28442369

(no title)

DaGardner | 4 years ago

Yes it does: https://stackexchange.com/performance

Pretty impressive I think.

discuss

order

chrismorgan|4 years ago

No it doesn’t. From your link:

• 9 web servers

• 4 SQL servers

• 2 Redis servers

• 3 tag engine servers

• 3 Elasticsearch servers

• 2 HAProxy servers

That comes to 23. I know “a couple” is sometimes used to mean more than two, but… not that much more than two.

“A couple” is just flat-out wrong; I’d guess that he’s misinterpreting ancient figures, taking the figures from no later than about 2013 about how many web servers (ignoring other types, which are presently more than half) they needed to cope with the load (ignoring the lots more servers that they have for headroom, redundancy and future-readiness).

fabian2k|4 years ago

One interesting aspect is that the number of servers is much higher than what would actually be needed to run the site, most servers run at something like 10% CPU or lower. Most of the duplication is for redundancy. As far as I remember they could run SO and the entire network on two web servers and one DB server (and I assume 1 each of the other ones as well).

If someone says SO runs on a couple servers this might be about the number actually necessary to run it with full traffic, not the number of servers they use in production. This is a more useful comparison if the question is only about performance, but not that useful if you're comparing operating the entire thing.

lamontcg|4 years ago

23 is not a lot of servers.

That is still doable with mid-90s era hand management of servers (all named after characters in lord of the rings).

Not that you should, but you could.

And the growth rate must be very low and pretty easy to plan out your O/S upgrade and hardware upgrade tempo.

And it was actually possible to manage tens of thousands of servers before containers. The only thing you really need is what they now call a "cattle not pets" mentality.

What you lose is the flexibility of shoving around software programmatically to other bits of hardware to scale/failover and you'll need to overprovision some, but even if half of SOs infrastructure is "wasted" that isn't a lot of money.

And if they're running that hardware lean in racks in a datacenter that they lease and they're not writing large checks to VMware/EMC/NetApp for anything, then they'd probably spend 10x the money microservicing everything and shoving it all into someone's kubernetes cloud.

In most places though this will fail due to resume-driven design and you'll wind up with a lot of sprawl because managers don't say no to overengineering. So at SO there must be at least one person in management with a cheap vision of how to engineer software and hardware. Once they leave or that culture changes the footprint will eventually start to explode.

manigandham|4 years ago

Most of that is extra unused capacity. They've shared their load graphs and past anecdotes where it's clear the entire site runs very lean.

Also 23 is very much a couple for a company and application of that size. It's not uncommon to see several hundred or thousands of nodes deployed by similar sites.

szszrk|4 years ago

Exactly. That twitter thread is just pure rage based on no data. Sum up resources from that page - we are talking around 6500GB* of RAM worth of servers. That is no homelab.

* Maybe a bit more/less, because it's not clear to me if DB RAM is per server, or per cluster. Likely server, as on other servers. There is also no data on how big is their haproxy.

szszrk|4 years ago

It is impressive, but it's not a raspberry pi kind of setup. Just two of those "couple" are hot and standby DB servers with 1.5TB RAM. That infrastructure is scaled A LOT vertically.