Even though I love their simplicity as an example of how to be pragmatic and not over-engineer, do remember that they’ve tuned their code to the point that they built an ORM that is one of the fastest in the NET world. I used it and it was awesomely lightweight.
It’s as much an example of how far world class talent can go, as it is about doing more with less.
Right - Marc Gravell and Tim Craver, who worked on the core architecture of Stack Overflow, were both so obsessive about extracting performance from .net web applications that when they couldn’t do any more from the outside, they both quit and went to work for Microsoft on performance improvements in the framework itself.
I feel like it’s similar to how people point to Craigslist as evidence that you can still build sites in Perl - ignoring the fact that Craigslist has Larry Wall on a retainer.
Running highly scalable monoliths is easy! As long as you’re willing to hire some of the five to ten people in the world who are capable of advancing the state of the art of development on that technology stack…
You can also see this the other way around — it's a testament to how slow some other stuff is.
Which, to be clear, is not intended to be a negative statement about that "other stuff". It really depends. Some is. But I've also seen things just done poorly by applying tools wrong, e.g. ORM misuse leading to thousands of queries that should have been one OUTER JOIN.
But I don't think you need engineers of their unique calibre to get most of what they got. It's probably an exponential thing, if you have some merely good engineers you could maybe achieve 80% of their performance. The last 20% are just much more costly.
Yep. Following some of the SO folks on Twitter a while back, I remember watching them do all sorts of things with .NET that didn’t feel remotely “necessary” for a Q&A website. It’s not like you can pull people off the street and have them get away with infrastructure this simple.
Not to take anything away from Dapper (it's an excellent library), but it isn't really that much faster than EntityFramework anymore.
> EF Core 6.0 performance is now 70% faster on the industry-standard TechEmpower Fortunes benchmark, compared to 5.0.
> This is the full-stack perf improvement, including improvements in the benchmark code, the .NET runtime, etc. EF Core 6.0 itself is 31% faster executing queries.
> Heap allocations have been reduced by 43%.
> At the end of this iteration, the gap between Dapper and EF Core in the TechEmpower Fortunes benchmark narrowed from 55% to around a little under 5%.
Again, this isn't to take anything away from Dapper. It's a wonderful query library that lets you just write SQL and map your objects in such a simple manner. It's going to be something that a lot of people want. Historically, Entity Framework performance wasn't great and that may have motivated StackOverflow in the past. At this point, I don't think EF's performance is really an issue.
If you look at the TechEmpower Framework Benchmarks, you can see that the Dapper and EF performance is basically identical now: https://www.techempower.com/benchmarks/#section=data-r21&l=z.... One fortunes test is 0.8% faster for Dapper and the other is 6.6% faster. For multiple queries, one is 5.6% faster and the other is 3.8% faster. For single queries, one is 12.2% faster and the other 12.9% faster. So yes Dapper is faster, but there isn't a huge advantage anymore - not to the point that one would say StackOverflow has tuned their code to such an amazing point that they need substantially less hardware. If they swapped EF in, they probably wouldn't notice much of a difference in performance. In fact, in the real world where apps, the gap between them is probably going to end up being less.
In some tests, EF actually edges past Dapper since it can compile queries in advance (which just means calling `EF.CompileQuery(myQuery)` and assigning that to a static variable that will get reused.
Again, none of this is to take away from Dapper. Dapper is a wonderful, simple library. In a world where there's so many painful database libraries, Dapper is great. It shows wonderful care in its design. Entity Framework is great too and performance isn't really an interesting distinction. I love being able to use both EF and Dapper and having such amazing database access options.
The best cache is the one built into the database. People seem to forget that the major rdbmses have sophisticated cache strategies of their own and that handing them more RAM (and ensuring they are configured to use it for query or other cache) is usually a good first strategy before trying to second guess and reinvent the cache outside the db.
Thread says SO allocates 1.5TB RAM to SQL Server. Sounds wise.
It's all about the load though. SO is probably 95% Read-Only which makes sense for removing the cache layer. If you had a more writes, then they would need an external cache to offset the read load.
Microservices remains mostly an organisational pattern to scale development teams not necessarily the system performance. Microservices add a lot of complexity and overhead.
The main takeaway is that the questions searched for are so widely distributed that there is no need for a cache layer - they are nothing but long tail.
At that point there is no 'cloud' design that can help. Its either one database (or maybe just shard everything onto thousands of distributed nodes)
But the point I am trying to make is that kubernetes and microservices etc are based on idea of winners - power laws. One tweet everyone wants to read. One search term, one viral video.
Then again. This is just a question of taste - the taste of the dev lead. What (s)he feels is best approach. Take another company doing the same thing and different approach might emerge.
I mean, kubernetes or microservices don’t care how the data reads are distributed, right? That problem is a database-level thing whereas k8s is infrastructure, you can run any kind of database with any kind of sharding you want on it. I feel like it might be more accurate to say something like “the value of caching is based on the idea of winners” for example
I’m always puzzled when I’m using SO to help diagnose some obscure problem in my tech stack and I see a bunch of “hot questions” in the sidebar about whether dwarf armor can deflect magic bullets, or what the energy capacity of a Stormtrooper’s laser rifle is, etc.
Imagine trying to present this kind of architecture to a room full of executives already sold on the "benefits" of kubernetes, big data, serverless, etc.
Hah. I get your point but it would be an easy sell for them. The impossible sell would be to engineers. Executives would just compare operating costs estimates.
The use case is simple i.e. web front end, thin app layer, database.
So if you were to implement this same architecture using Kubernetes or Serverless it would be as equally simple as a bunch of Ansible or Puppet scripts.
The folks over at SO picked a stack (C#, SQL Server, IIS), and optimized the heck out of it to keep this "simplicity". Much of SO is custom built from the ground up to push performance and stay within the purity of the canonical .net stack.
It isn't clear to me this is a model that would work elsewhere, or should be held up as something to be replicated.
Did they save time? Did they save money? Did this help make SO a wildly successful company? Did it allow them to deliver features to customers faster?
It's worth reminding people what is actually possible with a relatively simple architecture. There's a vast number of websites and services with a very small fraction of the traffic of Stack Overflow with a much more complicated architecture simply because everyone thinks you need Kubernetes etc to scale out.
It's not cacheless. There are countless caches throughout (including what appears to be ~1TB of memory in the database server), just not a dedicated cache machine.
Isn't stackoverflow, incidentally, one of the websites who would benefit the most from caching, given their content supposedly is going to be static the majority of the time?
Diagram 1 has the comment "What I think it should be".
It's easy to interpret that as "stackoverflow should change to be like this", but I think it was meant to be more like "If I had to guess how stackoverflow works, this is what I think it would look like".
It's amazing how much performance and scalability you can get out of computers, if you don't burden them with 100x overhead caused by shoveling data between microservices all the time :-)
The word "should" might be confusing here. I didn't read it as the author recommending a change; rather the author first proposes "Given what I know about Stack Overflow, they must be doing something like this, right?" Then boom comes the surprising revelation.
Is there a website that tracks outages of other websites like Stack Overflow over years? I know some that tell you if it's down right now, but not over years.
I have a subjective feeling that Stack Overflow is down a lot more than other websites. I don't see that ever mentioned in the discussion of cloud vs on-prem which makes the discussion seem lacking.
Not caching the questions and answers makes sense to me, as I imagine the hit rate wouldn't be terribly good. I would guess, though, that they somehow cache things like the sidebar list of blog articles, featured items, "Hot Network Questions", etc.
They do in fact cache some things like that, they've had caching issues in the past (and again recently, I think) with the wrong cache being used in some situations:
Please ignore my lack of understanding a bit here. I'm genuinely trying to learn.
I've always heard (and it made sense to me) that to reduce latency of requests from across the globe, you might want to have read replicas or caches spread on global infrastructure. Then how is it that stack overflow is fast here when the db is on-prem, 7 seas across from me? Any amount of RAM should not account for the distance, right?
You can put a big dent in the impact of the speed of light if you keep round-trips to a minimum.
This is one advantage of server-rendered HTML (though that's not the only option you have).
It also helps that StackOverflow is light on interactivity. You load a page, read for a minute, then maybe click a vote button or open a textarea to discuss. As long as the text and styles load quickly, you won't notice if progressive enhancement scripts take a little more time to load.
It's a useful reality check. Dedicated machines are fast and you can do a lot without much software complexity. People mention the StackOverflow guys optimizing their software, but their CPU utilization is 5% so they have a lot of headroom to be less optimized. Probably they just enjoyed it and could spend time on that, so why not?
At KotlinConf in April I'll be giving a talk on two-tier architecture, which is the StackOverflow simplicity concept pushed even further. Although not quite there yet for social "web scale" apps like StackOverflow, it can be useful for many other kinds of database backed services where the users are a bit more committed and you're less dependent on virality. For example apps where users sign a contract, internal apps, etc.
The gist is that you scrap the web stack entirely and have only two tiers: an app that acts as your frontend (desktop, mobile) and an RDBMS. The frontend connects directly to the DB using its native protocols and drivers, the user authentication system is that of the database. There is no REST, no JSON, no GraphQL, no OAuth, no CORS, none of that. If you want to do a query, you do it and connect the resulting result stream directly to your GUI toolkit's widgets or table view controls. If what you want can't be expressed as SQL you use a stored procedure to invoke a DB plugin e.g. implemented with PL/Java or PL/v8. This approach was once common - the thread on Delphi the other day had a few people commenting who still maintain this type of app - but it fell out of favor because Microsoft completely failed to provide good distribution systems, so people went to the web to get that. These days distributing apps outside the browser is a lot easier so it makes sense to start looking at this design again.
The disadvantages are that it requires a couple more clicks up front for end users, and if they have very restrictive IT departments it may be harder for them to get access to your app. In some contexts that doesn't matter much, in others it's fatal. The tech for blocking DoS attacks isn't as good, and you may require a better RDBMS (Postgres is great but just not as scalable as SQL Server/Oracle). There are some others I'll cover in my talk along with proposed solutions.
The big advantage is simplicity with consequent productivity. A lot of stuff devs spend time designing, arguing about, fighting holy wars over etc just disappears. E.g. one of the benefits of GraphQL over plain REST is that it supports batching, but SQL naturally supports even better forms of batching. Results streaming happens for free, there's no need to introduce new data formats and ad-hoc APIs between frontend and DB, stored procedures provide a typed RPC protocol that can integrate properly with the transaction manager. It can also be more secure as SQL injection is impossible by design, and if you don't use HTML as your UI then XSS and XSRF bugs also become impossible. Also because your UI is fully installed locally, it can provide very low latency and other productivity features for end users. In some cases it may even make sense to expose the ability to do direct SQL queries to the end user, e.g. if you have a UI for browsing records then you can allow business analysts to supply their own SQL query rather than flooding the dev's backlog with requests for different ways to slice the data.
When my startup was acquired a few years ago, our infra was hosted at AWS, but most of our "cloud features" were used more for monitoring, alerting, and dashboarding. The real work was done by Windows/SQL and .NET app code. Ours was a messaging application that we tested to support about 350 messages/second, and we had to integrate with the "big co" backend after we were acquired. The bigco back-end could handle about 3-5 messages/second.
Our main production "infra" was a load-balanced pair of medium CPU front-end servers and a high-memory back-end for the SQL server. Theirs was approximately 20x the size, and a more "traditional" cloud microservices, etc. infrastructure. Optimization makes all the difference. So many of the "extras" just add unnecessary complexity, just like avoiding those "extras" probably does when they actually are required.
On the topic of Postgres versus MS SQL Server or Oracle, I wonder if any of the newer Postgres-compatible databases, like Cockroach or Materialize, solve the scalability issue you raise with Postgres, while not having quite the stigma of MS SQL Server or (especially) Oracle.
atonse|3 years ago
It’s as much an example of how far world class talent can go, as it is about doing more with less.
jameshart|3 years ago
I feel like it’s similar to how people point to Craigslist as evidence that you can still build sites in Perl - ignoring the fact that Craigslist has Larry Wall on a retainer.
Running highly scalable monoliths is easy! As long as you’re willing to hire some of the five to ten people in the world who are capable of advancing the state of the art of development on that technology stack…
didntreadarticl|3 years ago
Looks like its expanded a little since then
https://github.com/DapperLib/Dapper
eqvinox|3 years ago
Which, to be clear, is not intended to be a negative statement about that "other stuff". It really depends. Some is. But I've also seen things just done poorly by applying tools wrong, e.g. ORM misuse leading to thousands of queries that should have been one OUTER JOIN.
But I don't think you need engineers of their unique calibre to get most of what they got. It's probably an exponential thing, if you have some merely good engineers you could maybe achieve 80% of their performance. The last 20% are just much more costly.
KyeRussell|3 years ago
mdasen|3 years ago
> EF Core 6.0 performance is now 70% faster on the industry-standard TechEmpower Fortunes benchmark, compared to 5.0.
> This is the full-stack perf improvement, including improvements in the benchmark code, the .NET runtime, etc. EF Core 6.0 itself is 31% faster executing queries.
> Heap allocations have been reduced by 43%.
> At the end of this iteration, the gap between Dapper and EF Core in the TechEmpower Fortunes benchmark narrowed from 55% to around a little under 5%.
https://devblogs.microsoft.com/dotnet/announcing-entity-fram...
Again, this isn't to take anything away from Dapper. It's a wonderful query library that lets you just write SQL and map your objects in such a simple manner. It's going to be something that a lot of people want. Historically, Entity Framework performance wasn't great and that may have motivated StackOverflow in the past. At this point, I don't think EF's performance is really an issue.
If you look at the TechEmpower Framework Benchmarks, you can see that the Dapper and EF performance is basically identical now: https://www.techempower.com/benchmarks/#section=data-r21&l=z.... One fortunes test is 0.8% faster for Dapper and the other is 6.6% faster. For multiple queries, one is 5.6% faster and the other is 3.8% faster. For single queries, one is 12.2% faster and the other 12.9% faster. So yes Dapper is faster, but there isn't a huge advantage anymore - not to the point that one would say StackOverflow has tuned their code to such an amazing point that they need substantially less hardware. If they swapped EF in, they probably wouldn't notice much of a difference in performance. In fact, in the real world where apps, the gap between them is probably going to end up being less.
If we look at some other benchmarks in the community, they tell a similar story: https://github.com/FransBouma/RawDataAccessBencher/blob/mast...
In some tests, EF actually edges past Dapper since it can compile queries in advance (which just means calling `EF.CompileQuery(myQuery)` and assigning that to a static variable that will get reused.
Again, none of this is to take away from Dapper. Dapper is a wonderful, simple library. In a world where there's so many painful database libraries, Dapper is great. It shows wonderful care in its design. Entity Framework is great too and performance isn't really an interesting distinction. I love being able to use both EF and Dapper and having such amazing database access options.
eduction|3 years ago
Thread says SO allocates 1.5TB RAM to SQL Server. Sounds wise.
MrFoof|3 years ago
If the data is sitting in memory, and you've tuned extracting the data from memory as fast as possible, job done.
likeabbas|3 years ago
PaulKeeble|3 years ago
mupuff1234|3 years ago
sebazzz|3 years ago
lifeisstillgood|3 years ago
At that point there is no 'cloud' design that can help. Its either one database (or maybe just shard everything onto thousands of distributed nodes)
But the point I am trying to make is that kubernetes and microservices etc are based on idea of winners - power laws. One tweet everyone wants to read. One search term, one viral video.
Then again. This is just a question of taste - the taste of the dev lead. What (s)he feels is best approach. Take another company doing the same thing and different approach might emerge.
pickledish|3 years ago
selcuka|3 years ago
didntreadarticl|3 years ago
This question does not appear to be about programming, Closed.
hyped-up technologies
subjective, Closed
problems caused by over-engineering
Opinion-based, Closed.
docandrew|3 years ago
oconnore|3 years ago
cntainer|3 years ago
prng2021|3 years ago
ElectricalUnion|3 years ago
threeseed|3 years ago
So if you were to implement this same architecture using Kubernetes or Serverless it would be as equally simple as a bunch of Ansible or Puppet scripts.
ctvo|3 years ago
It isn't clear to me this is a model that would work elsewhere, or should be held up as something to be replicated.
Did they save time? Did they save money? Did this help make SO a wildly successful company? Did it allow them to deliver features to customers faster?
Yeroc|3 years ago
cosmotic|3 years ago
Sammi|3 years ago
ElectricalUnion|3 years ago
tylergetsay|3 years ago
bluedino|3 years ago
Fire-Dragon-DoL|3 years ago
infomaniac|3 years ago
bitwize|3 years ago
another2another|3 years ago
tony-allan|3 years ago
Is there a particular reason to suggest a change to the architecture?
[1] https://twitter.com/sahnlam/status/1629713954225405952/photo...
borland|3 years ago
It's easy to interpret that as "stackoverflow should change to be like this", but I think it was meant to be more like "If I had to guess how stackoverflow works, this is what I think it would look like".
It's amazing how much performance and scalability you can get out of computers, if you don't burden them with 100x overhead caused by shoveling data between microservices all the time :-)
default-kramer|3 years ago
kichik|3 years ago
I have a subjective feeling that Stack Overflow is down a lot more than other websites. I don't see that ever mentioned in the discussion of cloud vs on-prem which makes the discussion seem lacking.
didntreadarticl|3 years ago
Spooky23|3 years ago
tyingq|3 years ago
banana_giraffe|3 years ago
https://meta.stackexchange.com/a/235277
jonas-w|3 years ago
[0] https://stackexchange.com/performance
hoseja|3 years ago
tiffanyh|3 years ago
A hidden taken away is that NVMe storage databases are so fast, they are comparable to in-memory (redis) databases these days.
kkielhofner|3 years ago
foobazzy|3 years ago
I've always heard (and it made sense to me) that to reduce latency of requests from across the globe, you might want to have read replicas or caches spread on global infrastructure. Then how is it that stack overflow is fast here when the db is on-prem, 7 seas across from me? Any amount of RAM should not account for the distance, right?
spiffytech|3 years ago
This is one advantage of server-rendered HTML (though that's not the only option you have).
It also helps that StackOverflow is light on interactivity. You load a page, read for a minute, then maybe click a vote button or open a textarea to discuss. As long as the text and styles load quickly, you won't notice if progressive enhancement scripts take a little more time to load.
wlonkly|3 years ago
bryancoxwell|3 years ago
unknown|3 years ago
[deleted]
ThatMedicIsASpy|3 years ago
ec109685|3 years ago
ksec|3 years ago
ElectricalUnion|3 years ago
didntreadarticl|3 years ago
One of the only well known sites to do so, I think?
profile53|3 years ago
mytailorisrich|3 years ago
mike_hearn|3 years ago
At KotlinConf in April I'll be giving a talk on two-tier architecture, which is the StackOverflow simplicity concept pushed even further. Although not quite there yet for social "web scale" apps like StackOverflow, it can be useful for many other kinds of database backed services where the users are a bit more committed and you're less dependent on virality. For example apps where users sign a contract, internal apps, etc.
The gist is that you scrap the web stack entirely and have only two tiers: an app that acts as your frontend (desktop, mobile) and an RDBMS. The frontend connects directly to the DB using its native protocols and drivers, the user authentication system is that of the database. There is no REST, no JSON, no GraphQL, no OAuth, no CORS, none of that. If you want to do a query, you do it and connect the resulting result stream directly to your GUI toolkit's widgets or table view controls. If what you want can't be expressed as SQL you use a stored procedure to invoke a DB plugin e.g. implemented with PL/Java or PL/v8. This approach was once common - the thread on Delphi the other day had a few people commenting who still maintain this type of app - but it fell out of favor because Microsoft completely failed to provide good distribution systems, so people went to the web to get that. These days distributing apps outside the browser is a lot easier so it makes sense to start looking at this design again.
The disadvantages are that it requires a couple more clicks up front for end users, and if they have very restrictive IT departments it may be harder for them to get access to your app. In some contexts that doesn't matter much, in others it's fatal. The tech for blocking DoS attacks isn't as good, and you may require a better RDBMS (Postgres is great but just not as scalable as SQL Server/Oracle). There are some others I'll cover in my talk along with proposed solutions.
The big advantage is simplicity with consequent productivity. A lot of stuff devs spend time designing, arguing about, fighting holy wars over etc just disappears. E.g. one of the benefits of GraphQL over plain REST is that it supports batching, but SQL naturally supports even better forms of batching. Results streaming happens for free, there's no need to introduce new data formats and ad-hoc APIs between frontend and DB, stored procedures provide a typed RPC protocol that can integrate properly with the transaction manager. It can also be more secure as SQL injection is impossible by design, and if you don't use HTML as your UI then XSS and XSRF bugs also become impossible. Also because your UI is fully installed locally, it can provide very low latency and other productivity features for end users. In some cases it may even make sense to expose the ability to do direct SQL queries to the end user, e.g. if you have a UI for browsing records then you can allow business analysts to supply their own SQL query rather than flooding the dev's backlog with requests for different ways to slice the data.
fatnoah|3 years ago
Our main production "infra" was a load-balanced pair of medium CPU front-end servers and a high-memory back-end for the SQL server. Theirs was approximately 20x the size, and a more "traditional" cloud microservices, etc. infrastructure. Optimization makes all the difference. So many of the "extras" just add unnecessary complexity, just like avoiding those "extras" probably does when they actually are required.
mwcampbell|3 years ago
yamrzou|3 years ago
didntreadarticl|3 years ago
https://twitter.com/alexcwatt/status/1544876135711916035?lan...
faizmokhtar|3 years ago
That's a little bit arrogant no?
KyeRussell|3 years ago
didntreadarticl|3 years ago