We had a critical service that often got overwhelmed, not by one client app but by different apps over time. One week it was app A, the next week app B, each with its own buggy code suddenly spamming the service.
The quick fix suggested was caching, since a lot of requests were for the same query. But after debating, we went with rate limiting instead. Our reasoning: caching would just hide the bad behavior and keep the broken clients alive, only for them to cause failures in other downstream systems later. By rate limiting, we stopped abusive patterns across all apps and forced bugs to surface. In fact, we discovered multiple issues in different apps this way.
Takeaway: caching is good, but it is not a replacement for fixing buggy code or misuse. Sometimes the better fix is to protect the service and let the bugs show up where they belong.
It's funny how I encountered a problem which went exactly the opposite way! We initially introduced a rate limiter that was adequate for the time, but with the product scaling up it stopped being adequate, and any failures with 429 were either ignored, or closed as client bugs. Only after some time we realized that the rate of requests scaled up approximately with the rate of product growth, and a quick fix was to simply remove the limiter, but after a couple of times when DB decided to take a nap after being overwhelmed, we added a caching layer.
Just goes to show that there is no silver bullet - context, experience and good amount of gut feeling is paramount.
You can never trust clients to behave. If your goal is to reduce infra cost, sure, rate limiting is an acceptable answer. But is it really that hard to throw on a cache and provision your service to be horizontally scalable?
It details Apache samza, which I didn’t totally grasp but it seems similar to what you’re talking about here.
He talks about how if you could essentially use an event stream as your source of truth instead of a database, and you had a sufficiently powerful stream processor, you could define views on that data by consuming stream events.
The end result is kind of like an auto-updating cache with no invalidation issues or race conditions. Need a new view on the data? Just define it and run the entire event stream through it. Once the stream is processed, that source of data is perpetually accurate and up-to-date.
I’m not a database guy and most of this stuff is over my head, but I loved this talk and I think you should check it out! It’s the first thing I thought of when I read your post.
I think a fundamental mistake I see many developers make is they use caching trying to solve problems rather than improve efficiency.
It's the equivalent of adding more RAM to fix poor memory management or adding more CPUs/servers to compensate for resource heavy and slow requests and complex queries.
If your application requires caching to function effectively then you have a core issue that needs to be resolved, and if you don't address that issue then caching will become the problem eventually as your application grows more complex and active.
Idk I think caching is a crucial part of many well-designed systems. There’s a lot of very cache-able data out there. If invalidating events are well defined or the data is fine being stale (week/month level dashboards, for example), that’s a fantastic reason to use a cache. I’d much rather just stuff those values in a cache than figure out any other more complicated solution.
I also just think it’s a necessary evil of big systems. Sometimes you need derived data. You can even think about databases as a kind of cache: the “real” data is the stream of every event that ever updated data in the database! (Yes this stretching the meaning of cache lol)
However I agree that caching is often an easy bandaid for a bad architecture.
> It's the equivalent of adding more RAM to fix poor memory management
No it’s ten times worse than that. Adding RAM doesn’t make the task of fixing the memory management problems intrinsically harder. It just makes the problem bigger when you do fix it.
Adding caching to your app makes all of the tools used for detecting and categorizing performance issues much harder to use. We already have too many developers and “engineers” who balk at learning more than the basics of using these tools. Caching is like stirring up sediment in a submarine cave. Now only the most disciplined can still function and often just barely.
When you don’t have caches, data has to flow along the call tree. So if you need a user’s data in three places, that data either flows to those three or you have to look it up three times, which can introduce concurrency issues if the user metadata changes in the middle of a request. But because it’s inefficient there is clear incentive to fix the data propagation issues. Fixing those issues will make testing easier because now the data is passed in instead of having to mock the lookup code.
Then you introduce caching. Now the incentive is mostly gone, since you will only improve cold start performance. And now there is a perverse incentive to never propagate the data again. You start moving backward. Soon there are eight places in the code that use that data, because looking it up was “free” and they are all detached from each other. And now you can’t even turn off the cache, and cache traffic doesn’t tell you what your costs are.
And because the lookup is “free” the user lookup code disappears from your perf data and flame graphs. Only a madman like me will still tackle such a mess, and even I have difficulty finding the motivation.
For these reasons I say with great confidence and no small authority: adding caching to your app is the last major performance improvement most teams will ever see. So if you reach for it prematurely, you’re stuck with what you’ve got. Now a more astute competitor can deliver a faster, cheaper, or both product that eats your lunch and your team will swear there is nothing they can do about it because the app is already as fast as they can make it, and here are the statistics that “prove” it.
Friends don’t let friends put caches on immature apps.
> If your application requires caching to function effectively then you have a core issue that needs to be resolved, and if you don't address that issue then caching will become the problem eventually as your application grows more complex and active.
I don’t think this is always true. Sometimes your app simply has data that takes a lot of computation to generate but doesn’t need to be generated often. Any way you solve this is going to be able to be described as a ‘cache’ even if you are just storing calculations in your main database. That doesn’t mean your application has a fundamental design flaw, it could mean your use case has a fundamental cache requirement.
If your database is slow because it's on spinning disks, then a cache will speed up access.
That's not a fundamental mistake, and there's very little you can do about that from an efficiency point of view.
It's easy to forget that there was a world without SSDs, high speed pipes, etc - but it actually did exist. And that wasn't so long ago either.
And of course sometimes putting data nearer to the user actually makes sense...like the Netflix movie boxes inside various POPs or CDNs. Bandwidth and latency are actual factors for many applications.
That said, most applications probably should investigate adding indexes to their databases (or noSQL databases) instead of adding a cache layer.
Not to mention latency! Caching does nothing to fix the latency of “misses”, which means any app that uses a caching layer to paper over a bad design will forever have a terrible P99 (or even P90) latency.
“But, but, when I reload the page now it’s fast! I fixed it!”
A friend of mine once argued that adding a cache to a system is almost always an indication that you have an architectural problem further down the stack, and you should try to address that instead.
The more software development experience I gain the more I agree with him on that!
Caches suck because invalidation needs to be sprinkled all over the place in what is often an abstraction-violating way.
Then there's memoization, often a hack for an algorithm problem.
I once "solved" a huge performance problem with a couple of caches. The stain of it lies on my conscience. It was actually admitting defeat in reorganizing the logic to eliminate the need for the cache. I know that the invalidation logic will have caused bugs for years. I'm sure an engineer will curse my name for as long as that code lives.
If you have no cache, and your first thought is "this needs a cache", you're probably right. Chances are you need to optimize a query or storage pattern. But you're thinking like an engineer. It may be true that there is a "more correct" engineering solution, but adding a cache might be the most expedient solution.
But after you'd done all the optimizations, there is still a use case for caches. The main one being that a cache holds a hot set of data. Databases are getting better at this, and with AI in everything, latency of queries is getting swamped by waiting for the LLM, but I still see caches being important for decades to come.
Most of the time I use caching it's to cut down on network round trips. If I'm fetching data on every end user request that only updates daily or weekly caching that's a no-brainer. Edge caching for content sites is also a no-brainer. Caching something computationally expensive may be fishy but also may be useful. Even if you are just papering over some inefficient process, that's not necessarily a sin. Sometimes you have to be pragmatic.
The two questions no one seems to ask are 'do I even need a database?', and 'where do I need my database?'
There are alternate data storage 'patterns' that aren't databases. Though ultimately some sort of (Structure) query language gets invented to query them.
Yeah my architecture problem is that Postgres RDS EBS storage is slow as dog. Sure our data won’t go poof if we lose an instance but it’s so slow.
(It’s not really my architecture problem. My architecture problem is that we store pages as grains of sand in a db instead of in a bucket, and that we allow user defined schemas)
If you think of it as a cache, yes. If you think of it as another data layer then no.
For example, let’s say that every web page your CMS produces is created using a computationally expensive compilation. But the final product is more or less static and only gets updated every so often. You can basically have your compilation process pull the data from your source of truth such as your RSBMS but then store the final page (or large fragments of it) in something like MongoDB. In other words the cache replacement happens at generation time and not on demand. This means there is always a cached version available (though possibly slightly stale), and it is always served out of a very fast data store without expensive computation. I prefer this style of caching to on demand caching because it means you avoid cache invalidation issues AND the thundering herd problem.
Of course this doesn’t work for every workflow but I can get you quite far. And yes this example can also be sort of solved with a static site generator but look beyond that at things like document fragments, etc. This works very well for dynamic content where the read to write ratio is high.
I disagree. For large search pages where you're building payloads from multiple records that don't change often, it could be beneficial to use a cache. Your cache ends up helping the most common results to be fetched less often and return data faster.
I've been thinking a lot recently about edge/client layer data sync, interesting to hear where others are at. Noria seems to have got as far as a smart way to store and manage tabular data, however this doesn't seem to help much when the frontend is built on blobs & if one isn't prepared to write the additional layer for read/write on top of the rest of the fetching system.
The dumb/MVP approach I'd like to try sometime is close-to-client read only sqlite db's that get managed in the background and neatly handled by wrapper functions around things like fetch. The part I've been slowly thinking about is Noria style efficient handling of data structures while allowing for 'raw' queries, ideally I'd like to set this up so the frontend doesn't need an additional layers worth of read/write functionality just to have CDN-like behaviour. Maybe something like plugins to [de/re]normalise different kinds of blob to tables (from gql, groqd, etc). I'd also like to include a realtime cache invalidation/update system to keep all clients in sync without cache clearing... If I ever get that far.
This got me thinking a bit more. Rest / GraphQL / Groq handled with adapters, flatten anything nested that references an ID to the row level. Opinionated queries (queries only fetch a superset/subset of the same structure). Fetched data 'fans out' the new content into the rows based on ID to fill out/update structure. Lives in a service worker or side by side with frontend. Drops oldest/least fetched data when limits are reached. Would something like that work?
Alternatively just ship an entire shallow copy of least changed / most used data as sqlite db's to the edge, push updates to those, and fetch from source anything that isn't in the DB. Might be simpler.
Some of these questions are informed by the Redis/DynamoDB or Postgres/MySQL world the author seems to inhabit.
Why would you want to do this?
"I don’t know of any database built to handle hundreds of thousands of read replicas constantly pulling data."
If you want an open-source database with Redis latencies to handle millions of concurrent reads, you can use RonDB (disclaimer, I work on it).
"Since I’m only interested in a subset of the data, setting up a full read replica feels like overkill. It would be great to have a read replica with just partial data. It would be great to have a read replica with just partial data."
This is very unclear. Redis returns complete rows because it does not support pushdown projections or ordered indexes. RonDB supports these and distion aware partition-pruned index scans (start the transaction on the node/partition that contains the rows that are found with the index).
For the type of cache usage described in the article, cache lookups are almost always O(1). This is because a cache value is retrieved for a specific key.
Whereas db queries are often more complicated and therefore take longer. Yes, plenty of db queries are fetching a row by a key, and therefore fast. But many queries use a join and a somewhat complicated WHERE clause.
No, what makes a cache a cache is invalidation. A cache is stale data. It's a latent out of date calculation. It's misinformation that risks surviving until it lies to the user.
Many of these points are not compelling to me when 1) you can filter both rows and columns (in postgres logical replication anyway [0]) and 2) SQL views.
Is it possible to create a filter that can work over a complex join operation?
That's what IVM systems like Noria can do. With application + cache, the application stores the final result in the cache. So, with these new IVM systems, you get that precomputed data directly from the database.
Views in Postgres are not materialized right? so every small delta would require refresh of entire view.
Event-sourcing is a powerful tool that helps with exactly this. Why spin up a cache server when you can spin up another read DB instance for the same price and get unlimited capabilities...
miggy|6 months ago
The quick fix suggested was caching, since a lot of requests were for the same query. But after debating, we went with rate limiting instead. Our reasoning: caching would just hide the bad behavior and keep the broken clients alive, only for them to cause failures in other downstream systems later. By rate limiting, we stopped abusive patterns across all apps and forced bugs to surface. In fact, we discovered multiple issues in different apps this way.
Takeaway: caching is good, but it is not a replacement for fixing buggy code or misuse. Sometimes the better fix is to protect the service and let the bugs show up where they belong.
Alex_L_Wood|6 months ago
Just goes to show that there is no silver bullet - context, experience and good amount of gut feeling is paramount.
andersmurphy|6 months ago
In all seriousness sometimes a cache is what you need. Inline caching is a classic example.
spyspy|6 months ago
chamomeal|6 months ago
https://youtu.be/fU9hR3kiOK0?si=t9IhfPtCsSyszscf
It details Apache samza, which I didn’t totally grasp but it seems similar to what you’re talking about here.
He talks about how if you could essentially use an event stream as your source of truth instead of a database, and you had a sufficiently powerful stream processor, you could define views on that data by consuming stream events.
The end result is kind of like an auto-updating cache with no invalidation issues or race conditions. Need a new view on the data? Just define it and run the entire event stream through it. Once the stream is processed, that source of data is perpetually accurate and up-to-date.
I’m not a database guy and most of this stuff is over my head, but I loved this talk and I think you should check it out! It’s the first thing I thought of when I read your post.
ajcp|6 months ago
shivasaxena|6 months ago
[deleted]
zeras|6 months ago
It's the equivalent of adding more RAM to fix poor memory management or adding more CPUs/servers to compensate for resource heavy and slow requests and complex queries.
If your application requires caching to function effectively then you have a core issue that needs to be resolved, and if you don't address that issue then caching will become the problem eventually as your application grows more complex and active.
chamomeal|6 months ago
I also just think it’s a necessary evil of big systems. Sometimes you need derived data. You can even think about databases as a kind of cache: the “real” data is the stream of every event that ever updated data in the database! (Yes this stretching the meaning of cache lol)
However I agree that caching is often an easy bandaid for a bad architecture.
This talk on Apache Samza completely changed how I think about caching and derived data in general: https://youtu.be/fU9hR3kiOK0?si=t9IhfPtCsSyszscf
And this interview has some interesting insights on the problems that caching faces at super large scale systems (twitter specifically): https://softwareengineeringdaily.com/2023/01/12/caching-at-t...
hinkley|6 months ago
No it’s ten times worse than that. Adding RAM doesn’t make the task of fixing the memory management problems intrinsically harder. It just makes the problem bigger when you do fix it.
Adding caching to your app makes all of the tools used for detecting and categorizing performance issues much harder to use. We already have too many developers and “engineers” who balk at learning more than the basics of using these tools. Caching is like stirring up sediment in a submarine cave. Now only the most disciplined can still function and often just barely.
When you don’t have caches, data has to flow along the call tree. So if you need a user’s data in three places, that data either flows to those three or you have to look it up three times, which can introduce concurrency issues if the user metadata changes in the middle of a request. But because it’s inefficient there is clear incentive to fix the data propagation issues. Fixing those issues will make testing easier because now the data is passed in instead of having to mock the lookup code.
Then you introduce caching. Now the incentive is mostly gone, since you will only improve cold start performance. And now there is a perverse incentive to never propagate the data again. You start moving backward. Soon there are eight places in the code that use that data, because looking it up was “free” and they are all detached from each other. And now you can’t even turn off the cache, and cache traffic doesn’t tell you what your costs are.
And because the lookup is “free” the user lookup code disappears from your perf data and flame graphs. Only a madman like me will still tackle such a mess, and even I have difficulty finding the motivation.
For these reasons I say with great confidence and no small authority: adding caching to your app is the last major performance improvement most teams will ever see. So if you reach for it prematurely, you’re stuck with what you’ve got. Now a more astute competitor can deliver a faster, cheaper, or both product that eats your lunch and your team will swear there is nothing they can do about it because the app is already as fast as they can make it, and here are the statistics that “prove” it.
Friends don’t let friends put caches on immature apps.
cortesoft|6 months ago
I don’t think this is always true. Sometimes your app simply has data that takes a lot of computation to generate but doesn’t need to be generated often. Any way you solve this is going to be able to be described as a ‘cache’ even if you are just storing calculations in your main database. That doesn’t mean your application has a fundamental design flaw, it could mean your use case has a fundamental cache requirement.
mannyv|6 months ago
That's not a fundamental mistake, and there's very little you can do about that from an efficiency point of view.
It's easy to forget that there was a world without SSDs, high speed pipes, etc - but it actually did exist. And that wasn't so long ago either.
And of course sometimes putting data nearer to the user actually makes sense...like the Netflix movie boxes inside various POPs or CDNs. Bandwidth and latency are actual factors for many applications.
That said, most applications probably should investigate adding indexes to their databases (or noSQL databases) instead of adding a cache layer.
jiggawatts|6 months ago
“But, but, when I reload the page now it’s fast! I fixed it!”
simonw|6 months ago
The more software development experience I gain the more I agree with him on that!
hinkley|6 months ago
barrkel|6 months ago
Then there's memoization, often a hack for an algorithm problem.
I once "solved" a huge performance problem with a couple of caches. The stain of it lies on my conscience. It was actually admitting defeat in reorganizing the logic to eliminate the need for the cache. I know that the invalidation logic will have caused bugs for years. I'm sure an engineer will curse my name for as long as that code lives.
jedberg|6 months ago
But after you'd done all the optimizations, there is still a use case for caches. The main one being that a cache holds a hot set of data. Databases are getting better at this, and with AI in everything, latency of queries is getting swamped by waiting for the LLM, but I still see caches being important for decades to come.
tootie|6 months ago
jmull|6 months ago
Caches have perfectly valid uses, but they are so often used in fundamentally poor ways, especially with databases.
DrBazza|6 months ago
The two questions no one seems to ask are 'do I even need a database?', and 'where do I need my database?'
There are alternate data storage 'patterns' that aren't databases. Though ultimately some sort of (Structure) query language gets invented to query them.
jitl|6 months ago
(It’s not really my architecture problem. My architecture problem is that we store pages as grains of sand in a db instead of in a bucket, and that we allow user defined schemas)
IgorPartola|6 months ago
For example, let’s say that every web page your CMS produces is created using a computationally expensive compilation. But the final product is more or less static and only gets updated every so often. You can basically have your compilation process pull the data from your source of truth such as your RSBMS but then store the final page (or large fragments of it) in something like MongoDB. In other words the cache replacement happens at generation time and not on demand. This means there is always a cached version available (though possibly slightly stale), and it is always served out of a very fast data store without expensive computation. I prefer this style of caching to on demand caching because it means you avoid cache invalidation issues AND the thundering herd problem.
Of course this doesn’t work for every workflow but I can get you quite far. And yes this example can also be sort of solved with a static site generator but look beyond that at things like document fragments, etc. This works very well for dynamic content where the read to write ratio is high.
AtheistOfFail|6 months ago
tengbretson|6 months ago
jayd16|6 months ago
interstice|6 months ago
The dumb/MVP approach I'd like to try sometime is close-to-client read only sqlite db's that get managed in the background and neatly handled by wrapper functions around things like fetch. The part I've been slowly thinking about is Noria style efficient handling of data structures while allowing for 'raw' queries, ideally I'd like to set this up so the frontend doesn't need an additional layers worth of read/write functionality just to have CDN-like behaviour. Maybe something like plugins to [de/re]normalise different kinds of blob to tables (from gql, groqd, etc). I'd also like to include a realtime cache invalidation/update system to keep all clients in sync without cache clearing... If I ever get that far.
interstice|6 months ago
Alternatively just ship an entire shallow copy of least changed / most used data as sqlite db's to the edge, push updates to those, and fetch from source anything that isn't in the DB. Might be simpler.
jamesblonde|6 months ago
Why would you want to do this? "I don’t know of any database built to handle hundreds of thousands of read replicas constantly pulling data."
If you want an open-source database with Redis latencies to handle millions of concurrent reads, you can use RonDB (disclaimer, I work on it).
"Since I’m only interested in a subset of the data, setting up a full read replica feels like overkill. It would be great to have a read replica with just partial data. It would be great to have a read replica with just partial data."
This is very unclear. Redis returns complete rows because it does not support pushdown projections or ordered indexes. RonDB supports these and distion aware partition-pruned index scans (start the transaction on the node/partition that contains the rows that are found with the index).
Reference:
https://www.rondb.com/post/the-process-to-reach-100m-key-loo...
stevoski|6 months ago
For the type of cache usage described in the article, cache lookups are almost always O(1). This is because a cache value is retrieved for a specific key.
Whereas db queries are often more complicated and therefore take longer. Yes, plenty of db queries are fetching a row by a key, and therefore fast. But many queries use a join and a somewhat complicated WHERE clause.
hoppp|6 months ago
The difference is in persistence and scaling and read/write permissions
barrkel|6 months ago
Supermancho|6 months ago
eatonphil|6 months ago
[0] https://www.postgresql.org/docs/current/logical-replication-...
avinassh|6 months ago
That's what IVM systems like Noria can do. With application + cache, the application stores the final result in the cache. So, with these new IVM systems, you get that precomputed data directly from the database.
Views in Postgres are not materialized right? so every small delta would require refresh of entire view.
xixixao|6 months ago
Having caching by default (like in Convex) is a really neat simplification to app development.
phoronixrly|6 months ago
gethly|6 months ago
mannyv|6 months ago
Again, you should test. But the main reason imo for redis is connections and speed, not just speed.
jayd16|6 months ago
cbsmith|6 months ago
valentinammm|6 months ago
[deleted]