top | item 43301432

You might not need Redis

241 points| jcartw | 1 year ago |viblo.se | reply

147 comments

order
[+] g9yuayon|1 year ago|reply
When I was in Uber back in 2015, my org was trying to convert zip-code-based geo partitioning with a hexagon-based scheme. Instead of partitioning a city into on average tens of zip codes, we may partition the city into potentially hundreds of thousands of hexagons and dynamically create areas. The first launch was in Phoenix, and the team who was responsible for the launch stayed up all night for days because they could barely scale our demand-pricing systems. And then the global launch of the feature was delayed first by days, then by weeks, and then by months.

It turned out Uber engineers just loved Redis. Having a need to distribute your work? Throw that to Redis. I remember debating with some infra engineers why we couldn't throw in more redis/memcached nodes to scale our telemetry system, but I digressed. So, the price service we built was based on Redis. The service fanned out millions of requests per second to redis clusters to get information about individual hexagons of a given city, and then computed dynamic areas. We would need dozens of servers just to compute for a single city. I forgot the exact number, but let's say it was 40 servers per an average-sized city. Now multiply that by the 200+ cities we had. It was just prohibitively expensive, let alone that there couldn't other scalability bottlenecks for managing such scale.

The solution was actually pretty simple. I took a look at the algorithms we used, and it was really just that we needed to compute multiple overlapping shapes. So, I wrote an algorithm that used work-stealing to compute the shapes in parallel per city on a single machine, and used Elasticsearch to retrieve hexagons by a number of attributes -- it was actually a perfect use case for a search engine because the retrieval requires boolean queries of multiple attributes. The rationale was pretty simple too: we needed to compute repetitively on the same set of data, so we should retrieve the data only once for multiple computations. The algorithm was of merely dozens of lines, and was implemented and deployed to production over the weekend by this amazing engineer Isaac, who happens to be the author of the library H3. As a result, we were able to compute dynamic areas for 40 cities, give or take, on a single machine, and the launch was unblocked.

[+] ckrapu|1 year ago|reply
I love H3. Isaac and Uber did a real service to the geospatial community with that one.
[+] necubi|1 year ago|reply
Funny enough, exactly the same at Lyft. Redis everywhere. The original version of the dynamic pricing system was a series of cron jobs reading and writing to redis, before it was replaced with a Flink pipeline (which still wrote to redis for serving).
[+] mdaniel|1 year ago|reply
I'm firmly in the "you only need Postgres" camp, so I went into your story thinking it was going to end with you saying that you used PostGIS

Err, now that I think more about that, IIRC Uber is a monster mysql shop so it may cause them to break out in hives if someone installed Postrges there

[+] tombert|1 year ago|reply
I have gotten in arguments with people who over-deploy Redis. Redis is cool, I don't dislike it or anything, but a lot of the time when people use it, it actually slows things down.

Using it, you're introducing network latency and serialization overhead. Sometimes that's worth it, especially if your database is falling over, but a lot of the time people use it and it just makes everything more complex and worse.

If you need to share cached data across processes or nodes, sometimes you have to use it, but a lot of the stuff I work with is partitioned anyway. If your data is already partitioned, you know what works well a lot of the time? A boring, regular hashmap.

Pretty much every language has some thread-safe hashmap in there, and a lot of them have pretty decent libraries to handle invalidation and expiration if you need those. In Java, for example, you have ConcurrentHashMap for simple stuff, and Guava Caches or Caffeine Caches for more advanced stuff.

Even the slowest [1] local caching implementation will almost certainly be faster than anything that hits the network; in my own testing [2] Caffeine caches have sub-microsecond `put` times, and you don't pay any serialization or deserialization cost. I don't think you're likely to get much better than maybe sub-millisecond times with Redis, even in the same data center, not to mention if you're caching locally that's one less service that you have to babysit.

Again, I don't hate Redis, there are absolutely cases where it's a good fit, I just think it's overused.

[1] Realistic I mean, obviously any of use could artificially construct something that is slow as we want.

[2] https://blog.tombert.com/posts/2025-03-06-microbenchmark-err... This is my own blog, feel free to not click it. Not trying to plug myself, just citing my data.

[+] ohgr|1 year ago|reply
My trick is saying no to redis full stop. Every project where it was used as a cache only it developed retention ans backup requirements and every project where it was a key value store someone built a relational database on top of it.

There’s nothing worse than when someone does the latter. I had to write a tool to remove deletes from the AOF log because someone fucked up ordering of operations big time trying to pretend they had proper transactions.

[+] fabian2k|1 year ago|reply
I prefer caching in memory, but a major limitation once you have more than one process is invalidation. It's really only easy to stuff you can cache and just expire on time, not if you need to invalidate it. At that point you need to communicate between your processes (or all of them need to listen to the DB for events).
[+] ozim|1 year ago|reply
I’ve seen the same, like when I just mentioned caching a team mate would hear „implement redis”.

Then I would have to explain „no, we have caching stuff ‚in process’, just use that, our app will use more RAM but that’s what we need„.

[+] evil-olive|1 year ago|reply
an antipattern I've observed when giving system design interviews is that a lot of people, when faced with a performance problem, will throw out "we should add a caching layer" as their first instinct, without considering whether it's really appropriate or not.

for example, if the problem we're talking about is related to slow _writes_, not slow reads, the typical usage of a cache isn't going to help you at all. implementing write-through caching is certainly possible, but has additional pitfalls related to things like transactional integrity between your cache and your authoritative data store.

[+] hajimuz|1 year ago|reply
In most cases It’s not about the speed, it’s about data sharing for containers or distributed systems. Filesystem or in-memory doesn’t work. I agree that in most cases a normal database is enough though.
[+] Salgat|1 year ago|reply
We use an event database (think Kafka) as our source of truth and we've largely shifted away from redis and elasticsearch in favor of local in-memory singletons. These get pretty big too, up to 6GB in some cases for a single mapping. Since it's all event based data, we can serialize the entire thing to json asynchronously along with the stream event numbers specific to that state and save the file to s3. On startup we can restore the state for all instances and catchup on the remaining few events. The best part is that the devs love being able to just use LINQ on all their "database" queries. We do however have to sometimes write these mappings to be lean to fit in memory for tens of millions of entries, such as only one property we use for a query, then we do a GET on the full object in elasticsearch.
[+] slt2021|1 year ago|reply
redis is needed to share data with other microservices, that are possibly written in different language.

polyglot teams when you have big data pipeline running in java, but need to share data with node/python written services.

if you dont have multiple isolated micro services, then redis is not needed

[+] antirez|1 year ago|reply
I believe that the issue is that the culture about Redis usage didn't evolve as much as its popularity. To use it memcached alike has many legitimate use cases but it's a very reductive way to use it. For instance sorted set ranking is something that totally changes the dynamics of what you can and can't do with traditional databases. Similarly large bitmaps that allow to retain very fast real time one bit information to do analytics otherwise very hard to do is another example. Basically, Redis helps a lot more as the company culture around it increases, more patterns are learned, and so forth. But in this regard a failure on the Redis (and my) side is that there isn't a patterns collection book: interviewing folks that handled important use cases (think at Twitter) to understand the wins and the exact usage details and data structure usages. Even just learning the writable cache pattern totally changes the dynamics of your Redis experience.
[+] kshitij_libra|1 year ago|reply
Do you plan to write the book ? I’d like to read it
[+] dinobones|1 year ago|reply
It’s not about Redis vs not Redis, it’s about working with data that does not serialize well or lend itself well to extremely high update velocity.

Things like: counters, news feeds, chat messages, etc

The cost of delivery for doing these things well with a LSM based DB or RDB might actually be higher than Redis. Meaning: you would need more CPUs/memory to deliver this functionality, at scale, than you would with Redis, because of all the overhead of the underlying DB engine.

But for 99% of places that aren’t FAANG, that is fine actually. Anything under like 10k QPS and you can do it in MySQL in the dumbest way possible and no one would ever notice.

[+] daneel_w|1 year ago|reply
"But for 99% of places that aren’t FAANG, that is fine actually. Anything under like 10k QPS and you can do it in MySQL in the dumbest way possible and no one would ever notice."

It's not fine. I feel like you're really stretching it thin here in an almost hand-waving way. There are so many cases at far smaller scale where latency is still a primary bottleneck and a crucial metric for valuable and competitive throughput, where the definitively higher latency of pretty much any comparable set of operations performed in a DBMS (like MySQL) will result in large performance loss when compared to a proper key-value store.

An example I personally ran into a few years ago was a basic antispam mechanism (a dead simple rate-limiter) in a telecoms component seeing far below 10k items per second ("QPS"), fashioned exactly as suggested by using already-available MySQL for the counters' persistence: a fast and easy case of SELECT/UPDATE without any complexity or logic in the DQL/DML. Moving persistence to a proper key-value store cut latency to a fraction and more than doubled throughput, allowing for actually processing many thousands of SMSes per second for only an additional $15/month for the instance running Redis. Small operation, nowhere near "scale", huge impact to performance and ability to process customer requests, increased competitiveness. Every large customer noticed.

[+] packetlost|1 year ago|reply
I've only ever seen Redis used in two scenarios: storing ephemeral cache data to horizontally scale Django applications and for ephemeral job processing where the metadata about the job was worthless.

I reevaluated it for a job processing context a couple of years ago and opted for websockets instead because what I really needed was something that outlived an HTTP timeout.

I've never actually seen it used in a case where it wasn't an architecture smell. The codebase itself is pretty clean and the ideas it has are good, but the idea of externalizing datastructures like that just doesn't seem that useful if you're building something correctly.

[+] gytisgreitai|1 year ago|reply
Exactly. Lots of people read post by companies doing millions of qps and then decide that they need redis, kafka, elastic, nosql, etc right from start. And that complicates things. We are currently at 500k RPS scale and we have probably around a handful of use cases for Redis and it works great
[+] hinkley|1 year ago|reply
I worked for a company that had enough customers that AWS had to rearrange their backlog for cert management to get us to come on board, and our ingress didn’t see 10,000 req/s. We put a KV store in front of practically all of our backend services though. We could have used Redis, but memcached was so stable and simple that we just manually sharded by service. We flew too close to the sun trying to make the miss rate in one of the stores a little lower and got bit by OOMKiller.

By the time it was clear we would have been better off with Redis’ sharding solution the team was comfortable with the devil they knew.

[+] lr4444lr|1 year ago|reply
100% this. Also, is it data whose scale and speed is more important than its durability?

I actually agree with the author that Redis was not the right solution for the situations he was presented with, but he's far from proving it is not the solution for a whole host of other problems.

[+] karmakaze|1 year ago|reply
Even then you can do a lot of things to spread write contention with an RDBMS.

e.g. MySQL 8.0.1+ adds SKIP LOCKED modifier to SELECT ... FOR UPDATE.

Then you can increment the first available row, otherwise insert a new row. On read aggregate the values.

[+] 0xbadcafebee|1 year ago|reply
Software development today is largely just people repeating what other people do without thinking. Which is how human culture works; we just copy what everyone else is doing, because it's easier, and that becomes "normal", whatever it is.

In the software world in the mid 00's, the trend started to work around the latency, cost and complexity of expensive servers and difficult databases by relying on the speed of modern networks and RAM. This started with Memcached and moved on to other solutions like Redis.

(this later evolved into NoSQL, when developers imagined that simply doing away with the complexity of databases would somehow magically remove their applications' need to do complex things... which of course it didn't, it's the same application, needing to do a complex thing, so it needs a complex solution. computers aren't magic. we have thankfully passed the hype cycle of NoSQL, and moved on to... the hype cycle for SQLite)

But the tradeoff was always working around one limitation by adding another limitation. Specifically it was avoiding the cost of big databases and the expertise to manage them, and accepting the cost of dealing with more complex cache control.

Fast forward to 2025 and databases are faster (but not a ton faster) and cheaper (but not a ton cheaper) and still have many of the same limitations (because dramatically reinventing the database would have been hard and boring, and no software developer wants to do hard and boring things, when they can do hard and fun things, or ignore the hard things with cheap hacks and pretend there is no consequence to that).

So people today just throw a cache in between the database, because 1) databases are still kind of stupid and hard (very very useful, but still stupid and hard) and 2) the problems of cache complexity can be ignored for a while, and putting off something hard/annoying/boring until later is a human's favorite thing.

No, you don't need Redis. Nobody needs Redis. It's a hack to avoid dealing with stateless applications using slow queries on an un-optimized database with no fast read replicas and connection limits. But that's normal now.

[+] edoceo|1 year ago|reply
> hype cycle for SQLite

Drop Redis, replace with in-memory SQLite.

But for real, the :memory: feature is actually pretty awesome!

[+] cmbothwell|1 year ago|reply
This hits at the true nature of the problem which has _nothing_ to do with Redis at all (which is a fine piece of technology written by a thoughtful and conscientious creator) and has everything to do with the fact that our industry at large encourages very little thinking about the problems we are trying to solve.

Hence, fads dominate. I hate to sound so cynical but that has been my experience in every instance of commercial software development.

[+] briandear|1 year ago|reply
Recently just left a MongoDB project. A total nightmare.
[+] bassp|1 year ago|reply
I agree with the author 100% (the TanTan anecdote is great, super clever work!), but.... sometimes you do need Redis, because Redis is the only production-ready "data structure server" I'm aware of

If you want to access a bloom filter, cuckoo filter, list, set, bitmap, etc... from multiple instances of the same service, Redis (slash valkey, memorydb, etc...) is really your only option

[+] jasonthorsness|1 year ago|reply
Yes, while the default idea of Redis might be to consider it a key/value cache, the view of the project itself is definitely about being a "data structure server" - it's right at the top of the https://github.com/redis/redis/blob/unstable/README.md and antirez has focused on that (I can't find one quote I am looking for specifically but it's evident for example in discussion on streams https://antirez.com/news/114). Although I've definitely seen it be used just as a key/value store in the deployments I'm familiar with ¯\_(ツ)_/¯
[+] e_hup|1 year ago|reply
All of those can be serialized and stored in an RDMS. You don't need Redis for that.
[+] kflgkans|1 year ago|reply
You might not need a cache. In my previous company (~7 years) all teams around me were introducing caches left and right and getting into a lot of complexity and bugs. I persevered and always pushed back adding caches to apps in my team. Instead focusing on improving the architecture and seeking other performance improvements. I can proudly say my teams have stayed cached-free for those 7 years.
[+] superq|1 year ago|reply
The issues that I have with Redis are not at all its API (which is elegant and brilliant) or even its serialized, single-core, single-threaded design, but its operational hazards.

As a cache or ephemeral store like a throttling/rate limiting, lookup tables, or perhaps even sessions store, it's great; but it's impossible to rely on the persistence options (RDB, AOF) for production data stores.

You usually only see this tendency with junior devs, though. It might be a case where "when all you have is a hammer, all you see are nails", or when someone discovers Redis (or during the MongoDB hype cycle ten years ago), which seems like it's in perfect alignment with their language datatypes, but perhaps this is mostly because junior devs don't have as many production-ready databases (from SQL like Postgresql, CockroachDB, Yugabyte to New/NoSQL like ScyllaDB, YDB, Aerospike) to fall back on.

Redis shines as a cache for small data values (probably switch to memcache for larger values, which is simpler key-value but generally 3 to 10 times faster for that more narrow use case, although keep an eye on memory fragmentation and slab allocation)

Just think carefully before storing long-term data in it. Maybe don't store your billing database in it :)

[+] noisy_boy|1 year ago|reply
I have seen horrifying use of Redis where I inherited the maintainance of an application whose original developer implemented his own home grown design to manage relationships between different types of key value pairs, pretending like they were tables including cross-referencing logic; it took me a week to just add test cases with sufficient logging to reveal the "schema" and mutation logic. All this with the non-technical manager wondering why it took so long to make the change which directly dependended on understanding this. To top it all, the code was barely better than spaghetti with less than ten lines of comments across maybe 5k LOC. The irony was that this was not a latency sensitive application - it did data quality checks and could have been implemented in a much more cleaner and flexible way using, e.g., PostgreSQL.
[+] progbits|1 year ago|reply
Redis as ephemeral cache is ok, but nothing extra.

Redis as transactional, distributed and/or durable storage is pretty poor. Their "active active" docs on conflict resolution for example don't fill me with confidence given there is no formalism, just vague examples. But this comes from people who not only not know how do distributed locks, they refuse to learn when issues are pointed out to them: https://martin.kleppmann.com/2016/02/08/how-to-do-distribute...

Every time I find code that claims to do something transactional in Redis which is critical for correctness, not just latency optimization, I get worried.

[+] dimgl|1 year ago|reply
I'm really surprised that the pendulum has swung so far in the other direction that people are recommending not to use Redis.

Sure, don't introduce a data store into your stack unless you need it. But if you had to introduce one, Redis still seems like one of the best to introduce? It has fantastic data structures (like sorted sets, hash maps), great performance, robust key expiry, low communication overhead, low runtime overhead... I mean, the list goes on.

[+] hot_gril|1 year ago|reply
Or a simple KV cache, which is pretty often something you need. Sure you can use Postgres for this if you really don't want another dep, but Redis or Memcached is better suited for it and probably cheaper.
[+] ks2048|1 year ago|reply
Yeah, my first thought was, “I don’t need redis, but I want redis”.
[+] igortg|1 year ago|reply
I followed with this rationale in a small project and opted for PostgreSQL pub/sub instead Redis. But I went through so much trouble correctly handling PostgreSQL disconnections that I wonder if Redis wouldn't be the better choice.
[+] bdcravens|1 year ago|reply
Another category is using Redis indirectly via dependencies. For example in Rails, Sidekiq is a common background job library. However, there are now Postgresql-backed options (like GoodJob and the baked-in Solid Queue, which supports other RDBMSes as well)
[+] alberth|1 year ago|reply
> A single beefy Redis (many cores, lots of RAM type of machine) should be able to handle the load

I thought Redis was single threaded running on a single core.

Having multiple cores provides no benefit (and arguably could hurt since large multicore systems typically have a lower clock)

[+] mannyv|1 year ago|reply
Caching is a funny thing. Just like anything you need to understand if and why you need it.

And one other thing is you should be able to fall back if your cache is invalidated.

In our case we keep a bunch of metadata in redis that’s relatively expensive (in cost and speed) to pull/walk in realtime. And it needs to be fast and support lots of clients. The latter sort of obviates direct-to-database options.

[+] lukaslalinsky|1 year ago|reply
For years I've tried to use Redis as a persistent data store. I've only been dissatisfied, having bad experiences with both sentinel and cluster. Most of my service outages were linked to Redis replication getting broken.

Then I decided to give up and use it only as an empehemral cache. I have a large number of standalone Redis instances (actually, now they are Valkey), no storage, only memory, and have Envoy proxy on top of them for monitoring and sharding. And I'm really happy with it, storing hundreds of GBs of data there, if one goes down, only a small part of the data needs to be reloaded from the primary source, and with the Envoy proxy, applications see it as a single Redis server. I was considering just replacing it with memcached, but the redis data model is more rich, so I kept using it, just not expecting anything put there to be actually stored forever.

[+] sgarland|1 year ago|reply
> 'Why can't we just store this data on the shards in PostgreSQL, next to the swipes?'. The data itself would be microscopic in comparison and the additional load would also be microscopic in comparison to what these servers were already doing.

I'm assuming based on the rest of the article that the author and team knew what they were doing, but if you aren't familiar with Postgres' UPDATE strategy and HOT updates [0], you should familiarize yourself before attempting this, otherwise, you're going to generate a massive amount of WAL traffic and dead tuples at scale.

[0]: https://www.postgresql.org/docs/current/storage-hot.html

[+] simonw|1 year ago|reply
If you work at a company where teams keep on turning to Redis for different features, there's a chance that it's an indication that the process for creating new database tables in your relational store has too much friction!
[+] harrall|1 year ago|reply
Two things:

I think people forget (or don’t know) that adding data storage system to your architecture also involves management, scaling, retention, and backup. It’s not free.

And second, sometimes you do need to invest in storage to permit failover or to minimize latency but people do it for traffic when they really have little traffic. A server from 20 years ago could push a lot of traffic and systems have gotten only beefier.