Readyset: A MySQL and Postgres wire-compatible caching layer

[+] BrentOzar|2 years ago|reply

In the Microsoft SQL Server space, several of these vendors have come and gone. My clients have been burned badly by 'em, so a few quick lessons learned:

Be aware that there are hundreds of open issues[0] and dozens of pull requests [1], some of which involve clients being unable to connect or not supporting all components of the SQL language. Just because your database supports something, doesn't mean your caching layer will.

It gets really ugly when a new version of your database comes out, with brand new features and language enhancements, and the caching layer doesn't support it. It may take months, or in some cases years, before the caching layer is feature-complete with the underlying database. If you want to use some of those language enhancements, then your app may have to maintain two connection strings - one for the caching layer, and one for direct database queries that the caching layer doesn't support.

Your support teams need to learn how to diagnose problems with the caching layer. For example, this issue [2] talks about the level of work involved with understanding why newly inserted data isn't showing up in selects.

I hope they succeed and deliver the concept, because it's one of the holy grails of databases.

[0]: https://github.com/readysettech/readyset/issues [1]: https://github.com/readysettech/readyset/pulls [2]: https://github.com/readysettech/readyset/issues/39

[+] hobofan|2 years ago|reply

I don't think it's fair to hold the number of open issues and pull requests against them. Looking through them for a minute, it looks like 95%+ are from their own team members, with a good chunk of the issues being "low priority" issues. So you are just seeing the typical ever-growing backlog that is normally in a private JIRA instance.

Having said that, the way they work with pull requests is unlike anything else I've seen. I see that they are using a merge bot, but apart from that all branch names are completely illegible. As a lot of the team seems to be present in this thread, it would be interesting to get some details about that.

[+] gwbas1c|2 years ago|reply

They use the database's replication API. In theory, it should avoid situations where the cache doesn't understand query syntax.

But, what I'd worry about are situations where an application needs to cache an object that's expensive to build. (And perhaps expensive to build because the programmer doesn't understand how to optimize SQL.)

[+] gilbertbw|2 years ago|reply

Regarding the new database version issue, I wonder why the caching layer can't just pass any query it is unable to process on to the underlying database?

This would be more complex if the feature you are using does not return a normal table of results back (e.g. the pub/sub support in Postgres).

[+] timsuchanek|2 years ago|reply

This is one of the deepest deep tech startups I've seen in a long time. I had the pleasure to meet some of the folks at RustConf in Portland.

Readyset is basically reimplementing a full database, at the absolute bleeding edge of db research, enabling global partial replication of any kind of data.

A solution desperately needed, as databases grow.

You can think of it as an intelligent LRU cache in front of your database. An important step towards fast globally distributed applications.

I hope this project will get more publicity and adoption - it's very well deserved.

[+] dpcx|2 years ago|reply

Pretty sure that this is the database that Jon Gjengset[0] was working on as part of his thesis project. There have been several videos shared by him during talks about the system. It's a really interesting concept.

edit: Here's[1] a video where he talks about the concept

[0]: https://www.youtube.com/@jonhoo [1]: https://www.youtube.com/watch?v=GctxvSPIfr8

[+] thom|2 years ago|reply

Someone knowledgeable might know: is this just incremental view updates? To what extent is the cache intelligent if parameters, where clauses, or aggregates change?

I really love this space and have been impressed with Materialize, but even if you can make some intermediate state incremental, if your workload is largely dynamic you end up needing to jump the whole way to OLAP platforms. I’m hopeful that we’re closer and closer to having our cake and eating it here, and that the operational data warehouse is only round the corner.

[+] hobofan|2 years ago|reply

They have a bit about their technical foundation here[0].

Given that Readyset was co-founded by Jon Gjengset (but has apparently since departed the company), who authored the paper on Noria[1], I would assume that Readyset is the continuation of that research. I wouldn't call that "just" incremental view maintenance, as it's a lot more involved than the simplest implementation of IVM (though obviously that is the end-goal).

So it shares some roots with Materialize. They have a common conceptual ancestry in Naiad, where Materialize evolved out of timely-dataflow.

[0]: https://docs.readyset.io/concepts/streaming-dataflow

[1]: https://jon.thesquareplanet.com/papers/osdi18-noria.pdf

[2]: https://dl.acm.org/doi/10.1145/2517349.2522738

[3]: https://github.com/TimelyDataflow/timely-dataflow

[+] lukoktonos|2 years ago|reply

Readyset auto-parameterizes cached queries similar to a prepared statement. If you run the same query with different parameters, it will be routed to the cache. The first time a parameter set is queried, it will be a cache miss and trigger an "upquery" to populate the cache, after which that set of parameters will be served from the cache.

Different where clauses (sets of parameters) would map to different query-caches and currently need to be cached separately.

Aggregates supported by Readyset[1] will also be automatically maintained, but depending on the query, they may be handled by post-processing after retrieving results from a cache

[1] https://docs.readyset.io/reference/features/queries#aggregat...

[+] alwaysrusty|2 years ago|reply

From a tech perspective, this is really cool. From a use case perspective, could someone help me understand why a developer would adopt something like this over a database like Clickhouse, outside of some fintech use cases where milliseconds of latency really matter? I'd be worried about introducing an additional point of failure to the data stack. And, if this is like Materialize, I'd be worried about this not suppporting ad hoc queries -- only precomputed ones.

[+] Culonavirus|2 years ago|reply

> From a use case perspective, could someone help me understand why

Imagine a legacy system that has a method that dynamically query-builds a massive query based on several method parameters (say 10 or 20), this method is used in two dozen places or more. The underlying tables are used in a million other places. Rewriting the query building method or, even worse, changing the underlying data model, would be expensive.

Now imagine that you could speed up some of these queries WITHOUT changing your code or model or rolling your own cache solution (the invalidation of which is always a real PITA). All this basically for free.

I don't think "why a developer would adopt something like this over a database like Clickhouse" is the right take. They do not compete. It's not about "adopting a database", that decision has been typically made a long time ago in a galaxy far far away and by someone else than you. Of course unless you work on green field projects or small enough projects that "adopting a different database" is even a question. I'd love some of that stuff :) ... one of the biggest systems I worked on for several years had close to 700 mysql tables (yea, not colums, tables), basically anything that was anywhere near the core of the system took ages to change and test. I can't possibly imagine the investment it would require to move that system from mysql to something else while not making a billion bugs along the way. I could imagine using something like Readyset, especially if it handles cache invalidation for you based on underlying model data changes.

[+] hodgesrm|2 years ago|reply

> rom a use case perspective, could someone help me understand why a developer would adopt something like this over a database like Clickhouse, outside of some fintech use cases where milliseconds of latency really matter?

Concurrency. ClickHouse works best with a relatively small number of concurrent queries: hundreds to low thousands, not 10s of thousands or more. That allows each query to hog more resources and get done quickly.

[+] larsnystrom|2 years ago|reply

They have a pretty good writeup on why you’d want to use this here: https://blog.readyset.io/dont-use-kv-stores/

[+] darkwater|2 years ago|reply

To be honest, I see more like a write-up about the shortcomings of using a classic cache system in front of your DB rather than about what Ready set does. Yes, it explains it but it could be better (for example, practical example of how you set a query to be cached, an overview how replication stream ingestion works under the hood etc)

[+] 10000truths|2 years ago|reply

This sounds like it has heavy overlap with IVM. How does Readyset distinguish itself from existing solutions like pg_ivm or Materialize?

[+] jitl|2 years ago|reply

ReadySet descends from Noira (https://www.usenix.org/conference/osdi18/presentation/gjengs...), which I view as the next generation of incremental dataflow technology after Naiad/Differential Dataflow/Materialize.

The key difference is that Noira/ReadySet supports partial materialization and can reconstruct data on-demand, whereas Naiad/Differential Dataflow/Materialize must keep a complete materialization up to date. ReadySet can partially evict parts of the data flow and bring them back later if needed.

In practical terms if you have a materialized view you want to maintain, in ReadySet you pay O(part of the materialized data flow you actually need), which is less than O(entire materialized view) you’d pay with Materialize.

[+] exabrial|2 years ago|reply

One of the things we love about using JPA (We use EclipseLink) is it comes with caching, for free, and it’s transparent. You can mark any field as a cache index and it automatically tries the cache first. Updates are published and loaded into every nodes cache automatically, and you get fallback protection in the form of incremental version numbers on rows.

The one thing it can’t handle however is range update queries or native queries that perform updates.

You can just avoid them in your architecture… OR maybe this is the solution we’ve been looking we’ve been looking for! definitely going to give this a spin!

Documentation looks very complete and I like there’s a UI to view the query cache.

[+] zer00eyz|2 years ago|reply

I dont understand what the use case is for this.

If I have a front end, I would hope that the formated response is what were caching. Be that HTML or JSON.

If I cant read from that cache then I should be reading from fresh data all together? right?

[+] gwbas1c|2 years ago|reply

Ever add (or inherit) the server-side part of an application that uses Redis or Memcache? The data is denormalized: When you do an update/insert/delete into the SQL side of things, you need to do a corresponding change in Redis / Memcache. All of your queries end up being something like: Try Redis/Memcache, if the data isn't present, query the database and insert into Redis/Memcache.

It (Redis/Memcache caching) adds a huge amount of complexity to your application, and the risk of defects is very, very real.

ReadySet basically gives you a magic "stick this thing between your database and application and we'll do the caching for you." It totally eliminates a time consuming and error-prone part of your application.

If you want to, they go into details here: https://blog.readyset.io/dont-use-kv-stores/

[+] hobofan|2 years ago|reply

You are reading from "fresh" (though only eventually consistent) data. For many public facing queries that's enough.

After an initial query seen by their proxy, you can configure all future queries of the same kind to be pre-computed.

So I think it makes more sense to think of it like an auto-updating materialized view available with the click of a button rather than a cache.

There are some more under-the hood details here: https://docs.readyset.io/concepts/overview#how-does-readyset...

[+] joeatwork|2 years ago|reply

This isn’t an open source project (which isn’t a bad thing! Just a non-obvious thing if you don’t scroll down their whole readme)

[+] gvsg-rs|2 years ago|reply

[deleted]

[+] ltbarcly3|2 years ago|reply

Trying to add transparent caching to a transactional database is just a bad idea and cannot work. Anyone who says it works for them is just in the period after putting it in place and before when they realize why it cannot work.

If it was possible to just slap a cache in between you and the db and magically make shit fast, DB vendors would have done that 20 years ago. Billions of dollars a year is put into relational db development. Papers are published every week, from theoretical ways to model and interact with data to practical things like optimizing query execution plans.

Unless Readyset can point to a patent or a paper that has fundamentally revolutionized how database will be built from today forward it is going to be crap and will burn you.

[+] KingOfCoders|2 years ago|reply

This might be good tech and a good company.

Once we used a distributed caching system in a startup which was open source. Then the open source version got cut features we needed, so we bought a license. Then the startup was bought up by a large software company and the license costs went 10x YoY with a one week notice. As our migration away from this tech was not done, because it was very complicated and tied into our application we had to pay. Luckily we also had been bought and the very large costs were not a problem. I would never again use something from a company that is crucial to our operations.

[+] marceloaltmann|2 years ago|reply

That is a good point on the application changes. What is appealing from Readyset is that it does not require you to change your application code. You can just change your database connection string to point to it and it will start to proxy your queries to your database. From there you can choose what you want to cache, and everything else (writes, non supported queries, non cached read queries) will be automatically proxied to your database.

[+] qaq|2 years ago|reply

But the read side is already fairly trivial to scale with read replicas

[+] jitl|2 years ago|reply

Some queries might be too slow at p95 even on a read replica with no other clients. Those kinds of queries can benefit greatly from a materialized view, and incremental view maintenance as data changes.

[+] iknownothow|2 years ago|reply

Not denying your point but in some applications (which aren't outliers) a good cache can increase number of requests served 100x. Achieving those with read-replicas is easy and possible but can be quite expensive.

A read-replica doesn't give you perfect transactional guarantees as well. Read operations to the replica after a write operation to the leader might still give you stale data (lagging by a few milliseconds).

[+] eatonphil|2 years ago|reply

Yes to a degree, but not if you require/value consistency.

[+] AYBABTME|2 years ago|reply

Love the work. Looks quite similar to what PlanetScale Boost does[1]. Basically the same but as a front-end to someone's existing database? (disclaimer: I work at PS).

[1]: https://planetscale.com/blog/how-planetscale-boost-serves-yo...

[+] Jonhoo|2 years ago|reply

They're both based on the techniques outlined in the Noria paper (https://www.usenix.org/conference/osdi18/presentation/gjengs...) and my thesis (https://jon.thesquareplanet.com/papers/phd-thesis.pdf), so not terribly surprising they carry some resemblance :p

[+] giovannibonetti|2 years ago|reply

I imagine a good use case for this at its current stage would be for powering up a monitoring dashboard that runs ad-hoc queries against your operational DB. I've seen this situation in a previous Fintech company I worked at, where we had some people staring at dashboards all day long looking for issues in any of the subsystems.

[+] mdaniel|2 years ago|reply

I just wanted to give a high five for having Jepsen tests for this: https://github.com/readysettech/readyset/tree/stable-240117/...

[+] notapoolshark|2 years ago|reply

Slight tangent, but this reminds me of discussions I've seen in the Postgres email servers about native support for real-time materialized views. Does anyone know if we can expect to see something like this in a future version of Postgres?

[+] vira28|2 years ago|reply

I don't know about the support in core but there is this extension https://github.com/sraoss/pg_ivm

[+] redwood|2 years ago|reply

Anyone successfully using? There are a few other services out there like PolyScale. It will be interesting to see if any of these introduce some form of write support over time

[+] fastest963|2 years ago|reply

What are some advantages to ReadySet versus read replicas from YugabyteDB or CockroachDB? A downside is that it appears to require a separate cloud subscription.

[+] marceloaltmann|2 years ago|reply

You can deploy on your own via their .deb packages - https://readyset.io/download

The advantages is that reading from a cache will be faster than from a read replicas. The benefits increase even further if you have to perform computation on the fetched data.

[+] potamic|2 years ago|reply

> ReadySet is licensed under the BSL 1.1 license, converting to the open-source Apache 2.0 license after 4 years.

What does this mean?

[+] martypitt|2 years ago|reply

Not the OP, but BSL is Source License that gives you access to use the source, with specific restrictions.

Most commonly the restrictions prevent you from launching a competing offering. In their case, you can't offer database-as-a-service using their code.

BSL typically also restricts production use - though it looks like ReadySet has relaxed that restriction.

Finally, BSL reverts to a traditional open source license after a set period of time - in their case Apache 2 after 4 years. This means that code written today is licensed under BSL for 4 years, then automatically reverts to Apahce 2 thereafter.

[+] PeterZaitsev|2 years ago|reply

The outdated security hole ridden useless code will eventually become Open Source.

In practice I would count on either using software in compliance with BSL restrictions (which are generous) or seek commercial license

69 comments