top | item 37764581

An interactive intro to CRDTs

922 points| jakelazaroff | 2 years ago |jakelazaroff.com

130 comments

order
[+] insanitybit|2 years ago|reply
So far this is probably the best "intro to CRDTs for a developer" I've read. I built a product around CRDTs, essentially, and my god was it painful trying to engage with. Showing actual code, explaining that `merge` is the fundamental operation, etc, is really all a developer needs to know IMO.

Also, the fact that we always use text editing as the de-facto solution is so weird to me since that problem is both niche and extremely complex. IMO a better example would be something like "Can this person drink alcohol?". Age moves in one direction so it has a simple merge function:

    def set_age(self, new_age: int):
        self.age = max(self.age, new_age)
A property of this is that if I query your age and if you're 21 I can cache that age forever. You'll only ever be >= 21, after all. If I add new queries that care about you being 25 (for a hotel) I can satisfy the "drinking age" queries from a stale cache and then retrieve the true value (<25) when I need to check if you can book a hotel.

This means you can have distributed caches without invalidation logic. A pretty amazing property since cache invalidation is a hugely complex problem and has seriously negative performance/ storage implications.

It also means you can drop writes. If my system gets information that a person was 18, but that information is out of date, I can drop that write, and I can do so by examining the cache and viewing stale information, only checking the real value if the cache value is < 18.

This whole thing lets you push computation to the edge, drop expensive writes, ignore any cache invalidation logic, cache values forever, potentially answer queries from stale cache values, etc.

Anyway, kudos for the writeup. I skimmed the second half but the first half was great and the second half looked legit.

[+] kevincox|2 years ago|reply
> the fact that we always use text editing as the de-facto solution is so weird to me since that problem is both niche and extremely complex

The reason we use that is because it is complex enough to show the problems that CRDTs solve. I would argue that this painting example is too simple. The core merge loop is:

    if pixel.created_at < newPixel.created_at {
        pixel = newPixel;
    }
this is maybe good as a first step, but I don't think it is enough to even really called an "Intro". A last-write-wins register is trivial.

Simple text inserts with a simple "insert after" CRDTs is not much more complicated but involves things like generated unique IDs without communication and how to resolve conflicts with some sort of globally consistent ordering.

[+] whalesalad|2 years ago|reply
I realize this is a straw man argument/example - but it feels hairy to me. So much fuss about age and cache invalidation ... age should not be persisted anywhere. When you make age a calculated property from birthday it is never inaccurate or stale or wrong. "set age" should not be a possible operation in any system imho.
[+] richardwhiuk|2 years ago|reply
Somebody typos there age to be too high, and then you cache it forever.

You have to be careful here.

[+] alephnan|2 years ago|reply
> text editing as the de-facto solution is so weird to me since that problem is both niche and extremely complex.

My first foray into collaborative editing was for my text editor. Indeed, things get super linearly harder as you add basic functionality of editors such as deleting and replacing, especially when those space multiple lines.

Instead, I reached for Fraser’s differential syncing. https://neil.fraser.name/writing/sync/. There’s a lot of ambiguity and nuances in various versions of the prose and white paper that I could never really flesh out.

I think anyone attempting to relay a collaborative editing algorithm needs to do is start with the simplest scenarios: append only / monotonically increasing data.

[+] cabalamat|2 years ago|reply
> Age moves in one direction so it has a simple merge function:

Better to use date of birth as that doesn't change at all.

[+] steve_adams_86|2 years ago|reply
The way you describe that makes me wonder if state machines would be natural tools to express CRDT “states”.
[+] Racing0461|2 years ago|reply
What if i set the wrong dob and i need to change it?
[+] braden-lk|2 years ago|reply
I’ve built a successful business from a TTRPG campaign manager (LegendKeeper) using CRDTs, specifically the Yjs kind. It’s been great, and the UX of CRDT-powered stuff is excellent. Zero latency, eventually consistent; overall users love the performance and offline capability.

That said, there are a lot of trade offs. Some things that are easy in a traditional server-client model become difficult in the local-first context that CRDTs provide. Role based authorization is hard, data model changes must be done additive (never mutative), and knowing what state a client is in when debugging is tough too, without a lot of “full-surveillance”-level tooling. Also with the automatic , bidirectional syncing a lot of CRDT architectures afford you, a bug in production that corrupts data can virally propagate and cause a huge headache.

Investor-funded services like Liveblocks are starting to pop up that promise to make this stuff easier, but as an indie I find them expensive; I’m sure they’re a great value for big corps or funded teams though. Rolling my own infrastructure for Yjs has been taxing, but I’ve learned a lot, and have been able to tailor it exactly to my needs.

[+] paulgb|2 years ago|reply
> knowing for sure what state a user’s client is in when debugging is tough too

My team has built an open-source debugger for Yjs that might interest you (docs: https://y-sweet.cloud/advanced/debugger)

You mention the investor-funded services that pop up to make this stuff easier -- our goal with Y-Sweet is to build the same type of DX you’d get from those services, but build it on a fully open-source (MIT) platform with Yjs at the core: https://github.com/drifting-in-space/y-sweet

[+] doctorpangloss|2 years ago|reply
> data model changes must be done additive (never mutative)

Sounds painful. It means your mutative data model changes, which exist, live somewhere else.

[+] Karrot_Kream|2 years ago|reply
Is LegendKeeper built on top of the Websocket Yjs provider? If so, do you run the Websocket server yourself? If not, do you use WebRTC and have you had any STUN/TURN issues with that?

LegendKeeper looks really awesome btw, I might bring this up for my own campaign use. I've been thinking of using Yjs to build some character sheet builders myself which is why I'm asking.

[+] matlin|2 years ago|reply
Evan Wallace (co-founder of Figma) has one of the best visualizations of CRDTs in action https://madebyevan.com/algos/crdt-fractional-indexing/

In practice, most apps will only need Last-Writer-Wins registers and not the more complicated sequence CRDT's that you find in Y.js and Automerge.

We've built a auto-syncing database that uses CRDTs under the hood but never exposes them through the API. So if you want all of the benefits of CRDTs e.g. offline-first user experience, checkout our project, Triplit!

[+] jongjong|2 years ago|reply
Something important to mention when discussing CRDTs is that they are particularly suited for scenarios where clients may go offline often and where it makes sense to resolve conflicts automatically. Not every kind of data lends itself well to automatic conflict resolution as the merged state may not be desirable when all parts are constructed independently without real-time collaborative feedback.

For example, if I have a field which is "color" and one person writes red and the other writes blue, there is no way to automatically resolve that conflict when they both become reconnected. It's physically impossible since the intent cannot be established without the ability to read the minds of both participants. You can't just merge the letters into the word "reblued" nor can you allow one to completely overwrite the other while letting both participants believe that their change was settled when in fact, only one made it through. Often, it's desirable that both participants must be online and better to show one an error message if they're not so that they are not mislead into thinking that they're actually changing the system state when in fact their change hasn't been persisted.

I've worked on realtime systems which don't rely on CRDTs. This was a suitable approach in my case since accuracy of the data was paramount and each section of the data was well isolated from one another and offline editing was not required.

[+] auggierose|2 years ago|reply
I swear, HN somehow tracks what I am doing. The last few days I also looked into CRDTs, Automerge, etc, and here we go. Happens so often, it is uncanny.

Here is a good overview article, which has pointers to other articles: https://cacm.acm.org/magazines/2022/11/265835-research-for-p...

To me it seems that while state-based CRDTs are easy to understand, operation-based CRDTs are actually what is used in practice. Furthermore, it seems to me the difference between operation-based Automerge, and operational transform (OT) is actually not that big.

[+] splashdown5|2 years ago|reply
We used CRDTs to build Pennant notebooks (think Jupyter Notebooks with collaborative Google Docs features https://pennant-notebook.github.io/). Getting Yjs to behave for a multi-editor environment took some doing. I highly recommend building your own interface/library for interacting with Yjs and never touching Yjs directly in React itself. The state management and event handler cascade can be incredibly fussy if you don't have a good handle on the whole system.

We've found most multi-user apps running over websocket experience significant degradation in performance in the high teen and low twenties. Beyond that we were able to update nested CRDTs and all presence/user data in one connection with the backend.

TipTap has a great backend called HocusPocus with well documented API. Y-websocket backend is already quite good but the support for user tokens isn't there natively. We were actually able to be backend provider agnostic for well into the project. It's a fun ecosystem.

[+] endisneigh|2 years ago|reply
CRDTs seem like one of these things that are mentioned on here frequently, but I haven't seen that many popular apps that use them. Any examples?
[+] unholiness|2 years ago|reply
Once you start adding enough complexity, there will arise cases that the primitives are an awkward place for the merging to happen. There will arise cases where that user expectations and the merge function behavior don't agree. There will arise cases where the server can do a better job than the client at applying the change. There will arise cases where you need to undo but the undo function violates the merge function. And as the author freely states, there will arise cases where sending the whole state is prohibitively slow.

Those are really only issues with state-based CRDTs. The fundamental concepts behind operation-based CRDTs vs operational transforms vs bespoke hybrid approaches aren't really different. It's all about determining an unambiguous order, then getting everyone to update their state as if it had been applied in that order. Much less democratic but much more practical.

[+] antidnan|2 years ago|reply
The academic version of them is not used that widely AFAIK outside of newer companies using Yjs.

I believe Figma, Notion, Google Docs, etc all use some form of OTs which aren't necessarily a perfect CRDT

[+] tin7in|2 years ago|reply
A lot of the popular document/notes/whiteboard apps would use Yjs or Automerge or even a ready solution like Liveblocks.
[+] arendtio|2 years ago|reply
Well, isn't it the sense of CRDTs that you don't see them? I mean, traditionally users are asked what the system should once it finds a lock, but with CRDTs they should never find a lock and therefore the user isn't bothered.
[+] jamil7|2 years ago|reply
Apple Notes.
[+] cdmckay|2 years ago|reply
I believe Notion uses them.
[+] flatline|2 years ago|reply
Google docs, really any online collaborative editor uses them. If you have a distributed system with multiple asynchronous data feeds into the same sink, this is one way of automatically resolving conflicts. A complicated way that most applications probably don’t really need, and that does not guarantee consistency. But they are neat.
[+] rwoerz|2 years ago|reply
I find that "conflict-free" a little overpromising. If two users simulatiously update the same piece of data to different values, then they have still to agree on a common value manually.

CRDTs just provide a common interface for automatic synchronization of replicated data and uses metadata (timestamps etc.) to resolve conflicts in a best-effort manner. With CRDTs, you still have to accept that cases may occur where the conflict resolution does not reflect the intersubjective intention of all participating users.

Depending on the use case this may work well, e.g., in simultaneous collaborative editing where you can loose just some of you last keystrokes or mouse clicks but less in others like banking applications.

[+] thoughtlede|2 years ago|reply
I have studied CRDTs at a deeper level for a few weeks and implemented several small prototypes. They are fascinating. As an eventual consistency model for data management, CRDT inspired techniques (op-based or state-based) are useful.

However, for building user-facing applications with CRDTs, their importance is unclear.

The question with CRDTs and local-first paradigms has always been the pressing need (or the lack thereof). The only one plausible 'need' that CRDTs serve is real-time collaboration and that too with a squinting eye.

Real-time collaboration support translates, in practice, to text-editing and picture-editing collaboration. Google docs and the ilk have solved that problem (using central solutions). A CRDT-inspired central-solution like Figma is inspiring, and maybe that's the only place CRDTs fit in their survival quest when combating against central-solutions.

The rest of the claimed advantages seem to not withstand the test of times. This articles talks about 7 features of CRDTs [1].

Fast: Things are already fast with central solutions.

Multi-device: There is multi-device support with almost all solutions (if you decouple the real-time collaboration aspect).

Offline: It's rare, at least in first world countries, to be in a need for offline access (except maybe in airplanes).

Longevity: As can be seen from another comment here, longevity is actually a problem with CRDTs because data model updates are not easy.

Privacy: With BYOK encryption pattern, privacy is not as much an issue.

User control: Even with CRDTs, user is not in control of their data - other peers can mess with your data.

[1] https://www.inkandswitch.com/local-first/

[+] danielvaughn|2 years ago|reply
This is absolutely lovely, well done. I've worked with CRDT's a couple times and it's always mind-bending trying to understand the data flow; these interactive demos make it so much clearer.
[+] tgsovlerkhgsel|2 years ago|reply
"Conflict-free Replicated Data Type", if you don't want to have to read through an entire page of text before knowing what the article is about.
[+] angelmm|2 years ago|reply
I love interactive introductions to complex topics. They are the best way to learn and consolidate their concepts. Great resource!
[+] xmcqdpt2|2 years ago|reply
Something I just thought about:

On a Last-Write-Wins CRDTs, can I just set my computer's time to like 100 years in the future, and thus make changes that can never be reverted by anyone?

[+] jamil7|2 years ago|reply
A lot of implementations would favour something like a lamport clock or counter instead of a timestamp for a few different reasons. You can tamper with it, and it will increment predictably. You don't really need to worry about timestamps if you're only interested in the relative order of the events in the CRDT.
[+] dgellow|2 years ago|reply
What I’ve been doing for a personal project is that clients use a local id for their events, and the server adds a timestamp and its own id as soon as it receives them. So a client time doesn’t have any influence.
[+] nsonha|2 years ago|reply
Timestamps are written by server, problem solved.
[+] bafe|2 years ago|reply
Thank you, very cool work. A small note: the first example doesn't work on mobile devices. If I drag, it will just scroll the page instead of drawing
[+] dangoodmanUT|2 years ago|reply
Wow the first CRDT post I actually was engaged with! Kept me interested through the end, and very well written!... is this the fly.io font?