top | item 38289327

You don't need a CRDT to build a collaborative experience

236 points| zknill | 2 years ago |zknill.io | reply

65 comments

order
[+] jitl|2 years ago|reply
I agree broadly with the article’s position but I think locks are more harmful than helpful. When I was a Quip user (2018) it was super frustrating to get locked out of a paragraph because someone’s cursor idled there. Instead just allow LWW overwrites. If users have contention and your sync & presence is fast, they’ll figure it out pretty quick, and at most lose 1-2 keystrokes, or one drag gesture, or one color pick.

Notion is “collaborative” and we don’t use a CRDT for text, it’s all last-write-wins decided by the server. However our LWW texts are individually small - one block/paragraph in size - and adding/moving/removing blocks is intention-preserving if not perfectly convergent.

As the article says, the downside for LWW is that “offline” / async collaboration isn’t so great. That’s why we’re working on switching to CRDT for our texts. If you’re interested in bringing CRDTs to a product with a lot of users, consider joining Notion’s Docs team - https://boards.greenhouse.io/notion/jobs/5602426003 / @jitl on Twitter / [email protected]

[+] zknill|2 years ago|reply
So Last-Write-Wins (LWW) basically _is_ a CRDT, but not in the sense that anyone really expects, because they aren't that useful or intention preserving. Especially if the two writes happen in very quick succession / concurrently.

LWW becomes useful if you can:

a) help humans to see who is doing what on a doc

b) reduce the size of the change that is LWW

As you've said:

> However our LWW texts are individually small - one block/paragraph in size

This is really important, because by reducing the size of the individual element that any single user is updating you can also reduce the chance of conflicts. With a small amount of contextual feedback (like the notion icon next to the block) a lot of the problem cases are just avoided.

Clearly locking and updating the entire document would be terrible, but if you can do it on a small enough scope that others can change other elements, it can work really well.

(If you've worked on the notion things you're describing then I'm sure you know this better than I do, but just spelling it out really clearly.)

[+] ccorcos|2 years ago|reply
I agree!

If people clobber each others updates by typing at the same time in the same input, at least they can understand what happened.

That’s much better than not being able to do something because someone left their cursor on something and walked away…

[+] Summerbud|2 years ago|reply
I feel the same in the paragraph. But in some situation it seems rational

Figma still limits you from editing the component if someone is on it. And Figjam did that too. In my mind, this is a good practice in the realm of collaborative design. Because it will be very messy if a single component obeys the last-write-win rule

[+] btown|2 years ago|reply
Are you considering having a CRDT for each text block individually, or moving to a CRDT for the entire data model for a document? Really curious about the design approach here, especially insofar as there's now an external API that the data models need to service!
[+] lewisjoe|2 years ago|reply
> everyone’s gonna say “but hey, google docs uses operational transform not CRDTs”.. OK yes, but you are not google.

Well, google docs works not because they somehow figured out how to converge OT edits with as much precision as CRDTs do, but simply because they have a central server which orders edits anyway and don't need true leader-less convergence.

In fact, I agree not many things don't need a CRDT. CRDTs help with mathematical rigidity of convergence when you want true peer-2-peer collaboration which works without any central authority.

However, most apps anyway work on top of a central authority (example SaaS apps) so there is no real reason to accomodate all the compexity that comes with CRDT. They might get far with a simpler OT based model + central server based ordering.

For example even Figma doesn't call its model a 100% pure CRDT. It's a partial, simpler CRDT implemented with an assumption that there's going to be a server that understands ordering.

It's the same with Google Docs. They don't need a CRDT because it's a cloud app after all, which means OT is more convenient with major heavylifting (ordering and conflict handlings) outsourced to the server.

[+] josephg|2 years ago|reply
Yeah. Text based OT is pretty simple to implement, too. It’s about 200 lines of code, plus a simple network protocol. It’s fast and relatively straightforward to code up, and unlike text CRDTs it’s pretty fast by default.

I use it as my standard test when trying out a new programming language. It was unexpectedly ugly in go because of go’s lack of enums & unions, and that’s one of the big reasons I never got in to programming Go.

[+] MontagFTB|2 years ago|reply
One of the main points of this article is to "just use locks", which glosses over a lot of technical complications about locking elements within a document. How long is the lock held? Can it be stolen from a user who has gone offline, or is still online but off to lunch, and we _really_ need to make this change before the presentation in an hour? What if the user comes back online and they have changes, but the lock was stolen - how are those changes reconciled to the document?

I am generally in favor of simpler is better, and if there is a way to build a collaborative experience without using CRDTs, then go for it. However, sometimes the cure can be worse than the disease, and solutions like locking may introduce more technical complexity than originally thought.

[+] zknill|2 years ago|reply
You're absolutely right, locking is a hard problem. Especially when you get into the edge cases of clients not disconnecting cleanly.

If you've ever used one of those awful early-2000s Microsoft word collaboration systems where you have to check-out the doc, and remember to check it back in, and no one can use it until you've checked it back in... it's awful!

I'm not directly in this team, but one of the teams at my company have been working on this problem. They call it "Spaces", and one of the features solves this component locking problem.

https://ably.com/examples/component-locking

[+] chii|2 years ago|reply
By exploring that locking and unlocking mechanism, you will find that the logical conclusion in the end, when enough complexity and edge cases get covered/fixed as bugs, that it turns into a crude form of "CRDT" (where it's not actually consistent, but merges within reason for 99% of use cases).

It might as well have been CRDT from the get go.

[+] dsmmcken|2 years ago|reply
> How long is the lock held?

For however long the user has a browser focus state on the element seems like a reasonable answer, and submit changes as they are made. However, I don't know how you resolve conflicts of two users simultaneously attempting to acquire a lock.

[+] antidnan|2 years ago|reply
I don't think you need a pure CRDT either but I think locking and presence is a bit of an oversimplification.

LWW is a good place to start, and updating the smallest piece of information possible is the right idea in general but there is a lot more nuance to handling complex applications like a spreadsheet (I'm working on one) and whiteboard apps.

Things like reparenting or grouping shapes [1], or updating elements that aren't at the lowest scale like deleting a row or column in a spreadsheet make locking challenging to implement. Do you lock the entire row while I'm moving it? Do you lock the entire group of shapes?

With the exception of text editing, the popular libraries like Yjs don't just give you a perfect CRDT out of the box. You still have to construct your data model in a way that enables small scale updates [2], and CRDT libraries and literature are the best source of thinking for these problems that I've found.

[1] https://www.figma.com/blog/how-figmas-multiplayer-technology...

[2] https://mattweidner.com/2022/02/10/collaborative-data-design...

[+] charles_f|2 years ago|reply
That's true for collaborative experience. Crdts are a mechanism to handle eventual consistency (that's even the preface of the paper). If you assume that said collaborative experience is always online, you don't need them, and "using locks" as you described is probably enough.

If you want a mechanism to handle that eventual consistency, it's probably better to reuse their principles rather than reinventing something that will eventually ressemble Crdts.

You mentioned "offline first", I think it's probably a good place to pluck that ib https://www.inkandswitch.com/local-first/

[+] iamwil|2 years ago|reply
> Ever-growing state: for CRDTs to work well they need to keep a record of both what exists, and what has been deleted (so that the deletes aren’t accidentally added back in later). This means that CRDT state will continually expand.

I guess a couple things:

It depends on the CRDT. Some CRDTs grow with the number of replicas and others with the number of events.

State-based CRDTs don't need to keep history and don't need causal ordering of messages, but internal bookkeeping grows with the number of replicas. And for large states (like sets and maps), it can be prohibitive to send the state all over the wire for an idempotent merge.

That's why in practice, people implement Op-based CRDTs, which makes the trade: in order to send small ops over the wire, we now need causal ordering of messages. To make sure we can sync with replicas long offline, we keep as much history so that they can catch up.

There are other variations, such as delta-state based CRDTs that send diffs, and merkle CRDTs, which use merkle data structures to calculate diffs and detect concurrency, which have different growth characteristics.

---

As for a growing state: Is this actually a concern for devs that aren't using CRDTs for collaborative text? I can see that being an issue with the amount of changes that can happen.

But outside of that, lots of data don't grow that fast. We all regularly use Git and it keeps a history of everything. Our disks are huge, and having an immutable record is great for lots of things (providing you can access it).

> Opaque state: ...you’re generally left with an opaque blob of binary encoded data.

Most CRDT libraries take a document-orientated angle. It assumes that you can contain the entire "unit of work", like a document, inside of a CRDT. However, if your data is more relational, it doesn't quite fit. And while there's immutable data in a CRDT, I do wish it was more accessible and queryable. In addition, being a binary blob, it's not exactly composable. I think CRDT libraries should be composable with each other.

[+] earthboundkid|2 years ago|reply
I've seen locks used at the CMSes of large news organizations. It's fine, but they all need a mechanism to kick out an editor who has an idle tab left open. For my own small scale CMS, I just wrapped Google Docs and let them handle all the syncing headaches.
[+] chromatin|2 years ago|reply
We took a super simple (IMO) approach to collaborative editing in my current project:

Each block of text has a version number which must be incremented by one by the client at the time of submission. The database provides conflict prevention by uniqueness constraint which bubbles up to the API code. The frontend is informed of conflict, so that the user can be notified and let the human being perform conflict resolution.

Because most concurrent users are working on different blocks, this works great.

[+] zknill|2 years ago|reply
How do you handle getting the changes that one client makes onto the other clients? Are you pushing it from the server to the clients with websockets, or waiting for the clients to ask for new info, or waiting for the conflict to happen when someone else tries to make a change, or something else?

I'm thinking a lot about keeping server and client data in sync while working on our hopefully-soon-to-be-released LiveSync product[1]

[1] https://ably.com/livesync

[+] namelosw|2 years ago|reply
That's not gonna work for real-world projects. Real-world apps often have larger edits than locking individual cells/cards e.g. Move columns or replace large chunks of spreadsheets in Google Sheets, or Ctrl-A to select all and then drag to move.

Also, if you consider latency, locking does not work well because client B might do operations before he/she even acknowledges the lock from client A because of latency.

[+] lmm|2 years ago|reply
> You can’t inspect your model represented by the CRDT without using the CRDT library to decode the blob, and you can’t just store the underlying model state because the CRDT needs its change history also. You’re left with an opaque blob of data in your database. You can’t join on it, you can’t search it, you can’t do much without building extra features around that state blob.

So use the CRDT library when building your indices? Or better yet use a CRDT-aware datastore. This doesn't seem like a real problem.

> Locking for safety

Please don't. You're inevitably going to have lost locks, lost updates, or most likely both.

[+] zknill|2 years ago|reply
> So use the CRDT library when building your indices?

Yeah, sure, you can build a secondary index over the data. But you're still having to decode the blob and index it. There's no version where you can index the data without using the library to expose the underlying model state (like you could if you weren't using a CRDT).

On locking, yes, it's hard. But it's not the same kind of locking that you'd expect in other parts of a system. You're locking the UI, not the actual data, so it's a tiny bit more forgiving. In general the locks aren't trying to force consistency, instead they are trying to prompt the humans to reduce the chance of conflict happening in the first place. Ofc, you still have to care about the locking, unlocking, disconnection problems, etc.

Here's a decent example/demo you can play around with in multiple windows:

https://examples.ably.dev/component-locking?space=W0V-t5oY6A...

https://ably.com/spaces

[+] spion|2 years ago|reply
For offline first apps, or for applications where very high degree of control for the content is needed (e.g. legal docs) and realtime collaboration isn't that valuable, there is also the option to use 3-way merge instead.

The benefit is that you can even allow the user to resolve conflicts in a satisfactory way.

Another benefit is that the document doesn't even have to be derived from the original, it could go through exports and re-imports and it will still be possible to run a 3-way merge as long as a common base version is declared. This can be especially covnenient for systems that involve e.g. MS Word.

[+] mweidner|2 years ago|reply
> Opaque State: [...] You can’t inspect your model represented by the CRDT without using the CRDT library to decode the blob, and you can’t just store the underlying model state because the CRDT needs its change history also. You’re left with an opaque blob of data in your database.

As someone who works on a CRDT library with opaque state [1], I agree that this is a big barrier to adoption. Features like partial loading, per-paragraph permissions, and accept/reject suggestions seem pretty easy to implement if each text char is just a row in your server's DB, but I would have trouble implementing them on top of e.g. Yjs.

For text editing, one idea is to separate the CRDT "positions" from the text itself, which you can then store as a map (position -> char) in your own data structures. I've made a simple (but inefficient) library along these lines [2] and would be interested in ideas for further development.

[1] Collabs - https://collabs.readthedocs.io

[2] position-strings - https://www.npmjs.com/package/position-strings

[+] matlin|2 years ago|reply
I think the most important part of designing collaborative software, which this touches on a bit, is having a the right granularity and scope of a given change.

Last-writer-wins is only bad when the granularity of what you're editing is too big. E.g. if you're an editor like Figma and each element is a row in a database, a single row is too big. Instead you want attribute level granularity so two users can change the independent properties (like one color and the other size) without bulldozing each other.

The other key thing (that's also a common mistake) is to only consider realtime collaboration. In practice, there's always some delay (maybe just milliseconds but could be be hours or days) in how events propagate so solutions like locking don't work.

The reality is that any client-server system that needs to be highly interactive and robust to unreliable network conditions is undeniably a distributed system and therefore warrants using distributed system solutions like vector clocks, Lamport timestamps, CRDTs, etc.

Last thing is that I think many people only think of operation-based CRDTs when they think about CRDTs. You can (and we have at my company) created a fairly traditional feeling database that relies on a state-based CRDT solution that doesn't need to maintain a log of every operation that has every happened.

So yes, you might not need to reach for a fancy library like Yjs or Automerge, but it's worth understanding how these things thinks basically work because many of them are extremely simple and easy to grok - the complicated parts of Yjs and Automerge are the sophisticated data-structures and algorithms that are pretty much only needed for large document text editing.

[+] socketcluster|2 years ago|reply
The no-code serverless platform I built achieves this behind the scenes via a real-time CRUD API: https://saasufy.com/

The key is to perform updates on fields individually. Normally this would not be viable using HTTP due to headers/overheads (too many fields per resource to dedicate an entire HTTP request per-field) but it is viable over WebSockets as each frame is very lightweight and can even be batched. Also, being able to tie together the life of the connection to the subscription is handy to ensure that no real-time updates can be missed.

I built a chat app with authentication + access control with it (you can log in with GitHub at the bottom):

https://saasufy.github.io/chat-app/

Only 120 lines of HTML markup (web components), no custom JS. See GitHub repo here for the 'source': https://github.com/Saasufy/chat-app

[+] czx111331|2 years ago|reply
We are addressing the CRDT downsides mentioned in the article at Loro:

- Ever-growing state. This is no longer an issue. With OT-like CRDTs, you can discard unnecessary historical data at any time. This is theoretically feasible, and we are moving towards this goal. - Complex implementation. The complexity is internal within the package, and it's written in Rust, making it universally applicable. - Opaque state. We aim to expose these internal states through improved DevTools, making them easier to control and observe. This is one of the essential steps in enhancing our DX.

You can visit our blog to learn more: https://www.loro.dev/blog/loro-now-open-source

[+] aboodman|2 years ago|reply
It's true that a CRDT is often not the right thing for a classic client/server application. But this doesn't mean we should just give up on ux and use locking.

There are approaches to multiplayer that are client/server native. By leveraging the authoritative server they can offer features that CRDTs can't, while preserving the great ux.

I'm partial to server reconciliation:

https://www.gabrielgambetta.com/client-side-prediction-serve...

My product, Reflect, implements server reconciliation as a service. You can learn more about how it works here:

https://rocicorp.dev/blog/ready-player-two

[+] zknill|2 years ago|reply
> But this doesn't mean we should just give up on ux

It's a little unfair to describe locking as "[giving] up on UX", especially given some well known collaboration products use it quite successfully; Google Sheets cell locking, Miro element/Text locking, etc.

Ofc, it's going to depend on the scope of what's being locked. These two examples are quite finely scoped element locks.

[+] parhamn|2 years ago|reply
At this point, given the maturity of libraries (I was exploring this recently), I think you'd have to make the case that CRDTs are bad not just "too much".

Interfacing with the 'blob' is a real thing (y-js is solving some of this with a rust implementation that has cross language binding) but generally the things they noted (e.g. a Figma canvas) aren't things you commonly do joins across and if you did you'd have an independent indexing store for that functionality.

With tools like SyncedStore [1] and HocusPocus [2] you end up with a pretty good, we'll tested, easy to implement base for good collaboration.

[1] syncedstore.org

[2] github.com/ueberdosis/hocuspocus