top | item 28998767

CRDT resources

119 points| stichers | 4 years ago |wiki.nikitavoloboev.xyz | reply

45 comments

order
[+] dang|4 years ago|reply
Lists don't make good HN submissions—they're too generic. The only thing to discuss is the lowest common denominator of the items on the list, and that's usually some very general topic about which there's nothing particularly new to say. Also, HN is itself a list. A pointer to a pointer to a pointer is too much indirection!

It's better to post the most interesting item on the list. That increases the chance that there's something specific to talk about.

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...

[+] dqpb|4 years ago|reply
As a counterpoint, I upvoted and favorited this post before seeing your comment.
[+] mox111|4 years ago|reply
Most of the CRDT examples I've seen appear to be Electron apps e.g. https://github.com/automerge/pushpin.

My understanding is that CRDT's rely on having a safe place to store data on the user's machine (otherwise it's a bit like doing a `git clone` to receive new data, rather than a `git pull`).

Is this not a major limitation for people hoping to use it for web apps?

[+] cyber_kinetist|4 years ago|reply
> “Distributed state is so fundamentally complex that I think we actually need CRDTs (or something like them) to reason about it effectively.”

Gamedevs working on multiplayer FPSs and MMOs (which requires resolving incredibly complex state synchronizations at millisecond-scales) have done this for decades, and they haven’t been using any fancy CRDTs. Maybe they might have some ideas on how to achieve fast document synchronization as well?

If you forget about P2P and only think about server/client type connections (since P2P doesn’t give you that much advantages in a Google-Docs type service), I think there’s a lot of overlap between multiplayer games and collaborative document editing, and maybe some cross-domain pollination might be needed to solve this problem.

[+] hesdeadjim|4 years ago|reply
Realtime multiplay state sync isn't remotely like CRDT, it's an endless series of hacks where you attempt to send as little data as possible to have the best player experience as possible. Total smoke and mirrors.

It's fantasy to assume a client has any real "truth" about the world over the span of milliseconds, and in practice you want the client to be as ignorant as possible because cheat programs will let people see through walls, see across the map, etc.

You compensate through speculative prediction to try and make things like movement or gunshots feel lag-free, but again it's smoke and mirrors and a hundred things can cause that to go wrong. How do we resolve mis-predictions to the player? Hide it as best as possible through VFX is a great technique, or hope the users don't notice that a couple of their bullets had exactly zero effect because the server decided you shot outside the rollback buffer and didn't actually hit that other player.

It's very game-dependent too. Valorant hits a 120hz tick rate through insane optimization, and if you have a low-latency connection your view of the world will be much, much more in sync with truth compared to a game like PubG that early on had a tick rate around 15hz (oof).

It's always been a bucket list gamedev goal of mine to work on realtime multiplayer, but after a couple years of doing it I find myself fantasizing of a world where everyone is playing on a computer that isn't a potato with a connection that isn't jittery.

[+] jitl|4 years ago|reply
The major difference in constraint is how much user intention/data it's acceptable to discard, especially when a user message arrives later than expected.

In a multiplayer game, it's fine to discard all user intention that arrives a few minutes "late" - gameplay has moved on, and the client should discard local state and just use the server state. But this is totally unacceptable for a text editing application.

CRDTs or Operational Transforms are strategies that allow accepting user edits and preserving their intention even if the user is days or weeks behind or diverged from the server state. I'm not aware of a real-time multiplayer game that allows such high latency for user input.

I like this talk on Overwatch's data model (entity-component-system model) https://www.gdcvault.com/play/1024001/-Overwatch-Gameplay-Ar... - the discussion of netcode starts at timestamp 22:30

[+] User23|4 years ago|reply
> Gamedevs working on multiplayer FPSs and MMOs (which requires resolving incredibly complex state synchronizations at millisecond-scales) have done this for decades

No they haven't. Servers crash and players drop all the time, and many things otherwise don't always act as intended. But practically it doesn't matter because nobody, not the developers and not the players, cares about whether or not their FPS is byzantine fault tolerant.

[+] dboreham|4 years ago|reply
> they haven’t been using any fancy CRDTs

CRDT is actually "the tricks we have always used + math to prove whether they work or not". Read the papers. You'll find old school stuff like Lamport Clocks.

[+] phailhaus|4 years ago|reply
> Gamedevs working on multiplayer FPSs and MMOs (which requires resolving incredibly complex state synchronizations at millisecond-scales) have done this for decades

...no they haven't? Those aren't distributed systems. There is an authoritative server that holds the source of truth, and players sync up with it.

A truly distributed game would have each game client contain the full world state, and would synchronize with every other player rather than an authoritative server.

[+] remram|4 years ago|reply
Those games have an authoritative central server, so using OT or something akin to OT is trivial.

Even in P2P games or games with a weaker server, a given entity is owned by a single machine. This is a much simpler setup.

[+] dnautics|4 years ago|reply
Not really a CRDT UseCase. Gamedev needs A<=>S and B<=>S, where S is a central server.

CRDT is:

A<=>B A<=>C B<=>C A<=>D ...

[+] globular-toast|4 years ago|reply
At least just write out the name in full before using the acronym. "CRDT" could mean literally anything and it doesn't help when half the links in the list also don't bother to name it before using the acronym.

Resist using acronyms before defining them. Don't just do it because everyone else does it. You are only creating barriers for people who might be interested in what you are talking about. I know people do this intentionally too. Don't be so insecure. You don't need to invent special language to remain relevant.

Another user commented on a similar thing earlier today: https://news.ycombinator.com/item?id=28997945

[+] zffr|4 years ago|reply
Since no one else defined it, CRDT = Conflict Free Replicated Data Types.

There are 2 variants which confusingly have a very similar acronym:

- CmRDT: Commutative Replicated Data Types (also known as operation-based CRDTs). These CRDTs replicate state by transmitting update operations to peers. Peers are always able to apply these operations in any order and without conflicts.

- CvRDT: Convergent Replicated Data Types (also known as state-based CRDTs). These CRDTs replicate state by transmitting the entire object every time an update is made to a local replica. Peers are able to merge the state they receive with their local copy without conflicts.

[+] pfraze|4 years ago|reply
Off topic - probably the most successful piece of "content" I've ever made is the CRDT notes [1] item that's nestled in there. I saw this submission and thought, "I wonder if my repo made it in there" and indeed it did.

Why I find that funny is that I made that repo on a whim while I was doing my own reading, and then did nothing with it. Maybe I tweeted it? But somehow it SEOed well with Google for a stretch and I've been getting a steady stream of stars on that repo ever since. I assume it was because I created it when CRDTs were still early and so it got the clicks.

I'm sure a lot of folks here know what it's like to try to put projects out there and go looking for traction. It's always made me chuckle that one of my biggest successes was the unintentional one.

1. https://github.com/pfrazee/crdt_notes

[+] keewee7|4 years ago|reply
What are some interesting use cases for CRDTs beyond collaborative (text) editing?
[+] sunny--tech|4 years ago|reply
Quite a few data stores use CRDTs under the hood.

Redis uses CRDTs for active-active architectures [1] and for some of their native data structures [2].

Riak also uses them in their data store [3].

And looks like PayPal might use them for consensus purposes (I found this while looking up the Riak talk so I haven't actually watched it) [4]

1: https://redis.com/blog/diving-into-crdts/

2: https://redis.com/videos/active-active-geo-distribution-redi...

3: https://www.youtube.com/watch?v=f20882ZSdkU&ab_channel=Erlan...

4: https://www.infoq.com/presentations/crdt-production/

[+] lijogdfljk|4 years ago|reply
I mean.. data, right? I'm trying to learn about CRDTs for complex, nested data structures. The use cases are for.. well, nested data, anything you'd use BTree's and etc for over distributed systems.

A big thing i'm currently learning with them is to write a content addressable system with a more forgiving merge policy between parties.

Yea i often see people nitpick CRDT about user intention, and where it should be `ADBC` or `ABCD`, but in my case i'm focusing on multi-device, not multi-user - and even in multi-user it's still best-in-breed when you are designing away from centralization.

I'm still struggling to learn the more complex approaches to CRDT. So many resources focus on the low hanging fruit of CRDT. Grow counters, basic text editing, etc. I need to build the full suite of data structures; maps, sets, lists, etc.

[+] nuerow|4 years ago|reply
The content is interesting but the page is totally unusable in mobile.
[+] nikivi|4 years ago|reply
Author of the wiki here, I think GitBook is quite usable on mobile for me. What is the issue?
[+] tomsonj|4 years ago|reply
I've observed that implementing a CRDT system requires blogging about it afterward
[+] sunny--tech|4 years ago|reply
I have implemented a CRDT system and have also blogged about it afterwards. So I can confirm this.

Joking aside though, CRDTs are still a pretty esoteric space and all the various blog posts came in handy when I was researching how to build my own CRDT.

Most of the resources I've seen on CRDTs are from whitepapers and those can be difficult to read if you don't have a math background. I gave a talk at a Papers We Love [1] explicitly because I found the academic papers a big turn off from many people interested in the space.

[1]: https://www.youtube.com/watch?v=1Bs3Fj9rvks&t=2169s&ab_chann...

[+] nosianu|4 years ago|reply
What method did you use to determine which/if anyone implementing CRDTs did not write a blog post?

Or did you observe that 100% of the people who blogged about CRDTs implemented CRDTs and then wrote a blog post?