Lists don't make good HN submissions—they're too generic. The only thing to discuss is the lowest common denominator of the items on the list, and that's usually some very general topic about which there's nothing particularly new to say. Also, HN is itself a list. A pointer to a pointer to a pointer is too much indirection!
It's better to post the most interesting item on the list. That increases the chance that there's something specific to talk about.
My understanding is that CRDT's rely on having a safe place to store data on the user's machine (otherwise it's a bit like doing a `git clone` to receive new data, rather than a `git pull`).
Is this not a major limitation for people hoping to use it for web apps?
> “Distributed state is so fundamentally complex that I think we actually need CRDTs (or something like them) to reason about it effectively.”
Gamedevs working on multiplayer FPSs and MMOs (which requires resolving incredibly complex state synchronizations at millisecond-scales) have done this for decades, and they haven’t been using any fancy CRDTs. Maybe they might have some ideas on how to achieve fast document synchronization as well?
If you forget about P2P and only think about server/client type connections (since P2P doesn’t give you that much advantages in a Google-Docs type service), I think there’s a lot of overlap between multiplayer games and collaborative document editing, and maybe some cross-domain pollination might be needed to solve this problem.
Realtime multiplay state sync isn't remotely like CRDT, it's an endless series of hacks where you attempt to send as little data as possible to have the best player experience as possible. Total smoke and mirrors.
It's fantasy to assume a client has any real "truth" about the world over the span of milliseconds, and in practice you want the client to be as ignorant as possible because cheat programs will let people see through walls, see across the map, etc.
You compensate through speculative prediction to try and make things like movement or gunshots feel lag-free, but again it's smoke and mirrors and a hundred things can cause that to go wrong. How do we resolve mis-predictions to the player? Hide it as best as possible through VFX is a great technique, or hope the users don't notice that a couple of their bullets had exactly zero effect because the server decided you shot outside the rollback buffer and didn't actually hit that other player.
It's very game-dependent too. Valorant hits a 120hz tick rate through insane optimization, and if you have a low-latency connection your view of the world will be much, much more in sync with truth compared to a game like PubG that early on had a tick rate around 15hz (oof).
It's always been a bucket list gamedev goal of mine to work on realtime multiplayer, but after a couple years of doing it I find myself fantasizing of a world where everyone is playing on a computer that isn't a potato with a connection that isn't jittery.
The major difference in constraint is how much user intention/data it's acceptable to discard, especially when a user message arrives later than expected.
In a multiplayer game, it's fine to discard all user intention that arrives a few minutes "late" - gameplay has moved on, and the client should discard local state and just use the server state. But this is totally unacceptable for a text editing application.
CRDTs or Operational Transforms are strategies that allow accepting user edits and preserving their intention even if the user is days or weeks behind or diverged from the server state. I'm not aware of a real-time multiplayer game that allows such high latency for user input.
> Gamedevs working on multiplayer FPSs and MMOs (which requires resolving incredibly complex state synchronizations at millisecond-scales) have done this for decades
No they haven't. Servers crash and players drop all the time, and many things otherwise don't always act as intended. But practically it doesn't matter because nobody, not the developers and not the players, cares about whether or not their FPS is byzantine fault tolerant.
CRDT is actually "the tricks we have always used + math to prove whether they work or not". Read the papers. You'll find old school stuff like Lamport Clocks.
> Gamedevs working on multiplayer FPSs and MMOs (which requires resolving incredibly complex state synchronizations at millisecond-scales) have done this for decades
...no they haven't? Those aren't distributed systems. There is an authoritative server that holds the source of truth, and players sync up with it.
A truly distributed game would have each game client contain the full world state, and would synchronize with every other player rather than an authoritative server.
At least just write out the name in full before using the acronym. "CRDT" could mean literally anything and it doesn't help when half the links in the list also don't bother to name it before using the acronym.
Resist using acronyms before defining them. Don't just do it because everyone else does it. You are only creating barriers for people who might be interested in what you are talking about. I know people do this intentionally too. Don't be so insecure. You don't need to invent special language to remain relevant.
Since no one else defined it, CRDT = Conflict Free Replicated Data Types.
There are 2 variants which confusingly have a very similar acronym:
- CmRDT: Commutative Replicated Data Types (also known as operation-based CRDTs). These CRDTs replicate state by transmitting update operations to peers. Peers are always able to apply these operations in any order and without conflicts.
- CvRDT: Convergent Replicated Data Types (also known as state-based CRDTs). These CRDTs replicate state by transmitting the entire object every time an update is made to a local replica. Peers are able to merge the state they receive with their local copy without conflicts.
Off topic - probably the most successful piece of "content" I've ever made is the CRDT notes [1] item that's nestled in there. I saw this submission and thought, "I wonder if my repo made it in there" and indeed it did.
Why I find that funny is that I made that repo on a whim while I was doing my own reading, and then did nothing with it. Maybe I tweeted it? But somehow it SEOed well with Google for a stretch and I've been getting a steady stream of stars on that repo ever since. I assume it was because I created it when CRDTs were still early and so it got the clicks.
I'm sure a lot of folks here know what it's like to try to put projects out there and go looking for traction. It's always made me chuckle that one of my biggest successes was the unintentional one.
I mean.. data, right? I'm trying to learn about CRDTs for complex, nested data structures. The use cases are for.. well, nested data, anything you'd use BTree's and etc for over distributed systems.
A big thing i'm currently learning with them is to write a content addressable system with a more forgiving merge policy between parties.
Yea i often see people nitpick CRDT about user intention, and where it should be `ADBC` or `ABCD`, but in my case i'm focusing on multi-device, not multi-user - and even in multi-user it's still best-in-breed when you are designing away from centralization.
I'm still struggling to learn the more complex approaches to CRDT. So many resources focus on the low hanging fruit of CRDT. Grow counters, basic text editing, etc. I need to build the full suite of data structures; maps, sets, lists, etc.
I have implemented a CRDT system and have also blogged about it afterwards. So I can confirm this.
Joking aside though, CRDTs are still a pretty esoteric space and all the various blog posts came in handy when I was researching how to build my own CRDT.
Most of the resources I've seen on CRDTs are from whitepapers and those can be difficult to read if you don't have a math background. I gave a talk at a Papers We Love [1] explicitly because I found the academic papers a big turn off from many people interested in the space.
[+] [-] dang|4 years ago|reply
It's better to post the most interesting item on the list. That increases the chance that there's something specific to talk about.
https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...
[+] [-] dqpb|4 years ago|reply
[+] [-] conaclos|4 years ago|reply
[+] [-] mox111|4 years ago|reply
My understanding is that CRDT's rely on having a safe place to store data on the user's machine (otherwise it's a bit like doing a `git clone` to receive new data, rather than a `git pull`).
Is this not a major limitation for people hoping to use it for web apps?
[+] [-] nisa|4 years ago|reply
[+] [-] sunny--tech|4 years ago|reply
https://github.com/yjs/yjs
[+] [-] cyber_kinetist|4 years ago|reply
Gamedevs working on multiplayer FPSs and MMOs (which requires resolving incredibly complex state synchronizations at millisecond-scales) have done this for decades, and they haven’t been using any fancy CRDTs. Maybe they might have some ideas on how to achieve fast document synchronization as well?
If you forget about P2P and only think about server/client type connections (since P2P doesn’t give you that much advantages in a Google-Docs type service), I think there’s a lot of overlap between multiplayer games and collaborative document editing, and maybe some cross-domain pollination might be needed to solve this problem.
[+] [-] hesdeadjim|4 years ago|reply
It's fantasy to assume a client has any real "truth" about the world over the span of milliseconds, and in practice you want the client to be as ignorant as possible because cheat programs will let people see through walls, see across the map, etc.
You compensate through speculative prediction to try and make things like movement or gunshots feel lag-free, but again it's smoke and mirrors and a hundred things can cause that to go wrong. How do we resolve mis-predictions to the player? Hide it as best as possible through VFX is a great technique, or hope the users don't notice that a couple of their bullets had exactly zero effect because the server decided you shot outside the rollback buffer and didn't actually hit that other player.
It's very game-dependent too. Valorant hits a 120hz tick rate through insane optimization, and if you have a low-latency connection your view of the world will be much, much more in sync with truth compared to a game like PubG that early on had a tick rate around 15hz (oof).
It's always been a bucket list gamedev goal of mine to work on realtime multiplayer, but after a couple years of doing it I find myself fantasizing of a world where everyone is playing on a computer that isn't a potato with a connection that isn't jittery.
[+] [-] jitl|4 years ago|reply
In a multiplayer game, it's fine to discard all user intention that arrives a few minutes "late" - gameplay has moved on, and the client should discard local state and just use the server state. But this is totally unacceptable for a text editing application.
CRDTs or Operational Transforms are strategies that allow accepting user edits and preserving their intention even if the user is days or weeks behind or diverged from the server state. I'm not aware of a real-time multiplayer game that allows such high latency for user input.
I like this talk on Overwatch's data model (entity-component-system model) https://www.gdcvault.com/play/1024001/-Overwatch-Gameplay-Ar... - the discussion of netcode starts at timestamp 22:30
[+] [-] User23|4 years ago|reply
No they haven't. Servers crash and players drop all the time, and many things otherwise don't always act as intended. But practically it doesn't matter because nobody, not the developers and not the players, cares about whether or not their FPS is byzantine fault tolerant.
[+] [-] dboreham|4 years ago|reply
CRDT is actually "the tricks we have always used + math to prove whether they work or not". Read the papers. You'll find old school stuff like Lamport Clocks.
[+] [-] phailhaus|4 years ago|reply
...no they haven't? Those aren't distributed systems. There is an authoritative server that holds the source of truth, and players sync up with it.
A truly distributed game would have each game client contain the full world state, and would synchronize with every other player rather than an authoritative server.
[+] [-] sunny--tech|4 years ago|reply
https://technology.riotgames.com/news/chat-service-architect...
[+] [-] nivenkos|4 years ago|reply
[+] [-] remram|4 years ago|reply
Even in P2P games or games with a weaker server, a given entity is owned by a single machine. This is a much simpler setup.
[+] [-] dnautics|4 years ago|reply
CRDT is:
A<=>B A<=>C B<=>C A<=>D ...
[+] [-] globular-toast|4 years ago|reply
Resist using acronyms before defining them. Don't just do it because everyone else does it. You are only creating barriers for people who might be interested in what you are talking about. I know people do this intentionally too. Don't be so insecure. You don't need to invent special language to remain relevant.
Another user commented on a similar thing earlier today: https://news.ycombinator.com/item?id=28997945
[+] [-] zffr|4 years ago|reply
There are 2 variants which confusingly have a very similar acronym:
- CmRDT: Commutative Replicated Data Types (also known as operation-based CRDTs). These CRDTs replicate state by transmitting update operations to peers. Peers are always able to apply these operations in any order and without conflicts.
- CvRDT: Convergent Replicated Data Types (also known as state-based CRDTs). These CRDTs replicate state by transmitting the entire object every time an update is made to a local replica. Peers are able to merge the state they receive with their local copy without conflicts.
[+] [-] pfraze|4 years ago|reply
Why I find that funny is that I made that repo on a whim while I was doing my own reading, and then did nothing with it. Maybe I tweeted it? But somehow it SEOed well with Google for a stretch and I've been getting a steady stream of stars on that repo ever since. I assume it was because I created it when CRDTs were still early and so it got the clicks.
I'm sure a lot of folks here know what it's like to try to put projects out there and go looking for traction. It's always made me chuckle that one of my biggest successes was the unintentional one.
1. https://github.com/pfrazee/crdt_notes
[+] [-] keewee7|4 years ago|reply
[+] [-] sunny--tech|4 years ago|reply
Redis uses CRDTs for active-active architectures [1] and for some of their native data structures [2].
Riak also uses them in their data store [3].
And looks like PayPal might use them for consensus purposes (I found this while looking up the Riak talk so I haven't actually watched it) [4]
1: https://redis.com/blog/diving-into-crdts/
2: https://redis.com/videos/active-active-geo-distribution-redi...
3: https://www.youtube.com/watch?v=f20882ZSdkU&ab_channel=Erlan...
4: https://www.infoq.com/presentations/crdt-production/
[+] [-] lijogdfljk|4 years ago|reply
A big thing i'm currently learning with them is to write a content addressable system with a more forgiving merge policy between parties.
Yea i often see people nitpick CRDT about user intention, and where it should be `ADBC` or `ABCD`, but in my case i'm focusing on multi-device, not multi-user - and even in multi-user it's still best-in-breed when you are designing away from centralization.
I'm still struggling to learn the more complex approaches to CRDT. So many resources focus on the low hanging fruit of CRDT. Grow counters, basic text editing, etc. I need to build the full suite of data structures; maps, sets, lists, etc.
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] nuerow|4 years ago|reply
[+] [-] nikivi|4 years ago|reply
[+] [-] tomsonj|4 years ago|reply
[+] [-] dang|4 years ago|reply
https://news.ycombinator.com/newsguidelines.html
[+] [-] sunny--tech|4 years ago|reply
Joking aside though, CRDTs are still a pretty esoteric space and all the various blog posts came in handy when I was researching how to build my own CRDT.
Most of the resources I've seen on CRDTs are from whitepapers and those can be difficult to read if you don't have a math background. I gave a talk at a Papers We Love [1] explicitly because I found the academic papers a big turn off from many people interested in the space.
[1]: https://www.youtube.com/watch?v=1Bs3Fj9rvks&t=2169s&ab_chann...
[+] [-] nosianu|4 years ago|reply
Or did you observe that 100% of the people who blogged about CRDTs implemented CRDTs and then wrote a blog post?
[+] [-] hecturchi|4 years ago|reply
https://arxiv.org/abs/2004.00107#
The link takes you to our Merkle-CRDTs paper, which includes a nice intro to CRDTs in general, so no prior knowledge needed!