(no title)
zawaideh | 8 months ago
If the friend is online then sending operations is possible, because they can be decrypted and merged.
zawaideh | 8 months ago
If the friend is online then sending operations is possible, because they can be decrypted and merged.
ath92|8 months ago
So instead of merging changes on the server, all you need is some way of knowing which messages you haven’t received yet. Importantly this does not require the server to be able to actually read those messages. All it needs is some metadata (basically just an id per message), and when reconnecting, it needs to send all the not-yet-received messages to the client, so it’s probably useful to keep track of which client has received which messages, to prevent having to figure that out every time a client connects.
josephg|8 months ago
Generally there’s two categories of CRDTs: state based and operation based CRDTs.
State based CRDTs are like a variable with is set to a new value each time it changes. (Think couchdb if you’ve used it). In that case, yes, you generally do update the whole value each time.
Operation based CRDTs - used in things like text editing - are more complex, but like the parent said, deal with editing events. So long as a peer eventually gets all the events, they can merge them together into the resulting document state. CRDTs have a correctness criteria that the same set of operations always merges into the same document, on all peers, regardless of the order you get the messages.
Anyway, I think the parent comment is right here. If you want efficient E2E encryption, using an operation based crdt is probably a better choice.
charcircuit|8 months ago
Joker_vD|8 months ago
This scheme doesn't require them two people to be on-line simultaneously — all updates are mediated via the sync server, after all. So, where am I wrong?
eightys3v3n|8 months ago
This could be done to reduce the time required for a client to catch up once it comes online (because it would need to replay all changes that have happened since it last connected to achieve the conflict free modification). But the article also mentions something about keeping the latest version quickly accessible.
unknown|8 months ago
[deleted]
crdrost|8 months ago
The reason CRDT researchers don't like the sync server is, that's the very thing that CRDTs are meant to solve. CRDTs are a building-block for theoretically-correct eventual consistency: that's the goal. Which means our one source-of-truth now exists in N replicas, those replicas are getting updated separately, and now: why choose eventual consistency rather than strong consistency? You always want strong consistency if you can get it, but eventually, the cost of syncing the replicas is too high.
So now we have a sync server like you planned? Well, if we're at the scale where CRDTs make sense then presumably we have data races. Let's assume Alice and Bob both read from the sync server and it's a (synchronous, unencrypted!) last-write-wins register, both Alice and Bob pull down "v1" and Alice writes "v1a" to the register and Bob in parallel writes "v1b" as Alice disconnects and Bob wins because he happens to have the higher user-ID. Sync server acknowledged Alice's write but it got lost until she next comes online. OK so new solution, we need a compare-and-swap register, we need Bob to try to write to the server and get rejected. Well, except in the contention regime that we're anticipating, this means that we're running your sync server as a single-point-of-failure strong consistency node, and we're accepting the occasional loss of availability (CAP theorem) when we can't reach the server.
Even worse, such a sync server _forces_ you into strong consistency even if you're like "well the replicas can lose connection to the sync server and I'll still let them do stuff, I'll just put up a warning sign that says they're not synced yet." Why? Because they use the sync server as if it is one monolithic thing, but under contention we have to internally scale the sync server to contain multiple replicas so that we can survive crashes etc. ... if the stuff happening inside the sync server is not linearizable (aka strongly consistent) then external systems cannot pretend it is one monolithic thing!
So it's like, the sync server is basically a sort of GitHub, right? It's operating at a massive scale and so internally it presumably needs to have many Git-clones of the data so that if the primary replica goes down then we can still serve your repo to you and merge a pull request and whatever else. But then it absolutely sucks to merge a PR and find out that afterwards, it's not merged, so you go into panic mode and try to fix things, only for 5 minutes later to discover that the PR is now merged. And if you've got a really active eventually consistent CRDT system that has a lot of buggy potential.
For the CRDT researcher the idea of "we'll solve this all with a sync server" is a misunderstanding that takes you out of eventual-consistency-land. The CRDT equivalent that lacks this misunderstanding is, "a quorum of nodes will always remain online (or at least will eventually sync up) to make sure that everything eventually gets shared," and your "sync server" is actually just another replica that happens to remain online, but isn't doing anything fundamentally different from any of the other peers in the swarm.
blamestross|8 months ago
Or the user's client can flatten un-acked changes and tell the server to store that instead.
It can just allways flatten until it hears back from a peer.
The entire scenario is over-contrived. I wish they had just shown it off instead of making the lie of a justification.
clawlor|8 months ago