top | item 6302944

How the Dropbox Datastore API Handles Conflicts – Part Two

61 points| llambda | 12 years ago |dropbox.com | reply

19 comments

order
[+] jchrisa|12 years ago|reply
The problem with this approach is that it requires you to resolve conflicts when you first see them. So you can't do workflows that accumulate more than two versions of what happened. Nor can you resolve the conflict asynchronously.

For an approach that doesn't ever force you to throw away or merge data, see the data structures in my OSCON talk. https://speakerdeck.com/jchris/couchbase-lite-oscon

I owe the world a write-up explaining these slides. Or come see me talk at StrangeLoop in St. Louis or Couchbase SF in September. http://www.couchbase.com/couchbase-sf-2013

[+] erikb|12 years ago|reply
I could interprete it as a multiple-branch model with Git. But it's still rather confusing for me what use it would have for anybody outside of source code repos (meaning: I don't have the knowledge, not that there wouldn't be a usecase).

For some end users it's already confusing to have different states in the same linear history on different devices because of upload times and devices being offline. I think giving most users multi-branch capabilities would overpower them and they might just ignore all branches but one, manually merging parts of older branches into the newer ones.

So I guess you assume usecases for application writers? I'm also not deep enough into this topic that I could imagine any usecases. Would you mind giving some examples?

[+] pkj|12 years ago|reply
Trying to wrap my head around this. Seems difficult without clear usecases.

Let's say I have 10 devices d1,d2....d10 making updates to "a" on the server and went offline. a==20 and last update was by d5 before everyone went offline.

When the devices come back up, the fate of "a" depends on the rulesets. Following are 3 possible high-level combinations.

i) All devices have "remote" rule. On reconnection, everyone rollback "a" to 20. They are essentially back to the time before going offline. Even the device which did the last update(d5) before going offline is rolled back too, which seems bit odd. Still simple to reason with..

ii) All devices have "local" rule. On reconnection, the last device to reconnect updates "a". It is then broadcasted to all other devices. Note that it is not the last device to update "a". Rather it is the last to reconnect (Now, even if all of them reconnect at same time, depending on the queueing at server, the one at the tail wins). Not really simple..

iii) Mix of "remote" and "local" Let's say d1 had "local" rule and all others had "remote". On reconnection, d1's "a" will be propagated to everyone. This is irrespective of the order of reconnection (I am assuming that between reconnections "a" is not modified). This is pretty simple and perfectly predictable. Now, if we have more than one "local", we start getting non-deterministic, and at the extreme move to case ii)

[+] smarx|12 years ago|reply
Note that every device submits its change with an expected "parent revision." The server checks the change against the server's current revision, and the change is accepted if and only if the server revision matches the parent revision of the submitted change.

So when devices d1 through d10 make a (simultaneous, I assume) change, they all submit their change with the same parent revision. Assuming they were up-to-date before they submitted that change, exactly one of the device's changes will succeed (whichever reaches the server first). For example, if the previous revision was 100, they'll all submit changes with the parent revision 100, and the first one to reach the server will succeed, at which point the server revision will be increased to 101. When the other changes come in, they'll all fail because their parent revision doesn't match the server.

I have to slightly revise your scenario and say all these devices went offline and then made a local change. This may be what you meant, but I want to clarify that the changes were queued up locally but not yet sent to the server.

So in (i), where the conflict resolution strategy is "remote," what will happen is that one device's change will win (whichever reaches the server first), and all other devices will throw out their change in favor of the change that made it to the server. It's not the case that everybody's changes are rolled back.

In (ii), the first device to connect submits its change and is accepted by the server. Subsequent devices submit their change (with parent revision of 100), see that they're out of date (server revision is now 101 or higher), and resubmit their change with the new parent revision, effectively clobbering any changes that have been made on the server. So each device in turn clobbers the value on the server, and the ultimate value is whichever change was submitted last.

For (iii), really don't do that. Having different devices use different conflict resolution strategies is a bad idea, and I can't really think of a valid scenario fro that. Can you?

So to sum up, (i) would mean "first change to reach the server wins," and (ii) would mean "last change to reach the server wins."

[+] boomzilla|12 years ago|reply
Awesome, this is one of the best explanations of version conflict resolution in distributed storage.

The best engineers not only needs to write great code, they need to write the best documentations too :)

[+] joejohnson|12 years ago|reply
I might not understand what the Datastore API is for, but if I wanted to make a simple app that could sync to Word Docs between to clients, would these .doc files get represented in this database structure (tables, records and values)? Is that data structure just for example purposes?

How would this algorithm handle changes to the same text file? Is the file a record, and each row considered a value in that record?

EDIT: Looks like this API is only for "structured data like contacts, to-do items, and game state." https://www.dropbox.com/developers/datastore

[+] joshuak|12 years ago|reply
Why does chrome prompt me for keychain access on this page?

I'm not logged in, and it doesn't log me in.

[+] minikites|12 years ago|reply
I've heard around (I don't use Chrome personally) that Chrome just prompts for keychain access until you give up and grant it access all the time.