top | item 10230566

(no title)

sijieg | 10 years ago

Ah, just happened to see those two blog posts together in same place. Those are really good posts on explaining replication scheme of Apache BookKeeper.

One thing to add on Flavio's blog post. Readers of a log (ledger) agree on LastAddConfirmed (lac), which LAC could be thought of 'commit' message in most of consensus protocol. In replicated log, commit means making data visible for readers.

BookKeeper doesn't enforce 'commit' like what other consensus protocol does. Instead it exposes the core elements of a consensus protocol as primitives and let applications decide things such as when to commit, how often to commit. Readers could use API (readerLastConfirmed) to catch up to latest 'commit' data. Controlling when to commit is the way how DistributedLog uses BookKeeper to tune end-to-end latency for different types of workloads: for latency-sensitive workloads like database, it does aggressive commits, for analytics workload, it does periodical commits to get benefits (such as reducing bandwidth by compression) by grouping.

discuss

order

No comments yet.