(no title)
mpercy | 5 years ago
As one of the core authors of the Apache Kudu Raft implementation <http://kudu.apache.org/> (which is written in C++) I know that we tried to design it to be a pretty standalone subsystem, but didn't try to actually provide a libraft per se. We wanted to reuse the Raft write-ahead log as the database write-ahead log (as a performance optimization) which is one reason that making the log API completely generic eluded us a little.
That said, I'm currently at Facebook helping to adapt that implementation for another database. We are trying to make it database agnostic, and we continue to find cases where we need some extra metadata from the storage engine, new kinds of callbacks, or hacks to deal with various cases that just work differently than the Kudu storage engine. It would likely take anybody several real world integrations to get the APIs right (I'm hopeful that we eventually will :)
matthewaveryusa|5 years ago
https://github.com/matthewaveryusa/raft.ts/blob/master/src/i...
What kind of metadata do you need exactly from the datastore?
mpercy|5 years ago
1. Kudu uses a special "commit" record that goes into the log for crash consistency, related to storage engine buffer flushes. So we need an API to write those into the log. They don't have a term and an index, since they are a local-engine thing, so they have to be skipped when replicating data to other nodes in the case of the current node being the leader. If we were not sharing the log with the engine, we wouldn't need this.
2. Another database I'm working with requires file format information to be written at the top of every log segment, and it has to match the version of the log events following it. That info has to be communicated to the follower up-front even when the follower resumes replicating from the middle of the leader's log. So we need plugin callbacks on both sides to handle this, in terms of packing this as the leader and unpacking it as a follower into the wire protocol metadata.
Requirements like these will come up and either you hack around them by making some kind of out-of-band call (not ideal for multiple reasons) or you bake the capability into the plugin APIs and the communication protocol.
Frankly, designing generic APIs is also one of the less sexy aspects to consider because we spend so much of our time dreaming about and building all the cool distributed systems capabilities like leader elections, dynamic membership changes, flexible quorums, proxying/forwarding messages, rack/region awareness, etc etc etc. :)
The details of long-tail stuff like this is often hammered out as it comes up during implementation.
mpercy|5 years ago