top | item 37370823

(no title)

uw_rob | 2 years ago

Think databases which run across many different machines.

Distributed databases are often conceptually modeled as a state machine. Writes are then mutations on the state machine.

With a starting state (empty database), if everyone agrees that on a fixed list of mutations which are executed in a rigidity defined ordering, you will get the same final state.

Which makes sense, right? If you run the following commands on an empty database, you would expect the same final state:

1. CREATE TABLE FOO (Columns = A, B) 1. INSERT INTO FOO (1, 2) 1. INSERT INTO FOO (3, 4)

Which would be:

``` FOO: |A|B| |-|-| |1|2| |3|4| ```

So where does "consensus" come into play? Consensus is needed to determine `mutation 4`. If the user can send a request to HOST 1 saying 'Mutation 4. should be `INSERT INTO FOO (5, 6)`' then HOST 1 will need to coordinate together with all of the other hosts and hopefully all of the hosts can agree that this is the 4th mutation and then enact that change on their local replica.

This ordering of mutations is called the transaction log.

So, why is this such a hard problem? Most of the reasons are in [Fallacies of distributed computing](https://en.wikipedia.org/wiki/Fallacies_of_distributed_compu...) but the tl;dr is that everything in distributed computing is hard because hardware is unreliable and anything that can go wrong will go wrong. Also, because multiple things can be happening at the same time in multiple places so it's hard to figure out who came first, etc.

RAFT is such an algorithm to let all of these hosts coordinate together in a fault tolerant way to figure out the 4th mutation.

Disclaimer: The above is just one use of RAFT. Another way RAFT is used in distributed databases is as a mechanism for the hosts to coordinate a hierarchy of communicate among themselves and when a host in the hierarchy is having problems RAFT can be used again to figure out another hierarchy. (Think consensus is reached on leader election to improve throughput)

discuss

order

No comments yet.