top | item 46074562

(no title)

mjb | 3 months ago

There is no reason a database can’t be both strongly consistent (linearizable, or equivalent) and available to clients on the majority side of a partition. This is, by far, the common case of real-world partitions in deployments with 3 data centers. One is disconnected or fails. The other two can continue, offering both strong consistency and availability to clients on their side of the partition.

The Gilbert and Lynch definition of CAP calls this state ‘unavailable’, in that it’s not available to all clients. Practically, though, it’s still available for two thirds of clients (or more, if we can reroute clients from the outside), which seems meaningfully ‘available’ to me!

If you don’t believe me, check out Phil Bernstein’s paper (Bernstein and Das) about this. Or read the Gilbert and Lynch proof carefully.

discuss

order

lmm|3 months ago

> The Gilbert and Lynch definition of CAP calls this state ‘unavailable’, in that it’s not available to all clients. Practically, though, it’s still available for two thirds of clients (or more, if we can reroute clients from the outside), which seems meaningfully ‘available’ to me!

That's great for those two thirds but not for the other one third. (Indeed you will notice that it's "available" precisely to the clients that are not "partitioned").

mjb|3 months ago

When does AP help?

It helps in the case where clients are (a) able to contact a minority partition, and (b) can tolerate eventual consistency, and (c) can’t contact the majority partition. These cases are quite rare in modern internet-connected applications.

Consider a 3AZ cloud deployment with remote clients on the internet, and one AZ partitioned off. Most often, clients from the outside will either be able to contact the remaining majority (the two healthy AZs), or will be able to contact nobody. Rarely, clients from the outside will have a path into the minority partition but not the majority partition, but I don’t think I’ve seen that happen in nearly two decades of watching systems like this.

What about internal clients in the partitioned off DC? Yes, the trade-off is that they won’t be able to make isolated progress. If they’re web servers or whatever, that’s moot because they’re partitioned off and there’s no work to do. Same if they’re a training cluster, or other highly connected workloads. There are workloads that can tolerate a ton of asynchrony where being able to continue while disconnected is interesting, but they’re the exception rather than the rule.

Weak consistency is much more interesting as a mechanism for reducing latency (as DynamoDB does, for example) or increasing scalability (as the typical RDBMS ‘read replicas’ pattern does).