top | item 11700856

You Can’t Sacrifice Partition Tolerance (2010)

43 points| xrorre | 10 years ago |codahale.com | reply

21 comments

order
[+] kerkeslager|10 years ago|reply
Maybe I'm missing something, but it seems like this is basically just glossing over this part:

> Some systems cannot be partitioned. Single-node systems (e.g., a monolithic Oracle server with no replication) are incapable of experiencing a network partition. But practically speaking these are rare;

I'm not sure how the writer came to the conclusion that these systems are rare. The website I'm working on now has a single monolithic Postgres server, and the majority of systems I've worked on connected to single monolithic datastores, so at the very least these systems exist. These aren't particularly interesting examples, but when talking theoretically like with the CAP theorem, you don't get to just ignore the rare or uninteresting cases. These seem like relatively common examples of choosing CA.

Of course, when you try to shard such databases, you're going to run into serious problems.

I guess the claim I can credit to the author is that CA really really really means no partition tolerance. Even one network partition puts you in CP or AP territory. But I don't think that means CA stores don't exist.

It may be that I'm just not understanding things correctly, though; the CAP paper is definitely a challenging read for me.

[+] chillacy|10 years ago|reply
If you read the next sentence he says

> But practically speaking these are rare; add remote clients to the monolithic Oracle server and you get a distributed system which can experience a network partition (e.g., the Oracle server becomes unavailable).

[+] mason55|10 years ago|reply
You skipped the second part of that. Unless your web application is running on the database server then you are no longer running single node, your web app and db server can become partitioned.
[+] lostcolony|10 years ago|reply
Also, per the article, a single node system WITHOUT REPLICATION. That would hopefully be quite rare, as it means you have a single point of failure, with no automated recovery plan.

But as soon as you have replication...you have distribution, and what happens when a partition occurs, where the master can no longer talk to the replica(s)? That's where CAP applies; the system -will- make a decision about what to do, and in doing so will trade availability for consistency, or consistency for availability, for any given read/write scenario.

[+] zzzcpan|10 years ago|reply
> The website I'm working on now has a single monolithic Postgres server, and the majority of systems I've worked on connected to single monolithic datastores, so at the very least these systems exist.

These systems do not guaranty neither consistency, nor availability during network partitions.

[+] aminorex|10 years ago|reply
The implication derived from a trivial probability argument is misleading. Failures are correlated, so 1-(1-P)^n is an upper bound, firstly, but secondly and much more importantly, the value being computed is the wrong value. Failure of a node is not partition. To non-trivially partition a cluster of N nodes requires (N-2) failures, when the cluster is fully connected, for example (as e.g. on any bus network). Handling trival partition is, well, trivial, in some important senses.
[+] zaroth|10 years ago|reply
Liked everything up to this part;

"When it comes to designing or evaluating distributed systems, then, I think we should focus less on which two of the three Virtues we like most and more on what compromises a system makes as things go bad."

It seemed like the entire point was that really there are only 2 Virtues (Consistency, Availability) and given a partition you will chose one or the other. The only database that doesn't have network partitions also isn't distributed.

[+] zzzcpan|10 years ago|reply
> and given a partition you will chose one or the other

There is no sacrificing consistency though. It's about whether you use CRDTs or something to keep the system working when network partition happens or you don't and let the system stop working.

[+] superuser2|10 years ago|reply
You can (and a shocking number of distributed database systems in widespread usage do) fail to guarantee either one.
[+] throwaway_exer|10 years ago|reply
The original CAP paper was intended as a high-level discussion item for students. Brewer has since emphasized that it is more of an academic approach than applicable to real-world distributed databases.

Other computer scientists have either modified or narrowed it to be more useful.

So it's fun to read the original CAP paper, but it's less useful than you would expect. When somebody asks me about CP vs. CA, I realize they in fact don't anything at all about distributed databases.