Please stop calling databases CP or AP (2015)

[+] sradman|5 years ago|reply

Martin Kleppmann applies critical thinking to the claims of NoSQL marketing; I wish I was aware of his work at the time.

I always found it frustrating that the semantics of the word "Consistent" in ACID has nothing to do with the semantics of CAP "Consistency". ACID Consistency refers to integrity constraints while CAP Consistency seems to refer to Cache Coherence across nodes in a cluster. I was a little surprised that Kleppmann uses the idea of "Linearizability" to explain the discrepancy but it does clearly identify the same mistaken assumptions.

I was also frustrated with the NoSQL movement's focus on Partitioned clusters when my intuition was that it was a very rare failure mode. Perhaps we can now return to the metrics of data loss (RPO - Recovery Point Objective) and Recovery Time Objective (RTO) since we seem no closer to fully transparent failure recovery.

Regardless, the goal was always to have available and scalable databases that operate on clusters of commodity cloud servers and all of the SQL, NoSQL, and NewSQL solutions have converged around that goal. The transition to cloud computing seems to have been more about NoRAID than NoSQL. Some technologies like LSM-Trees address important use-cases in commodity clusters so maybe we/I shouldn't focus on the truthiness of the original rhetoric.

[+] fiedzia|5 years ago|reply

>I was also frustrated with the NoSQL movement's focus on Partitioned clusters when my intuition was that it was a very rare failure mode.

Network errors are very common among errors database has to deal with, and particularly common when we are talking about networks between datacenters.

[+] georgewfraser|5 years ago|reply

The CAP theorem is so dumb and yet so influential. If you want to gain a more useful understanding of the trade offs of distributed systems, I highly highly recommend the “PACELC theorem” by Daniel Abadi:

https://www.cs.umd.edu/~abadi/papers/abadi-pacelc.pdf

[+] maxov|5 years ago|reply

The posted article briefly discusses PACELC and Kleppmann says he doesn't find it any more enlightening than CAP.

PACELC is an improvement over CAP in some sense because it is a finer-grained characterization of distributed systems, but it still suffers from the problem that the stated properties are too strong. As Kleppmann explains in the article, there are systems that strictly only fulfill the partition tolerance part of the equation, yet are still very practically useful. This is because while they may not satisfy strict linearizability and availability, they have softer guarantees (e.g. defining "availability" as "99% of requests complete in 1 second") that we can rely on in practice. This reasoning extends to PACELC as well.

I am inclined to agree with Kleppmann. CAP, and by consequence PACELC are useful mental models in the extreme, but their characterizations are really strong. It seems to me that e.g. probabilistic guarantees based on actual failure statistics are much more useful for the real world, because they aren't as strong as CAP/PACELC guarantees.

Edit: I am also not sure if I agree with calling CAP "dumb". It's a very useful theoretical result in a particular model. I think the problem is when people try to extend the implications of the theorem beyond the restricted model it operates in.

[+] dap|5 years ago|reply

Why dumb? Admittedly I don't like the common phrasing of "pick 2". I much prefer: "in the face of a partition, a system can be consistent or available, but not both."

[+] cheph|5 years ago|reply

> The CAP theorem is so dumb

Why?

[+] genneth|5 years ago|reply

In reality the application running on top of the database wants C-kinda-A. https://research.google/pubs/pub45855/

> Despite being a global distributed system, Spanner claims to be consistent and highly available, which implies there are no partitions and thus many are skeptical. Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA.

The point is that if my application will actually go down anyway if the WAN craps out, then really, I don't actually need P. The application would also be significantly simpler if it assumes that if the application can work, then the DB is also up. And one seems that realistic systems built on Spanner simply assume CA and then get on with life...

[+] cthalupa|5 years ago|reply

>The point is that if my application will actually go down anyway if the WAN craps out, then really, I don't actually need P.

Can you guarantee no further modification to the data in the database occurs when the WAN craps out? What happens to in flight requests, batch/scheduled jobs, etc.? There's no situation where your application continues modifying the contents of the database even when there's no WAN link?

I've seen very few real world services where there is a 0% chance of data changing just because the application servers can no longer talk with clients but can still talk with the database. What happens when those databases try to rejoin the cluster and their data is now inconsistent, if you did not take into account partition tolerance?

[+] EGreg|5 years ago|reply

CAP is trivial and doesn’t tell you anything except that you can’t have all three things perfectly. It says nothing about the various trade-offs.

Often you can just let the people choose whether to accept the latest state of a particular partition of the database, or wait until it becomes consistent again.

[+] srtjstjsj|5 years ago|reply

CAP is trivial like Seinfeld isn't funny. It's so fundamental and well taught now that we forget there was a time before it was known.

CAP theorem is older than the career of nearly everyone on HN.

[+] cheph|5 years ago|reply

> CAP is trivial and doesn’t tell you anything except that you can’t have all three things perfectly.

And if someone don't understand it then it usually does not end well if they work with distributed data stores.

[+] Autowired|5 years ago|reply

Always wondered, if the CAP theorem does not account for latency, one assumes infinite latency does not impact availability. Shouldn't then be possible to define a theoretical system with an infinite buffer that, in the event of a network partition, will simply keep all incoming requests on hold (infinitely or until the partition is undone), satisfying CAP?

I know this is a useless construct in practice, and there is probably a flaw in my reasoning, but it seems to me that you have to establish a non-infinite timeout for the proof to be consistent.

[+] ses1984|5 years ago|reply

If there is a partition and you want to execute a write on both sides of the partition, both sides can wait, but eventually one side wins and the other side gets back a failure, so the db turned out not to be available on that side. At least that's my interpretation.

[+] unknown|5 years ago|reply

[deleted]

[+] dang|5 years ago|reply

Discussed at the time: https://news.ycombinator.com/item?id=9525266

[+] exabrial|5 years ago|reply

The best thing ever to happen to interviews was CAP questions to go out of the way of the dodo in terms of Silicon Valley fashion.

[+] cheph|5 years ago|reply

Not sure why you are so happy, every time I worked with people who did not understand it the outcome was really a massive mess of mismatched expectations.

[+] xwdv|5 years ago|reply

[deleted]

[+] GuB-42|5 years ago|reply

So the CAP theorem is a bit like Godel incompleteness theorem.

CAP tells us that databases are fundamentally broken, Godel theorem tells us that math is broken. But besides being important theoretical points, they don't matter that much in practice. In the same way, the two generals problem doesn't prevent TCP from working fine.

[+] burakemir|5 years ago|reply

Nah: the article says the CAP narrative is based on very specific assumptions which make it unfit to talk about actual distributed systems.

Gödel's incompleteness (that, by a diagonalization argument, formal systems of minimal expressivity will contain true statements that cannot be proven) applies and will continue to apply to all such formal systems. The bar is so low that any interesting formal system is affected by it. Also, it does not make math broken at all - those formal systems can and in fact must be consistent for the proof to go through (1st incompleteness theorem), and it's just a pity that whatever formal system one comes up with, if it contains arithmetic, it won't be able to prove it's own consistency (2nd incompleteness theorem).

In the words of von Neumann: Gödel's two past papers ("Widerspruchfreiheit" and Continuum Hypothesis), are in any case worth more than the total literary output, past, present and future, of most mathematicians [...]

51 comments