Of course two systems can only be consistent if they can communicate, so you have to either sacrifice availability until the partition is resolved, or give up on consistency.
I'm sorry, but I can't resist: Isn't the cap theorem irrelevant, because true network partitions never happen in the real world? If a link fails, an administrator will fix it eventually. With any system implementing ACK packets (tcp is one example) a link that fails but is then fixed is the same as a very slow link.
I don't think you should be sorry. I was wondering the same thing. If messages can't get lost (because of good network monitoring), can't you have perfect consistency and availability?
That said, practically speaking your system may have to wait for the partition to be fixed, which would either make the system practically unavailable or practically inconsistent. But not theoretically, at least, not as this site describes it.
> With any system implementing ACK packets (tcp is one example) a link that fails but is then fixed is the same as a very slow link.
This is only even theoretically true if a) you have no bound on memory and b) the link will always come back up. In practice, neither of these is true.
Further, if you're relying on never being partitioned then any network break requires delaying availability until the partition is resolved. CAP is then relevant if you're not willing to do this (and virtually no one is going to be so inclined).
interesting FAQ...i like the idea of bringing this info together.
i've found there are lots of more-common things that cause partitions in practice than equipment-in-the-middle failures. human errors are probably the biggest: network configuration changes, fresh bugs in your own software - or in your dependencies, etc.
also, while a network might be asynchronous, there's usually a limit to how long a message can be delayed in practice. ...the limit might be how much memory you have to queue up messages...or perhaps how long your client-side software (or your end-user) is willing to wait for a message when a dialog is more complex than request/response.
when designing distributed software, i've found that it's helpful to ask: when (not if) X process/server/cluster/data-center fails or becomes unreachable - temporarily or forever - how should the rest of my system respond?
so, perhaps the most important take-away from the FAQ for designers is #13: that C and A are "spectrums" that you tune to meet your own requirements when the various failure scenarios happen.
"A partition is when the network fails to deliver some messages to one or more nodes by losing them (not by delaying them - eventual delivery is not a partition)."
That part is confusing to me. Doesn't the term partition have another meaning in distributed system design? For instance, consistent hashing "partitions" keys to multiple nodes. I haven't heard partition as a term describing dataloss.
consider the network of nodes separating (partitioning) into two groups - each internally. connected, but unconnected from each other. then messages from one group never reach the other group.
so partitioning still has the sense of "splitting" - it's just that the explanation focuses on messages rather than the network.
Sort of. CAP talks about behaviour during potentially ambiguous failures (network partitions), but most of the systems that call themselves "eventually consistent" also sacrifice consistency under normal operation. Examples are Cassandra, Riak and the original Dynamo.
The main tradeoff is that after writing the values 1, 2, 3 in order, reads could see anything from no value or any one of those three values until the nodes converge.
In a Consistent system, if a read happens after writing and you see a 3, you will never see a 2, 1 or no value on subsequent reads. In the case of a network partition, the system will prefer to not be available than to return reads older than reads that have already been returned.
> There is another way. You can't avoid the CAP theorem, but you can isolate its complexity and prevent it from sabotaging your ability to reason about your systems.
I don't think he really debunked it there, he just came up with a way around it. If you accept what the theorem assumes about your system, then the cap theorem applies to you. Any system that squirrels away from it's assumptions can be said to "beat" it, but only in a practical sense.
[+] [-] nlavezzo|13 years ago|reply
"16. Have I 'got around' or 'beaten' the CAP theorem?
No. You might have designed a system that is not heavily affected by it. That's good."
Our thoughts on CAP and how we've dealt with it while building a distributed truly ACID database might also be interesting to some: http://foundationdb.com/white-papers/the-cap-theorem/
[+] [-] Xodarap|13 years ago|reply
Of course two systems can only be consistent if they can communicate, so you have to either sacrifice availability until the partition is resolved, or give up on consistency.
[+] [-] habitue|13 years ago|reply
A) Formulating the theorem in the first place
B) Coming up with the proof yourself
[+] [-] tbrownaw|13 years ago|reply
[+] [-] whatshisface|13 years ago|reply
[+] [-] cynicalkane|13 years ago|reply
But if your definition of "available and consistent" goes beyond this, then you have to start considering the CAP theorem.
[+] [-] jamesaguilar|13 years ago|reply
That said, practically speaking your system may have to wait for the partition to be fixed, which would either make the system practically unavailable or practically inconsistent. But not theoretically, at least, not as this site describes it.
[+] [-] mnarayan01|13 years ago|reply
This is only even theoretically true if a) you have no bound on memory and b) the link will always come back up. In practice, neither of these is true.
Further, if you're relying on never being partitioned then any network break requires delaying availability until the partition is resolved. CAP is then relevant if you're not willing to do this (and virtually no one is going to be so inclined).
[+] [-] tomjohnson3|13 years ago|reply
i've found there are lots of more-common things that cause partitions in practice than equipment-in-the-middle failures. human errors are probably the biggest: network configuration changes, fresh bugs in your own software - or in your dependencies, etc.
also, while a network might be asynchronous, there's usually a limit to how long a message can be delayed in practice. ...the limit might be how much memory you have to queue up messages...or perhaps how long your client-side software (or your end-user) is willing to wait for a message when a dialog is more complex than request/response.
when designing distributed software, i've found that it's helpful to ask: when (not if) X process/server/cluster/data-center fails or becomes unreachable - temporarily or forever - how should the rest of my system respond?
so, perhaps the most important take-away from the FAQ for designers is #13: that C and A are "spectrums" that you tune to meet your own requirements when the various failure scenarios happen.
[+] [-] capkutay|13 years ago|reply
That part is confusing to me. Doesn't the term partition have another meaning in distributed system design? For instance, consistent hashing "partitions" keys to multiple nodes. I haven't heard partition as a term describing dataloss.
[+] [-] andrewcooke|13 years ago|reply
so partitioning still has the sense of "splitting" - it's just that the explanation focuses on messages rather than the network.
[+] [-] drorweiss|13 years ago|reply
[+] [-] lucian1900|13 years ago|reply
The main tradeoff is that after writing the values 1, 2, 3 in order, reads could see anything from no value or any one of those three values until the nodes converge.
In a Consistent system, if a read happens after writing and you see a 3, you will never see a 2, 1 or no value on subsequent reads. In the case of a network partition, the system will prefer to not be available than to return reads older than reads that have already been returned.
[+] [-] andrewcooke|13 years ago|reply
[+] [-] sritchie|13 years ago|reply
[+] [-] saurik|13 years ago|reply
> There is another way. You can't avoid the CAP theorem, but you can isolate its complexity and prevent it from sabotaging your ability to reason about your systems.
I have a different definition of "debunk".
[+] [-] lemming|13 years ago|reply
16. Have I 'got around' or 'beaten' the CAP theorem?
No. You might have designed a system that is not heavily affected by it. That's good.
[+] [-] whatshisface|13 years ago|reply