TL;DR summary (to my understanding, no sane human can ever claim it can summarize Paxos):
The claim is that, once a leader is elected (ie. Q1), is no longer necessary to attain a majority quorum for actually accepting writes (ie. Q2). A minority quorum can accept writes, provided the minority contains at least one node that participated in the leader election. By increasing the leader election quorum number to higher than N/2+1 (as to have a sufficient number of nodes that participated in the Q1 election), the cluster can then operate much faster because writes require only minority quorum. The drawback is that it no longer tolerates N/2-1 failures, as N/2-1 failures leaves too few electors to choose a new leader in Q1.
NB. the Paxos terminology uses terms like 'decide a value', but practically in clusters this is equivalent to 'accept writes' so I used that instead for easier comprehension.
Haven't read it carefully, so I might have missed the reference, but Barbara Liskov has described a very similar optimization for Viewstamped Replication[1]
If your summary is accurate (haven't read the paper yet), I don't think this works, because if you don't have a proper quorum, you can't know that the leader is still valid at the time of an event. It might've been re-elected in the meanwhile. The only way to know is to "check in" with all the other nodes.
rusanu|9 years ago
The claim is that, once a leader is elected (ie. Q1), is no longer necessary to attain a majority quorum for actually accepting writes (ie. Q2). A minority quorum can accept writes, provided the minority contains at least one node that participated in the leader election. By increasing the leader election quorum number to higher than N/2+1 (as to have a sufficient number of nodes that participated in the Q1 election), the cluster can then operate much faster because writes require only minority quorum. The drawback is that it no longer tolerates N/2-1 failures, as N/2-1 failures leaves too few electors to choose a new leader in Q1.
NB. the Paxos terminology uses terms like 'decide a value', but practically in clusters this is equivalent to 'accept writes' so I used that instead for easier comprehension.
stelfer|9 years ago
[1] http://www.pmg.csail.mit.edu/papers/vr-to-bft.pdf
btrask|9 years ago
I recently proposed this idea (informally) and had to retract it: https://bentrask.com/?q=hash://sha256/b40971e7b30324fdda15ce...
Disclaimer: totally not an expert.
jpitz|9 years ago
amelius|9 years ago
rusanu|9 years ago
rollulus|9 years ago
TimFogarty|9 years ago
unknown|9 years ago
[deleted]
arthursilva|9 years ago