This is good work, and the lock semantics and fencing token (epoch) make a lot of sense. I can't help but think that implementing java.util.concurrent.locks.Lock will turn out to be a liability. The problem here is that the code looks like a Java lock, but the semantics are entirely different with regards to failure. Specifically:
> While we’re on this subject, the same logic applies even to the primary FencedLock.lock() call: at the very next line of code in your program, you may no longer be holding the lock.
That's not, in most programmer's experience, how locks work. This behavior is necessary (at some level) to deal with partial failures and stalls of clients, but means that if you use this like Lock your code will be very wrong.
> Note the key message here: all external services must participate in the fencing-token protocol, with guaranteed linearizability, for the whole setup to uphold its invariants.
So this isn't really like a Java lock at all, and instead is a nice convenient way to build part of an epoch/view change implementation. That's useful, but in my mind the API they chose will reduce the likelihood that non-experts will use this correctly.
More accurately, these are leases, not locks in the traditional sense. The lease expires when the corresponding client session ends, which is detected by the absence of a heartbeat from the client.
I think the first time I encountered the idea of leases it was dabbling in some CORBA code, back when enough people knew what that acronym means that Sun Microsystems thought they should include an orb in the dev kit. And then again in their RPC mechanism for Java.
Since then I've encountered the concept from others only a handful of times. Object leases aren't that important if your distributed state is straightforward. Platonic REST with no state doesn't need them, nor really do the fully stateful servers we had for about 20 years. And then there are concensus protocols like Raft which fill in some of the gaps in between.
First, leases aren't mentioned at all. Second, committing changes looks a lot like optimistic locking, due to the version number they're assigning to the lock.
It's not a lease, anyone can force close your session (which unlocks your locks), which takes effect immediately, not after waiting out a lease holding grace period.
I see they mention Raft in the article. If they are updating a directory of ownership data via Raft then your split brain problem is addressed.
Generally, distributed locks are a fiat based system. I claim ownership of something and I have indisputable rights to that thing until lease renewal time. If the lease renewal fails for any reason I have to give up my claim on the object.
I might have an architecture that lets me make forward progress in a split brain scenario because I owned a lease before the split happened. If the recovery is fast enough then everything will be fine.
However my instincts tell me that it would take a pretty special problem domain and a very assertive dev team to maintain this invariant over a long period of time. Business people see all this data we have and they want to connect it more and more over time. They are not above selling a feature and then cajoling us into implementing it, even if it reduces long-term viability.
In the end you are left with is a distributed system with lower overhead per transaction. But that's nothing to sneeze at.
Each lock has an associated name. If you have locks for separate purposes, you have separate names. If you need to grab multiple locks before processing a single task, I would suggest looking for a design that doesn't need it. If there is no other way, then acquring each like in a defined order would still work.
[+] [-] mjb|6 years ago|reply
> While we’re on this subject, the same logic applies even to the primary FencedLock.lock() call: at the very next line of code in your program, you may no longer be holding the lock.
That's not, in most programmer's experience, how locks work. This behavior is necessary (at some level) to deal with partial failures and stalls of clients, but means that if you use this like Lock your code will be very wrong.
> Note the key message here: all external services must participate in the fencing-token protocol, with guaranteed linearizability, for the whole setup to uphold its invariants.
So this isn't really like a Java lock at all, and instead is a nice convenient way to build part of an epoch/view change implementation. That's useful, but in my mind the API they chose will reduce the likelihood that non-experts will use this correctly.
[+] [-] roro159|6 years ago|reply
Redlock is a distributed lock using Redis: https://redis.io/topics/distlock
Martin Kleppmann criticized Redlock and mentioned the fencing solution: http://martin.kleppmann.com/2016/02/08/how-to-do-distributed...
Antirez disagrees with the analysis and the HN post has a good discussion: https://news.ycombinator.com/item?id=11065933
[+] [-] sriram_malhar|6 years ago|reply
[+] [-] hcnews|6 years ago|reply
[+] [-] hinkley|6 years ago|reply
I think the first time I encountered the idea of leases it was dabbling in some CORBA code, back when enough people knew what that acronym means that Sun Microsystems thought they should include an orb in the dev kit. And then again in their RPC mechanism for Java.
Since then I've encountered the concept from others only a handful of times. Object leases aren't that important if your distributed state is straightforward. Platonic REST with no state doesn't need them, nor really do the fully stateful servers we had for about 20 years. And then there are concensus protocols like Raft which fill in some of the gaps in between.
[+] [-] hinkley|6 years ago|reply
First, leases aren't mentioned at all. Second, committing changes looks a lot like optimistic locking, due to the version number they're assigning to the lock.
[+] [-] grogers|6 years ago|reply
[+] [-] cangencer|6 years ago|reply
[+] [-] heavenlyblue|6 years ago|reply
The fact that split brain is not allowed implies that liveness is given up for it.
More importantly, what can I possibly do in the scenario where I would like to obtain several locks at the same time?
Distributed lock frameworks usually imply there’s some sort of transaction reversal mechanism implied by the architecture.
[+] [-] hinkley|6 years ago|reply
Generally, distributed locks are a fiat based system. I claim ownership of something and I have indisputable rights to that thing until lease renewal time. If the lease renewal fails for any reason I have to give up my claim on the object.
I might have an architecture that lets me make forward progress in a split brain scenario because I owned a lease before the split happened. If the recovery is fast enough then everything will be fine.
However my instincts tell me that it would take a pretty special problem domain and a very assertive dev team to maintain this invariant over a long period of time. Business people see all this data we have and they want to connect it more and more over time. They are not above selling a feature and then cajoling us into implementing it, even if it reduces long-term viability.
In the end you are left with is a distributed system with lower overhead per transaction. But that's nothing to sneeze at.
[+] [-] mey|6 years ago|reply
[+] [-] PaulHoule|6 years ago|reply
[+] [-] PaulHoule|6 years ago|reply