top | item 17828879

(no title)

BMarkmann | 7 years ago

I honestly don't know that there are any posts I get more excited about on HN than Jepsen analyses. The number of landmines distributed systems create and the ability to suss them out in such detail / depth is really incredible. Kudos.

discuss

order

amelius|7 years ago

Agree. But ... I sure hope that the new way of developing distributed systems doesn't follow this pattern:

1. write some code

2. submit to Jepsen

3. if errors, goto 1

The problem with this approach is obvious.

namibj|7 years ago

I'd hope they find themselves one or more with a master in math on logic proving in ways that a computer can help, and actually prove that their code, at least if compiled to LLVM bytecode, is adhering to the high-level architectural proof. Considering that they'd only need to prove consitency, nothing more, and especially not every little piece of code they are writing, but just the core that handles transactional isolation, not even that the code actually resolves in all cases (being stranded with a stuck cluster that is made to handle a hard reboot without problems is better than a cluster that silently violates constraints), it should not be an infeasible goal.

The main issue is that distributed database consistency without trivial, performance-killing locking schemes is too complex to prove when writing or using any trivial, local methods based on e.g. SMT solvers or so.

If something like cockroachdb would be proved-consistent on that level, it would be used for applications currently employing pessimistic locking due to a lack of trust in their database, or scaling vertically without really needing to (there are cases which make horizontal scaling cost-prohibitive due to the dependency chains in the algorithms that can solve them, but they can be replaced most of the time).

yashap|7 years ago

With most of the DBs he tests, the authors have very good understandings of distributed systems, and make very carefully considered design choices. They still make mistakes though, because they’re human. Very thorough testing is simply a great way to make any software more robust, distributed systems or otherwise.

BMarkmann|7 years ago

You forgot step zero, which is "find terrible example code on SO and copy it"... then 3 goes back to step 0.

striking|7 years ago

Totally agreed. My one and only wish is that they might offer an RSS feed.

aphyr|7 years ago

Oh, yeah, that's something I could put together. In the meantime, there's a mailing list and twitter you can subscribe to! https://jepsen.io

danr4|7 years ago

Me too. and I don't know a thing about databases and distributed systems. I guess I still have a soft spot for the snarky ones from a few years back.