top | item 10835808

(no title)

GordyMD | 10 years ago

Great to finally see a write up by Aphyr on RethinkDB. Ever since reading these blogs and seeing RethinkDB lost standing Github issue [1] I was keen to hear how RethinkDB would hold up to the tests once RAFT was implemented.

Given the thoroughness of the Jepsen test suite it is something people want to see these days before being able to choose a database with any confidence. Hopefully this sets out expectations with high transparency.

Kudos to the team at RethinkDB for funding and assisting Apyhr in his work.

[1]: https://github.com/rethinkdb/rethinkdb/issues/1493

discuss

lomnakkus|10 years ago

> Given the thoroughness of the Jepsen test suite it is something people want to see these days before being able to choose a database with any confidence.

Definitely agreed on the "something people want to see" part, but this is this is the thing... Jepsen isn't actually that thorough[1]. I rather think that this is an indictment on the state of "practical" distributed computing as it currently stands that a "simple" test for linearizability (nowadays) or even simple CAS (which I believe Jepsen started out testing) in a partitioned system would turn up such a huge amount of badly implemented distributed systems and... frankly dishonest documentation around those systems.

It's not that I could necessarily do any better -- except maybe the "honesty" part, or at least adding lots of qualifiers -- I just find it a bit... sad in a way that we haven't come farther. Still, it is a young field, so there may be grounds for optimism for the future. (Thinking of e.g. dependent type systems coming together with model checking coming together with verified model->machine translation, chips with verified semantic models, etc. etc.)

[1] As Aphyr explicitly states, it's actually very limited in the state space that it can explore simply because it's constrained to be "external" to the system being tested. Model checkers can do much more -- but then you usually don't get a fully automatic and verified translation to machine code... and who knows if you've modeled the CPU/IOMMU/etc. correctly anyway?

EDIT: Btw, Aphyr deserves a HUGE amount of praise. AFAIK he's the only person so far who has stepped up to the plate and "dared" to actually test this stuff. It's kind of amazing in a way, but I think I'll blame corporate culture for this sort of thing... "Oh, the documentation says $X therefore we'll believe $X. At least they can't fire us for that." Not surprisingly I was known to be hugely skeptical of any claims made of distributed systems, but I was too much of a coward to embark on "Aphyr's Quest" :). I'm hoping to coin a phrase with that last bit.