This is in such a thick academic style that it is difficult to follow what the problem actually might be and how it would impact someone. This style of writing serves mostly to remind me that I am not a part of the world that writes like this, which makes me a little sad.
glutamate|10 months ago
You may not be part of that world now, but you can be some day.
EDIT: forgot to say, i had to read 6 or 7 books on Bayesian statistics before i understood the most basic concepts. A few years later i wrote a compiler for a statistical programming language.
cr3ative|10 months ago
concerndc1tizen|10 months ago
I somewhat feel that there was a generation that had it easier, because they were pioneers in a new field, allowing them to become experts quickly, while improving year-on-year, being paid well in the process, and having great network and exposure.
Of course, it can be done, but we should at least acknowledge that sometimes the industry is unforgiving and simply doesn't have on-ramps except for the privileged few.
unknown|10 months ago
[deleted]
jorams|10 months ago
Essentially: The configuration claims "Snapshot Isolation", which means every transaction looks like it operates on a consistent snapshot of the entire database at its starting timestamp. All transactions starting after a transaction commits will see the changes made by the transaction. Jepsen finds that the snapshot a transaction sees doesn't always contain everything that was committed before its starting timestamp. Transactions A an B can both commit their changes, then transactions C and D can start with C only seeing the change made by A and D only seeing the change made by B.
deathanatos|10 months ago
For this particular one, the graph under "Results" is the most approachable portion, I think. (Don't skip the top two sections, though … and they're so short.) In the graph, each line is a transaction, and read them left-to-right.
Hopefully I get this right, though if I do not, I'm sure someone will correct me. Our database is a set of ordered lists of integers. Something like,
The first transaction: This is shorthand; means "(a)ppend to list #89 the integer 9" (in SQL, crudely this is perhaps something like … though we'd need to handle the case where the list doesn't exist yet, turning it into an `INSERT … ON CONFLICT … DO UPDATE …`, so it would get gnarlier.[2]); the next: I assume you can `SELECT` ;) That should provide sufficient syntax for one to understand the remainder.The arrows indicate the dependencies; if you click "read-write dependencies"[1], that page explains it.
Our first transaction appends 9 to list 89. Our second transaction reads that same list, and sees that same 9, thus, it must start after the first transaction has committed. The remaining arrows form similar dependencies, and once you take them all into account, they form a cycle; this should feel problematic. It's that they're in a cycle, which snapshot isolation does not permit, so we've observed a contradiction in the system: these cannot be obeying snapshot isolation. (This is what "To understand why this cycle is illegal…" gets at; it is fairly straightforward. T₁ is the first row in the graph, T₂ the second, so forth. But it is only straight-forward once you've understood the graph, I think.)
> This is in such a thick academic style that it is difficult to follow what the problem actually might be and how it would impact someone.
I think a lot of this is because it is written with precision, and that precision requires a lot of academic terminology.
Some of it is just syntax peculiar to Jepsen, which I think comes from Clojure, which I think most of us (myself included) are just not familiar with. Hence why I used SQL and comma-sep'd lists in my commentary above; that is likely more widely read. It's a bit rough when you first approach it, but once you get the notation, the payoff is worth it, I guess.
More generally, I think once you grasp the graph syntax & simple operations used here, it becomes easier to read other posts, since they're mostly graphs of transactions that, taken together, make no logical sense at all. Yet they happened!
> This style of writing serves mostly to remind me that I am not a part of the world that writes like this, which makes me a little sad.
I think Jepsen posts, with a little effort, are approachable. This post is a good starter post; normally I'd say Jepsen posts tend to inject faults into the system, as we're testing if the guarantees of the system hold up under stress. This one has no fault injection, though, so it's a bit simpler.
Beware though, that if you learn to read these, that you'll never trust a database again.
[1]: https://jepsen.io/consistency/dependencies
[2]: I think this is it? https://github.com/jepsen-io/postgres/blob/225203dd64ad5e5e4... — but this is pushing the limits of my own understanding.
mdaniel|10 months ago
I chuckled, but (while I don't have links to offer) I could have sworn that there were some of them which actually passed, and a handful of others that took the report to heart and fixed the bugs. I am similarly recalling that a product showed up to their Show HN or Launch HN with a Jepsen in hand, which I was especially in awe of the maturity of that (assuming, of course, I'm not hallucinating such a thing)
joevandyk|10 months ago
[deleted]
rezonant|10 months ago
Sesse__|10 months ago
senderista|10 months ago
belter|10 months ago
bananapub|10 months ago
have some respect for yourself and everyone else, christ.
ZYbCRq22HbJ2y7|10 months ago
Why? Because it has variables and a graph?
What sort of education background do you have?
renewiltord|10 months ago
benatkin|10 months ago
vlovich123|10 months ago
vlovich123|10 months ago
I’ve repeatedly used ChatGPT and Claude to help me understand papers and to cut through the verbiage to the underlying concepts.