(no title)
knl | 4 years ago
Like many research projects, this one will also probably last as long as there is funding. Remember, the goal of PhD students is to publish papers, not develop and maintain software. Thus, without skin in the game, I couldn’t trust my data/workloads to such systems.
dang|4 years ago
https://news.ycombinator.com/newsguidelines.html
We detached this subthread from https://news.ycombinator.com/item?id=29403597.
tytso|4 years ago
The key thing to remember is that THERE IS NOTHING WRONG WITH THIS ACADEMIC PROCESS. I go to the Filesystems and Storage Technology (FAST) conference, where many of these BetrFS papers were published, to harvest ideas which I might use in my production systems, and of course, to see if any of the graduate students who have decided that the academic life is not for them, whether they might come to work for my company[1]. I personally find the FAST conference incredibly useful on both of these fronts, and I think the BetrFS papers are super useful if you approach them from the perspective of being a proving ground for ideas, not as a production file system.
So it's unfortunate that people seem to be judging BetrFS on whether they should "trust my data/workloads to such systems", and complaining that the prototype is based on the 3.11 kernel. That's largely irrelevant from the perspective of proving such ideas. Now, I'm going to be much more harshly critical when someone proposes a new file system for inclusion in the upstream kernel, and claiming that it is ready for prime time, and then when I run gce-xfstests on it, we see it crashing right and left[2][3]. But that's a very different situation. You will notice that no one is trying to suggest that BetrFS is being submitted upstream.
A good example of how this works is the iJournaling paper[4], where the ideas were used as the basis for ext4 fast commits[5]. We did not take their implementation, and indeed, we simplified their design for simplicity/robustness/deployment concerns. This is an example of academic research creating real value, and shows the process working as intended. It did NOT involve taking the prototype code from the jJournaling research effort and slamming it into ext4; we reimplemented the key ideas from that paper from scratch. And that's as it should be.
[1] Oligatory aside: if you are interested in working on file systems and storage in the Linux kernel; reach out to me --- we're hiring! My contact information should be very easily found if you do a Google search, since I'm the ext4 maintainer
[2] https://lore.kernel.org/r/YQdlJM6ngxPoeq4U@mit.edu
[3] https://lore.kernel.org/all/YQgJrYPphDC4W4Q3@mit.edu/
[4] https://www.usenix.org/conference/atc17/technical-sessions/p...
[5] https://lwn.net/Articles/842385/
throwaway02201|4 years ago
> Like many research projects, this one will also probably last as long as there is funding. Remember, the goal of PhD students is to publish papers, not develop and maintain software. Thus, without skin in the game, I couldn’t trust my data/workloads to such systems.
Sadly true. For-profit companies only care about $$$. Academia only cares about publishing to get funding.
Both options are not ideal for developing trusted and user-focused software in the long term. OpenSSL is a good example.
No-profits really struggle to get funding. Government grants are a mess.
The world really needs a new approach to R&D.
globular-toast|4 years ago
That's just not true. To do well in academia you have to be truly invested in your field. You can just about get by if you're only in it for the papers, but it's just like getting by in a job that you're only in for the money. At the end of the day, though, in a world where everyone is forced to be productive or be homeless, there are times when publishing becomes a necessity. This doesn't mean they only care about publishing, though.
gnufied|4 years ago
I am not even sure it wants to be production ready but may be it is a playground for ideas.
donporter|4 years ago
Part of our challenge is that we are also exploring non-standard extensions to the VFS API - largely supported by kallsyms + copied code to avoid kernel modifications. This makes rolling forward more labor intensive, but we are working to pay down this technical debt over time, or possibly make a broader case for a VFS API change.
all2|4 years ago
mirekrusin|4 years ago
hhmc|4 years ago
lvh|4 years ago
knl|4 years ago
unknown|4 years ago
[deleted]
klyrs|4 years ago
And, the way to get production-ready code is to write a kernel module, with hopes that others in the kernel community will pick it up. Linux certainly didn't start out mature, but you're probably using it now.
unknown|4 years ago
[deleted]
_jal|4 years ago
All your written-in-production, battle-hardened code with no effete book-larnin' algorithms aren't going to run very well without a functional electricity grid.
rackjack|4 years ago
freedomben|4 years ago
toast0|4 years ago
wittycardio|4 years ago
knl|4 years ago
I never assumed that PhD students can’t code. They can and they are pretty good at that. My point is that their incentives are in writing papers and running experiments that support claims in their papers, not produce reliable software. It might be reliable, but mostly it’s not. When we use tools build by PhD students, it’s usually when there are companies/startups built around it, and that is what I refer to as having skin in the game.
mbreese|4 years ago
I’ve known people with PhDs in computer science (from a top tier school) that couldn’t code. Their research was all done in Matlab for simulations, modeling a biological process. It was a very specific set of skills required. And at the time, this person couldn’t have written a web front end to a database to save their lives.
Just because one is good at the theory behind CS doesn’t mean they understand software engineering. Similarly, because one is good at the theory doesn’t mean they can’t code.
They are two related, but different, skill sets.
sfink|4 years ago
Average programmer here. PhDs in computer science can't code.
Ok, it's an overgeneralization. And it's probably based on a flawed sample of job applicants that make it past HR screening to get to me. The base rate of applicants who can't code is disturbingly high, probably around 20%. (Not that high numerically, but given that they've passed pre-screening and have something impressive-sounding on their resume, it's too high.) The rate of applicants with a PhD in CS who can't code is way higher, probably around 60%.
Note that these tend to be fresh graduates. And it even makes sense -- most theses require just enough coding to prove a point or test a hypothesis. In fact, the people who like to code tend to get sucked into the coding and have trouble finishing the rest of their thesis work, which may start out interesting but soon gets way less fun than the coding part. Often such people bail out with an MS instead of a PhD.
(Source: personal experience, plus talking to people I've worked with, plus pulling stuff out of my butt.)
At the same time, many of the best coders I know have PhDs.
> Implementation and edge cases are the easy part, the hard part is design and algorithms.
Hahahaha. <Snarky comment suppressed, with difficulty.>
I agree that design and algorithms can be hard. (Though they usually aren't; the vast majority of things being built don't require a whole lot.) But the entire history of our field shows that even a working implementation Just Isn't Good Enough. Especially when what you're writing is exposed in a way that security vulnerabilities matter.
Though it's a bit of a false dichotomy. Handling the edge cases and the interaction with the rest of the system requires design, generally much more so than people give it credit for. Algorithms sometimes too, to avoid spending half your memory or code size on the 1% of edge cases.
gnufied|4 years ago
Most production software (esp low level stuff like Kernel, filesystem) today is written and maintained by people having that work as jobs. I wish it was any other way. Also, what users expect from production software is way different than situation 30-40 years ago. An Operating sytem must work for different CPU, GPU. A bare-bones OS is basically a non-starter. I mean look at Haiku-OS or any of other operating system projects, for most part they have gone nowhere.
A filesystem is also fairly complicated piece and what we expect from a filesystem is different. Speed is good but that is not the only criteria and I am afraid it does take serious engineering effort (edge cases and all) to get it usable on today's hardware.
WastingMyTime89|4 years ago
That doesn’t really apply here obviously. The BetrFS team has experienced members.