top | item 26224182

(no title)

Well as somebody who has written research software, I don't agree that research software is a "tangled mess". A couple of points,

1. often when I read read software written by profession programmers I find it very hard to read because it is too abstract, almost every time I try to figure out how something works, it turns out I need to learn a new framework and api, by contrast research code tends to be very self contained

2. when I first wrote research software I applied all the programming best practices and was told these weren't any good; turns out using lots of abstraction to increase modularity makes the code much slower, this is language dependent of course

3. you will find it much harder to read research code if you don't understand the math+science behind it

> many of those writing software know very little about how to do it

This is just not true. I found in my experience that people writing research software have a very specific skillset that very very few industry programmers are likely to have. They know how to write good numerics code, and they know how to write fast code for super computers. Not to mention, interpreting the numerics theory correctly in the first place is not a trivial matter either.

discuss

acmj|5 years ago

Quite a few professional programmers evaluate the quality of code by "look": presence of tests, variable length, function length etc. However, what makes great code is really the code structure and logical flows behind. In my experience, good industrial programmers are as rare as good academic programmers. Many industrial programmers make a fuss about coding styles but are not really good at organizing structured code for a medium sized project.

deklund|5 years ago

As someone who's worked for a large part of my career as a sort of bridge between academia and industry (working with researchers to implement algorithms in production), both you and the original author are right to an extent.

On one hand, academics I've worked with absolutely undervalue good software engineering practices and the value of experience. They tend to come at professional code from the perspective of "I'm smart, and this abstraction confuses me, so the abstraction must be bad", when really there's good reason to it. Meanwhile they look at their thousands of lines of unstructured code, and the individual bits make sense so it seems good, but it's completely untestable and unmaintainable.

On the other side, a lot of the smartest software engineers I've known have a terrible tendency to over-engineer things. Coming up with clever designs is a fun engineering problem, but then you end up with a system that's too difficult to debug when something goes wrong, and that abstracts the wrong things when the requirements slightly change. And when it comes to scientific software, they want to abstract away mathematical details that don't come as easily to them, but then find that they can't rely on their abstractions in practice because the implementation is buried under so many levels of abstraction that they can't streamline the algorithm implementation to an acceptable performance standard.

If you really want to learn about how to properly marry good software engineering practice with performant numerical routines, I've found the 3D gaming industry to be the most inspirational, though I'd never want to work in it myself. They do some really incredible stuff with millions of lines of code, but I can imagine a lot of my former academia colleagues scoffing at the idea that a bunch of gaming nerds could do something better than they can.

acmj|5 years ago

> a lot of the smartest software engineers I've known have a terrible tendency to over-engineer things.

Your definition of "smartest software engineers" is the opposite of mine. In my view, over-engineering is the symptom of dumb programmers. The best programmers simplify complex problems; they don't complicate simple problems.

disabled|5 years ago

I work on mathematical modeling, dealing with human physiology. Likewise, the software packages used can be esoteric, and the structure of your "code" can be very different looking, to say the least.

This is certainly a lot of work, and this takes a lot of practice to perform efficiently: But no matter what, I comment every single line of code, no matter how mundane it is. I also cite my sources in the commenting itself, and I also have a bibliography at the bottom of my code.

I organize my code in general with sections and chapters, like a book. I always give an overview for each section and chapter. I make sure that my commenting makes sense for a novice reading them, from line-to-line.

I do not know why I do this. I guess it makes me feel like my code is more meaningful. Of course it makes it easier to come back to things and to reuse old code. I also want people to follow my thought process. But, ultimately, I guess I want people to learn how to do what I have done.

titanomachy|5 years ago

"Esoteric software used for mathematical models of physiology" brought back a strong memory of the xpp software we had to use as undergrads. Apparently it was the best tool available for graphing bifurcations in nonlinear systems... but damn that was some old software.

Writing long descriptions in comments works if you're the only one editing the code, or you supervise all contributions... in a fast-changing industrial codebase, those things go out of date very quickly, so comments are used more sparsely. I document the usage of any classes or functions that my package exports, and I'll write little inline comments explaining lines of code whose purpose or effect is not obvious. Mostly I just try to organize things sensibly and choose descriptive names for variables and functions.

taeric|5 years ago

Your points apply to industry, too. I heretically push flatter code all the time. I'm not against abstraction, but it is easy to fall into the trap of building a solution machine, but missing the solution you need.

exdsq|5 years ago

Point 1 is so true, I think it’s why I like Golang without generics so people can’t go crazy with abstractions.