Bad scientific code beats code following "best practices" (2014)

[+] lynndotpy|2 years ago|reply

Scientist and programmer here, and my experiences are the opposite. I value keeping things "boringly simple", but I desperately wish there was any kind of engineering discipline.

First is the reproducibility issue. I think I've spent about as much time simply _trying_ to get the dependencies of research code to run as I have done writing or doing research in my PhD. The simple thing is to write a requirements.txt file! (For Python, at least.)

Second, two anecdotes where not following best practices ruined the correctness of research code:

- Years ago, I was working on research code which simulated a power-grid. We needed to generate randomized load profiles. I noticed that each time it ran, we got the same results. As a software engineer, I figured I had to re-set the `random` seed, but that didn't work. I dug into the code, talked to the researcher, and found the load-profile algorithm: It was not randomly generated, but a hand-coded string of "1" and "0".

- I later had the pleasure of adapting someone's research code. They had essentially hand-engineered IPC. It worked by calling a bash script from Python, which would open other Python processes and generate a random TCP/IP socket, the value of which was saved to an ENV variable. Assuming the socket was open, the Python scripts would then share the socket names of other filenames for the other processes to read and open. To prevent concurrency issues, sleep calls were used throughout the Python and Bash script. This was four Python scripts and two shell scripts, and to this day, I do not understand the reason this wasn't just one Python script.

[+] cauch|2 years ago|reply

My problem with this discussion is that a lot of people just say "I'm a scientist (or I'm working with scientists) and I'm observing X so I can say 'scientists blahblahblah'".

Different scientific research fields are using widely different computer software environment, and have their own habits and traditions. The way a biologist uses programming has no reason to be similar to the way an astrophysicist does: they have not at all experienced the same software environment. It may even be useless to talk about "scientist" in the same field as two different labs working in the same field may have very different approaches (but it's more difficult if there are shared framework).

So, I'm not at all surprised that you observe opposite experience. The same way I'm not surprised to see someone saying they had the opposite experience if someone says "European people are using a lot of 'g' and 'k' in their words" just because they observed what happened in Germany.

[+] pennomi|2 years ago|reply

Absolutely my experience as well. Scientists write code that works, but is a pain to reproduce in any sort of scalable way. However it’s been getting better over time as programming is becoming a less niche skill.

[+] BobbyJo|2 years ago|reply

The problem I've run into over and over with research code is fragility. We ran it on test A, but when we try test B nothing works and we have no idea why because god forbid there is any error handling, validation, or even just comprehensible function names.

[+] astrobe_|2 years ago|reply

This is partly because, in my opinion, some "best practices" are superstitions.

Some practice was best because of some issue with 80s era computing, but is now completely obsolete; problem has been solved in better ways or has completely disappeared thanks e.g. to better tooling or better, well, practices. e.g. Hungarian notation. Yet it is still passed down as a best practice and followed blindly because that's what they teach in schools. But nobody can tell why it is "good", because it is actually not relevant anymore.

Scientific code has no superstitions (as expected I would say), but not for the best reasons; they didn't learn the still relevant good practices either.

[+] tchalla|2 years ago|reply

I wish we communicated the intent of the “best practice” instead of the practice itself.

[+] DragonStrength|2 years ago|reply

Actually, when I’ve followed those guidelines, it’s because the tech lead graduated in the 1980s, almost certainly learned it all on the job, but has always done it that way. Others just do what they’ve done before. School talked about those things, but not in a “this is the right way” sort of thing.

[+] quickthrower2|2 years ago|reply

There is no best practice. It is good to know the tools. In dojo, do that crazy design pattern shit and do crazy one long function. Do some C#, Java, JS, Go, Typescript, Haskell, Ruby, Rust (not necessarily those but a big variety). I want the next person to understand my code - this is very important. Probably more important than time spent or performance. If spending another 10% refactoring to make it easier to understand, even if just adding good comments, it is well worth it. Make illegal state impossible, if you can (e.g. don't store the calculated value, and if you do then design it so it can't be wrong!). Make it robust. Pretend it'll page you at 2am if it breaks!

[+] jayd16|2 years ago|reply

Such as what? I don't really know of any such superstitions that are based on nothing.

I see a lot of opinion/taste presented as something more, but I really can't think of superstitions.

[+] chaxor|2 years ago|reply

It is important to have popular and powerful tools that can reduce amount of code for things like caching and building.

For example, Snakemake (os-independent make) with data version control based off of torrent (removing complication of having to pay for AWS, etc) for the caching of build steps, etc would be a HUGE win in the field. *No one has done it yet* (some have danced around the idea), but if done well and correctly, it could reduce the amount of code and pain in reproducing work by thousands of lines of code in some projects.

It's important for the default of a data version control to be either ipfs or torrent, because it's prohibitive to make everyone set up all these accounts and pay these storage companies to run some package. Ipfs, torrent, or some other centralized solution is the only real solution.

[+] leptons|2 years ago|reply

Today's "best practice" is tomorrow's worst practice.

[+] ohlokkru|2 years ago|reply

[deleted]

[+] jusssi|2 years ago|reply

Two more to the scientists' tab:

1. No tests of any kind. "I know what the output should look like." Over time people who know what it should look like leave, and then it's untouchable.

2. No regard to the physical limits of hardware. "We can always get more RAM on everyone's laptops, right?". (You wouldn't need to if you just processed the JSONs one at a time, instead of first loading all of them to the memory and then processing them one at a time.)

Also the engineers' tab has a strong smell of junior in it. When you have spent some time maintaining such code, you'll learn not to make that same mess yourself. (You'll overcorrect and make another, novel kind of mess; some iterations are required to get it right.)

[+] lozenge|2 years ago|reply

Yes, the claim that the scientists' hacked-together code is well tested and even uses valgrind gave me pause. It's more likely there are no tests at all. They made a change, they saw that a linear graph became exponential, and they went bug hunting. But there's no way they have spotted every regression caused by every change.

[+] Asraelite|2 years ago|reply

Agree with those two problems on the scientist side. I would also add that they often don't use version control.

I think a single semester of learning the basics of software development best practices would save a lot of time and effort in the long term if it was included in physics/maths university courses.

[+] 2devnull|2 years ago|reply

1 and 2 are features. Re 1, if someone doesn’t know what the output should look like they shouldn’t be reusing the code. Re 2, just think a bit more about it and you’ll realize fretting over ram that isn’t needed until it’s needed is actually just premature optimization.

[+] gregopet|2 years ago|reply

Sounds like the non-programmers are good at what they are supposed to be good at (solving the actual problem, if perhaps not always in the most elegant manner) while the programmers should be producing a highly maintainable, understandable, testable and reliable code base (and potentially have problems with advanced algorithms that rely on complicated theorems), but they are not. The OP has a case of bad programmers - the techniques listed as bad can be awesome if used with prudence.

A good programmer has a very deep knowledge of the various techniques they can use and the wisdom to actually choose the right ones in a given situation.

The bad programmers learn a few techniques and apply them everywhere, no matter what they're working on, with whom they are working with. Good programmers learn from their mistakes and adapt, bad programmers blame others.

I've worked with my share of bad programmers and they really suck. A good programmer's code is a joy to work with.

[+] laserbeam|2 years ago|reply

I agree with the feelings of the author, most software is overengineered (including most of my software).

That being said, most scientific code I've encountered doesn't compile/run. It ran once at some point, it produced results, it worked for the authors and published a paper. The goal for that code was satisfied and than that code somehow rusted out (doesn't work with other compilers, hadn't properly documented how it gets build, unclear what dependencies were used, dependencies were preprocessed at some point and you can't find the preprocessed versions anywhere to reproduce the code, has hardcoded data files which are not in the published repos etc.). I wouldn't use THAT as my compass on how to write higher quality code.

[+] jakobnissen|2 years ago|reply

I'm a scientist programmer working in a field comprised by biologists and computer scientists, and what I've experienced is almost exactly the opposite of the author.

I've found the problems that biologists cause are mostly:

* Not understanding dependencies, public/private, SCM or versioning, making their own code uninstallable after a few months

* Writing completely unreadable code, even to themselves, making it impossible to maintain. This means they always restart from zero, and projects grow into folders of a hundred individual scripts with no order, depending on files that no longer exists

* Foregoing any kind of testing or quality control, making real and nasty bugs rampant.

IMO the main issue with the software people in our field (of which I am one, even though I'm formally trained in biology) is that they are less interested in biology than in programming, so they are bad at choosing which scientific problems to solve. They are also less productive when coding than the scientists because they care too much about the quality of their work and not enough about getting shit done.

[+] mglz|2 years ago|reply

I just handed in my PhD in computer science. Our department teaches "best practices" but adherence to them is hardly possible in research:

1) Requirements change constantly, since... it's research. We don't know where exactly we're going and what problems we encounter.

2) Buying faster hardware is usually an option.

3) Time spent on documentation, optimization or anything else that does not directly lead to results is directly detrimental to your progress. The published paper counts, nothing else. If a reviewer ask about reproducibility, just add a git repository link.

4) Most PhD students never worked in industry, and directly come from the Master's to the PhD. Hence there is no place where they'd encounter the need to create scalable systems.

I guess Nr. 3 is has the worst impact. I would love to improve my project w.r.t. stability and reusability, but I would shoot myself into the foot: It's no publishable, I can't mention it a lot in my thesis, and the professorship doesn't check.

[+] pfisherman|2 years ago|reply

Putting some effort into (3) can increase your citations (h-index). If people can’t use your software then they will just find some other method to benchmark against or build on.

Here you are not improving your time to get out an article, but reducing it for others - which will make your work more influential.

[+] alexmolas|2 years ago|reply

> 3) Time spent on documentation, optimization or anything else that does not directly lead to results is directly detrimental to your progress.

Here's is where I disagree. It's detrimental in the short term, but to ensure reproducibility and development speed in the future you need to follow best practices. Good science requires good engineering practices.

[+] quickthrower2|2 years ago|reply

I agree. Been doing devops recently but back at some coding at work and I wrote the function as simple as I could, adding complexity but only as needed.

So it started as a MVC controller function that was as long as your arm. Then it got split up into separate functions, and eventually I moved those functions to another file.

I had some genuine need for async, so added some stuff to deal with that, timeouts, error handling etc.

But I hopefully created code that is easy to understand, easy to debug/change.

I think years ago I would have used a design pattern. Definitely a bridge - because that would impress Kent Beck or Martin Fowler! But now I just want to get the job done, and the code to tell a story.

I think I pretend I am a Go programmer even if I am not using Go!

[+] Nursie|2 years ago|reply

Yeah nah.

There was the flawed model out of Imperial College (IIRC) during the early covid days that showed up how wrong this attitude is.

It was so poorly written that the results were effectively useless and non-deterministic. When this news came out, the scientists involved doubled down and instead of admitting that coding might be hard, and getting in a few experts to help out might be useful, actually blamed software engineers for how hard it is to use C++.

[+] dahart|2 years ago|reply

In other words, programmers tend to over-engineer, and non-programmers tend to under-engineer. Despite all the arguments here about who’s making the biggest messes, that part is not surprising at all.

Both are real problems. Over-abstraction and over-engineering can be very expensive up front and along the way, and we do a lot of it, right? Under-engineering is cheaper up front but can cause emergencies or cost a lot later. Just-right engineering is really hard to do and rarely ever happens because we never know in advance exactly what our requirements and data really are.

The big question I have about scientific environments is why there isn’t more pair-programming between a scientist and a programmer? Wouldn’t having both types of expertise vetting every line of code be better than having each person over/under separately? Ultimately software is written by teams, and it’s not fair to point fingers at individuals for doing the wrong amount of engineering, it’s up to the entire team to have a process that catches the wrong abstraction level before it goes too far.

[+] dkarl|2 years ago|reply

Programmers want to embed domain terms everywhere. They look at scientific code and expect to see variables names containing "gravity," "velocity," etc.

Scientists need code to conform to the way they examine, solve, and communicate problems. I asked for an explanation of a particular function and was sent a PDF and was told to look at a certain page, where I found a sequence of formulas. All of the notation matched up, with the exception that superscripts and subscripts could not be distinguished in the code. To a programmer, the code looked like gibberish. To the scientists working on the code, it looked like a standard solution to a problem, or at least the best approximation that could be given in code.

You see the inverse problem when it comes to structuring code and projects: programmers see standard structures, expected and therefore transparent; scientists see gibberish. Scientists look at a directory called "tests" and think of a variety of possible meanings of the word, none of them what the programmer intended.

[+] cnewey|2 years ago|reply

While I think there are a couple of valid points, in general my feeling is that the author is setting up a straw man to attack.

Most of the “programmer sins” are of the type that more seasoned engineers will easily avoid, especially those with experience working with scientific code. Most of these mistakes are traps I see junior developers falling into because of inexperience.

[+] YouWhy|2 years ago|reply

I think we have a case of survivorship bias.

A considerable majority of the science-non-SWE crowd are de facto incapable of writing more than 100 lines of runs-in-my-notebook code.

Hence, if a change/bug is necessary, it is much likelier to fall under a SWE jurisdiction, and hence is much more likely to be industrial code.

Add to that a further confounder (tiptoeing a "no true Scotsman" here): academia is not a first choice of workplace for strong SWEs.

[+] bearsnowstorm|2 years ago|reply

Read this on mobile and the identifier longWindedNameThatYouCantReallyReadBTWProgrammersDoThatALotToo overflowed into the margins - I regard this not as a bug but a feature which helped make the author’s point :-)

[+] _the_inflator|2 years ago|reply

That’s why I fell in love with Objective C. The libraries used a lot of those expressive descriptions for attributes and methods.

I never understood nor understand people who nest their inner loops in an entangled mess of hardly distinguishable digits, which is error prone.

Same for method names.

I try to use speaking out loud to some of my methods: What do you do? And if the answer is getValue I believe it needs renaming.

[+] austin-cheney|2 years ago|reply

> Invariably, the biggest messes are made by the minority of people who do define themselves as programmers.

After 15 years of writing JavaScript professionally I know that is a lie. The biggest messes are made by the majority of people hired that cannot really program.

[+] fsloth|2 years ago|reply

I guess this could be an economics thing. Stereotypically maintenance of scientific codebases in general is not very lucractive, and the mental kick (IMHO you need to enjoy high performance numerical computing to be truly good at it) can be had for much better compensation doing stuff like cad or game engines. So I would imagine if the author has lots of experience of "professional programmers" maintaining their scientific codebase the talent pool from which they are sampled is not necessarily optimal for high output individual contributors.

My intent is not to put down maintainers of scientific software! It's super cool and super important.

I see the damage a person decades in an industry can do when they cluelessly and energetically start to test and implement a new shiny thing on an industrial codebase.

When the product brings in hundreds of millions a year, there is incentive to patch up the damage so you can have future releases and continue the business. I'm not sure how much resources a scientific codebase maintenance could use just to patch up a mountain of architectural and runtime damage.

[+] dahart|2 years ago|reply

Want to give any examples or reasoning rather than state pure opinion? I’m not a fan of using “lie” when you believe something isn’t true. Lie implies intentional dishonesty, and there’s absolutely no reason to suspect the author doesn’t believe what they said. Their experience certainly could have involved larger messes made by programmers than scientists. Just say you think it’s not true, and why, even if lie seems funny or you don’t mean to imply dishonesty.

It appears that you are not even talking about the same problem as the author. You seem to be talking about people who all define themselves as programmers, some of whom have more experience than others. The author wasn’t talking about new-hire programmers, they were talking about experienced physicists, chemists, biologists, etc., who have been doing some programming, possibly for a long time.

Either way, most of my experience is with all-programmer teams, and I have to say I’ve seen the experienced programmers make far bigger and costlier messes. The people who can’t really program might always make a lot of messes, but they make very small messes, and nobody puts them in charge of teams or lets them do that much process critical work without oversight or someone re-writing it. I’ve watched very good very experienced programmers make enormous mistakes such as engaging in system-wide rewrites that turn everything into a mess, and that cost many millions of dollars, only to take years longer than they estimated, and to come out the other end admitting it was a mistake. There was also the time a senior programmer tried to get really clever with his matrix copy constructor, and caused an intermittent crash bug only in release builds that triggered team-wide overtime right before a deadline. He was incredulous at first when we started to suspect his code, and I had to write a small ad-hoc debugger just to catch it. I calculated the dollar cost of his one line of cleverness in the several tens of thousands of dollars.

[+] jurschreuder|2 years ago|reply

This is so true I don't think I ever read something so true.

It's not even scientists vs software developers. It's people who are really into software development and clean code.

They say the program needs a total rewrite and proceed to add 20 layers of inheritance and spreading out every function over 8 files.

Ever since I make sure to repeat my mantra every week to developers:

How maintainable code is is measured in how many files you have to edit to add one feature.

[+] logicchains|2 years ago|reply

>They say the program needs a total rewrite and proceed to add 20 layers of inheritance and spreading out every function over 8 files.

Anyone who in 2023 still thinks inheritance is a good idea for anything other than a few very specialised use-cases is not somebody who seriously cares about the craft of software development, not somebody who's put any effort to study programming theory and move beyond destructive 1990s enterprise Java practices. Widespread usage of inheritance inevitably makes code harder to reason about and refactor, as anyone who's compared code in Java to code for similar functionality in Rust or Go would see (both Rust and Go deliberately eschew support for inheritance due to the nightmares it can cause).

[+] o11c|2 years ago|reply

Wrong.

How maintainable code is is measured in how well you know where to change something, and how certain you are that it did the right thing without side effects.

The fatal error of the linked article is that bad scientific code often suffers from correctness problems - not just theoretical concerns, but the "negates the main point of this paper" kind of thing.

[+] locallost|2 years ago|reply

To be honest it's mostly not their fault. Most people want to do the right thing and that's what they're taught. Doing things differently is frowned upon, and most people don't want to stick their neck out and say the emperor is naked.

Recently at work some people argued "things" (methods, classes, even files) should have a limit in size. I think that's valid thinking because you want to strive to having smaller components that you can reuse and compose, if you are able to do that. But what happened is that people started creating dozens of little files containing a function each and then importing those. To be it's obvious that this is now a lot worse because the complexity is still the same, just spread out across dozens of files. But most people were somehow convinced that they're "refactoring" and that they're doing best practices of keeping things small.

[+] fulafel|2 years ago|reply

> Simple-minded, care-free near-incompetence can be better than industrial-strength good intentions paving a superhighway to hell. The "real world" outside the computer is full of such examples.

Overengineering is insidious - "It is difficult to get a man to understand something, when his salary depends upon his not understanding it". A team can sell a solution better than a single person fixing something without making a big deal out of it. You get organizational clout and inertia on your side when you make something big and expensive.

And then complex systems are by nature hard to reason about and by extension hard to critique.

So many things come down to "complexity is the enemy".

[+] scj|2 years ago|reply

Best practices tend to be overkill for small codebases that have few users, which encompasses the majority of scientific code.

Sheer tenacity is typically sufficient for scientific codebases.

[+] wilg|2 years ago|reply

Is this article just two strawmen fighting?

[+] nickm12|2 years ago|reply

> I've been working, ... in an environment dominated by people with a background in math or physics who often have sparse knowledge of "software engineering". ... Invariably, the biggest messes are made by the minority of people who do define themselves as programmers.

Interesting switch in language here from "software engineering" to "programmers". There is of course a long history of debate on these terms, whether there is a meaningful distinction, and what qualifies as engineering versus programming.

Wherever you stand on this debate, there are a number of practices of software developers that tend to be used more towards the "engineering" side. Two of the most essential in my mind are peer code reviews and automated testing of changes (with tests, linters, type-checkers, code formatters, profilers, fuzzers, etc.).

This post doesn't talk about any of these practices or whether the so-called "programmers" messing up the scientific code are using them. I'd say if the people messing up the code are not actually advocating for using software development tools to write better code they are not actually applying software engineering practices to their code.

324 comments