Why scientific programming does not compute

[+] dasil003|14 years ago|reply

For all the talk of "best practices" and "training" the depressing truth is that guaranteeing correct software is incredibly difficult and expensive. Professional software engineering practices aren't nearly sufficient to guarantee correctness with heavy math. The closest thing we have is NASA where the entire development process is designed and constantly refined in response to individual issues to create the checks and balances with the lofty goal of approaching bug impossibility at an organizational level. Unfortunately this type of evolutionary process is only viable for multi-year projects with 9-figure budgets. It's not going to work for the vast majority of research scientists with limited organizational support.

On the positive side, such difficulty is also in the nature of science itself. Scientists already understand that rigorous peer review is the only way to come to reliable scientific conclusions over time. The only thing they need help with understanding is that the software used to come to these conclusions is as suspect as—if not more so than—the scientific data collection and reasoning itself, and therefore all software must be peer-reviewed as well. This needs to be ingrained culturally into the scientific establishment. In doing so, the scientists can begin to attack the problem from the correct perspective, rather than industry software experts coming in and feeding them a bunch of cargo cult "unit tests" and "best practices" that are no substitute for the deep reasoning in the specific domain in question.

[+] schleyfox|14 years ago|reply

I've spent a bit of time on the inside at NASA, specifically working on earth observing systems. There is a huge difference between the code quality of things that go into control systems for spacecraft (even then, meters vs. feet, really?) and the sort of analysis/theoretical code the article talks about. Spacecraft code gets real programmers and disciplined practices, while scientific code is generally spaghetti IDL/Matlab/Fortran.

There is a huge problem with even getting existing code to run on different machines. My team's work was primarily dealing with taking lots of project code (always emailed around, with versions in the file name) and rewriting it to produce data products that other people could even just view. Generally we'd just pull things like color coding out of the existing code and then write our processors from some combination of specifications and experimentation.

I'd agree that "unit tests" and trendy best practices are probably not the full answer, but the article is correct in emphasizing documentation, modularity, and source control. Source control alone would protect against bugs produced by simply running the wrong version of code.

[+] Confusion|14 years ago|reply

  the depressing truth is that guaranteeing correct software
  is incredibly difficult and expensive

There is a world of difference between the correctness of industrial programs that follow 'cargo cult best practices' and the correctness of scientific programs. This is achieved without incurring incredible expenses. That we can't go all the way by (practical) definition doesn't mean we shouldn't try to get further.

One of the main problems is convincing, especially young, scientists that their code sucks. Young programmers, you can coach. You review their code, teach them what works and what doesn't and they get better. Scientists that happen to write progams, they don't learn to become better programmers: they've got other things to worry about. There's nobody to help them and since they're usually highly intelligent and overestimate their capabilities in things they don't want to spend time on (which is a way of justifying for yourself not to spend time on it), they need all the more guidance to become good.

[+] demian|14 years ago|reply

I agree that correct software can not be achive by industry's practices.

BUT isn't better to use "cargo cult best practices", as you call them, than code-and-fix without any kind of formal test or documentation?

The hole point of these software programming practices is to improve overall quality with limited resources, not to craft perfect code.

[+] gallamine|14 years ago|reply

I'm a PhD student in Electrical Engineering. I'm currently working on a Monte Carlo-type simulation for looking at the underwater light field for underwater optical communication (no sharks!). I'm doing the development in MATLAB and I recently put all my code up on Github (https://github.com/gallamine/Photonator) to help avoid some of these problems (lack of transparency). Even if nobody ever looks/uses the code, I know every time I do a commit there's a change someone MIGHT and I think it helps me write better code.

The problem with doing science via models/simulation is that there just isn't a good way of knowing when it's "right" (well, at least in a lot of cases), so testing and verification are imperative. I can't tell you how many times I've laid awake at night wondering if my code has a bug in it that I can't find and will taint my research results.

I suspect another big problem is that one student writes the code, graduates, then leaves it to future students, or worse, their professor, to figure out what they wrote. Passing on the knowledge takes a heck of a lot of time, especially when you're pressed to graduate and get a paycheck).

There's got to be a market in this somewhere. Even if it was just a volunteer service of "real" programmers who would help scientists out. I spent weeks trying to get my code running on AWS, which probably would have taken a few hours from someone who knew what they were doing. I also suspect that someone with practice could make my simulations run at twice the speed, which really adds up when you're doing hundreds of them and they take hours each.

[+] john_b|14 years ago|reply

I'm a M.S. student in mechanical engineering facing a similar situation, except I haven't put any code on Github (my advisor wants to keep it proprietary, but I probably would not bother putting it up even if he were ok with it).

I've written around 15000 lines of MATLAB for my research and only a handful of people will ever need to see it. Some is well-structured and nicely commented, but other parts are incomprehensible and were written under severe time constraints. My advisor is not much of a programmer and will not be able to figure it out, and I feel bad for leaving a pile of crappy code to the person who inevitably follows in my footsteps, but I ultimately have a choice between writing fully commented, well-tested, and well-structured code and graduating a semester late (at the cost of several thousand dollars to myself), or writing code that's "just good enough" to get results on time. This is a solo project (there is no money for a CS student to intern) and I'm not getting paid to write code unlike a professional programmer, so every second I spend improving my code beyond the bare minimum costs me time and money.

Even if I were able to tidy up and publish all of my code, most mechanical engineers would not be able to understand it because most can't write code. Those who can mostly use FORTRAN, although C is becoming more common. Nonetheless, even those who could understand my code would have little incentive to read through 15000+ lines of code.

Unfortunately, as far as research code is concerned, a lot of trust is still required on the part of the reader of the publication. I agree that the transfer of knowledge should be handled differently, but until there is a strong incentive for researchers to write good code it will continue to be bad. Especially when many research projects only require the code to demonstrate something, after which it can be put in the closet.

[+] pwang|14 years ago|reply

There is a market, and it's called libraries. Eventually you will use a language where software carpentry and code reuse is a core feature, and tested, modular libraries for not only core algorithms, but also deployment and dev-ops stuff (like managing a compute cluster on the cloud) will have standard approaches.

This is starting to shape up on the Python side of things, but it has stagnated a little bit. People who can and do write the foundational code are oftentimes too focused on making the code work, and not at all focused on improving the quality of the ecosystem that their code is part of. Open Source is a great mechanism for many things, but polishing up the last 20% is not one of them.

[+] FrojoS|14 years ago|reply

The people at Willow Garage think there is a market for this [1].

http://www.willowgarage.com/blog/2010/04/27/reinventing-whee...

[+] JabavuAdams|14 years ago|reply

Huh. Cool. Are you familiar with Metropolis light-transport? http://en.wikipedia.org/wiki/Metropolis_light_transport

I've read about it in the context of speeding up global-illumination path-tracing for computer graphics.

I think it's based on work that was originally done for neutron scattering.

[+] xtracto|14 years ago|reply

"Essentially, all models are wrong, but some are useful."

[ http://en.wikiquote.org/wiki/George_E._P._Box ]

[+] gte910h|14 years ago|reply

I want to write a "software style guide" for journalists and their editors.

Software and Code are both mass nouns in technical language.

"Code" can be in programs (aka, things that run), libraries (things that other programmers can use to make programs), or in samples to show people how to do things in their programs or libraries. Some people call short programs scripts.

When you feel you should pluralize "software", you're doing something wrong. You might want to use the word programs, you might want to use the word products, you might want to just use it like a mass noun "It turns out, thieves broke into the facility and stole some of the water", etc when talking about a theft of software "It turns out, thieves broken into the facility and stole some of the software".

[+] ahi|14 years ago|reply

"he attempted to correct a code analysing weather-station data from Mexico."

This annoys me, and it is everywhere. It indicates the writer has no idea what they're writing about and presumes that it's not a process but a matter of getting the right answer. "Hold on a sec, let me get out my Little Orphan Annie's Secret Decoder Ring."

(sibling deleted and moved here)

[+] szany|14 years ago|reply

Actually it's a science thing. In a scientific context "code" is understood to mean "program". For example: http://scholar.google.com/scholar?q=%22population+synthesis+...

I'm not sure why this is though.

[+] neutronicus|14 years ago|reply

I believe "code" meaning "program" is a relic of early military software development, and as such is probably older than your terminology.

[+] losvedir|14 years ago|reply

Is this not simply a British English thing? I assumed it was, like "maths", since Nature is a British publication. HN users from the UK, can you confirm? gte is speaking about constructions in the article like:

"As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software, say computer scientists."

"As recognition of these issues has grown, software experts and scientists have started exploring ways to improve the codes used in science."

[+] unknown|14 years ago|reply

[deleted]

[+] jzila|14 years ago|reply

My girlfriend is a PhD student in a pharmacology lab. I'm a software engineer working for an industry leader.

Once, she and the lab tech were having issues with their analysis program for a set of data. It was producing errors randomly for certain inputs, and the data "looked wrong" when it didn't throw an error. I came with her to the lab on a Saturday and looked through the spaghetti code for about 20 minutes. Once I understood what they were trying to do, I noticed that they had forgotten to transpose a matrix at one spot. A simple call to a transposition function fixed everything.

If this had been an issue that wasn't throwing errors, I don't know whether they would have even found the bug. I've been trying to teach my gf a basic understanding of software development from the ground up, and she's getting a lot better. But this does appear to be a systemic problem within the scientific community. As the article notes, more and more complicated programs are needed to perform more detailed analysis than ever before. This problem isn't going to go away, so it's important that scientists realize the shortcoming and take steps to curb it.

[+] andrewcooke|14 years ago|reply

i'm in a similar position to you (although i started out as an academic i've worked in the software industry for ages and so end up helping my astronomer partner).

anyway, i disagree slightly with your analysis. in my experience academics know that they suck at the "engineering" part and, to make up for it, are very diligent in making sure that the results "feel right". so i don't think what you described was luck - that's how they work.

in comparison, what drives me crazy, is that if they learnt to use a few basic tools (scm, libraries, an ide, simple test framework) they could save so much time and frustration.

[related anecdote: last year i rewrote some c code written by a grad student that was taking about 24 hours to run. my python translation finished in 15 minutes and gave the same answer each time it was run (something of a novelty, apparently)].

[+] reinhardt|14 years ago|reply

Not sure how your anecdote relates to the conclusion. Forgetting, or even knowing why, to transpose a matrix is not an example of a problem that can be solved by "a basic understanding of software development". Hell, I'm sure there are many decent hackers that don't know what a matrix is, let alone spot such errors within a long sequence of computations.

[+] xtracto|14 years ago|reply

The problem I see with your girlfriend's program is more of a "verification" issue.

In the simulation sub-field I am there is this "research development process" which includes "verification" and "validation" after the model is performed.

Part of the verification is done by "third party code reviews" in which a party unrleated to the program/project reviews the model description (word document) and does a line-by-line analysis of the code to see that the program matches the code.

I did that during my PhD (a Professor at INSEAD paid me to do a code review of a model).

In the case of your girlfriend's lab, they catched the error via "face validation" (the results looked wrong).

[+] GoogleMeElmo|14 years ago|reply

Yes this is a huge problem. I am a software engineer working at a research institute for bioinformatic. The biggest problem I encounter in my struggle for clean maintainable code, is that management down prioritize this task quite heavily.

The researchers produce code of questionable quality that needs to go into the main branch asap. Those few of the researchers that know how to code (we do a lot of image analysis), don't know anything about keeping it maintainable. There is almost a hostile stance against doing things right, when it comes to best practices.

The "Works on my computer" seal of approval have taken a whole new meaning for me. Things go from prototype to production by a single correct run on a single data set. Sometimes its so bad I don't know if I should laugh or cry.

Since we don't have a single test or, ever take the time to do a proper build system, my job description becomes mostly droning through repetitive tasks and bug hunting. It sucks the life right out of any self respecting developer.

There, I needed that. Feel free to flame my little rant down into the abyss. :)

[+] bh42222|14 years ago|reply

As a general rule, researchers do not test or document their programs rigorously, and they rarely release their codes, making it almost impossible to reproduce and verify published results generated by scientific software, say computer scientists.

Just stop doing that!

Seriously, testing is not wasted effort and for any project that's large enough it's not slowing you down. For a very small and simple project testing might slow you down, for bigger things - testing makes you faster! And the same goes for documentation. And full source code should be part of every paper.

Many programmers in industry are also trained to annotate their code clearly, so that others can understand its function and easily build on it.

No, you document code primarily so YOU can understand it yourself. Debugging is twice as hard as coding, so if you're just smart enough to code it, you have no hope of debugging it.

[+] scott_s|14 years ago|reply

The point is that since software development is not their main goal or background, their practices tend to be ad-hoc. We know the value of testing and documentation, but they do not. People don't know to stop doing something until they know it's a bad practice. And they're not going to know it's a bad practice until they discover that fact on their own (which can be a slow process) or someone teaches them (faster, but potential cultural problems).

That they should is basically a given in the article. The question is how to make it happen.

[+] neutronicus|14 years ago|reply

The thing about scientific code is that it's often a potential dead end. The maintenance phase of the software life cycle is not as assured as it is in industry.

Writing good engineering software is not the scientist's goal so much as demonstrating that someone else with a greater tolerance for tedium (also someone better-paid) could write good engineering software.

[+] pygy_|14 years ago|reply

You won't see any clean code written by scientists until (major) journals make it mandatory to submit code for peer review and publication.

When it happens, I hope that they'll manage to agree on a sensible license (even though I won't set my hopes too high).

[+] notarealname|14 years ago|reply

[New account for anonymity]

An often neglected force in this argument is that many practitioners of "scientific coding" take rapid iteration to its illogical and deleterious conclusion.

I'm often lightly chastised for my tendencies to write maintainable, documented, reusable code. People laugh guiltily when I ask them to try checking out an svn repository, let alone cloning a git repo. It's certain that in my field (ECE and CS) some people are very adamant about clean coding conventions, and we're definitely able to make an impact bringing people to use more high level languages and better documentation practices.

But that doesn't mean an hour goes by without seeing results reverse due to a bug buried deep into 10k lines of undocumented C or Perl or MATLAB full of single letter variables and negligible modularity.

[+] gte910h|14 years ago|reply

What I am hearing from you, is that we need to lobby the US government into open code requirement for grant work done under their purvey.

Also some sort of git front end that unwilling people could use would make things better?

[+] brohee|14 years ago|reply

Next they'll discover than when those scientists leave academia and become quants, they don't magically become any better at coding (but at least they now have access to professionals, if they recognize the need).

[+] gwern|14 years ago|reply

An interesting citation http://portal.acm.org/citation.cfm?id=188228 :

> This paper describes some results of what, to the authors' knowledge, is the largest N-version programming experiment ever performed. The object of this ongoing four-year study is to attempt to determine just how consistent the results of scientific computation really are, and, from this, to estimate accuracy. The experiment is being carried out in a branch of the earth sciences known as seismic data processing, where 15 or so independently developed large commercial packages that implement mathematical algorithms from the same or similar published specifications in the same programming language (Fortran) have been developed over the last 20 years. The results of processing the same input dataset, using the same user-specified parameters, for nine of these packages is reported in this paper. Finally, feedback of obvious flaws was attempted to reduce the overall disagreement. The results are deeply disturbing. Whereas scientists like to think that their code is accurate to the precision of the arithmetic used, in this study, numerical disagreement grows at around the rate of 1% in average absolute difference per 4000 fines of implemented code, and, even worse, the nature of the disagreement is nonrandom. Furthermore, the seismic data processing industry has better than average quality standards for its software development with both identifiable quality assurance functions and substantial test datasets.

[+] gwern|14 years ago|reply

There's a later paper available for reading: http://www.leshatton.org/Documents/Texp_ICSE297.pdf

[+] saulrh|14 years ago|reply

Something I heard from one of my professors once: "A programmer alone has a good chance of getting a good job. A scientist alone has a good chance of getting a good job. A scientist that can program, or a programmer that can do science, is the most valuable person in the building."

[+] enjalot|14 years ago|reply

I'm finishing up a degree in Computational Science where we are essentially trained in computational and mathematical techniques used in the physical sciences, and all I can think about is becoming an artist.

I don't think it's the science that adds value, I think it's the programming. The thing is, programming allows you to automate, simulate, measure and visualize complex processes. Science is all about complex processes, so if you have more powerful tools available to understand them, you will be much more valuable. Add to that, many of the physical sciences are hitting limits of physical experimentation and require simulations for further understanding.

I don't think the power of programming has truly shown itself, it should revolutionize every industry. It brings with it a different attitude towards solving problems and opens up new realms of possibilities. Social sciences are finally starting to look like real science thanks to big data and we have new knowledge industries. I'm personally most interested in how much art and education will change thanks to new powers of interactivity.

[+] MasterScrat|14 years ago|reply

Similarly, from Zed Shaw (http://learnpythonthehardway.org/book/advice.html) :

"People who can code in the world of technology companies are a dime a dozen and get no respect. People who can code in biology, medicine, government, sociology, physics, history, and mathematics are respected and can do amazing things to advance those disciplines."

[+] cosgroveb|14 years ago|reply

This is also true in business. You should always remember that you're not just a developer or a programmer but that you're solving business problems. That's the only way to truly make yourself invaluable.

[+] nkassis|14 years ago|reply

This is so true. I work in a research lab and I'm trying to interest myself more to the science and it's really helping my coding work. It's easier to improve software when you know what your users need.

[+] unknown|14 years ago|reply

[deleted]

[+] a_dy|14 years ago|reply

I can attest to this.

[+] earl|14 years ago|reply

That has not been my experience. I spent years working on DNA aligners then in a wetlab building software for confocal laser microscopy. In both locations, the best paid and most highly valued people were the scientists. If you, say, were a good developer with masters in stats and a strong understanding (somewhere between an undergrad and an MS) of the relevant science... you were paid 1/2 as much as you would be paid if you did computational advertising.

And yet there aren't many good developers doing science. Weird, huh?

[+] ajdecon|14 years ago|reply

(Disclaimer: my background is in materials physics, and it may be different in other fields. But I doubt it.)

Unfortunately there is very little direct incentive for research scientists to write or publish clean, readable code:

- There are no direct rewards, in the tenure process or otherwise, for publishing code and having it used by other scientists. Occasionally code which is widely used will add a little to the prestige of an already-eminent scientist, but even then it rarely matters much.

- Time spent on anything other than direct research or publication is seen as wasted time, and actively selected against. Especially for young scientists trying to make tenure, also the group most likely to write good code. Many departments actually discourage time spent on teaching, and they're paid to do that. Why would they maintain a codebase?

- Most scientific code is written in response to specific problems, usually a body of data or a particular system to be simulated. Because of this, code is often written to the specific problem with little regard for generality, and only rarely re-used. (This leads to lots of wheel re-invention, but it's still done this way.) If you aren't going to re-use your code, why would others?

- If by some miracle a researcher produces code which is high-quality and general enough to be used by others, the competitive atmosphere may cause them to want to keep it to themselves. Not as bad a problem in some fields, but I hear biology can be especially bad here.

- Most importantly, the software is not the goal. The goal is a better understanding of some natural phenomenon, and a publication. (Or in reverse order...) Why spend more time than absolutely necessary on a single part of the process, especially one that's not in your expertise? And why spend 3x-5x the cost of a research student or postdoc to hire a software developer at competitive rates?

I went to grad school in materials science at an R1 institution which was always ranked at 2 or 3 in my field. I wrote a lot of code, mostly image-processing routines for analyzing microscope images. Despite it being essential to understanding my data, the software component of my work was always regarded by my advisor and peers as the least important, most annoying part of the process. Time spent on writing code was seen as wasted, or at best a necessary evil. And it would never be published, so why spend even more time to "make it pretty"?

I'm honestly not sure what could be done to improve this. Journals could require that code be submitted with the paper, but I really doubt they'd be motivated to directly enforce any standards, and I have no faith in scientists being embarrassed by bad code. Anything not in the paper itself is usually of secondary importance. (Seriously, if you can, check out how bad the "Supplementary Information" on some papers is.) But even making bad code available could help... I guess. And institutions could try to more directly reward time put into publishing good code, but without the journals on board it may be seen as just another form of "outreach"--i.e., time you should have been in lab.

I did publish some code, and exactly two people have contacted me about it. That does make me happy. But many, many more people have contacted me to ask about how I solved some problem in lab, or what I'm working on now that they could connect with. (And are always disappointed when I tell them I left the field, and now work in high-performance computing.) Based on the feedback of my peers... well, on what do you think I should've spent my time?

[+] arctangent|14 years ago|reply

I think it is unreasonable to expect that a person will be a good programmer just because (a) they are a scientist and (b) their current project can be assisted by computers.

Is it not sensible, perhaps, to have a dedicated group of programmers (with various specialities) available as a central resource to assist the scientists with their modelling? (I am imagining a central pool whose budget would be spread over several areas.)

I personally love working on toy projects related to science. Maybe we hackers with time for that kind of thing should volunteer in some way to assist with the technical aspects of research that is directed by a scientist? I'm not sure I'd even care about getting a credit on a research paper so long as I could post pretty pictures and graphs on my blog...

[+] ANH|14 years ago|reply

From personal experience, I attest that it can be more difficult than pulling teeth to get a scientist to commit code to a version control system.

[+] pwang|14 years ago|reply

Greg Wilson once commented that the subversive way to get scientists to use source control was not to pitch it as a code history tool, but rather as a nifty way to sync up code between their work machines, home machines, etc. He said he had a lot more traction with that than trying to lecture them about having code history.

[+] scott_s|14 years ago|reply

One of the main sources in the article is a study from the 2009 Workshop on Software Engineering for Computational Science and Engineering. One of the workshop's organizer's has a report of the overall conference which is interesting: http://cs.ua.edu/~carver/Papers/Journal/2009/2009_CiSE.pdf

[+] mclin|14 years ago|reply

Rather than building these data analysis/visualization programs from scratch each time, my thought is that scientists should instead be writing them as modules for a data workflow application like RapidMiner.

If you haven't heard of RapidMiner, you basically edit a flowchart where each step takes inputs and outputs, eg take some data and make a histogram, or perform a clustering analysis.

Video of someone demoing it: http://www.youtube.com/watch?v=TNESlvXp47E

This way, the scientists can focus on the algorithms and not have to worry about all the other details of creating useable, maintainable software.

[+] gwern|14 years ago|reply

There are a lot of suggestions that the code and data be required to publish.

Sorry guys, but that hasn't worked so far: the economics journal _Journal of Money, Credit and Banking _, which required researchers provide the data & software which could replicate their statistical analyses, discovered that <10% of the submitted materials were adequate for repeating the paper (see "Lessons from the JMCB Archive", Volume 38, Number 4, June 2006).

Oops.

[+] sliverstorm|14 years ago|reply

Why not just hire comp scientists or programmers permanently? Adjust the company model, permanently segregate the work?

[+] snissn|14 years ago|reply

there's not nearly enough open source academic projects, nor is there any sort of pervasive culture that encourages one.. besides the litany of examples that could be put together to show that open source + academia does exist and does work, I've read way too many computational physics or computational chemistry or computational anything academic papers that simply do not publish source code, and imo there's no good excuse for it, other than the usual, funding, or copyright / university IP

[+] rflrob|14 years ago|reply

Where do most programmers get this exposure to best practices like version control, unit testing, etc? I took a few early-mid level CS classes, and there was a relatively cursory emphasis on readable code, there was barely any on any of the sorts of things that lead to well-maintained projects. If these are the sorts of things that one learns at your first internship, then it's no wonder that academics in other disciplines don't have any exposure to it.

[+] jleyank|14 years ago|reply

This is a difficult situation. Is it easier to train the domain experts to be competent programmers or train the competent programmers to be domain experts? In a research environment, I worry there's little time or interest in developing specs that can change in an instant or can't be written until the physics is understood.

We find it quite difficult trying to get programming out of people who don't know why Carbon has 4 bonds while Nitrogen has 3, for example.

[+] gammarator|14 years ago|reply

My feeling is that a one-semester required course for students in "software carpentry" [1] (as developed by Greg Wilson and discussed in the article) would cure many of the most serious ills in scientific software development. Students can't know they should be using version control, debuggers, and testing if they don't even know such things exist.

[1] http://software-carpentry.org/

164 comments