J_Thomas's comments

J_Thomas | 13 years ago | on: Practices in source code sharing in astrophysics

"I spent years of my life perfecting these, learning about algorithms, learning the intricacies and quirks of the various datasets we use-- if someone wants to take my place in this community as "that guy" then I expect them to devote as much time as I have to learning these techniques inside out and then to do something better than me. I have no desire to pass my code off to a masters student and let him naively plug some dataset into it-- firstly because no one should ever rely on a black box in this field, and secondly because these codes all need to be tweaked to account for the different instruments and data structures being used."

The result is that you are important, and no one else can check on you. That's kind of good for you, but there should be a way you can do better.

My junior year in college some psychologists told me about a stastistician who would once a year write a paper demonstrating ways that psychologists mis-used some statistical technique. He would quote 20 or 30 psychology papers and explain why their statistics were wrong and so their conclusions were wrong. They lived in fear of him.

If there existed robust and reasonably transparent code to do what you do, along with great documentation to show users where the pitfalls are, and you got a valued publication every time you showed that a significant result was done wrong and you gave the better version, very likely you would be better off. I'm pretty sure astronomy would be better off.

I don't know how we could get from here to there, but it's something to consider.

J_Thomas | 13 years ago | on: Practices in source code sharing in astrophysics

"Replication -- an exact duplication of a study and hopefully its results -- is a cornerstone of science. And without full disclosure of the original study's method and results, replication isn't possible."

My first quarter in grad school in statistics, a girl who was a couple of years ahead of me quit. She was doing an internship with an MD, and every time he got a new diagnosed kid in the project he pondered which group to put the kid in. "This one's going to die, where do I put him to make the results come out right?"

The institution was considered second-class in that field, so he got the money to replicate a Harvard result. If he got the right results it would make him look more competent and he might get better grants. So he was doing his best to fudge the results so they would come out right.

She tried to argue that he should do his statistics correctly and he disagreed. She was so upset that her work was pointless, and that her career would be pointless, that she quit completely.

Ideally scientists would be rewarded for doing open science correctly. In some ways that is not the case now, and we should look at ways to fix that.

How can average mediocre scientists get job security without keeping secrets? Perhaps we are giving too many people a chance to be professional scientists, so that too many of them must lose out?

J_Thomas | 13 years ago | on: Practices in source code sharing in astrophysics

It's good and useful for independent researchers to collect their own data and analyse it. That's a check on the data collection methods.

It's also useful for others to check the analysis of trusted data. If they do it with their own methods and get a different result, then it's time to actually compare the methods.

It's a good thing for the original researchers to release their code too. But looking for bugs in the code is no substitute for doing independent analysis and independent data collection. It's an important supplement, but it isn't enough.

page 1