top | item 3566460

Roger Boisjoly dies at 73; engineer tried to halt Challenger launch

375 points| pwg | 14 years ago |latimes.com | reply

101 comments

order
[+] raphman|14 years ago|reply
From the Rogers report [1]:

Engineers at Thiokol also were increasingly concerned about the problem. On July 22, 1985, Roger Boisjoly of the structures section wrote a memorandum predicting NASA might give the motor contract to a competitor or there might be a flight failure if Thiokol did not come up with a timely solution.

Nine days later (July 31) Boisjoly wrote another memorandum titled "O-ring Erosion/Potential Failure Criticality" to R. K. Lund, Thiokol's Vice President of Engineering:

"The mistakenly accepted position on the joint problem was to fly without fear of failure and to run a series of design evaluations which would ultimately lead to a solution or at least a significant reduction of the erosion problem. This position is now changed as a result of the [51-B] nozzle joint erosion which eroded a secondary O-ring with the primary O-ring never sealing. If the same scenario should occur in a field joint (and it could), then it is a jump ball whether as to the success or failure of the joint because the secondary O-ring cannot respond to the clevis opening rate and may not be capable of pressurization. The result would be a catastrophe of the highest order-loss of human life."

Boisjoly recommended setting up a team to solve the O-ring problem, and concluded by stating:

"It is my honest and very real fear that if we do not take immediate action to dedicate a team to solve the problem, with the field joint having the number one priority, then we stand in jeopardy of losing a flight along with all the launch pad facilities."

[1] http://history.nasa.gov/rogersrep/v1ch6.htm

[+] lutorm|14 years ago|reply
Boisjoly recommended setting up a team to solve the O-ring problem

And there was a team working on it. IIRC, a redesign of the joint was being worked on. But no one suggested, before the evening before the launch, that launches should be stopped until the redesign was completed.

[+] dmethvin|14 years ago|reply
The fate of the Challenger engineers is depressing because they were essentially kicked out of their profession for knowing the right and moral answer. Thankfully, very few of us will be faced with these honestly life-or-death technical decisions.

I think there are parallels to right-or-wrong issues that we face in our own industry on a regular basis, for example:

http://johnnye.net/articles/ios-apps-want-your-contacts,-not...

Are phrases like "standard industry practice" and "covered by the click-through license" today's weasel words that rationalize us implementing immoral management demands?

[+] rdtsc|14 years ago|reply
> I think there are parallels to right-or-wrong issues that we face in our own industry on a regular basis, for example:

If our software deals with privacy and personal information then it can even become life or death in some countries. A dissident in a brutal authoritarian regime could lose his life if the software has a bug in it, for example.

The other area were moral choices take place is in licensing and patents. Can be on both large scales (like we see with big companies use busybox for their platform) or smaller scales (copying code from one project to another without even giving credit).

[+] bwarp|14 years ago|reply
> Thankfully, very few of us will be faced with these honestly life-or-death technical decisions.

You'd be surprised. Check out the computer risks digest - always a sign of how pear shaped stuff can go:

http://catless.ncl.ac.uk/Risks

[+] cellularmitosis|14 years ago|reply
While watching the new iPad UI course from CMU ( http://www.cmu.edu/homepage/computing/2012/winter/ipad-cours... ), the lecturer touched on what happened with the Challenger, using it as an example in a point he was trying to make.

However, I was struck by wrong he got it. He was making a point about how just having the data available isnt enough - the modern challenge is to make sense of big data in a meaningful way through visualization. And to back this point up, he made it sound like a bunch of engineers didn't look closely at the data sheet for the rubber used in their o-rings and oopsy! The challenger blew up. Shrug, Who knew?

I'll tell you who knew. Roger. "The engineers" knew perfectly well what was going to happen. It was management who refused to listen to them and launched anyway.

Feyman's account of his involvement in investigating this disaster in "the pleasure of finding things out" was excellent. In one exercise, he bad the engineers and management write down what they honestly thought the failure rate of the shuttle was. Management quoted 1 in 100,000, while engineering quoted 1 in 100. The challenger disaster was a symptom of a complete breakdown in communication between engineering and management at NASA.

[+] krschultz|14 years ago|reply
One of my college mechanical engineering professors was a former NASA engineer involved in determining risk and investigating problems. He gave a gave a lecture & led a discussion on the Challenger O-Ring case (and I have since read a lot of the reports). As you mentioned, the CMU lecturer wasn't even close to the real story. But it also ins't so simple as to say 'managment didn't listen'.

The engineers knew what was going to happen with a fairly high degree of certainty - enough that the launch shouldn't hvae happened. Then they turned to management and tried to present their findings. Clearly the business guys didn't want to be the ones to hold up the launch, they had a strong incentive not to. So the engineers had to convince them why the company should basically stick its neck out as one of hundreds and hundreds of vendors and stop this launch. The engineers didn't do the best job of communicating the certainty of the problem, the magnitude of the problem, etc. So it was a little bit of both.

That lecture was probably 8 years ago for me, but it definitely left a lasting impression. Today I work on systems that truthfully are more complicated than the space shuttle and also have a lot more lives at stake. "Intellectual honesty" is a phrase I live by. There have been times I fucked up some math and knew we needed to fix something - even though it would money and delay, but I stuck to my guns and made sure it happened. The alternative is too likely to lead to disaster. It keeps me up at night, and those are the near misses. I can't imagine what that guy went through knowing they almost stopped it, but didn't.

[+] lutorm|14 years ago|reply
I just read "The Challenger Launch Decision" (after someone here recommended it) and it has a more nuanced view of what went wrong. While there was a long history of concern about the O-rings, there was also a history of getting conflicting data, of implementing "fixes" that appeared to work, only to come back later. Even the engineers themselves, while expressing concern like the memo mentioned in the article, they did not, until the launch decision conference the night before, voice anything as strong as "stop the launch".

During reading the book, I was struck by two things: one was the "slippery slope" introduced by first seeing something and arguing it was not a launch risk. With the decision making system in place, that meant it was very difficult to then, when another datum came in, to say that the two data together implied something was wrong, because they had already argued that the first datum was not a risk. This is of course a well-known phenomenon, but it didn't seem like the system was well-equipped to deal with sparse and sometimes contradictory data.

The second was the feeling that what was happening was that the engineers were observing a random process and post-rationalizing cause and effect. (I had just read "Fooled by Randomness", which may have contributed to this.) Every piece of information was made to fit into some model, but it seemed like no one was considering that maybe what they were seeing was just inherent randomness. With that view, the progressively bad outcomes that happened before indicated that what was observed was a poorly-characterized random process with a fat tail. At some point, one rationalization was that they had a "safety factor of 4 left", but if you have indications of a fat-tailed process, that's not much to bank on.

[+] po|14 years ago|reply
The idea that the challenger accident came from poor data analysis comes from Tufte's (well-read) Visual Explanations:

http://www.onlineethics.org/CMS/profpractice/exempindex/RB-i...

His argument (which the linked article takes to task) is that the engineers knew but were unable to convince due to their lack of ability to put together a convincing case for it. While Tufte's argument might have some merit, there was probably a lot more going on than just that. Perhaps even with a very compelling case presented, the launch would have gone through.

[+] ThomPete|14 years ago|reply
I think the point with that was that those who needed to know didn't know because they hadn't been presented the data in a meaningful way.

So you are correct that he knew, but you are wrong in claiming the lecturer got it wrong. It is exactly because those who makes the decisions don't have the necessary know-how that visualizing the data in a meaningful way is so much more important.

[+] gallamine|14 years ago|reply
I don't dispute what you say, but I think it's too easy to absolve the engineers of any fault. If they knew that the public was being lied to, and that there was a good chance of people being killed, they should have been more vocal. When lives are on the line, you can't just say, "well, I told management ... "
[+] mcdillon|14 years ago|reply
"When he was pressed by NASA the night before the liftoff to sign a written recommendation approving the launch, he refused, and later argued late into the night for a launch cancellation. When McDonald later disclosed the secret debate to accident investigators, he was isolated and his career destroyed."

This is quite simply astonishing; everything I have ever learned in my engineering classes said to do what McDonald did and look what happened to him.

[+] johngalt|14 years ago|reply
Having integrity would be easy if you never had to fight and the results always benefitted you. Life is not a fair thing.

Engineers place a lot of emphasis on having the right answer and almost no emphasis on ensuring their influence. It doesn't matter how right you are if nobody will follow your direction.

[+] ScottBurson|14 years ago|reply
Yes, this to me is the most disturbing aspect of the story. It's one thing that the mistake was made in the first place; no one could be completely sure what would happen. But to then ostracize the people who warned of the possible catastrophe -- that's a corrupt culture.
[+] joering2|14 years ago|reply
may be slight offtopic, but this kind of news touches me. It shouldnt since NASA is a government agency and we all know how bureaucratic it can get, but for some reason NASA always stand to me as a well behave agency with strong morale. This is an exception from all other government entities. But this news proves my feeling were wrong.
[+] feralchimp|14 years ago|reply
What's the statute of limitations on negligent homicide?
[+] asynchronous13|14 years ago|reply
Ronald Reagan was supposed to give a state of the union address a couple days after the launch. He wanted to use the success of the space program as part of that speech. While there was no direct order, it's clear that NASA management felt significant political pressure to push the launch forward or risk reduced funding from congress. It's clear they screwed up, just wanted to add some context to the climate when they made these decisions.
[+] damoncali|14 years ago|reply
This comment needs more upvotes. The political pressure on NASA is extreme. The only people doted over more than the astronauts were the politicians. Look into the origins of the Triana satellite and its subsequent fate for a particularly ridiculous example.
[+] 3lit3H4ck3r|14 years ago|reply
Stunning.

"When the space shuttle Columbia burned up on reentry in 2003, killing its crew of seven, the accident was blamed on the same kinds of management failures that occurred with the Challenger. By that time, Boisjoly believed that NASA was beyond reform, some of its officials should be indicted on manslaughter charges and the agency abolished."

"NASA's mismanagement "is not going to stop until somebody gets sent to hard rock hotel," Boisjoly said. "I don't care how many commissions you have. These guys have a way of numbing their brains. They have destroyed $5 billion worth of hardware and 14 lives because of their nonsense." "

[+] afterburner|14 years ago|reply
Yes, nothing has changed. Lessons were not learned. And this was known before the Columbia disaster, in fact not long after the commission on the Challenger. It bears striking resemblances to the financial crisis.
[+] crikli|14 years ago|reply
"Their pleas and technical theories were rejected by senior managers at the company and NASA, who told them they had failed to prove their case and that the shuttle would be launched in freezing temperatures the next morning. It was among the great engineering miscalculations in history."

Horsepucky: the engineers calculations weren't mis-anything. The historical record has long since proven that the engineers were repeatedly ignored by their management and by bureaucrats at NASA.

[+] damoncali|14 years ago|reply
I spent 6 years as a structural engineer on space shuttle flights. A few thoughts:

While I don't know the people involved with Challenger - I was in 6th grade at the time - it goes well against my own experience that NASA management had anything but the interests of the crew in mind. To a fault. In fact, your average NASA employee doted on astronauts like a star struck little girl. What the crew wanted, the crew got. You could always tell who the astronaut was when you saw a group walking about the centers - he/she was the one whose every semi-whimsical comment extracted voluminous and polite laughter from the others in the group.

I was, however, working when Columbia blew up. In fact, my mission was supposed to fly on it when it got back. Although sad, I feel comfortable saying that most of the people working on these things sort of know it's going to happen from time to time. It wasn't exactly surprising to us or the crew.

Blaming managers and celebrating engineers is overly simplistic. The line is not as well defined as you might think. I had few - if any (I can't think of a single one, actually) - managers (either contractors or NASA employees) who were not experienced engineers.

The safety rules for a shuttle payload, let alone the actual orbiter, are voluminous and arcane. It is the primary reason that very little new technology comes out of the manned space flight program. Everything new is considered too dangerous because it hasn't been flown before.

This stuff is insanely dangerous. It is pretty damn easy to come up with a way some piece of hardware you're working on could kill someone. The complexity is enormous. The number of people involved is in the thousands, and they're spread all over the country. Different centers have different rules.

As a result, you make life-and-death decisions literally every day. It's not such a big deal, because there is a lot of formal process in place to make sure it gets done right. The the "standards" are what keep space flight as we know it as safe as it is. Are they or the processes by which they are enforced perfect? Hell no.

The system failed. People failed. But we knew this would happen, and we did it anyway because it's the price of exploring the frontiers. We learned from Challenger. We learned from Columbia. We will learn from the next catastrophic failure. NASA isn't perfect. In fact, you might say the bloated organization and government involvement makes this sort of thing inevitable. But I bet the small privateers exploring manned space flight will run into their own challenges.

Basically, what I'm saying is that we need to keep this in a larger perspective. Obsessing over one failure in what is a centuries-long quest is not helpful. Dissect it, learn from it, and move on.

[+] JGailor|14 years ago|reply
I took a professional ethics class in college, and the professor was a personal friend of Roger. The whole class was about Challenger, and the incredible failure of judgement around it's demise. Roger came into one of our classes and spoke to us late in the semester.

After listening to tapes of the trials, interviews, reading transcripts, and reading articles it was very apparent to me that this was a failure of management. The lead engineer, during the discussions of whether to launch the night before, was arguing that the engineering evidence did not support a launch under the temperature conditions projected for the following morning. He was told by Morton-Thiokol business reps to "take off your engineer hat, and put on your manager hat".

Evidence points to this failure happening because NASA needed a PR boost for funding, and M.T. wanted to continue doing business with them delivering solid-state boosters.

Because Roger Boisjoly spoke to Congress during the hearings he was black-listed from his industry. At no point during the decisions leading up to that disaster did good engineering practices that could have prevented this destruction come into play.

[+] ak217|14 years ago|reply
Thanks for your perspective.

> The system failed. People failed. But we knew this would happen, and we did it anyway because it's the price of exploring the frontiers. ... you might say the bloated organization and government involvement makes this sort of thing inevitable. But I bet the small privateers exploring manned space flight will run into their own challenges.

I bet they will do better, and go farther. They will know a disaster like that will likely ruin their company, so they will make damn sure that the communication process between managers and engineers doesn't break down, and that the process complexity is kept in check.

You're describing (and excusing) a bloated and dysfunctional system that sprang up around the need to manage the complexity of the space shuttle. People tried to fix the organization after Challenger, but the fact that Linda Ham stopped the request for imagery as described by CAIB shows that they failed or it reverted. And as your attitude shows, there is a bit of a fatalist perspective ("bloated organization and government involvement"). In the long term, it has to be fixed, or stuff will keep blowing up.

[+] ArbitraryLimits|14 years ago|reply
> In fact, your average NASA employee doted on astronauts like a star struck little girl. What the crew wanted, the crew got.

This is the question I've always had about the Challenger, but never heard addressed: How much of a factor was the crew's opinion considered to be? I strongly suspect that the astronauts themselves exerted informal but real pressure to prefer flying, since it would be their moment in the sun.

[+] angersock|14 years ago|reply
You raise two main themes here: NASA/space travel has grown into a Byzantine set of rules, and also that it is an accepted cost of doing business that you'll lose lives.

Do you believe that we'd be better served by simpler (but potentially much more dangerous) craft flown by private industry? If not, why is the NASA approach better, seeing as how it is mired in both politic from without and bureaucracy/process from within?

[+] rbanffy|14 years ago|reply
After the Columbia disaster, I confess I was a bit shocked the heat shield was never examined in orbit to assess the damage that that could occur during launch. Would it be that hard to do?
[+] vaksel|14 years ago|reply
perhaps the culture change because of the challenger disaster...nothing like having the shuttle blow up, to make you reevaluate your priorities for safety
[+] kahawe|14 years ago|reply
Different teams working on modules is a very different animal than one engineer saying "this item is going to blow up" for years and obviously everyone ignoring him... I find it particularly shocking how your answer suggests a strong "well, shit happens" attitude when clearly the potential AND a strong reason to make things better was right there.

What would really be interesting is why he "failed to make his case" according to executives.

[+] arto|14 years ago|reply
The story in more detail at NPR:

http://www.npr.org/blogs/thetwo-way/2012/02/06/146490064/rem...

[...] "We all knew what the implication was without actually coming out and saying it," a tearful Boisjoly told Zwerdling in 1986. "We all knew if the seals failed the shuttle would blow up."

Armed with the data that described that possibility, Boisjoly and his colleagues argued persistently and vigorously for hours. At first, Thiokol managers agreed with them and formally recommended a launch delay. But NASA officials on a conference call challenged that recommendation.

"I am appalled," said NASA's George Hardy, according to Boisjoly and our other source in the room. "I am appalled by your recommendation."

Another shuttle program manager, Lawrence Mulloy, didn't hide his disdain. "My God, Thiokol," he said. "When do you want me to launch--next April?"

These words and this debate were not known publicly until our interviews with Boisjoly and his colleague. They told us that the NASA pressure caused Thiokol managers to "put their management hats on," as one source told us. They overruled Boisjoly and the other engineers and told NASA to go ahead and launch.

"We thought that if the seals failed the shuttle would never get off the launch pad," Boisjoly told Zwerdling. So, when Challenger lifted off without incident, he and the others watching television screens at Thiokol's Utah plant were relieved.

"And when we were one minute into the launch a friend turned to me and said, 'Oh God. We made it. We made it!'" Boisjoly continued. "Then, a few seconds later, the shuttle blew up. And we all knew exactly what happened."

[+] kevinalexbrown|14 years ago|reply
What strikes me about the episode in terms of "general life lessons" isn't just "Listen to the engineers" (you should, though); it's that under the pressure to Get Stuff Done, there's a huge temptation to brush legitimate concerns under the rug. "These guys tell me this shuttle is unsafe, but space launch is never completely safe" --> "These guys tell me this user data isn't secure but no software is completely safe." Now that the newness of the space program has worn off a bit, it's easy to say "why didn't they just delay the launch?" but back in the day, it was an issue of national pride, and the managers, simple-minded as they may have been, were under an extreme amount of pressure to pull the launch off.

I guess it's just worth remembering that even if you're under pressure to ship, launch, or publish, if the guys whose job it is to know tell you to reconsider, you probably should.

[+] SoftwareMaven|14 years ago|reply
You can go through any catastrophe in a complex system and piece together a chain of events that show how "obvious" it was that it was going to happen. What you miss are all the chains that say every complex system is going to end in catastrophe, because when they don't end, nobody looks. It is kind of an anti-survivor bias.

It is also why it is a bad idea to make policy changes strictly off the cause of a single failure, and that is where things like commissions should help: you can move the focus to looking at the entire problem set for weaknesses instead of just leaving it with "make a better o-ring".

[+] bconway|14 years ago|reply
What I find most interesting is the contrast this article's discussion paints with the one we saw only 2 weeks ago, How Much Is an Astronaut's Life Worth?[1]. Some of the highly-rated HN comments included:

Space is dangerous. We should stop pretending it can be made "safe". It just gives politicians something to wag their tongues at when something inevitably goes wrong.

The problem here is that NASA is a political agency, not a scientific one. Each year, elected politicians sit down and decide how much they're going to get.

This provides thoughtful perspective on policy trade-offs. As Thomas Sowell has written, "The first lesson of economics is scarcity: There is never enough of anything to satisfy all those who want it. The first lesson of politics is to disregard the first lesson of economics."

[1] https://news.ycombinator.com/item?id=3518559

[+] snowpolar|14 years ago|reply
I don't know science so please pardon me if my comment makes no sense in this context.

What if by some stroke of luck, Challenger was extremely lucky enough to not explode on that 1st launch...What would happen to these righteous people such as Roger? Condemned by the people around them as some over worrying, insane self righteous people who thought they know everything. If you get what I mean...

It's sad that the challenger explosion happened, but at the same time it helps to highlight an important issue which may otherwise remain buried.

In software/web development context, it's of course harder to say this thing is going to blow up because a serious technical debt usually only climbs in after a much longer time for which by then, the people responsible may have left . Leaving the next victim to clean it up. This also is a sad state when it comes to final year projects, where students try to do everything that could impress to the graders on the outside to get the top grade. While students who make the extra effort for a clean and maintainble backend did not get the top grades because the lecturer only looks at the outside during presentation.

[+] mcantelon|14 years ago|reply
What are the names of those above Boisjoly who ignored him and made the call to launch? Good to know the names of the heroes in this story, but good also to know the names of the villains.
[+] dennisgorelik|14 years ago|reply
"Boisjoly could not watch the launch, so certain was he that the shuttle would blow up."

One thing is to believe that the chance of blowing up is ~1% (which is enough to prevent launch).

Another thing is to be certain, that it WILL blow up (I assume 90%+ probability here).

If he was so certain, why he could not convince his management?

[+] coles|14 years ago|reply
It's worth reading the Wikipedia entry on the following report and Feynman's findings. Anonymous polling of engineers showed they estimated a general probability of catastrophic disaster in a shuttle launch between 1% and 2%.