I feel like this developer is trying to present a false dichotomy between his "Sherlock" method of debugging and the scientific debugging method. To be more specific, it seems as though the example he gave exactly fit the definition of the scientific debug method.
1. The author identified a problem he wanted to fix
2. He tried to guess the cause of the problem
3. He tested that guess
4. He analyzed the results of multiple trials.
5. When he was done with analysis, he worked out a fix and then tested again.
The only thing of note to the "Sherlock" method is that the author explicitly decided to spend a great deal of time in the guessing phase. What is my point here? Unless you already have experienced a bug and/or know exactly what is causing it right off the bat, the scientific method is still the most effective tool in your debugging arsenal.
This was my thought as well. Sherlock's method can be seen as a special case of the scientific method that focuses on gathering all the facts before making your HYPOTHESIS. (You can't call it a theory until you've tested it, and even then, the theory explains the facts, not the other way around.)
Also, completely missing from this story is how deductions are made and the difference between deductions, hypothesis, and facts.
For example, fact: The message isn't sent. Deduction: cURL works, so the problem isn't the system. Hypothesis: The problem is in the API.
> I use the Sherlock Holmes method of debugging software.
Everyone does, surprise! Finding a cause by effect is exactly what the debugging is :)
That said, does anyone else remember the "Undo Step Forward" command in TurboPascal 5.5 IDE? You could un-step through the program being debugged, essentially moving back in time. That was an incredible feature, a true engineering curiosity.
Not true. You need a deep understanding of the workings of the system and a logical mind to be able to work backward from the symptom to the cause. (Patience helps too.) Lots of programmers who don't have these traits just try fixes at random. Either they find the solution (after a long time), or they give up and ask someone who is better at debugging to help them. I frequently get asked to debug problems that other people have failed to debug.
Differential Diagnosis is an excellent approach to debugging. Seeing debugging in the wild before, I've seen two, maybe three different approaches to debugging:
1) Magical thinking: This is the fear that the computer system has suddenly changed its mind and decided to stop working. The approach here is typically to try some cargo-cult knowledge on how to fix the system without knowing what's wrong, and then just start hitting random things, and see if they come back.
2) Panic / Fixation: These kinds of developers usually get into a state where they are in a panic over what's wrong and fixate on one component of the system where they believe the failure to be. This is slightly more productive than the earlier, but not changing your assessment based on the new information gained is also counterproductive.
3) DDX: This is very much the Sherlock / House type of debugging. It tends to require a lot of wide-shallow knowledge, as opposed to in depth knowledge of the system. Typically, you work backwards from the symptoms, and essentially do a binary search based on the widest type of problem sets to the narrowest.
DDX in-fact is very difficult for most people to do under stress. Nelson from Stripe wrote a post on keeping a notebook to store wide information, and enable caching b-tree searches. This is a very good approach.
This was the first programming book I read, back in the eighties. Lately I got a copy for nostalgia's sake, and it's all still spot on good practice today - problem definition, algorithm design, avoiding global state etc - although the language (both the Pascal syntax and the Conan-Doyle-style prose) probably wouldn't suit today's "Dummy's Guide" market.
Sometimes I have tried to use the scientific method to resolve hard to reproduce bugs.
Write a hypothesis or series of them. Consider what implications this hypothesis would mean about the behaviour of the bug or code. Design an experiment to test the hypothesis, continue with more experiments. List methodology in bug report so others can repeat the experiments. Finally a proof of the cause of the bug is found.
Well "Sherlock Holmes Debugging" may be redundant, since this is really just "debugging" (though who would have clicked on that link?), I was most excited by this line:
> There is no sin in software engineering more serious than thinking some behavior of a computer system is magical or beyond our understanding.
I've been trying to find a good test case to get my junior engineers to step out of their code and see more of the layers. This is a pretty good mantra for improving your debugging skills.
Only loosely related but for lovers of weird programming texts, there's a bizarre gem called "Elementary Basic - Learning To Program Your Computer In Basic With Sherlock Holmes" by Henry Ledgard and Andrew Singer.
What the article says: Go down the stack in a linear format and check everything
What a Sherlock Holmes story is: Seemingly-unrelated characters could be responsible for the crime
How the story would've been if it was really similar to Sherlock Holmes:
- Check your own software (Already done)
- Go down the stack and check all dependency, platform issues (Already done)
- Check if your clipboard works properly (New)
- Check if the monitor outputs pixels properly (New)
- Check if your eyes see clearly (New)
That would be Sherlock Holmes debugging!
On a serious note, I like the story. I also like the way things are done but I assumed that's how every developer works. I looked at the link to the Scientific approach and IMO, that shouldn't be considered as the de-facto Scientific approach. This should be the Scientific approach.
MTU problems are not unheard of, it's one of the things you always check for whem you have this type of network problems. Especially when you're running jumbo frames, which used to be quite troublesome when the technology was new.
Another thing to check for is funny-looking TCP flags. Some firewalls tend to drop such traffic, and it may not end up in the logs you usually check.
That's why the first thing you do when one connection works and one doesn't is to tcpdump them and compare. Just last week I had one application which ran ssl directly and in another environment it did a starttls-type thing just because of the underlying libraries.
It was immediately obvious from looking at it, but it would have been terribly difficult to guess. Don't start with Sherlockian reasoning, start by getting all the data.
It's always bugs like this that end up eating all of your time. That's why I find making small client websites or apps (as opposed to larger software projects) to be so hard to profit on.
this is a nice story, but i think something much more important is missed here... application of the scientific method.
all of this logical deduction is worthless unless you verify it with experiments. this is very much not what Sherlock Holmes does... but it is exactly what enabled the deduction in the story to be cemented into a reliable conclusion.
[+] [-] CephalopodMD|11 years ago|reply
1. The author identified a problem he wanted to fix
2. He tried to guess the cause of the problem
3. He tested that guess
4. He analyzed the results of multiple trials.
5. When he was done with analysis, he worked out a fix and then tested again.
The only thing of note to the "Sherlock" method is that the author explicitly decided to spend a great deal of time in the guessing phase. What is my point here? Unless you already have experienced a bug and/or know exactly what is causing it right off the bat, the scientific method is still the most effective tool in your debugging arsenal.
[+] [-] splinterofchaos|11 years ago|reply
Also, completely missing from this story is how deductions are made and the difference between deductions, hypothesis, and facts.
For example, fact: The message isn't sent. Deduction: cURL works, so the problem isn't the system. Hypothesis: The problem is in the API.
[+] [-] huhtenberg|11 years ago|reply
Everyone does, surprise! Finding a cause by effect is exactly what the debugging is :)
That said, does anyone else remember the "Undo Step Forward" command in TurboPascal 5.5 IDE? You could un-step through the program being debugged, essentially moving back in time. That was an incredible feature, a true engineering curiosity.
[+] [-] teddyh|11 years ago|reply
https://sourceware.org/gdb/current/onlinedocs/gdb/Reverse-Ex...
[+] [-] greenyoda|11 years ago|reply
Not true. You need a deep understanding of the workings of the system and a logical mind to be able to work backward from the symptom to the cause. (Patience helps too.) Lots of programmers who don't have these traits just try fixes at random. Either they find the solution (after a long time), or they give up and ask someone who is better at debugging to help them. I frequently get asked to debug problems that other people have failed to debug.
[+] [-] kolodny|11 years ago|reply
1: https://github.com/traceglMPL/tracegl
2: https://twitter.com/ebryn/status/443080437485682689
[+] [-] bfwi|11 years ago|reply
[+] [-] sargun|11 years ago|reply
1) Magical thinking: This is the fear that the computer system has suddenly changed its mind and decided to stop working. The approach here is typically to try some cargo-cult knowledge on how to fix the system without knowing what's wrong, and then just start hitting random things, and see if they come back.
2) Panic / Fixation: These kinds of developers usually get into a state where they are in a panic over what's wrong and fixate on one component of the system where they believe the failure to be. This is slightly more productive than the earlier, but not changing your assessment based on the new information gained is also counterproductive.
3) DDX: This is very much the Sherlock / House type of debugging. It tends to require a lot of wide-shallow knowledge, as opposed to in depth knowledge of the system. Typically, you work backwards from the symptoms, and essentially do a binary search based on the widest type of problem sets to the narrowest.
DDX in-fact is very difficult for most people to do under stress. Nelson from Stripe wrote a post on keeping a notebook to store wide information, and enable caching b-tree searches. This is a very good approach.
[+] [-] unknown|11 years ago|reply
[deleted]
[+] [-] fineline|11 years ago|reply
http://www.amazon.com/Elementary-Learning-Program-Computer-S...
This was the first programming book I read, back in the eighties. Lately I got a copy for nostalgia's sake, and it's all still spot on good practice today - problem definition, algorithm design, avoiding global state etc - although the language (both the Pascal syntax and the Conan-Doyle-style prose) probably wouldn't suit today's "Dummy's Guide" market.
[+] [-] andrewchambers|11 years ago|reply
Write a hypothesis or series of them. Consider what implications this hypothesis would mean about the behaviour of the bug or code. Design an experiment to test the hypothesis, continue with more experiments. List methodology in bug report so others can repeat the experiments. Finally a proof of the cause of the bug is found.
[+] [-] shocks|11 years ago|reply
Cool story though.
[+] [-] TillE|11 years ago|reply
[+] [-] hox|11 years ago|reply
http://en.wikipedia.org/wiki/5_Whys
[+] [-] dfritsch|11 years ago|reply
> There is no sin in software engineering more serious than thinking some behavior of a computer system is magical or beyond our understanding.
I've been trying to find a good test case to get my junior engineers to step out of their code and see more of the layers. This is a pretty good mantra for improving your debugging skills.
[+] [-] daviddaviddavid|11 years ago|reply
http://www.amazon.com/Elementary-Chronicled-Learning-Compute...
[+] [-] anirudh24seven|11 years ago|reply
What a Sherlock Holmes story is: Seemingly-unrelated characters could be responsible for the crime
How the story would've been if it was really similar to Sherlock Holmes:
- Check your own software (Already done)
- Go down the stack and check all dependency, platform issues (Already done)
- Check if your clipboard works properly (New)
- Check if the monitor outputs pixels properly (New)
- Check if your eyes see clearly (New)
That would be Sherlock Holmes debugging!
On a serious note, I like the story. I also like the way things are done but I assumed that's how every developer works. I looked at the link to the Scientific approach and IMO, that shouldn't be considered as the de-facto Scientific approach. This should be the Scientific approach.
[+] [-] xorcist|11 years ago|reply
Another thing to check for is funny-looking TCP flags. Some firewalls tend to drop such traffic, and it may not end up in the logs you usually check.
That's why the first thing you do when one connection works and one doesn't is to tcpdump them and compare. Just last week I had one application which ran ssl directly and in another environment it did a starttls-type thing just because of the underlying libraries.
It was immediately obvious from looking at it, but it would have been terribly difficult to guess. Don't start with Sherlockian reasoning, start by getting all the data.
[+] [-] rbosinger|11 years ago|reply
[+] [-] jheriko|11 years ago|reply
all of this logical deduction is worthless unless you verify it with experiments. this is very much not what Sherlock Holmes does... but it is exactly what enabled the deduction in the story to be cemented into a reliable conclusion.
[+] [-] chj|11 years ago|reply
[+] [-] placeybordeaux|11 years ago|reply
[+] [-] glibgil|11 years ago|reply