My experience is that you can write an essay like this about any development practice. You can cite your experience ignoring it, you can say some nice things about the importance of experience and priorities and managing trade-offs, and you can satirize people who are dogmatic to the point of becoming cult-zombies.
It appears as if such an essay providing great advice. But it it?
What is the reproducible, objective process or behaviour this essay is advocating? If I give this essay to five teams and then look at what they are doing to follow its advice, will they all be doing the same thing? What results does this essay suggest we will obtain from following its advice? How can we be sure it will be better than the results we would obtain from mindlessly chaining ourselves to the red/green light?
For better or for worse, the red/green light protocol is specific. As such, it can be criticized. A specific protocol has edge cases. You can argue slippery slopes. It's a stationary target. Whereas, vague and nonspecific generalities are more difficult to criticize. There are no details, so how does one identify edge cases?
It's great to hear about this one specific anecdote. beyond that, what am I supposed to walk away with besides the knowledge that slavishly following a process can be bad?
Today I was working with a new library. I wrote a 30 line sample program just to make sure I understood the library semantics, and that it would at least do what I thought in a situation similar to the actual code base.
I called over another developer to sanity check my thinking, and he spent 10 minutes going on how I should have just written tests for it in our codebase instead.
He would not take any of these as valid discussion points:
* I spent nearly 7 minutes on my sample code, writing tests including the appropriate mocks for our codebase would have taken 70.
* I didn't even know we would be moving forward with that library, why start integrating it into our infrastructure?
* I'll write tests of our code using the library, when we figure out what the thing we're building actually does.
* I would have to write that same sample program before I can even write tests, to know how to use the library.
I do know now not to include that guy in my thinking process ever again, if tests aren't perfect, I will lose any time including someone might gain by dealing with his TDD is my holy saviour rants about how I should learn.
Worst part: this guy takes days to write "tested" code that still doesn't actually do what it is supposed to.
I'm not sure that there is a sure-fire way to quantify what tests are or are not necessary. In my opinion, this is something that comes with experience and is more of an art than a science. But I'm okay with that.
If all you got out of the article was that slavishly following a process (like TDD) can be bad, I'll consider that a win for my writing. Because my general feel of the community is that testing is some sacrosanct practice that must be adhered to without question. And I question that.
I write tests. I'm not anti-test. I just imagine I write far fewer tests than most. In my opinion we should constantly question whether a certain practice (in this case, writing a specific test) is worth it. I think that's a healthier approach to development than blind red/green light.
> My experience is that you can write an essay like this about any development practice
Right, but in this case, he's writing about testing. One would argue that relevant now because the foolishness around how to test and how often has reached ridiculously dogmatic levels.
> What is the reproducible, objective process or behaviour this essay is advocating?
Why does it have to have one? There are many articles that make it to HN that call out software practices that are just ridiculous and help spark a discussion about them, why can't this be one of those? why can't it be the article that makes a young developer who's slowly turning into a TDD zombie "must ... test ... everything ... all ... the ... time" pause for a second and go "hmmmm"?
Any popular development practice is overhyped, many of us know so little about writing good software that we often cargo cult. How does one manage a bunch of inexperienced software developers, and maybe the manager is inexperienced themselves? Well, try to copy what others say has worked!
Really, the only cure is judicious/experienced application of development practice. Is unit testing always appropriate? No! Is it sometimes appropriate? Yes! But that is not an easy message for people to understand.
I've seen unit tests on prototypes before because that was standard practice, only to have the entire prototype scrapped at the end...because that was also standard practice (never ship prototypes :) ). Confusing.
Tests for software development == Crimes for political agenda.
This is one of those issues you are supposed to lie about.
Tests are like crime in politics -- no politician is going to say "I am soft on crime, I think we should reduce sentences." So crazier and crazier laws are created. Mandatory minimum sentences for possessing small amounts of drugs, all kind of craziness. People know it is crazy, but nobody is capable of speaking up and remain standing.
Same thing with tests. Nobody can say "fuck it, stop writing tests, the customer will never run this code, or this is too fucking obvious, test stuff that matters". That is considered irresponsible. Everyone is supposed to worry about not having enough tests.
Now there a subtle difference and that is when it comes to shipping on a schedule and in many companies tests are ignored -- BUT silently. Everyone loves talking about tests but try and tell you boss you spent 2 weeks on adding tests to your code. They might not like. Try to double the time you promise to do a feature by saying I need to write tests for it. Or if you are given free rein try saying I won't do all the new awesome features, I'll write more tests -- it will be approved probably but everyone in the end will praise the guy who chose to work on and deliver features -- even though they might be completely buggy and unusable. So that is the other side if you will.
> Same thing with tests. Nobody can say "fuck it, stop writing tests, the customer will never run this code, or this is too fucking obvious, test stuff that matters". That is considered irresponsible. Everyone is supposed to worry about not having enough tests.
I don't know about everyone else, but I definitely worry about having too many tests. For every test I write, I have to weigh how useful the test is versus how likely it is that the thing it is testing will change. If its usefulness is overshadowed by the likelihood that it will be a burden later on, it does not get written.
Try to double the time you promise to do a feature by saying I need to write tests for it.
That, IMO, is the best argument for TDD. By having a process that forces tests first, you never get into the situation where you are cutting critical tests for the schedule's sake.
For me, I ask the question, "Is this code adding more value than other code I could be writing?" If the answer is no, I do something else. Of course, this is completely subjective, but my years of experience count for something.
tl;dr: Tests are overhyped because we once wrote tests that became irrelevant when the thing we were testing changed
Really? I mean, tests might be overhyped, but so far it looks like the author draws incredibly general conclusions from some isolated incident. Why did the tests become irrelevant? Is their scoring algorithm now doing something completely different? Do they now have different use cases? Were they hard-coding scores when the actual thing that matters was, say, the ordering of the things they score?
> Only test what needs to be tested
well, thanks for the helpful advice :) Care to share what, in your opinion, needs to be tested?
OP here. My conclusion wasn't drawn from an isolated incident; I shared the incident to highlight what I imagine is a common pitfall in testing.
I used the phrase "only test what needs to be tested" intentionally. How should I know what you need to test? But if you accept that you should only test what needs to be tested, then there is an implication that some (I'd venture to say most) of what you write doesn't need to be tested. And that's a liberating concept. You aren't obligated to test everything, but you should test what really matters.
What needs to be tested is probably directly related to the size and stability of the company (assuming size and stability have a positive correlation). I would venture to surmise that young start-ups have almost no need to test anything. That comment isn't meant to be inflammatory, but I look at testing too early like optimizing too early. There's no reason to shard a database until you absolutely have to, and I don't think you need to test something until you know it's mission critical.
For example, Stripe offers a service that were components of it to fail would jeopardize their entire business. Their code base is obligated, by its nature, to have more stringent tests. But the startup in the garage next to yours who is still trying to determine MVP and will probably end up pivoting five times in the next two weeks? Save yourself a lot of grief and don't worry about the tests. Once you know what you are going to be, once you know what you can't afford to lose, well, wrap those portions up with some solid tests.
I've heard this saying: "Test until fear turns to boredom." Then you need to make sure you get bored at a properly calibrated rate (i.e. not too soon given your business' criticality, not too late given your need to move quickly). For example, if a lot of expensive bugs during final cause analysis are found to be preventable via unit tests, then your calibration is probably tuned to being bored too quickly.
1) Test your APIs. A public API, especially one that is key to your business, should be near 100% coverage. And attacked to look for security/usability/load problems.
2) Test enough during development to support later regression tests, and to make sure that the design is testable. This can usually be achieved with less than 20% coverage. But if you write production code that's so screwy it can't be regression tested, then you've got big problems.
3) Test any parts that scare you or confuse you or make you nervous. Use "test until you're more bored than scared" here.
>well, thanks for the helpful advice :) Care to share what, in your opinion, needs to be tested?
I am not the OP, but I think unit tests are more useful for code that has a lot of edge cases (such as string parsing) and for code which causes the most bugs (as seen in black box testing or integration tests.)
Also, some code is just easier to develop if you create a harness that runs it directly instead of having to work your way to the point in the program where it would execute it. If you do this, you might as well turn it into a test.
* A test should accept any correct behaviour from the tested code. Anything which is not in the requirements, should not be enforced by the test.
* A test should not use the same logic as the code to find the "right" answer.
* A function whose semantics are likely to be changed in the next refactor should not be invoked from a test.
* Whenever a test fails, make a note of whether you got it to pass by changing the test or by changing the code. If it's the test more than 2/3 of the time, it's a bad test.
* If you can't write a good test, don't write a test at all. See if you can write a good test for the code that calls this code instead.
In my admittedly limited experience, unit tests are way overhyped, especially when things like mocking are brought into the mix. It's easy to end up with a test that is not about correctness, but "did line 1 get called, did line 2 get called, etc.". Then you change the implementation, and 20% of your test cases break. That's not to say that they are valueless, but that I think unit tests should be used pretty sparingly.
Where I've found a ton of value has been in writing almost artificial-intelligence driven integration tests. Write a bot that uses your service in way, as fast as possible. Run fifty of these bots simultaneously, and see what happens. Then have some way to validate state at various points (either by tallying the bots actions, or sanity checks). Bugs will come fallout out of the sky. Then, in the future, when you get a bug, the challenge becomes to update the integration test bots behavior so that they (preferably quickly) can reproduce the bug.
I mean, I think that this is dependent on the domain of your software, but I think it's a good strategy for many areas.
If we're only using unit tests to see if line N gets called, I think we're doing it wrong. Instead, we want to use unit tests to tell us if the answer we get is correct -- while exercising line N.
This lets you verify that this particular branch (when bar=12) is executed, and that your results are as you expect. If you change some of the underlying calculations, things can break (as you get different answers), but then you at least have a test that lets you ensure that changing the answers is what you want to do. Sometimes, you want to change the way you calculate something and get the same answer, after all.
I've had great success with this technique. In my book automated and randomized integration stress tests cannot be beaten when it comes to hardening software as efficiently as possible. Randomized stress tests come across pathological conditions quickly, and it only takes a fraction of the imagination that it would take to write a unit test to successfully catch each one. Great for data structures and distributed systems, and probably applicable wherever non-trivial unit tests would also be useful...
I find it interesting that the premise for this entire diatribe is that once some code was written, a pivot was made, and a number of tests then had to be thrown away (or reworked, or re-factored?).
Recently I had a code base where we decided that due to a new delightful feature our customers were going to be quite pleased with we would need to switch out our old queuing system for a new one. In doing so well more than half of our huge test suite turned red. This told us two important things 1) that the queuing system touched a lot of areas of code we needed to think about and 2) where in the code the queuing system had touch points.
Ultimately we were able to put in the new queuing system, fix the areas that were broken by the change, and have the confidence at the end of the process that we had not broken any of the areas of the code that were previously under test. (This does not mean that our code was bug free of course, only that the areas under test were still working in the prescribed way, but that is a discussion for a different article.)
I believe that this would have taken a team of people weeks to do previously. I was confident that the change was ready after only 3 days with 2 developers. I would not trade my tests. There is a cost associated with everything, but I believe tests are the least costly way to get highly confident software built.
Get halfway through a project, realize you have to make a big change, then make it. Without tests i'll guarantee you'll watch your stability plummet. With tests, you might just go home at 5 pm. Alternatively don't make the change, and deal with a problem in your design. I've seen it more than a few times to have become convinced that there's a lot of value when striving to cover as much of your code as you can.
If you have a statically-typed language - and it doesn't even have to be a good one, with C++ being fine, in my experience - then making such changes is usually a case of making the change, or perhaps removing the bit of code you intend to change, compiling, and seeing what happens.
What happens is usually that you have 14,000,000 compile errors. Well, congratulations! That's the difficult part over. Now it's time to relax! Start at error 1, and fix it. Repeat. Every now and again, build. Once it builds again, it will usually work; if not, it's always something quite simple. If you have enough asserts in your code, then you can be doubly confident that there's nothing wrong with the new code.
I've had good success with this approach, for all sorts of changes, large and small. I've given up being surprised when the result just builds and runs.
I have no real idea, how you would do this reliably in a large program written in something like python. Sure, you'd fix up your tests easily enough... but then what? Don't you have a program to maintain as well? :)
Yes but there are also cases where relatively safe changes break a lot of tests, and it tells you nothing except that you now have a lot of tests to fix-up. I've held off of refactorings that I knew would scrap a bunch of tests. There is no easy answer to this stuff.
Care to comment on your experience with larger code bases? The content of your post seems short-sighted, and there's an exponential function of complexity increase as LOC and developer headcount both go up.
You're right people pay for features, but lagging a little at the beginning to establish good TDD culture pays off in spades later on. Shipping product is something you have to do continuously, and you arguably create more value as time goes on, so ensuring you can continue to ship product in a timely manner is a great thing for organizations.
I'm not the original poster, but I find that integration tests have a far larger payoff than unit tests, in general. A good release process that tests differences in behavior between a test system and the current version of the service in production is also valuable.
Being able to test that the whole system works as intended gives a better return on investment, in my experience, than testing small bits in isolation. The errors are, often as not, in the glue between the small bits.
I've worked on quite a few different large code bases. I agree with the author 100%. Tests are a tool. They are a particularly useful one, but they are just a tool. Unit tests are significantly less useful than TDD folks would have you believe when compared to things like integration tests. And integration tests are vastly more costly than what TDD folks would have you believe. Personally, I much prefer gold file tests with manual inspection when it comes to test automation. I've seen large compiler projects that get by pretty far with just that. Never mind things that make no sense to test: Like "Is this game fun?"
The key, to me, is determining what needs to be tested. That doesn't change with larger code bases. Note that I didn't say that tests were pointless or that you shouldn't write tests. That's incredibly shortsighted.
Originally I was going to title this "Tests are overrated" but that both seemed like linkbait and distorts my actual opinion.
I've been on projects where they tested to make sure that certain values were unique in the database and I couldn't help but think they: didn't understand the framework; didn't understand what tests are meant to do; didn't understand database key constraints; or all of the above.
Tests have their place. But they are a means, not an end. And I see a lot of people confusing them for the end.
But, again, I don't dislike tests. I just dislike what I perceive to be a current overemphasis.
Behavior/test-driven development needs all the hype it can get. There are so many developers, even entire software development subcommunities (I'm looking at you, video game developers) who haven't written a single test in their lives and don't understand its value.
Yes, it is time-consuming and invisible to your customers, but I imagine so is setting up a frame for a house instead of just stacking bricks. The structure, flexibility and peace of mind you get from a comprehensive test suite pays off when you have a 50-brick tall structure to put a roof on.
In my experience test-driven development is not time consuming, even for small developments. I don't think we even need to say things like "yes it takes time now but it will pay off later". I have found that it saves time right from the beginning. It seems other people have a similar experience : http://blog.8thlight.com/uncle-bob/2013/03/11/TheFrenziedPan...
I'm not necessarily disagreeing with the author, but I think if you are going to 'do agile', especially in a breaky language like C++, you need to do tests. Lots.
The current shop I'm at is maintaining a huge code base, parts of which go back 20 years. Because test coverage is so low, there is a real reluctance to refactor.
The first thing I did when I started here was to clean up the code, renaming miss-named variables to get it in line with the coding standard, adding autopointers here and there to head off memory leaks. By gosh, I nearly got fired.
If you are going to fearlessly edit your codebase, you need to know that regressions are going to be caught. You need automated testing.
"We wrote the first scoring algorithm at Goalee based on the red-green light. Within a couple of weeks we made our first algorithmic change, and made several quick fix releases to update the scoring methods in the following weeks. By the end of the month, a whole section of our testing suite was almost completely worthless. In a world where cash is king, I wouldn’t mind having those dollars back."
What I don't understand about comments like this is that a whole section of your code, both runtime / deliverable code and test code had become worthless. But, you only seem to view the discarded test code as wasted effort. Either the tests have value or they don't. And, if you write tests, and then discard the code they test, you'll likely also discard the tests. But, that doesn't change whether or not the tests had value, nor whether the new tests that you'll write for the new code have value.
> What I don't understand about comments like this is that a whole section of your code, both runtime / deliverable code and test code had become worthless
Not so. The code demonstrated that the first algorithm wasn't good enough and provided the experience needed to write the second one. The tests (hopefully) made the first algorithm's code maintainable, but it turns out there was no need to maintain it.
I think the hype of tests is driven by agencies that charge by the hour. One thing I have noticed is that on an existing system especially, you can put a very inexperienced developer down and instruct them to 'write tests for untested functionality', and do so but produce neligible value.
The same thing with green fields development. I've seen steaming piles of shit with huge test suites. Absolutely zero insight into the problem. No craftmanship at all, nothing interesting about the application. But a set of tests.
It's like the suite is proof enough that there was a job well done. I fear that a lot of development is devolving into nothing more than superstition and hype, backed up by agencies that like to bill a lot and amateurs who need a justification for their timelines and ineptitude.
Amen! I run https://circleci.com, where we make all of our money by actually running tests for people, and I still believe this. The goal of your software is to achieve goals, typically business goals. Often the software itself is already one or two steps removed from the value provided to customers. Tests are an extra step removed at least.
Tests are not overhyped. They are under-understood. Unit tests are not regression tests are not integration tests. I've encountered tons of teams that don't seem to understand this.
Unit tests are more than any other factor a design tool. Like any other design tool (uml, specification, etc), when the design needs to change, you throw them out. If it takes longer to design a system with unit tests than without them 1 of 2 things is true 1) you should not write unit tests 2) you should learn how to write unit tests.
I think it just goes to the way human beings handle original ideas, first they fight them, then they embrace them, then they take them to ridiculous extremes as they try to substitute rules for common sense in applying them.
You can see it in politics, religion and almost any really popular area of human endeavor.
Testing falls in the same category, I have had interviewers look me in the eye and in all seriousness, declare that developers who don't write tests for their code should be fired, or that their test suites cannot drop below x% of code coverage. Dogma is a horrible thing to afflict software teams, whether it is pair programming, or mandatory code reviews, if there are no exceptions to the rule or situations where you don't have to apply it, its probably a bullshit rule IMO.
Me, I like to ship shit, and I like to look at tests as a way to help me ship shit faster, because the less time I spend fixing regressions the more time I can spend actually getting more code/features out that door.
So my only metric for writing tests is this ... "is this going to help (me or a team member) not break something months from now, when I change code somewhere else".
This is not accurate unless you use a nice dependently typed lanugage like Agda or Coq. It is true that you need far fewer tests, but hardly none at all.
This is particularly evidenced by the fact that Haskell--certainly a "strongly typed functional language"--also has some of the best testing facilities of any language I've seen. QuickCheck simply can't be beat, and you can cover other parts of the code with more standard tools like HUnit.
Now, there is some code--only very polymorphic code--where the type system is strong enough to give a proof of correctness. For that sort of code, which you're only likely to encounter if writing a very generic library, you can get away without testing. But that is not even the majority of all Haskell code! And even there you have to be careful of infinite loops (e.g. bottom).
Comments like this make functional programmers sound much more arrogant and clueless than they really are.
I have a long experience in enterprise software and I agree with the premise.
There are two kinds of test units: workflow and functional.
1 - Workflow test units are a waste of time because no single test unit stays valid when there is a change. In other words, whenever we added/removed steps in the workflow, 99% of the time we have to change the test unit to fit that new workflow which breaks the concept of "write once, test all the time" concept. In my experience, having proactive developers who test areas around the workflow that they changed is much faster and reliable.
2 - Functional test units are great. They test one function that needs certain parameters and is expected to spit a certain output i.e function to calculate dollar amounts or do any king of mathematical operations.
However, these functions tends to stay unchanged during the lifetime of a project. Therefore, the test units are rarely run.
From my experience workflow changes/bugs represent 80% of the problems we face in enterprise software. Functional changes/bug are rare and can be detected quickly.
This is why I agree with the author premise that unit testing is overhyped.
I tended to do API-centric functional testing (does it spew JSON with the structure and data I expect?), and found it to be a time-saver in that it was faster to write a test and re-run it to check output than to manually use the web app or make the calls myself.
However if the test exceeded this cost/benefit metric where it wasn't really helping me get the feature written, out the window it went.
Helped when I went to refactor/fix fairly major chunks of the backend as all those tests from back when I did initial development were still there. It wasn't really "test first" because I didn't know what to test for until the basics of the API endpoint were in place.
This was Python if it matters (default unittest2). I do mostly Clojure when I have the choice lately.
It's not even really a matter of "does it need tested?", although you should be asking that question and building up the coverage for the critical bits.
For me it was a question of, "is this going to save me time/sanity?"
I advocated tests to the other engineers at my startup only when they were experiencing regressions/repeat bugs. I left them alone about the lack of test discipline otherwise.
My Clojure code tends to "just work" (after I'm done experiencing insanity and disconnection from reality in my REPL) to the degree that I mostly write tests when I'm making a library or touching somebody else's Git repo.
This is all fitting though. I use Clojure instead of Haskell precisely because I'm impatient, lazy, etc. Would kill for a "quickcheck" for Clojure though.
This whole debate has a whiff of excluded middle (we have 3 UND ONLY 3 TESTS <---> TDD ALL THE TIME!), not to speak of people not mentioning how tests can simply...save time sometimes.
Funnily enough what the article describes is almost the perfect case for testing: "We wrote the first scoring algorithm...based on the red-green light." This sounds like it describes some heuristic weighting, which couldn't have been solved by types, but reasonable tests would have shown if the new algorithm weighted some special cases higher or lower than it should.
The problem with an heuristic weight though, is that it's an heuristic, judged against other heuristics by taste not proof.
The obvious testing approach, ensuring that the score for each test case retains the same order as you tweak the algorithm - is overtesting. You don't care about this total order; you more likely care about ordering of classes of things, rather than ordering within those classes; or simply that 'likely' cases follow an order. Hence, you hit far too many test failures.
I'd agree that it's possible to overtest in general, but it's so easy to overtest heuristics that it needs called out as a special case, and it sounds like the problem here.
No amount of tests can help you if don't know how to solve the problem. And the better you know how to solve the problem the less tests you really need. ;)
Some sane amount of testing is good. However, I'm not very convinced writing tests first is a good idea. I saw a few programmers practicing this and when writing code they often concentrated too much on just passing the damn test-cases instead of really solving the problem in a generic way. Their code looked like a set of if-then cases patched one over another. Therefore, if they missed an important case in their testing suite, they had 99% chances for a bug. Once I ended up writing a separate, more generic implementation to validate the test suite. And it turned out there were a few missed/wrong cases in the test suite.
Seeing how many people agree with the post I am left wondering who the commenters on HN are. With a large code base and actual customers, comprehensive unit testing is by far the most cost effective way to create and maintain software. This is not being dogmatique, it's the shared experience of the majority of established software engineering firms. In the case of my company, we experienced so much growing pains that our productivity almost came down to zero before we switched to TDD. Many developers in our group were skeptics (including me) but today you won't find one of us arguing for less testing.
The kind of large pivot that the author refers to is only possible when you don't have established customers and you have a minimal product. You may as well call it prototyping. And prototyping with or without tests is indeed more a matter of taste than effectiveness.
[+] [-] raganwald|13 years ago|reply
It appears as if such an essay providing great advice. But it it?
What is the reproducible, objective process or behaviour this essay is advocating? If I give this essay to five teams and then look at what they are doing to follow its advice, will they all be doing the same thing? What results does this essay suggest we will obtain from following its advice? How can we be sure it will be better than the results we would obtain from mindlessly chaining ourselves to the red/green light?
For better or for worse, the red/green light protocol is specific. As such, it can be criticized. A specific protocol has edge cases. You can argue slippery slopes. It's a stationary target. Whereas, vague and nonspecific generalities are more difficult to criticize. There are no details, so how does one identify edge cases?
It's great to hear about this one specific anecdote. beyond that, what am I supposed to walk away with besides the knowledge that slavishly following a process can be bad?
[+] [-] sophacles|13 years ago|reply
Today I was working with a new library. I wrote a 30 line sample program just to make sure I understood the library semantics, and that it would at least do what I thought in a situation similar to the actual code base.
I called over another developer to sanity check my thinking, and he spent 10 minutes going on how I should have just written tests for it in our codebase instead.
He would not take any of these as valid discussion points:
* I spent nearly 7 minutes on my sample code, writing tests including the appropriate mocks for our codebase would have taken 70.
* I didn't even know we would be moving forward with that library, why start integrating it into our infrastructure?
* I'll write tests of our code using the library, when we figure out what the thing we're building actually does.
* I would have to write that same sample program before I can even write tests, to know how to use the library.
I do know now not to include that guy in my thinking process ever again, if tests aren't perfect, I will lose any time including someone might gain by dealing with his TDD is my holy saviour rants about how I should learn.
Worst part: this guy takes days to write "tested" code that still doesn't actually do what it is supposed to.
[+] [-] sturgill|13 years ago|reply
If all you got out of the article was that slavishly following a process (like TDD) can be bad, I'll consider that a win for my writing. Because my general feel of the community is that testing is some sacrosanct practice that must be adhered to without question. And I question that.
I write tests. I'm not anti-test. I just imagine I write far fewer tests than most. In my opinion we should constantly question whether a certain practice (in this case, writing a specific test) is worth it. I think that's a healthier approach to development than blind red/green light.
[+] [-] trustfundbaby|13 years ago|reply
Right, but in this case, he's writing about testing. One would argue that relevant now because the foolishness around how to test and how often has reached ridiculously dogmatic levels.
> What is the reproducible, objective process or behaviour this essay is advocating?
Why does it have to have one? There are many articles that make it to HN that call out software practices that are just ridiculous and help spark a discussion about them, why can't this be one of those? why can't it be the article that makes a young developer who's slowly turning into a TDD zombie "must ... test ... everything ... all ... the ... time" pause for a second and go "hmmmm"?
[+] [-] seanmcdirmid|13 years ago|reply
Really, the only cure is judicious/experienced application of development practice. Is unit testing always appropriate? No! Is it sometimes appropriate? Yes! But that is not an easy message for people to understand.
I've seen unit tests on prototypes before because that was standard practice, only to have the entire prototype scrapped at the end...because that was also standard practice (never ship prototypes :) ). Confusing.
[+] [-] rdtsc|13 years ago|reply
This is one of those issues you are supposed to lie about.
Tests are like crime in politics -- no politician is going to say "I am soft on crime, I think we should reduce sentences." So crazier and crazier laws are created. Mandatory minimum sentences for possessing small amounts of drugs, all kind of craziness. People know it is crazy, but nobody is capable of speaking up and remain standing.
Same thing with tests. Nobody can say "fuck it, stop writing tests, the customer will never run this code, or this is too fucking obvious, test stuff that matters". That is considered irresponsible. Everyone is supposed to worry about not having enough tests.
Now there a subtle difference and that is when it comes to shipping on a schedule and in many companies tests are ignored -- BUT silently. Everyone loves talking about tests but try and tell you boss you spent 2 weeks on adding tests to your code. They might not like. Try to double the time you promise to do a feature by saying I need to write tests for it. Or if you are given free rein try saying I won't do all the new awesome features, I'll write more tests -- it will be approved probably but everyone in the end will praise the guy who chose to work on and deliver features -- even though they might be completely buggy and unusable. So that is the other side if you will.
[+] [-] StavrosK|13 years ago|reply
I don't know about everyone else, but I definitely worry about having too many tests. For every test I write, I have to weigh how useful the test is versus how likely it is that the thing it is testing will change. If its usefulness is overshadowed by the likelihood that it will be a burden later on, it does not get written.
[+] [-] SoftwareMaven|13 years ago|reply
That, IMO, is the best argument for TDD. By having a process that forces tests first, you never get into the situation where you are cutting critical tests for the schedule's sake.
For me, I ask the question, "Is this code adding more value than other code I could be writing?" If the answer is no, I do something else. Of course, this is completely subjective, but my years of experience count for something.
[+] [-] pacala|13 years ago|reply
git rm <code that customer will never run>
[+] [-] azov|13 years ago|reply
Really? I mean, tests might be overhyped, but so far it looks like the author draws incredibly general conclusions from some isolated incident. Why did the tests become irrelevant? Is their scoring algorithm now doing something completely different? Do they now have different use cases? Were they hard-coding scores when the actual thing that matters was, say, the ordering of the things they score?
> Only test what needs to be tested
well, thanks for the helpful advice :) Care to share what, in your opinion, needs to be tested?
[+] [-] sturgill|13 years ago|reply
I used the phrase "only test what needs to be tested" intentionally. How should I know what you need to test? But if you accept that you should only test what needs to be tested, then there is an implication that some (I'd venture to say most) of what you write doesn't need to be tested. And that's a liberating concept. You aren't obligated to test everything, but you should test what really matters.
What needs to be tested is probably directly related to the size and stability of the company (assuming size and stability have a positive correlation). I would venture to surmise that young start-ups have almost no need to test anything. That comment isn't meant to be inflammatory, but I look at testing too early like optimizing too early. There's no reason to shard a database until you absolutely have to, and I don't think you need to test something until you know it's mission critical.
For example, Stripe offers a service that were components of it to fail would jeopardize their entire business. Their code base is obligated, by its nature, to have more stringent tests. But the startup in the garage next to yours who is still trying to determine MVP and will probably end up pivoting five times in the next two weeks? Save yourself a lot of grief and don't worry about the tests. Once you know what you are going to be, once you know what you can't afford to lose, well, wrap those portions up with some solid tests.
[+] [-] jamesaguilar|13 years ago|reply
[+] [-] wildwood|13 years ago|reply
2) Test enough during development to support later regression tests, and to make sure that the design is testable. This can usually be achieved with less than 20% coverage. But if you write production code that's so screwy it can't be regression tested, then you've got big problems.
3) Test any parts that scare you or confuse you or make you nervous. Use "test until you're more bored than scared" here.
[+] [-] jaredsohn|13 years ago|reply
I am not the OP, but I think unit tests are more useful for code that has a lot of edge cases (such as string parsing) and for code which causes the most bugs (as seen in black box testing or integration tests.)
Also, some code is just easier to develop if you create a harness that runs it directly instead of having to work your way to the point in the program where it would execute it. If you do this, you might as well turn it into a test.
[+] [-] dspeyer|13 years ago|reply
* A test should accept any correct behaviour from the tested code. Anything which is not in the requirements, should not be enforced by the test.
* A test should not use the same logic as the code to find the "right" answer.
* A function whose semantics are likely to be changed in the next refactor should not be invoked from a test.
* Whenever a test fails, make a note of whether you got it to pass by changing the test or by changing the code. If it's the test more than 2/3 of the time, it's a bad test.
* If you can't write a good test, don't write a test at all. See if you can write a good test for the code that calls this code instead.
[+] [-] mdkess|13 years ago|reply
Where I've found a ton of value has been in writing almost artificial-intelligence driven integration tests. Write a bot that uses your service in way, as fast as possible. Run fifty of these bots simultaneously, and see what happens. Then have some way to validate state at various points (either by tallying the bots actions, or sanity checks). Bugs will come fallout out of the sky. Then, in the future, when you get a bug, the challenge becomes to update the integration test bots behavior so that they (preferably quickly) can reproduce the bug.
I mean, I think that this is dependent on the domain of your software, but I think it's a good strategy for many areas.
[+] [-] brown9-2|13 years ago|reply
Yes, it's easy to write bad tests. But that does not reduce the value of good tests!
[+] [-] gknoy|13 years ago|reply
e.g.: def test_foo(self): result = get_foo(bar=12) self.assertEquals(result.category, 'FOO') self.assertEquals(result.name, 'my-foo-12') self.assertEquals(result.unit_price, 97.12)
This lets you verify that this particular branch (when bar=12) is executed, and that your results are as you expect. If you change some of the underlying calculations, things can break (as you get different answers), but then you at least have a test that lets you ensure that changing the answers is what you want to do. Sometimes, you want to change the way you calculate something and get the same answer, after all.
[+] [-] z0r|13 years ago|reply
[+] [-] trcollinson|13 years ago|reply
Recently I had a code base where we decided that due to a new delightful feature our customers were going to be quite pleased with we would need to switch out our old queuing system for a new one. In doing so well more than half of our huge test suite turned red. This told us two important things 1) that the queuing system touched a lot of areas of code we needed to think about and 2) where in the code the queuing system had touch points.
Ultimately we were able to put in the new queuing system, fix the areas that were broken by the change, and have the confidence at the end of the process that we had not broken any of the areas of the code that were previously under test. (This does not mean that our code was bug free of course, only that the areas under test were still working in the prescribed way, but that is a discussion for a different article.)
I believe that this would have taken a team of people weeks to do previously. I was confident that the change was ready after only 3 days with 2 developers. I would not trade my tests. There is a cost associated with everything, but I believe tests are the least costly way to get highly confident software built.
[+] [-] swalsh|13 years ago|reply
[+] [-] to3m|13 years ago|reply
What happens is usually that you have 14,000,000 compile errors. Well, congratulations! That's the difficult part over. Now it's time to relax! Start at error 1, and fix it. Repeat. Every now and again, build. Once it builds again, it will usually work; if not, it's always something quite simple. If you have enough asserts in your code, then you can be doubly confident that there's nothing wrong with the new code.
I've had good success with this approach, for all sorts of changes, large and small. I've given up being surprised when the result just builds and runs.
I have no real idea, how you would do this reliably in a large program written in something like python. Sure, you'd fix up your tests easily enough... but then what? Don't you have a program to maintain as well? :)
Jonathan Blow wrote about this years ago: http://lerp.org/notes_4/index.html
[+] [-] jeremyjh|13 years ago|reply
[+] [-] scootklein|13 years ago|reply
You're right people pay for features, but lagging a little at the beginning to establish good TDD culture pays off in spades later on. Shipping product is something you have to do continuously, and you arguably create more value as time goes on, so ensuring you can continue to ship product in a timely manner is a great thing for organizations.
[+] [-] ori_b|13 years ago|reply
Being able to test that the whole system works as intended gives a better return on investment, in my experience, than testing small bits in isolation. The errors are, often as not, in the glue between the small bits.
[+] [-] snprbob86|13 years ago|reply
[+] [-] sturgill|13 years ago|reply
Originally I was going to title this "Tests are overrated" but that both seemed like linkbait and distorts my actual opinion.
I've been on projects where they tested to make sure that certain values were unique in the database and I couldn't help but think they: didn't understand the framework; didn't understand what tests are meant to do; didn't understand database key constraints; or all of the above.
Tests have their place. But they are a means, not an end. And I see a lot of people confusing them for the end.
But, again, I don't dislike tests. I just dislike what I perceive to be a current overemphasis.
[+] [-] Skoofoo|13 years ago|reply
Yes, it is time-consuming and invisible to your customers, but I imagine so is setting up a frame for a house instead of just stacking bricks. The structure, flexibility and peace of mind you get from a comprehensive test suite pays off when you have a 50-brick tall structure to put a roof on.
[+] [-] rhizome31|13 years ago|reply
[+] [-] shubb|13 years ago|reply
The current shop I'm at is maintaining a huge code base, parts of which go back 20 years. Because test coverage is so low, there is a real reluctance to refactor.
The first thing I did when I started here was to clean up the code, renaming miss-named variables to get it in line with the coding standard, adding autopointers here and there to head off memory leaks. By gosh, I nearly got fired.
If you are going to fearlessly edit your codebase, you need to know that regressions are going to be caught. You need automated testing.
[+] [-] kscaldef|13 years ago|reply
What I don't understand about comments like this is that a whole section of your code, both runtime / deliverable code and test code had become worthless. But, you only seem to view the discarded test code as wasted effort. Either the tests have value or they don't. And, if you write tests, and then discard the code they test, you'll likely also discard the tests. But, that doesn't change whether or not the tests had value, nor whether the new tests that you'll write for the new code have value.
[+] [-] dspeyer|13 years ago|reply
Not so. The code demonstrated that the first algorithm wasn't good enough and provided the experience needed to write the second one. The tests (hopefully) made the first algorithm's code maintainable, but it turns out there was no need to maintain it.
[+] [-] ryan-allen|13 years ago|reply
The same thing with green fields development. I've seen steaming piles of shit with huge test suites. Absolutely zero insight into the problem. No craftmanship at all, nothing interesting about the application. But a set of tests.
It's like the suite is proof enough that there was a job well done. I fear that a lot of development is devolving into nothing more than superstition and hype, backed up by agencies that like to bill a lot and amateurs who need a justification for their timelines and ineptitude.
[+] [-] pbiggar|13 years ago|reply
I wrote a similar piece here: http://blog.circleci.com/testing-is-bullshit/
[+] [-] Evbn|13 years ago|reply
[+] [-] kasey_junk|13 years ago|reply
Unit tests are more than any other factor a design tool. Like any other design tool (uml, specification, etc), when the design needs to change, you throw them out. If it takes longer to design a system with unit tests than without them 1 of 2 things is true 1) you should not write unit tests 2) you should learn how to write unit tests.
[+] [-] trustfundbaby|13 years ago|reply
I think it just goes to the way human beings handle original ideas, first they fight them, then they embrace them, then they take them to ridiculous extremes as they try to substitute rules for common sense in applying them.
You can see it in politics, religion and almost any really popular area of human endeavor.
Testing falls in the same category, I have had interviewers look me in the eye and in all seriousness, declare that developers who don't write tests for their code should be fired, or that their test suites cannot drop below x% of code coverage. Dogma is a horrible thing to afflict software teams, whether it is pair programming, or mandatory code reviews, if there are no exceptions to the rule or situations where you don't have to apply it, its probably a bullshit rule IMO.
Me, I like to ship shit, and I like to look at tests as a way to help me ship shit faster, because the less time I spend fixing regressions the more time I can spend actually getting more code/features out that door.
So my only metric for writing tests is this ... "is this going to help (me or a team member) not break something months from now, when I change code somewhere else".
I honestly don't care about anything else.
[+] [-] Evbn|13 years ago|reply
[+] [-] lbarrow|13 years ago|reply
[+] [-] tikhonj|13 years ago|reply
This is particularly evidenced by the fact that Haskell--certainly a "strongly typed functional language"--also has some of the best testing facilities of any language I've seen. QuickCheck simply can't be beat, and you can cover other parts of the code with more standard tools like HUnit.
Now, there is some code--only very polymorphic code--where the type system is strong enough to give a proof of correctness. For that sort of code, which you're only likely to encounter if writing a very generic library, you can get away without testing. But that is not even the majority of all Haskell code! And even there you have to be careful of infinite loops (e.g. bottom).
Comments like this make functional programmers sound much more arrogant and clueless than they really are.
[+] [-] _ak|13 years ago|reply
[+] [-] donebizkit|13 years ago|reply
There are two kinds of test units: workflow and functional.
1 - Workflow test units are a waste of time because no single test unit stays valid when there is a change. In other words, whenever we added/removed steps in the workflow, 99% of the time we have to change the test unit to fit that new workflow which breaks the concept of "write once, test all the time" concept. In my experience, having proactive developers who test areas around the workflow that they changed is much faster and reliable.
2 - Functional test units are great. They test one function that needs certain parameters and is expected to spit a certain output i.e function to calculate dollar amounts or do any king of mathematical operations. However, these functions tends to stay unchanged during the lifetime of a project. Therefore, the test units are rarely run.
From my experience workflow changes/bugs represent 80% of the problems we face in enterprise software. Functional changes/bug are rare and can be detected quickly.
This is why I agree with the author premise that unit testing is overhyped.
[+] [-] coolsunglasses|13 years ago|reply
However if the test exceeded this cost/benefit metric where it wasn't really helping me get the feature written, out the window it went.
Helped when I went to refactor/fix fairly major chunks of the backend as all those tests from back when I did initial development were still there. It wasn't really "test first" because I didn't know what to test for until the basics of the API endpoint were in place.
This was Python if it matters (default unittest2). I do mostly Clojure when I have the choice lately.
It's not even really a matter of "does it need tested?", although you should be asking that question and building up the coverage for the critical bits.
For me it was a question of, "is this going to save me time/sanity?"
I advocated tests to the other engineers at my startup only when they were experiencing regressions/repeat bugs. I left them alone about the lack of test discipline otherwise.
My Clojure code tends to "just work" (after I'm done experiencing insanity and disconnection from reality in my REPL) to the degree that I mostly write tests when I'm making a library or touching somebody else's Git repo.
This is all fitting though. I use Clojure instead of Haskell precisely because I'm impatient, lazy, etc. Would kill for a "quickcheck" for Clojure though.
This whole debate has a whiff of excluded middle (we have 3 UND ONLY 3 TESTS <---> TDD ALL THE TIME!), not to speak of people not mentioning how tests can simply...save time sometimes.
[+] [-] bazzargh|13 years ago|reply
The problem with an heuristic weight though, is that it's an heuristic, judged against other heuristics by taste not proof.
The obvious testing approach, ensuring that the score for each test case retains the same order as you tweak the algorithm - is overtesting. You don't care about this total order; you more likely care about ordering of classes of things, rather than ordering within those classes; or simply that 'likely' cases follow an order. Hence, you hit far too many test failures.
I'd agree that it's possible to overtest in general, but it's so easy to overtest heuristics that it needs called out as a special case, and it sounds like the problem here.
[+] [-] pkolaczk|13 years ago|reply
Old, but quite relevant: http://ravimohan.blogspot.com/2007/04/learning-from-sudoku-s...
Some sane amount of testing is good. However, I'm not very convinced writing tests first is a good idea. I saw a few programmers practicing this and when writing code they often concentrated too much on just passing the damn test-cases instead of really solving the problem in a generic way. Their code looked like a set of if-then cases patched one over another. Therefore, if they missed an important case in their testing suite, they had 99% chances for a bug. Once I ended up writing a separate, more generic implementation to validate the test suite. And it turned out there were a few missed/wrong cases in the test suite.
[+] [-] pesenti|13 years ago|reply
The kind of large pivot that the author refers to is only possible when you don't have established customers and you have a minimal product. You may as well call it prototyping. And prototyping with or without tests is indeed more a matter of taste than effectiveness.