Why Bother With Cucumber Testing?

[+] aslakhellesoy|12 years ago|reply

Creator of Cucumber here.

Cucumber is not a testing tool [1], it is a collaboration and analysis tool.

Use it to document (in very broad strokes) the features of your system.

It was never meant to be used as an exhaustive regression testing tool, or to replace unit testing.

[1] https://cucumber.pro/blog/2014/03/03/the-worlds-most-misunde...

[+] jrochkind1|12 years ago|reply

I'm curious what you think about 'Relish', software for turning cucumber steps into developer documentation. Ruby projects vcr[1] and rspec[2] use it as their exclusive/preferred/canonical developer API documentation.

As a developer-user (not maintainer/creator) of these projects, I've always found their relish-produced api docs to be fairly insufficient, confusing, and frustrating for me as a developer needing API docs.

[1] https://www.relishapp.com/vcr/vcr/docs

[2] https://relishapp.com/rspec/docs

[+] jdlshore|12 years ago|reply

My comment was a bit too late to be seen last time [0], so I'll repeat it here:

---

I was heavily involved with the Fit project for a while. (Fit is Ward Cunningham's predecessor to Cucumber. It used HTML tables rather than ASCII sentences.)

I find it interesting, if not surprising, that the Cucumber community is discovering exactly the same issues that we did with Fit: namely, that it encourages brittle integration tests, and that people don't use it for its intended purpose of collobaration.

I've come to believe that these problems are unsolvable.

I worked on and promoted Fit for several years. Eventually, after some deep soul searching, I concluded that Fit (and by extension, Cucumber) is solving the wrong problem. The value of Fit (and Cucumber) comes from discussions with domain experts. The tools encourage a certain amount of rigor, which is good, but you can get that rigor just as easily by discussing concrete examples at a whiteboard.

The actual automation of those examples, which is where Fit and Cucumber focus your attention, has minimal value. In some cases, the tools have negative value. Their tests are fundamentally more complex and less refactorable than ordinary xUnit tests. If you want to automate the examples, you're better off doing it in a programmer tool.

Some people got a lot of value out of Fit, and I'm sure the same is true for Cucumber. They got that value by using it for collaboration and focusing on domain rules rather than end-to-end scenarios. My experience, though, was that the vast majority used it poorly. When a tool is used so badly, so widely, you have to start questioning whether the problem is in the tool itself.

Ward and I ended up retiring Fit [1]. I've written about my reasons for abandoning it before [2] [3].

[0] https://news.ycombinator.com/item?id=7514651

[1] http://www.hanselminutes.com/151/fit-is-dead-long-live-fitne...

[2] http://www.jamesshore.com/Blog/The-Problems-With-Acceptance-...

[3] http://www.jamesshore.com/Blog/Acceptance-Testing-Revisited....

[+] Morendil|12 years ago|reply

> I've come to believe that these problems are unsolvable.

"Unsolvable" is strong. My take is that these problems stem inevitably from approaching the goal ass-backwards.

You're trying to force a particular testing framework on people who don't care about frameworks, supposedly in the name of "better communication". There's no reason to expect that to work.

The way I've advocated doing it, for years now, is to first sit with the people in question, discover how they communicate about business goals that the programming effort is expected to assist, formalize their notation as little as you can get away with and use that for acceptance testing.

This should be a process of active listening, not passive recording. The client should be gently nudged away from speaking in solution-terms, for instance.

[+] jdlshore|12 years ago|reply

I should mention that Aslak Hellesoy (creator of Cucumber) responded with a well thought-out comment about the issues I raised above: https://news.ycombinator.com/item?id=7519369

[+] dragonwriter|12 years ago|reply

> I find it interesting, if not surprising, that the Cucumber community is discovering exactly the same issues that we did with Fit: namely, that it encourages brittle integration tests, and that people don't use it for its intended purpose of collobaration.

> I've come to believe that these problems are unsolvable.

I think their quite solvable, its just that they aren't technical problems, they are social problems, and technical solutions fail, and the people that are good at solving technical problems often aren't particularly skilled at solving social problems.

I think Cucumber could be a good tool in the right workflow, but establishing the right cross-functional workflow is a very hard social problem that the people that understand the technology very often don't have the skill to solve (and, generally, don't have the social position to effectively champion solutions even if they had them.)

[+] rurounijones|12 years ago|reply

The only time I have found cucumber testing useful is when I was working in a "corporate" scenario where we had tedious UAT tests written for humans to run manually.

I convinced them to use plain-text cucumber syntax files. We had to use detailed imperative, not declarative, tests with the default steps which all cucumber gurus hate, and in fact were relegated to a "cucumber-training-wheels" repo. But in this scenario they were the only option. Old school Enterprise test thinking) then I just automated them myself and made them part of the CI build.

The result, 100% pass on the first UAT run (minus a few % from UAT testers reading tests differently to how everyone else had been reading them up to that point), everyone happy, we look good and client hired contract testers finishing early and saving money.

Without cucumber we would have been in excel spreadsheet "UAT Test file" hell.

I am not sure I would use them in any other scenario though.

[+] nimblegorilla|12 years ago|reply

I used cucumber in the same type of scenario (humans running tests) and it worked great. It works so well it's really tempting to try and use it everywhere, but there are definitely shortcomings when cucumber gets misapplied.

[+] psychometry|12 years ago|reply

I've never quite understood exactly whom Cucumber is for. Clients don't want to read Cucumber specs and programmers don't want to write them.

[+] praptak|12 years ago|reply

> Clients don't want to read Cucumber specs and programmers don't want to write them.

Perhaps it is a general problem? Clients want to stay on their end of the (in-)formality spectrum (narrative) and programmers on theirs (formal spec.) Nobody wants to go the middle.

Yeah, there are analysts. They bridge the gap but are themselves an additional link where transcription errors happen.

[+] marksweston|12 years ago|reply

>Clients don't want to read Cucumber specs

Far too broad a generalization I think. I currently work with a non-technical CEO whose first encounter with Cucumber was when we were trying to re-specify a feature that we'd got wrong at the previous attempt. It was a revelation to him, and his initial burst of enthusiasm resulted in his going away and writing - unknown to us - Cucumber Features for about half of the functionality of the app. He's calmed down since then and leaves the actual writing to us, but Cucumber remains our go-to tool for resolving complicated requirements.

Obviously trying to force Cucumber on an unwilling/uninterested client is going be difficult and damaging to the relationship. Maybe if their first experience was a positive one in which using Cucumber helped resolve a communications problem rather than a technical ritual that they were forced to repeat for every user story, then attitudes might change?

[+] sokrates|12 years ago|reply

Most points raised by the auther are criticism with specific Cucumber step implementations. I understand that this is not criticism of Cucumber-the-tool but Cucumber-the-process, but of course, if you're using your tools wrong, nobody's going to just fix your process.

The awkward step naming, and the routing issue, are all with the default steps of an integration of something like Webrat or Capybara. In recent projects (and I think this is the default for newer versions of Capybara), no steps are automatically generated for you. You have to choose the level of detail you want to operate at yourself. Comparing character count of a method call and a Gherkin step is also a rather useless metric.

The "doesn't share code with my test env" issue is a trivial fix: move your test-but-also-cucumber-env code from test_helper.rb into another file, then require it from both cucumber's env.rb and test_helper.rb.

Personally, I write Cucumber features exactly because they mean I don't have to think about routing, or paths, or syntax. I try to put myself in the role of the user and write down what I want to accomplish, and how. A key indicator for reasonably abstracted feature files might be the equivalence "I changed something in a .feature file" IFF "I have to communicate a change to my users."

But if you want to shoot yourself in the foot, you really have to do it yourself.

[+] ctolsen|12 years ago|reply

In Python, I use doctests for things like this. Just write the features in plain understandable language, with clear code that serves both as integration/behaviour test and example in between.

Cucumber etc. is usually about coaxing natural language into code, which makes no sense to me. Better to use natural language with useful code examples, which are readable to client, developer, and test software.

[+] berkes|12 years ago|reply

I agree with most of the problems OP describes, but find the article lacking two important things.

It is always easy to say "don't do X". But without explaining how that can be replaced with- or ported to an alternative, it is a quite hollow advice.

For me, Cucumber is not just the Gherkin syntax. It is, first and for all, a turnkey setup that allows me to have browser interaction (including selenium), organised in a nice and workable way.

How I would implement something like:

  When "I fill in my payment details" do
    fill_in "Creditcard Number", "1111111222"
    fill_in "ccv", "1234"
  end

In, say, Minitest or MinitestSpec is beyond me.

Are there good resources on doing proper integration tests as OP suggests? There are entire books on Cucumber, yet I cannot find something similar on Minitest.

Is it easy to set up a toolchain that allows me to use Capybara and Selenium (phantom.js) interchangeble in my integration tests? If so, is there a place where I can find documentation on that?

[+] bodhi|12 years ago|reply

Capybara does pretty much what you are asking for:

https://github.com/jnicklas/capybara

And it has various drivers for e.g. Selenium.

[+] markov_twain|12 years ago|reply

Here's an entirely working example I just created: https://gist.github.com/benolee/ca49aaa9a0363c18904b

[+] bradgessler|12 years ago|reply

https://github.com/wojtekmach/minitest-capybara

[+] theyCallMeSwift|12 years ago|reply

So it seems like the new hot way to do integration testing with RSpec + Features [1]. Any sources on doing this well? I find myself writing methods that read like cucumber.

[1] http://pivotallabs.com/getting-by-with-rspec-feature-specs/

[+] foz|12 years ago|reply

In my frontend web team, we used Cucumber for over a year. We slowly came to the same conclusion - Cucumber tests are hard to maintain, run more slowly, and overall take more effort to develop.

As of today, we're in the process of ripping out all of our Cucumber tests and replacing them all with RSpec features.

In our largish company we found that product owners generally didn't care about reading Cucumber features. The definition of the products and how they works are defined in the agile boards and cards we write together, along with documentation and wireframes which live on our intranet.

[+] karmajunkie|12 years ago|reply

I've done integration testing just about every way I can think of. I've "cuked it wrong". I've used plain rspec. I've used rspec with features. I've written an equivalent framework for use in a test::unit shop.

They're all fine. They get the job done.

I've also done acceptance testing with all of the above, and it works great as well. It's a language thing. My AT's are rarely more than half a dozen lines, and many of them are less than 5.

Whether or not you should do ATs at all really comes down to the process culture. If you have someone signing off on whether a feature works, I think it's great to do. If you're the only one signing off, I think it's worth doing as a tool to help implement the feature, but then convert it to an integration test, trim the AT to be part of a very limited suite, or toss it altogether. They're not unit tests, and not meant to be voluminous documentation of your app. That doesn't mean don't do them. It means learn how to do them well.

[+] thisisauserid|12 years ago|reply

I do this with RSpec + PageObject: https://github.com/cheezy/page-object

An example that's similar to what I do: http://youtu.be/e9tfC-gLW8c?t=8m28s

[+] dpeck|12 years ago|reply

I wouldn't say I'm doing it well, but I'm doing it. I resolved to make myself do TDD for a few projects and today signed up for a month of Thoughtbots Learn program. They're big advocates of integration spec testing, and have written quite a bit about it.

I'm not wholly convinced its the best way to go, but it does offer some advantages.

[+] jsnk|12 years ago|reply

I use both RSpec and Cucumber often, and easiness wise, I think Cucumber is just easier to write.

RSpec verbs and scoping still messes with my head, and I always have to debug why the tests are breaking.

[+] midas007|12 years ago|reply

Cucumber, especially for moving-fast-and-break-things startups, can be extra baggage. It's only needed where there is so much behavior and/or stakeholders to please that it would matter. Otherwise, it's probably as useful as repetitive, boilerplate comments that bloat LoC without adding value.

But differentialy, if you want to go down the waterfall route, especially for safety / control systems, formal specs can be really, really useful (Z "Zed" for example). It looks like model descriptions for MVC, but it's closer to being readable formal proofs... so it's very useful for engineering / scientific projects. [0]

IOW... What I remind myself: "Stop playing with tech toys, ship and get some customer feedback already."

[0] https://en.wikipedia.org/wiki/Z_notation

[+] stirno|12 years ago|reply

This mirrors my own experiences using Gherkin syntax in other tech stacks. The most workable scenario I've come up with is to focus on simple, terse syntax that is still readable.

Build reusable page objects, keep your actual tests short and things get nice and easy.

Lots of good options in this space. A few years ago I built FluentAutomation [1] to solve this issue in the .NET community.

[1] http://fluent.stirno.com/

29 comments