top | item 22861573

(no title)

beder | 5 years ago

I was on board for points 1 and 2, but completely disagree with 3:

> The components might look wrong when rendered. Tests are very bad at this. We don't want to go down the esoteric and labor-intensive path of automated image capture and image diffing.

This is the main reason we have tests for our (Angular) typescript all; all the tests roughly look like:

1. set up component with fake data

2. take screenshot and compare with golden

3. (maybe) interact with component and verify change

These are super easy to write, and also easy to review: most review comments are on the screenshots anyways. And since the goldens are checked in, it’s easy to see what different parts of the app look like without starting it up.

discuss

geoelectric|5 years ago

In my experience, bitmap comparison testing is pretty hard to keep going unless you have a dedicated service for maintaining the bitmaps, updating them when needed, a nice reporting system that will shade the difference area, and people with enough time to review the results to figure out what the difference means and determine root cause. I'm sure some of that has become easier over time with off-the-shelf tooling, but since I'm still seeing bespoke systems out there doing it I doubt it's become turnkey easy yet.

It's also not something you want to do until the UI has solidified for that spot--which sometimes never happens, for some apps. It also has the issue that you can often only pixel-compare between two shots from the same rendering system--that is, same browser version if you're testing browser, same graphics drivers and rendering subsystem version if you're native, etc. Frequently that means testing against frozen reference environments that become increasingly less relevant to in-field behavior over time and are also a maintenance load to qualify and update.

When thinking of the "test pyramid" and why, for example, UI tests are at the point due to high fragility vs. low specificity vs. high effort to triage and maintain, I'd put bitmap testing at the very tippy-top on an antenna above the pyramid. They're useful, but have a large hump to set up, a long tail of rather heavy maintenance, a failure could mean almost anything under the hood, and they churn like a mofo during any flow or visual refactor whatsoever with no possible way to abstract them to mitigate that.

At that point, it's not really about the usefulness anymore, and more about the opportunity cost of not doing something different with the time. I think they're usually pretty questionable unless you're actually testing a rendering system where the bitmap is the primary output. A custom-rendered component ancillary to the application probably wouldn't meet that bar in most cases unless it were complex and central enough to merit the operational risk and stable enough to mitigate the same.