Ask HN: How do you manage flaky E2E tests at scale?
3 points| forcepushed | 19 days ago
I’ve seen a few patterns over the years: retries everywhere, quarantining tests, rewriting flows, adding more waits than anyone feels good about, or just slowly losing trust in CI signal. None of them feel great once you have hundreds or thousands of tests running across multiple environments.
I’m especially interested in how QA and engineering teams split responsibility. Do you treat flakiness as a test problem, a product problem, or infrastructure noise? At what point do you decide a test is no longer worth keeping?
Asking partly out of personal frustration and partly because I’ve been working on tooling around browser automation and want to sanity check the problems I’m seeing against the pains others are feeling day to day.
Would love to hear real stories from people running E2E at scale, what actually worked, and what you wish you had done earlier.
Thanks in advance.
alexandriaeden|5 days ago
benoau|19 days ago
forcepushed|17 days ago
Do you feel yourself wanting to extract this logic (wait for a selector, load state, or function evaluating etc) to some shared utility and then just pushing all of your interactions through this as a sort of feedback engine for future problems?
alexgandy|15 days ago
For what it's worth, I've been working on a side-project to try to help with almost this exact situation and would be really interested if it could help you; https://gaffer.sh
apothegm|19 days ago
forcepushed|17 days ago
How much time do you think you spend making this decision vs just fixing them tests and how often do you see yourself adding tests back in you quarantined or deleted?
microseyuyu|19 days ago
[deleted]