top | item 41963833

(no title)

tsv_ | 1 year ago

Each time a new LLM version comes out, I give it another try at generating tests. However, even with the latest models, tailored GPTs, and well-crafted prompts with code examples, the same issues keep surfacing:

- The models often create several tests within the same equivalence class, which barely expands test coverage

- They either skip parameterization, creating multiple redundant tests, or go overboard with 5+ parameters that make tests hard to read and maintain

- The model seems focused on "writing a test at any cost" often resorting to excessive mocking or monkey-patching without much thought

- The models don’t leverage existing helper functions or classes in the project, requiring me to upload the whole project context each time or customize GPTs for every individual project

Given these limitations, I primarily use LLMs for refactoring tests where IDE isn’t as efficient:

- Extracting repetitive code in tests into helpers or fixtures

- Merging multiple tests into a single parameterized test

- Breaking up overly complex parameterized tests for readability

- Renaming tests to maintain a consistent style across a module, without getting stuck on names

discuss

deeviant|1 year ago

All of the points you raise I find common in human written tests.