top | item 44312876

(no title)

zie1ony | 8 months ago

I find it amazing, that the same ideas pop up in the same period of time. For example, I work on tests generation and I went the same path. I tried to find bugs by prompting "Find bugs in this code and implement tests to show it.", but this didn't get me far. Then I switched to property (invariant) testing, like you, but in my case I ask AI: "Based on the whole codebase, make the property tests." and then I fuzz some random actions on the state-full objects and run prop tests over and over again.

At first I also wanted to automate everything, but over time I realized that best is: 10% human to 90% AI of work.

Another idea I'm exploring is AI + Mutation Tests (https://en.wikipedia.org/wiki/Mutation_testing). It should help AI with generation of full coverage.

discuss

LAC-Tech|8 months ago

I'd have much more confidence in an AI codebase where the human has chosen the property tests, than a human codebase where the AI has chosen the property tests.

Tests are executable specs. That is the last thing you should offload to an LLM.

bccdee|8 months ago

Also, a poorly designed test suite makes your code base extremely painful to change. A well-designed test suite with good abatractions makes it easy to change code, on top of which, it makes tests extremely fast to write.

I think the whole idea of getting LLMs to write the tests comes from a pandemic of under-abstracted, labour-intensive test suites. And that just makes the problem worse.

unknown|8 months ago

[deleted]

kenjackson|8 months ago

While I agree in theory -- the problem I have is that humans I've worked with are much worse at writing tests than they are at writing the implementation. Maybe its motivation or experience, but test quality generally is much worse than implementation quality -- at least in my experience.

koakuma-chan|8 months ago

How about an LRM?

wahnfrieden|8 months ago

An under-explored approach is to collect data on human usage of the app (from production and from internal testers) and feed that to your generative inputs