I was going to ask, why aren't they running the code through unit tests if they're committing such shit code? Does it pass the tests, or is it just inefficient code?
First consider following: The problem with current LLMS they generate code which in addition to any obvious messiness have the tendency to look plausible even if wrong. I.e. the kind of bugs which are overlooked in reviews the most are also the kinds of bugs LLMS are most likely to have "accidentally learned as correct behaviour".
Then: Outside of some extra complex or absurdly simple case it is very often harder to write tests which truly and fully test your code then it is to write the code correct.
In my experience often correct code is a product of carefully written test (which still are in reality imperfect), static code analysis (can be the type system, or external tools) applied to carefully written code and a proper code review.
So if you bring both of it together you have:
- AI supported code which is likely to contain bugs which are really easy to overlook in reviews
- AI supported test code which is has the same issue, i.e. they have gaps which are really likely to overlook by reviewers.
- more code due to less reuse and it also sometimes being easier to generate instead of use a library leading to more code review needing to be done and in turn more time pressure and less quality review
so put together: more bugs which are hard to find with test which are more likely subtle pass even with bugs and less time for proper reviews
So does it pass the test? Yes, but it was AI written too so can it be trusted?
dathinab|2 years ago
Then: Outside of some extra complex or absurdly simple case it is very often harder to write tests which truly and fully test your code then it is to write the code correct.
In my experience often correct code is a product of carefully written test (which still are in reality imperfect), static code analysis (can be the type system, or external tools) applied to carefully written code and a proper code review.
So if you bring both of it together you have:
- AI supported code which is likely to contain bugs which are really easy to overlook in reviews
- AI supported test code which is has the same issue, i.e. they have gaps which are really likely to overlook by reviewers.
- more code due to less reuse and it also sometimes being easier to generate instead of use a library leading to more code review needing to be done and in turn more time pressure and less quality review
so put together: more bugs which are hard to find with test which are more likely subtle pass even with bugs and less time for proper reviews
So does it pass the test? Yes, but it was AI written too so can it be trusted?
actionfromafar|2 years ago