top | item 45807720

(no title)

hatefulmoron | 3 months ago

Calling things "slop" is just begging the question. The real differentiating factor is that, in the past, "human-generated slop" at least took effort to produce. Perhaps, in the process of producing it, the human notices what's happening and reconsiders (or even better, improves it such that it's no longer "slop".) Claude has no such inhibitions. So, when you look at a big bunch of code that you haven't read yet, are you more or less confident when you find out an LLM wrote it?

discuss

order

fragmede|3 months ago

If you try and one shot it, sure, but if you question Claude, point out the errors of its ways, tell it to refactor and ultrathink, point out that two things have similar functionality and could be merged. It can write unhinged code with duplicate unused variable definitions that don't work, and it'll fix it up if you call it out, or you can just do it yourself. (cue questions of if, in that case, it would just be faster to do it yourself.)

hatefulmoron|3 months ago

I have a Claude max subscription. When I think of bad Claude code, I'm not thinking about unused variable definitions. I'm thinking about the times you turn on ultrathink, allow it to access tools and negotiate it's solution, and it still churns out an over complicated yet partially correct solution that breaks. I totally trust Claude to fix linting errors.

WalterSear|3 months ago

I have pretty much the same amount of confidence when I receive AI generated or non-AI generated code to review: my confidence is based on the person guiding the LLM, and their ability to that.

Much more so than before, I'll comfortably reject a PR that is hard to follow, for any reason, including size. IMHO, the biggest change that LLMs have brought to the table is that clean code and refactoring are no longer expensive, and should no longer be bargained for, neglected or given the lip service that they have received throughout most of my career. Test suites and documentation, too.

(Given the nature of working with LLMs, I also suspect that clean, idiomatic code is more important than ever, since LLMs have presumably been trained on that, but this is just a personal superstition, that is probably increasingly false, but also feels harmless)

The only time I think it is appropriate to land a large amount of code at once is if it is a single act of entirely brain dead refactoring, doing nothing new, such as renaming a single variable across an entire codebase, or moving/breaking/consolidating a single module or file. And there better be tests. Otherwise, get an LLM to break things up and make things easier for me to understand, for crying out loud: there are precious few reasons left not to make reviewing PRs as easy as possible.

So, I posit that the emotional reaction from certain audiences is still the largest, most exhausting difference.

grey-area|3 months ago

clean code and refactoring are no longer expensive

Are you contending that LLMs produce clean code?

hatefulmoron|3 months ago

I don't really understand your point. It reads like you're saying "I like good code, it doesn't matter if it comes from a person or an LLM. If a person is good at using an LLM, it's fine." Sure, but the problem people have with LLMs is their _propensity_ to create slop in comparison to humans. Dismissing other people's observations as purely an emotional reaction just makes it seem like you haven't carefully thought about other people's experiences.

HelloNurse|3 months ago

> Perhaps, in the process of producing it, the human notices what's happening and reconsiders (or even better, improves it such that it's no longer "slop".)

Given the same ridiculously large and complex change, if it is handwritten only a seriously insensitive and arrogant crackpot could, knowing what's inside, submit it with any expectation that you accept it without a long and painful process instead of improving it to the best of their ability; on the other hand using LLM assistance even a mildly incompetent but valuable colleague or contributor, someone you care about, might underestimate the complexity and cost of what they didn't actually write and believe that there is nothing to improve.