top | item 47107247

(no title)

jamesmcq | 8 days ago

Trust me I'm very impressed at the progress AI has made, and maybe we'll get to the point where everything is 100% correct all the time and better than any human could write. I'm skeptical we can get there with the LLM approach though.

The problem is LLMs are great at simple implementation, even large amounts of simple implementation, but I've never seen it develop something more than trivial correctly. The larger problem is it's very often subtly but hugely wrong. It makes bad architecture decisions, it breaks things in pursuit of fixing or implementing other things. You can tell it has no concept of the "right" way to implement something. It very obviously lacks the "senior developer insight".

Maybe you can resolve some of these with large amounts of planning or specs, but that's the point of my original comment - at what point is it easier/faster/better to just write the code yourself? You don't get a prize for writing the least amount of code when you're just writing specs instead.

discuss

fourthark|8 days ago

This is exactly what the article is about. The tradeoff is that you have to throughly review the plans and iterate on them, which is tiring. But the LLM will write good code faster than you, if you tell it what good code is.

reg_dunlop|8 days ago

Exactly; the original commenter seems determined to write-off AI as "just not as good as me".

The original article is, to me, seemingly not that novel. Not because it's a trite example, but because I've begun to experience massive gains from following the same basic premise as the article. And I can't believe there's others who aren't using like this.

I iterate the plan until it's seemingly deterministic, then I strip the plan of implementation, and re-write it following a TDD approach. Then I read all specs, and generate all the code to red->green the tests.

If this commenter is too good for that, then it's that attitude that'll keep him stuck. I already feel like my projects backlog is achievable, this year.

Degorath|7 days ago

My experience has so far been similar to the root commenter - at the stage where you need to have a long cycle with planning it's just slower than doing the writing + theory building on my own.

It's an okay mental energy saver for simpler things, but for me the self review in an actual production code context is much more draining than writing is.

I guess we're seeing the split of people for whom reviewing is easy and writing is difficult and vice versa.

hathawsh|7 days ago

Several months ago, just for fun, I asked Claude (the web site, not Claude Code) to build a web page with a little animated cannon that shoots at the mouse cursor with a ballistic trajectory. It built the page in seconds, but the aim was incorrect; it always shot too low. I told it the aim was off. It still got it wrong. I prompted it several times to try to correct it, but it never got it right. In fact, the web page started to break and Claude was introducing nasty bugs.

More recently, I tried the same experiment, again with Claude. I used the exact same prompt. This time, the aim was exactly correct. Instead of spending my time trying to correct it, I was able to ask it to add features. I've spent more time writing this comment on HN than I spent optimizing this toy. https://claude.ai/public/artifacts/d7f1c13c-2423-4f03-9fc4-8...

My point is that AI-assisted coding has improved dramatically in the past few months. I don't know whether it can reason deeply about things, but it can certainly imitate a human who reasons deeply. I've never seen any technology improve at this rate.

Kiro|7 days ago

> but I've never seen it develop something more than trivial correctly.

What are you working on? I personally haven't seen LLMs struggle with any kind of problem in months. Legacy codebase with great complexity and performance-critical code. No issue whatsoever regardless of the size of the task.

nojito|8 days ago

>I've never seen it develop something more than trivial correctly.

This is 100% incorrect, but the real issue is that the people who are using these llms for non-trivial work tend to be extremely secretive about it.

For example, I view my use of LLMs to be a competitive advantage and I will hold on to this for as long as possible.

jamesmcq|8 days ago

The key part of my comment is "correctly".

Does it write maintainable code? Does it write extensible code? Does it write secure code? Does it write performant code?

My experience has been it failing most of these. The code might "work", but it's not good for anything more than trivial, well defined functions (that probably appeared in it's training data written by humans). LLMs have a fundamental lack of understanding of what they're doing, and it's obvious when you look at the finer points of the outcomes.

That said, I'm sure you could write detailed enough specs and provide enough examples to resolve these issues, but that's the point of my original comment - if you're just writing specs instead of code you're not gaining anything.