top | item 47205117

(no title)

ptnpzwqd | 1 day ago

I have used Claude (incl. Opus 4.6) fairly extensively, and Claude still spits out quality that is far below what I would call production ready - both littered with smaller issues, but also the occasional larger blunder. Particularly when doing anything non-trivial, and even when guiding it in detail (although that admittedly reduces the amount of larger structural issues).

Maybe it is tech stack dependent (I have mostly used it with C#/.NET), but I have heard people say the same for C#. The only conclusion I have been able to draw from this, is that people have very different definitions of production ready, but I would really like to see some concrete evidence where Claude one-shots a larger/complex C# feature or the like (with or without detailed guidance).

discuss

KellyCriterion|1 day ago

> C#/.NET

same here :)

> one-shots a larger/complex C# feature

I can show you a timeseries data-renderer which was created with 1 initial very large prompt and then 3 following "change this and that" prompts. The file is around 5000 lines and everything works fine & exactly as specified.

allajfjwbwkwja|23 hours ago

> The file is around 5000 lines

Yep, this is another case of different standards for "production ready."

ptnpzwqd|1 day ago

Feel free to share it, would be very curious - ideally alongside the prompts.

peteforde|1 day ago

I see this over and over again. I don't dispute your experience. My experience with ESP32 development has been unreasonably positive. My codebase is sitting around 600k LoC and is the product of several hundred Opus 4.x Plan -> Agent -> Debug loops. I review everything that goes through, but I'm reviewing the business logic and domain gotchas, not dumb crap like what you and so many others describe.

What is so strange to me is that surely there is more C# out there than ESP-IDF code? I don't have a good explanation beyond saying that my codebase is extensively tested and used; I would know very quickly if it suddenly started shitting the bed in the way you explain.

whaleidk|1 day ago

600k lines of code for anything on the ESP32 sounds like the absolute polar opposite of “good”

ivan_gammel|1 day ago

The more code is out there, the worse is the average in the training dataset. There will be legacy approaches and APIs, poor design choices, popular use cases irrelevant for your context etc that increase the chances of output not matching your expectations. In Java world this is exactly how it works. I need 3-5 iterations with Claude to get things done the way I expect, sometimes jumping straight to manual refactoring and then returning the result to Claude for review and learning. My CLAUDE.md (multiple of them) are growing big with all patterns and anti-patterns identified this way. To overcome this problem model needs specialized training, that I don‘t think the industry knows how to approach (it has to beat the effort put in the education system for humans).

xienze|1 day ago

> My experience with ESP32 development has been unreasonably positive. My codebase is sitting around 600k LoC and is the product of several hundred Opus 4.x Plan -> Agent -> Debug loops.

I feel like this is an example of people having different standards of what “good” code is and hence the differing opinions of how good these tools are. I’m not an embedded developer but 600K LOC seems like a lot in that context, doesn’t it? Again I could be way off base here but that sounds like there must be a lot of spaghetti and copy-paste all over the codebase for it to end up that large.

je42|1 day ago

Interesting - what kind of structural issues have you encountered?

Is these more related to the existing source code or is this a bad pattern thar you would never do regardless of the existing code?

skeledrew|1 day ago

I don't get it though. Why do you expect perfect responses? Humans continually make mistakes, and AI is trained on human data. Yet there seems to be this higher bar of expectation for the latter. Somehow people expect this thing that's been around for a few weeks/months, and cannot learn anything more beyond its training cutoff date, to always do a better job than a human who's been around for 20+ years and is able to learn on their own until death.

ptnpzwqd|1 day ago

I don't expect that - am merely responding to the parent comments claim that Claude consistently one-shots production ready code (which does not at all match my observations).

huflungdung|1 day ago

[deleted]