top | item 46972726

(no title)

exfalso | 19 days ago

Exact same experience.

Here's what I find Claude Code (Opus) useful for:

1. Copy-pasting existing working code with small variations. If the intended variation is bigger then it fails to bring productivity gains, because it's almost universally wrong.

2. Exploring unknown code bases. Previously I had to curse my way through code reading sessions, now I can find information easily.

3. Google Search++, e.g. for deciding on tech choices. Needs a lot of hand holding though.

... that's it? Any time I tried doing anything more complex I ended up scrapping the "code" it wrote. It always looked nice though.

discuss

enraged_camel|19 days ago

>> 1. Copy-pasting existing working code with small variations. If the intended variation is bigger then it fails to bring productivity gains, because it's almost universally wrong.

This does not match my experience. At all. I can throw extremely large and complex things at it and it nails them with very high accuracy and precision in most cases.

Here's an example: when Opus 4.5 came out I used it extensively to migrate our database and codebase from a one-Postgres-schema-per-tenant architecture to a single schema architecture. We are talking about eight years worth of database operations over about two dozen interconnected and complex domains. The task spanned migrating data out of 150 database tables for each tenant schema, then validating the integrity at the destination tables, plus refactoring the entire backend codebase (about 250k lines of code), plus all of the test suite. On top of that, there were also API changes that necessitated lots of tweaks to the frontend.

This is a project that would have taken me 4-6 months easily and the extreme tediousness of it would probably have burned me out. With Opus 4.5 I got it done in a couple of weeks, mostly nights and weekends. Over many phases and iterations, it caught, debugged and fixed its own bugs related to the migration and data validation logic that it wrote, all of which I reviewed carefully. We did extensive user testing afterwards and found only one issue, and that was actually a typo that I had made while tweaking something in the API client after Opus was done. No bugs after go-live.

So yeah, when I hear people say things like "it can only handle copy paste with small variations, otherwise it's universally wrong" I'm always flabbergasted.

exfalso|19 days ago

Interesting. I've had it fail on much simpler tasks.

Example: was writing a flatbuffers routine which translated a simple type schema to fbs reflection schema. I was thinking well this is quite simple, surely Opus would have no trouble with it.

Output looked reasonable, compiled.. and was completely wrong. It seemed to just output random but reasonable looking indices and offsets. It also inserted in one part of the code a literal TODO saying "someone who understands fbs reflection should write this". Had to write it from scratch.

Another example: was writing a fuzzer for testing a certain computation. In this case, there was existing code to look at (working fuzzers for slighly different use cases), but the main logic had to be somewhat different. Opus managed to do the copy paste and then messed up the only part where it had to be a bit more creative. Again, showing the limitation of where it starts breaking. Overall I actually considered this a success, because I didn't have to deal with the "boring" bit.

Another example: colleague was using Claude to write a feature that output some error information from an otherwise completely encrypted computation. Claude proceeded to insert a global backdoor into the encryption, only caught in review. The inserted comments even explained the backdoor.

I would describe a success story if there was one. But aside from throwing together simple react frontends and SQL queries (highly copy-pasteable recurring patterns in the training set) I had literally zero success. There is an invisible ceiling.

yencabulator|13 days ago

I find LLMs to be absolutely worst at "take this content and put (a copy) there" tasks. They slightly subtly mutate the content while doing that! I keep having to e.g. restore some explanatory comments.