top | item 47205604

(no title)

The more code is out there, the worse is the average in the training dataset. There will be legacy approaches and APIs, poor design choices, popular use cases irrelevant for your context etc that increase the chances of output not matching your expectations. In Java world this is exactly how it works. I need 3-5 iterations with Claude to get things done the way I expect, sometimes jumping straight to manual refactoring and then returning the result to Claude for review and learning. My CLAUDE.md (multiple of them) are growing big with all patterns and anti-patterns identified this way. To overcome this problem model needs specialized training, that I don‘t think the industry knows how to approach (it has to beat the effort put in the education system for humans).

discuss

mjdiloreto|14 hours ago

I also believe this must be true. Try asking Claude to program in Forth, I find the results to be unreasonably good. That's probably because most of the available Forth to train on is high quality.

re-thc|14 hours ago

> To overcome this problem model needs specialized training, that I don‘t think the industry knows how to approach

We already have coding tuned models i.e. Codex. We should just have language / technology specific models with a focus on recent / modern usage.

Problem with something like Java is too old -- too many variants. Make a cut off like at least above Java 8 or 17.

ivan_gammel|4 hours ago

> We should just have language / technology specific models with a focus on recent / modern usage.

The “just” part is a big assumption. It is far from easy, given that modern best practices are always underspecified. The effective model for coding must have reasoning signals to be much stronger than coding patterns, and that, I suspect, requires very different architecture.