top | item 46940592

(no title)

This isn't really true though. Pre-training for coding models is just a mass of scraped source-code, but post-training is more than simply generating compiling code. It includes extensive reinforcement learning of curated software-engineering tasks that are designed to teach what high quality code looks like, and to improve abilities like debugging, refactoring, tool use, etc.

discuss

softwaredoug|21 days ago

Well and also a lot of Claude Code users data as well. That telemetry is invaluable.

sarchertech|21 days ago

Yeah but how is that any different. The vast majority of prompts are going to be either for failed experiments or one off scripts where no one cares about code quality or by below average developers who don’t understand code quality. Anthropic doesn’t know how to filter telemtry for code we want AI to emulate.

sarchertech|21 days ago

There’s no objective measurement for high quality code, so I don’t think model creators are going to be particularly good at screening for it.