A computer generating a compiler is nothing new. Unzip has done this many many times. The key difference is that unzip extracts data from an archive in a deterministic way, while LLMs recover data from the training dataset using a lossy statistical model. Aid that with a feedback loop and a rich test suite, and you get exactly what Anthropic has achieved.
While I agree that the technology behind this is impressive, the biggest issue is license infringement. Everyone knows there's GPL code in the training data, yet there's no trace of acknowledgment of the original authors.
Its already bad enough people are using non-GPL compilers like LLVM (that make malicious behavior like proprietary incompatible forks possible), so yet another compiler not-under GPL, that even AI-washes GPL code, is not a good thing.
These tools do not compete against the lonely programmer that writes everything from scratch they compete with the existing tooling. 5 years ago compiler generators already exist, as they did in the previous decades. That is a solved problem. People still like the handroll their parsers, not because generating wouldn't work, but because it has other benefits (maintainability, adaption, better diagnostics). Perfectly fine working code is routinely thrown away and reimplemented, because there are not enough people around anymore who know the code by heart. "The big Rewrite" is a meme for a reason.
That’s not true. It didn’t have access to the internet and no LLM has the fidelity to reproduce code verbatim from its training data at the project level.
In this case, it’s true that compilers were in its training data but only helped at the conceptual level and not spitting verbatim gcc code.
yeah its pretty amazing it can do this. The problem is the gaslighting by the companies making this - "see we can create compilers, we won't need programmers", programmers - "this is crap, are you insane?", classic gas lighting.
It’s giving you an idea of what Claude is capable of - creating a project at the complexity of a small compiler. I don’t know if it can replace programmers but can definitely handle tasks of smaller complexity autonomously.
eb08a167|21 days ago
While I agree that the technology behind this is impressive, the biggest issue is license infringement. Everyone knows there's GPL code in the training data, yet there's no trace of acknowledgment of the original authors.
m4rtink|20 days ago
vidarh|20 days ago
1718627440|20 days ago
mdavid626|21 days ago
What’s the big deal about that?
simianwords|21 days ago
chadcmulligan|21 days ago
simianwords|21 days ago
player1234|21 days ago
[deleted]