(no title)
ndesaulniers | 24 days ago
This LLM did it in (checks notes):
> Over nearly 2,000 Claude Code sessions and $20,000 in API costs
It may build, but does it boot (was also a significant and distinct next milestone)? (Also, will it blend?). Looks like yes!
> The 100,000-line compiler can build a bootable Linux 6.9 on x86, ARM, and RISC-V.
The next milestone is:
Is the generated code correct? The jury is still out on that one for production compilers. And then you have performance of generated code.
> The generated code is not very efficient. Even with all optimizations enabled, it outputs less efficient code than GCC with all optimizations disabled.
Still a really cool project!
shakna|24 days ago
Does it really boot...?
ndesaulniers|24 days ago
They don't need 16b x86 support for the RISCV or ARM ports, so yes, but depends on what 'it' we're talking about here.
Also, FWIW, GCC doesn't directly assemble to machine code either; it shells out to GAS (GNU Assembler). This blog post calls it "GCC assembler and linker" but to be more precise the author should edit this to "GNU binutils assembler and linker." Even then GNU binutils contains two linkers (BFD and GOLD), or did they excise GOLD already (IIRC, there was some discussion a few years ago about it)?
TheCondor|24 days ago
brundolf|24 days ago
Not to invalidate this! But it's toward the "well-suited for AI" end of the spectrum
HarHarVeryFunny|23 days ago
It's notable that the article says Claude was unable to build a working assembler (& linker), which is nominally a much simpler task than building a compiler. I wonder if this was at least in part due to not having a test suite, although it seems one could be auto generated during bootstrapping with gas (GNU assembler) by creating gas-generated (asm, ELF) pairs as the necessary test suite.
It does beg the question of how they got the compiler to point of correctness of generating a valid C -> asm mapping, before tackling the issue of gcc compatibility, since the generated code apparently has no relation to what gcc generates. I wonder which compilers' source code Claude has been trained on, and how closely this compiler's code generation and attempted optimizations compares to those?
unknown|23 days ago
[deleted]
qarl|24 days ago
Yeah. This test sorta definitely proves that AI is legit. Despite the millions of people still insisting it's a hoax.
The fact that the optimizations aren't as good as the 40 year gcc project? Eh - I think people who focus on that are probably still in some serious denial.
PostOnce|24 days ago
It cost $20,000 and it worked, but it's also totally possible to spend $20,000 and have Claude shit out a pile of nonsense. You won't know until you've finished spending the money whether it will fail or not. Anthropic doesn't sell a contract that says "We'll only bill you if it works" like you can get from a bunch of humans.
Do catastrophic bugs exist in that code? Who knows, it's 100,000 lines, it'll take a while to review.
On top of that, Anthropic is losing money on it.
All of those things combined, viability remains a serious question.
thesz|24 days ago
The "out of distribution" test would be like "implement (self-bootstrapping, Linux kernel compatible) C compiler in J." J is different enough from C and I know of no such compiler.
LinXitoW|24 days ago
soperj|24 days ago
cardanome|23 days ago
Writing a toy C compiler isn't that hard. Any decent programmer can write one in a few weeks or months. The optimizations are the actually interesting part and Claude fails hard at that.
kvemkon|24 days ago
with all optimizations disabled:
> Even with all optimizations enabled, it outputs less efficient code than GCC with all optimizations disabled.
dwaite|22 days ago
It is not feasible that someone will use AI to tackle genuinely new software and provide a tenth of the level of guide-rails Anthropic had for this project. They were able to keep the million monkeys on their million typewriters on an extremely short leash, and able to have it do the vast majority of iteration without human intervention.
byzantinegene|24 days ago
wqaatwt|22 days ago
Do we know it just didn’t shuffle gcc’s source code around a bit?
miohtama|24 days ago
qarl|23 days ago
[deleted]
ip26|24 days ago
byzantinegene|24 days ago
9rx|24 days ago
How much of that time was spent writing the tests that they found to use in this experiment? You (or someone like you) were a major contributor to this. All Opus had to do here was keep brute forcing a solution until the tests passed.
It is amazing that it is possible at all, but remains an impossibly without a heavy human hand. One could easily still spend a good part of their career reproducing this if they first had to rewrite all of the tests from scratch.
beambot|24 days ago
bopbopbop7|24 days ago
ndesaulniers|24 days ago
https://llvm.org/docs/MLGO.html
int_19h|24 days ago
Intuitively it feels like it should be a straightforward training setup - there's lots of code out there, so compile it with various compilers, flags etc and then use those pairs of source+binary to train the model.
jojobas|24 days ago
psychoslave|24 days ago
sandinmyjoints|24 days ago
greenavocado|24 days ago
andai|24 days ago
dnautics|24 days ago
iberator|24 days ago
Real usable AI would create it with simple: 'make c compilers c99 faster than GCC'.
AI usage should be banned in general. It takes jobs faster than creating new ones ..
arcanemachiner|24 days ago
embedding-shape|24 days ago
I don't have an strong opinion about that in either direction, but curious: Do you feel the same about everything, or is just about this specific technology? For example, should the nail gun have been forbidden if it was invented today, as one person with a nail gun could probably replace 3-4 people with normal "manual" hammers?
You feel the same about programmers who are automating others out of work without the use of AI too?
wiseowise|24 days ago
You think compiler engineer from Google gives a single shit about this?
They’ll automate millions out of career existence for their amusement while cashing out stock money and retiring early comfortably.
benterix|24 days ago
I have no problems with tech making some jobs obsolete, that's normal. The problem is, the job being done with the current generation of LLMs are, at least for now, mostly of inferior quality.
The tools themselves are quite useful as helpers in several domains if used wisely though.
7thpower|24 days ago
unglaublich|24 days ago
MaskRay|24 days ago
make O=/tmp/linux/x86 ARCH=x86_64 CC=/tmp/p/claudes-c-compiler/target/release/ccc -j30 defconfig all
``` /home/ray/Dev/linux/arch/x86/include/asm/preempt.h:44:184: error: expected ';' after expression before 'pto_tmp__' do { u32 pto_val__ = ((u32)(((unsigned long) ~0x80000000) & 0xffffffff)); if (0) { __typeof_unqual__((__preempt_count)) pto_tmp__; pto_tmp__ = (~0x80000000); (void)pto_tmp__; } asm ("and" "l " "%[val], " "%" "[var]" : [var] "+m" (((__preempt_count))) : [val] "ri" (pto_val__)); } while (0); ^~~~~~~~~ fix-it hint: insert ';' /home/ray/Dev/linux/arch/x86/include/asm/preempt.h:49:183: error: expected ';' after expression before 'pto_tmp__' do { u32 pto_val__ = ((u32)(((unsigned long) 0x80000000) & 0xffffffff)); if (0) { __typeof_unqual__((__preempt_count)) pto_tmp__; pto_tmp__ = (0x80000000); (void)pto_tmp__; } asm ("or" "l " "%[val], " "%" "[var]" : [var] "+m" (((__preempt_count))) : [val] "ri" (pto_val__)); } while (0); ^~~~~~~~~ fix-it hint: insert ';' /home/ray/Dev/linux/arch/x86/include/asm/preempt.h:61:212: error: expected ';' after expression before 'pao_tmp__' ```
silver_sun|24 days ago
the_jends|24 days ago
ndesaulniers|23 days ago
I had to move teams twice before a third team was able to say: this work is valuable to us, please come work for us and focus just on that.
I had to organize multiple internal teams, then build an external community of contributors to collaborate on this shared common goal.
Having carte blanche to contribute to open source projects made this feasible at all; I can see that being a non-starter at many employers, sadly. Having low friction to change teams also helped a lot.
HarHarVeryFunny|23 days ago
Did this come down to making Clang 100% gcc compatible (extensions, UDB, bugs and all), or were there any issues that might be considered as specific to the linux kernel?
Did you end up building a gcc compatability test suite as a part of this? Did the gcc project themselves have a regression/test suite that you were able to use as a starting point?
ndesaulniers|23 days ago
Some were necessary (asm goto), some were not (nested functions, flexible array members not at the end of structs).
> UDB, bugs and all
Luckily, the kernel didn't intentionally rely on GCC specifics this way. Where it did unintentionally, we fixed the kernel sources properly with detailed commit messages explaining why.
> or were there any issues that might be considered as specific to the linux kernel?
Yes, https://github.com/ClangBuiltLinux/linux/issues is our issue tracker. We use tags extensively to mark if we triage the issue to be kernel-side vs toolchain-side.
> Did you end up building a gcc compatability test suite as a part of this?
No, but some tricky cases LLVM got wrong were distilled from kernel sources using either:
- creduce - cvise (my favorite) - bugpoint - llvm-reduce
and then added to LLVM's existing test suite. Many such tests were also simply manually written.
> Did the gcc project themselves have a regression/test suite that you were able to use as a starting point?
GCC and binutils have their own test suites. Folks in the LLVM community have worked on being able to test clang against GCC's test suite. I personally have never run GCC's test suite or looked at its sources.
TZubiri|24 days ago
It's worth noting that this was developed by compiling Linux and running tests, so at least that is part of the training set and not the testing set.
But at least for linux, I'm guessing the tests are very robust and I'm guessing that will work correctly. That said, if any bugs pop up, it will show weak points in the linux tests.
VladVladikoff|24 days ago
what is the ecological cost of producing this piece of software that nobody will ever use?
ryanjshaw|24 days ago
If you see this as part of a bigger picture to improve human industrial efficiency and bring us one step closer to the singularity? Most likely net positive.
thefounder|24 days ago
grey-area|24 days ago
I'm curious on your take on the references the GAI might have used to create such a project and whether this matters.
zaphirplane|24 days ago
ndesaulniers|24 days ago
Fixing some UB in the kernel sources, lots of plumbing to the build system (particularly making it more hermetic).
Getting the rest of the LLVM binutils substitutes to work in place of GNU binutils was also challenging. Rewriting a fair amount of 32b ARM assembler to be "unified syntax" in the kernel. Linker bugs are hard to debug. Kernel boot failures are hard to debug (thank god for QEMU+gdb protocol). Lots of people worked on many different parts here, not just me.
Evangelism and convincing upstream kernel developers why clang support was worth anyones while.
https://github.com/ClangBuiltLinux/linux/issues for a good historical perspective. https://github.com/ClangBuiltLinux/linux/wiki/Talks,-Present... for talks on the subject. Keynoting LLVM conf was a personal highlight (https://www.youtube.com/watch?v=6l4DtR5exwo).
unknown|24 days ago
[deleted]
m463|23 days ago
wonder if clang source is part of its model :)
ur-whale|24 days ago
You do realize the LLM had access (via his training set) and "reused" (not as is, of course) your own work, right?
phillmv|24 days ago
underdeserver|24 days ago
GaggiX|24 days ago
jbjbjbjb|24 days ago
nomel|24 days ago
There's some incredible source available code out there. Statistically, I think there's a LOT more not so great source available code out there, because the majority of output of seasoned/high skill developers is proprietary.
To me, a surprising portion of Claude 4.5 output definitely looks like student homework answers, because I think that's closer to the mean of the code population.
wvenable|24 days ago
But I wonder how it would fare given a language specification for a non-existent non-trivial language and build a compiler for that instead?
luke5441|24 days ago
nlawalker|24 days ago
computerex|24 days ago
kreelman|24 days ago
It is standing on the shoulders of giants (all of the compilers of the past, built into it's training data... and the recent learnings about getting these agents to break up tasks) to get itself going. Still fairly impressive.
On a side-quest, I wonder where Anthropic is getting there power from. The whole energy debacle in the US at the moment probably means it made some CO2 in the process. Would be hard to avoid?
tdemin|24 days ago
[deleted]
eek2121|24 days ago
Granted, marketing sucks up far too much money for any startup, and again, we don't know the actual numbers in play, however, this is something to keep in mind. (The very same marketing that likely also wrote the blog post, FWIW).
willsmith72|24 days ago
but regardless, hiring is difficult and high-end talent is limited. If the costs were anywhere close to equivalent, the agents are a no-brainer
GorbachevyChase|24 days ago
bloaf|24 days ago