top | item 43715593

(no title)

HippoBaro | 10 months ago

Eminently pragmatic solution — I like it. In Rust, a crate is a compilation unit, and the compiler has limited parallelism opportunities, especially since rustc offloads much of the work to LLVM, which is largely single-threaded.

It’s not surprising they didn’t see a linear speedup from splitting into so many crates. The compiler now produces a large number of intermediate object files that must be read back and linked into the final binary. On top of that, rustc caches a significant amount of semantic information — lifetimes, trait resolutions, type inference — much of which now has to be recomputed for each crate, including dependencies. That introduces a lot of redundant work.

I also would expect this to hurt runtime performance as it likely reduces inlining opportunities (unless LTO is really good now?)

discuss

JJJollyjim|10 months ago

They mention that compiling one crate at a time (-j1) doesnt give the 7x slowdown, which rules out the object file/caching-in-rustc theories... I think the only explanation is the rustcs are sharing limited L3 cache.

lsuresh|10 months ago

The L3 cache angle is one of our hypotheses too. But it doesn't seem like we can do much about it.