top | item 31357993

Sccache – Shared Compilation Cache

59 points| gjvc | 3 years ago |github.com | reply

23 comments

order
[+] meinersbur|3 years ago|reply
My first contribution to a Rust codebase was <https://github.com/mozilla/sccache/commit/da2934fcc2ed2a4ae2...>. It is adding the -fminimize-whitespace flag to the preprocessor command when available (New in Clang 14, <https://releases.llvm.org/14.0.0/tools/clang/docs/ReleaseNot...>). The equivalent ccache patch is still pending <https://github.com/ccache/ccache/pull/815>.

When using the disk cache, ccache is still faster on cache hits due to also checking the hash of all input files (called a manifest) before even executing the preprocessor. It also can just clone/hardlink the file in the cache instead of copying it.

[+] maxfan8|3 years ago|reply
Hi, Michael! Nice to see you on HN and talking about that PR :).
[+] pianoben|3 years ago|reply
I was a very happy user of sccache - it took some big CI builds from ~10 to 1.5 minutes on average. We had to add an Azure backend to it, but the code is very well-organized and it was pretty easy to hack on.

I don't work in native languages these days, but if I do again I'll definitely reach for this again.

[+] Dowwie|3 years ago|reply
There's a growing list of projects written in Rust that couldn't benefit by sccache. It would be helpful if people made clear what sccache was not good for so that people can stop spending needless hours re-discovering on their own.
[+] Kinrany|3 years ago|reply
Any specific examples?
[+] pabs3|3 years ago|reply
It would be interesting to have this at Internet scale, everyone on the planet who is building software would share hashes of the code they built, the binaries they built and the compiler details.
[+] throwamon|3 years ago|reply
If you haven't already, check out Nix/Guix.
[+] encryptluks2|3 years ago|reply
I had a similar idea in the past of a distributed compiler that scales across multiple machines to improve build times. This is great work and excited to see it becoming more prominent.
[+] dboreham|3 years ago|reply
For me the existence of "build caching" schemes is indicative that something's wrong with the tool chain or its users and that modularity hasn't been properly implemented.
[+] kazinator|3 years ago|reply
While build caching could help mask problems caused by poor modularity, such as the same source file being built multiple times in different subdirectories of a build, rather than just once, that's really not what its for.

It solves the toolchain problem that the toolchain doesn't remember that it's already built something before; if you give it the same inputs, it will compile them every time, taking the same time as before.

Caching lets you do a clean rebuild in a newly spun up environment with a new checkout of the source code, while saving time by re-using pieces that have not changed from another build (not necessarily identical to that one).

Yes, there could be less of a need for caching if incremental builds were rigorously reliable. Every instance of a CI server could then just update the same repository in-place with new commits, and run an incremental build. But caching would still help with that. For instance, if a commit happens to revert a file to a prior state, caching will pick up on that and pull out the prior object file for that.

When you use caching in a private repository where you have reliable incremental builds, you still see an improvement. For instance, when you throw away some experimental code, returning files to a prior state, and run a build, the object files just come blazing out of the cache.

When you do a "git bisect" to find a bug, same thing: the old commits build really fast.

[+] saghm|3 years ago|reply
It's worth noting that sccache is specifically designed to support sharing caches across machines, not just locally. I don't really see what the benefit of a language having first-class support for using a cache from S3 or whatever instead of it just being a third party tool.
[+] jonstewart|3 years ago|reply
Have you worked in a compiled language?
[+] krimbus|3 years ago|reply
Compilation can be a big overhead on C++ codebases even when there is plenty of care in regards to modularity. Projects that are heavy on templates usually benefit a lot from compile caching mechanisms.