The Little Things: Speeding up C++ compilation

[+] modeless|5 years ago|reply

I wish someone would pick up the zapcc project. Compilers do an insane amount of duplicated and redundant work and there's tremendous potential for speeding up C++ builds if you are willing to rethink how the compiler works a little bit. https://github.com/yrnkrn/zapcc

[+] wheybags|5 years ago|reply

I work on the video game factorio, which is a c++ project. On a 9900k a rebuild takes about a minute, so it's pretty sizeable but not something ridiculous like oracle db or unreal engine. I tried using zapcc on it, and it was a complete failure. I don't have measurements to hand, but iirc it was actually slower than stock clang. I tested it on a threadripper 2950x with 64gb of ram, running Debian.

[+] chubot|5 years ago|reply

Is this like llbuild?

https://www.youtube.com/watch?v=b_T-eCToX1I

https://github.com/apple/swift-llbuild

I remember watching the talk awhile ago, but haven't kept up with its progress. It's about using LLVM as a library to eliminate duplicate work, as far as I remember.

I think it was for C++ as well as Swift.

[+] usrnm|5 years ago|reply

> Compilers do an insane amount of duplicated and redundant work

This is one of the problems that should be solved by modules in the near future.

[+] gjvc|5 years ago|reply

This is an excellent and thorough review of the details.

An almost-should-be-mandatory tool on projects of any decent size is https://ccache.dev/

[+] flohofwoe|5 years ago|reply

Bruce Dawson's Chromium build time investigation is very enlightening:

https://randomascii.wordpress.com/2020/03/30/big-project-bui...

The main takeaway:

> That is, the main source files represent just 0.32% of the lines of code being processed.

Chromium itself has about 12 million lines of code, but the compiler needs to process 3.6 billion lines of code because it needs to parse the same headers over and over again.

[+] stephc_int13|5 years ago|reply

From my experience, unity builds (single compilation unit with #include of all source files into a single main file) and use of orthodox C++ (avoidance of templates and fancy features) is giving good results.

[+] barumi|5 years ago|reply

It's reasonable to assume that compilation times can go down if you decide to forego compile-time features (i.e., templates).

However, C++'s standard library is a productivity blessing. I'm not convinced of the advantage of replacing it with ad-hoc components that may or may not be buggy just because you want to shave off a few seconds of each build.

More often than not, all you have to do to lower a double-digit percentage of your compilation time is to modularized your project, and follow basic principles like not have includes in your interface headers. This alone, along with PImpl, does far more for your compilation time than well-meaning but poorly thought-out tradeoff decisions like not using the STL.

[+] bregma|5 years ago|reply

"Orthodox C++" without templates, language features, or the standard library is also known as "C".

I'd go one step further. Skip the compiler and just write in assembly. Your builds will be lightning fast especially if you use one single source file with no macros and static linking. You'll never ship, but you'll never ship really fast.

[+] ladberg|5 years ago|reply

That sounds like it breaks all incremental builds (correct me if I'm wrong) for relatively little benefit (skipping linking? less header parsing?). Are there scenarios where unity builds are faster than incremental builds? Or is the intended use-case only for from-scratch builds?

[+] zabzonk|5 years ago|reply

One of (many) other ways to speed up C++ (or C) builds is to use multi-processing in your make file - https://stackoverflow.com/questions/414714/compiling-with-g-... - I typically get about a 20% speed up, but obviously YMMV.

[+] bla3|5 years ago|reply

This post here recommends using ninja, which does this automatically.

[+] saagarjha|5 years ago|reply

Careful, this can break certain makefiles that are not written to take this into account.

[+] ephimetheus|5 years ago|reply

Another tip to keep compilation fast: don’t use Eigen. Non-trivial usage absolutely obliterates compile times and blows up the memory consumption by an insane amount.

[+] barumi|5 years ago|reply

Eigen is an absurdly template-intensive library, and people use it because so far there is nothing better or even remotely comparable to it.

If you use Eigen but somehow feel that compilation time is your focus instead of actually doing linear algebra or solving systems of linear equations then you have a few best practices that you can follow and techniques at your disposal to lower compilation times.

[+] happyweasel|5 years ago|reply

Hmm... What about the pImpl Idiom? That should also help (and also brings in it's own costs).

[+] jeffbee|5 years ago|reply

Builds are just like any other software: if it takes too long, profile it. Use bazel build --profile or whatever equivalent your build system offers.

[+] unknown|5 years ago|reply

[deleted]

[+] gumby|5 years ago|reply

Modules are supposed to make much of this now quite useful info irrelevant.

Unfortunately modules are several years in the future, both due to lack of implementations yet and the fact that few projects are being written in C++20 yet. I also harbour a suspicion that tricks like these will still be useful with modules.

[+] tomovo|5 years ago|reply

Last time I checked somebody working on modules said something along the lines of "don't expect modules to speed things up". The goals were different, apparently (stop #defines from leaking everywhere, large-project management, libraries). Has this changed, are there any benchmarks available?

[+] mhh__|5 years ago|reply

I still haven't seen a module in the wild yet, they should've been standardized 20 years ago now (D has had modules from day 1, and there are more people on the C++ committee than working on D at all) . I hope they get some traction but I just can't see it helping for all but the newest codebases.

[+] jart|5 years ago|reply

What I've found wreaks the most havoc on build times is having too many -iquote and -isystem flags. Last time I profiled TensorFlow compiles using strace, about half the gcc wall time was spent inside all the stat() system calls those flags generate.

[+] cgrealy|5 years ago|reply

On thing about forward declarations: they violate DRY.

Now, most of the time, the trade off of repeating

  class Thing;

is more than offset by faster build times.

But if

  class Thing;

is actually

  namespace Bob
  {
    namespace Dave
    {
      template <typename T>
      class ThingTrait;

      template <typename T>
      class Thing : BaseClass<T, allocator<T>, trait<ThingTrait<T> >;
    }
  }

you really want to put that in it's own header, typically "Thing_fwd.h"

(apologies for any syntax errors, been a while since I've written C++)

[+] asdfasgasdgasdg|5 years ago|reply

You're either gonna write

    #include "thing.h"

or

    class Thing;

Either way, as a practical matter, you're saying, "There's something called Thing I'm going to use." C++ forces you to say that before use in each translation unit. It's not really "repeating" yourself to write it, since you haven't "said" it yet in this TU.

[+] steerablesafe|5 years ago|reply

I prefer _fwd headers for a different reason: it's easier to modify without breaking dependencies.

If you change Thing to an alias to a class template specialization then all of the "class Thing;" forward declarations break.

[+] secondcoming|5 years ago|reply

There's a lot of money to be made by someone who write a tool that takes in a C++ codebase and suggests all the improvements that can be made to speed the build up.

[+] m0zg|5 years ago|reply

Just use a build system in which reliable incremental builds are possible and are the default. Your compilation will take seconds most of the time.

[+] grandinj|5 years ago|reply

If you can get it to work (not guaranteed) icecream can make a very large difference.

[+] kevincox|5 years ago|reply

This appears to be referring to a distcc derivative https://github.com/icecc/icecream

Note that discc doesn't really solve the header problem, it just throws more compute at it.

[+] tekknolagi|5 years ago|reply

How does the <algorithm> header compare in that list? That's the one I always try and avoid in my projects.

[+] saagarjha|5 years ago|reply

> Remember that std::vector consists of three pointers to a chunk of dynamically allocated memory.

Note that this is not required, and most std::string implementations inline small strings anyways.

[+] nikki93|5 years ago|reply

Was your `std::string` SBO comment meant to relate to the `std::vector` quote? `std::vector` can't use SBO bc. `std::swap` on `std::vector`s can't invalidate pointers to elements.

[+] unknown|5 years ago|reply

[deleted]

[+] adembudak|5 years ago|reply

Great write up. Thanks for sharing.

82 comments