top | item 35854665

(no title)

rbrown46 | 2 years ago

I’ve gotten good insight into what takes up space in binaries by profiling with Bloaty McBloatface. My last profiling session showed that clang’s ThinLTO was inlining too aggressively in some cases, causing functions that should be tiny to be 75 kB+.

https://github.com/google/bloaty

discuss

order

ghotli|2 years ago

I spent a lot of time with bloaty for our embedded application and found I had more actionable output from something like this...

nm -B -l -r --size-sort --print-size -t d ./path/to/compiler/output{.so} | c++filt > /tmp/by_size

Just a lot of flags that show you size by symbol in decimal with unmangled symbols. Run it before you run `strip` in your CI pipeline or whatever preps a build for proper release.

menaerus|2 years ago

I agree, bloaty seems to be good in giving a good (and quick) overview but the difficult part is drilling through the symbols to find out what the heck is happening. In that case nm/objdump/readelf are irreplaceable.

chc4|2 years ago

If you can run PGO, it will take the profiling information into account when doing inlining heuristics, which can help a lot in some cases. Technically that is general optimization for speed and not size, though, so if you really care specifically for binary size you'd probably still have to muck about with noinline attributes and such.

mananaysiempre|2 years ago

Unfortunately, PGO done the default way is antithetical to reproducible builds. You can avoid that by putting the profiling data in your VCS, but then you suffer of all the consequences of a version-controlled binary blob, one heavily dependent on other files at that.

Perhaps it should be possible use profiling data to keep human-managed {un,}likely or {hot,cold} annotations up to date? How valuable are PGO’s frequencies compared to these discrete-valued labels? (I know GCC allows you to specify frequencies in the source, but that sounds less than convenient.)