top | item 39144281

(no title)

jezze | 2 years ago

A linker typically only includes the parts of the library it needs for each binary so some parts will definately have many copies of the same code when you statically link but it will not make complete copies.

But I wouldnt consider this bloat. To me it is just a better seperation of concerns. To me bloat would be to have a system that has to keep track of all library dependencies instead, both from a packaging perspective but also in runtime. I think it depends where you are coming from. To me static linking is just cleaner. I dont care much for the extra memory it might use.

discuss

jvanderbot|2 years ago

Dynamic linking served us when OS upgrades came infrequently, user software was almost never upgraded short of mailing out new disks, and vendors had long lead times to incorporate security fixes.

In the days of fast networks, embedded OSs, emphemeral containers, and big hard drives, a portable static binary is way less complex and only somewhat less secure (unless you're regularly rebuilding your containers/execs in which case it's break even security wise or possibly more secure, simply because each exec may not include vulnerable code)

bscphil|2 years ago

> In the days of fast networks, embedded OSs, emphemeral containers, and big hard drives, a portable static binary is way less complex and only somewhat less secure

If what you're trying to do is run a single program on a server somewhere, then yes absolutely a static binary is the way to go. There are lots of cases, especially end user desktops, where this doesn't really apply though.

In my opinion the debate over static vs dynamic linking is resolved by understanding that they are different tools for different jobs.

gnramires|2 years ago

As far as I can see, it would be unwise to roll back 30 years of (Linux) systems building with dynamic linking in favor of static linking. It mostly works very well and does save some memory, disk, and has nice security properties. Both have significant pros and cons.

I've been thinking (not a Linux expert by any means) the ideal solution would be to have better dependency management: I think a solution could be if say binaries themselves carried dependency information. That way you get the benefits of dynamic and static linking by just distributing binaries with embedded library requirements. Also, I think there should be a change of culture in library development to clearly mark compatibility breaks (I think something like semantic versioning works like that?).

That way, your software could support any newer version up to a compatibility break -- which should be extremely rare. And if you must break compatibility there should be an effort to keep old versions available, secure and bug free (or at least the old versions should be flagged as insecure in some widely accessible database).

Moreover, executing old/historical software should become significantly easier if library information was kept in the executable itself (you'd just have to find the old libraries, which could be kept available in repositories).

I think something like that could finally enable portable Linux software? (Flatpak and AppImage notwithstanding)

nequo|2 years ago

> Dynamic linking served us when OS upgrades came infrequently, user software was almost never upgraded

Even today, dynamic linking is not only a security feature but also serves convenience. A security fix in OpenSSL or libwebp can be applied to everything that uses them by just updating those libraries instead of having to rebuild userland, with Firefox, Emacs, and so on.

teaearlgraycold|2 years ago

Yeah I’d prefer we just use another gigabyte of storage than add so much complexity. Even with what is a modest SSD capacity today I have a hard time imagining how I’d fill my storage. I’m reminded of my old workstation from 8 years ago. It had a 500GB hard drive and a 32GB SSD for caching. I immediately reconfigured to just use the SSD for everything by default. It ended up being plenty.

manmal|2 years ago

Apple has been pushing dynamic libraries for a while, but now realized that they really like static linking better. The result is they found a way to convert dynamic libraries into static ones for release builds, while keeping them dynamic for debug builds: https://developer.apple.com/documentation/xcode/configuring-...

chaxor|2 years ago

I'm not versed in this, so apologies for the stupid question, but wouldn't statically linking be more secure, if anything? Or at least have potentially better security?

I always thought the better security practice is statically linked Go binary in a docker container for namespace isolation.

jhallenworld|2 years ago

>A linker typically only includes the parts of the library it needs for each binary so some parts will definately have many copies of the same code when you statically link but it will not make complete copies.

Just to add to what you said: in the old days the linker would include only the .o files in the .a library that were referenced. Really common libraries like libc should be made to have only a single function per .o for this reason.

But modern compilers have link time optimization, which changes everything. The compiler will automatically leave out any items not referenced without regard to .o file boundaries. But more importantly, it can perform more optimizations. Perhaps for a given program a libc function is always called with a constant for a certain argument. The compiler could use this fact to simplify the function.

I'm thinking that you might be giving up quite a lot of performance by using shared libraries, unless you are willing to run the compiler during actual loading.

Even without lto, you can have the same results in C++ by having your library in the form of a template- so the library is fully in the /usr/include header file, with nothing in /usr/lib.

inkyoto|2 years ago

> Just to add to what you said: in the old days the linker would include only the .o files in the .a library that were referenced.

It was not exactly like that. Yes, the .o file granularity was there but the unused code from that .o file would also get linked in.

The original UNIX linker had a very simple and unsophisticated design (compared to its contemporaries) and would not attempt to optimise the final product being linked. Consider a scenario where the binary being linked references A from an «abcde.o» file, and the «abcde.o» file has A, B, C, D and E defined in it, so the original «ld» would link the entire «abcde.o» into the final product. Advanced optimisations came along much later on.

inkyoto|2 years ago

> A linker typically only includes the parts of the library it needs for each binary […]

It is exactly the same with the dynamic linking due to the demand paging available in all modern UNIX systems: the dynamic library is not loaded into memory in its entirety, it is mapped into the process's virtual address space.

Initially, there is no code from the dynamic library loaded into memory until the process attempts to access the first instruction from the required code at which point a memory fault occurs, and the virtual memory management system loads the required page(s) into the process's memory. A dynamic library can be 10Gb in size and appear as a 10Gb in the process's memory map but only 1 page can be physically present in memory. Moreover, under the heavy memory pressure the kernel can invalidate the memory page(s) (using LRU or a more advanced memory page tracking technique) and the process (especially true for background or idlying processes) will reference zero pages with the code from the dynamic library.

Fundamentally, dynamic linking is the deferred static linking where the linking functions are delegated to the dynamic library loader. Dynamic libraries incur a [relatively] small overhead of slower (compared to statically linked binaries) process startup times due to the dynamic linker having to load the symbol table, the global offset table from the dynamic library and performing the symbol fixup according to the process's own virtual memory layout. It is a one-off step, though. For large, very large and frequently used dynamic libraries, caching can be employed to reduce such overhead.

Dynamic library mapping into the virtual address space != loading the dynamic library into memory, they are two disjoint things. It almost never happens when the entire dynamic library is loaded into memory as the 100% code coverage is exceedingly rare.

akira2501|2 years ago

> It is a one-off step, though.

Yes, but often a one off step that sets all your calls to call through a pointer, so each call site in a dynamic executable is slower due to an extra indirection.

> For large, very large and frequently used dynamic libraries, caching can be employed to reduce such overhead.

The cache is not unlimited nor laid out obviously in userspace, and if you have a bunch of calls into a library that end up spread all over the mapped virtual memory space, sparse or not, you may evict cache lines more than you otherwise would if the functions were statically linked and sequential in memory.

> as the 100% code coverage is exceedingly rare.

So you suffer more page faults than you otherwise have to in order to load one function in a page and ignore the rest.

rwmj|2 years ago

You should be keeping track of those library dependencies anyway if you want to know what you have to recompile when, say, zlib or openssl has a security problem.

ithkuil|2 years ago

Well, you have to do that anyways

giljabeab|2 years ago

Can’t file systems de dupe this now