I wouldn't lead with symbols in an article focused on understandability.
A linker in first approximation just concatenates files.
In second approximation, it splits every input file into N parts (code and data, basically), then concatenates the corresponding parts in each input file, and writes all concatenated parts ("sections") to the output file.
It needs to fix up references at the end and it needs symbols for that, but that's kind of an implementation detail.
Even diagnosing duplicate symbols is a convenience feature if you look at things this way.
Symbols become more important if you want to write a "real" linker that supports weak symbols and symbols imported from dynamic libraries. But for the understanding the basics, it helps to deemphasize symbols, in my opinion.
The way I would describe a linker is backwards, kind of like what you did. As you say, you start with a linker as concatenating files, which themselves consist of multiple streams to be independently concatenated (i.e., sections). Then you point out that references between sections need to be resolved, and this is where you introduce relocations and symbols. Now you bring up the symbol resolution process, and the fact that the linker is going to (in some modes) choose to include object files only when it provides a necessary symbol. For bonus points, talk about segments, the dynamic loader, and/or debugging information.
This description is a reverse process of how the linker goes through and does all of it: first it loads all the input files, then resolves the symbols to build a link plan, then it fixes up necessary relocations, and finally it streams it all into the output file. I don't think it helps to omit any part of the process from the description--symbols are not "kind of an implementation detail," they are a core feature of what linkers provide.
Indeed, if I were to provide this as a blog post, I see no reason to not immediately reach for describing this using the full complexity of ELF and "real linkers" as the basis of description. You don't really gain anything from a simpler description, and building a linker that can build even relatively complex libraries such as glibc or Firefox's libxul is likely to be pretty simple--the really hard part of linkers is supporting things like linker scripts, which is not common operation.
The type of linker that was created is a symbolic linker whose very purpose is to resolve symbols. It's best not to approximate the functions of the linker, since it's purpose is so very straightforward.
Brian Callahan also wrote a very cool series of blog posts called "Demystifying programs that create programs" in which he builds an assembler (and a disassembler).
I found reading the code cleared up a lot because it's well commented and is really two functions, and a bit of boilerplate for reading/writing files.
'collect1' collects pairs of (name, address) and puts it in a table called 'collect'. Then 'process1' finds any names and substitutes the address found in the 'collect' table.
Smart linkers (and most linkers these days are smart linkers) are a lot like tracing, compacting garbage collectors.
A GC has a series of roots which point at blobs of memory. The GC needs to understand where further references are located in the pointed memory; a simple GC follows these references, avoiding cycles, until it has traced through the live set. It calculates new addresses for each blob of memory, then goes through the live set a second time, moving each object into its final position and adjusting all the references to point to their final locations.
A linker starts out with a main() function or equivalent which is a blob of code, contained in an object file produced by the compiler. The object file contains a table of references to all the external symbols used by the code, as well as the locations in the code that the references are made, classified by type of reference (e.g. absolute vs relative references). The references in the main() function are like the set of GC roots. The object file also contains a table of exports, all the symbols which it exposes externally.
The linker then locates the object code or data for each unresolved external reference (in other object files supplied on the command line, or in archive library files - indexed concatenations of object files). This process is followed recursively until the live set - the set of blobs of code and data - to be assembled into the final executable is determined.
Once all these blobs have been found, they can be copied into the final executable, and all the references to symbols can be filled in with appropriate offsets. There's details here: relative offsets may be baked in, but the locations of absolute offsets need to be marked in the final executable so that the operating system loader can adjust them if it needs to load the executable somewhere other than its preferred address.
The operating system loader is itself a linker. It, too, needs to follow a similar process, but instead of object files, it's dealing with shared objects (DLLs in Windows), and it's chasing down external references, loading things, and adjusting offsets as necessary (address space layout randomization (ASLR), load address conflicts (a lot more common in 32-bit address spaces)). The OS loader can also be lazy; instead of resolving a reference, it can fill it with a stub which is only resolved on first call. This delays linkage failures, but if the application tests for versions etc. before invoking, the developer gets the ergonomics of early bound linking with the features of late-bound linking that would otherwise need manual use of dlsym / GetProcAddress.
If you're interested in the topic, I highly recommend Linkers and Loaders by John Levine, long-time moderator of comp.compilers.
I don't think it's competing with lld as a general purpose linker. If you're interested in advances in those, I suggest looking at mold and the in-progress work in zig, which both seem to tackle link speed in different and interesting ways.
I always get tripped up with ld failing unless the object files are listed in the right order on the command line. I was curious if this simple linker would overcome this historic limitation, and was happy to see it does.
> Another downside is that you do have to arrange the object files in the command line in an order that makes sense for this linear approach. The first object file must be the entry point of the program. Different arrangements of object files may work for linking your executable but will produce an organizationally different executable. You may or may not care about this.
One thing why I still like reading programming blog posts is because every now and then, I come across some way to implement things that I would never, ever could even think about. I understand how it works, it's basically equivalent to storing cross-references inside a text file like this:
no problem, it's just... I wouldn't even think to think about trying to store this information inline, and if pressed to, I would not think about quoting every single verbatim piece of data individually.
And arguably, it kinda makes more sense than trying to quote the { and [ instead: you've already lost the ability to just copy a blob from one file to another with a couple of places patched at known offsets, so you may as well simplify your life further.
Yeah I agree that the real insight here is firstly that three datatypes is enough to product a linker, and secondly that using this crazy serialization format makes life simpler.
It somehow hurts my brain to almost double your file size, but who cares? That's the computers problem, not yours.
As an iOS dev the hardest part for me to learn has definitely been how all the parts of the build chain work together: linking, symbolication, etc. Swift tries to be high-level but Xcode still relies on a bunch of low-level tools to build the app, and often you'll be diagnosing why `ld` failed for something. Pretty interesting stuff.
Doesn't he need to deal with relocation types? Otherwise you can't do a lot of interesting stuff like relative jumps, or loads computed from a base address. x86 has dozens of different relocation types.
Are you perhaps confusing it with SPARC? That one does have dozens of different relocation types because its instructions can have displacements of about 10 different widths.
As for interesting stuff... relative jumps are normally not used for jumping to a symbol defined in a different object, and base address computations are trivially supported by sticking a special symbol at the beginning of every object file.
Linkers are the lymph nodes of systems. Developers will wax rhapsodic about the virtues of Rust but know nothing about the (necessary) evils of linkers.
So the OP wrote a linker. But did they write one which understands linker script? Does it understand shared libraries, multiple architectures, multiple object file formats, ...? Is it 'smart'? Can it link itself? Does it support zero pages, ASLR, non-executable stacks+heaps? Is it fast?
Linkers are hard. Relocs and fixups are really hard. The Oracle® Solaris 11.4 Linkers and Libraries Guide is 500+ pages [1]. But if the development environment is done well, you won't have to worry about that, which is pretty cool.
Why would "everyone can understand this" be a valuable property for a specialist tool? I would much rather use tools that enable skilled professionals to do their best work than try to appease the layman.
>Why would "everyone can understand this" be a valuable property for a specialist tool? I would much rather use tools that enable skilled professionals to do their best work than try to appease the layman.
First, who said "everyone" here refers to the "layman"?
[+] [-] bla3|4 years ago|reply
A linker in first approximation just concatenates files.
In second approximation, it splits every input file into N parts (code and data, basically), then concatenates the corresponding parts in each input file, and writes all concatenated parts ("sections") to the output file.
It needs to fix up references at the end and it needs symbols for that, but that's kind of an implementation detail.
Even diagnosing duplicate symbols is a convenience feature if you look at things this way.
Symbols become more important if you want to write a "real" linker that supports weak symbols and symbols imported from dynamic libraries. But for the understanding the basics, it helps to deemphasize symbols, in my opinion.
[+] [-] jcranmer|4 years ago|reply
This description is a reverse process of how the linker goes through and does all of it: first it loads all the input files, then resolves the symbols to build a link plan, then it fixes up necessary relocations, and finally it streams it all into the output file. I don't think it helps to omit any part of the process from the description--symbols are not "kind of an implementation detail," they are a core feature of what linkers provide.
Indeed, if I were to provide this as a blog post, I see no reason to not immediately reach for describing this using the full complexity of ELF and "real linkers" as the basis of description. You don't really gain anything from a simpler description, and building a linker that can build even relatively complex libraries such as glibc or Firefox's libxul is likely to be pretty simple--the really hard part of linkers is supporting things like linker scripts, which is not common operation.
[+] [-] IncRnd|4 years ago|reply
[+] [-] codethief|4 years ago|reply
Part 1: https://briancallahan.net/blog/20210407.html
[+] [-] jb1991|4 years ago|reply
[+] [-] pjmlp|4 years ago|reply
Chapter 6
http://people.inf.ethz.ch/wirth/ProjectOberon/PO.System.pdf
Or the re-implementation of Turbo Pascal one for MS-DOS,
http://mail.turbopascal.org/linker
[+] [-] anitil|4 years ago|reply
'collect1' collects pairs of (name, address) and puts it in a table called 'collect'. Then 'process1' finds any names and substitutes the address found in the 'collect' table.
[+] [-] barrkel|4 years ago|reply
A GC has a series of roots which point at blobs of memory. The GC needs to understand where further references are located in the pointed memory; a simple GC follows these references, avoiding cycles, until it has traced through the live set. It calculates new addresses for each blob of memory, then goes through the live set a second time, moving each object into its final position and adjusting all the references to point to their final locations.
A linker starts out with a main() function or equivalent which is a blob of code, contained in an object file produced by the compiler. The object file contains a table of references to all the external symbols used by the code, as well as the locations in the code that the references are made, classified by type of reference (e.g. absolute vs relative references). The references in the main() function are like the set of GC roots. The object file also contains a table of exports, all the symbols which it exposes externally.
The linker then locates the object code or data for each unresolved external reference (in other object files supplied on the command line, or in archive library files - indexed concatenations of object files). This process is followed recursively until the live set - the set of blobs of code and data - to be assembled into the final executable is determined.
Once all these blobs have been found, they can be copied into the final executable, and all the references to symbols can be filled in with appropriate offsets. There's details here: relative offsets may be baked in, but the locations of absolute offsets need to be marked in the final executable so that the operating system loader can adjust them if it needs to load the executable somewhere other than its preferred address.
The operating system loader is itself a linker. It, too, needs to follow a similar process, but instead of object files, it's dealing with shared objects (DLLs in Windows), and it's chasing down external references, loading things, and adjusting offsets as necessary (address space layout randomization (ASLR), load address conflicts (a lot more common in 32-bit address spaces)). The OS loader can also be lazy; instead of resolving a reference, it can fill it with a stub which is only resolved on first call. This delays linkage failures, but if the application tests for versions etc. before invoking, the developer gets the ergonomics of early bound linking with the features of late-bound linking that would otherwise need manual use of dlsym / GetProcAddress.
If you're interested in the topic, I highly recommend Linkers and Loaders by John Levine, long-time moderator of comp.compilers.
[+] [-] juloo|4 years ago|reply
[+] [-] dapids|4 years ago|reply
[+] [-] dundarious|4 years ago|reply
[+] [-] motardo|4 years ago|reply
> Another downside is that you do have to arrange the object files in the command line in an order that makes sense for this linear approach. The first object file must be the entry point of the program. Different arrangements of object files may work for linking your executable but will produce an organizationally different executable. You may or may not care about this.
[+] [-] Joker_vD|4 years ago|reply
And arguably, it kinda makes more sense than trying to quote the { and [ instead: you've already lost the ability to just copy a blob from one file to another with a couple of places patched at known offsets, so you may as well simplify your life further.
[+] [-] anitil|4 years ago|reply
It somehow hurts my brain to almost double your file size, but who cares? That's the computers problem, not yours.
[+] [-] zabzonk|4 years ago|reply
[+] [-] aghastnj|4 years ago|reply
"I wrote a linker that works!" -> No-one can understand all of it.
[+] [-] trevor-e|4 years ago|reply
[+] [-] dw-im-here|4 years ago|reply
[+] [-] rwmj|4 years ago|reply
[+] [-] Joker_vD|4 years ago|reply
As for interesting stuff... relative jumps are normally not used for jumping to a symbol defined in a different object, and base address computations are trivially supported by sticking a special symbol at the beginning of every object file.
[+] [-] corobo|4 years ago|reply
https://en.wikipedia.org/wiki/Linker_%28computing%29
[+] [-] CalChris|4 years ago|reply
So the OP wrote a linker. But did they write one which understands linker script? Does it understand shared libraries, multiple architectures, multiple object file formats, ...? Is it 'smart'? Can it link itself? Does it support zero pages, ASLR, non-executable stacks+heaps? Is it fast?
Linkers are hard. Relocs and fixups are really hard. The Oracle® Solaris 11.4 Linkers and Libraries Guide is 500+ pages [1]. But if the development environment is done well, you won't have to worry about that, which is pretty cool.
[1] I always wanted a Linker Alien t-shirt.
http://www.linker-aliens.org/
[+] [-] ike77|4 years ago|reply
[+] [-] Dah00n|4 years ago|reply
[+] [-] cupofcoffee|4 years ago|reply
[+] [-] comonoid|4 years ago|reply
[+] [-] linker3000|4 years ago|reply
[+] [-] hota_mazi|4 years ago|reply
[+] [-] junon|4 years ago|reply
[+] [-] coldtea|4 years ago|reply
[+] [-] tremon|4 years ago|reply
[+] [-] yakubin|4 years ago|reply
It's also a breath of fresh air in an industry whose practitioners can't distinguish technology from magic.
[+] [-] coldtea|4 years ago|reply
First, who said "everyone" here refers to the "layman"?
[+] [-] scruffyherder|4 years ago|reply
How does this handle endian, word sizes, and dreaded alignment? Is packing enough?
[+] [-] peterhil|4 years ago|reply
* { background: #fbfcff; }
Disabling that rule or using reader mode helps to read the article.
[+] [-] breakingcups|4 years ago|reply
[+] [-] GrayShade|4 years ago|reply
[+] [-] ColinWright|4 years ago|reply
> * { background: #fbfcff; }
You also say:
> ... makes everything (including text) almost white.
Doesn't that rule just force the background to (nearly) white? Why do you say it forces everything including text to white?
[+] [-] nindalf|4 years ago|reply
[+] [-] scruffyherder|4 years ago|reply
This elitist UI thing of forcing deliberately blind schemes is trash