top | item 24717084

Large single compilation-unit C programs (2006)

50 points| Sadkov | 5 years ago |people.csail.mit.edu | reply

21 comments

order
[+] lou1306|5 years ago|reply
May be relevant to the discussion: SQLite is also compiled from a single, 220k LOC C file called "the amalgamation".

https://www.sqlite.org/amalgamation.html

[+] dhekir|5 years ago|reply
I also found this "amalgamate" script on GitHub, intended to allow creating such amalgamations from C/C++ projects:

https://github.com/rindeal/Amalgamate

Which seems interesting, however when I tried the FreeType example, there seemed to be some preprocessing issue, such that some function definitions are conditionally excluded even though they are called later. I didn't have the time to find out if this was an issue in the original code or if the amalgamation script introduced it.

In any case, such single-C programs are very useful for quickly testing tools, so having more of them would be great.

[+] klelatti|5 years ago|reply
I've been working on a project that auto generates c programs - sometimes up to 1.5m lines of code - in a single file (actually two files but the second is only 35 lines)

Not open source but happy to share benchmarks if that would be useful.

[+] klelatti|5 years ago|reply
Some compile times for those interested:

Hardware 2016 12" MacBook (1.1GHz Core m3) Ubuntu 20.04 running in Docker Clang 9 -O0 optimisation (more optimisation increases the compile times a lot!)

0.53m LOC 41MB 34s

0.99m LOC 76MB 91s

1.44m LOC 110MB 167s

I suspect the code is relatively straightforward to compile - few function calls etc.

[+] pulse7|5 years ago|reply
Please share the benchmarks...
[+] enriquto|5 years ago|reply
I like to code this way. You just include "foo.c" instead of "foo.h", which does not exist at all. The compilation is really simple, and there's half of the files!
[+] Sadkov|5 years ago|reply
1283 = continue 1432 = license 1766 = gnu

So for every loop continue statement there is a GPL license text :D

[+] dvfjsdhgfv|5 years ago|reply
I know it's half serious but it's simply not true, in the same way as grepping for "Stallman" in the leaked Windows source code (nobody actually mentioned RMS there, these were false positives). In this case, some headers contain multiple occurrences of GNU in a single header. Then there are several #ifdefs like "__GNU_LIBRARY__" or "__GNUC__" or e-mail addresses of people in the gnu.org domain.

In practice, it doesn't matter at all as the preprocessor replaces all license headers with a single space even before the compiler has the chance to look at it.