Cosmopolitan Libc author here. When I saw this article earlier today, it put a big smile on my face, because I would have thought there'd be so many more issues than there turned out to be! For additional context, we've got a GitHub issue that's tracking progress on the changes that need to be made based on what we learned: https://github.com/jart/cosmopolitan/issues/61
So first of all, I just want to echo the general sentiment here and say that all of this is beyond awesome. Cosmopolitan libc seems to have the potential to literally re-define what constitutes portable code.
With that out of the way, am I understanding correctly that the way this works on Linux/Unix is that the modifies itself (by overwriting the EXE file header with an ELF header)? This seems to have the consequence of making that specific file no longer portable. If I'm understanding things correctly, it also looks like the QEMU hack for non-x86_64 architectures will only work once per file, since after the first time running the file it will no longer run as a shell script on Unix so the QEMU invocation will be unreachable.
I'm an old Delphi/Pascal application programmer here, trying to get the lay of the land in the C/C++ world.
Why is there a forest of files in Cosmopolitan Libc? I tried looking at it on Github to see how things were done, and there were a lot of .h and .S files, I couldn't actually find the C source, though I'm sure somewhere in there is a .c file.
Doesn't fragmenting the source into thousands of files make compilation far slower than it needs to be?
Also, I wonder if/how functions that aren't called in a program get trimmed away by the linker, and thus don't make the executable larger.
One question: is it possible to load .so files (through dlopen/dlsym) on Linux when compiling to APE?
I was working on something similar (though much smaller in scope), but had to stop when I realised that `ld-linux.so` has some internal APIs that Glibc uses to setup dlopen()/dlsym(); essentially meaning that it is very hard -if not impossible - to load any shared libraries if one does not link to Glibc.
What I wouldn't give for a liblinux, similar to NTDLL.
Have you considered writing an index for Cosmopolitan's API by topic? The current reference documentation is rather daunting, and it is very difficult to determine at a glance what sorts of functionality are available, and thus what kinds of applications could be written or ported.
What would it take to create an analog of SDL- some kind of lowest-common-denominator interface for mouse/keyboard io, audio output, and software-rendered graphics?
I missed an opportunity to ask in the previous thread: what would it take to link an app in a different language (say, Rust) with this library? Is it enough to just build an object file, that has libc functions assuming LP64 ABI as unresolved exports?
Windows, macOS, Linux, and BSD all (usually) run on Intel processors, so in theory it should be possible to write a program that runs on all three. However, despite ultimately using the same instruction sets, the four OSes use different formats to store metadata about the programs, and have different ways for programs to communicate with the operating system. Justine Tunney created two things:
1) Actually Portable Executables, a clever way of formatting a program so that all four OSes interpret it a valid program in their own format.
2) Cosmopolitan libc, a library for communicating with the OS that handles each OS's interface, allowing programs to work on all four.
The author somehow came up with an executable file format that's compatible with everything. It's simultaneously a valid shell script, Unix ELF, Windows PE file, boot sector code and zip file.
Then she wrote a standard C library that detects which OS you're using at runtime and branches to the appropriate implementation, allowing the same x86_64 code to run on any supported system.
if (IsWindows())
DoWindowsThing();
else
DoUnixThing();
Traditionally, these system differences are resolved at build time: only the code for the target platform is included. Why would anyone want Windows-specific code in a Linux build? It's dead code... right? The new run-anywhere executable format makes that code useful.
In short, there is a "framework" that allows to compile a single binary that can run under both Linux, Windows, bare metal, and many others. Like the Apple universal binaries, but actually universal, not just covering the m1's arm64 and x86 64.
You can compile a program on MacOS. The result, an executable binary, can run on Linux, Mac, Windows, FreeBSD, OpenBSD, NetBSD, BIOS .... the same EXACT binary!
Something like this could be great for open source projects that target a lot of platforms. For example, Conda-Forge [0] has automated build pipelines for Linux, Mac and Windows. Perhaps something like this could let you reduce that to an x86 build and an Arm build... removing an entire dimension from the build matrix.
Gonna need to solve that 1/2=g problem before Numpy is happy though!
Portable Lua is a great idea. By using Cosmopilitan does this also run on baremetal machines?
Portable micropython might also be a low hanging fruit target.
Does anyone know if there is any way to run another exe/binary that its bundled in a cross platform app, while its all enclosed in 1 binary file?
Because Cosmo could work as a wrapper of windows/mac/linux binary executables and then the cosmo app can decide which one to invoke.
Based on my understanding if program A is to run program B, program B needs to be addressable in the filesystem. So based on my testing, the bundled executables need to be written to disk first and then the main program could run them. This also might work with macos .app folder formats but not sure how to do it on linux to avoid chmod +x issues and how to have 1 binary file on windows.
Would these binaries work on Apple Silicon? It seems like no, since it isn't x86? What are the limits to the portability, and is there a path past those limits?
Learned a lot from this and related pages, thanks a lot.
It is one of the best hack I've ever seen, ever. Although as others have pointed out I wouldn't consider it exactly elegant (IIUC basically shell script to morph it on first run on nix); Nor exactly ground breaking with the fat-binary-like concept, if you don't use platform specific libraries or calls anyway, cross compile works just fine? What's the use case here? I mean for platforms where performance or size overhead of vm-based languages is a problem, you probably don't want it to include extra code for other platforms anyway?
OTOH Blinkenlights looks really nice tho. I've see a dude live hand editing code page 417 with notepad.exe to change a jmp so it doesn't look for a CD with my own eyes ;)
Always admire hackers of this rank. I think Blinkenlights could help me achieve something similar someday if I allocate bandwidth to it, would spend lots of time to play with it.
The hard part when making a virus is to exploit the target system to run said virus. And if you have such an exploit, then supplying a compatible binary depending on which OS the target system runs is easy.
Exploit attempts for embedded systems (routers, cameras, etc) typically start with attempting to execute code in a portable language such as Shell scripts (which would be the only thing this new project would replace), those scripts try to detect the architecture of the system and then download the appropriate binary (the server hosting the malicious binaries provides many variants for different architectures).
In the end I don't see this solving a real problem when it comes to malware - this problem has already been solved through other means.
Cosmopolitan Libc is designed to put power in your hands, and power can be used for good or bad. The best way for us to all keep that power, is to start using Cosmopolitan to do as many good deeds as possible.
This project is going to benefit developers on all platforms, because it supports everyone without bias. Indie developers are going to have more opportunities to be successful writing native apps, since Cosmo helps them reach a broader audience. Before Cosmopolitan only big companies could introduce new projects (e.g. TensorFlow) that effectively solve the portability problem, since the only way to do it before was brute force cash. Lastly, Cosmopolitan is going to relieve language authors of many of the portability burdens they've each needed to carry on their own, which means they're going to have more time to focus on their visions.
If we don't use Actually Portable Executable to accomplish grand acts of public service, then operating systems are just going to block it. For example, UPX is a project that does creative things with executable formats. If you read the XNU source code you'll notice that they have explicit source code for blocking those executables and they call out the project by name.
Just because an executable is cross-platform doesn't mean it can exploit issues on other platforms. Those issues have to actually exist there, and most vulnerabilities don't look like that.
The author mentions the portable binary is smaller than the original. It seems Cosmopolitan's output is actually an executable pkzip file: https://justine.lol/ape.html
[+] [-] jart|5 years ago|reply
[+] [-] shawxe|5 years ago|reply
With that out of the way, am I understanding correctly that the way this works on Linux/Unix is that the modifies itself (by overwriting the EXE file header with an ELF header)? This seems to have the consequence of making that specific file no longer portable. If I'm understanding things correctly, it also looks like the QEMU hack for non-x86_64 architectures will only work once per file, since after the first time running the file it will no longer run as a shell script on Unix so the QEMU invocation will be unreachable.
Have you considered adding workarounds for this?
[+] [-] ahgamut|5 years ago|reply
As someone who has never messed around with Libc-level programming, it was surprisingly straightforward (and exciting!) to compile Lua all the way.
Cosmopolitan is incredible, thank you so much.
[+] [-] egeozcan|5 years ago|reply
I was wondering if we can accompany runtimes written in C (like lua) with some scripts? So we have cross-platform, double-click-to-run scripting?
For us poor souls who can't write reliable C code, that would be a great thing!
[+] [-] mikewarot|5 years ago|reply
Why is there a forest of files in Cosmopolitan Libc? I tried looking at it on Github to see how things were done, and there were a lot of .h and .S files, I couldn't actually find the C source, though I'm sure somewhere in there is a .c file.
Doesn't fragmenting the source into thousands of files make compilation far slower than it needs to be?
Also, I wonder if/how functions that aren't called in a program get trimmed away by the linker, and thus don't make the executable larger.
[+] [-] HexDecOctBin|5 years ago|reply
I was working on something similar (though much smaller in scope), but had to stop when I realised that `ld-linux.so` has some internal APIs that Glibc uses to setup dlopen()/dlsym(); essentially meaning that it is very hard -if not impossible - to load any shared libraries if one does not link to Glibc.
What I wouldn't give for a liblinux, similar to NTDLL.
[+] [-] RodgerTheGreat|5 years ago|reply
What would it take to create an analog of SDL- some kind of lowest-common-denominator interface for mouse/keyboard io, audio output, and software-rendered graphics?
[+] [-] ducktective|5 years ago|reply
[+] [-] e12e|5 years ago|reply
[1] https://news.ycombinator.com/item?id=26271117
[+] [-] s_Hogg|5 years ago|reply
[+] [-] lostmsu|5 years ago|reply
I missed an opportunity to ask in the previous thread: what would it take to link an app in a different language (say, Rust) with this library? Is it enough to just build an object file, that has libc functions assuming LP64 ABI as unresolved exports?
[+] [-] jll29|5 years ago|reply
[+] [-] stephc_int13|5 years ago|reply
I think your work is outstanding but it looks a bit childish/bizarre with this name, and thus it might prevent some people to trust its reliability.
[+] [-] krab|5 years ago|reply
Cosmopolitan is going to be a big thing.
[+] [-] TazeTSchnitzel|5 years ago|reply
[+] [-] abhinav22|5 years ago|reply
[+] [-] Gaelan|5 years ago|reply
1) Actually Portable Executables, a clever way of formatting a program so that all four OSes interpret it a valid program in their own format.
2) Cosmopolitan libc, a library for communicating with the OS that handles each OS's interface, allowing programs to work on all four.
[+] [-] matheusmoreira|5 years ago|reply
Then she wrote a standard C library that detects which OS you're using at runtime and branches to the appropriate implementation, allowing the same x86_64 code to run on any supported system.
Traditionally, these system differences are resolved at build time: only the code for the target platform is included. Why would anyone want Windows-specific code in a Linux build? It's dead code... right? The new run-anywhere executable format makes that code useful.[+] [-] baq|5 years ago|reply
[+] [-] gostsamo|5 years ago|reply
[+] [-] max_|5 years ago|reply
[+] [-] snarfy|5 years ago|reply
[+] [-] pbronez|5 years ago|reply
Gonna need to solve that 1/2=g problem before Numpy is happy though!
[0] https://conda-forge.org/
[+] [-] ers35|5 years ago|reply
[+] [-] EamonnMR|5 years ago|reply
[+] [-] zbendefy|5 years ago|reply
[+] [-] antman|5 years ago|reply
[+] [-] dazhbog|5 years ago|reply
Because Cosmo could work as a wrapper of windows/mac/linux binary executables and then the cosmo app can decide which one to invoke.
Based on my understanding if program A is to run program B, program B needs to be addressable in the filesystem. So based on my testing, the bundled executables need to be written to disk first and then the main program could run them. This also might work with macos .app folder formats but not sure how to do it on linux to avoid chmod +x issues and how to have 1 binary file on windows.
Any ideas?
[+] [-] ericb|5 years ago|reply
[+] [-] utbabya|5 years ago|reply
It is one of the best hack I've ever seen, ever. Although as others have pointed out I wouldn't consider it exactly elegant (IIUC basically shell script to morph it on first run on nix); Nor exactly ground breaking with the fat-binary-like concept, if you don't use platform specific libraries or calls anyway, cross compile works just fine? What's the use case here? I mean for platforms where performance or size overhead of vm-based languages is a problem, you probably don't want it to include extra code for other platforms anyway?
OTOH Blinkenlights looks really nice tho. I've see a dude live hand editing code page 417 with notepad.exe to change a jmp so it doesn't look for a CD with my own eyes ;)
Always admire hackers of this rank. I think Blinkenlights could help me achieve something similar someday if I allocate bandwidth to it, would spend lots of time to play with it.
[+] [-] benatkin|5 years ago|reply
It is pretty damn impressive, though. I'll give it that.
It's like that cat pushing a watermelon meme. This is a binary that runs on seven operating systems. Your argument is invalid.
[+] [-] temp00345|5 years ago|reply
Not to detract from the technical breakthrough, but won't this be a major gift to virus writers ?
[+] [-] Nextgrid|5 years ago|reply
Exploit attempts for embedded systems (routers, cameras, etc) typically start with attempting to execute code in a portable language such as Shell scripts (which would be the only thing this new project would replace), those scripts try to detect the architecture of the system and then download the appropriate binary (the server hosting the malicious binaries provides many variants for different architectures).
In the end I don't see this solving a real problem when it comes to malware - this problem has already been solved through other means.
[+] [-] jart|5 years ago|reply
This project is going to benefit developers on all platforms, because it supports everyone without bias. Indie developers are going to have more opportunities to be successful writing native apps, since Cosmo helps them reach a broader audience. Before Cosmopolitan only big companies could introduce new projects (e.g. TensorFlow) that effectively solve the portability problem, since the only way to do it before was brute force cash. Lastly, Cosmopolitan is going to relieve language authors of many of the portability burdens they've each needed to carry on their own, which means they're going to have more time to focus on their visions.
If we don't use Actually Portable Executable to accomplish grand acts of public service, then operating systems are just going to block it. For example, UPX is a project that does creative things with executable formats. If you read the XNU source code you'll notice that they have explicit source code for blocking those executables and they call out the project by name.
[+] [-] jxf|5 years ago|reply
[+] [-] rahimiali|5 years ago|reply
[+] [-] antman|5 years ago|reply
[+] [-] barneygale|5 years ago|reply
[+] [-] al2o3cr|5 years ago|reply
[+] [-] momothereal|5 years ago|reply
[+] [-] TJSomething|5 years ago|reply