top | item 24491700

Bakeware – Compile Elixir applications into single executable binaries

182 points| tempodox | 5 years ago |github.com | reply

50 comments

order
[+] zapnuk|5 years ago|reply
As a more or less elixir beginner, elixir/mix releases remain confusing to me.

First of all, it took me way longer than it should be to figure out how to set the entrypoint of my application.

More importantly the resulting releases are kinda confusing, at least to me. Why does it need to contain 17 (13 .exe + 4 .bat) executables? Why are there 55 c-header files in the directory of the erts? Why are there so many configuration files?

I'm sure that they are there for a good reason, but the current solution is daunting, such that it lowers my enjoyment with the otherwise very elegant language.

IMO there should be a release approach that is very simple + compact, and extendable. This would allow people like me to hit the compile button and distribute my application in very manageable parts. Though further configuration it should be possible to end up the very adjustable release that currently exists.

This project seems to go into into the direction I would like. I hope the windows release is soonish such that can try it.

[+] brightball|5 years ago|reply
The thing to remember about Elixir/Erlang is that it gives you a lot of deployment options. There's this mix of concerns with deployment tooling to both support all of those options and be simple at the same time, which is tough to achieve.

Just as an example, with many Java App Servers you deploy your packaged java code and it automatically handles seamless deployment across the cluster with zero downtime. And there are lots of different server options.

With languages using K8's and Docker, you're deploying a configured container across a system that will help manage the same thing.

Distillery has been the standard for deployments with Elixir for a while now and supports everything. Releases are slowly rolling this functionality into the core, but don't currently support all of the options that you get with distillery.

But with the BEAM you've got clustering built in. You've got hot reloading options which literally deploy your new code in the middle of running code without so much as dropping a socket connection. You get the ability to roll that back too.

It's a little more complicated because it comes with a lot of very unique options that are built in instead of delegated to another system.

[+] rhizome31|5 years ago|reply
My experience is that preparing a release is much more straightforward since Elixir 1.9, which added release support to the Elixir standard distribution. That said I tend to treat a release as a black box, I haven't tried to understand what all those files are and I just interact with bin/myapp.

However I still have two complaints.

First, the system on which the release is built and the system where it's going to run need to be similar (same version of the same Linux distribution). In practice this means that these days we build releases in containers matching the target host.

Second, releases can't run mix commands so we have to write boilerplate code specifically for releases for things like database migrations, etc. It would be nice if writing app management commands could be a bit more DRY.

I could be wrong but at first glance it doesn't seem that Bakeware addresses these issue.

[+] kungfooguru|5 years ago|reply
> Why are there 55 c-header files in the directory of the erts?

That shouldn't be the case anymore. It was a bug in rebar3 at one point and I think `mix release` may have copied said bug. If 1.11 doesn't properly not include those header files then an issue should be opened.

[+] derefr|5 years ago|reply
First, answers to your questions, and then a general addressing of your idea:

1. Having an “entrypoint to your application” isn’t the right way to think about an Erlang release.

An Erlang release is like a virtual appliance: an OS (ERTS) with a set of “service” packages installed in it (your apps and libraries.)

And, just like when assembling a VM instance using Terraform or the like, you can set up/install multiple root-level applications/services within that VM. Like, say, a LAMP stack. Different root-level services, none installed as a dependency of the other; all just running as siblings and configured to talk to one-another; all in one VM image. The VM image, so-composed, would be a “release” of a single networked system component. Not a single service, but a single set of services, deployed and upgraded atomically, along with the virtual-machine OS substrate they run on. A “node”, in Erlang parlance.

So, then, how do you set up an entrypoint for such an OS service? On Linux, you’d use systemd service-units. Each systemd service has its own entrypoint, configured by the unit file. Equivalently, each Erlang app has its own entrypoint, configured by the .app manifest. (Which is, in Elixir, generated from the Mix project file, which is why the application `mod` directive ends up there in Elixir.)

2. Those config files are the config files for the “OS” (ERTS), not for your app. They’re things that — in a different “multitenant” abstract-machine runtime, e.g. Smalltalk — would be hiding within the “image” that the emulator works with. Things that in a regular Linux VM, would be in /etc of the VM’s virtual block-device image.

Why are they there? Because Erlang is not designed under the expectation of heavy dev-ops collaboration. Releases may very well be created by an “upstream” of devs, and then thrown over a very tall wall to a hapless operations staff. The operations staff then has to deal with deploying this thing, where the only things they can tweak to get a release to work on a particular system, are those very config files. If they were inside the image, they’d have to ask the devs to burn them a new release with the fixes in place. As it stands, they can just tweak the deploy themselves.

3. And that’s also why there’s so many executables: a good few of them are different (static-compiled) emulators for the different operational deployment scenarios that won’t be known at build time, e.g. single-core vs. multi-core, where all this detail is abstracted away by detection-steps run just before emulator boot-time by those batch files. (Make no mistake: for any “runtime” package you might install — e.g. the JVM, the CLR, etc. — you get a similar menagerie of executables, just hidden somewhere out-of-sight.)

Oh, and some of them (e.g. EPMD) are just ERTS “daemons” that run as sub-processes of the emulator, rather than “in” the emulator. Since there are several variant emulators shipped in the release, burning this code into each of them and running it with fork(2), would result in more bulk to the release than just factoring it into a separate executable would. (And besides, Windows doesn’t fork(2).)

And also also, a more Erlang-y reason: isolating this code into separate processes, means you can validate it solely in terms of its failure-state IPC behaviour, rather than needing to take into account its failure-state emulator memory-state behaviour. It’s the same reason Erlang encourages the use of port processes over NIFs. It’s the same reason microkernels exist. Isolating failure, so things can crash hard, without the important things crashing.

4. The C header files are something you’ll see with any runtime that both ‘vendors’ the emulator itself; and ships a compiler accessible at runtime; and where that compiler supports FFI/building runtime extensions. In Erlang, you can run relups against a deployed release, that will install new Erlang applications into that release. If those Erlang apps contain native C code that needs to be compiled, the header files need to come from somewhere.

If they came from the host, they’d not be guaranteed to be compatible with the destination. Even if you wanted to set up some sort of cross-compilation toolchain matching the target, “the target” is a moving target, because relups can boot into a new version of the emulator; and because ops staff might independently upgrade/downgrade between relups (think “rolling upgrade failure”), meaning that any one of the set of so-far deployed copies of ERTS/BEAM might be the one running on any given node.

An Erlang node is a stateful, living system. Imagine it like a Windows virtual appliance that’s created as a series of Windows Deployment update files by a dev team; but where any given installation’s ops staff may-or-may-not choose to apply any given update pack. On such an appliance, the OS version isn’t really under the dev team’s control. Despite having atomic upgrades using atomic whole-release patches, it’s still not “immutable infrastructure” in the sense of e.g. a Docker image, where the whole image gets swapped out. And, as such, any given instance of the appliance can’t really be predicted in advance by the devs team, to have a particular OS version running on it. Rather, if the dev team wants generality, they have to build updates for multiple possible “base” versions of the OS; and then the update install system needs to interrogate/verify/select a matching update for the OS version that turns out to be running. And if they want specificity (e.g. to deliver a hot-fix update to a specific client), then they need to find out right before building the update, what OS version their appliance is currently running.

You can’t make the fully-general update-distribution problem any easier; but you can partially automate the hotfix-build-discovery problem. Just set up the virtual appliance so any running prod instance can be interrogated by your dev toolchain, whereupon it will deliver to your toolchain a tiny little cross-complication toolchain (i.e. C header files et al) precisely matching the running instance.

Which is... precisely what ERTS does. Relups are weird.

—————

Even what I said before (an Erlang release being a VM) is a bad abstraction — an Erlang release is an atomic patch of a VM, that the VM itself can then switch to. Like a base-image in CoreOS... but where the VM can switch to it without needing to reboot. That has a lot of complications.

Some languages (e.g. Go, Rust) are “closed-world”: they assume that, within some boundary (in Rust, a “crate”), everything will become fixed at compile-time, with nothing further able to change or intercede at runtime. These language compilers can thus execute Whole Program Optimizations.

Other languages (e.g. Java) are “open-world”: they assume that code can be loaded at runtime, right into the middle of any boundary you might draw; and, therefore, optimizations can only occur at the level of the code-unit (e.g. module, class), guaranteeing that all replacements will at least happen atomically at the level of the code-unit.

And then there’s Erlang, which takes “open-world” to a whole different level.

What you’re basically imagining here, is a version of ERTS that takes a “closed-world” assumption. No relups, no runtime module loading, maybe even burning the whole system into a single BEAM file with WPO. This would disable much of what makes Erlang, Erlang — but it would be possible. It’s just not possible to build this on top of the current OTP version of ERTS, since the open-world/closed-world assumption of a runtime is baked into basically every implementation decision of a VM and runtime at a deep level. You’d need to write your own (much less complex!) VM and runtime.

[+] triplejjj|5 years ago|reply
It has many files because it brings in Erlang and its executables. Then Elixir and its executables. And then it adds some scripts to invoke those. It may make more sense if you think of releases more of a bundle, that includes all tools that you have and may need in prod.

Theoretically speaking you shouldn’t need to worry about any scripts except the ones you have in bin/.

[+] arjan_sch|5 years ago|reply
Looks great!

Did not look at it in detail, but I'm wondering about portability, eg how this works with system libraries, like OpenSSL. Is the resulting binary portable to systems with different C libraries than the ones the OTP system was linked against originally?

[+] crusso|5 years ago|reply
I love this idea.

I've been using Elixir for years, on and off. The syntax is approachable, the actor model is solid and easy to reason about with many threads running. The Phoenix ecosystem is fantastic to work with, particularly now with LiveView making quick web UIs so effortless to create.

My main practical problem with Elixir over the years has been handing the tools I've created to others who might find them useful. Bakeware looks like the right way to proceed.

[+] abhijat|5 years ago|reply
Has anyone used elixir for building command line tools? It seems like this would be very useful for distributing them.
[+] nickjj|5 years ago|reply
The readme file in the repo mentions a ~500ms start up time. Maybe it's just me, but I wouldn't want to wait half a second for a tool to start running. For context most popular Unix tools start up in 1-2ms.
[+] freedomben|5 years ago|reply
I've used it to write CLI tools for myself and found it to be great for medium to large tools. Small tools it's still much faster and easier to grab Ruby (assuming a quick bash script isn't appropriate).

But yeah distribution is mostly a pain. I use the escript method but it's far from perfect. I'm excited about Bakeware and plan to try it.

[+] brightball|5 years ago|reply
I've always felt like the ideal way to use Elixir for command line tooling would be to have some way to let a pipe feed into a new BEAM process from an already running application.
[+] jdellinger|5 years ago|reply
In .NET we have a very similar tool for creating such packed binaries, dotnet-warp. I used it in one of my projects and quite liked it, since it's also quite easy to cross compile (cross-pack?) for the 3 major operating systems.

I like the general idea, your independent of the system wide framework version and it still has this "one-click" install procedure (dropping the binary in your path). However, I guess this is also the negativ Part. Users don't expect that a single binary extracts itself to somewhere --> uninstalling the binary leaves traces on the system.

Definitely looking forward to try it out for elixir, wondering how fast the erlang/elixir startup really is.

[+] at_a_remove|5 years ago|reply
Complete n00b question -- you can do that?

The last time I touched anything .NET was about a decade ago. My somewhat old-school superiors were unimpressed with the ability to come up with a plain .exe at the end of the process. My lack of familiarity with how Visual Studio had evolved certainly played into it; the IDE made me feel like a chimp dropped into an airliner cockpit. I had the worst time trying to figure out how to turn off the "of course you have an enterprise server dedicated to delivering upgrades!" setting. None of it seemed to, uh, scale down for our piddly purposes.

[+] csdreamer7|5 years ago|reply
Is there anything like this in Java?

I am aware of launch4j, but curious if there is a single step approach for all 3 platforms.

[+] crusso|5 years ago|reply
From the page: ~0.5s startup times or better on our computers
[+] tasogare|5 years ago|reply
The lack of single binary output will be fixed in dotnet 5.
[+] sajan45|5 years ago|reply
After using Go, I always felt the need for something like this in the Elixir world.
[+] victor106|5 years ago|reply
Wish something like this existed for Java
[+] ctas|5 years ago|reply
I believe GraalVM[0] makes your wish come true. It allows you to compile your Java code into a single binary and offers other features which makes Java a feasible solution for CLI tools.

[0] https://www.graalvm.org/

[+] madduci|5 years ago|reply
There is: you can use jlink/jpackage or GraalVM compiler to produce compiled executables
[+] hauxir|5 years ago|reply
does it support hot code deployment?
[+] joshuakelly|5 years ago|reply
Can vanilla Elixir releases, for that matter? Not from what I've seen - would love to know if I'm wrong.
[+] moocowtruck|5 years ago|reply
you can't have your cake and eat it too