top | item 29388456

(no title)

cbrewster | 4 years ago

Author here. As with most things, its all about the trade-offs. Docker has certainly proved itself and that approach has worked on a massive scale. However, its not a silver bullet. For us at Replit, our Docker approach was causing issues: our base image was large and unmaintainable and we had almost no way of knowing what changed between subsequent builds of the base image.

We've been able to utilize Nix to address both of those issues, and others who may be in a similar scenario might also find Nix to be valuable.

Of course Nix comes with its own set of opinions and complexities but it has been a worthwhile trade-off for us.

discuss

order

mayli|4 years ago

Correct, that's one of the cases where docker's layered image system doesn't work well. Nix is almost the perfect tool to perform incremental builds and deployments for the Replit requirements.

I wish that docker has the ability to merge multiple parent layers like git, then you can build the gigantic image by just updating single layer.

The only hack the docker can do is multistage-build, however that won't work reliably in some cases such as resolving conflicts.

KronisLV|4 years ago

Disclaimer: the following is still experimental, and will probably remain so for a while.

There is actually the --squash command that you can use during builds, to compress all of the layers: https://docs.docker.com/engine/reference/commandline/build/#...

For example:

  $ docker build --squash -t my-image .
In practice it can lead to smaller images, though in my experience, as long as you leverage the existing systems in place efficiently, you end up shuffling around less data.

E.g.:

  - layer N: whatever the base image needs
  - layer N+1: whatever system packages your container needs
  - layer N+2: whatever dependencies your application needs
  - layer N+3: your application, after it has been built
That way, i recently got a 300 MB Java app delivery down to about a few dozen MB actually being transferred, since nothing in the dependencies or the base image needed to be changed since, it just sent the latest application version, which was stored in the last layer.

Also, the above order also helps immensely with Docker build caching. No changes in your pom.xml or whatever file you use for keeping track of dependencies? The cached layers on your CI server can be used, no need to install everything again. No additional packages need to be installed? Cache. That way, you can just rebuild the application and push the new layer to your registry of choice, keeping all of the others present.

Using that sort of instruction ordering makes for faster builds, less network traffic and ergo, faster redeploys.

I even scheduled weekly base image builds and daily builds to have the dependencies ready (though that can largely be done away with by using something like Nexus as a proxy/mirror/cache for the actual dependencies too). It's pretty good.

Edit: actually, i think that i'm reading the parent comment wrong, maybe they just want to update a layer in the middle? I'm not sure. That would be nice too, to be honest, though.

AmericanBlarney|4 years ago

Those sound like issues with your Docker usage - there are options to keep base image quite streamlined (e.g. alpine or distroless images).

cbrewster|4 years ago

For context, I'm referencing our (legacy) base image for projects on Replit: Polygott (https://github.com/replit/polygott/).

The image contains dependencies needed for 50+ languages. This means repls by default are packed with lots of commonly used tools. However, the image is massive, takes a long time to build, and is difficult to deploy.

Unfortunately, slimming the image down is not really an option: people rely on all the tools we provide out of the box.