Kaniko: Build container images in Kubernetes

[+] jacques_chester|8 years ago|reply

This work (and related efforts like Img and Buildah) is a big deal.

Right now docker images and Dockerfiles are joined at the hip to the Docker daemon.

It works great for local development, but for hosted systems that run on containers, it's a dire mess. I have personally slammed head-first into Docker-in-Docker quagmires on Kubernetes and Concourse. Not knowing the particular arcane rites and having neither sufficient eye of newt nor sufficient patience to get it to work, I like everyone else in the universe gave up.

Not an acceptable state of affairs, given the many problems of Dockerfiles in themselves. Dockerfiles force an ugly choice. You can have ease of development or you can have fast, safe production images. But you can't really have both.

Kaniko is another step in the direction of divorcing docker images as a means of distributing bits from Dockerfiles as a means of describing docker images from Docker daemons as a means for assembling the images. All three are different and should no longer be conflated.

Disclosure: I work for Pivotal, we have a lot of stuff that does stuff with containers.

[+] derefr|8 years ago|reply

> Not knowing the particular arcane rites and having neither sufficient eye of newt nor sufficient patience to get it to work, I like everyone else in the universe gave up.

One thing I feel like more people need to know: Docker container-images are really not that hard to build "manually", without using Docker. Just because Docker itself builds images by repeatedly invoking `docker run` and then snapshotting the new layers, people think that's what their build tools need to do as well. No! You just need to have the files you want, and know the config you want, and the ability to build a tar file.

Here's a look inside an average one-layer Docker image:

    $ mkdir busybox_image; cd busybox_image
    $ docker pull busybox:latest
    $ docker save busybox:latest | tar x
    $ tree
    .
    ├── 8ac48589692a53a9b8c2d1ceaa6b402665aa7fe667ba51ccc03002300856d8c7.json
    ├── f4752d3dbb207ca444ab74169ca5e21c5a47085c4aba49e367315bd4ca3a91ba
    │   ├── VERSION
    │   ├── json
    │   └── layer.tar
    ├── manifest.json
    └── repositories

    1 directory, 6 files

• `repositories` contains the tag refs that will be imported when you `docker load` this archive;

• `manifest.json` contains the declarations needed for the daemon to unpack the layers into its storage backend (just a listing of the layer.tar files, basically);

• the SHA-named config file specifies how to reconstruct a container from this archive, if you dumped it from a container (and I believe it's optional when constructing a "fresh" archive for `docker load`ing);

Each SHA-named layer directory contains:

• a `layer.tar` file, which is what you'd expect, e.g.:

    -rwxr-xr-x  0 0      0     1037528 16 May  2017 bin/bash

• a `json` file, specifying (the patch of!) the container config that that layer creates. (If you're composing a docker image from scratch, you just need the one layer, so you don't have to worry about the patching semantics.)

That's pretty much it. Make a directory that looks like that, tar it up, and `docker load` will accept it and turn it into something you can `docker push` to a registry. No need to have the privileges required to run docker containers (i.e. unshare(3)) in your environment. (And `docker load` and `docker push` work fine without a working Docker execution backend, IIRC.)

[+] brightball|8 years ago|reply

This is one of the things that has me really interested in ansible-container. I don't like investing in Dockerfiles when there's still so much other stuff that has to go on AROUND the container itself.

The concept behind ansible-container - having the ability to create Docker, LXC, LXD or any future type or flavor of container...from Ansible playbooks...that you're already able to use to configure entire VMs or bare metal machines just feels like a much more efficient use of ops resources.

Ansible becomes portable across everything.

https://www.ansible.com/integrations/containers/ansible-cont...

[+] zbobet2012|8 years ago|reply

I actually have that working... The trick is DIND wipes /tmp (for no reason what so ever) on startup which also wipes out the concourse build dir. You need a custom version of DIND with the startup script setup not to wipe /tmp.

[+] gingerlime|8 years ago|reply

We’re using docker for development, but we still have to take the leap into production. The whole build/push/pull part is rather confusing somehow. I tried docker hub or docker cloud build as it’s now called(?), but the build itself takes forever... what are people using these days??

Also for development machines, how do you sync things between developers. I can commit a docker file change, but unless I explicitly tell docker compose to rebuild my images and containers, it will happily stick to the old version. I have to keep nagging our (3) developers to do this from time to time... what am I doing wrong?? Sorry if these are dumb questions but we’re still stuck with the basics it seems.

[+] lobster_johnson|8 years ago|reply

If you're still struggling with the build workflow, it's probably not yet the right time to take that leap.

It's not rocket science, of course. You build an image somewhere (your local machine, a CI server, anywhere), push to a registry, and when you want run the image, you pull from the registry and run it. ("docker run" will, by default, automatically pull when you ask it to run something.)

I don't quite understand what your Compose problem is. Is the Compose file referencing images published to, say, Docker Hub? If so, the image obviously has to be built and published beforehand. However, it's also possible to run Compose against local checkouts, then run "docker-compose up --build", e.g.:

    version: '3.2'
    services:
      mainApp:
        build:
          context: .
      service1:
        build:
          context: ../service1
      service2:
        build:
          context: ../service2

and so on.

There's a whole ecosystem of tools built around Docker for building, testing, deploying and orchestrating Docker applications. Kubernetes is one. If you're having issues with the Docker basics, however, I wouldn't consider any of these systems quite yet, although you should consider automating your building and testing with a CI (continuous integration) system, rather than making your devs build and test on their local machines.

As with anything, to actually use Docker in production you'll need an ops person/team that knows how to run it. That could be something as simple as a manual "docker run" or a manual "docker-compose", to something much more complex such as Kubernetes. This is the complicated part.

[+] gknoy|8 years ago|reply

We build a microservices-based tool, hosted as containers in AWS, and have a very developer-friendly workflow. My team's workflow might not work well for yours, YMMV, etc, but here's how we do it:

- When we make a PR, we mark it as #PATCH#, #MINOR#, or #MAJOR#.

- Once all tests pass and a PR is merged, CI uses that tag to auto-bump our app version (e.g. `ui:2.39.4`, or `backend:2.104.9`) and update the Changelog. [0]

- CI then updates the Dockerfile, builds a new image, and pushes that new image to our private repo (as well as to our private ECR in AWS).

- CI then updates the repo that represents our cloud solution to use the newest version of the app.

- CI then deploys that solution to our testing site, so that we can run E2E testing on APIs or the UI, and verify that bugs have been fixed.

- We can then manually release the last-known-good deployment to production.

The two main keys to all of this is that our apps all have extensive tests, so we can trust that our PR is not going to break things, and our CI handles all the inconvenient version-bumping and generation + publication of build artifacts. The best part is, we no longer have to have 5 people getting merge conflicts when we go to update versions of the app, as CI does it for us _after_ things are merged.

0: We use pr-bumper (https://github.com/ciena-blueplanet/pr-bumper), a tool written by my coworkers, for our JS apps and libraries, and a similar Python tool for our non-JS apps.

[+] antoncohen|8 years ago|reply

My first recommendation would be to separate in your head the Docker development environment from the Docker production environment. They can be very different, and that is OK.

For production you want the Docker image to be built when PRs are merged to master (or whatever your flow is). Google Container Builder makes that very easy, you can set up a trigger to build an image and push it to the registry when there are changes to git (code merged to a branch, tag pushed, etc.). Then you need to automate getting that deployed, hopefully to Kubernetes, but that is a different issue.

[+] linkmotif|8 years ago|reply

This is cool! Thanks for posting. I can see how this is useful if building images is part of your CI process.

I’ve been using https://github.com/dminkovsky/kube-cloud-build to build images on Google Cloud Container Builder. It handles generating Cloud Container Builder build requests based on the images specified in my Kubernetes manifests, which was a big deal for me since writing build requests by hand was a total pain.

[+] amq|8 years ago|reply

If you have CI, you normally shouldn't need something like this.

[+] spockz|8 years ago|reply

My understanding is that it is best practice to run your docker builds and images as s non root user. OpenShift will complain if you do for example. Now this kaniko image runs the build as root contrary to the recommendation and the post explicitly mentions this difference with Orca.

Why is it okay now for kaniko to run as root user?

[+] humbleMouse|8 years ago|reply

This was my first thought as well.

[+] ofrzeta|8 years ago|reply

With the availability of the free Red Hat tools for building container images (buildah...) and this, it will be interesting to see what remains of Docker (Inc).

[+] zapita|8 years ago|reply

It's pretty clear that Docker has been focused on moving downstream. They want to add value by assembling open-source components into a complete platform that they can control and sell. They don't want to be the ones developing all the components themselves - at this level of maturity and sophistication in the container market, they just don't have the manpower to do that. A major benefit of that strategy is that they can use the best component available, regardless of who developed it. I bet they're feeling spread very thin on the open-source side, and would love to redirect some of their resources away from developing a gazillion open-source gadgets on their own, and towards their commercial products (which historically have been not as good in my experience).

Evidence that Docker is doing this:

- They only advertise three things with the name Docker: "Docker for Mac" (a free product that is not open-source), "Docker EE" (an enterprise product), and "Docker Hub" (a cloud service). Those are all downstream products, like RHEL or Openshift.

- The whole "Moby" thing is basically their upstream brand, aka "the things not called Docker".

- They spun out tons of smaller projects like buildkit, linuxkit, containerd, runc, and seem eager to get others to use them and contribute, even competitors.

- They embraced Kubernetes as part of their downstream product, even though they famously did not invent it, and they certainly don't control it.

So I think people saying "these free open-source tools are killing Docker" are missing the point. The real competition for Docker is Openshift vs Docker EE, everything else is implementation details.

If you listen to the sales pitch of these two companies right now, it's an absolute tug of war. Docker focuses on independence and innovation ("we know where containers are going, and we don't force RHEL down your throat"). Red Hat focuses on maturity and upstream control ("We've been by your side for 20 years, are you going to trust us or some Silicon Valley hipster? Also we employ more Kubernetes contributors than anyone else").

That's the real battle, in my experience on the open-source side you'll find mostly engineers from all side collaborating peacefully and building whatever they need to get their job done.

[+] CSDude|8 years ago|reply

Although Docker images are not hard to build, (it is just a layers of tars with proper jsons) it is very nice to see such tools rise. Although I have a nice Kubernetes cluster, or any orchestrator, due to security reasons, I have to come up with a new VM with Docker installed and build it there, which really sucks. It is sad to see Docker did not implement this years ago although people wanted it a lot. They were busy deprecating the Swarm Whatever^TM for the 3rd time and not listening as usual.

[+] bkeroack|8 years ago|reply

Interesting! We built and use a service called Furan: https://github.com/dollarshaveclub/furan

That said, Furan isn't suitable for untrusted Dockerfiles (or multi-tenant environments) exactly due to the security implications of access to the Docker engine socket.

The issue I see with Kaniko is drift from upstream Moby/Docker syntax. One of the strengths with Furan is that you have the guarantee that the Docker build you perform locally is exactly what happens by the service. When you can't make this guarantee you get into weird situations where "the build works for me locally" but there's some issue when doing a remote build. That's also why we've resisted putting special build magic into Furan (like injecting metadata into the build context, for example).

[+] zapita|8 years ago|reply

Does this (or could this) use Buildkit? It seems that Docker themselves are encouraging the development of an ecosystem of third-party container build tools, with buildkit as an interoperability layer. I heard good things about buildkit but haven't tried it yet.

If Kaniko authors are reading this: have you considered buildkit and, if not, would you be open to contributions based on it?

My understanding is that the official 'docker build' itself is based on Buildkit.

https://github.com/moby/buildkit

[+] dlor|8 years ago|reply

Kaniko doesn't use buildkit - buildkit still uses containerd/runC under the hood so it can't run inside a container easily.

We are looking at interoperability with buildkit (and the large set of of other tooling like this) through the CBI: https://github.com/containerbuilding/cbi which aims to be a neutral interface on top of things like buildkit, buildah, docker and kaniko that build images.

Discloser: I work on kaniko and other container things at Google.

[+] tyrankh|8 years ago|reply

This is great work. Github link for the lazy: https://github.com/GoogleCloudPlatform/kaniko

[+] ksajadi|8 years ago|reply

This is a great tool. Wish it could work with build workflow tools like Habitus (http://www.habitus.io)

[+] medyadaily|8 years ago|reply

Sweet, This is truly great, I was hoping for a service like this for a long time. being able to build images without root privilege!

[+] amouat|8 years ago|reply

Kaniko does run in the container as root, but the container doesn't need to be granted any extra privileges when run (you don't need the equivalent of Docker's --privileged flag).

45 comments