(no title)
tylerl | 4 years ago
This was about the time that Bazel was being open-sourced, and Matt's rules_docker extension was already in there. A solution existed, so to speak, but it would have been nutty to assume that the average project would switch from the straightforward-looking Dockerfile format to using Bazel and BUILD files to construct docker containers. And Docker Inc wasn't going to play along; they were riding a high valuation that depended on them being the final word about containerization, so vocally pretending the problem didn't exist was their safest way forward.
At one point I put together a process and POC for porting the concept of reproducible builds to docker in a user-friendly format -- essentially you'd define a spec that listed your dependencies with no more specificity than you needed. Then tooling would dep-solve that spec and freeze it into a fully-reproducible manifest that encoded all the timestamps, package versions, and other bits that would otherwise have been determined at build time. Then the _actual_ build process left nothing to chance: grab the identified sources and build and assemble in a hermetic environment. You'd attach the manifest to the container, and it gave you a precise bill of materials in a format that you could confidently use for identifying vulnerabilities. Since the builds were fully hermetic, a given manifest would only ever produce one set of bits, which could be reproduced in an automated fashion, allowing you to spot supply chain inconsistencies.
In my tooling, I leaned heavily on package providers like Debian as "owning" the upstream software dependency graph, since this was a problem they'd already solved, and Debian in particular was already serious about reproducibility in their packages.
In the end, it didn't go anywhere. There were a LOT of hacks to make it work since the existing software wasn't designed to allow this kind of integration. For example, the dependency resolution step required splicing in a lot of internal code from package managers, and and the docker container format was (and probably still is) a mess that didn't allow the end products to be properly identified as reproducible without breaking other things.
Plus, this is a problem that only people trying to do security at scale even care about. We needed a sea-change of industry thought around verifiability before my solution would seem at all valuable to people outside a few huge tech companies.
dlor|4 years ago
Funny to see you here. Matt and I haven't given up on this, we're giving a lot of that another try at Chainguard.
tylerl|4 years ago