top | item 43030046

(no title)

IHLayman | 1 year ago

How this article discusses reproducibility in NixOS and declines to even mention the intensional model or efforts to implement it are surprising to me, since it appears they have done a lot of research into the matter.

If you don’t know, the intensional model is an alternative way to structure the NixOS store so that components are content-addressable (store hash is based on the targets) as opposed to being addressed based on the build instructions and dependencies. IIUC, the entire purpose of the intensional model is to make Nix stores shareable so that you could just depend on Cachix and such without the worry of a supply-chain attack. This approach was an entire chapter in the Nix thesis paper (chapter 6) and has been worked on recently (see https://github.com/NixOS/rfcs/pull/62 and https://github.com/NixOS/rfcs/pull/17 for current progress).

discuss

mschwaig|1 year ago

I think it would have been a good thing to mention, but difficult to do well in more than a quick reference or sidenote and could easily turn into a extensive detour. I'm saying this as someone who's working on exactly that topic. There is a little bit of overlap between the kind of quantitative work that they do and this design aspect: the extensional model leaves the identity of direct dependencies not entirely certain. In practice that means we don't know if they built direct dependencies from source or substituted them from cache.nixos.org, but this exact concern also applies to cache.nixos.org itself.

The intensional store makes the store shareable without also sharing trust relationships ('kind of trustless' in that sense), but only because it moves trust relationships out of the store, not because it gets rid of them. You still need to trust signatures which map an hash of inputs to a hash of the output, just like in the extensional model. You can however get really powerful properties for supply chain security from the intensional store model (and a few extra things). You can read about that in this recent paper of mine: https://dl.acm.org/doi/10.1145/3689944.3696169. I'm still working on this stuff and trying to find ways to get that work funded (see https://groundry.org/).

mikepurvis|1 year ago

You still need to trust something though. It's just that instead of trusting the signing of the binaries themselves, you trust the metadata that maps input hashes (computed locally) to content hashes (unknown until a build occurs).

The real win with content addressing in Nix is being able to proactively dedupe the store and also cut off rebuild cascades, like if you have dependency chain A -> B -> C, and A changes, but you can demonstrate that the result of B is identical, then there's no longer a need to also rebuild C. With input addressing, you have to rebuild everything downtree of A when it changes, no exceptions.

im3w1l|1 year ago

Is B remaining the same something that happens often enough for it to matter?