top | item 25758863

Debian discusses vendoring again

195 points| Tomte | 5 years ago |lwn.net | reply

159 comments

order
[+] angrygoat|5 years ago|reply
A snippet of a quote from Pirate Praveen in the article.

> All the current trends are making it easy for developers to ship code directly to users. Which encourages more isolation instead of collaboration between projects.

This is where I would respectfully disagree. As a dev, packaged libraries from the system are often fine – until I hit a snag, and need to work with the devs from another project to work out a fix. With cargo/node/yarn/poetry/gopkg/... I can send a PR to another project, get that merged, and vendor in the fix while all of that is happening.

If I can't do that, I'm left with hacky workarounds, as upstreaming a fix and then waiting up to 12 months (if I'm on a six-month release tempo OS) for the fix to be available to me is just not practical.

Being able to work on a quick turnaround with dependencies to fix stuff is one of the huge wins with modern build tools.

[+] ohazi|5 years ago|reply
Even as an engineer, I usually draw a line between "stuff I like to hack on" and "core system components that I'd rather not touch." I'm fine pulling a dependency from nightly for a project I'm working on, or because some program I use has a cool new feature I want to play with. But I probably wouldn't do that with, say, openssh.

I can certainly sympathize with this:

> and then waiting up to 12 months (if I'm on a six-month release tempo OS) for the fix to be available

but the needs of system administrators are not the same as the needs of developers. That's why my development machine is on a rolling release, but my servers run Debian stable with as few out-of-repository extras as possible.

Those servers are really fucking reliable, and I don't need a massive team to manage them. Maybe this sort of "boring" system administration isn't as popular as it used to be with all of that newfangled container orchestration stuff, but this is the core of the vendoring argument.

Installing who-knows-what from who-knows-where can work if you're Google, but it really sucks if you're one person trying to run a small server and have it not explode every time you poke at it.

[+] z3t4|5 years ago|reply
I've been maintaining a Node.js app for about five years and almost all dependencies have been "vendored"/locked with forked libraries because some of the dependencies have been abandoned, and some have switched owners where the new owner spend their days adding bugs to perfectly working code due to "syntax modernization", or where the maintainer didn't accept the pull request for various reasons. Software collaboration is not that easy, especially if it's done by people in their (very little) spare time.
[+] olau|5 years ago|reply
I think you misunderstood what he is talking about.

The issue he's addressing is that you don't care about other projects also using this library.

[+] iforgotpassword|5 years ago|reply
> By trying to shoehorn node/go modules into Debian packages we are creating busy work with almost no value.

Another problem, at least with python I've encountered this, is that the debian packages sometimes seem to fight what you downloaded via pip. It's not made to work together. I'm not a python dev so it was very confusing to figure out what is going on, and I wouldn't be surprised if it would be similar if you mix npm and deb packages for js libs. They don't know of each other and can't know which libs were anyway provided by the other, then search paths are unknown to the user etc. I think I went through similar pain when I had to get some ruby project going.

My gut feeling is that it would be best if debian only supplied the package of the software in question and let's the "native" dependency management tool handle all the libs, but I guess that would get the debian folks the feeling of loss of control, as it indeed makes it impossible to backport a fix for specific libs; rather you'd have to fiddle with the dependency tree somehow.

[+] viraptor|5 years ago|reply
> the debian packages sometimes seem to fight what you downloaded via pip

It's a bit annoying, but there are simple rules and it applies to pip/gem/npm the same (not sure about go): For each runtime installation you have a place for global modules. If you installed that runtime from a system package, you don't touch the global modules - they're managed using system packages.

If you install the language runtime on a side (via pyenv, asdf or something else) or use a project-space environment (python venv, bundler, or local node_modules) you can install whatever modules you want for that runtime without conflicts.

[+] initplus|5 years ago|reply
Sure they may have to fiddle with the dependency tree, but Node & Go both have well defined dependency formats (go.mod, package.json). It should be relatively easy to record the go.mod/package.json when these applications are built, and issue mass dependency bump & rebuilds if some security issue comes up.

Really seems like the best of both worlds, and less work than trying to wrangle the entire set of node/go deps & a selection of versions into the Debian repos. I mean Debian apparently has ~160,000 packages, while npm alone has over 1,000,000!

[+] prepperdev|5 years ago|reply
Debian policy is very sane (no network access during build), but it does seem like modern software just assumes that the Internet is always available, and all dependencies (including transitive) are out there.

The assumption is a bit fragile, as proven by the the left-pad incident ([1]). I hope that whatever the outcome of the discussion in Debian will be, it would keep the basic policy in place: not relying on things outside of the immediate control during package builds.

1. https://evertpot.com/npm-revoke-breaks-the-build/

[+] BelenusMordred|5 years ago|reply
Debian is incredibly conservative about versioning/updates and faces a lot of pressure to move faster. I hope they keep the same pace or even slow down.

The world will keep turning.

[+] cbmuser|5 years ago|reply
> Debian policy is very sane (no network access during build)

openaSUSE has that policy, too. And I’m pretty sure the same applies for Fedora.

You don’t want to rely on external dependencies during build that you can’t control.

That would be a huge security problem.

[+] arp242|5 years ago|reply
The whole "download during build" thing is a minor issue; k8s, for example, puts all their dependencies in the /vendor/ directory, and AFAIK many toolchains support this or something like it. And even if they don't, this is something that can be worked around in various ways.

The real issue is whether or not to use that vendor directory, or to always use generic Debian-provided versions of those dependencies, or some mix of both. This is less of a purely technical issue like the above, and more of a UX/"how should Debian behave"-kind of issue.

[+] JoshTriplett|5 years ago|reply
I don't think that aspect of Debian Policy is in any danger of changing, nor should it.
[+] initplus|5 years ago|reply
I am worried here that the alternative we'll end up with is applications that rely on vendoring will end up distributed entirely outside the Debian repositories... hopefully with go get/npm install, hopefully not with "download it from my website!"... But either way you lose a lot of the benefits that being officially in the Debian repos would bring. Devs want to distribute their software to users, and they aren't going to chase down rabbit holes to get it packaged to comply with every different distributions set of available dependency versions.

Really this idea that a distro (even a large well maintained one like Debian) has the resources to package a set of known versions of go/node packages for common open source software seems wrong? If they aren't going to package every exact version that's required, how is it going to be possible to test for compatibility? There is no way. And no dev is going to downgrade some random dependency of their app just to comply with with Debian's set of available versions.

Developers hate this versioning issue with languages like C\C++ on Linux, it's a huge pain. And that's partially why dependency management in languages like Go/Node work the way they do. A multitude of distros with slightly different versions of every lib you use is a huge headache to dev for, so people have designed languages to avoid that issue.

[+] throwaway098237|5 years ago|reply
There' has been always a split between software that is expected to be run for 10 or 20 years and software that will be obsoleted in 2 years.

https://www.cip-project.org/ aims to backport fixes to released kernel for 25 (twenty-five) years.

Because you don't "npm update" deployed systems on: banks, power plants, airplanes and airports, trains, industrial automation, phone stations, satellites. Not to mention military stuff.

(And Debian is much more popular in those places that people believe)

> Devs want to distribute their software to users, and they aren't going to chase down rabbit holes to get it packaged to comply with every different distributions set of available dependency versions.

That's what stable ABIs are for.

> Really this idea that a distro (even a large well maintained one like Debian) has the resources to package a set of known versions of go/node packages for common open source software seems wrong?

Yes, incredibly so. Picking after lazy developers to unbundle a library can require hours.

Backporting security fixes for hundreds of thousands libraries including multiple versions is practically impossible.

> And no dev is going to downgrade some random dependency of their app just to comply with with Debian's set of available versions.

Important systems will keep running in the next decades. Without the work from such developers.

[+] jillesvangurp|5 years ago|reply
That's already the reality for most of this century. Openjdk, go, rust, docker, npm/yarn, etc. all provide up to date Debian, Red Hat, etc. packages for what they offer. There's zero advantage to sticking with the distribution specific versions of those packages which are typically out of date and come with distribution specific issues (including stability and security issues).

Debian's claims to adding value in terms of security and stability to those vendor provided packages are IMHO dubious at best. At best they sort of ship security patches with significant delays by trying to keep up with their stable release channels. Worst case they botch the job, ship them years late, or introduce new bugs repackaging the software (I experienced all of that at some point).

When it comes to supporting outdated versions of e.g. JDKs, there are several companies specializing in that that actually work with Oracle to provide patched, tested, and certified JDKs (e.g. Amazon Coretto, Azul, or Adopt OpenJDK). Of course for Java, licensing the test suite is also a thing. Debian is probably not a licensee given the weird restrictive licensing for that. Which implies their packages don't actually receive the same level of testing as before mentioned ways of getting a supported JDK.

On development machines, I tend to use things like pyenv, jenv, sdkman, nvm, etc. to create project specific installations. Installing any project specific stuff globally is just unprofessional at this point and completely unnecessary. Also, aligning the same versions of runtimes, libraries, tools, etc. with your colleagues using mac, windows, and misc. Linux distributions is probably a good thing. Especially when that also lines up with what you are using in production.

Such development tools of course have no reason to exist on a production server. Which is why docker is so nice since you pre-package exactly what you need at build time rather than just in time installing run-time dependencies at deploy time and hoping that will still work the same way five years later. Clean separation of infrastructure deployment and software deployment and understanding that these are two things that happen at separate points in time is core to this. Debian package management is not appropriate for the latter.

Shipping tested, fully integrated, self-contained binary images is the best way to ship software to production these days. You sidestep distribution specific packaging issues entirely that way and all of the subtle issues that happen when these distributions are updated. If you still want Debian package management, you can use it in docker form of course.

[+] npsimons|5 years ago|reply
> But either way you lose a lot of the benefits that being officially in the Debian repos would bring.

The first thing I do when I hear about a new (to me) piece of software is an "apt-cache search $SOFTWARE". If it doesn't show up there, that's a red flag to me: this software isn't mature or stable enough to be trusted on my production machines.

Sure, I might go ahead and download it to play around with on my development machines, but for all the "I'm making it awesome!" arguments of developers, more often than not it's just an excuse for lack of discipline in development process.

[+] JoshTriplett|5 years ago|reply
This was exactly my concern. I believe that Debian packages should avoid vendoring when possible, but that means it must be possible to package the individual modules, even if there are multiple versions, and even if there are many small dependencies.
[+] sorisos|5 years ago|reply
Agree no one is going to downgrade but there is another strategy - always build your app against package versions that is in Debian stable. Of-course it can be problematic but have some advantages: well tested, any bugs probably have documented workaround.
[+] choeger|5 years ago|reply
Personally I believe that vendoring is just the lazy approach by developers that do not want to care about the eco system their software runs in. Consequently, their software will probably not be maintained for a long time (Red Hat offers 10 years of support, for instance). It's a shame but it seems like the cool kids simply tend to ignore sustainability in software development.

Since npm,pip,go,cargo, etc. are open source projects, would it not be simpler to add a "debian mode" to them? In that mode, the tool could collaborate with the system package manager and follow any policies the distribution might have.

[+] jancsika|5 years ago|reply
If upstream has decided to vendor, I only see two sensible options:

* package the vendored software in Debian, and annotate the category of vendored packages so it's clear to the user they cannot follow the normal Debian policies. I've been bitten by the lack of such feedback wrt Firefox ESR. My frustration would have gone away completely if the package manager told me, "Hey, we don't have the volunteer energy to properly package this complex piece of software and its various dependencies. If you install it, it's your job to deal with any problems arising from the discrepancy between Debian's support period and Mozilla's support period." As it is, Debian's policy advertises a level of stability (or "inertia" as people on this thread seem to refer to it) that isn't supported by the exceptions it makes for what are probably two of the most popular packages-- Chromium and Firefox.

* do not package software that is vendored upstream

I can understand either route, and I'm sure there are reasonable arguments for either side.

What I cannot understand-- and what I find borderline manipulative-- is pretending there's some third option where Debian volunteers roll up their sleeves and spend massive amounts of their limited time/cognitive load manually fudging around with vendored software to get it in a state that matches Debian's general packaging policy. There's already been a story posted about two devs approaching what looked to me like burnout over their failed efforts to package the same piece of vendored software.

Edit: clarification

[+] newpavlov|5 years ago|reply
It's a very reasonable policy to require an ability to build everything offline without accessing language "native" repositories. But I think a big problem is that Debian requires that each library was a separate package.

For classic C/C++ libraries it's not a problem, since for historical reasons (lack of a good, standard language package manager and thus high-level of pain caused by additional dependencies) they had relatively big libraries. Meanwhile in new languages, good tooling (cargo, NPM, etc.) makes "micro-library" approach quite viable and convinient (to the point of abuse, see leftpad). And packaging an application with sometimes several hundred dependencies is clearly a Sisyphean task.

I think, that instead of vendoring, Debian should instead adopt a different packaging policy, which would allow them to package whole dependency trees into a single package. This should make it much easier for them to package applications written in Rust and similar languages.

[+] jillesvangurp|5 years ago|reply
Well, C/C++ historically had no separate dependency management making linux distributions effectively the de-facto package management for C/C++.

Other languages do have package managers and not using those is typically not a choice developers make.

I agree vendoring npm, maven, pip, etc. dependencies for the purpose of reusing them in other packages that need them (as opposed to just vendoring the correct versions with those packages) is something that probably adds negative value. It's just not worth the added complexity of trying to even make that work correctly. Also package locking is a thing with most of these package managers meaning that anything else by definition is the wrong version.

[+] giovannibajo1|5 years ago|reply
> I think, that instead of vendoring, Debian should instead adopt a different packaging policy, which would allow them to package whole dependency trees into a single package.

I'm not sure how this is different from what I call vendoring, and I think this is indeed the solution.

In Go, there's "go mod vendor" which automatically creates a tree called "vendor" with a copy of all the sources needed to build the application, and from that moment on, building the application transparently uses the vendored copy of all dependencies.

In my ideal world, Debian would run "go mod vendor" and bundle the resulting tree a source DEB package (notice that the binary DEB package would still be "vendored" because go embraces static linking anyway).

If the Debian maintainer of that application wants to "beat upstream" at releasing security fixes, they will have a monitor on those dependencies' security updates, and then whenever they want, update the required dependencies, revendor and ship the security update.

What I totally disagree with is having "go-crc16" as a Debian package. I'm not even sure who would benefit from that, surely not Go developers that will install packages through the go package manager and decide and test their own dependencies without even knowing what Debian is shipping.

[+] RcouF1uZ4gsC|5 years ago|reply
> For classic C/C++ libraries it's not a problem, since for historical reasons (lack of a good, standard language package manager and thus high-level of pain caused by additional dependencies)

This is also one of the big reasons why header-only C++ libraries are so popular.

[+] rezonant|5 years ago|reply
Speaking as a seasoned Node.js dev, if they think they can handle Node's nested vendored packaging system using flat debian packaging and guarantee correct behavior of the app they are sorely mistaken. It's a fools errand. The sheer amount of effort being proposed here is astounding.
[+] zaarn|5 years ago|reply
If all you have is a hammer...

It's not the first time debian package policies seem backwards and trying to shove a square peg through a round hole. I hope the solution does not end up being "make APT do it" because APT is a terrible package manager to begin with (I hate every second that I had to fight APT over how to handle PIP packages that I would very much like installed globally).

[+] superkuh|5 years ago|reply
This futureshock is a result of the rapid pace of new features implemented within commonly used libraries and immediately used by devs. The rapid pace is good for commerce and servers but it's bad for the desktop. Commerce pays almost all the devs (except those wonderful people at debian) so futureshock will continue. The symptoms of this inbalance in development incentives versus user incentives express themselves as containerization and vendoring.
[+] IceWreck|5 years ago|reply
Fedora has separate packages for libraries. But for nodejs, packaging induvidual libs lead to a huge clusterfuck of difficult to maintain packages. Now theyve decided that nodejs based packages would have compiled/binary nodejs modules for now. https://fedoraproject.org/wiki/Changes/NodejsLibrariesBundle...
[+] eclipseo76|5 years ago|reply
And for Golang, we try to unbundle, we have around 1,600 go libraries packaged. Some package are still bundled like k8s though due to depndency hell.
[+] zeckalpha|5 years ago|reply
> Kali Linux, which is a Debian derivative that is focused on penetration testing and security auditing. Kali Linux does not have the same restrictions on downloading during builds that Debian has

The security auditing distribution has less auditable requirements around building packages?

[+] rbanffy|5 years ago|reply
I believe that one lesson here is that it's not because it's now possible to have a thousand dependencies that you should have a thousand dependencies. It'll make your sysadmins very sad.

I don't want the latest libraries on my servers. I want my servers to be boring and not change often. I want them to run the time-proven, battle-tested and well-understood software, because I don't want to be the first to debug those. There are people better at that than me.

If, and only if, there's a blocker bug in a distro-provided package, I'll think of vendoring it in. And then only if there is no plausible workaround.

Of course, I also do testing against the latest stuff so I'm not caught off-guard when the future breaks my apps.

[+] PeterisP|5 years ago|reply
IMHO for these ecosystems we're seeing a swap in priorities between OS/distro and app - instead of having the server as the main unit, which provides certain libraries and certain apps, the approach is to have a box (very likely virtualised or containerised) that's essentially "SuperWebApp v1.23" and the server is only there to support that particular single app.

The server/os/distro/admin does not tell what library version the app should use; the app tells which library version it prefers, and either packages it with itself or pulls it at installation time. If something else needs a different version - then that something else should be somewhere else, isolated from the environment that's tailored for that app only. You don't go looking for the package of that app version for Debian release that you have; you don't try to run a 2025 version of the app on a 2021 long term support version of the distro, instead you choose the app version that you want to have, and pick the Debian (or something else) version that the particular version of the app wants;

Also, an app like that does not expect to be treated as a composition with a thousand dependencies, it wants to be treated as a monolith black box. If there's a bug (security or not) in a dependency of SuperWebApp v1.23, you treat it in exactly the same as if there's a bug in the app itself - you deploy the update that the app vendor provides. In that context, a long-term support OS is required for the things that the app itself does not want to support (e.g. kernel and core libraries) - the app developer is not upstream for the distro, instead the app developer includes a distro (likely a long-term support version) as an upstream dependency packaged with the app VM or container.

If you need to go from "SuperWebApp v1.23" to "SuperWebApp v1.24", then the server can be treated as disposable, and everything either replaced fully or transformed in a noncompatible way to fit the new requirements - because, after all, that app is the only app that determines what else the server should have. Cattle, not pets; replaceable, not cared for.

[+] viraptor|5 years ago|reply
I haven't seen it mentioned in that discussion, but it's the vendoring is interesting from the reproducible builds point of view, especially the recent solarwind incident. The dependencies become one step removed from their upstream distribution and potentially patched. Tracking what you're actually running becomes a harder problem than just looking at package version.

With vendoring we'll see Debian security bulletins for X-1.2.3 which actually mean that the vendored Y-3.4.5 is vulnerable. And if you're monitoring some other vulnerabilities feed, Y will not show up as a package on your system at all.

[+] npsimons|5 years ago|reply
I haven't been through a lot of comments here or at the link, but I'll bring up something I ran into in what I realize now was an early version of "vendoring": over a decade ago I was playing around with https://www.ros.org/, and there were no distribution packages, so I went with the vendor method, and I distinctly remember it downloading gobs of stuff and building it, only to break here and there. It was fucking terrible to work with and I only did it because it was R&D, not a production grade project, and I was being paid full time for it.

Vendoring "build" processes, IME, are incredibly prone to breakage, and that alone is reason I won't bother with them for a lot of production stuff. Debian is stable - I can "apt install $PACKAGE" and not have to worry about some random library being pulled from the latest GitHub version breaking the whole gorram build.

[+] chalst|5 years ago|reply
I'm surprised the option of moving the package to contrib got so little support. Many of these packages don't seem a good fit for Debian stable and its security-patch model.
[+] nanna|5 years ago|reply
I recently returned to Debian after a long hiatus in Ubuntu. This time, I'm using Guix as my package manager.

It's a wonderful combo. Bleeding edge, reproducable, roll-backable, any version I choose, packages if I want them via Guix. Apt and the occasional .deb file as a fallback or for system services (nginx etc). And Debian as the no-bs, no-snap, solid foundation of everything.

To me this is the future.

[+] pabs3|5 years ago|reply
Have you considered GuixSD (Guix as an OS instead of just package manager) instead of Debian?
[+] twentydollars|5 years ago|reply
hmmm, this is your desktop or a server?
[+] bfrog|5 years ago|reply
Nix seems to effectively have solved this, by more or less vendoring everything, but in a way that still allows shared usage. Having made a few deb and rom packages in my life, I don't miss it. At all.
[+] ogre_codes|5 years ago|reply
This is a bit of a pet peeve with Linux packaging systems.

I want application X.

  $ sudo apt-get install appX

  This is going to instal 463 packages do you want to continue (y/n)?

  # HELL NO
Seems like every time a language starts to get popular, this is an issue until you have 8 or 9 sets of language tools piled up that you never use.
[+] Too|5 years ago|reply
Look for a appX-minimal package and add —-no-install-recommends.

Don’t ask why these are not the defaults.

[+] tpoacher|5 years ago|reply
Look, I need that leftpad import, ok?
[+] makz|5 years ago|reply
What if linux distributions stop packaging stuff altogether?

Most of the time it seems to create more trouble than it's worth (for the developers and mantainers of such distributions).

Maybe just provide a base system and package management tools but leave the packaging to third parties.

We can see that already with repositories such as EPEL and others more specialized.

[+] 3np|5 years ago|reply
Most about any distribution you can configure your own repositories.

Realistically, you could set up a minimal arch and host your own aur (there are projects for this). This is basically what the aur is.

Or Debian ppas if you’re looking for more self-contained bundles.

And there’s always gentoo.

I think what you want kind of exists and is in practice already :)

Personally I find A LOT of value in distributions and it’s an IOU’s that others do too - otherwise they wouldn’t have the significance they do today.