Watching this discussion makes it painfully obvious that the most vocal opposers to the post are those who have never used unified software package management. Sorry, guys, but you complaining how broken package management is when all you know is the google, browse, download, run, next-next-finish dance doesn't exactly help.
I don't want to know Mozilla released a new Firefox. I don't want to have to run it to find out. I don't want to know there is a new Java available and I don't care which version of OpenOffice is the latest and greatest. Or bzlib, or glibc. Or Python, Ruby, CMUCL... If the packaged version is good enough, I don't want to waste my time managing my computer, not a single minute. I'll let the system-wide updater do its job.
And that's what a unified software package management system (I use APT) does for me. It does its job so that I can do mine.
In the parts I need more control, I skip the whole thing and install my own programs in any way that makes sense for the specific product. My distro provides a reasonably up-to-date Python, but if I am to muck around libraries that conflict with the packages provided by the distro, I will use a virtualenv, which is a stand-alone Python environment, with its own libraries and easy_install/pip. This is similar to the "bundle your versions" approach, but, rather than the norm, it's the exception.
"Linux distributions must build everything from source code."
No, they don't have to. That's a self-imposed rule that's beneficial in some cases but not others. When an upstream developer releases nothing but source code, I appreciate the fact that the distros compile it on my behalf. But when an upstream developer (like Mozilla or Google or all these Java developers) releases a fully built and tested binary, I'd prefer the distros to just pass that on to me.
That's impossible because the distribution is responsible for the updates it provides. If you package libraries with your program alla Firefox, Chrome, or Java, you simply make your program unmaintainable and unsupportable from the distro point-of-view. This is true of Ubuntu, as well as Debian or RedHat. This way of thinking is incompatible with basically all of the major distros.
The next obvious solution is to make separate packages for every version of library that the software uses. The problem is that there is no real convergence on “commonly-used” versions of libraries. There is no ABI protection, nor general guidelines on versioning. You end up having to package each and every minor version of a library that the software happens to want.
I don't understand why this is a problem exactly. If a dev runs and tests against a particular version of a library, use that version. Even minor updates in libraries can cause problems. If the library has a security issue, blacklist the version and force the upstream dev of both the library and the app to acknowledge the issue and release a new versions of their projects.
Force upstream to fix a bug and release a new version? And what if they don't? Quite a few packages in Debian have somewhat dormant upstream authors - the packages work, perhaps needing a few patches to compile with the latest versions of common libraries, but the original author has essentially moved on.
Consider:
libfoobar has a bug in at least one version seen in the wild. Is my system safe?
If your distro packages just one or two stable versions of libfoobar, any package that depends on libfoobar is either OK or not OK, and if it's not OK, you can patch the bug in one place and you're safe again. If upstream is dormant, perhaps the current package maintainer can fix it.
If there are various versions of libfoobar being linked by individual apps, you need to check every app for the flaw and work with both the app author and libfoobar's author to determine whether the flaw exists and how to fix it. Upstream libfoobar might say the bug has been fixed in the latest release, so just upgrade to that. Upstream app then has more work to do, and may be in denial about the importance of the bug. And if the source isn't available for the precise libfoobar bundled with the app, the package maintainer would have to either (a) rework the app to work with a stable system version of libfoobar, (b) package the odd version of libfoobar separately, and link that, or (c) delete the package from the distribution.
This is synonymous with Zed's recent Python rant, and the "DLL hell" mentioned previously.
FreeBSD went through the pain of extracting itself away from its perl dependency years ago. And, it's way better for it. There's nothing like a minimal install. A nice, clean, empty slate.
This sounds peculiarly like the issue of "DLL Hell" on the Windows platform. Ultimately, you either hope that the shared libraries match, ship with the DLLs you need, or (as a newer option) throw your chips in with WinSxS -- where the cure might actually be worse than the disease.
Maybe the fact that Java grew out of the proprietary world is part of the reason why it doesn't play nicely with open-source/free operating systems.
It's a little different than DLL hell: most applications provide their own libraries precisely to avoid DLL hell. Rather than relying on (or trying to install) some globally-shared DLL, they include their own versions of libraries, and it doesn't matter if those versions match the versions used by other applications. (As a developer you have to worry about dependency differences between libraries, but as an application installer/user you don't have to worry about it once the developer's got everything working and bundled up). The Linux distro guys are in some ways asking Java applications to go back to a world where DLL Hell is possible, which might be good for distro packagers but would be a disaster for developers.
I think it's because Java sees itself as its own operating system, and because it started out with a pretty crummy module system that's only been incrementally improved over the years.
This problem is different from the Python issue Zed Shaw wrote about.
In Java, developers will bundle dozens of random .jar files with their application. Other versions of these libraries may exist in a Linux distribution already, but specific Java apps aren't linking to those and carefully documenting how to install all the dependencies (or which ones are optional). Instead, from the distro's point of view, these Java apps are being released with a bunch of binary blobs that may or may not contain bugs that need patching later. Which partly defeats the purpose of package management.
Python programmers don't do this. Distros love Python, which is why so many of their system administration/automation scripts depend on it. This puts some tension between packagers, who use it as a bash/perl replacement, and developers like Zed, who want to treat it like an up-to-date library -- but realize that the Java solution of just bundling the whole thing with the main app is ugly. That tension exists because Python developers haven't run amok the way Java developers have.
I've come to the conclusion that every language is broken from the point of view of people actually working with the language rather than relying on tools previously written in it. Ruby has broken out of this with rvm, maven kinda-sorta fixes Java but seems to come with a bundle of baggage, virtualenv with your own python build works well enough. CLBuild and QuickLisp, the Haskell Platform and from what I can tell Racket automate a similar approach. I don't know much about the Perl ecosystem, but at a guess I'd say that Perl's backwards compatibility saves it better than most. And this is before you start looking at the code quality and release processes of the libraries packaged in the languages' own distribution mechanisms.
Seriously, unless you're working in C/++, I think it's only worth using the system-bundled languages if you're actually targeting the OS itself. If you're targeting more than one platform, using your own build seems to be the only way to stay sane. If you're only targeting a single platform, unless you're tracking a bleeding-edge-style distro like Arch, you can guarantee that all the fun development is going to be going on somewhere you can't reach.
"The problem is that Java open source upstream projects do not really release code"
Wait what... If projects do not release source which you can modify, build and repackage, does it really deserve to be called open source project?
And I thought also that integrating different projects was distributions main task. Isn't that exactly what he is talking about? Distros like Debian already backport massive amounts of code in a lifetime of a relase to get all stuff working with their specific versions. Does Java really differ that much?
It's exactly the same thing and one of the reasons distros should get the hell out of the business of shipping stuff like this.
I'm on a mission. Distros that like to use languages, say like python, for system level stuff should stuff that shit somewhere isolated and ONLY use it for system level stuff.
Then they can provide or not provide a native package for python2.7/python3.1/ruby1.9.2 ... you get the point. With distros like RHEL and Ubuntu LTS, they lose ALL value as a platform for ruby or python development because they don't release often enough or worry about breakage to keep those languages up to date.
This is why companies like ActiveState are making a KILLING providing supported after-market dynamic language binaries.
What the distros should be doing is, besides isolating any dynamic language they use for system-level configuration, providing with the support of the language vendors an installable local package repository. I.e. you should be able to install a base RHEL provided python 2.7 RPM + local PyPi server and grab which packages you want to standardize on. Same goes for Ruby and gems.
This would solve the issue entirely and keep LTS distros like RHEL and Ubuntu from being irrelevent in 2 weeks when a new version of a gem comes out that you have to have for app X.
Ruby, Perl and Python packages usually come with a README that says:
This depends on these external packages: ...
Java programs usually come with a bunch of .jar files which were once independent packages, but have been dropped into the release itself. No dependency problems!
Then, if someone wants to package a Java application for Debian, the process is:
1. Look through the collection of .jar files in the release
2. Do you recognize one of these as already being packaged for Debian?
3. Work with upstream to delete that .jar from Debian's copy and depend on the system's version instead
4. Repeat for every other .jar in the release, until you hit a wall
5. Upload the package to Debian with an acceptably small number of bundled .jars
6. Time permitting, get someone to package the other .jars that aren't available in Debian yet
That said, you can cause a similar amount of trouble in other languages, it's just not the convention (thanks to the success of gems, CPAN, PyPI). For example, Ubuntu appears to have deleted the sagemath package because upstream keeps their own patched copies of dozens of libraries they depend on:
I think the real problem is that there are too few package maintainers for Java packages, and the upstream binaries are usable enough that there is less incentive to become a Java pkg maintainer compared to eg C packages.
edit: also, I don't believe in either of his solutions. Real solution imho would be to patch upstream source to work with distro provided libraries (of course in some cases patching the library is also viable alternative).
I think the real real problem is that assuming a "package maintainer" must exist for every single goddamn software package for every single goddamn Linux distribution has very obvious human scalability problems.
Just because an infinite number of monkeys appeared to create Debian does not mean it is a safe assumption to assume that there are many other groups of infinite monkeys out there to support every other software ecosystem.
Eventually, the Linux community will figure out that they've run out of bodies to build the same crap over-and-over and 'binary compatibility' will stop being a dirty word.
Stop worrying about duplicating files: disk is cheap.
Stop worrying about building everything from source: I'd rather use the official Firefox than the screwy patched version. (I'm looking at you, Debian Iceweasel.)
Just look at the way OS X does things.
I've been using Linux since 1993 and it's amazing this crap hasn't been fixed by now.
We want to avoid code duplication (so that a security update in a library package benefits all software that uses it)
The crux of the problem seems to be the Big Brother attitude distros take toward users and apps, specficially protecting users from unpatched vulnerabilities in upstream apps.
While this is crucial on servers and mass deployments, it's entirely possible users find this more of an annoyance than a feature, and thus we may have to wait just a little while longer for the so-called Year-of-Linux-on-the-Desktop.
Some have suggested that maven as a solution. I guess the only part I'm missing is how maven ties into the actual system. From what I've seen it is always pulling from my ~/.m2 repository, or a full repository upstream. Is there a way to have like a "System" repository that yum/apt/etc could install into?
Getting things to build with maven's idea of a build process has traditionally been tough, but it's pretty good for auto-downloading dependencies. Usually I needed a shell script that had maven download some, then wget'd a few more (that, for licensing reasons, couldn't go in an upstream server).
Unless a slightly-customized maven is distributed, you'd have to put a ~/.m2/settings.xml in the new-user template that specified your local repository. Which isn't too bad.
Anyone with more than a handful of developers should really be running their own maven repository mirror to shield deployments from external outages. http://nexus.sonatype.org/ is stable and simple to get running.
The fact that the libraries get downloaded to ~/.m2 should be irrelevant in production, because your deployments should happen against a deployment-specific role user, and downloads should be super-fast, because they're all from within your colo.
(all this being said, I haven't touched java and don't miss it at all since switching to Ruby on Rails to write AdGrok, and that's coming from more than 10 years soaking in java).
I think we need a radical break with the past to solve the horrible dependency nightmare we're looking at today. These would be my initial principles:
There should be _one_ unified Linux OS that only includes the bare minimum set of applications. It should be very clear what belongs to the OS and what doesn't. The root directory should contain exactly three sub directories: os, app and home.
There should be no dependencies on any non OS software. Each application should live in its own subdirectory of /app. Libraries that are not part of the OS should not be shared across applications. The only external dependency an application should have is the OS.
Whatever the new solution is, it should not include package mangers or package maintainers. Their existence is a symptom of an overly complicated system.
I realise that having a central registry of all installed software components does have advantages like being able to fix some security issues in one place. However, I think the idea has failed. It creates too many intractable dependencies, it is too complicated and hence ultimately insecure and unproductive.
One thing that always shocked me is how many Java apps play "library bingo".
There is no reason to use every third-party library available. IDEs make it easy to introduce all kinds of weird dependency in your code, but, please, don't.
Actually the "standard" JRE (as far as most java apps are concerned) is the official one from java.sun.com (bundled as "sun-java" in ubuntu IIRC).
I have encountered numerous instances of OpenJDK not being able to run java apps that run fine on sun-java & quite often when an app/applet I support doesn't work on a user's machine it's due to that user having OpenJDK installed. Replacing it normally solves the problem.
I don't really know why linux distros so often insist on installing OpenJDK as the default when it has so many incompatibility issues.
[+] [-] rbanffy|15 years ago|reply
I don't want to know Mozilla released a new Firefox. I don't want to have to run it to find out. I don't want to know there is a new Java available and I don't care which version of OpenOffice is the latest and greatest. Or bzlib, or glibc. Or Python, Ruby, CMUCL... If the packaged version is good enough, I don't want to waste my time managing my computer, not a single minute. I'll let the system-wide updater do its job.
And that's what a unified software package management system (I use APT) does for me. It does its job so that I can do mine.
In the parts I need more control, I skip the whole thing and install my own programs in any way that makes sense for the specific product. My distro provides a reasonably up-to-date Python, but if I am to muck around libraries that conflict with the packages provided by the distro, I will use a virtualenv, which is a stand-alone Python environment, with its own libraries and easy_install/pip. This is similar to the "bundle your versions" approach, but, rather than the norm, it's the exception.
[+] [-] regularfry|15 years ago|reply
[+] [-] wmf|15 years ago|reply
No, they don't have to. That's a self-imposed rule that's beneficial in some cases but not others. When an upstream developer releases nothing but source code, I appreciate the fact that the distros compile it on my behalf. But when an upstream developer (like Mozilla or Google or all these Java developers) releases a fully built and tested binary, I'd prefer the distros to just pass that on to me.
[+] [-] wazoox|15 years ago|reply
[+] [-] bokchoi|15 years ago|reply
I don't understand why this is a problem exactly. If a dev runs and tests against a particular version of a library, use that version. Even minor updates in libraries can cause problems. If the library has a security issue, blacklist the version and force the upstream dev of both the library and the app to acknowledge the issue and release a new versions of their projects.
[+] [-] etal|15 years ago|reply
Consider:
libfoobar has a bug in at least one version seen in the wild. Is my system safe?
If your distro packages just one or two stable versions of libfoobar, any package that depends on libfoobar is either OK or not OK, and if it's not OK, you can patch the bug in one place and you're safe again. If upstream is dormant, perhaps the current package maintainer can fix it.
If there are various versions of libfoobar being linked by individual apps, you need to check every app for the flaw and work with both the app author and libfoobar's author to determine whether the flaw exists and how to fix it. Upstream libfoobar might say the bug has been fixed in the latest release, so just upgrade to that. Upstream app then has more work to do, and may be in denial about the importance of the bug. And if the source isn't available for the precise libfoobar bundled with the app, the package maintainer would have to either (a) rework the app to work with a stable system version of libfoobar, (b) package the odd version of libfoobar separately, and link that, or (c) delete the package from the distribution.
[+] [-] swaits|15 years ago|reply
FreeBSD went through the pain of extracting itself away from its perl dependency years ago. And, it's way better for it. There's nothing like a minimal install. A nice, clean, empty slate.
[+] [-] hvs|15 years ago|reply
Maybe the fact that Java grew out of the proprietary world is part of the reason why it doesn't play nicely with open-source/free operating systems.
[+] [-] akeefer|15 years ago|reply
[+] [-] barrkel|15 years ago|reply
[+] [-] jedwhite|15 years ago|reply
[+] [-] etal|15 years ago|reply
In Java, developers will bundle dozens of random .jar files with their application. Other versions of these libraries may exist in a Linux distribution already, but specific Java apps aren't linking to those and carefully documenting how to install all the dependencies (or which ones are optional). Instead, from the distro's point of view, these Java apps are being released with a bunch of binary blobs that may or may not contain bugs that need patching later. Which partly defeats the purpose of package management.
Python programmers don't do this. Distros love Python, which is why so many of their system administration/automation scripts depend on it. This puts some tension between packagers, who use it as a bash/perl replacement, and developers like Zed, who want to treat it like an up-to-date library -- but realize that the Java solution of just bundling the whole thing with the main app is ugly. That tension exists because Python developers haven't run amok the way Java developers have.
[+] [-] dochtman|15 years ago|reply
[+] [-] regularfry|15 years ago|reply
Seriously, unless you're working in C/++, I think it's only worth using the system-bundled languages if you're actually targeting the OS itself. If you're targeting more than one platform, using your own build seems to be the only way to stay sane. If you're only targeting a single platform, unless you're tracking a bleeding-edge-style distro like Arch, you can guarantee that all the fun development is going to be going on somewhere you can't reach.
Luckily, we have checkinstall.
[+] [-] RyanMcGreal|15 years ago|reply
[+] [-] zokier|15 years ago|reply
Wait what... If projects do not release source which you can modify, build and repackage, does it really deserve to be called open source project?
And I thought also that integrating different projects was distributions main task. Isn't that exactly what he is talking about? Distros like Debian already backport massive amounts of code in a lifetime of a relase to get all stuff working with their specific versions. Does Java really differ that much?
[+] [-] bokchoi|15 years ago|reply
[+] [-] barrkel|15 years ago|reply
[+] [-] lusis|15 years ago|reply
I'm on a mission. Distros that like to use languages, say like python, for system level stuff should stuff that shit somewhere isolated and ONLY use it for system level stuff.
Then they can provide or not provide a native package for python2.7/python3.1/ruby1.9.2 ... you get the point. With distros like RHEL and Ubuntu LTS, they lose ALL value as a platform for ruby or python development because they don't release often enough or worry about breakage to keep those languages up to date.
This is why companies like ActiveState are making a KILLING providing supported after-market dynamic language binaries.
What the distros should be doing is, besides isolating any dynamic language they use for system-level configuration, providing with the support of the language vendors an installable local package repository. I.e. you should be able to install a base RHEL provided python 2.7 RPM + local PyPi server and grab which packages you want to standardize on. Same goes for Ruby and gems.
This would solve the issue entirely and keep LTS distros like RHEL and Ubuntu from being irrelevent in 2 weeks when a new version of a gem comes out that you have to have for app X.
[+] [-] etal|15 years ago|reply
Then, if someone wants to package a Java application for Debian, the process is:
1. Look through the collection of .jar files in the release
2. Do you recognize one of these as already being packaged for Debian?
3. Work with upstream to delete that .jar from Debian's copy and depend on the system's version instead
4. Repeat for every other .jar in the release, until you hit a wall
5. Upload the package to Debian with an acceptably small number of bundled .jars
6. Time permitting, get someone to package the other .jars that aren't available in Debian yet
That said, you can cause a similar amount of trouble in other languages, it's just not the convention (thanks to the success of gems, CPAN, PyPI). For example, Ubuntu appears to have deleted the sagemath package because upstream keeps their own patched copies of dozens of libraries they depend on:
http://packages.ubuntu.com/search?keywords=sagemath&sear...
[+] [-] zokier|15 years ago|reply
edit: also, I don't believe in either of his solutions. Real solution imho would be to patch upstream source to work with distro provided libraries (of course in some cases patching the library is also viable alternative).
[+] [-] flomo|15 years ago|reply
Just because an infinite number of monkeys appeared to create Debian does not mean it is a safe assumption to assume that there are many other groups of infinite monkeys out there to support every other software ecosystem.
Eventually, the Linux community will figure out that they've run out of bodies to build the same crap over-and-over and 'binary compatibility' will stop being a dirty word.
[+] [-] nerd_in_rage|15 years ago|reply
Stop worrying about duplicating files: disk is cheap.
Stop worrying about building everything from source: I'd rather use the official Firefox than the screwy patched version. (I'm looking at you, Debian Iceweasel.)
Just look at the way OS X does things.
I've been using Linux since 1993 and it's amazing this crap hasn't been fixed by now.
[+] [-] polynomial|15 years ago|reply
The crux of the problem seems to be the Big Brother attitude distros take toward users and apps, specficially protecting users from unpatched vulnerabilities in upstream apps.
While this is crucial on servers and mass deployments, it's entirely possible users find this more of an annoyance than a feature, and thus we may have to wait just a little while longer for the so-called Year-of-Linux-on-the-Desktop.
[+] [-] emarcotte|15 years ago|reply
[+] [-] lallysingh|15 years ago|reply
Unless a slightly-customized maven is distributed, you'd have to put a ~/.m2/settings.xml in the new-user template that specified your local repository. Which isn't too bad.
[+] [-] mceachen|15 years ago|reply
Anyone with more than a handful of developers should really be running their own maven repository mirror to shield deployments from external outages. http://nexus.sonatype.org/ is stable and simple to get running.
The fact that the libraries get downloaded to ~/.m2 should be irrelevant in production, because your deployments should happen against a deployment-specific role user, and downloads should be super-fast, because they're all from within your colo.
(all this being said, I haven't touched java and don't miss it at all since switching to Ruby on Rails to write AdGrok, and that's coming from more than 10 years soaking in java).
[+] [-] brown9-2|15 years ago|reply
[+] [-] fauigerzigerk|15 years ago|reply
There should be _one_ unified Linux OS that only includes the bare minimum set of applications. It should be very clear what belongs to the OS and what doesn't. The root directory should contain exactly three sub directories: os, app and home.
There should be no dependencies on any non OS software. Each application should live in its own subdirectory of /app. Libraries that are not part of the OS should not be shared across applications. The only external dependency an application should have is the OS.
Whatever the new solution is, it should not include package mangers or package maintainers. Their existence is a symptom of an overly complicated system.
I realise that having a central registry of all installed software components does have advantages like being able to fix some security issues in one place. However, I think the idea has failed. It creates too many intractable dependencies, it is too complicated and hence ultimately insecure and unproductive.
[+] [-] rbanffy|15 years ago|reply
There is no reason to use every third-party library available. IDEs make it easy to introduce all kinds of weird dependency in your code, but, please, don't.
[+] [-] c00p3r|15 years ago|reply
There is a "standard" JRE. OpenJDK is already included in any distro.
You think we should include all up-to-date crap with all dependencies? No. It is not our problem.
[+] [-] ido|15 years ago|reply
I have encountered numerous instances of OpenJDK not being able to run java apps that run fine on sun-java & quite often when an app/applet I support doesn't work on a user's machine it's due to that user having OpenJDK installed. Replacing it normally solves the problem.
I don't really know why linux distros so often insist on installing OpenJDK as the default when it has so many incompatibility issues.
[+] [-] zokier|15 years ago|reply