top | item 35166317

Docker is deleting Open Source organisations - what you need to know

1556 points| alexellisuk | 3 years ago |blog.alexellis.io | reply

738 comments

order
[+] dbingham|3 years ago|reply
As an SRE Manager, this is causing me a hell of a headache this morning.

In 30 days a bunch of images we depend on may just disappear. We mostly depend on images from relatively large organizations (`alpine`, `node`, `golang`, etc), so one would want to believe that we'll be fine - they're all either in the open source program or will pay. But I can't hang my hat on that. If those images disappear, we lose the ability to release and that's not acceptable.

There's no way for us to see which organizations have paid and which haven't. Which are members of the open source program and which aren't. I can't even tell which images are likely at risk.

The best I can come up with, at the moment, is waiting for each organization to make some sort of announcement with one of "We've paid, don't worry", "We're migrating, here's where", or "We've applied to the open source program". And if organizations don't do that... I mean, 30 days isn't enough time to find alternatives and migrate.

So we're just left basically hoping that nothing blows up in 30 days.

And companies that do that to me give me a very strong incentive to never use their products and tools if I can avoid it.

[+] softfalcon|3 years ago|reply
First of all, want to say, that sounds deeply frustrating.

Secondly, if this is a serious worry. I would recommend creating your own private docker registry.

https://docs.docker.com/registry/deploying/

Then I would download all current versions of the images you use within your org and push them up to said registry.

It’s not a perfect solution, but you’ll be able to pull the images if they disappear and considering this will take only a few minutes to set up somewhere, could be a life saver.

As well, I should note that most cloud providers also have a container registry service you can use instead of this. We use the google one to back up vital images to in case Docker Hub were to have issues.

Is this a massive pain in the butt? Yup! But it sure beats failed deploys! Good luck out there!

[+] jjav|3 years ago|reply
> If those images disappear, we lose the ability to release and that's not acceptable.

This shines light on why it is so risky (from both availability and security perspectives) to be dependent on any third party for the build pipeline of a product.

I have always insisted that all dependencies must be pulled from a local source even if the ultimate origin is upstream. I am continuously surprised how many groups simply rely on some third party service (or a dozen of them) to be always and perpetually available or their product build goes boom.

[+] aprdm|3 years ago|reply
You can vendor images. Never have your product depend on something that is in the internet. Spin up Harbour locally and put it in the middle to cache at the very least.
[+] cpitman|3 years ago|reply
Many of the responses here are talking about how to vendor/cache images instead of depending on an online registry, but remember that you also need access to a supply chain for these images. Base images will continue to be patched/updated, and you need those to keep your own images up to date. Unless the suggestion is to build all images, from the bottom up, from scratch.
[+] friendzis|3 years ago|reply
> If those images disappear, we lose the ability to release and that's not acceptable.

left-pad moment once again.

> I mean, 30 days isn't enough time to find alternatives and migrate.

Maybe take control of mission critical dependencies and self-host?

[+] caeril|3 years ago|reply
This whole thing is so weird. Why do so many organizations depend on the internet to function?

It wasn't too long ago that it was standard practice to vendor your dependencies; that is, dump your dependencies into a vendor/ directory and keep that directory updated and backed up.

But now, you all think it's 100% acceptable to just throw your hands up if github is down, or a maven repository is down, or docker hub makes a policy change?

Every year that goes by it becomes clear that we are actually regressing as a profession.

[+] ohgodplsno|3 years ago|reply
Or you could, you know, host a Docker registry and reupload those images to something you control. Worst case scenario, in 30 days, nothing is gone from Docker and you can just spin it down.

Your job as an SRE is not to look at things and go "oh well, nothing we can do lol".

[+] richardwhiuk|3 years ago|reply
You should be escrowing any Docker images you depend on, I'd have thought.
[+] SergeAx|3 years ago|reply
Our organization currently caching all and every external dependencies we are using: Go, Python, npm and .NET packages, Docker images, Linux deb packages, so everything is contained inside our perimeter. We did that after one day our self-hosted Gitlab runners were throttled and then rate-limited by some package repository and all CI pipelines halted.
[+] nickcw|3 years ago|reply
> Which are members of the open source program and which aren't.

You can tell which are members of the open source program if you go to their docker hub page and you'll see a banner "SPONSORED OSS"

Here is an example:

https://hub.docker.com/r/rclone/rclone

[+] phpisthebest|3 years ago|reply
Any organization that has the means to pay, should pay for another service that is not openly hostile to users...
[+] yenda|3 years ago|reply
Sounds like you could save yourself some time and budget by offering to pay for those images your are using?
[+] mc4ndr3|3 years ago|reply
That's a fair point, and when someone with a working brain mentions the fallout throughout the Internet that would result, I expect Docker Inc. will reverse course and embark on a PR campaign pretending it was all a mere tawdry joke.
[+] jon-wood|3 years ago|reply
Docker the tool has been a massive benefit to software development, every now and then I have a moan about the hassle of getting something bootstrapped to run on Docker, but it's still worlds better than the old ways of managing dependencies and making sure everyone on a project is aligned on what versions of things are installed.

Unfortunately Docker the company appears to be dying, this is the latest in a long line of decisions that are clearly being made because they can't work out how to build a business around what is at it's core a nice UI for Linux containers. My hope is that before the inevitable shuttering of Docker Inc another organisations (ideally a coop of some variety, but that's probably wishful thinking) pops up to take over the bits that matter, and then hopefully we can all stop trying to keep up with the latest way in which our workflows have been broken to try and make a few dollars.

[+] FlyingSnake|3 years ago|reply
> Start publishing images to GitHub

And when GitHub starts similar shenanigans, move out to where? I am old enough to know the we can't trust BigTech and their unpredictable behaviors.

Eventually we need to start a Codeberg like alternative using Prototype funds to be self reliant.

1: https://codeberg.org/ 2: https://prototypefund.de/

[+] davedx|3 years ago|reply
It actually sounds reasonable to me? They have an open source program, the article says its open source definition is "too strict" because it says you must have "no pathway to commercialization".

I mean why should you expect someone to host gigabytes of docker images for you, for free?

[+] JeremyNT|3 years ago|reply
It actually seems pretty reasonable to let BigTech host stuff, so long as you know the rug pull is going to come. Let the VCs light money on fire hosting the stuff we use for free, then once they stop throwing money at it figure out a plan B. Of course you should have a sketch of your plan B ready from the start so you are prepared.

If you view all of this "free" VC subsidized stuff as temporary/ephemeral you can still have a healthy relationship with it.

[+] r3trohack3r|3 years ago|reply
The economics of hosting an image registry are tough. Just mirroring the npm registry can cost $100s per month in storage for tiny little tarballs.

Hosting GB images in an append-only registry, some of which get published weekly or even daily, will burn an incredible amount of money in storage costs. And that’s before talking about ingress and egress.

There will also be a tonne of engineering costs for managing it, especially if you want to explore compression to push down storage costs. A lot of image layers share a lot of files, if you can store the decompressed tarballs in a chunk store with clever chunking you can probably reduce storage costs by an order of magnitude.

But, at the end of the day, expect costs for this to shoot into the 6-7 digit USD range per month in storage and bandwidth as a lower bound for your community hosted image registry.

[+] user3939382|3 years ago|reply
It should use DHT/BitTorrent. Organizations could share magnet links to the official images. OS projects have been doing it for years with ISOs.
[+] JonChesterfield|3 years ago|reply
We need to use a distributed system instead of a centralised one. Probably built on a source control system that can handle that.
[+] delfinom|3 years ago|reply
Yea, people are really spoiled due to more than a decade of VC and general investing cashburn offering tons of services for free. But at the end of the day there are costs and companies will want to recoup their money.

The problem with just replacing GitHub isn't the source code hosting part. There's tons of alternatives both commercial and open source. The problem is the cost of CI infrastructure and CDN/content/release hosting.

Even moderating said CI infrastructure is a nightmare. freedesktop.org which uses a self-hosted gitlab instance recently had to shutdown CI for everything but official projects because the crypto mining bots attacked over the last few days hard and fast.

[+] maxloh|3 years ago|reply
I don't think we will receive enough donations to cover infrastructure costs, let alone maintainers' salaries.

Even core-js sole maintainer failed to raise enough donations to feed his own family, despite the library is used by at least half of the top 1000 Alexa websites. [0]

People (and also big-techs) just won't pay for anything they can get for free.

[0]: https://github.com/zloirock/core-js/blob/master/docs/2023-02...

[+] nindalf|3 years ago|reply
> And when GitHub starts similar shenanigans

The difference between GitHub and Docker is that GitHub is profitable.

[+] syklep|3 years ago|reply
Codeburg is more strict for blocking projects at the moment. Wikiless is blocked by Codeburg for using the Wikipedia puzzle logo but is still up and unchanged on GitHub.
[+] robotburrito|3 years ago|reply
Maybe we all start hosting this stuff via torrent or something?
[+] quickthrower2|3 years ago|reply
Hosting images can be doe via hypertext transfer protocol alone. Keep a local copy of your dependencies and back that up. Done.
[+] JonChesterfield|3 years ago|reply
My first thought on this was good riddance. The dev model of "we've lost track of our dependencies so ship Ubuntu and a load of state" never sat well.

However it looks like the main effect is going to be moving more of open source onto GitHub, aka under Microsoft's control, and the level of faith people have in Microsoft not destroying their competitor for profit is surreal.

[+] PedroBatista|3 years ago|reply
The truth is Docker ( the company ) could never capitalize the success of their software. They clearly need the money and I have the impression things have not been "great" in the last couple of years. ( regardless of reasons )

The truth is also the fact that most people/organizations never paid a dime for the software or the service, and I'm talking about Billion dollar organizations that paid ridiculous amounts of money for both "DevOps Managers" and consultants but the actual source of the images they pull are either from "some dude" or some opensource orgs.

I get that there will be many "innocent victims" of the circumstances but most people who are crying now are the same ones who previously only took, never gave and are panicking because as Warren Buffett says: "Only when the tide goes out do you discover who's been swimming naked."

And there are a lot of engineering managers and organizations who like to brag with expressions like "Software supply chains" and we'll find out who has been swimming with their willy out.

[+] foxandmouse|3 years ago|reply
I think it's also a product of the larger economic environment. The old model of grow now and profit later seems to be hitting a wall, leaving companies scrambling to find profit streams in their existing customer base not realizing that doing so will hinder their growth projection leading to more scrambling for profit.

It's a vicious cycle, but when you don't grow in a sustainable way it seems unavoidable.

[+] tyingq|3 years ago|reply
The only real moat they seem to have here is that "FROM" in a Dockerfile, "image:" in a docker-compose.yml file, and the docker command line default "somestring" as an image to "hub.docker.com:somestring".

They pushed that with the aggressive rate limiting first though, which caused a lot of people to now understand that paragraph above and use proxies, specify a different "hub", etc.

So this move, to me, has less leverage than they might have intended, since the previous move already educated people on how to work around docker hub.

At some point, they force everyone's hand and lose their moat.

[+] jiggywiggy|3 years ago|reply
It was always unbelievable too me how much they hosted for free. I recklessly pushed over 100gbs of containers the last years, all free. Never made sense to me, even google doesn't do this anymore.
[+] nrvn|3 years ago|reply
After Docker announced rate limiting for the hub this was an anticipated move. Was just the matter of time.

The only recommendation to everyone: move away or duplicate.

One of the strategies I am yet to test is the synchronization between gitlab and github for protected branches and tags and relying on their container registries. Thus (at least) you provide multiple ways to serve public images for free and with relatively low hassle.

And then for open source projects’ maintainers: provide a one command way to reproducibly build images from scratch to serve them from wherever users want. In production I don’t want to depend on public registries at all and if anything I must be able to build images on my own and expect them to be the same as their publicly built counterparts. Mirroring images is the primary way, reproducing is the fallback option and also helps to verify the integrity.

[+] Havoc|3 years ago|reply
To me this smells of VC model issues.

Initially it's great if you can get all the FOSS to play in your technology walled garden. Subsidize it with VC cash.

Downside is it generates a ton of traffic that is hard to monetize. Sooner or later it reaches a point where it can't be subsidized and then you get pay up or get out decisions like this.

One question I haven't seen yet is 420 USD? Is that what it costs to serve the average FOSS project? Or is that number a bad Elon style joke? If they came out with "We've calculated X as actual costs. We're making no margin on this but can't free lunch this anymore" that would go down a lot better I think.

[+] ridruejo|3 years ago|reply
Without entering into the specifics of this situation, I don’t understand the hate for Docker the company. They are providing a huge service for the community and looking for ways to make money from it to make it sustainable. I would give them a bit more empathy/benefit of the doubt as they iterate on their approach. Somewhere, somehow, someone has to pay for that storage and bandwidth whether directly or indirectly (I am old enough to remember what happened with sourceforge so I rather them find a model that works for everyone)
[+] roydivision|3 years ago|reply
Docker should never have become a business. There’s virtually nothing there to make a business around, it’s a suite of useful utilities that should have remained a simple open source project. I switched to podman a while ago and haven’t looked back.
[+] koolba|3 years ago|reply
Can we just get the big three cloud players to make a new public repo? They’ve got oodles of bandwidth and storage, plus the advantage that a lot of access would be local to their private networks.

Setup a non-profit, dedicate resources from each of them spendable as $X dollars of credits, and this problem is solved in a way that works for the real world. Not some federated mess that will never get off the ground.

[+] pimterry|3 years ago|reply
Consensus on a new repo for public community images would help, but it isn't the biggest problem (as the author notes, GHCR does that already, and GitHub seem pretty committed to free hosting for public data, and have the Microsoft money to keep doing so indefinitely if they like).

The issue I worry about is the millions of blog posts, CI builds, docker-compose files, tutorials & individual user scripts who all reference community images on Docker Hub, a huge percentage of which are about to disappear, apparently all at once 29 days from now.

From a business perspective particularly, this looks like suicide to me - if you teach everybody "oh this guide uses Docker commands, it must be outdated & broken like all the others" then you're paving a path for everybody to dump the technology entirely. It's the exact opposite of a sensible devrel strategy. And a huge number of their paying customers will be affected too! Most companies invested enough in Docker tech to be paying Docker Inc right now surely use >0 community images in their infrastructure, and they're going to see this breakage. Docker Inc even directly charge for pulling lots of images from Docker Hub right now, and this seems likely to actively stop people doing that (moving them all to GHCR etc) and thereby _reduce_ the offering they're charging for! It's bizarre.

Seems like a bad result for the industry in general, but an even worse result for Docker Inc.

[+] remram|3 years ago|reply
quay.io is a pretty popular general-purpose repo, it replaced docker.io for many projects when they started rate-limiting.
[+] millerm|3 years ago|reply
I suppose BitTorrent for Images should be a thing (again?)

Discussions of decentralization and redundancy always come up in software/system design and development, but we seem to always gravitate to bottlenecks and full dependency on single entities for the tools we "need".

[+] tolmasky|3 years ago|reply
Could IPFS possibly be a good distributed (and free?) storage backing for whatever replaces DockerHub for Open Source, as opposed to using something like GitHub? We'd still need a registry for mapping the image name to CID, along with users/teams/etc., but that simple database should be much cheaper to run than actually handling the actual storage of images and the bandwidth for downloading images.
[+] foepys|3 years ago|reply
I posted this in the other thread already but will also add it here. https://news.ycombinator.com/item?id=35167136

---

In an ideal world every project had its own registry. Those centralized registries/package managers that are baked into tools are one of the reasons why hijacking namespaces (and typos of them) is even possible and so bad.

Externalizing hosting costs to other parties is very attractive but if you are truly open source you can tell everybody to build the packages themselves from source and provide a script (or in this case a large Dockerfile) for that. No hosting of binary images necessary for small projects.

Especially since a lot of open source projects are not used by other OSS but by large organizations I don't see the need to burden others with the costs for these businesses. Spinning this into "Docker hates Open Source" is absolutely missing the point.

Linux distributions figured out decades ago that universities are willing to help out with decentralized distribution of their binaries. Why shouldn't this work for other essential OSS as well?

[+] cookiengineer|3 years ago|reply
Does anybody know whether there could be something like an open/libre container registry?

Maybe the cloud native foundation or the linux foundation could provide something like this to prevent vendor lock-ins?

I was coincidentially trying out harbor again over the last days, and it seems nice as a managed or self-hosted alternative. [1] after some discussions we probably gonna go with that, because we want to prevent another potential lock-in with sonarpoint's nexus.

Does anybody have similar migration plans?

The thing that worries me the most is storage expectations, caching and purging unneeded cache entries.

I have no idea how large/huge a registry can get or what to expect. I imagine alpine images to be much smaller than say, the ubuntu images where the apt caches weren't removed afterwards.

[1] https://goharbor.io

[+] VadimBauer|3 years ago|reply
Many people are quite upset. But on the other hand, how many years could this work? Petabytes of data and traffic.

When we started to offer an alternative to Docker Hub in 2015-2016 with container-registry.com, everyone was laughing at us. Why are you doing that, you are the only one, Docker Hub is free or almost free.

Owning your data and having full control over the distribution is crucial for every project, event open source.

[+] scoopr|3 years ago|reply
I'm not too familiar with docker infra at large, but could the docker hub in principle act just as a namespace such that the opensource projects could have the images hosted elsewhere, but docker hub just redirects them there, so saving on the bandwidth on those?

I suppose there are hand-wavy business reasons not to do that, but somehow I feel that would:

  1. Still keep themselves in the loop and relevant, owning the namespace/hub/main registry
  2. Offset the costs of those that they don't want to deal with (push them to ghcr or whatever)
  3. Preserve some notion of goodwill in not braking the whole dockerverse
[+] bmitc|3 years ago|reply
It is my understanding that Microsoft has previously tried to purchase Docker. Despite me having problems with companies buying up each other, I wouldn't be surprised if Microsoft revisits, or already is revisiting, buying Docker.

Being a heavy Visual Studio Code user, I have centered my personal development around Docker containers using VS Code's Devcontainer feature, which is a very, very nice way of developing. All I need installed is VS Code and Docker, and I can pull down and start developing any project. (I'm not there yet for all my personal projects, but that's where I'm headed.)