top | item 44813397

We shouldn't have needed lockfiles

151 points| tobr | 7 months ago |tonsky.me | reply

287 comments

order
[+] hyperpape|7 months ago|reply
> But if you want an existence proof: Maven. The Java library ecosystem has been going strong for 20 years, and during that time not once have we needed a lockfile. And we are pulling hundreds of libraries just to log two lines of text, so it is actively used at scale.

Maven, by default, does not check your transitive dependencies for version conflicts. To do that, you need a frustrating plugin that produces much worse error messages than NPM does: https://ourcraft.wordpress.com/2016/08/22/how-to-read-maven-....

How does Maven resolve dependencies when two libraries pull in different versions? It does something insane. https://maven.apache.org/guides/introduction/introduction-to....

Do not pretend, for even half a second, that dependency resolution is not hell in maven (though I do like that packages are namespaced by creators, npm shoulda stolen that).

[+] Karrot_Kream|7 months ago|reply
When I used to lead a Maven project I'd take dependency-upgrade tickets that would just be me bumping up a package version then whack-a-moling overrides and editing callsites to make dependency resolution not pull up conflicting packages until it worked. Probably lost a few days a quarter that way. I even remember the playlists I used to listen to when I was doing that work (:

Lockfiles are great.

[+] adrianmsmith|7 months ago|reply
If you use two dependencies, and one requires Foo 1.2.3 and the other Foo 1.2.4 then 99% of the time including either version of Foo will work fine. (I was a Java developer and used Maven for about 10 years.)

For those times where that's not the case, you can look at the dependency tree to see which is included and why. You can then add a <dependency> override in your pom.xml file specifying the one you want.

It's not an "insane" algorithm. It gives you predictability. If you write something in your pom.xml that overrides whatever dependency your dependency requires, because you can update your pom.xml if you need to.

And because pom.xml is hand-written there are very few merge conflicts (as much as you'd normally find in source code), vs. a lock file where huge chunks change each time you change a dependency, and when it comes to a merge conflict you just have to delete the lot and redo it and hope nothing important has been changed.

[+] xg15|7 months ago|reply
The irony is that it even has something like lockfiles as well: The <dependencyManagement> section:

> Dependency management - this allows project authors to directly specify the versions of artifacts to be used when they are encountered in transitive dependencies or in dependencies where no version has been specified.

It's just less convenient because you have to manage it yourself.

[+] arcbyte|7 months ago|reply
I have YEARS of zero problems with maven dependencias. And yet i cant leave up a node project for more than a month without immediately encountering transitive dependency breakage that take days to resolve.

Maven is dependency heaven.

[+] kemitchell|7 months ago|reply
npm got around to `@{author}/{package}` and `@{org}/{package}` beyond just global `{package}`, albeit midstream, rather than very early on. The jargon is "scoped packages". I've seen more adoption recently, also with scopes for particular projects, like https://www.npmjs.com/package/@babel/core
[+] smrtinsert|7 months ago|reply
The java ecosystem never went through the level of pain node ecosystem has. For a while it was simply insanity in node. I've worked heavily in both, and the developer experience in java was always way better.
[+] Perepiska|7 months ago|reply
The author of the article asked the same question in his personal blog in Telegram a year and a half ago [1], and received in response exactly the same example with maven. Probably, to build a personal brand and increase visibility, he needs to ask the same thing many times in different places.

1. https://t.me/nikitonsky_pub/597

[+] Bjartr|7 months ago|reply
Seems like there's room then in the Maven ecosystem that does what maven-enforcer-plugin does, but which just looks at a lockfile to make its decisions.
[+] potetm|7 months ago|reply
The point isn't, "There are zero problems with maven. It solves all problems perfectly."

The point is, "You don't need lockfiles."

And that much is true.

(Miss you on twitter btw. Come back!)

[+] trjordan|7 months ago|reply
There is absolutely a good reason for version ranges: security updates.

When I, the owner of an application, choose a library (libuseful 2.1.1), I think it's fine that the library author uses other libraries (libinsecure 0.2.0).

But in 3 months, libinsecure is discovered (surprise!) to be insecure. So they release libinsecure 0.2.1, because they're good at semver. The libuseful library authors, meanwhile, are on vacation because it's August.

I would like to update. Turns out libinsecure's vulnerability is kind of a big deal. And with fully hardcoded dependencies, I cannot, without some horrible annoying work like forking/building/repackaging libuseful. I'd much rather libuseful depend on libinsecure 0.2.*, even if libinsecure isn't terribly good at semver.

I would love software to be deterministically built. But as long as we have security bugs, the current state is a reasonable compromise.

[+] seniorsassycat|7 months ago|reply
Yeah, this felt like a gap in the article. You'd have to wait for every package to update from the bottom up before you could update you top levels to remove a risk (or you could patch in place, or override)

But what if all the packages had automatic ci/cd, and libinsecure 0.2.1 is published, libuseful automatically tests a new version of itself that uses 0.2.1, and if it succeeds it publishes a new version. And consumers of libuseful do the same, and so on.

[+] deredede|7 months ago|reply
What if libinsecure 0.2.1 is the version that introduces the vulnerability, do you still want your application to pick up the update?

I think the better model is that your package manager let you do exactly what you want -- override libuseful's dependency on libinsecure when building your app.

[+] PhilipRoman|7 months ago|reply
Slightly off topic but we need to normalize the ability to patch external dependencies (especially transitive ones). Coming from systems like Yocto, it was mind boggling to see a company bugging the author of an open source library to release a new version to the package manager with a fix that they desperately needed.

In binary package managers this kind of workflow seems like an afterthought.

[+] skybrian|7 months ago|reply
Go has a deterministic package manager and handles security bugs by letting library authors retract versions [1]. The 'go get' command will print a warning if you try to retrieve a retracted version. Then you can bump the version for that module at top level.

You also have the option of ignoring it if you want to build the old version for some reason, such as testing the broken version.

[1] https://go.dev/ref/mod#go-mod-file-retract

[+] guhcampos|7 months ago|reply
The author hints very briefly that Semantic Version is a hint, not a guarantee, to which I agree - but then I think we should be insisting on library maintainers that semantic versioning *should* be a guarantee, and in the worst case scenario, boycott libraries that claim to be semantically versioned but don't do it in reality.
[+] tonsky|7 months ago|reply
It’s totally fine in Maven, no need to rebuild or repackage anything. You just override version of libinsecure in your pom.xml and it uses the version you told it to
[+] zaptheimpaler|7 months ago|reply
Dependency management is a deep problem with a 100 different concerns, and every time someone says "oh here it's easy, you don't need that complexity" it turns out to only apply to a tiny subset of dependency management that they thought about.

Maven/Java does absolutely insane things, it will just compile and run programs with incompatible version dependencies and then they crash at some point, and pick some arbitrary first version of a dependency it sees. Then you start shading JARs and writing regex rules to change import paths in dependencies and your program crashes with a mysterious error with 1 google result and you spend 8 hours figuring out WTF happened and doing weird surgery on your dependencies dependencies in an XML file with terrible plugins.

This proposed solution is "let's just never use version ranges and hard-code dependency versions". Now a package 5 layers deep is unmaintained and is on an ancient dependency version, other stuff needs a newer version. Now what? Manually dig through dependencies and update versions?

It doesn't even understand lockfiles fully. They don't make your build non-reproducible, they give you both reproducible builds (by not updating the lockfile) and an easy way to update dependencies if and when you want to. They were made for the express purpose of making your build reproducible.

I wish there was a mega article explaining all the concerns, tradeoffs and approaches to dependency management - there are a lot of them.

[+] adrianmsmith|7 months ago|reply
1) "it will just compile and run programs with incompatible version dependencies and then they crash at some point"

2) "Now a package 5 layers deep is unmaintained and is on an ancient dependency version, other stuff needs a newer version. Now what? Manually dig through dependencies and update versions?"

You can't solve both of these simultaneously.

If you want a library's dependences to be updated to versions other than the original library author wanted to use (e.g. because that library is unmaintained) then you're going to get those incompatibilities and crashes.

I think it's reasonable to be able to override dependencies (e.g. if something is unmaintained) but you have to accept there are going to be surprises and be prepared to solve them, which might be a bit painful, but necessary.

[+] yawaramin|7 months ago|reply
> compile and run programs with incompatible version dependencies and then they crash at some point

Just because Java does this doesn't mean every language has to. It's not strongly tied to the dependency management system used. You could have this even with a Java project using lockfiles.

> a package 5 layers deep is unmaintained and is on an ancient dependency version, other stuff needs a newer version. Now what? Manually dig through dependencies and update versions?

Alternatively, just specify the required version in the top-level project's dependency set, as suggested in the article.

[+] spdionis|7 months ago|reply
Funnily enough PHP solved this perfectly with composer, but unfortunately it's not an enterprise-level programming language /s
[+] andy99|7 months ago|reply
In case the author is reading, I can't read your article because of that animation at the bottom. I get it, it's cute, but it makes it too distracting to concentrate on the article, so I ended up just closing it.
[+] fellowniusmonk|7 months ago|reply
I've never seen something so egregious before, it made it impossible to read without covering it with my hand.

But I realized something by attempting to read this article several times first.

If I ever want to write an article and reduce peoples ability to critically engage with the argument in it I should add a focus pulling animation that thwarts concerted focus.

It's like the blog equivalent of public speakers who ramble their audience into a coma.

[+] ddejohn|7 months ago|reply
It's downright awful and I'm having a hard time imagining the author proof reading their own page and thinking "yeah, that's great".

As an aside, I have an article in my blog that has GIFs in it, and they're important for the content, but I'm not a frontend developer by any stretch of the imagination so I'm really at wit's end for how to make it so that the GIFs only play on mouse hover or something else. If anybody reading has some tips, I'd love to hear them. I'm using Zola static site generator, and all I've done is make minor HTML and CSS tweaks, so I really have no idea what I'm doing where it concerns frontend presentation.

[+] nerdjon|7 months ago|reply
Agreed, and the fact that there is not an easy "x" to close it is even worse.

If you want to do something cute and fun, whatever its your site. But if you actually want people to use your site make it easy to dismiss. We already have annoying ads and this is honestly worse than many ads.

Also, from the bio that I can barely see he writes about "UI Design" and... included this?

[+] somehnguy|7 months ago|reply
I read the article but that animation was incredibly distracting. I don't even understand what it's for - clicking it does nothing. Best guess is a representation of how many people active on page.
[+] mvieira38|7 months ago|reply
Give in to the noJS movement, there's no animation and it's a beautiful minimalistic site if you disable javascript
[+] _verandaguy|7 months ago|reply
I'll also add that the "night mode" is obnoxious as hell anyway.

Inverted colours would've been _mostly fine._ Not great, but mostly fine, but instead, the author went out of their way to add this flashlight thing that's borderline unusable?

What the hell is this website?

[+] yladiz|7 months ago|reply
As someone who does like tonsky’s stuff sometimes: I immediately closed the article when I saw it. I’m less charitable than you: it’s not cute, it’s just annoying, and it should be able to be switched off. For me it goes into the same box as his “dark mode” setting but it’s worse because it can’t be disabled. Why should I, as the reader, put in effort to overcome something the author found “cute” just to read their words? It’s akin to aligning the words to the right, or vertically: I can read it but it’s so much work that I’d rather just not.
[+] hans_castorp|7 months ago|reply
On sites like that, I typically just switch to "reader view" which the leaves only the interesting content.
[+] dom96|7 months ago|reply
The animation? For me it was the blinding yellow background
[+] epage|7 months ago|reply
Let's play this out in a compiled language like Cargo.

If every dependency was a `=` and cargo allowed multiple versions of SemVer compatible packages.

The first impact will be that your build will fail. Say you are using `regex` and you are interacting with two libraries that take a `regex::Regex`. All of the versions need to align to pass `Regex` between yourself and your dependencies.

The second impact will be that your builds will be slow. People are already annoyed when there are multiple SemVer incompatible versions of their dependencies in their dependency tree, now it can happen to any of your dependencies and you are working across your dependency tree to get everything aligned.

The third impact is if you, as the application developer, need a security fix in a transitive dependency. You now need to work through the entire bubble up process before it becomes available to you.

Ultimately, lockfiles are about giving the top-level application control over their dependency tree balanced with build times and cross-package interoperability. Similarly, SemVer is a tool any library with transitive dependencies [0]

[0] https://matklad.github.io/2024/11/23/semver-is-not-about-you...

[+] jchw|7 months ago|reply
Go MVS ought to be deterministic, but it still benefits from modules having lockfiles as it allows one to guarantee that the resolution of modules is consistent without needing to trust a central authority.

Go's system may be worth emulating in future designs. It's not perfect (still requires some centralized elements, module identities for versions ≥2 are confusing, etc.) but it does present a way to both not depend strongly on specific centralized authorities without also making any random VCS server on the Internet a potential SPoF for compiling software. On the other hand, it only really works well for module systems that purely deal with source code and not binary artifacts, and it also is going to be the least hazardous when fetching and compiling modules is defined to not allow arbitrary code execution. Those constraints together make this system pretty much uniquely suited to Go for now, which is a bit of a shame, because it has some cool knock-on effects.

(Regarding deterministic MVS resolution: imagine [email protected] depending on [email protected], and [email protected] depending on [email protected]. What if [email protected] no longer depends on b? You can construct trickier versions of this possibly using loops, but the basic idea is that it might be tricky to give a stable resolution to version constraints when the set of constraints that are applied depends on the set of constraints that are applied. There are possible deterministic ways to resolve this of course, it's just that a lot of these edge cases are pretty hard to reason about and I think Go MVS had a lot of bugs early on.)

[+] maxmcd|7 months ago|reply
For what it's worth I think Go's MVS somewhat meets the desire here. It does not require lockfiles, but also doesn't allow use of multiple different minor/patch versions of a library: https://research.swtch.com/vgo-mvs

I believe Zig is also considering adopting it.

If there are any dependencies with the same major version the algorithm simply picks the newest one of them all (but not the newest in the package registry), so you don't need a lockfile to track version decisions.

Go's go.sum contains checksums to validate content, but is not required for version selection decisions.

[+] lalaithion|7 months ago|reply
What if your program depends on library a1.0 and library b1.0, and library a1.0 depends on c2.1 and library b1.0 depends on c2.3? Which one do you install in your executable? Choosing one randomly might break the other library. Installing both _might_ work, unless you need to pass a struct defined in library c from a1.0 to b1.0, in which case a1.0 and b1.0 may expect different memory layouts (even if the public interface for the struct is the exact same between versions).

The reason we have dependency ranges and lockfiles is so that library a1.0 can declare "I need >2.1" and b1.0 can declare "I need >2.3" and when you depend on a1.0 and b1.0, we can do dependency resolution and lock in c2.3 as the dependency for the binary.

[+] RangerScience|7 months ago|reply
No, no, a thousand times no.

The package file (whatever your system) is communication to other humans about what you know about the versions you need.

The lockfile is the communication to other computers about the versions you are using.

What you shouldn't have needed is fully defined versions in your package files (but you do need it, in case some package or another doesn't do a good enough job following semver)

So, this:

  package1: latest

  # We're stuck on an old version b/c of X, Y, Z
  package2: ~1.2

(Related: npm/yarn should use a JSON variant (or YAML, regular or simplified) that allows for comments for precisely this reason)
[+] simonw|7 months ago|reply
I see lockfiles as something you use for applications you are deploying - if you run something like a web app it's very useful to know exactly what is being deployed to production, make sure it exactly matches staging and development environments, make sure you can audit new upgrades to your dependencies etc.

This article appears to be talking about lockfiles for libraries - and I agree, for libraries you shouldn't be locking exact versions because it will inevitably pay havoc with other dependencies.

Or maybe I'm missing something about the JavaScript ecosystem here? I mainly understand Python.

[+] boscillator|7 months ago|reply
Ok, but what happens when lib-a depends on lib-x:0.1.4 and lib-b depends on lib-x:0.1.5, even though it could have worked with any lib-x:0.1.*? Are these libraries just incompatible now? Lockfiles don't guarantee that new versions are compatible, but it guarantees that if your code works in development, it will work in production (at least in terms of dependencies).

I assume java gets around this by bundling libraries into the deployed .jar file. That this is better than a lock file, but doesn't make sense for scripting languages that don't have a build stage. (You won't have trouble convincing me that every language should have a proper build stage, but you might have trouble convincing the millions of lines of code already written in languages that don't.)

[+] nemothekid|7 months ago|reply
Most of the issues in this thread and the article, are, IMO, problems with Node, not with lockfiles.

>How could they know that liblupa 0.7.9, whenever it will be released, will continue to work with libpupa? Surely they can’t see the future? Semantic versioning is a hint, but it has never been a guarantee.

Yes, this is a social contract. Not everything in the universe can be locked into code, and with Semantic versioning, we hope that our fellow humans won't unnecessarily break packages in non-major releases. It happens, and people usually apologize and fix, but it's rare.

This has worked successfully if you look at RubyGems which is 6 years older than npm (although Gemfile.lock was introduced in 2010, npm didn't introduce it until 2017).

RubyGems doesn't have the same reputation for dysfunction as Node does. Neither does Rust, Go, PHP, and Haskell. Even more that I probably don't use a daily basis. Node is the only language that I will come back and find a docker container that straight up won't build or a package that requires the entire dependency tree to update because one package pushed a minor-version change that ended up requiring a minor version change to Node, then that new version of Node isn't compatible with some hack that another package did in it's C extension.

In fact, I expect some Node developer to read this article and deploy yet another tool that will break _everything_ in the build process. In other languages I don't even think I've ever really thought about dependency resolution in years.

[+] wedn3sday|7 months ago|reply
I absolutely abhor the design of this site. I cannot engage with the content as Im filled with a deep burning hatred of the delivery. Anyone making a personal site: do not do this.
[+] ratelimitsteve|7 months ago|reply
anyone find a way to get rid of the constantly shifting icons at the bottom of the screen? I'm trying to read and the motion keeps pulling my attention away from the words toward the dancing critters.
[+] spooky_deep|7 months ago|reply
> The important point of this algorithm is that it’s fully deterministic.

The algorithm can be deterministic, but fetching the dependencies of a package is not.

It is usually an HTTP call to some endpoint that might flake out or change its mind.

Lock files were invented to make it either deterministic or fail.

Even with Maven, deterministic builds (such as with Bazel) lock the hashes down.

This article is mistaken.

[+] egh|7 months ago|reply
we've all learned about things, not understood them, and thought "wow, these people must be idiots. why would they have made this complicated thing? makes no sense whatsoever. I can't believe these people, idiots, never thought this through like I have."

Most of us, fortunately, don't post these thoughts to the internet for anybody to read.

[+] tonsky|7 months ago|reply
I worked for 20 years in an ecosystem that didn’t have lockfiles and had reproducible builds before the term was invented, and now you come and tell me that it couldn’t be?
[+] zahlman|7 months ago|reply
> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

While I share the view that TFA is misguided in some ways, this isn't a productive or insightful way to make the point.

[+] andix|7 months ago|reply
Lockfiles are essential for somewhat reproducible builds.

If a transient dependency (not directly referenced) updates, this might introduce different behavior. if you test a piece of software and fix some bugs, the next build shouldn't contain completely different versions of dependencies. This might introduce new bugs.

[+] freetonik|7 months ago|reply
In the world of Python-based end-user libraries the pinned (non-ranged) versions result in users being unable to use your library in an environment with other libraries. I’d love to lock my library to numpy 2.3.4, but if the developers of another library pin theirs to 2.3.5 then game over.

For server-side or other completely controlled environments the only good reason to have lock files is if they are actually hashed and thus allow to confirm security audits. Lock files without hashes do not guarantee security (depending on the package registry, of course, but at least in Python world (damn it) the maintainer can re-publish a package with an existing version but different content).

[+] _verandaguy|7 months ago|reply

    > But... why would libpupa’s author write a version range that includes versions that don’t exist yet? How could they know that liblupa 0.7.9, whenever it will be released, will continue to work with libpupa? Surely they can’t see the future? Semantic versioning is a hint, but it has never been a guarantee.

    > For that, kids, I have no good answer.
Because semantic version is good enough for me, as a package author, to say with a good degree of confidence, "if security or stability patches land within the patch (or sometimes, even minor) fields of a semver version number, I'd like to have those rolled out with all new installs, and I'm willing to shoulder the risk."

You actually kind-of answer your own question with this bit. Semver not being a guarantee of anything is true, but I'd extend this (and hopefully it's not a stretch): package authors will republish packages with the same version number, but different package contents or dependency specs. Especially newer authors, or authors new to a language or packaging system, or with packages that are very early in their lifecycle.

There are also cases where packages get yanked! While this isn't a universally-available behaviour, many packaging systems acknolwedge that software will ship with unintentional vulnerabilities or serious stability/correctness issues, and give authors the ability to say, "I absolutely have to make sure that nobody can install this specific version again because it could cause problems." In those cases, having flexible subdependency version constraints helps.

It might be helpful to think by analogy here. If a structure is _completely rigid,_ it does have some desirable properties, not the least of which being that you don't have to account for the cascading effects of beams compressing and extending, elements of the structure coming under changing loads, and you can forget about accounting for thermal expansion or contraction and other external factors. Which is great, in a vacuum, but structures exist in environments, and they're subject to wear from usage, heat, cold, rain, and (especially for taller structures), high winds. Incorporating a planned amount of mechanical compliance ends up being the easier way to deal with this, and forces the engineers behind it to account for failure modes that'll arise over its lifetime.

[+] xp84|7 months ago|reply
This is weird to me. (Note: i'll use ruby terms like 'gem' and 'bundle' but the same basic deal applies everywhere)

Generally our practice is to pin everything to major versions, in ruby-speak this means like `gem 'net-sftp', '~> 4.0'` which allows 4.0.0 up to 4.9999.9999 but not 5. Exceptions for non-semver such as `pg` and `rails` which we just pin to exact versions and monitor manually. This little file contains our intentions of which gems to update automatically and for any exceptions, why not.

Then we encourage aggressive performances of `bundle update` which pulls in tons of little security patches and minor bugfixes frequently, but intentionally.

Without the lockfile though, you would not be able to do our approach. Every bundle install would be a bundle update, so any random build might upgrade a gem without anyone even meaning to or realizing it, so, your builds are no longer reproducible.

So we'd fix reproducibility by reverting to pinning everything to X.Y.Z, specifically to make the build deterministic, and then count on someone to go in and update every gem's approved version numbers manually on a weekly or monthly basis. (yeah right, definitely will happen).