top | item 41286509

Can a Rust binary use incompatible versions of the same library?

84 points| braxxox | 1 year ago |github.com

85 comments

order

richardwhiuk|1 year ago

You can do this in one crate:

    [dependencies]
    foo_v1 = { package = "foo", version = "1" }
    foo_v2 = { package = "foo", version = "2" }

WD-42|1 year ago

This isn’t a magic bullet. Using multiple versions of the same crate can still blow up your project.

For example, the compiler error in this example:

note: perhaps two different versions of crate `smithay_client_toolkit` are being used?

https://github.com/pop-os/launcher/issues/237

gary_0|1 year ago

Ah, I was wondering what would happen if you're using a type from lib-v2 and an intermediary library passes you that type from lib-v1, and the type has changed internally. Good to know the Rust compiler is set up to catch that.

(I've seen cases where that happens with C and C++ software, and things seem to compile and run... until everything explodes. Fun times.)

woodruffw|1 year ago

I thought this was about loading two incompatible versions of a shared object into the same address space at first :-)

The author correctly contrasts Rust (and NPM's) behavior with that of Python/pip, where only one version per package name is allowed. The Python packaging ecosystem could in theory standardize a form of package name mangling wherein multiple versions could be imported simultaneously (akin to what's currently possible with multiple vendored versions), but that would likely be a significant undertaking given that a lot of applications probably - accidentally - break the indirect relationship and directly import their transitive dependencies.

(The more I work in Python, the more I think that Python's approach is actually a good one: preventing multiple versions of the same package prevents dependency graph spaghetti when every subdependency depends on a slightly different version, and provides a strong incentive to keep public API surfaces small and flexible. But I don't think that was the intention, more of an accidental perk of an otherwise informal approach to packaging.)

josephg|1 year ago

> (The more I work in Python, the more I think that Python's approach is actually a good one ...)

I've come to the opposite conclusion. I've "git cloned" several programs in both python and ruby (which has the same behaviour) only to discover that I can't actually install the project's dependencies. The larger your gemfile / requirements.txt is, the more likely this is to happen. All it takes is a couple packages in your tree to update their own dependencies out of sync with one another and you can run into this problem. A build that worked yesterday doesn't work today. Not because anyone made a mistake - but just because you got unlucky. Ugh.

Its a completely unnecessary landmine. Worse yet, new developers (or new teammembers) are very likely to run into this problem as it shows up when you're getting your dev environment setup.

This problem is entirely unnecessary. In (almost) every way, software should treat foo-1.x.x as a totally distinct package from foo-2.x.x. They're mutually incompatible anyway, and semantically the only thing they share is their name. There's no reason both packages can't be loaded into the package namespace at the same time. No reason but the mistakes of shortsighted package management systems.

RAM is cheap. My attention is expensive. Print a warning if you must, and I'll fix it when I feel like it.

rtpg|1 year ago

Another thing I appreciate about this in the Python world is it avoids an issue I've seen in node a lot, which is people being too clever by a half and pre-emptively adding major version bounds to their library. So foo depends on "bar<9", despite bar 9, 10, 11, 12, 13, and 14 all working with foo's usage of bar.

The end result of this is that you end up with some random library in your stack (4 transitive layers deep because of course it is) holding back stuff like chokadir in a huge chunk of your dep tree for... no real good reason. So you now have several copies of a huge library.

Of course new major versions might break your usage! Minor versions might as well! Patch versions too sometimes! Upper bounds pre-emptively set help mainly in one thing, and that's reducing the number of people who would help "beta-test" new major versions because they don't care enough to pin their own dependencies.

stouset|1 year ago

> dependency graph spaghetti

The worst spaghetti comes from hard dependencies on minor versions and revisions.

I will die on the hill that you should only ever specify dependencies on “at least this major-minor (and optionally and rarely revision for a bugfix)” in whatever the syntax is for your preferred language. Excepting of course a known incompatibility with a specific version or range of versions, and/or developers who refuse to get on the semver bandwagon who should collectively be rounded up and yelled at.

In Rust, Cargo makes this super easy: “x.y.z” means “>= x.y.z, < (x+1).0.0”.

It’s fine to ship a generated lock file that locks everything to a fixed, known-good version of all your dependencies. But you should be able to trivially run an update that will bring everything to the latest minor and revision (and alert on newer major versions).

gorgoiler|1 year ago

For fun, you could add this to Python and I think it would it cover a lot of edge cases?

You would need:

A function v_tree_install(spec) which installs a versioned pypi package like “foo=3.2” and all its dependencies in its own tree, rather than in site-packages.

Another pair of functions v_import and v_from_import to wrap importlib with a name, version, and symbols. These functions know how to find the versioned package in its special tree and push that tree to sys.path before starting the import.

To cover the case for when the imported code has dynamic imports you could also wrap any callable code (functions, classes) with a wrapper that also does the sys.push/pop before/after each call.

You then replace third party imports in your code with calls assigning to symbols in your module:

  # import foo
  foo = v_import(“foo==3.2”)

  # from foo import bar, baz as q
  bar, q = v_from_import(
    “foo>=3.3”,
    “bar”,
    “baz”,
  )
Finally, provide a function (or CLI tool) to statically scan your code looking for v_import and calling v_tree_install ahead of time. Or just let v_import do it.

Edit: …and you’d need to edit the sys.modules cache too, or purge it after each “clever” import?

orf|1 year ago

I can’t see how it would ever be possible with Python to do this.

You depend on two packages, each with a function that returns a “requests.Request” object. These packages depend on different versions of “requests”.

How would you implement “isinstance(return_value, requests.Request)” on each of these calls?

Or, the indirect case of this: catching a “requests.HttpException” from each of these calls?

Importing the right thing isn’t hard, but doing things with it is the hard bit.

dathinab|1 year ago

I have though about this a bunch (and have been annoyed by it a bunch).

But the main issue here is somewhat designed around a "scripts and folder of scripts from a package" design principle while such a loading system would fundamentally need to always work in terms of packages. E.g. you wouldn't execute `main.py` but `package:main`. (Through this is already the direction a bunch of tooling moved to, e.g. poetry scripts, some of the WSGI and especially more modern ASGI implementations etc.)

Another issue is that rust can reliable detect type collisions of the same type of two different versions and force you to fix them.

With a lot of struct type annotations on python and tooling like mypy this might be possible (with many limitations) but as of today it in practice likely will not be caught. Sometimes that is what you want (ducktyping happens to work). But for any of the reflection/inspection heavy python library this is a recipe for quite obscure errors somewhere in not so obvious inspection/auto generation/metaclass related magic code. Python can't, escept it can

Anyway technically it's possible, you can put a version into __qualname__, and mess with the import system enough to allow imports to be contextual based on the manifest of the module they come from. (Through you probably would not be fully standard conform python, but we are speaking about dynamic patching pythons import system, there is nothing standard about it)

btilly|1 year ago

This is great for avoiding conflicts when you try to get your project running.

It sucks when there is a vulnerability in a particular library, and you're trying to track all of the ways in which that vulnerable code is being pulled into your project.

My preference is to force the conflict up front by saying that you can't import conflicting versions. This creates a constant stream of small problems, but avoids really big ones later. However I absolutely understand why a lot of people prefer it the other way around.

MrJohz|1 year ago

   cargo tree -i log@0.3.9
will show which dependencies require this particular version of log, and how they are transitively related to the main package. In this case, you would clearly see that the out-of-date dependency comes from package "b".

There are equivalents for must other package managers that take this approach, and I've never found this a problem in practice.

Of course, you still need to know that there's a vulnerability there in the first place, but that's why tools like NPM often integrate with vulnerability scanners so that they can check your dependencies as you install them.

pkolaczk|1 year ago

That’s nowhere near as terrible as not being able to resolve a conflict between incompatible versions. Like half of your project can’t use Guava X but another half can’t use Guava Y, and there is no common version that works. We ran into compatibility problems with our big Java project many times and wasted months on attempting things like jar shading or classloaders. At the end of the day we use shading but that comes with its own set of annoyances like increasing the build times and allowing people to occasionally import the wrong version of library (eg. shaded instead of non-shaded). The bigger the project the more likely you’re going to hit this, and the lack of support for feature-gating dependencies in the Java ecosystem doesn’t help.

oefrha|1 year ago

Go got this right: you want an incompatible version, you have to use a different import path. Then you can only pick one version (which is deterministically the lowest possible version) for a certain import path, not a hundred different versions.

Also forces people to actually take backwards compatibility seriously.

alkonaut|1 year ago

How does this work? Assume that the log crate in its internal state has a lock it uses for synchronizing writing to some log endpoint. If I have two versions of log in my process then they must have two copies of their internal state. So they both point to the same log endpoint, but they have one mutex each? That means it "works" at compile time but fails at runtime? That's the worst kind of "works!"

Or if I depend transitively on two versions of a library (e.g. a matrix math lib) through A and B and try to read a value from A and send it into B. Then presumably due to type namespacing that will fail at compile time?

So the options when using incompatible dependencies are a) it compiles, but fails at runtime, b) it doesn't compile, or c) it compiles and works at runtime?

yorwba|1 year ago

If the log endpoint is external to your process and two different copies of the logging crate in the same process writing to it cause problems, two identical copies of the logging crate in different processes will likely also cause problems. The solution here is global synchronzation, not just within one process.

If the log endpoint is internal to your process, how did you end up with two independent mutexes guarding (or not guarding) access to the same resource? It should be wrapped in a shared mutex as soon as you create it, and before passing it to the different versions of the logging crate. And unless you use unsafe, Rust's ownership model forces you to do that, because it forbids having two overlapping mutable references at the same time.

dboreham|1 year ago

Every language will re-create its own version (sic) of DLL-hell.

pjmlp|1 year ago

And this is why one gets to watch some crates being compiled from scratch multiple times in a single "make world" build.

anonymoushn|1 year ago

You can do this, but you can't use two semver-compatible versions of the same library in *different binaries* in the same workspace.

hsfzxjy|1 year ago

So both versions of log crate manage their own internal states within the same process? Would this lead to surprising results?

dwattttt|1 year ago

Their internal states in Rust are also namespaced, so two incompatible crates in the same process won't observe each others symbols. If they access external resources that are not namespaced though, that could be a problem.

jcelerier|1 year ago

How does that work if you want to export a symbol for dlopen?

dureuill|1 year ago

I'm not sure I understand the use case here. Are you asking if you can depend on two versions of the same crate, for a crate that exports a `#[no_mangle]` or `#[export_name]` function?

I guess you could slap a `#[used]` attribute on your exported functions, and use their mangled name to call them with dlopen, but that would be unwieldy and guessing the disambiguator used by the compiler error prone to impossible.

Other than that, you cannot. What you can do is define the `#[no_mangle]` or `#[export_name]` function at the top-level of your shared library. It makes sense to have a single crate bear the responsibility of exporting the interface of your shared library.

I wish Rust would enforce that, but the shared library story in Rust is subpar. Fortunately it never actually comes into play, as the ecosystem relies on static linking

dboreham|1 year ago

Nobody knows about dynamic linking now. And most languages don't support it (looking at you: golang).

gmueckl|1 year ago

I cannot shake the feeling that this is actually a misfeature that will get people into trouble in new and puzzling ways. The isolated classloaders in Java and the assembly domains in .Net didn't turn out to be very bright ideas and from a software design perspective this is virtually identical.

pornel|1 year ago

It's been working like that for a decade, and it's been fine.

Rust/Cargo have been designed for it from the start.