top | item 43097006

(no title)

barosl | 1 year ago

A very well written article! I admire the analysis done by the author regarding the difficulties of Python packaging.

With the advent of uv, I'm finally feeling like Python packaging is solved. As mentioned in the article, being able to have inline dependencies in a single-file Python script and running it naturally is just beautiful.

  #!/usr/bin/env -S uv run
  # /// script
  # dependencies = ['requests', 'beautifulsoup4']
  # ///
  import requests
  from bs4 import BeautifulSoup

After being used to this workflow, I have been thinking that a dedicated syntax for inline dependencies would be great, similar to JavaScript's `import ObjectName from 'module-name';` syntax. Python promoted type hints from comment-based to syntax-based, so a similar approach seems feasible.

> It used to be that either you avoided dependencies in small Python script, or you had some cumbersome workaround to make them work for you. Personally, I used to manage a gigantic venv just for my local scripts, which I had to kill and clean every year.

I had the same fear for adding dependencies, and did exactly the same thing.

> This is the kind of thing that changes completely how you work. I used to have one big test venv that I destroyed regularly. I used to avoid testing some stuff because it would be too cumbersome. I used to avoid some tooling or pay the price for using them because they were so big or not useful enough to justify the setup. And so on, and so on.

I 100% sympathize with this.

discuss

epistasis|1 year ago

One other key part of this is freezing a timestamp with your dependency list, because Python packages are absolutely terrible at maintaining compatibility a year or three or five later as PyPI populates with newer and newer versions. The special toml incantation is [tool.uv] exclude-newer:

  # /// script
  # dependencies = [
  #   "requests",
  # ]
  # [tool.uv]
  # exclude-newer = "2023-10-16T00:00:00Z"
  # ///

https://docs.astral.sh/uv/guides/scripts/#improving-reproduc...

This has also let me easily reconstruct some older environments in less than a minute, when I've been version hunting for 30-60 minutes in the past. The speed of uv environment building helps a ton too.

woodruffw|1 year ago

Maybe I'm missing something, but why wouldn't you just pin to an exact version of `requests` (or whatever) instead? I think that would be equivalent in practice to limiting resolutions by release date, except that it would express your intent directly ("resolve these known working things") rather than indirectly ("resolve things from when I know they worked").

procaryote|1 year ago

Oh that's neat!

I've just gotten into the habit of using only the dependencies I really must, because python culture around compatibility is so awful

CJefferson|1 year ago

This is the feature I would most like added to rust, if you don’t save a lock file it is horrible trying to get back to the same versions of packages.

athrun|1 year ago

Gosh, thanks for sharing! This is the remaining piece I felt I was missing.

isoprophlex|1 year ago

Wow, this is such an insanely useful tip. Thanks!

zelphirkalt|1 year ago

Why didn't you create a lock file with the versions and of course hashsums in it? No version hunting needed.

leni536|1 year ago

That's fantastic, that's exactly what I need to revive a bit-rotten python project I am working with.

sunshowers|1 year ago

Oooh! Do you end up doing a binary search by hand and/or does uv provide tools for that?

aragilar|1 year ago

My feeling sadly is because uv is the new thing, it hasn't had to handle anything but the common cases. This kinda gets a mention in the article, but is very much glossed over. There are still some sharp edges, and assumptions which aren't true in general (but are for the easy cases), and this only going to make things worse, because now there's a new set of issues people run into.

EdwardDiego|1 year ago

As an example of an edge case - you have Python dependencies that wrap C libs that come in x86-64 flavour and arm-64.

Pipenv, when you create a lockfile, will only specify the architecture specific lib that your machine runs on.

So if you're developing on an ARM Macbook, but deploying on an Ubuntu x86-64 box, the Pipenv lockfile will break.

Whereas a Poetry lockfile will work fine.

And I've not found any documentation about how uv handles this, is it the Pipenv way or the Poetry way?

0xCMP|1 year ago

I think this is an awesome feature and will probably a great alternative to my use of nix to do similar things for scripts/python if nothing else because it's way less overhead to get it running and playing with something.

Nix for all it's benefits here can be quite slow and make it otherwise pretty annoying to use as a shebang in my experience versus just writing a package/derivation to add to your shell environment (i.e. it's already fully "built" and wrapped. but also requires a lot more ceremony + "switching" either the OS or HM configs).

EdwardDiego|1 year ago

It's not a feature that's exclusive to uv. It's a PEP, and other tools will eventually support it if they don't already.

zelphirkalt|1 year ago

Will nix be slow after the first run? I guess it will have to build the deps, but in a second run should be fast, no?

throwup238|1 year ago

Agreed. I did the exact same thing with that giant script venv and it was a constant source of pain because some scripts would require conflicting dependencies. Now with uv shebang and metadata, it’s trivial.

Before uv I avoided writing any scripts that depended on ML altogether, which is now unlocked.

8n4vidtmkvmk|1 year ago

You know what we need? In both python and JS, and every other scripting language, we should be able to import packages from a url, but with a sha384 integrity check like exists in HTML. Not sure why they didn't adopt this into JS or Deno. Otherwise installing random scripts is a security risk

woodruffw|1 year ago

Python has fully-hashed requirements[1], which is what you'd use to assert the integrity of your dependencies. These work with both `pip` and `uv`. You can't use them to directly import the package, but that's more because "packages" aren't really part of Python's import machinery at all.

(Note that hashes themselves don't make "random scripts" not a security risk, since asserting the hash of malware doesn't make it not-malware. You still need to establish a trust relationship with the hash itself, which decomposes to the basic problem of trust and identity distribution.)

[1]: https://pip.pypa.io/en/stable/topics/secure-installs/

zahlman|1 year ago

The code that you obtain for a Python "package" does not have any inherent mapping to a "package" that you import in the code. The name overload is recognized as unfortunate; the documentation writing community has been promoting the terms "distribution package" and "import package" as a result.

https://packaging.python.org/en/latest/discussions/distribut...

https://zahlman.github.io/posts/2024/12/24/python-packaging-...

While you could of course put an actual Python code file at a URL, that wouldn't solve the problem for anything involving compiled extensions in C, Fortran etc. You can't feasibly support NumPy this way, for example.

That said, there are sufficient hooks in Numpy's `import` machinery that you can make `import foo` programmatically compute a URL (assuming that the name `foo` is enough information to determine the URL), download the code and create and import the necessary `module` object; and you can add this with appropriate priority to the standard set of strategies Python uses for importing modules. A full description of this process is out of scope for a HN comment, but relevant documentation:

https://docs.python.org/3/library/importlib.html

https://docs.python.org/3/library/sys.html#sys.meta_path

AgentME|1 year ago

Deno and npm both store the hashes of all the dependencies you use in a lock file and verify them on future reinstalls.

HumanOstrich|1 year ago

Where do you initially get the magical sha384 hash that proves the integrity of the package the first time it's imported?

shlomo_z|1 year ago

This is a nice feature, but I've not found it to be useful, because my IDE wont recognize these dependencies.

Or is it a skill issue?

zahlman|1 year ago

What exactly do you imagine that such "recognition" would entail? Are you expecting the IDE to provide its own package manager, for example?

EdwardDiego|1 year ago

No, it's the fact that it's a rather new PEP, and our IDEs don't yet support it, because, rather new.

otabdeveloper4|1 year ago

> I'm finally feeling like Python packaging is solved

Not really: https://github.com/astral-sh/uv/issues/5190

BossingAround|1 year ago

This looks horrible for anything but personal scripts/projects. For anything close to production purposes, this seems like a nightmare.

dagw|1 year ago

Anything that makes it easier to make a script that I wrote run on a colleagues machine without having to give them a 45 minute crash course of the current state of python environment setup and package management is a huge win in my book.

baq|1 year ago

Don’t use it in production, problem solved.

I find this feature amazing for one-off scripts. It’s removing a cognitive burden I was unconsciously ignoring.

stavros|1 year ago

It's not meant for production.

epistasis|1 year ago

There's about 50 different versions of "production" for Python, and if this particular tool doesn't appear useful to it, you're probably using Python in a very very different way than those of us who find it useful. One of the great things about Python is that it can be used in such diverse ways by people with very very very different needs and use cases.

What does "production" look like in your environment, and why would this be terrible for it?

zahlman|1 year ago

> As mentioned in the article, being able to have inline dependencies in a single-file Python script and running it naturally is just beautiful.

The syntax for this (https://peps.python.org/pep-0723/) isn't uv's work, nor are they first to implement it (https://iscinumpy.dev/post/pep723/). A shebang line like this requires the tool to be installed first, of course; I've repeatedly heard about how people want tooling to be able to bootstrap the Python version, but somehow it's not any more of a problem for users to bootstrap the tooling themselves.

And some pessimism: packaging is still not seen as the core team's responsibility, and uv realistically won't enjoy even the level of special support that Pip has any time soon. As such, tutorials will continue to recommend Pip (along with inferior use patterns for it) for quite some time.

> I have been thinking that a dedicated syntax for inline dependencies would be great, similar to JavaScript's `import ObjectName from 'module-name';` syntax. Python promoted type hints from comment-based to syntax-based, so a similar approach seems feasible.

First off, Python did no such thing. Type annotations are one possible use for an annotation system that was added all the way back in 3.0 (https://peps.python.org/pep-3107/); the original design explicitly contemplated other uses for annotations besides type-checking. When it worked out that people were really only using them for type-checking, standard library support was added (https://peps.python.org/pep-0484/) and expanded upon (https://peps.python.org/pep-0526/ etc.); but this had nothing to do with any specific prior comment-based syntax (which individual tools had up until then had to devise for themselves).

Python doesn't have existing syntax to annotate import statements; it would have to be designed specifically for the purpose. It's not possible in general (as your example shows) to infer a PyPI name from the `import` name; but not only that, dependency names don't map one-to-one to imports (anything that you install from PyPI may validly define zero or more importable top-level names, and of course the code might directly use a sub-package or an attribute of some module (which doesn't even have to be a class). So there wouldn't be a clear place to put such names except in a separate block by themselves, which the existing comment syntax already does.

Finally, promoting the syntax to an actual part of the language doesn't seem to solve a problem. Using annotations instead of comments for types allows the type information to be discovered at runtime (e.g. through the `__annotations__` attribute of functions). What problem would it solve for packaging? It's already possible for tools to use a PEP 723 comment, and it's also possible (through the standard library - https://docs.python.org/3/library/importlib.metadata.html) to introspect the metadata of installed packages at runtime.

nabla9|1 year ago

> by the author

And the author is?

zelphirkalt|1 year ago

Any flow that does not state checksums/hashsums is not ready for production and all but beautiful. But I haven't used uv yet, so maybe it is possible to specify the dependencies with hashsums in the same file too?

Actually the order of import statement is one of the things, that Python does better than JS. It makes completions much less costly to calculate when you type the code. An IDE or other tool only has to check one module or package for its contents, rather than whether any module has a binding of the name so and so. If I understand correctly, you are talking about an additional syntax though.

When mentioning a gigantic venv ... Why did they do that? Why not have smaller venvs for separate projects? It is really not that hard to do and avoids dependency conflicts between projects, which have nothing to do with each other. Using one giant venv is basically telling me that they either did not understand dependency conflicts, or did not care enough about their dependencies, so that one script can run with one set of dependencies one day, and another set of deps the other day, because a new project's deps have been added to the mix in the meantime.

Avoiding deps for small scripts is a good thing! If possible.

To me it just reads like a user now having a new tool allowing them to continue the lazy ways of not properly managing dependencies. I mean all deps in one huge venv? Who does that?? No wonder they had issues with that. Can't even keep deps separated, let alone properly having a lock file with checksums. Yeah no surprise they'll run into issues with that workflow.

And while we are relating to the JS world: One may complain in many ways about how NPM works, but it has had automatic lock file for aaages. Being the default tool in the ecosystem. And its competitors had it to. At least that part they got right for a long time, compared to pip, which does nothing of the sort eithout extra effort.

dagw|1 year ago

Why did they do that? Why not have smaller venvs for separate projects?

What's a 'project'? If you count every throw away data processing script and one off exploratory Jupyter notebook, that can easily be 100 projects. Certainly before uv, having one huge venv or conda environment with 'everything' installed made it much faster and easier to get that sort of work done.

slightwinder|1 year ago

> Why not have smaller venvs for separate projects?

Because they are annoying and unnecessary additional work. If I write something, I won't know the dependencies in the beginning. And if it's a personal tool/script or even a throwaway one-shoot, then why bother with managing unnecessary parts? I just manage my personal stack of dependencies for my own tools in a giant env, and pull imports from it or not, depending on the moment. This allows me to move fast. Of course it is a liability, but not one which usually bites me. Every some years, some dependency goes wrong, and I either fix it or remove it, but at the end the benefit I save in time far outweighs the time I would lose from micromanaging small separate envs.

Managing dependencies is for production and important things. Big messy envs is good enough for everything else. I have hundred of script and tools, micromanaging them on that level has no benefit. And it seems uv now offers some options for making small envs effortless without costing much time, so it's a net benefit in that area, but it's not something world shattering which will turn my world upside down.

zahlman|1 year ago

>Any flow that does not state checksums/hashsums is not ready for production

It's not designed nor intended for such. There are tons of Python users out there who have no concept of what you would call "production"; they wrote something that requires NumPy to be installed and they want to communicate this as cleanly and simply (and machine-readably) as possible, so that they can give a single Python file to associates and have them be able to use it in an appropriate environment. It's explicitly designed for users who are not planning to package the code properly in a wheel and put it up on PyPI (or a private index) or anything like that.

>and all but beautiful

De gustibus non est disputandum. The point is to have something simple, human-writable and machine-readable, for those to whom it applies. If you need to make a wheel, make one. If you need a proper lock file, use one. Standardization for lock files is finally on the horizon in the ecosystem (https://peps.python.org/pep-0751/).

>Actually the order of import statement is one of the things, that Python does better than JS.

Import statements are effectively unrelated to package management. Each installed package ("distribution package") may validly define zero or more top-level names (of "import packages") which don't necessarily bear any relationship to each other, and the `import` syntax can validly import one or more sub-packages and/or attributes of a package or module (a false distinction, anyway; packages are modules), and rename them.

>An IDE or other tool only has to check one module or package for its contents

The `import` syntax serves these tools by telling them about names defined in installed code, yes. The PEP 723 syntax is completely unrelated: it tells different tools (package managers, environment managers and package installers) about names used for installing code.

>Why not have smaller venvs for separate projects? It is really not that hard to do

It isn't, but it introduces book-keeping (Which venv am I supposed to use for this project? Where is it? Did I put the right things in it already? Should I perhaps remove some stuff from it that I'm no longer using? What will other people need in a venv after I share my code with them?) that some people would prefer to delegate to other tooling.

Historically, creating venvs has been really slow. People have noticed that `uv` solves this problem, and come up with a variety of explanations, most of which are missing the mark. The biggest problem, at least on Linux, is the default expectation of bootstrapping Pip into the new venv; of course uv doesn't do this by default, because it's already there to install packages for you. (This workflow is equally possible with modern versions of Pip, but you have to know some tricks; I describe some of this in https://zahlman.github.io/posts/2025/01/07/python-packaging-... . And it doesn't solve other problems with Pip, of course.) Anyway, the point is that people will make single "sandbox" venvs because it's faster and easier to think about - until the first actual conflict occurs, or the first attempt to package a project and accurately convey its dependencies.

> Avoiding deps for small scripts is a good thing! If possible.

I'd like to agree, but that just isn't going to accommodate the entire existing communities of people writing random 100-line analysis scripts with Pandas.

>One may complain in many ways about how NPM works, but it has had automatic lock file for aaages.

Cool, but the issues with Python's packaging system are really not comparable to those of other modern languages. NPM isn't really retrofitted to JavaScript; it's retrofitted to the Node.JS environment, which existed for only months before NPM was introduced. Pip has to support all Python users, and Python is about 18 years older than Pip (19 years older than NPM). NPM was able to do this because Node was a new project that was being specifically designed to enable JavaScript development in a new environment (i.e., places that aren't the user's browser sandbox). By contrast, every time any incremental improvement has been introduced for Python packaging, there have been massive backwards-compatibility concerns. PyPI didn't stop accepting "egg" uploads until August 1 2023 (https://blog.pypi.org/posts/2023-06-26-deprecate-egg-uploads...), for example.

But more importantly, npm doesn't have to worry about extensions to JavaScript code written in arbitrary other languages (for Python, C is common, but by no means exclusive; NumPy is heavily dependent on Fortran, for example) which are expected to be compiled on the user's machine (through a process automatically orchestrated by the installer) with users complaining to anyone they can get to listen (with no attempt at debugging, nor at understanding whose fault the failure was this time) when it doesn't work.

There are many things wrong with the process, and I'm happy to criticize them (and explain them at length). But "everyone else can get this right" is usually a very short-sighted line of argument, even if it's true.