top | item 43096901

(no title)

something98 | 1 year ago

Can someone explain a non-project based workflow/configuration for uv? I get creating a bespoke folder, repo, and uv venv for certain long-lived projects (like creating different apps?).

But most of my work, since I adopted conda 7ish years ago, involves using the same ML environment across any number of folders or even throw-away notebooks on the desktop, for instance. I’ll create the environment and sometimes add new packages, but rarely update it, unless I feel like a spring cleaning. And I like knowing that I have the same environment across all my machines, so I don’t have to think about if I’m running the same script or notebook on a different machine today.

The idea of a new environment for each of my related “projects” just doesn’t make sense to me. But, I’m open to learning a new workflow.

Addition: I don’t run other’s code, like pretrained models built with specific package requirements.

discuss

order

ayjay_t|1 year ago

`uv` isn't great for that, I've been specifying and rebuilding my environments for each "project".

My one off notebook I'm going to set up to be similar to the scripts, will require some mods.

It does take up a lot more space, it is quite a bit faster.

However, you could use the workspace concept for this I believe, and have the dependencies for all the projects described in one root folder and then all sub-folders will use the environment.

But I mean, our use case is very different than yours, its not necessary to use uv.

something98|1 year ago

Gotcha. Thank you.

FYI, for anyone else that stumbles upon this: I decided to do a quick check on PyTorch (the most problem-prone dependency I've had), and noticed that they recommending specifically no longer using conda—and have since last November.

bityard|1 year ago

I personally have a "sandbox" directory that I put one-off and prototype projects in. My rule is that git repos never go in any dir there. I can (and do) go in almost any time and rm anything older than 12 months.

In your case, I guess one thing you could do is have one git repo containing you most commonly-used dependencies and put your sub-projects as directories beneath that? Or even keep a branch for each sub-project?

One thing about `uv` is that dependency resolution is very fast, so updating your venv to switch between "projects" is probably no big deal.

zahlman|1 year ago

> The idea of a new environment for each of my related “projects” just doesn’t make sense to me. But, I’m open to learning a new workflow.

First, let me try to make sense of it for you -

One of uv's big ideas is that it has a much better approach to caching downloaded packages, which lets it create those environments much more quickly. (I guess things like "written in Rust", parallelism etc. help, but as far as I can tell most of the work is stuff like hard-linking files, so it's still limited by system calls.) It also hard-links duplicates, so that you aren't wasting tons of space by having multiple environments with common dependencies.

A big part of the point of making separate environments is that you can track what each project is dependent on separately. In combination with Python ecosystem standards (like `pyproject.toml`, the inline script metadata described by https://peps.python.org/pep-0723/, the upcoming lock file standard in https://peps.python.org/pep-0751/, etc.) you become able to reproduce a minimal environment, automate that reproduction, and create an installable sharable package for the code (a "wheel", generally) which you can publish on PyPI - allowing others to install the code into an environment which is automatically updated to have the needed dependencies. Of course, none of this is new with `uv`, nor depends on it.

The installer and venv management tool I'm developing (https://github.com/zahlman/paper) is intended to address use cases like yours more directly. It isn't a workflow tool, but it's intended to make it easier to set up new venvs, install packages into venvs (and say which venv to install it into) and then you can just activate the venv you want normally.

(I'm thinking of having it maintain a mapping of symbolic names for the venvs it creates, and a command to look them up - so you could do things like "source `paper env-path foo`/bin/activate", or maybe put a thin wrapper around that. But I want to try very hard to avoid creating the impression of implementing any kind of integrated development tool - it's an integrated user tool, for setting up applications and libraries.)

cdavid|1 year ago

That's my main use case not-yet-supported by uv. It should not be too difficult to add a feature or wrapper to uv so that it works like pew/virtualenvwrapper.

E.g. calling that wrapper uvv, something like

  1. uvv new <venv-name> --python=... ...# venvs stored in a central location 
  2. uvv workon <venv-name> # now you are in the virtualenv
  3. deactive # now you get out of the virtualenv
You could imagine additional features such as keeping a log of the installed packages inside the venv so that you could revert to arbitrary state, etc. as goodies given how much faster uv is.

Doxin|1 year ago

So this is probably just me not understanding your use case, but surely this is a nearly identical workflow?

1. uv init <folder-name> # venv stored in folder-name/.venv 2. cd <folder-name> # running stuff with uv run will automatically pick up the venv 3. cd .. # now you get out of the virtualenv

dagw|1 year ago

I've worked like you described for years and it mostly works. Although I've recently started to experiment with a new uv based workflow that looks like this:

To open a notebook I run (via an alias)

    uv tool run jupyter lab
and then in the first cell of each notebook I have

   !uv pip install my-dependcies
This takes care of all the venv management stuff and makes sure that I always have the dependencies I need for each notebook. Only been doing this for a few weeks, but so far so good.

uneekname|1 year ago

Why not just copy your last env into the next dir? If you need to change any of the package versions, or add something specific, you can do that without risking any breakages in your last project(s). From what I understand uv has a global package cache so the disk usage shouldn't be crazy.

BrenBarn|1 year ago

Yeah, this is how I feel too. A lot of the movement in Python packaging seems to be more in managing projects than managing packages or even environments. I tend to not want to think about a "project" until very late in the game, after I've already written a bunch of code. I don't want "make a project" to be something I'm required or even encouraged to do at the outset.

lmm|1 year ago

I have the opposite feeling, and that's why I like uv. I don't want to deal with "environments". When I run a Python project I want its PYTHONPATH to have whatever libraries its config file says it should have, and I don't want to have to worry about how they get there.

dharmab|1 year ago

I set up a "sandbox" project as an early step of setting up a new PC.

Sadly for certain types of projects like GIS, ML, scientific computing, the dependencies tend to be mutually incompatible and I've learned the hard way to set up new projects for each separate task when using those packages. `uv init; uv add <dependencies>` is a small amount of work to avoid the headaches of Torch etc.

Zizizizz|1 year ago

Just symlink the virtualenv folder and pyproject.toml it makes to whatever other project you want it to use.