(no title)
something98 | 1 year ago
But most of my work, since I adopted conda 7ish years ago, involves using the same ML environment across any number of folders or even throw-away notebooks on the desktop, for instance. I’ll create the environment and sometimes add new packages, but rarely update it, unless I feel like a spring cleaning. And I like knowing that I have the same environment across all my machines, so I don’t have to think about if I’m running the same script or notebook on a different machine today.
The idea of a new environment for each of my related “projects” just doesn’t make sense to me. But, I’m open to learning a new workflow.
Addition: I don’t run other’s code, like pretrained models built with specific package requirements.
ayjay_t|1 year ago
My one off notebook I'm going to set up to be similar to the scripts, will require some mods.
It does take up a lot more space, it is quite a bit faster.
However, you could use the workspace concept for this I believe, and have the dependencies for all the projects described in one root folder and then all sub-folders will use the environment.
But I mean, our use case is very different than yours, its not necessary to use uv.
something98|1 year ago
FYI, for anyone else that stumbles upon this: I decided to do a quick check on PyTorch (the most problem-prone dependency I've had), and noticed that they recommending specifically no longer using conda—and have since last November.
bityard|1 year ago
In your case, I guess one thing you could do is have one git repo containing you most commonly-used dependencies and put your sub-projects as directories beneath that? Or even keep a branch for each sub-project?
One thing about `uv` is that dependency resolution is very fast, so updating your venv to switch between "projects" is probably no big deal.
zahlman|1 year ago
First, let me try to make sense of it for you -
One of uv's big ideas is that it has a much better approach to caching downloaded packages, which lets it create those environments much more quickly. (I guess things like "written in Rust", parallelism etc. help, but as far as I can tell most of the work is stuff like hard-linking files, so it's still limited by system calls.) It also hard-links duplicates, so that you aren't wasting tons of space by having multiple environments with common dependencies.
A big part of the point of making separate environments is that you can track what each project is dependent on separately. In combination with Python ecosystem standards (like `pyproject.toml`, the inline script metadata described by https://peps.python.org/pep-0723/, the upcoming lock file standard in https://peps.python.org/pep-0751/, etc.) you become able to reproduce a minimal environment, automate that reproduction, and create an installable sharable package for the code (a "wheel", generally) which you can publish on PyPI - allowing others to install the code into an environment which is automatically updated to have the needed dependencies. Of course, none of this is new with `uv`, nor depends on it.
The installer and venv management tool I'm developing (https://github.com/zahlman/paper) is intended to address use cases like yours more directly. It isn't a workflow tool, but it's intended to make it easier to set up new venvs, install packages into venvs (and say which venv to install it into) and then you can just activate the venv you want normally.
(I'm thinking of having it maintain a mapping of symbolic names for the venvs it creates, and a command to look them up - so you could do things like "source `paper env-path foo`/bin/activate", or maybe put a thin wrapper around that. But I want to try very hard to avoid creating the impression of implementing any kind of integrated development tool - it's an integrated user tool, for setting up applications and libraries.)
cdavid|1 year ago
E.g. calling that wrapper uvv, something like
You could imagine additional features such as keeping a log of the installed packages inside the venv so that you could revert to arbitrary state, etc. as goodies given how much faster uv is.Doxin|1 year ago
1. uv init <folder-name> # venv stored in folder-name/.venv 2. cd <folder-name> # running stuff with uv run will automatically pick up the venv 3. cd .. # now you get out of the virtualenv
dagw|1 year ago
To open a notebook I run (via an alias)
and then in the first cell of each notebook I have This takes care of all the venv management stuff and makes sure that I always have the dependencies I need for each notebook. Only been doing this for a few weeks, but so far so good.uneekname|1 year ago
BrenBarn|1 year ago
lmm|1 year ago
dharmab|1 year ago
Sadly for certain types of projects like GIS, ML, scientific computing, the dependencies tend to be mutually incompatible and I've learned the hard way to set up new projects for each separate task when using those packages. `uv init; uv add <dependencies>` is a small amount of work to avoid the headaches of Torch etc.
Zizizizz|1 year ago