top | item 44116643

Show HN: Wetlands – a lightweight Python library for managing Conda environments

30 points| arthursw | 9 months ago |arthursw.github.io

When building a plugin system for an application, avoiding dependency conflicts is critical. To address this, I created Wetlands – a lightweight Conda environment manager.

Wetlands not only simplifies the creation of isolated Conda environments with specific dependencies, but also allows you to run arbitrary Python code within those environments and retrieve the results. It uses the multiprocessing.connection and pickle modules for inter-process communication. Additionally, one can easily use shared memory between the environments, making data exchange more efficient.

Docs: https://arthursw.github.io/wetlands/latest/ Source: https://github.com/arthursw/wetlands/

I’d really appreciate any feedback. Thanks!

42 comments

mushufasa|9 months ago

> Wetlands not only simplifies the creation of isolated Conda environments with specific dependencies, but also allows you to run

I've been using Conda for 10 years as my default package manager on my devices (not pipenv or poetry etc). I started because it was "the way" for data science but I kept with it because the syntax is really intuitive to me (conda create, conda activate).

I'm not sure what problem you are solving here -- the issues with conda IMO are that it is overkill for the rest of the python community, so conda-forge has gradually declined and I typically create a conda environment then use pip for the latest libraries. Managing the conda environments though is not my issue -- that part works so well that I keep with it.

If you could explain why you created this and what problems you are solving with an example, that would be helpful. All package managers are aimed at "avoiding dependency conflicts" so that doesn't really communicate to me what this is and what real problem it solves.

N1H1L|9 months ago

I used to use conda, but have switched entirely to uv now

superkuh|9 months ago

A python dependency manager manager manager. Truly we live in an age of unprecedented code abstraction and complexity. And I love that you install and manage wetlands itself with pip. An ouroboros matrioska of code.

jpecar|9 months ago

Why not just call it swamp? It would better describe the python ecosystem mess ;)

Jokes aside, this feels very meta: package manager for a package manager for a package manager. Reminds me of the old RFC1925: "you can always add another layer of abstraction". That RFC also says "perfection has been reached not when there is nothing left to add, but when there is nothing left to take away".

And as a hpc admin, I'm not offering my users any help with conda and let them suffer on their own. Instead I'm showing them the greener pastures with spack, easybuild and eessi whenever I can. And they're slowly migrating over.

vindex10|9 months ago

it was also my first thought on abstractions of abstractions, thanks for sharing :)

could you elaborate a bit more on why HPC world is special when it comes to configuring the environment?

I always feel it is a typical problem in software development, to separate operating system env from the application env.

do you use spack / easybuild on your personal computer, for example if you need to install a package that is not part of the distribution?

arthursw|9 months ago

When I started the project, Spack was not available on Windows so it was not an option. Now I should reconsider this. Thanks!

arthursw|9 months ago

Thanks for your comments! I fully agree that the Python ecosystem is (overly) complex, and Conda is generally not necessary for Python tools.

I made this library for a workflow management system, which can use any tool packaged with Conda, not just Python tools. The tools can be binaries made in C++, Java programs, or anything Conda can containerize. Note that Docker is not an option because it cannot be installed automatically on all platforms (and because of performances on non-Linux OS).

My users do not have to worry about command lines to install tools since Wetlands is installed in the workflow management system. Each tool is installed when the user executes a workflow using it.

In the bio-image analysis and medical imaging communities —as well as many others— scientists are often unfamiliar with the Python ecosystem and the concept of virtual environments. However, they rely heavily on a wide range of tools, each with numerous dependencies written in various languages. Applications with a built-in package management system like Wetlands greatly simplify their workflow by handling the complex task of setting up environments for these tools behind the scenes.

For example, Napari is an excellent viewer for multi-dimensional images written in Python which can be easily extended via plugins. There are hundreds of plugins, to do things like image denoising, registration, segmentation, particle tracking, etc. Plugins depend on tools (like Segment-Anything-Model, Cellpose, Stardist, etc.) which cannot be installed in the same environment. Wetlands can come to the rescue and isolate each plugin in its own environment.

I hope the purpose of Wetlands is clearer now :)

barapa|9 months ago

Why do people use Conda instead of uv?

phronimos|9 months ago

Conda manages binaries and their native dependencies together, including shared libraries[0]. This offers significant advantages over uv and pip when distributing packages with C extensions, such as dependency resolution that accounts for shared library requirements, and better package isolation.

[0]: https://docs.conda.io/projects/conda-build/en/latest/resourc...

ElectricalUnion|9 months ago

The conda ecosystem was a early adopter of standardized binary packages.

Now it's mostly behind us, but there used to be a time where pypi didn't have wheels (a 2012 thing), or manylinux wheels (a 2016 thing) for most libraries. pip install was a world of pain if you didn't have the "correct source packages" in your system.

And now several of those projects built back then, they're no longer projects but deployed systems, might as well stick to what is working.

joppy|9 months ago

“Pixi instead of uv” would be a more fair comparison, as Pixi is a more modern tool which still uses the conda package format and ecosystem, much like uv is a modernised pip which still uses the PyPI package format.

One thing an conda package can do which an PyPI package cannot is have binary dependencies: a conda package is linked upon installation, and packages can declare dependencies on shared libraries. As common example is numeric libraries depending on a BLAS implementation: in a conda/pixi environment you will get exactly one BLAS shared library linked into your process, used by numpy, scipy, optimisers, etc. For some foundational libraries like BLAS which have multiple implementations, the user even has the power to consistently switch the implementation within the environment, eg from OpenBLAS to Intel’s MKL.

The PyPI package format does not allow binary dependencies: wheels must be self-contained when it comes to binary code (not when it comes to Python code - which hopefully makes it clear that something here is inconsistent). Take any numerical python environment and enumerate the copies of BLAS you have, it is probably 3-5. All running their own threadpools.

Another very simple example is with inbuilt modules depending on native code, like the sqlite3 module. In a conda/pixi installation you are guaranteed that the python binary links against the same sqlite3 code as the command-line sqlite3 cli tool in the same environment. Stuff like this removes many cross-language or cross-tool hassles.

I prefer uv or poetry if I’m doing anything simple or pure python (or perhaps with a small binary dependency like an event loop). But pixi is the way to go for large environments with lots of extra tools and numerical libraries.

blactuary|9 months ago

I use it because I do not need to create packages, and I often do a lot of interactive coding within a conda environment, not just running full Python scripts. At any given time I have a primary conda env I'm using with my set of daily use packages, eventually creating a new one for testing when there are major version upgrades to Python or a package I use frequently.

When I read the uv docs and see other people's examples, I have a hard time understanding how it works for my workflow. It seems I could continue using conda for environment management and only use uv for package installation and it would be much faster, but that also feels a little shaky and potential for error combining the two tools, and since mamba became the default solver conda is pretty fast, even when building a new env from scratch.

It feels like conda and it's ability to have multiple Python versions, with env management built in, gives me more than uv, just without the package installation speed. But I am certainly open to someone explaining uv to me in a way to disprove that

jessekv|9 months ago

These days I see Conda (and micromamba) used as a reliable cross-platform winget/apt/yum/brew. For example, to install GDAL.

uv replaces pip, conda and pip have been complementary for a long time. But I would be surprised if uv does not take on conda at some point, e.g. with a micromamba subcommand.

martinky24|9 months ago

Because conda has been around for years and uv is more or less brand new. It's pretty much as simple as that.

chillpenguin|9 months ago

The fact that this exists shows that there is a serious problem in the python ecosystem. I'm sure it solves a real problem, so I'm not knocking the author. It's more of a "state of our industry" problem.

whalesalad|9 months ago

I maintain there is no issue. It's really not hard. conda is a smell for me though.

pyenv is all you need. it manages python versions and python virtual environments. you can create and destroy them just as easily as git branches.

pyenv + good ol' requirements.txt is really all you need.

if your env dictates containers, it's even easier to work with. FROM python:version and done.

mrweasel|9 months ago

> If the user doesn't have pixi or micromamba installed, Wetlands will download and set it up automatically.

Please don't. Never have a tool that automatically reaches out onto the internet to get a binary and then run it. Just let the user know that they need to install either pixi or micromamba. It's inherently unsafe and you don't know what will be put into those binaries in the future.

Maybe it's because I don't have a use case for this, but I don't really get what this is for. It's interesting, but I'm not really sure where I'd use it.

unknown|9 months ago

[deleted]