How Python virtual environments work

_coveredInBees|3 years ago

I'm surprised at the number of people here complaining about venvs in Python. There are lots of warts when it comes to package management in Python, but the built-in venv support has been rock solid in Python 3 for a long time now.

Most of the complaints here ironically are from people using a bunch of tooling in lieu of, or as a replacement for vanilla python venvs and then hitting issues associated with those tools.

We've been using vanilla python venvs across our company for many years now, and in all our CI/CD pipelines and have had zero issues on the venv side of things. And this is while using libraries like numpy, scipy, torch/torchvision, etc.

whalesalad|3 years ago

I've been using Python since like 2006, so maybe I just have that generational knowledge and battlefront experience... but whenever I come into threads like this I really feel like an imposter or a fish out of water. Like, am I using the same Python that everyone else is using? I echo your stance - the less overhead and additional tooling the better. A simple requirements.txt file and pip is all I need.

TheRealPomax|3 years ago

Except when you try to move it, or copy it to a different location. This _almost_ made sense back when it was its own script, but it hasn't made sense for years, and the obstinacy to just sit down and fix this has been bafflingly remarkable.

("why not make everyone install their own venv and run pip install?" because, and here's the part that's going to blow your mind: because they shouldn't have to. The vast majority of packages don't need compilation, you just put the files in the right libs dir, and done. Your import works. Checking this kind of thing into version control, or moving it across disks, etc. etc. should be fine and expected. Python yelling about dependencies that do need to be (re)compile for your os/python combination should be the exception, not the baseline)

black3r|3 years ago

> Most of the complaints here ironically are from people using a bunch of tooling in lieu of, or as a replacement for vanilla python venvs and then hitting issues associated with those tools.

That's because the vanilla python venvs feel like a genius idea but not thought out thoroughly, they feel as if there's something missing..., So there's naturally lots of attempts at improvements and people jump at those...

And when you think about it in bigger depth, venvs are themselves just another one of the solutions used to fix the horrible mess that is python's management of packages and sys.path...

The "Zen of Python" says "There should be one-- and preferably only one --obvious way to do it.", so I can't understand why it's nowhere near as easy when it comes to Python's package management...

aflag|3 years ago

It's incredibly lacking in features. PyPI doesn't even properly index packages, making pip go into this dependency resolution he'll trying to find a set of versions that will work for you. It works for simple cases with few dependencies/not a lot of pinning. But if your needs are a bit more complex it certainly shows its rough edges.

I actually find it amazing that they python community puts up with that. But I suppose fixing it is not that pressing now the language is widely adopted. It's not going to be anyone's priority to mess with that. It's high risk low rewards sort of project.

hot_gril|3 years ago

I've never used anything but vanilla Python venvs, and no they don't work reliably. What does is a Docker container. I keep hearing excuses for it, but the prevalence of Dockerfiles in GitHub Python projects says it all. This is somehow way less of an issue in NodeJS, maybe because local environments were always the default way to install things.

crabbone|3 years ago

The most important part about venv is that you shouldn't need it. The very fact that it exists is a problem. It is a wrong fix to a problem that was left unfixed because of it.

The problem is fundamental in Python in that its runtime doesn't have a concept of a program or a library or a module (not to be confused with Python's modules, which is a special built-in type) etc. The only thing that exists in Python is a "Python system", i.e. an installation of Python with some packages.

Python systems aren't built to be shared between programs (especially so because it's undefined what a program is in Python), but, by any plausible definition of a program, venv doesn't help to solve the problem. This is also amplified by a bunch of tools that simply ignore venvs existence.

Here are some obvious problems venv doesn't even pretend to solve:

* A Python native module linking with shared objects outside of Python's lib subtree. Most comically, you can accidentally link a python module in one installation of Python with Python from a "wrong" location (and also a wrong version). And then wonder how it works on your computer in your virtual environment, but not on the server.

* venvs provides no compile-time isolation. If you are building native Python modules, you are going to use system-wide installed headers, and pray that your system headers are compatible with the version of Python that's going to load your native modules.

* venv doesn't address PYTHONPATH or any "tricks" various popular libraries (s.a. pytest and setuptools) like to play with the path where Python searches for loadable code. So much so that people using these tools often use them contrary to how they should be used (probably in most cases that's what happens). Ironically, often even the authors of the tools don't understand the adverse effects of how the majority is using their tools in combination with venv.

* It's become a fashion to use venv when distributing Python programs (eg. there are tools that help you build DEB or RPM packages that rely on venv) and of course, a lot of bad things happen because of that. But, really, like I said before: it's not because of venv, it's because venv is the wrong fix for the actual problem. The problem nobody in Python community is bold enough to address.

9dev|3 years ago

Oh, yeah? It’s working great? Like figuring out which packages your application actually uses? Or having separate development and production dependencies? Upgrading outdated libraries?

Having taken a deep-dive into refactoring a large python app, I can confidently say that package management in python is a pain compared to other interpreted languages.

Jackevansevo|3 years ago

Likewise, I think people have a negative first experience because it doesn't work exactly like node, throw their toys out the pram and complain on HN for the rest of time.

Guess in taking this stance we're both part of the problem... \s

winrid|3 years ago

Because even with --copy it creates all kinds of symlinks, and if you're using pyenv, hard coded paths to the python binary which can break from CI to installation.

If you're using docker then it's a lot easier I guess.

emptysongglass|3 years ago

But why bother? Just use PDM in PEP-582 mode [1] which handles packages the same way as project-scoped Node packages. Virtual environments are just a stop-gap that persisted for long enough for a whole ecosystem of tooling to support for them. It doesn't make them less bad, just less frustrating to deal with.

[1] https://pdm.fming.dev/latest/usage/pep582/

smeagull|3 years ago

My complaints stem from libraries/OSes requiring different tools. So conda is sometimes required, and pip is also sometimes required, and some provide documentation only for pipenv rather than venv. And then you've got Jupyter, which needs to be configured for each environment.

On top of that there are some large libraries that need to only be installed once per system because they're large, which you can do but does mess with dependency resolution, and god help you if you have multiple shadowing versions of the same library installed.

I wish it was simpler. I agree the underlying system is solid, but the fact that it doesn't solve some issues means we have multiple standards layered on top, which is itself a problem.

And great if you've been using vanilla venvs. Good for those that can. If I want hardware support for Apple's hardware I need to use fucking conda. Heaven help me if I want to combine that in a project with something that only uses pip.

nl|3 years ago

I agree with this 100%. Simple venv works reliably.

The only gotcha I've had is to make sure you deactivate and reactivate the virtual environment after installing Jupyter (or iPython). Otherwise (depending on your shell) you might have the executable path to "jupyter" cached by your shell so "jupyter notebook" will be using different dependencies to what you think.

Even comparatively experienced programmers don't see to know this and it causes a lot of subtle programs.

Here's some details on how bash caches paths: https://unix.stackexchange.com/questions/5609/how-do-i-clear...

atoav|3 years ago

I agree with the statement that venvs are usable and fine. However, they do not come without their pitfalls in the greater picture of development and deployment of python software.

It very often not as simple as going to your target system, cloning the repo and running a single line command that gives you the running software. This is what e.g. Rust's cargo would do.

The problem with python venvs is that when problems occur, they require a lot of deep knowledge very fast and that deep knowledge will not be available to the typical beginner. Even I as a decade long python dev will occasionally encounter stuff that takes longer to resolve than needed.

renewiltord|3 years ago

The annoying thing with vanilla venvs (which are principally what I use) is that when I activate a venv, I can no longer `nvim` in that directory because that venv is not going to have `python-neovim` installed. This kind of state leakage is unpleasant for me to work with.

buildbot|3 years ago

I personally hate Conda with a firey passion - it does so much weird magic and ends up breaking things in non obvious ways. Python works best when you keep it really simple. Just a python -m venv per project, a requirements.txt, and you will basically never have issues.

crabbone|3 years ago

I remember when Conda just appeared... I was so high no hopium... well, the word "hopium" didn't exist yet...

Anyways. Today I have to help scientists to deal with it. And... I didn't think it was possible to be worse than pip or other Python tools, but they convinced me it is. Conda is the worst Python program of note that I had ever dealt with. It's so spectacularly bad it's breathtaking. You can almost literally take a random piece of its source code and post it to user boards that make fun of bad code. When I have a bad day, I just open its code in a random file, and like that doctor who was happy running around the imaging room exclaiming "I'm so, so, so happy! I'm so unimaginably happy that I'm not this guy! (pointing at an X-ray in his hand)" I'm happy I'm not the author of this program. I would've just died of shame and depression if I was.

bobbylarrybobby|3 years ago

If it were really that simple, surely all these other solutions wouldn't exist?

insane_dreamer|3 years ago

I'm the opposite. We have to maintain a lot of different environments for different projects, and with conda things "just work" (esp now with better PyCharm integration). Venv is much more of a hassle.

jstx1|3 years ago

With plain venv it’s hard to maintain multiple different Python versions on the same machine; conda makes this much easier.

Also on M1/M2 Macs some libraries (especially for ML) are only available through conda-forge.

lmm|3 years ago

> Just a python -m venv per project, a requirements.txt, and you will basically never have issues.

As long as you always remember to run exactly the right two commands every time you work on any project and never run the wrong ones, or run project A's commands in project B's terminal. There's no undo BTW, if you ever do that you've permanently screwed up your setup for project B and there's no way to get back to what you had before (your best bet is destroying and recreating project B's venv, but that will probably leave you with different versions of dependencies installed from what you had before).

(And as others have said, that doesn't cover multiple versions of Python, or native dependencies. But even if your project is pure python with only pure python dependencies that work on all versions, venv+pip is very easy to mess up and impossible to fix when you do)

pbronez|3 years ago

Until you want to use anything with a c extension..

tomalaci|3 years ago

I would highly recommend Poetry for python package management. It basically wraps around pip and venvs offering a lot of convenience features (managing packages, do dist builds, etc.). It also works pretty nicely with Tox.

I would recommend using virualenvs.in-project setting so Poetry generates venv in the project folder and not in some temporary user folder.

peterhil|3 years ago

I just compared and evaluated Hatch, Flit, Poetry and Pdm and found Pdm to be most robust and slimmest. Hatch was a good second option, and Poetry and Hatch are easy to use, but have too much bloat and magic.

davidktr|3 years ago

100% this. I've always struggled with creating packages, but now simply do poetry init and I am done. Magic.

nerdponx|3 years ago

I prefer Hatch over Poetry. I don't have any strong reason for that preference, I've just use both and I feel more comfortable with Hatch. It feels a little more seamlessly integrated with other Python tools, and I appreciate the developers' conservative approach to adding features.

winrid|3 years ago

Thanks. I recently spent a whole afternoon learning how to package a new python project. Was really surprised at the difficulty even with venv, compared to node and java.

Max_Limelihood|3 years ago

Answer: they don’t

(Seriously, I’ve gotten so fed up with Python package management that I just use CondaPkg.jl, which uses Julia’s package manager to take care of Python packages. It is just so much cleaner and easier to use than anything in Python.)

dangerlibrary|3 years ago

I hate python package management - I really do. But I've never actually had a problem with virtual environments, and I think it's because I just use virtualenv directly (rather than conda or whatever else).

I have these aliases in my .bashrc, and I can't remember the last time I had a major issue.

alias venv='rm -rf ./venv && virtualenv venv && source ./venv/bin/activate'

alias vact='source ./venv/bin/activate'

alias pinstall='source ./venv/bin/activate && pip install . && pip install -r ./requirements.txt && pip install ./test_requirements.txt'

I don't have all the fancy features, like automatically activating the virtualenv when I cd into the directory, but I've always found those to be a bigger headache than they are worth. And if I ever run into some incompatibility or duplicate library or something, I blow away the old venv and start fresh. It's a good excuse to get up and make a cup of tea.

birdstheword5|3 years ago

It sounds mean to say it, but it's 100% true. I moved away from using python wherever I can. I've had colleagues struggle for days to install well used packages like pandas and numpy in conda.

swyx|3 years ago

also there's like 3 different flavors of virtual env now and me being 8 years out of date with my python skillz i have no idea what the current SOTA is with python venv tooling :/

i dont need them demystified, i need someone smarter than me to just tell me what to do lol

cmcconomy|3 years ago

My personal approach is:

- use miniconda ONLY to create a folder structure to store packages and to specify a version of python (3.10 for example)

- use jazzband/pip-tools' "pip-compile" to create a frozen/pinned manifest for all my dependencies

- use pip install to actually install libraries (keeping things stock standard here)

- wrap all the above in a Makefile so I am spared remembering all the esoteric commands I need to pull this all together

in practice, this means once I have a project together I am:

- activating a conda environment

- occasionally using 'make update' from to invoke pip-compile (adding new libraries or upgrading), and

- otherwise using 'make install' to install a known working dependency list.

raihansaputra|3 years ago

thanks for sharing. I've thought about the same approach. Conda installs are.. annoying to say the least, but they do provide a better UX compared to manually managing venvs. Your approach seems mature. (why not ./venv/ per project? because you can't do that when your project directory is on another disk) (also i got burned with poetry in regards of very long dependency checking. I'm not making libraries, just an environment for my own projects)

nose-wuzzy-pad|3 years ago

This seems simplistic and low drag. Do you have an example you can share?

Thanks!

harold_b|3 years ago

[deleted]

josteink|3 years ago

All other languages: use whatever packages you like. You’ll be fine.

Python: we’re going to force all packages from all projects and repos to be installed in a shared global environment, but since nobody actually wants that we will allow you to circumvent that by creating “virtual” environments you can maintain and have to deal with instead. Also remember to activate it before starting your editor or else lulz. And don’t use the same editor instance for multiple projects. Are you crazy???

Also: Python “just works”, unlike all those other silly languages.

Somebody IMO needs to get off their high horse. I can’t believe Python users are defending this nonsense for real. This must be a severe case of Stockholm-syndrome.

rad_gruchalski|3 years ago

Yeah, how does it go? There’s at least one obvious way to do something? Python… makes me anxious. I don’t mind writing Python but setting it up is crazy.

analog31|3 years ago

I'm a long time Python user, and I don't defend virtual environments at all. I just don't use them. Granted, I'm a so called "scientific" programmer, and am not writing production code. I haven't run into the problems that are solved by virtual environments, nor have my colleagues. Sure, it means I'm probably living in a bubble, but it may be a bubble shared by a lot of people.

Python is the first language that I've used, where the user community is a major attraction, resulting in significant inertia. Replacing Python requires a new language and a new community. Also, the tools that helped build that community, such as Google and Stackoverflow, have (by some accounts) deteriorated.

If package management is that bad, then yeah, time to switch languages.

No high horse here. Like Sancho Panza, I have to be content with an ass. ;-)

pnt12|3 years ago

What an awful take.

Python is 30 years old and back then system packages were more common, that's it. It's been a bumpy ride and I'd prefer if there was a standard manager, but the community has produced some decent ones.

The python community has always tried to make devs happy. Pip has an amazing number of libraries. Virtual envs were included into the standard tools. Pipenv was integrated in the python github organization. The org doesn't hate virtual envs!

mixmastamyk|3 years ago

Basically the point is to avoid the system python, which is not hard. One needs some sys-ad skills to understand what is going on however; unfortunately sounds like they are short supply.

I don't do anything you mention, so there must be a simpler way.

frou_dh|3 years ago

Python is over 30 years old so it's hardly surprising that it's got plenty baggage in at least some areas.

asicsp|3 years ago

See also: Virtual Environments Demystified (https://meribold.org/python/2018/02/13/virtual-environments-...)

Discussion from 2021: https://news.ycombinator.com/item?id=25611307

sakex|3 years ago

It feels like it is one of the reasons experienced devs are ditching Python for production systems. Besides horrendous performance and lousy semantics. The cost of setting up, maintaining the environment and onboarding people is just not worth it.

lockhouse|3 years ago

Excuse my ignorance, but aren’t Virtual Environments something you setup once per project? Why would that be a dealbreaker? How is it any more difficult than putting everything in a docker container like all the cool kids are doing these days?

lucb1e|3 years ago

> The cost of setting up, maintaining the environment and onboarding people is just not worth it.

I have yet to come across a situation where I need a virtual environment at all. A lot of projects use it, but then lazy me just runs git clone && python3 clone/main.py and it works just fine, sometimes after an apt install python3-somedependency or two.

It always seemed weird to me to depend on such a specific version ("package foo can't be older than 7 months or newer than 4 months"), how even does one manage to use such obscure features that they were removed and so you need an older-than version?

And then there's the people in this thread (seemingly a majority when judging by top level comment loudness) that have trouble with virtual environments and so add another layer on top to manage that virtual environment thing. The heck. Please explain

phonebucket|3 years ago

Are experienced devs really ditching Python for production systems? I wasn't under this impression.

unknown|3 years ago

[deleted]

ggregoire|3 years ago

Experienced Python devs run their projects in docker, which solves the 3 issues you listed at the end.

Havoc|3 years ago

These days I'm just throwing each project into a fresh LXC on a server.

All these different languages have their own approach and each then also user/global/multiple versions...it's just not worth figuring out

spyremeown|3 years ago

Question: what makes you choose LXC over Docker?

bagels|3 years ago

Same, but with Docker. I don't like conda, was fine with virtualenv, but using docker there's only one python and you can just pip install and not worry about multiple environments.

cpburns2009|3 years ago

Virtual environments are easy to create and manage. Create one with the built-in venv module:

    python3.10 -m venv ./venv  # or your favorite version
    . ./venv/bin/activate
    pip install pip-tools

Manage dependencies using pip-compile from pip-tools. Store direct dependencies in "requirements.in", and "freeze" all dependencies in "requirements.txt" for deployment:

    . ./venv/bin/activate
    pip-compile -U -o ./requirements.txt ./requirements.in
    pip install -r ./requirements.txt

warner25|3 years ago

> One point I would like to make is how virtual environments are designed to be disposable and not relocatable.

Is the author saying that relocating them will actually break things, or that it's just as easy to recreate them in a different location? Because I've moved my venv directories and everything still seemed to work OK. Did I just get lucky?

jszymborski|3 years ago

It's a gamble to move venvs.

The real way to move venvs is to freeze the venv (i.e. make a requirements.txt) and then pip -r requirements.txt to recreate the venv.

This process is really the only thing about venvs that ever causes me trouble.

korijn|3 years ago

Depends if any of your packages use absolute paths (generated at install time for example).

noisenotsignal|3 years ago

There’s also relocating across machines. For example, maybe your build environment has access to internal registries but your release environment does not. I naively thought you could build your venv and just copy to the new machine (both environments were Ubuntu) but ran into errors (due to links breaking). We also used pex for a bit, which is kind of like building a binary of a venv, and that eventually stopped working too when the C ABI was no longer the same between environments. There didn’t seem to be an easy way to pick the ABI version to target when creating the pex file, so I gave up and just downloaded the wheels for internal packages in the build.

Izkata|3 years ago

By default when you activate a virtualenv, it uses hardcoded absolute paths (determined at the time the environ was created), so moving the directory will break it.

senko|3 years ago

> relocating them will actually break things

Yes, absolute paths are hardcoded in several places.

I actually have a use case for copying/relocating them (for https://apibakery.com), and simple search/replace of paths across the entire venv works, but I wouldn't recommend that as a best practice approach :-)

JamesonNetworks|3 years ago

I’ve had problems with symlinked python bins not existing in the same place requiring relinking as one example of a problem

acomjean|3 years ago

I’ve tried to move things and broken everything. (Conda environments). I tried replacing the paths in the files and it didn’t work. We run a bunch of different tools with various python requirements and would like to be able to duplicate them for the next tool.

We ended up making a new environments for each. Honestly it’s a bit of a mess.

Falell|3 years ago

Relocating them will actually break things in many cases, especially when native code is involved.

Helmut10001|3 years ago

I have my whole conda env folder symlinked to my second drive. Impossible to store 120GB of environments otherwise.

its_over_|3 years ago

I use poetry or docker or nixpkgs

I've given up.

EDIT: also just finding myself reaching for go in most cases

ggm|3 years ago

How much of this is caused by a join over "odd" decisions of what is installed by Python3 developers, "odd" decisions of what a "package" is by package makers and what I think I want to call "fanaticism" by Debian apt around things?

FreeBSD ports are significantly closer to "what the repo has, localized" where it feels like linux apt/yum/flat is "what we think is the most convenient thing to bodge up from the base repo, but with our special sauce because <reasons>"

rekahrv|3 years ago

That's insightful.

It seems that a virtual environment created by Poetry looks very similar, except that it doesn't contain an `include` directory. It contains:

* `bin` directory

* `lib/<python-version>/site-packages/` directory

* `pyvenv.cfg`

killjoywashere|3 years ago

I didn’t realize venv was part of the standard library. If that’s the case, how is it that conda even exists? Anybody got a good history of this?

int_19h|3 years ago

conda can install things other than Python packages. C++ compilers, for example, or native libraries that Python packages depend on.

randoglando|3 years ago

venv is part of the standard library from Python 3. It's not in Python 2.

unknown|3 years ago

[deleted]

jcparkyn|3 years ago

I'm beginning to feel like every single comment in every thread related to python package management is just this:

"Package management in python is so easy, just use [insert tool or workflow that's different to literally every other comment in the thread]."

PhysicalNomad|3 years ago

I don't bother with venvs anymore and just use podman instead.

Already__Taken|3 years ago

Been really enjoying trying out pdm in PEP 582 mode. I've just found it behaves when used across multiple devs, not necessarily that used to working with python.

cozzyd|3 years ago

The "global" vs. "directory" dichotomy seems... off. Haven't PYTHONHOME and PYTHONPATH been supported since approximately forever?

89vision|3 years ago

I haven't used these since docker

ginko|3 years ago

That's just giving up.

foooobaba|3 years ago

With docker, do you use debugging in pycharm/vscode, or just for compiling/shipping?

hkgjjgjfjfjfjf|3 years ago

[deleted]

gt565k|3 years ago

Just setup a django project with pipenv, works just fine.

aniforprez|3 years ago

Pipenv has never once worked just fine personally. The dependency resolution is a joke and the slowest of any project in this space, they have tons of bugs and the project is languishing

I prefer to use a combination of pip-tools and pyenv for my projects

Supermancho|3 years ago

This writeup needs work.

> So while you could install everything into the same directory as your own code (which you did, and thus didn't use src directory layouts for simplicity), there wasn't a way to install different wheels for each Python interpreter you had on your machine so you could have multiple environments per project (I'm glossing over the fact that back in my the day you also didn't have wheels or editable installs).

This is a single run-on sentence. Someone reading this, probably doesn't know what "wheels" means. If you are going to discount it anyway, why bring it up?

> Enter virtual environments. Suddenly you had a way to install projects as a group that was tied to a specific Python interpreter

I thought we were talking about dependencies? So is it just the interpreter or both or is there a typo?

> conda environments

I have no idea what those are. Do I care? Since the author is making a subtle distinction, reading about them might get me confused, so I've encountered another thing to skip over.

> As a running example, I'm going to assume you ran the command py -m venv --without-pip .venv in some directory on a Unix-based OS (you can substitute py with whatever Python interpreter you want

Wat? I don't know what venvs are. Can you maybe expand without throwing multi-arg commands at me? Maybe add this as a reference note, rather than inlining it into the information. Another thing to skip over.

> For simplicity I'm going to focus on the Unix case and not cover Windows in depth.

Don't cover Windows at all. Make a promise to maintain a separate doc in the future and get this one right first.

> (i.e. within .venv):

This is where you start. A virtual environment is a directory, with a purpose, which is baked into the ecosystem. Layout the purpose. Map the structure to those purposes. Dive into exceptional cases. Talk about how to create it and use it in a project. Talk about integrations and how these help speed up development.

I also skipped the plug for the mircoenv project, at the end with a reference to VSCode.

ianbutler|3 years ago

I expect most everyday python users know what these things are. I also expect this was targeted at python users who use these things but haven't thought deeply about them.

Charitably, I will assume you are a non python user, and that's why this is a miss for you.

283 comments