top | item 45466086

PEP 810 – Explicit lazy imports

433 points| azhenley | 5 months ago |peps.python.org

240 comments

order
[+] simonw|5 months ago|reply
Love this. My https://llm.datasette.io/ CLI tool supports plugins, and people were complaining about really slow start times even for commands like "llm --help" - it turned out there were popular plugins that did things like import pytorch at the base level, so the entire startup was blocked on heavy imports.

I ended up adding a note to the plugin author docs suggesting lazy loading inside of functions - https://llm.datasette.io/en/stable/plugins/advanced-model-pl... - but having a core Python language feature for this would be really nice.

[+] zahlman|5 months ago|reply
You can implement this from your tool today: https://news.ycombinator.com/item?id=45467489

Note that this is global to the entire process, so for example if you make an import of Numpy lazy this way, then so are the imports of all the sub-modules. Meaning that large parts of Numpy might not be imported at all if they aren't needed, but pauses for importing individual modules might be distributed unpredictably across the runtime.

Edit: from further experimentation, it appears that if the source does something like `import foo.bar.baz` then `foo` and `foo.bar` will still be eagerly loaded, and only `foo.bar.baz` itself is deferred. This might be part of what the PEP meant by "mostly". But it might also be possible to improve my implementation to fix that.

[+] peterfirefly|5 months ago|reply
Parse the command line and do things like "--help" without doing the imports.

Only do imports when you know you need them -- or as an easy approximation, only if the easy command line options have been handled and there's still something to do.

[+] Neywiny|5 months ago|reply
I think really the problem is that packages like pytorch take so long to import. In my work I've tried a few packages (not AI stuff) that do a lot of work on import. It's actually quite detrimental because I have to setup environment variables to pass things that should be arguments of a setup function in. All things considered a python module shouldn't take any noticeable time to import
[+] zvr|5 months ago|reply
It's not only in the case of plugins.

If a tool has different capabilities that use different imports, why load all of them if only a subset is required?

As a simple example, a tool that can generate output in various formats (e.g., json, csv, xml, ...) should only import the appropriate modules to handle the output format after having determined which ones will be used in this invocations.

[+] charliermarsh|5 months ago|reply
Lazy imports have been proposed before, and were rejected most recently back in 2022: https://discuss.python.org/t/pep-690-lazy-imports-again/1966.... If I recall correctly, lazy imports are a feature supported in Cinder, Meta's version of CPython, and the PEP was driven by folks that worked on Cinder. Last time, a lot of the discussion centered around questions like: Should this be opt-in or opt-out? At what level? Should it be a build-flag for CPython itself? Etc. The linked post suggests that the Steering Council ultimately rejected it because of the complexity it would introduce to have two divergent "modes" of importing.

I hope this proposal succeeds. I would love to use this feature.

[+] BiteCode_dev|5 months ago|reply
Especially since it is opt in, with various level of granularity, and a global off switch. Very well constructed spec given the constraints.
[+] flare_blitz|5 months ago|reply
I also hope this proposal succeeds, but I'm not optimistic. This will break tons of code and introduce a slew of footguns. Import statements fundamentally have side effects, and when and how these side effects are applied will cause mysterious breakages that will keep people up for many nights.

This is not fearmongering. There is a reason why the only flavor of Python with lazy imports comes from Meta, which is one of the most well-resourced companies in the world.

Too many people in this thread hold the view of "importing {pandas, numpy, my weird module that is more tangled than an eight-player game of Twister} takes too long and I will gladly support anything that makes them faster". I would be willing to bet a large sum of money that most people who hold this opinion are unable to describe how Python's import system works, let alone describe how to implement lazy imports.

PEP 690 describes a number of drawbacks. For example, lazy imports break code that uses decorators to add functions to a central registry. This behavior is crucial for Dash, a popular library for building frontends that has been around for more than a decade. At import-time, Dash uses decorators to bind a JavaScript-based interface to callbacks written in Python. If these imports were made lazy, Dash would break. Frontends used by thousands, if not millions of people, would immediately become unresponsive.

You may cry, "But lazy imports are opt-in! Developers can choose to opt-out of lazy imports if it doesn't work for them." What if these imports were transitive? What if our frontend needed to be completely initialized before starting a critical process, else it would cause a production outage? What if you were a maintainer of a library that was used by millions of people? How could you be sure that adding lazy imports wouldn't break any code downstream? Many people made this argument for type hints, which is sensible because type hints have no effect on runtime behavior*. This is not true for lazy imports; import statements exist in essentially every nontrivial Python program, and changing them to be lazy will fundamentally alter runtime behavior.

This is before we even get to the rest of the issues the PEP describes, which are even weirder and crazier than this. This is a far more difficult undertaking than many people realize.

---

* You can make a program change its behavior based on type annotations, but you'd need to explicitly call into typing APIs to do this. Discussion about this is beyond the scope of this post.

[+] dheera|5 months ago|reply
Oof. I wish they could support version imports

    import torch==2.6.0+cu124
    import numpy>=1.2.6
and support having multiple simultaneous versions of any Python library installed. End this conda/virtualenv/docker/bazel/[pick your poison] mess
[+] tomyhsieh|5 months ago|reply
Just curious. What changed?

From merely browsing through a few comments, people have mostly positive opinions regarding this proposal. Then why did it fail many times, but not this time? What drives the success behind this PEP?

[+] comex|5 months ago|reply
I don't hate it but I don't love it. It sounds like everyone will start writing `lazy` before essentially every single import, with rare exceptions where eager importing is actually needed. That makes Python code visually noisier. And with no plan to ever change the default, the noise will stay forever.

I would have preferred a system where modules opt in to being lazy-loaded, with no extra syntax on the import side. That would simplify things since only large libraries would have to care about laziness. To be fair, in such a design, the interpreter would have to eagerly look up imports on the filesystem to decide whether they should be lazy-loaded. And there are probably other downsides I'm not thinking of.

[+] wrmsr|5 months ago|reply
If anyone's interested I've implemented a fairly user friendly lazy import mechanism in the form of context managers (auto_proxy_import/init) at https://pypi.org/project/lazyimp/ that I use fairly heavily. Syntactically it's just wrapping otherwise unmodified import statements in a with block, so tools 'just work' and it can be easily disabled or told to import eagerly for debugging. It's powered primarily by swapping out the frame's f_builtins in a cext (as it needs more power than importlib hooks provide), but has a lame attempt at a threadsafe pure python version, and a super dumb global hook version.

I was skeptical and cautious with it at first but I've since moved large chunks of my codebase to it - it's caused surprisingly few problems (honestly none besides forgetting to handle some import-time registration in some modules) and the speed boost is addictive.

[+] NSPG911|5 months ago|reply
looks very interesting! i might use this for some of my projects as well
[+] Spivak|5 months ago|reply
I love the feature but I really dislike using the word lazy as a new language keyword. It just feels off somehow. I think maybe defer might be a better word. It is at least keeps the grammar right because it would be lazily.

    lazily import package.foo
    vs
    defer import package.foo
Also the grammar is super weird for from imports.

    lazy from package import foo
    vs.
    from package defer import foo.
[+] est|5 months ago|reply
second this.

Just might as well add `defer` keyword like Golang.

[+] thayne|5 months ago|reply
One thing the PEP doesn't really talk about, and that I find very annoying is that many python linters will complain if you don't put all of your imports at the top of the file, so you get lint warnings if you do the most obvious way to implement lazy imports.

And that is actually a problem for more than just performance. In some cases, importing at the top might actually just fail. For example if you need a platform specific library, but only if it is running on that platform.

[+] bcoates|5 months ago|reply
I don't think there is any solution for that but "fix your broken linter".
[+] phainopepla2|5 months ago|reply
It is annoying, but most linters will accept a `#noqa E402` comment to ignore it
[+] nilslindemann|5 months ago|reply
This is the wrong syntax, comparable to how "u" strings were the wrong syntax and "b" strings are the right syntax.

They make that, what should be the default, a special case. Soon, every new code will use "lazy". The long term effect of such changes is a verbose language syntax.

They should have had a period where one, if they want lazy imports, has to do "from __future__ import lazy_import". After that period, lazy imports become the default. For old-style immediate imports, introduce a syntax: "import now foo" and "from foo import now bar".

All which authors of old code would have to do is run a provided fix script in the root directory of their code.

[+] zahlman|5 months ago|reply
I think your assessment of what's "the right/wrong" syntax is fair. But the transition you describe takes a long time, even now that the community has figured out a "deprecation cycle" process that seems satisfactory (i.e. won't lead to another Python 3.0 situation).

> All which authors of old code would have to do is run a provided fix script in the root directory of their code.

As I recall, `lib2to3` didn't do a lot to ease tensions. And `six` is still absurdly popular, mainly thanks to `python-dateutil` still attempting to support 2.7.

[+] zahlman|5 months ago|reply
> The standard library provides the LazyLoader class to solve some of these inefficiency problems. It permits imports at the module level to work mostly like inline imports do.

The use of these sorts of Python import internals is highly non-obvious. The Stack Overflow Q&A I found about it (https://stackoverflow.com/questions/42703908/) doesn't result in an especially nice-looking UX.

So here's a proof of concept in existing Python for getting all imports to be lazy automatically, with no special syntax for the caller:

  import sys
  import threading # needed for python 3.13, at least at the REPL, because reasons
  from importlib.util import LazyLoader # this has to be eagerly imported!
  class LazyPathFinder(sys.meta_path[-1]): # <class '_frozen_importlib_external.PathFinder'>
      @classmethod
      def find_spec(cls, fullname, path=None, target=None):
          base = super().find_spec(fullname, path, target)
          base.loader = LazyLoader(base.loader)
          return base
  sys.meta_path[-1] = LazyPathFinder
We've replaced the "meta path finder" (which implements the logic "when the module isn't in sys.modules, look on sys.path for source code and/or bytecode, including bytecode in __pycache__ subfolders, and create a 'spec' for it") with our own wrapper. The "loader" attached to the resulting spec is replaced with an importlib.util.LazyLoader instance, which wraps the base PathFinder's provided loader. When an import statement actually imports the module, the name will actually get bound to a <class 'importlib.util._LazyModule'> instance, rather than an ordinary module. Attempting to access any attribute of this instance will trigger the normal module loading procedure — which even replaces the global name.

Now we can do:

  import this # nothing shows up
  print(type(this)) # <class 'importlib.util._LazyModule'>
  rot13 = this.s # the module is loaded, printing the Zen
  print(type(this)) # <class 'module'>
That said, I don't know what the PEP means by "mostly" here.
[+] redleader55|5 months ago|reply
Can you explain why does "threading" needs to be loaded? The rest seems decently straightforward, but what initialization from threading is required and why only in 3.13+?
[+] TheMrZZ|5 months ago|reply
Feels like a good feature, with a simple explanation, real world use cases, and a scoped solution (global only, pretty simple keyword). I like it!
[+] BiteCode_dev|5 months ago|reply
Agree, they really did their homework, listed edge cases, made practical compromises, chose not to overdo it, reworked it again and again quite a bit and compared it to real life experience.

It's really beautiful work, especially since touching the back bone (the import system) of a language as popular as Python with such a diverse community is super dangerous surgery.

I'm impressed.

[+] 12_throw_away|5 months ago|reply
Yeah, I think this is one of the cleanest PEPs to come around in quite a while, at least from the userspace perspective. Interested to see what happens after the traditional syntax bikeshedding ritual has been completed.
[+] oofbey|5 months ago|reply
Hopefully they learned lessons from why PEP-690 was rejected. I've spent quite a while trying to build this stuff for our codebase and it's never worked well enough to use.
[+] bcoates|5 months ago|reply
I think they're understating the thread safety risks here. The import is going to wind up happening at a random nondeterministic time, in who knows what thread holding who knows what locks (aside from the importer lock).

Previously, if you had some thread hazardous code at module import time, it was highly likely to only run during the single threaded process startup phase, so it was likely harmless. Lazy loading is going to unearth these errors in the most inconvenient way (as Heisenbugs)

(Function level import can trigger this as well, but the top of a function is at least a slightly more deterministic place for imports to happen, and an explicit line of syntax triggering the import/bug)

[+] aftbit|5 months ago|reply
We tend to prefer explicit top-level imports specifically because they reveal dependency problems as soon as the program starts, rather than potentially hours or days later when a specific code path is executed.
[+] f311a|5 months ago|reply
I don't like the idea of introducing a new keyword. We need a backward compatible solution. I feel like Python needs some kind of universal annotation syntax such as in go (comments) or in Rust (macros). New keyword means all parsers, lsps, editors should be updated.

I’m pretty sure there will be new keywords in Python in the future that only solve one thing.

[+] ashu1461|5 months ago|reply
Not sure if this can be made backward compatible.

Right now all the imports are getting resolved at runtime example in a code like below

  from file1 import function1
When you write this, the entire file1 module is executed right away, which may trigger side effects.

If lazy imports suddenly defer execution, those side effects won’t run until much later (or not at all, if the code path isn’t hit). That shift in timing could easily break existing code that depends on import-time behavior.

To avoid using lazy, this there is also a proposal of adding the modules you want to load lazily to a global `__lazy_modules__` variable.

[+] Alir3z4|5 months ago|reply
Me neither.

Introducing new keyword has become a recent thing in Python.

Seems Python has a deep scare since Python2 to Python3 time and is scared to do anything that causes such drama again.

For me, the worst of all is "async". If 2to3 didn't cause much division, the async definitely divided Python libraries in 2. Sync and Async.

Maybe if they want backward compatible solution, this can be done by some compile or runtime flag like they did with free threading no-gil.

[+] BiteCode_dev|5 months ago|reply
They thought about backward compatibility, and offer an alternative syntax that use no keyword for library that want to activate it yet stay compact with old version. It's already in the spec.
[+] baq|5 months ago|reply
Yeah I like the feature but hate the keyword. Dunder lazy imports sounds good enough imho.
[+] jacquesm|5 months ago|reply
Lazy imports are a great way to create runtime errors far into the operation of a long lived service. Yes, it gives the superficial benefit of 'fast startup', but that upside is negated by the downside of not being sure that once something runs it will run to completion due to a failed import much further down the line. It also allows for some interesting edge cases with the items that are going to be imported no longer being what is on the tin at the time the program is started.
[+] famouswaffles|5 months ago|reply
That's fine, because this is still a genuine problem in need of a solution. It's not just about startup time for the sake of it (not that this is even a superficial concern - python startup time with large dependencies quickly gets awful). Large projects can have hefty dependencies that not every user will use. And bundling it all for everyone can sometimes be intractable. The work arounds people use already have the sort of problems you're talking about, on top of being diabolical and hacky. Not having to duplicate and hide imports in functions alone would be a big improvement. It's not like it isn't being proposed as an optional language feature.
[+] BHSPitMonkey|5 months ago|reply
An automated test mitigates the risk you describe, and is well worth the tradeoff for fast startup.

I don't consider startup time "superficial" at all; I work in a Django monolith where this problem resulted in each and every management command, test invokation, and container reload incurring a 10-15sec penalty because of just a handful of heavy-to-import libraries used by certain tasks/views. Deferring these made a massive difference.

[+] film42|5 months ago|reply
I'm a fan because it's something you can explicitly turn on and off. For my Docker based app, I really want to verify the completeness of imports. Preferably, at build and test time. In fact, most of the time I will likely disable lazy loading outright. But, I would really appreciate a faster loading CLI tool.

However, there is a pattern in python to raise an error if, say, pandas doesn't have an excel library installed, which is fine. In the future, will maintainers opt to include a bunch of unused libraries since they won't negatively impact startup time? (Think pandas including 3-4 excel parsers by default, since it will only be loaded when called). It's a much better UX, but, now if you opt out of lazy loading, your code will take longer to load than without it.

[+] jmward01|5 months ago|reply
This is needed, but I don't like new keywords. What I would love, for many reasons, is if we could decorate statements. Then things like:

import expensive_module

could be:

@lazy

import expensive_module

or you could do:

@retry(3)

x = failure_prone_call(y)

lazy is needed, but maybe there is a more basic change that could give more power with more organic syntax, and not create a new keyword that is special purpose (and extending an already special purpose keyword)

[+] ajb|5 months ago|reply
Given all the problems people are mentioning, it seems like this proposal is on the wrong side. There should be an easy way for a module to declare itself to be lazy loaded. The module author, not the user, is the one who knows whether lazy loading will break stuff.
[+] zahlman|5 months ago|reply
> There should be an easy way for a module to declare itself to be lazy loaded.

It can just implement lazy loading itself today, by using module-level __getattr__ (https://docs.python.org/3/reference/datamodel.html#customizi...) to overwrite itself with a private implementation module at the appropriate time. Something like:

  # foo.py
  def __getattr__(name):
      # clean up the lazy loader before loading
      # this way it's cleaned up if the implementation doesn't replace it,
      # and not scrubbed if it does
      global __getattr__
      del __getattr__
      import sys
      self = sys.modules[__name__]
      from . import _foo
      # "star-import" by adding names that show up in __dir__
      self.__dict__.update(**{k: getattr(_foo, k) for k in _foo.__dir__()})
      # On future attribute lookups, everything will be in place.
      # But this time, we need to delegate to the normal lookup explicitly
      return getattr(self, name)
Genericizing this is left as an exercise.
[+] Grikbdl|5 months ago|reply
It's also simpler. It could even be a package level thing similar to typed.py marker. I don't want to pepper literally all my modules with loads of explicit lazy keywords.
[+] pjjpo|5 months ago|reply
I wonder if this proposal suffers an because of Python's extremely generous support period and perhaps the ship has sailed.

- lazy imports are a hugely impactful feature

- lazy imports are already possible without syntax

This means any libraries that get large benefit from lazy imports already use import statements within functions. They can't really use the new feature since 3.14 EoL is _2030_, forever from now. The __lazy_modules__ syntax preserves compatibility only but being eager, not performance - libraries that need lazy imports can't use it until 2030.

This means that the primary target for a long time is CLI authors, which can have a more strict python target and is mentioned many times in the PEP, and libraries that don't have broad Python version support (meaning not just works but works well), which indicates they are probably not major libraries.

Unless the feature gets backported to 2+ versions, it feels not so compelling. But given how it modifies the interpreter to a reasonable degree, I wonder if even any backport is on the table.

[+] Grikbdl|5 months ago|reply
In at least the scientific python environment, there's "SPEC0" in which a lot of the de facto core libraries have basically agreed to support the last three versions of Python, no more.

For other libraries they can of course choose as they want, but generally I don't think it's so common for libraries to be as generous with the support length as cpython.

[+] Boxxed|5 months ago|reply
Ugh...I like the idea, but I wish lazy imports were the default. Python allows side effects in the top level though so that would be a breaking change.

Soooo instead now we're going to be in a situation where you're going to be writing "lazy import ..." 99% of the time: unless you're a barbarian, you basically never have side effects at the top level.

[+] f33d5173|5 months ago|reply
It would be interesting if instead you added a syntax whereby a module could declare that it supported lazy importing. Maybe even after running some code with side effects that couldn't be done lazily. For one thing, this would have a much broader performance impact, since it would benefit all users of the library, not just those who explicitly tagged their imports as lazy. For another, it would minimize breakage, since a module author knows best whether, and which parts of, their module can be lazily loaded.

On the other hand, it would create confusion for users of a library when the performance hit of importing a library was delayed to the site of usage. They might not expect, for example, a lag to occur there. I don't think it would cause outright breakage, but people might not like the way it behaved.

[+] the_mitsuhiko|5 months ago|reply
I like the approach of ES6 where you pull in bindings that are generally lazily resolved. That is IMO the approach that should be the general strategy for Python.