Show HN: Modshim – A new alternative to monkey-patching in Python
109 points| joouha | 4 months ago |github.com
It's a bit like OverlayFS for Python modules - it allows you write modifications for a target module (lower) in a new module (upper), and have these combined in a new virtual module (mount).
It works by rewriting imports using AST transformations, then running both the lower and upper module's code in the new Python module.
This prevents polluting the global namespace when monkey-patching, and means if you want to make changes to a third-party package, you don't have to take on the maintenance burden of forking, you can package and distribute just your changes.
nbadg|4 months ago
Point being, it's a lot of really complicated fiddling with the python import system. And a lesson I have learned is that messing around with import internals in python is extremely tricky to get right. Furthermore, trying to coordinate correctly between modules that do and don't get modified my the hook is very finicky. Not to mention that supply side attacks on the import system itself could be a terrifying attack vector that would be absurdly difficult to detect.
All this to say, I'm not a big fan of monkeypatching, but I know exactly how it behaves, its edge cases, and what to expect if I do it. It is, after all, pretty standard practice to patch things during python unit tests. And even with all its warts, I would prefer patching to import fiddling any day of the week and twice on Sunday.
Feedback for the author: you need to explain the "why" of your project more thoroughly. I'm sure you had a good reason to strike out in this direction, and maybe this is a super elegant solution. But you've failed to explain to me under what circumstances I might also encounter the same problems with patching that you've encountered, in order to explain to me why the risk of an import hook is justified.
OJFord|4 months ago
> means if you want to make changes to a third-party package, you don't have to take on the maintenance burden of forking, you can package and distribute just your changes.
That's a big win. I've seen and done my share of `# this file from github.com/blah with minor change X to L123` etc.
joouha|4 months ago
I've written a Jupyter client for the terminal (euporie), for which I've had to employ monkey-patching of various third-party packages to achieve my goals and avoid forking those packages. For example, I've added terminal graphics support & HTML/CSS rendering to prompt-toolkit (a Python TUI library), and I've changed aiohttp to not raise errors on non-200 http responses. These are things the upstream package maintainers do not want to maintain or will not implement, and likewise I do not want to maintain forks of these packages.
So far I've got away with monkey-patching, but recently I implemented a kernel for euporie which runs on the local interpreter (the same interpreter as the application itself). This means that my patches are exposed to the end user in a REPL, resulting in potentially unexpected behaviour for users when using certain 3rd party packages in Python through euporie. Modshim will allow me to keep my patched versions isolated from the end user.
Additionally, I would like to publish some of my patches to prompt_toolkit as a new package extending prompt_toolkit, as I think they would be useful to others building TUI applications. However, the changes required need to be deeply integrated to work, which would mean forking prompt_toolkit (something I'd like to avoid). modshim will make it possible for me to publish just my modifications.
Perhaps it's a somewhat niche use-case, and modshim is not something most Python users would ever need to use. I just thought it was something novel enough to be of interest to other HN users.
> messing around with import internals in python is extremely tricky to get right
This is true! modshim has been the most complicated thing I've written by some way!
BiteCode_dev|4 months ago
This solution is interesting, as it provides the patched code as if it were a new package, indendant of the existing one you have installed, like vendoring, but without the burden of it.
In case you want to be the only one seing your patch, this is great. It also makes the whole maintenance easier, as you don't have to wonder if you patch it at the right time or in the right way. MK can fail in many subtle edge cases.
Inheritance, particularly, is a great Mk pitfall I expect this method to transparently work with.
Uptrenda|4 months ago
o11c|4 months ago
If I control all the imports I can usually subclass things myself just fine.
theptip|4 months ago
This seems to explicitly handle the case you are interested in - automatically updating library-internal references to the lower to instead use the upper?
afarviral|4 months ago
The README mentions 3 scenarios that this might be preferred over, but not the fourth which I regularly do: Create my own functions/classes that are composed from the unchanged modules. E.g. a request_with_retries function which adds retry logic to requests without the need to monkey patch. I regularly use decorators as well to add things like retries.
For more complex scenarios Modshim might win out, as mentioned in the understated section of the README "Benefits of this Approach":
> Internal Reference Rewriting: This example demonstrates modshim's most powerful feature. By replacing requests.sessions.Session, we automatically upgraded top-level functions like requests.get() because their internal references to Session are redirected to our new class.
> Preservation of the Original Module: The original requests package is not altered. Code in other parts of an application that imports requests directly will continue to use the original Session object without any retry logic, preventing unintended side-effects.
What I think this means is Modshim lets you really get in to the guts of a module (monkey-patch style, giving you god-like powers), while limiting the damage.
boxed|4 months ago
epgui|4 months ago
https://github.com/epgui/pybond
Izkata|4 months ago
Edit: okay Readme is clear on it and the description does make sense, the short description here just confused me.
moezd|4 months ago
When you have a scalpel, you give it to operating doctors during the operation, not to 5 year olds on the street.
ramses0|4 months ago
Your patch "with retries" might never be accepted, and maintaining any kind of fork(s) or "out-of-tree patches" is not as integrated into the programming environment. Being able to say "assert WrappedLoginLibrary().login(), '...with retries...'" keeps you testable and "in" the language proper.
BiteCode_dev|4 months ago
satya71|4 months ago
tracnar|4 months ago
It's much cleaner than monkey patching, and it will more likely detect if an update conflicts with your patching.
I've used it by packaging everything through nix, but that can be cumbersome.
pmarreck|4 months ago
throwaway894345|4 months ago
procaryote|4 months ago
> * Fix bugs in third-party libraries without forking
> * Modify the behavior of existing functions
> * Add new features or options to existing classes
> * Test alternative implementations in an isolated way
only the last sounds close to something you might actually want to do, and then only as a throwaway thing
If you want to change a library, fork it. If you want to change the behavior of existing functions, don't or at least fork first. If you want to add new features to a class, write a new class, or again, at least fork first
Uptrenda|4 months ago
joouha|4 months ago
yincong0822|4 months ago
[deleted]