I can't help but read all the interjections from the core developers as a strong indication of why all these sorts of things tend to fail on the vine. From Unladen Swallow onward there have been these groups off doing interesting and awesome experiments to try and make Python faster, and they never actually make it into something that Python end-users can use (yes, I know US had shortcomings).
This is the work of one (admittedly super smart) guy over two years. Just like Nuitka, something else that was supposed to be impossible but just keeps making steady progress. Maybe I'm just seeing design by committee?
The lone geniuses can only go so far. Maybe the problem is that Python can't quite decide what it wants to be because it's too many things for too many people already.
Is the concept of Python the language, as opposed to Python the ecosystem, valuable enough so that a Python that broke backwards compatibility with all the C extensions would be useful as its own multicore-capable runtime?
PyPy seemed to think so for a while and now has gone hard in the other direction, reimplementing (faking?) a bunch of the CPython extension API so maybe this approach would never work. I don't know, but seeing things like:
> there is a large number of “dark matter” Python (and C extension) code out there that isn’t open-source. We need to be careful not to break it since it might not be feasible for its users to make required changes, or to report problems back upstream to us. In particular, some C extensions protect their own internal state with the GIL. This is a big worry, and might be a big hindrance to adoption of a GIL-free Python.
really make me wonder if the community as a whole would conclude the same on these critical sorts of decisions which shape the future of the language if they were put forward and not just made by a couple people in a closed meeting.
Would you prefer to support some weird arbitrary nameless closed source extensions, or have a multicore Python? This obviously depends on who you are and what you're doing, which leads us back to Python being too much for too many, but even here we can get a feeling for how many people do what with the language.
> Would you prefer to support some weird arbitrary nameless closed source extensions, or have a multicore Python?
There's nothing wrong with staying on an older LTS version of Python. Let the people with the nameless closed-source stuff stick with that. The beauty of open source is that they can fork the older, GIL-ful version of Python and maintain it, if they like.
Multicore would be a tremendous boon to the language.
> I can't help but read all the interjections from the core developers as a strong indication of why all these sorts of things tend to fail on the vine.
I think this might be a misunderstanding of the nature of the event that these notes are generated from, unless I'm misunderstanding your objection. The point of this Q&A as I saw it was to explore the feasibility of the idea and fully flesh out the costs and benefits so that we can make informed decisions about how to proceed.
The "random interjections" are notes of caution about what trade-offs need to be made. For example, it is very easy to overlook "dark matter" code because we don't have access to it, but it's almost certainly the majority of Python code out there. It is also not a complete deal-breaker to say that some change could break unknown proprietary extensions — otherwise we'd never be able to change anything; the key is that the changes have to be worth it. A lot of that depends on details — if it's easy to update C extensions for nogil mode (even if they were designed without parallelism in mind), then making breaking changes to remove the GIL might not be so bad. If nogil mode requires that most C extensions totally overhaul their reference counting and C API usage and the changes require restructuring code rather than something that can be done with automated search and replace, that's a much bigger cost and will probably come with a long term fork of the ecosystem (which is a huge pain to deal with) and it might not be worth it.
Avoiding this sort of criticism will not make the underlying problems go away, and I think everyone involved understood that this meeting was intended to bring to light any objections that might guide the work towards ultimate resolution.
I don't think it's very coherent to criticise the team because "Python can't quite decide what it wants to be", but also criticise them for not adopting all these crazy cool changes that would fundamentally change Python.
Taking a highly conservative approach to breaking changes is absolutely not the same thing as being indecisive. The Python team has learned from experience how disruptive breaking changes can be.
> really make me wonder if the community as a whole would conclude the same on these critical sorts of decisions which shape the future of the language if they were put forward and not just made by a couple people in a closed meeting.
I for one couldn't care less if some proprietary binaries fail on Python 3.11 or so. That's why we keep multiple versions around (at last company, I could only use up to 3.6 because that was the version in the Sacred CentOS AMI)
And, of course, a very critical piece of code was depending on a bug in regex that was fixed in 3.8 or so, and decided to break during a demo (where I was using 3.9 instead of 3.6).
> there is a large number of “dark matter” Python (and C extension) code out there that isn’t open-source. We need to be careful not to break it since it might not be feasible for its users to make required changes, or to report problems back upstream to us. In particular, some C extensions protect their own internal state with the GIL. This is a big worry, and might be a big hindrance to adoption of a GIL-free Python.
Probably doesn't work across minor versions anyway, most stuff isn't built against the limited API.
I think there's a pretty good chance this stuff gets incorporated:
"On a personal level, we are impressed by Sam’s work so far and invited him to join the CPython project. I’m happy to report he is interested, and to help him ramp up to become a core developer, I will be mentoring him. Guido and Neil Schemenauer will help me review code for the interpreter bits I’m unfamiliar with."
12 references to people in one statement, 5 referring to the post author, 1 reference to social fraternity membership, 1 statement of authority.
I'm not sure if there is a common name for this particular source of discomfort, but that quote definitely contains a lot of it. I'm a historical contributor to the Python source repository, but something about the social structure of the project has changed significantly in recent years that would dissuade me from submitting changes in future. The focus in the statement above no longer feels like it is on the actual productive output of the project itself, and in previous years it wasn't like that, nor needed to be like that.
Reminds me of something like the minutes of a professional schmoozer's business lunch, rather than a technical meeting, or something like that. If you have ever seen a stray engineer at an event like this (or had the misfortune of being that engineer), this feeling probably captures the problem well. Whatever it is, I'd love to see less of it.
Just came in here briefly to opine that there is a very real risk of fork if the Python core community does not at least offer a viable alternative expediently.
The economic pressures surrounding the benefits of gross’s changes will likely influence this more than any tears shed over subtle backwards incompatibility.
I believe it was Dropbox that famously released their own private internal Python build a while back and included some concurrency patches.
Many teams might go the route of working from Sam Gross’ work and if we see subtle changes in underlying runtime concurrency semantics or something else backwards incompatible that’s it- either that adoption will roll downhill to a new standard or Python core will have to answer with a suitable GIL-less alternative.
I for one do not want to think about “ANSI Python” runtimes or give the MSFTs etc of the world an opening to divide the user base.
I mean, PyPy is over a decade old now, and micropython is a mere 7 years old. What's another fork? If anything, I strongly prefer languages that have more than one implementation.
It might not matter much if Canonical or IBM decided to port a critical mass of open source extensions/packages. Then they could ship the new CPython in place of the old one and mention the differences in the release notes. With one or both throwing their weight behind it, it would gain significant momentum above and beyond the original project.
There experimental forks don't aim to change language semantics so they are quite safe from fragmentation pov even if they accidentally get some adoption. But they have so far been explicit about being research and uninterested in anything else.
These are exceptionally clear notes. They're easy to read and feel comprehensive. I also note that the author is the current CPython developer in residence (a recently created position).
I am surprised that closed source is suddenly an issue when it comes to the GIL, but half the world breaking on the python 3 transition was not only intended but actively pushed by various members of the community. Since Linux managed to get rid of the BLK then python should be able to get rid of the GIL.
They said they would make the non-GIL version opt in (command line flag?) so they wouldn't be breaking the old stuff anyway. It's a solved problem. If anyone moving to a new version of python can't take the time to understand such a small change, then that's on them.
Suppose a Python module written in C registers a method that ends up making a call to C functions larry(), moe(), and then curly() to mutate a global variable "global_mutable_temp" before finally returning a value generated from global_mutable_temp.
1. Supposing this method doesn't currently crash under GIL python, would it be true that this method will also run without crashing on the non-GIL python interpreter?
2. Would it be true that the non-GIL python interpreter will introduce a race to this method (resulting in a runtime error) that didn't exist under the GIL interpreter?
Why is that relevant? Many of the people contributing significantly to CPython also have full time jobs elsewhere. Sometimes their full time job overlaps with their contributions and sometimes it does not. For example Guido works for Microsoft and is working on CPython performance there, does that mean all of his work needs an asterisk saying it's actually a Microsoft corporate initiative?
PyTorch seems to obfuscate its Facebook ownership in general. At least I could find no mention of it on their "About" page or in the documentation, where the only mention of "Facebook" is a link in the footer to the PyTorch project's own social media page: https://pytorch.org/features/
I think switching the version to 4 is the most viable path. Make the last GILed python 3 an LTS release that interested parties can hold on to, eg. Ubuntu can keep python 3 as a default for a long time. One can always use conda to run multiple versions simultaneously.
Besides the internal politics, I think the greatest blocker with improving python performance is not giving whomever is running the code any control. If I don't use any naughty C plugins, or if I can assure you that I've annotated all my types and can allow a large class of optimizations, why can't I run my code with some VM flags?
It's one language, but why can't I tune my VM to my needs? I can't imagine Java not letting users tune their GC.
I wonder if a midpoint for this sort of work would be if a major distro or several declared they would move to GIL-free Python?
System python at least is generally only recommended to be used for system libs, and that's a relatively supportable set. Developers use virtualenv's and their own specific interpreter, but it would certainly move the needle on what language people were by default scripting and thinking in.
This here: "The GIL will still be optionally available as an interpreter startup-time option" seems like a midpoint. Maybe it will even be GIL-by-default for some versions.
> What’s the level of perceived risk that the nogil project will end up not being viable for inclusion in CPython?
(...)
> It all depends on how well the community adapts C extensions so they don’t cause downright crashes of the interpreter. Then, the remaining long tail is community adopting free threads in their applications in a way that is both correct and scales well. Those two are the biggest challenges but we have to be optimistic.
Even if it's 10% of the mess the path py2->py3 was, it still worries me. I hope I'm wrong and it's much less than that (for the fatal cases ATL, and similar/non improved perf for the rest)
What python commitee considered as infeasible was almost done by a lone hero. Since the previous decision to change the format of the print function (that no one asked for) broke everybody's code for no reason and took ten years to be adopted, they will not push (for the one change everyone wants) for the foreseable future. Although it does not seem to be that impossible after all.
I am glad they will invite Sam, the lone gero we needed and hope he will be given some ownership of the task and not get him swamped with commiteeisms through a embrace, not extend and extinguish. He is on a success path, the commitee is in no path at all.
Just put a timeline and call it a fail if don't succeed and quit avoiding it through "discussions". We got it, it's not planned for the X.XX+1 version, each version.
Python is used by a lot of "semi-technical" people (data scientists, researchers, hobbyists, etc). Removing the GIL isn't going to make their life any easier.
[+] [-] ctoth|4 years ago|reply
The lone geniuses can only go so far. Maybe the problem is that Python can't quite decide what it wants to be because it's too many things for too many people already.
Is the concept of Python the language, as opposed to Python the ecosystem, valuable enough so that a Python that broke backwards compatibility with all the C extensions would be useful as its own multicore-capable runtime? PyPy seemed to think so for a while and now has gone hard in the other direction, reimplementing (faking?) a bunch of the CPython extension API so maybe this approach would never work. I don't know, but seeing things like:
> there is a large number of “dark matter” Python (and C extension) code out there that isn’t open-source. We need to be careful not to break it since it might not be feasible for its users to make required changes, or to report problems back upstream to us. In particular, some C extensions protect their own internal state with the GIL. This is a big worry, and might be a big hindrance to adoption of a GIL-free Python.
really make me wonder if the community as a whole would conclude the same on these critical sorts of decisions which shape the future of the language if they were put forward and not just made by a couple people in a closed meeting.
Would you prefer to support some weird arbitrary nameless closed source extensions, or have a multicore Python? This obviously depends on who you are and what you're doing, which leads us back to Python being too much for too many, but even here we can get a feeling for how many people do what with the language.
[+] [-] shepardrtc|4 years ago|reply
There's nothing wrong with staying on an older LTS version of Python. Let the people with the nameless closed-source stuff stick with that. The beauty of open source is that they can fork the older, GIL-ful version of Python and maintain it, if they like.
Multicore would be a tremendous boon to the language.
[+] [-] pganssle|4 years ago|reply
I think this might be a misunderstanding of the nature of the event that these notes are generated from, unless I'm misunderstanding your objection. The point of this Q&A as I saw it was to explore the feasibility of the idea and fully flesh out the costs and benefits so that we can make informed decisions about how to proceed.
The "random interjections" are notes of caution about what trade-offs need to be made. For example, it is very easy to overlook "dark matter" code because we don't have access to it, but it's almost certainly the majority of Python code out there. It is also not a complete deal-breaker to say that some change could break unknown proprietary extensions — otherwise we'd never be able to change anything; the key is that the changes have to be worth it. A lot of that depends on details — if it's easy to update C extensions for nogil mode (even if they were designed without parallelism in mind), then making breaking changes to remove the GIL might not be so bad. If nogil mode requires that most C extensions totally overhaul their reference counting and C API usage and the changes require restructuring code rather than something that can be done with automated search and replace, that's a much bigger cost and will probably come with a long term fork of the ecosystem (which is a huge pain to deal with) and it might not be worth it.
Avoiding this sort of criticism will not make the underlying problems go away, and I think everyone involved understood that this meeting was intended to bring to light any objections that might guide the work towards ultimate resolution.
[+] [-] simonh|4 years ago|reply
Taking a highly conservative approach to breaking changes is absolutely not the same thing as being indecisive. The Python team has learned from experience how disruptive breaking changes can be.
[+] [-] rbanffy|4 years ago|reply
I for one couldn't care less if some proprietary binaries fail on Python 3.11 or so. That's why we keep multiple versions around (at last company, I could only use up to 3.6 because that was the version in the Sacred CentOS AMI)
And, of course, a very critical piece of code was depending on a bug in regex that was fixed in 3.8 or so, and decided to break during a demo (where I was using 3.9 instead of 3.6).
[+] [-] formerly_proven|4 years ago|reply
Probably doesn't work across minor versions anyway, most stuff isn't built against the limited API.
[+] [-] bigdict|4 years ago|reply
[+] [-] frazbin|4 years ago|reply
"On a personal level, we are impressed by Sam’s work so far and invited him to join the CPython project. I’m happy to report he is interested, and to help him ramp up to become a core developer, I will be mentoring him. Guido and Neil Schemenauer will help me review code for the interpreter bits I’m unfamiliar with."
[+] [-] blackandsqueaky|4 years ago|reply
I'm not sure if there is a common name for this particular source of discomfort, but that quote definitely contains a lot of it. I'm a historical contributor to the Python source repository, but something about the social structure of the project has changed significantly in recent years that would dissuade me from submitting changes in future. The focus in the statement above no longer feels like it is on the actual productive output of the project itself, and in previous years it wasn't like that, nor needed to be like that.
Reminds me of something like the minutes of a professional schmoozer's business lunch, rather than a technical meeting, or something like that. If you have ever seen a stray engineer at an event like this (or had the misfortune of being that engineer), this feeling probably captures the problem well. Whatever it is, I'd love to see less of it.
[+] [-] caffzz|4 years ago|reply
[deleted]
[+] [-] mvanveen|4 years ago|reply
The economic pressures surrounding the benefits of gross’s changes will likely influence this more than any tears shed over subtle backwards incompatibility.
I believe it was Dropbox that famously released their own private internal Python build a while back and included some concurrency patches.
Many teams might go the route of working from Sam Gross’ work and if we see subtle changes in underlying runtime concurrency semantics or something else backwards incompatible that’s it- either that adoption will roll downhill to a new standard or Python core will have to answer with a suitable GIL-less alternative.
I for one do not want to think about “ANSI Python” runtimes or give the MSFTs etc of the world an opening to divide the user base.
[+] [-] yjftsjthsd-h|4 years ago|reply
[+] [-] rbanffy|4 years ago|reply
Google also had their Unladen Swallow version, but it seems they lost interest at some point.
[+] [-] qwerty456127|4 years ago|reply
[+] [-] wyldfire|4 years ago|reply
It might not matter much if Canonical or IBM decided to port a critical mass of open source extensions/packages. Then they could ship the new CPython in place of the old one and mention the differences in the release notes. With one or both throwing their weight behind it, it would gain significant momentum above and beyond the original project.
[+] [-] fulafel|4 years ago|reply
[+] [-] mixedmath|4 years ago|reply
[+] [-] didip|4 years ago|reply
If users are gaining performance, they will bend over backward porting their code to this new version.
At minimum, I predict, all FAANGMULA would jumped in the bandwagon and create a pretty big ripple effect.
[+] [-] josefx|4 years ago|reply
[+] [-] stjohnswarts|4 years ago|reply
[+] [-] aserafini|4 years ago|reply
And Python is not a business with customers. It’s an open source volunteer project.
[+] [-] rolisz|4 years ago|reply
[+] [-] kzrdude|4 years ago|reply
https://archive.md/Zb8p2
(Archived through google cache, so two layers of cache.)
[+] [-] ambivalence|4 years ago|reply
[+] [-] jancsika|4 years ago|reply
1. Supposing this method doesn't currently crash under GIL python, would it be true that this method will also run without crashing on the non-GIL python interpreter?
2. Would it be true that the non-GIL python interpreter will introduce a race to this method (resulting in a runtime error) that didn't exist under the GIL interpreter?
[+] [-] bionhoward|4 years ago|reply
[+] [-] bigdict|4 years ago|reply
Not that it's bad but it should be mentioned that it's a corporate initiative.
[+] [-] minhazm|4 years ago|reply
[+] [-] Lammy|4 years ago|reply
At least the "Brand Guidelines" PDF makes it clear that “PyTorch, the PyTorch logo and any related marks are trademarks of Facebook, Inc.”: https://pytorch.org/assets/brand-guidelines/PyTorch-Brand-Gu...
Also a small hint in `CONTRIBUTING.md`: https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING....
[+] [-] jamesmishra|4 years ago|reply
[1]: https://github.com/colesbury/nogil
[+] [-] fulafel|4 years ago|reply
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] kombine|4 years ago|reply
[+] [-] IshKebab|4 years ago|reply
[+] [-] mihaic|4 years ago|reply
It's one language, but why can't I tune my VM to my needs? I can't imagine Java not letting users tune their GC.
[+] [-] XorNot|4 years ago|reply
System python at least is generally only recommended to be used for system libs, and that's a relatively supportable set. Developers use virtualenv's and their own specific interpreter, but it would certainly move the needle on what language people were by default scripting and thinking in.
[+] [-] kzrdude|4 years ago|reply
[+] [-] gadrev|4 years ago|reply
> It all depends on how well the community adapts C extensions so they don’t cause downright crashes of the interpreter. Then, the remaining long tail is community adopting free threads in their applications in a way that is both correct and scales well. Those two are the biggest challenges but we have to be optimistic.
Even if it's 10% of the mess the path py2->py3 was, it still worries me. I hope I'm wrong and it's much less than that (for the fatal cases ATL, and similar/non improved perf for the rest)
[+] [-] orf|4 years ago|reply
[+] [-] jokoon|4 years ago|reply
But on the other hand, isn't simple to just dedicate a script per core?
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] antman|4 years ago|reply
What python commitee considered as infeasible was almost done by a lone hero. Since the previous decision to change the format of the print function (that no one asked for) broke everybody's code for no reason and took ten years to be adopted, they will not push (for the one change everyone wants) for the foreseable future. Although it does not seem to be that impossible after all.
I am glad they will invite Sam, the lone gero we needed and hope he will be given some ownership of the task and not get him swamped with commiteeisms through a embrace, not extend and extinguish. He is on a success path, the commitee is in no path at all.
Just put a timeline and call it a fail if don't succeed and quit avoiding it through "discussions". We got it, it's not planned for the X.XX+1 version, each version.
[+] [-] landmark3|4 years ago|reply
Impressive work (2 years working full time knowing that it might never be merged is incredible)
[+] [-] csmpltn|4 years ago|reply
[+] [-] chrisseaton|4 years ago|reply
[+] [-] WesolyKubeczek|4 years ago|reply