It's worth noting that Guido van Rossum, creator of Python, is working at Dropbox on this project, and that Mypy is funded primarily by Dropbox (I think).
I'm glad that they are doing it, because it's contributed hugely to the Python ecosystem -- especially making it easier for other companies to make the switch to Python 3.
I highly recommend mypy for new and current python3 users. It's one of the biggest and best reasons to be using python3 (not that you should need any more :) )
I've never used mypy and I used Python a lot until I recently got my first job.
It kind of looks like a band-aid. Could you elaborate on why I would use, say, mypy as opposed to Golang if I don't need the benefits of a scripting language?
edit: I can understand why mypy would be useful to refactor an existing (large) project to bring forth type safety guarantees as you go forwards, but surely for new projects there are other languages of choice if type safety is what you want?
At work, we have much smaller codebases that have fractional FTEs allocated to their ongoing maintenance. In spite of the difference in scale, my experience is similar to what was described in the article. Because we had focused on getting the unicode to work right in those old code bases, we had good test coverage for those features as a result.
The other common legacy problems to address were:
- Other implicit encode/decode behaviors in py2 that need to be explicit in py3
- Old 'print' and 'except' statements not valid in py3, easily rewritten
- Implicitly relative 'import' statements not valid in py3, rewritten with a little care
- Arithmetic needing a change to the '//' operator for integer division
- Waiting for py3 support in all third-party dependencies
- Dealing with restructuring in standard lib packages and in upgraded third-party libs w/ py3 support
As I reviewed the techniques for straddling py2 and py3, I was displeased with how many seemed to involve a third dialect which was not really idiomatic py2 nor py3, particularly for the unicode/bytes handling. Many third-party libraries and frameworks also made different choices for how they handled this. Trying to integrate those approaches looked to produce even uglier code.
Also, some of our code had evolved since the Python 2.2 days and had accumulated cruft to import and wrap multiple generations of older standard lib and add-on packages which we have not cared about for 5+ years. The additional package restructuring in py3 would have made this even more bizarre. I wanted to see the code reset to use standard libs where possible and cull these legacy third-party dependencies. I also wanted the code to become more idiomatically py3, so whoever visited it for future maintenance would not need to work so hard to understand it.
So, we chose a clean break where we finally clean up and modernize the code to py3-only without the added burden of supporting py2 deployments from the same code. The declared 2020 deadline helped this decision. We branched our repos and worked on py3 ports and integration testing in parallel while continuing to run py2 in production. We declared a feature freeze on the py2 code, so we would not have a merging nightmare later, and so that we could use that as pressure to prevent procrastination on scheduling the flag day where we merge PRs and convert all our repos to a py3-only worldview.
It has to find and diff files, coordinate a very reliable file upload of many files, as fast as possible. It has to understand enough about the content of those files to be able to do useful things with them. It has to reliably update itself again and again, from possibly very old versions, it has to communicate with an ever changing API. It has to have enough analytics in it to support product development, error reporting, understanding how users use it, how the product needs to evolve over time. It needs to integrate deeply with all major OSes.
...I'm sure I've missed some things. 1 million lines of code sounds like the right ballpark to me though.
It is a good question, and I am glad you asked as many new python developers hear this advice from Zed.
I think in 2019 there is no question that you should be using python 3. Zed seems to be personally offended by both how the transition was handled and some of the design choices in python 3, but I think he is doing his readers a disservice if he is still recommending python 2 today.
> That "learn python the hard way" guy says don't use python 3
They are wrong. There is no good reason to start with Python 2 for new projects. The reasons people stay on 2.x are institutional inertia and legacy code. A hello world program has none of these problems.
FWIW, the latest version of LPTHW is focused on learning Python 3.
Beyond that, I personally don't find the arguments presented in versions of the book that recommend beginners learn Python 2 to be all that compelling. They read to me as one complaint about string handling that's wildly at odds with my own experience, nestled in a great big bowl of sour grapes about the fact that the breaking changes are happening in the first place.
The library support issue is a consideration, but depends heavily on what you want to do. My own sense has been that it's tools that are primarily used by devops and sysadmins that have the poorest Python 3 support, whereas the rest of the ecosystem is more-or-less 100% migrated at this point. Many have already started to discontinue Python 2 support. So, if you're saying that you're at the "Hello World" stage, I'm guessing that you'll hit a lot fewer speed bumps if you just start with Python 3 and don't look back.
Python 2 support is sunsetting in 2020 after over a decade of python 2 and python 3 support. I'd recommend starting with Python 3. Many external libraries you may need will only be issued for python 3 and many external libraries will choose to not even issue bug fixes for python 2.
He's 100% wrong. Use Python 3. Don't spend a second thinking about Python 2.
His complaints are bizarre. He mostly claims that Python 3 sucks because he can't figure out how to convert his old Python 2 code to work on Python 3 easily so therefore a beginner couldn't possibly learn the language from scratch. But all of his examples are edge cases of him intentionally fighting against the changes in Python 3 by doing things that don't make any sense like trying to concatenate unicode strings with raw binary data. It's just the weirdest argument ever.
Python 3 makes almost everything slightly simpler and more logical - especially if you ever have to deal with unicode text. And since emojis are unicode and people love emojis, that means literally everyone.
Overall, he comes off to me like people who spend a lot of time ranting about USB-C ports when in reality USB-C is mostly great and honestly just isn't that big of a deal anyway. It's a strange hill to die on.
The only good historic argument for using Python 2 was that it used to be that some popular libraries weren't updated for Python 3 yet. But that hasn't been true for years. Nowadays you are more likely to run into libraries dropping Python 2 support than you are to find them not supporting Python 3. I can't think of a single remotely popular library that still doesn't support Python 3.
I don't write a lot of Python (use it mostly when deep learning calls for it), but my understanding is that it's time to start with Python 3. I try to write Python 3 whenever I have to write Python.
At Thread we did a similar thing. Admittedly our codebase is ~10% as big, but incrementally adding linters for incompatible code, and keeping the CI green, helped loads. We only had about 2-3 days of engineering time to ship the final version, the rest was done in 10% time.
It's already been since 2015. Every major Python dependency has been Python3 compatible for several years, and many have already gone Python3 only.
If you have a good test suite it's pretty easy to migrate, you can easily do a couple thousand lines per day. There were a few third-party testing tools that used to be fairly popular that are no longer well maintained, so if your tests are written in those then you might be looking at re-writing the app from scratch. But otherwise it's fairly straightforward to port.
Even if you did your string encoding just using guess-and-check, which most people did, it still doesn't take all that long to upgrade as long as there are tests.
I would say yes. I'm pretty sure most people are writing all new projects exclusively in python3, except if you need them to run on old distros without any dependencies (lots of distributions shipped for years with python2 but not python3 installed by default). The ecosystem of 3 is very mature - I've been running my servers' production code for 5+ years in python3 and had no major issues with that choice. The only thing left is "legacy codebases".
While adding Python3 support for numpy at a time when I worked for a large Ad-serving company, we managed to uncover a bug in python 2/3's import code that had been latent (no crash) in Python 2 for 15+ years. The problem was unique to people who had made it possible to import the same library from two filesystem locations.
In python2, it was silent, in python3 it was a segfault. There was really only one person in the company truly qualified to understand and fix the bug.
I'm finally moved over to Python3 but boy, was that an unwanted transition.
I've noticed a trend that migrating from Python 2 to Python 3 includes adding mypy annotations and going async.
At that point, why not Go? You're trying to correct for a language that was designed for scripting, not application software. Go already has a type system, coroutines are a simpler model of concurrency than async (and Go can actually use multithreading), you don't have to choose between writing and async or nonasync library code, built in formatter (wheras Black is still experimental), and the code will run 10x faster.
The crucial lesson from the Python 2 -> 3 transition (which this article is also mainly about) is that incrementality is super valuable. In retrospect, a single codebase supporting 2 and 3 is the obvious best option, but that wasn't at all clear in the beginning. (For example, the u"" string syntax, which is crucial for compatibility, wasn't even legal until 3.3!) Moving to Python 3 incrementally is painful, but it's possible. Moving to a totally different programming language incrementally isn't possible, other than when you're starting brand new projects.
I see your point, but I think when migrating over a million loc, it's a question of practicality. Incrementally migrating a project from Python 2 to 3, between which at least large-scale architectural patterns are more-or-less identical, is a completely different story from migrating to Golang, which uses drastically different paradigms to structure code and think about data flow.
mypy is a joke of a tool. I have actually used it, and 80% of its messages are useless junk or just plain wrong. Granted, the other 20% can be on point. All things considered, it's better to be with a tool like it, than without.
[+] [-] lincolnq|7 years ago|reply
I'm glad that they are doing it, because it's contributed hugely to the Python ecosystem -- especially making it easier for other companies to make the switch to Python 3.
I highly recommend mypy for new and current python3 users. It's one of the biggest and best reasons to be using python3 (not that you should need any more :) )
[+] [-] xvector|7 years ago|reply
It kind of looks like a band-aid. Could you elaborate on why I would use, say, mypy as opposed to Golang if I don't need the benefits of a scripting language?
edit: I can understand why mypy would be useful to refactor an existing (large) project to bring forth type safety guarantees as you go forwards, but surely for new projects there are other languages of choice if type safety is what you want?
[+] [-] saltcured|7 years ago|reply
The other common legacy problems to address were:
- Other implicit encode/decode behaviors in py2 that need to be explicit in py3
- Old 'print' and 'except' statements not valid in py3, easily rewritten
- Implicitly relative 'import' statements not valid in py3, rewritten with a little care
- Arithmetic needing a change to the '//' operator for integer division
- Waiting for py3 support in all third-party dependencies
- Dealing with restructuring in standard lib packages and in upgraded third-party libs w/ py3 support
As I reviewed the techniques for straddling py2 and py3, I was displeased with how many seemed to involve a third dialect which was not really idiomatic py2 nor py3, particularly for the unicode/bytes handling. Many third-party libraries and frameworks also made different choices for how they handled this. Trying to integrate those approaches looked to produce even uglier code.
Also, some of our code had evolved since the Python 2.2 days and had accumulated cruft to import and wrap multiple generations of older standard lib and add-on packages which we have not cared about for 5+ years. The additional package restructuring in py3 would have made this even more bizarre. I wanted to see the code reset to use standard libs where possible and cull these legacy third-party dependencies. I also wanted the code to become more idiomatically py3, so whoever visited it for future maintenance would not need to work so hard to understand it.
So, we chose a clean break where we finally clean up and modernize the code to py3-only without the added burden of supporting py2 deployments from the same code. The declared 2020 deadline helped this decision. We branched our repos and worked on py3 ports and integration testing in parallel while continuing to run py2 in production. We declared a feature freeze on the py2 code, so we would not have a merging nightmare later, and so that we could use that as pressure to prevent procrastination on scheduling the flag day where we merge PRs and convert all our repos to a py3-only worldview.
[+] [-] jonathanpoulter|7 years ago|reply
Disclosure: I work for Bank of America.
[+] [-] jsilence|7 years ago|reply
[+] [-] danpalmer|7 years ago|reply
It has to find and diff files, coordinate a very reliable file upload of many files, as fast as possible. It has to understand enough about the content of those files to be able to do useful things with them. It has to reliably update itself again and again, from possibly very old versions, it has to communicate with an ever changing API. It has to have enough analytics in it to support product development, error reporting, understanding how users use it, how the product needs to evolve over time. It needs to integrate deeply with all major OSes.
...I'm sure I've missed some things. 1 million lines of code sounds like the right ballpark to me though.
[+] [-] bdcravens|7 years ago|reply
[+] [-] barkingcat|7 years ago|reply
[+] [-] nvr219|7 years ago|reply
[+] [-] doctoboggan|7 years ago|reply
I think in 2019 there is no question that you should be using python 3. Zed seems to be personally offended by both how the transition was handled and some of the design choices in python 3, but I think he is doing his readers a disservice if he is still recommending python 2 today.
[+] [-] outworlder|7 years ago|reply
They are wrong. There is no good reason to start with Python 2 for new projects. The reasons people stay on 2.x are institutional inertia and legacy code. A hello world program has none of these problems.
[+] [-] bunderbunder|7 years ago|reply
Beyond that, I personally don't find the arguments presented in versions of the book that recommend beginners learn Python 2 to be all that compelling. They read to me as one complaint about string handling that's wildly at odds with my own experience, nestled in a great big bowl of sour grapes about the fact that the breaking changes are happening in the first place.
The library support issue is a consideration, but depends heavily on what you want to do. My own sense has been that it's tools that are primarily used by devops and sysadmins that have the poorest Python 3 support, whereas the rest of the ecosystem is more-or-less 100% migrated at this point. Many have already started to discontinue Python 2 support. So, if you're saying that you're at the "Hello World" stage, I'm guessing that you'll hit a lot fewer speed bumps if you just start with Python 3 and don't look back.
[+] [-] ldiracdelta|7 years ago|reply
[+] [-] ageitgey|7 years ago|reply
His complaints are bizarre. He mostly claims that Python 3 sucks because he can't figure out how to convert his old Python 2 code to work on Python 3 easily so therefore a beginner couldn't possibly learn the language from scratch. But all of his examples are edge cases of him intentionally fighting against the changes in Python 3 by doing things that don't make any sense like trying to concatenate unicode strings with raw binary data. It's just the weirdest argument ever.
Python 3 makes almost everything slightly simpler and more logical - especially if you ever have to deal with unicode text. And since emojis are unicode and people love emojis, that means literally everyone.
Overall, he comes off to me like people who spend a lot of time ranting about USB-C ports when in reality USB-C is mostly great and honestly just isn't that big of a deal anyway. It's a strange hill to die on.
The only good historic argument for using Python 2 was that it used to be that some popular libraries weren't updated for Python 3 yet. But that hasn't been true for years. Nowadays you are more likely to run into libraries dropping Python 2 support than you are to find them not supporting Python 3. I can't think of a single remotely popular library that still doesn't support Python 3.
[+] [-] carbocation|7 years ago|reply
https://pythonclock.org/
[+] [-] minimaxir|7 years ago|reply
Both have been resolved.
[+] [-] phowat|7 years ago|reply
[+] [-] raverbashing|7 years ago|reply
Use Python 3
[+] [-] happy-go-lucky|7 years ago|reply
[+] [-] N3cr0ph4g1st|7 years ago|reply
[+] [-] qgsrw3fd|7 years ago|reply
[deleted]
[+] [-] mserdarsanli|7 years ago|reply
[+] [-] danpalmer|7 years ago|reply
At Thread we did a similar thing. Admittedly our codebase is ~10% as big, but incrementally adding linters for incompatible code, and keeping the CI green, helped loads. We only had about 2-3 days of engineering time to ship the final version, the rest was done in 10% time.
[+] [-] rhacker|7 years ago|reply
[+] [-] sametmax|7 years ago|reply
See
https://www.jetbrains.com/research/devecosystem-2018/python/
For 2018 stats
[+] [-] Alex3917|7 years ago|reply
If you have a good test suite it's pretty easy to migrate, you can easily do a couple thousand lines per day. There were a few third-party testing tools that used to be fairly popular that are no longer well maintained, so if your tests are written in those then you might be looking at re-writing the app from scratch. But otherwise it's fairly straightforward to port.
Even if you did your string encoding just using guess-and-check, which most people did, it still doesn't take all that long to upgrade as long as there are tests.
[+] [-] lincolnq|7 years ago|reply
[+] [-] dangoor|7 years ago|reply
[+] [-] dekhn|7 years ago|reply
In python2, it was silent, in python3 it was a segfault. There was really only one person in the company truly qualified to understand and fix the bug.
I'm finally moved over to Python3 but boy, was that an unwanted transition.
[+] [-] omarforgotpwd|7 years ago|reply
[+] [-] raverbashing|7 years ago|reply
Trickiest one that we tripped: strings in Python 3 have the __iter__ method
[+] [-] petters|7 years ago|reply
No. :-) Now you need to remove all compatibility code, modernize your syntax and gradually start using new features.
[+] [-] ilovecaching|7 years ago|reply
At that point, why not Go? You're trying to correct for a language that was designed for scripting, not application software. Go already has a type system, coroutines are a simpler model of concurrency than async (and Go can actually use multithreading), you don't have to choose between writing and async or nonasync library code, built in formatter (wheras Black is still experimental), and the code will run 10x faster.
[+] [-] oconnor663|7 years ago|reply
[+] [-] sammnaser|7 years ago|reply
Reminds me of some of the points made here: https://www.joelonsoftware.com/2000/04/06/things-you-should-....
[+] [-] ApolloFortyNine|7 years ago|reply
Well, that's not true. Are you basing this solely over a lack of a type system?
[+] [-] unknown|7 years ago|reply
[deleted]
[+] [-] painful|7 years ago|reply