top | item 7729006

The homogenization of scientific computing (2013)

71 points| stickhandle | 12 years ago |talyarkoni.org | reply

54 comments

order
[+] the_french|12 years ago|reply
I heard a saying at work: "Python is the second best language for anything." Scientific computing is a perfect example. By being close to the best, Python ends up taking over each sub-section of the SC world.
[+] mtct|12 years ago|reply
One of the best description for python that I've ever heard
[+] zanny|12 years ago|reply
Man, if Python is the second best in readability, I want to become best friends with number one. (and I don't mean ugly syntax like forwarded arguments that makes Python look like ass, I mean normal scripts that read like prose)
[+] thisjustinm|12 years ago|reply
So true. And being second best with a very friendly syntax often makes it a first choice.
[+] karangoeluw|12 years ago|reply
> Scientific computing is a perfect example

Just out of curiosity, what's the best one?

[+] jdonaldson|12 years ago|reply
If python is good enough to replace R or Matlab for you, then you are using a negligible fraction of what those platforms have to offer.

R is a lot like vim or javascript. It has a lot of warts, but it's an incredibly expressive toolkit for its task. I usually buy into a language once I find a few extremely gifted developers working with it (and who seem to do so voluntarily). For instance: for vim it's Tim Pope, for R, it's Hadley Wickham, for javascript it's Mike Bostock.

Python, despite its many good decisions, is likewise full of warts. So, who are good developers to follow in the python/numpy community?

[+] takluyver|12 years ago|reply
> who are good developers to follow in the python/numpy community?

Off the top of my head: Travis Oliphant, creator of numpy and founder of two companies in the scientific Python space; Jake Vanderplas, enthusiastic developer and blogger and Fernando Perez, creator of IPython (disclaimer: I work for Fernando). In the broader Python world, Kenneth Reitz, the author of requests.

[+] yason|12 years ago|reply
That's how idiomatic one-way-to-do-it generic languages slowly win. They raise the bar for the specialty languages so that, for example, after R the next statistical language really, really needs to shine in what it does in order to justify the trouble of learning another language over using a Python with a statistics library.
[+] hatmatrix|12 years ago|reply
There's a catch though - once you start using numpy/scipy, there's no longer one-way-to-do-it and code becomes pretty messy.
[+] ggchappell|12 years ago|reply
I'm seeing this phenomenon in my own work. I do a fair amount of computational stuff (in the old sense of computing something, as opposed to just using a computer), and I find myself gravitating more & more to Python.

Performance is, of course, an issue. But in one case I noted that the week or so it took me to write and execute some Python code was almost certainly less time than it would have taken just to write the code in C/C++. I concluded that -- for this problem at least -- the extra performance I might have gotten out of C++ simply did not matter.

However, these days I do not do much with large scientific models or anything resembling big data. Performance is not the issue for me that it would be for others. And with articles having titles like "Why Python is Slow: Looking under the Hood"[1] out there, I find it difficult to see how Python can displace Fortran (or maybe C) in the realm of traditional supercomputing.

[1] https://jakevdp.github.io/blog/2014/05/09/why-python-is-slow...

[+] jokoon|12 years ago|reply
remember there are ways to squeeze performance out of python, projects like cython for example. you can also make a python library in C... I don't know how practical it is though
[+] Malarkey73|12 years ago|reply
I think the author really needs to define scientific computing. Some people would think of that as simulations, clusters, astronomy, epidemiology and fluid dynamics. That maybe typified by a scripting language wrapper(inc Python) to C, C++ or even Fortran programs.

He himself seems to be talking more about standard stats and data analysis or prediction. In this field R is growing at least as fast as Python. R is the no1 Kaggle language and the no1 academic stats language, Python 2nd and stable there. The software carpentry movement to train scientitsts concentrates on both R and Python.

Then the biggest scientific programming field is bioinformatics - which really is a mish mash of Python, Perl, Java, C++ whatever piped CLI and a lot of R again. Here Python is growing mostly at the expense of Perl (there is a big Perl legacy) but as most software is designed for piping together in Bash scripts the diversity is not too big a deal. My own perusal of Job adverts on this area sees employers asking for "R Perl or Python" which are viewed as interchangeable.

[+] acjohnson55|12 years ago|reply
But he also talks about document parsing and web dev. I think Python finds its edge when you have to combine things on the sciencey/mathy side of things with more user-facing software development. As the author said, it's nice not having to code switch between the a dozen best-in-field languages, when Python is highly capable across the board. Besides maybe C# and maybe Java, what other languages are as versatile?
[+] scythe|12 years ago|reply
The easy objection is that none of what he's doing actually depends on Python; those libraries could have been as easily linked to Javascript, Ruby, Lua, Clojure, etc, the critical point: NumPy does not really take advantage of any specific feature of Python other than its popularity. Python isn't really taking over when most of the actual code that runs was written in C, but it's the best glue right now.

But really, just this second part is the point. Why wasn't this post written five years ago, when Python had already become a popular and established language? Well, if you look at it a little differently, it took roughly five years from the time Python took over undergrad CS courses to the time it became the lingua franca of scientific computing -- see a connection?

[+] kenjackson|12 years ago|reply
This is a partial example of worse-is-better. No one wants to have 10 languages in their toolbelt.

I personally find I use C# for almost everything. You can find good libraries for virtually everything you need. And it has the best tooling I've found. Is it actually the best language for everything? Absolutely not, but its close enough that I'm probably the most productive in virtually everything with it.

[+] ZanyProgrammer|12 years ago|reply
Python is such a ubiquitous language, even outside of scientific computing. It seems like mobile dev is the only area where its not present.
[+] jnbiche|12 years ago|reply
Python faces the same problem that Clojure does on mobile: start-up times are too slow and battery use is too high.

People have made valient efforts to overcome this issue in both Python and Clojure: Kivy, Py4A and Clojure on Android, in particular. But they still can't provide reasonable performance compared to native mobile frameworks.

It's just a product on Python implementation: too much complexity and too heavyweight objects (and almost everything is an object!).

To a certain extent, the same applies to Clojure -- there is a price to be paid for abstractions and (and in the case of Clojure) immutability, and it rears its ugly head the most when trying to fit these language implementations on resource-constrained mobile platforms.

[+] seba_dos1|12 years ago|reply
The whole freesmartphone.org middleware was initially implemented in Python and there is bunch of Python apps in distros like SHR.

Python is great for mobile when it comes to hacking on-the-fly - I actually traced down and fixed few bugs in early Openmoko while getting bored in tram etc. thanks to the most useful parts being implemented in Python. Unfortunately, on Freerunner the difference in performance was very visible - however it shouldn't be as bad on more recent hardware.

[+] marcmarc|12 years ago|reply
The Kivy project seems to be making progress recently. However, I believe it still has a limited scope. It has poor support for native framework features/UI.
[+] analog31|12 years ago|reply
I bought an Android 3.x tablet in rapt anticipation of using the Android Python interpreter, but what killed me was a lack of packages that I could easily deploy and use, most notably Tkinter.
[+] Peaker|12 years ago|reply
If I analyzed the toolset I used years ago vs. now, I'd conclude the world is moving from Python to Haskell :)
[+] cjbprime|12 years ago|reply
I wonder what it'll take for JavaScript to start taking over Python's niches. Is there a numeric library (like Scipy/pandas) hooked up to asm.js yet? I can imagine that being faster than Python.
[+] alexandros|12 years ago|reply
Javascript will need to fix the mess that is its numeric types. As it currently is, you either need a bignum library with an ungodly syntax, or you are open to losing precision above a certain level, without warning. I tried solving Euler problems with JS, but it's so painful I switched back to Python after a few.
[+] zokier|12 years ago|reply
> Is there a numeric library (like Scipy/pandas) hooked up to asm.js yet? I can imagine that being faster than Python.

Faster than Python using LAPACK and other native libs?

[+] acjohnson55|12 years ago|reply
CoffeeScript proved to a lot of people how much more enjoyable JS could be with some of the lessons in language ergonomics learned from Python and Ruby. I think that ES6 really reflects this. With some of the long awaited improvements, like better lambdas, generators, Set/Map, assignment sugar, better metaprogramming of objects, and standard-blessed implementations promises and classes, I think that JS is finally close to being able to compete in the multi-paradigm space. Sweet.js (or some other macro layer) will also really help in letting people develop new syntactic abstractions for domain-specific stuff. It'll be interesting to see how it all develops.
[+] alco|12 years ago|reply
Python is good for scientific computing. But this is one person's experience. Users of other languages may sleep calm.
[+] VeejayRampay|12 years ago|reply
Isn't Golang slowly eating Python's stolen lunch though?
[+] quanticle|12 years ago|reply
In scientific computing? No way. I'd expect something like Julia to overtake Python before golang does.
[+] sitkack|12 years ago|reply
I don't think so, but should be siphoning off people that use Cython.
[+] tmikaeld|12 years ago|reply
What about Zope/Plone?

Not good enough, or any other reason?

I'm curious.

[+] darkandbrooding|12 years ago|reply
In the early 2000s, the Zope community felt that Zope 2.x had grown long in the tooth, so they decided to create a new, modern code base that they called "Zope 3," initially released in 2004.

Zope 3 was not backward compatible with Zope 2.x, nor with the impressive ecosystem of Zope 2.x plugins, and for years there was confusion about the direction, momentum, and relative importance of those two parallel projects. Zope 3 never gained any significant traction, because in adopting a "component architecture" they decided to use XML to connect those individual components, and it felt like you spent more time writing XML than Python. IMO this was a strategic mistake.

In 2005 the Django framework was released. In 2006, Ruby on Rails was released. After a couple years of confusion about the Zope roadmap, developers now had multiple options. You couldn't sell management on new Zope 2 apps, Zope 3 wasn't ready for prime time, and Rails was a significantly(!) more productive environment than Zope 2. (Django presumably is, too, but I have no Django experience and so cannot say.)

In 2010, the Zope community renamed "Zope 3" to "bluebream" to clarify their messaging, but that was after six years of ambiguity. Developers moved to other tools and frameworks, and Zope's developer community shrank until it no longer had a critical mass of developer interest.