top | item 3529841

Fixing the Python subprocess interface

291 points| daenz | 14 years ago |amoffat.github.com | reply

61 comments

order
[+] tswicegood|14 years ago|reply
FWIW, re-replaying the stack trace to figure out what was imported and re-implementing it is a horrible idea. There are much better ways to do this type of import voodoo, specifically the import hooks that Python ships with. Here's an example of their use inside a small side project of mine: https://github.com/tswicegood/maxixe/blob/master/maxixe/__in...

All that said, this is horribly un-Pythonic. A much better route to so is something like envoy[1] which simply wraps `subprocess` in a sane API.

[1]: https://github.com/kennethreitz/envoy

[+] tswicegood|14 years ago|reply
Just to clarify. I applaud this type of development for people trying to learn various parts of Python like playing with the stack, but the idea of this being used in the wild scares me a bit.
[+] jamesaguilar|14 years ago|reply
envoy looks a lot more verbose than this though . . . if you're really trying to replace shell scripting, having envoy.run(foo) on every line is going to get annoying.

Also, it's not clear on what basis you make this assertion:

> FWIW, re-replaying the stack trace to figure out what was imported and re-implementing it is a horrible idea.

Does it not work?

[+] samps|14 years ago|reply
It's odd that the import mechanism is abused here to make objects "out of thin air". The fact that "from pbs import ffmpeg" works only if ffmpeg is actually on the path is somewhat surprising.

I think the more comfortable (and Pythonic?) way to do this would be to explicitly create these command objects:

   >>> import pbs
   >>> ffmpeg = pbs.Command('ffmpeg') # or '/usr/bin/ffmpeg', perhaps
   >>> result = ffmpeg(...)
[Edit: I really like the concept of using Python's positional and keyword arguments to construct a shell command, though. Great insight there.]
[+] SoftwareMaven|14 years ago|reply
Except that changes what it is trying to accomplish. The goal (as I read it) is to make shell scripting more palatable in python. Needing to continually differentiate between "this is python" and "this is system" gets old very fast.

In your example, what pbs is doing is no different than a bash shell script reporting a "command not found". It isn't what I would want if I were writing a complex piece of software interacting with ffmpeg, but for a simple script to process a bunch of files in a directory, I like it.

I've tended to shy away from using Python for shell scripting because subprocess is so ugly (even its out-of-favor, crippled relative os.system is nasty), and pbs looks like it does a really great job at addressing that.

[+] daenz|14 years ago|reply
Hi! Author here, there are a few ways to use it, including your suggestion (did you make your suggestion up or were you pulling from the docs?):

    # magical, designed only for single shell scripts
    from pbs import *
    ffmpeg()

    # less magical
    from pbs import ffmpeg
    ffmpeg()

    # or
    import pbs
    pbs.ffmpeg()

    # no magic
    import pbs
    ffmpeg = pbs.Command(pbs.which("ffmpeg")) # command takes full path
    ffmpeg()
I tried to cover the main use cases adequately. My goal was to ease a pain point myself and others have experienced, that other popular packages don't address well unfortunately.

If it's helped anyone like it's helped me, I'm happy and glad to share!

[+] nas|14 years ago|reply
Neat but way too magical for my taste. The code to figure out what to do in the case of 'from .. import *' is particularly ugly.

Perhaps the commands should accessible from an object you import. That's slightly more typing but more explicit and would not require ugly magic. E.g.

  from pbs import sh
  print sh.ifconfig('eth')
If it's not clear, the 'sh' object could override __getattr__ or __getattribute__ and wrap commands as necessary.
[+] dman|14 years ago|reply
Anything that encourages a from ... import * usage is evil irrespective of its implementation.
[+] d0mine|14 years ago|reply
To avoid any magic you could use:

  ifconfig = pbs.Command("/path/to/ifconfig")
[+] jmtulloss|14 years ago|reply
So many people saying this is "too much magic". Whatever, I'm into it. The idea of commands being functions that can just pass their output to other functions is intuitive, and passing arguments as, well, arguments is as well.

It might not be pythonic, it might be a crime against Guido and everything he represents, but it's pretty awesome.

[+] gbog|14 years ago|reply
I have the feeling you misunderstood the problem with magic. In fact, having a function calling shell command is ok. The problem is to know, from reading the code, where it comes from. In python there is this idea of transparency, the determinism, which is with it feels enlightening, like in the docs cartoon. Anything obscuring that is said to be unpythonic.
[+] scott_s|14 years ago|reply
I use Python for shell scripting a lot. Ignoring all of the issues people have brought up here, I really like how function composition is piping:

  # sort this directory by biggest file
  print sort(du("*", "-sb"), "-rn")

  # print the number of folders and files in /etc
  print wc(ls("/etc", "-1"), "-l")
The reason that I like that method over, say, envoy's [1] method is that envoy.run('uptime | pbcopy') has what I consider code in strings. When I'm writing a script, the programs I'm calling, and how they interact, is part of the "code" to me. I would prefer that they're at the language level, and not represented as strings.

[1] I only just learned about envoy in this thread. Thanks! Sadly, one of the places where I run my Python scripts is a location where I can't install my own packages, and I don't want to deal with using my own install of Python, so I tend to just implement something like this:

  def checked_exec(seq):
    p = Popen(seq, stdout=PIPE, stderr=PIPE)
    stdout, stderr = p.communicate()
    if p.returncode != 0:
        print 'err: ' + stderr
        print "'" + seq + "' failed."
        sys.exit()
    return stdout
[+] ulope|14 years ago|reply
As long as the package you need doesn't include a C extension (which most don't) you can just ship it with your code (license permitting ofc.) - just add the path to the libary to sys.path. It's not a very clean solution but can be a real life saver when you have to work on "broken" systems.
[+] kenneth_reitz|14 years ago|reply
This module is a huge hack. It's a neat hack, though :)

I've written what I feel is a much better solution to this problem: Envoy.

https://github.com/kennethreitz/envoy

It's pythonic and makes far fewer assumptions about both your code and what you're running.

[+] mafro|14 years ago|reply
+1 for Envoy. It's a really nice replacement for subprocess (which is awful).

Clint and Requests are worth a look too.. Thanks Kenn!

[+] bryze|14 years ago|reply
Can we please stop titling postings like this? How about "An Alternative to the Python Subprocess Interface". Fixing something implies that it's broken or inadequate, and from my limited experience, Subprocess is already a major improvement over os.system. I'm not saying this idea doesn't have value, I'm saying that the way it's framed lacks the requisite humility it ought to.
[+] ggchappell|14 years ago|reply
This is a cute little hack, and quite possibly a very useful one. (And if it isn't useful, then it is certainly interesting and instructive.)

However:

Please, please, please don't use or recommend things like "from pbs import * ". Namespace pollution is bad enough when importing a documented collection of functions. Importing functions that are named based on whatever happens to be in my path at the moment ... that's seriously scary.

But "import pbs" looks like fun. :-) And "pbs.ls" isn't that much to type.

As The Zen of Python says:

] Namespaces are one honking great idea -- let's do more of those!

P.S. Hmmm ... but does "import pbs" work? Haven't tried it.

[+] trun|14 years ago|reply
Reminds me of something I saw not too long ago...

https://github.com/JulienPalard/Pipe

It would be really cool (though admittedly less Pythonic) to combine the infix notation provided by the Pipe library to allow more shell-like function chaining.

Instead of this...

  print wc(ls("/etc", "-1"), "-l")
You would have this...

  print ls("/etc", "-1") | wc("-l")
[+] stephen_mcd|14 years ago|reply
The value/cost of a "hack" is offset by what it provides. If something can be implemented at the same cost in a less brittle and more future-proof way, then by all means label it horrible.

If you think "horrible" hacks are a slight against something that provides an amazing level of functionality to an end user, then you've lost sight of what we're coding for.

[+] jneen|14 years ago|reply
Magic!!!!!

I code in Ruby for a living, and this is even a bit much magic for me. I suppose you can use it in magic-less mode with `from pbs import Command`, and set up your command set manually.

[+] mLewisLogic|14 years ago|reply
If you don't like the import *, don't use it. Python supports it, so why shouldn't this library? (not that it's a great idea)

If you'd like to handle missing system executables, catch the exception. You should be writing in that style anyways.

Honestly, this cleans up a ton of system scripting code, making it way more readable/maintainable.

Maybe the code could be cleaned up, but this is the direction that Python should be heading. Abstract away the complications when possible, keep low-level stuff around for when it's absolutely needed.

Beautiful is better than ugly.

[+] cobbal|14 years ago|reply
It states that the lines

  curl("http://duckduckgo.com/", "-o page.html", "--silent")
  curl("http://duckduckgo.com/", "-o", "page.html", "--silent")
are equivalent. This worries me, I would much rather always have it be one argument corresponding to exactly one shell argument. Here it looks like in some (maybe all?) cases arguments are split on spaces, which means always having to be extremely cautious about escaping, something a good abstraction shouldn't force you to deal with.
[+] mgedmin|14 years ago|reply
Exactly. If you look at the sources, you'll see that the arguments are all joined into a single string with spaces, then split back into separate words using shlex.split().

So cat("filename with spaces in it") will fail, but cat("'filename with spaces in it') ought to succeed.

It's a neat experiment, but using this module in production would not be a great idea.

[+] VMG|14 years ago|reply
Very impressive, but the use of globals and the dynamic lookup mechanism are a little scary. Looking at the source there seems to be some magic involved like hacking the interpreter.

I'd feel more comfortable if it only exported one variable.

[+] ishi|14 years ago|reply
That's pretty brilliant, well done.
[+] bwarp|14 years ago|reply
This is basically a dynamic DSL. A confusing muddle of syntax and semantics. Will result in pain. Usually when least expected.
[+] simonw|14 years ago|reply
This looks great, but I really don't like the fact that I can't fire up a Python shell and try it out interactively. Having to run pbs.py itself to get a different kind of shell is uncomfortable.
[+] daenz|14 years ago|reply
This is fixed on master as of version 0.4, just fyi. It has some limitations (no star import) but otherwise works as expected.