Fixing the Python subprocess interface

[+] tswicegood|14 years ago|reply

FWIW, re-replaying the stack trace to figure out what was imported and re-implementing it is a horrible idea. There are much better ways to do this type of import voodoo, specifically the import hooks that Python ships with. Here's an example of their use inside a small side project of mine: https://github.com/tswicegood/maxixe/blob/master/maxixe/__in...

All that said, this is horribly un-Pythonic. A much better route to so is something like envoy[1] which simply wraps `subprocess` in a sane API.

[1]: https://github.com/kennethreitz/envoy

[+] tswicegood|14 years ago|reply

Just to clarify. I applaud this type of development for people trying to learn various parts of Python like playing with the stack, but the idea of this being used in the wild scares me a bit.

[+] jamesaguilar|14 years ago|reply

envoy looks a lot more verbose than this though . . . if you're really trying to replace shell scripting, having envoy.run(foo) on every line is going to get annoying.

Also, it's not clear on what basis you make this assertion:

> FWIW, re-replaying the stack trace to figure out what was imported and re-implementing it is a horrible idea.

Does it not work?

[+] samps|14 years ago|reply

It's odd that the import mechanism is abused here to make objects "out of thin air". The fact that "from pbs import ffmpeg" works only if ffmpeg is actually on the path is somewhat surprising.

I think the more comfortable (and Pythonic?) way to do this would be to explicitly create these command objects:

   >>> import pbs
   >>> ffmpeg = pbs.Command('ffmpeg') # or '/usr/bin/ffmpeg', perhaps
   >>> result = ffmpeg(...)

[Edit: I really like the concept of using Python's positional and keyword arguments to construct a shell command, though. Great insight there.]

[+] SoftwareMaven|14 years ago|reply

Except that changes what it is trying to accomplish. The goal (as I read it) is to make shell scripting more palatable in python. Needing to continually differentiate between "this is python" and "this is system" gets old very fast.

In your example, what pbs is doing is no different than a bash shell script reporting a "command not found". It isn't what I would want if I were writing a complex piece of software interacting with ffmpeg, but for a simple script to process a bunch of files in a directory, I like it.

I've tended to shy away from using Python for shell scripting because subprocess is so ugly (even its out-of-favor, crippled relative os.system is nasty), and pbs looks like it does a really great job at addressing that.

[+] daenz|14 years ago|reply

Hi! Author here, there are a few ways to use it, including your suggestion (did you make your suggestion up or were you pulling from the docs?):

    # magical, designed only for single shell scripts
    from pbs import *
    ffmpeg()

    # less magical
    from pbs import ffmpeg
    ffmpeg()

    # or
    import pbs
    pbs.ffmpeg()

    # no magic
    import pbs
    ffmpeg = pbs.Command(pbs.which("ffmpeg")) # command takes full path
    ffmpeg()

I tried to cover the main use cases adequately. My goal was to ease a pain point myself and others have experienced, that other popular packages don't address well unfortunately.

If it's helped anyone like it's helped me, I'm happy and glad to share!

[+] nas|14 years ago|reply

Neat but way too magical for my taste. The code to figure out what to do in the case of 'from .. import *' is particularly ugly.

Perhaps the commands should accessible from an object you import. That's slightly more typing but more explicit and would not require ugly magic. E.g.

  from pbs import sh
  print sh.ifconfig('eth')

If it's not clear, the 'sh' object could override __getattr__ or __getattribute__ and wrap commands as necessary.

[+] dman|14 years ago|reply

Anything that encourages a from ... import * usage is evil irrespective of its implementation.

[+] d0mine|14 years ago|reply

To avoid any magic you could use:

  ifconfig = pbs.Command("/path/to/ifconfig")

[+] jmtulloss|14 years ago|reply

So many people saying this is "too much magic". Whatever, I'm into it. The idea of commands being functions that can just pass their output to other functions is intuitive, and passing arguments as, well, arguments is as well.

It might not be pythonic, it might be a crime against Guido and everything he represents, but it's pretty awesome.

[+] gbog|14 years ago|reply

I have the feeling you misunderstood the problem with magic. In fact, having a function calling shell command is ok. The problem is to know, from reading the code, where it comes from. In python there is this idea of transparency, the determinism, which is with it feels enlightening, like in the docs cartoon. Anything obscuring that is said to be unpythonic.

[+] scott_s|14 years ago|reply

I use Python for shell scripting a lot. Ignoring all of the issues people have brought up here, I really like how function composition is piping:

  # sort this directory by biggest file
  print sort(du("*", "-sb"), "-rn")

  # print the number of folders and files in /etc
  print wc(ls("/etc", "-1"), "-l")

The reason that I like that method over, say, envoy's [1] method is that envoy.run('uptime | pbcopy') has what I consider code in strings. When I'm writing a script, the programs I'm calling, and how they interact, is part of the "code" to me. I would prefer that they're at the language level, and not represented as strings.

[1] I only just learned about envoy in this thread. Thanks! Sadly, one of the places where I run my Python scripts is a location where I can't install my own packages, and I don't want to deal with using my own install of Python, so I tend to just implement something like this:

  def checked_exec(seq):
    p = Popen(seq, stdout=PIPE, stderr=PIPE)
    stdout, stderr = p.communicate()
    if p.returncode != 0:
        print 'err: ' + stderr
        print "'" + seq + "' failed."
        sys.exit()
    return stdout

[+] ulope|14 years ago|reply

As long as the package you need doesn't include a C extension (which most don't) you can just ship it with your code (license permitting ofc.) - just add the path to the libary to sys.path. It's not a very clean solution but can be a real life saver when you have to work on "broken" systems.

[+] kenneth_reitz|14 years ago|reply

This module is a huge hack. It's a neat hack, though :)

I've written what I feel is a much better solution to this problem: Envoy.

https://github.com/kennethreitz/envoy

It's pythonic and makes far fewer assumptions about both your code and what you're running.

[+] mafro|14 years ago|reply

+1 for Envoy. It's a really nice replacement for subprocess (which is awful).

Clint and Requests are worth a look too.. Thanks Kenn!

[+] j2labs|14 years ago|reply

I like Kenneth Reitz's envoy too. He describes it as "Python Subprocesses for Humans™", similar to how he describes Requests, which he also wrote.

https://github.com/kennethreitz/envoy

[+] bryze|14 years ago|reply

Can we please stop titling postings like this? How about "An Alternative to the Python Subprocess Interface". Fixing something implies that it's broken or inadequate, and from my limited experience, Subprocess is already a major improvement over os.system. I'm not saying this idea doesn't have value, I'm saying that the way it's framed lacks the requisite humility it ought to.

[+] ggchappell|14 years ago|reply

This is a cute little hack, and quite possibly a very useful one. (And if it isn't useful, then it is certainly interesting and instructive.)

However:

Please, please, please don't use or recommend things like "from pbs import * ". Namespace pollution is bad enough when importing a documented collection of functions. Importing functions that are named based on whatever happens to be in my path at the moment ... that's seriously scary.

But "import pbs" looks like fun. :-) And "pbs.ls" isn't that much to type.

As The Zen of Python says:

] Namespaces are one honking great idea -- let's do more of those!

P.S. Hmmm ... but does "import pbs" work? Haven't tried it.

[+] ulope|14 years ago|reply

Wow this is so frighteningly magical and will break in many entertaining ways.

For a sane alternative I'd recommend Kenneth Reitz' awesome Envoy (https://github.com/kennethreitz/envoy)

[+] trun|14 years ago|reply

Reminds me of something I saw not too long ago...

https://github.com/JulienPalard/Pipe

It would be really cool (though admittedly less Pythonic) to combine the infix notation provided by the Pipe library to allow more shell-like function chaining.

Instead of this...

  print wc(ls("/etc", "-1"), "-l")

You would have this...

  print ls("/etc", "-1") | wc("-l")

[+] daenz|14 years ago|reply

There's a few ideas floating around here https://github.com/amoffat/pbs/issues/6 on how to implement it, but nothing looks really feasible. If you have any insights, I welcome them :)

[+] stephen_mcd|14 years ago|reply

The value/cost of a "hack" is offset by what it provides. If something can be implemented at the same cost in a less brittle and more future-proof way, then by all means label it horrible.

If you think "horrible" hacks are a slight against something that provides an amazing level of functionality to an end user, then you've lost sight of what we're coding for.

[+] jneen|14 years ago|reply

Magic!!!!!

I code in Ruby for a living, and this is even a bit much magic for me. I suppose you can use it in magic-less mode with `from pbs import Command`, and set up your command set manually.

[+] unknown|14 years ago|reply

[deleted]

[+] mLewisLogic|14 years ago|reply

If you don't like the import *, don't use it. Python supports it, so why shouldn't this library? (not that it's a great idea)

If you'd like to handle missing system executables, catch the exception. You should be writing in that style anyways.

Honestly, this cleans up a ton of system scripting code, making it way more readable/maintainable.

Maybe the code could be cleaned up, but this is the direction that Python should be heading. Abstract away the complications when possible, keep low-level stuff around for when it's absolutely needed.

Beautiful is better than ugly.

[+] cobbal|14 years ago|reply

It states that the lines

  curl("http://duckduckgo.com/", "-o page.html", "--silent")
  curl("http://duckduckgo.com/", "-o", "page.html", "--silent")

are equivalent. This worries me, I would much rather always have it be one argument corresponding to exactly one shell argument. Here it looks like in some (maybe all?) cases arguments are split on spaces, which means always having to be extremely cautious about escaping, something a good abstraction shouldn't force you to deal with.

[+] mgedmin|14 years ago|reply

Exactly. If you look at the sources, you'll see that the arguments are all joined into a single string with spaces, then split back into separate words using shlex.split().

So cat("filename with spaces in it") will fail, but cat("'filename with spaces in it') ought to succeed.

It's a neat experiment, but using this module in production would not be a great idea.

[+] VMG|14 years ago|reply

Very impressive, but the use of globals and the dynamic lookup mechanism are a little scary. Looking at the source there seems to be some magic involved like hacking the interpreter.

I'd feel more comfortable if it only exported one variable.

[+] ishi|14 years ago|reply

That's pretty brilliant, well done.

[+] bwarp|14 years ago|reply

This is basically a dynamic DSL. A confusing muddle of syntax and semantics. Will result in pain. Usually when least expected.

[+] y4m4|14 years ago|reply

https://github.com/Harshavardhana/pbs - Refactored the code into more Python library like and still trying to fix the command line import problem.

[+] simonw|14 years ago|reply

This looks great, but I really don't like the fact that I can't fire up a Python shell and try it out interactively. Having to run pbs.py itself to get a different kind of shell is uncomfortable.

[+] daenz|14 years ago|reply

This is fixed on master as of version 0.4, just fyi. It has some limitations (no star import) but otherwise works as expected.

61 comments