Debugging Python Like a Boss

[+] haberman|12 years ago|reply

Debuggers are cool and often necessary, but I disagree with this often-expressed sentiment that print-debugging is a primitive hack for people who don't know any better.

Debugging is determining the point at which the program's expected behavior diverges from its actual behavior. You often don't know where where/when this is happening. Print-debugging can give you a transcript of the program's execution, which you can look at to hone in on the moment where things go wrong. Stepping through a program's execution line-by-line and checking your assumptions can be a lot slower in some cases. And most debuggers can't go backwards, so if you miss the critical moment you have to start all over again.

These are two tools in the toolbox; using print debugging does not mean you are not "a boss."

[+] dlau1|12 years ago|reply

Surprised no mention of JetBrains PyCharm. It's incredibly easy to use for debugging, even supports debugging gevent based code.

I agree though, outputting print statements at different levels of severity, i.e. warning, error, info, etc. is a great way to see if things are working on a high level. For more granularity, debuggers are invaluable to figure out why a specific portion of code isn't working as expected.

[+] epenn|12 years ago|reply

To expand on your point, I look at print statements in the same light as goto statements. There is a time and place for both, just make sure it's the best tool to accomplish your goal. In C I often use gotos for error handling, but I wouldn't use them in situations where higher-level branching constructs are more suitable. Similarly, sometimes you don't need the features of a heavy-weight debugger but just want to check some output. Print statements are great for this.

[+] ufo|12 years ago|reply

One neat trick that I really like for ptrintf debugging is using conditional breakpoints and putting the call to the printing function inside the breakpoint condition. This lets you add print statements without editing the original code and makes it very easy to toggle them on and off.

[+] scott_s|12 years ago|reply

I wrote an essay about this very thing: http://www.scott-a-s.com/traces-vs-snapshots/

[+] Symmetry|12 years ago|reply

I tend to find myself relying on debuggers for tracking down bugs in other people's code where I might not have a good grasp of the big picture, but relying more on print statements for debugging my own code.

[+] coolsunglasses|12 years ago|reply

There are better ways to do this style of debugging:

https://github.com/clojure/tools.trace

https://github.com/coventry/Troncle (I can use this from nRepl'ing into production servers)

[+] evanspa|12 years ago|reply

I agree. That, coupled with keeping your functions small (doing 1 thing) and tight, and being disciplined at writing unit tests can go a LONG way when it comes to debugging with simple print statements. Normally I'm able to hone in bugs with a few strategically placed prints, and a re-run of the unit tests. Pouring over stdout log is usually trivial with an incremental search, and the print statement prefixed with some known unique chars.

[+] henrik_w|12 years ago|reply

I find logging/tracing (that can be enabled/disabled at run time) to be very valuable for debugging, both during development and in production. I blogged about it here: http://henrikwarne.com/2013/05/05/great-programmers-write-de...

[+] vram22|12 years ago|reply

>Debuggers are cool and often necessary, but I disagree with this often-expressed sentiment that print-debugging is a primitive hack for people who don't know any better.

Yes.

>Stepping through a program's execution line-by-line and checking your assumptions can be a lot slower in some cases.

Yes again, particularly when the code is in a loop. Whereas, if you use print-debugging, even though the output will be repeated as many times as the loop runs, you at least have the possibility of grepping to filter it down to relevant (e.g. erroneous/unexpected) output.

Here's a simple Python debugging function that may be useful:

http://jugad2.blogspot.in/2013/10/a-simple-python-debugging-...

[+] hcarvalhoalves|12 years ago|reply

I only use "print debugging" (using the logging facilities more often than not) if it's something I can leave in the codebase, like logging a function call and it's parameters, or when a routine is being skipped; then a debugger if I want to check the interface or docstring of some object or retry a call with different parameters on the REPL.

[+] ChikkaChiChi|12 years ago|reply

Print statements allow me to not only create my own breakpoints, but to add additional conditions, timers, resource monitors, etc. to see how my program is actually performing.

[+] hcarvalhoalves|12 years ago|reply

A good list of libraries, but please, don't use this in the middle of your code to set a break point:

    import pdb; pdb.set_trace();

There's a chance you forget this, check-in, and it ends in production. Use pdb facilities instead:

    $ python -m pdb <myscript>

Then set a breakpoint and continue:

    (Pdb) break <filename.py>:<line>
    (Pdb) c

This is trivial to automate from any editor or command line, so you don't even have to guess the path to the file.

EDIT: For the lazy, here's a script to set breakpoints from the command line and run your scripts:

https://gist.github.com/hcarvalhoalves/7587621

[+] lewaldman|12 years ago|reply

Here is my setup:

- I have a ~/.python/sitecustomize.py file with the following:

# Start debug on Exception

import bdb import sys

def info(type, value, tb):

   if hasattr(sys, 'ps1') \
         or not sys.stdin.isatty() \
         or not sys.stdout.isatty() \
         or not sys.stderr.isatty() \
         or issubclass(type, bdb.BdbQuit) \
         or issubclass(type, SyntaxError):
      # we are in interactive mode or we don't have a tty-like
      # device, so we call the default hook
      sys.__excepthook__(type, value, tb)
   else:
      import traceback, ipdb
      # we are NOT in interactive mode, print the exception...
      traceback.print_exception(type, value, tb)
      print
      # ...then start the debugger in post-mortem mode.
      ipdb.pm()

sys.excepthook = info

It will start ipdb automatically in case of any exception on command line called scripts.

- I use the awesome pdb emacs package for debug interactivelly during bigger bug hunts (Also for dev too... It's very a nice tool)

- Buutt... I still find the "print dance" to be my first-to-use quick tool.

edit: Fixed pasted code

[+] cjg_|12 years ago|reply

If miss a line like that which is very easy to spot in a diff, which other things end up in your production environement?

And obviously any test covering that line will fail/hang.

[+] gknoy|12 years ago|reply

We avoid that by having a build step fail if 'import pdb' exists in our codebase. You could do similar for any of these tools. This will then lead to build failures in one's automated build system, and flag pull requests as not-yet-ready to be merged with our master branch.

If one doesn't have an automated test process, then I suspect one has bigger potential errors that could sneak in than an errant pdb breakpoint. I'll just assume that your release / merge process DOES include a test suite that you can add this sort of test to.

[+] jaytaylor|12 years ago|reply

This is great advice, I didn't know there was a better way than the scary import/set_trace() method. Thanks for sharing!

[+] bqe|12 years ago|reply

I've used pdb.set_trace() before when I had a series of complex breakpoints. I kept them in my git stash.

Perhaps I should add a pre-commit hook to grep for pdb.set_trace() and reject commits with that in them.

http://stackoverflow.com/a/10840525/2151949

[+] hcarvalhoalves|12 years ago|reply

Since I claimed it was trivial to automate that from an editor, here's a plugin for setting up breakpoints on ST2:

https://github.com/hcarvalhoalves/sublime-pdburger

[+] wodow|12 years ago|reply

This is only convenient in cases where

(a) the breakpoint line doesn't move around a lot between different executions, as you edit the code;

(b) you don't want to programatically invoke the debugger (i.e. if f(x): pdb.set_trace() )

[+] codygman|12 years ago|reply

How would I do that with something like django?

[+] brian_cooksey|12 years ago|reply

Thanks for the tip!

[+] rtpg|12 years ago|reply

The most frustrating thing (experienced in both Javascript and Python) is the "oh uncaught exception? let me just quit everything" model. Most of the time, if I were just given an interactive prompt right then, I could spend 1 minute looking at local variables, maybe get a special stack trace variable to look at that, then be over with it.

Instead I have to stick in some print statements and start everything over again.

[+] davmre|12 years ago|reply

There's a nice trick to enable this behavior for standard Python code run at the command line. Write the body of your code inside a main() function, then call it using the following toplevel block:

  if __name__ == "__main__":
      try:
          main()
      except KeyboardInterrupt: # allow ctrl-C
          raise
      except Exception as e:
          import sys, traceback, pdb
          print e
          type, value, tb = sys.exc_info()
          traceback.print_exc()
          pdb.post_mortem(tb)

This will catch any exceptions and throw you into PDB in the context where the exception was raised. You probably don't want to leave it in production code, but it's super useful for development.

[+] Widdershin|12 years ago|reply

Flask makes this really nice. When in Debug mode, if an exception happens, you get an interactive stack trace, and you can easily jump into console in each level.

[+] ehsanu1|12 years ago|reply

Chrome has pause on exceptions (look under the Sources tab). Firefox/Firebug might too.

[+] unknown|12 years ago|reply

[deleted]

[+] edwinnathaniel|12 years ago|reply

I've been using Python for a while for fun and Ruby (Rails) on and off.

I've always find it interesting how the Python/Ruby community debug your code both during development (coding or writing unit-tests) and perhaps in production as well (for the record, I use "print" as my debugging tool).

I'm a long time Eclipse user who has recently converted to IntelliJ (almost a year) and the debugger that comes with these editors is something I can't live without hence I have not moved to Emacs or VI(m) or whatever the UNIX hackers use because it would crippled my productivity significantly (or so I thought, feel free to criticize my point of view).

So sometimes I'm wondering how productive Python/Ruby + VIM/Emacs users. Just an honest question really.

PS: most Java IDE debuggers can do local AND remote AND has some support for hotswap code.

[+] gknoy|12 years ago|reply

I've used "println debugging" more than I have used an IDE's debugger, and am more comfortable with the former. I think it revolves around a different way of using them, and is likely very heavily influenced by having spent a lot of time developing with a REPL handy.

When I did mostly Java coding, I would tend to use println debugging rather than dive into the IDE's debugger, as I tended to be able to zero in more easily on what was going on when I took a holistic "Let's print out each item's id and name ..." approach to start.

Now that I do most of my code in Python, I use the interactive debugger almost exclusively.

With an IDE, I can look at variables' contents. What do I do when I want to check the result of a method call, though? It is likely tool unfamiliarity, but it's never been clear how to check that as something to inspect. (If you know how to do this, then you're a much more savvy user of the IDE's debugging tools than I am.) If I don't have the right breakpoint, or the right questions, I often glean little.

Println debugging is an easy way to see that you're looping incorrectly, or that All Your Data is bad in a way you didn't expect.

With a REPL-style debugger, I can treat it as an interactive question/answer session that lets me check things like contents of the database, or the values that helper methods return:

  > print len(foo.items)
  0
  # why??? Maybe the objects don't exist?
  > print Foo.objects.filter(bar=42)
  [foo1, foo2, foo3]
  # Let me place a new breakpoint then in 
  # my JsFoo.from_db_item() method, and try again ....

I think the nicest thing about an interactive debugger is that it lets me construct arbitrary expressions -- print lists of things, look at nested data, etc -- in the language I am already developing in. I always had a hard time doing something quite as powerful in Eclipse.

[+] cgh|12 years ago|reply

Yes, the ability to connect to and debug a remote, running JVM is a killer feature. And it is well supported by IntelliJ and Eclipse. But it's a function of the JVM and not the debuggers themselves - if the Python runtime offered remote debugging then I'm sure Python debuggers would support it too.

[+] tejaswiy|12 years ago|reply

I was about to bring up the exact same thing. I see some positivies in the the text editor not IDE approach that linux / unix folks take, but I don't find this approach so amazing that programmers need to trade intellisense / auto complete / In-IDE debugging / Refactoring features for this.

[+] dima55|12 years ago|reply

The emacs debugger infrastructure (gud) is pretty great, and is common for a multitude of languages. Add on top of this other fancy pants emacs features (for instance remote anything, including debugging via TRAMP) and emacs users do better than most.

[+] freakboy3742|12 years ago|reply

One feature that all these tools share is that they're console based. This is nice, but there's a reason we all use graphical environments for our daily computing -- rich graphical user interfaces are a powerful tool for visualising complex data. However, you don't have to adopt a full IDE to get a graphical UI. Bugjar (http://pybee.org/bugjar) is a graphical debugger -- not an IDE, just a debugger. It uses Tkinter, so it's cross platform, and can be installed using "pip install bugjar".

It's an early stage project, but hopefully demonstrates that there is a middle ground between "Everything in an 80x25 text window" and "500lb IDE gorilla".

(Full disclosure: I'm the core developer of Bugjar)

[+] smortaz|12 years ago|reply

very nice! plug: if you happen to be on windows, try PTVS which has nice features like mixed-mode Python/C++ debugging as well cross debugging from Visual Studio <-> linux & MacOS. (it's a free plug-in).

http://www.youtube.com/watch?v=wvJaKQ94lBY

[+] corysama|12 years ago|reply

Although VStudio Express does not support plugins, you can also use PTVS in combination with the free "VS2013 Shell" https://pytools.codeplex.com/wikipage?title=PTVS%20Installat...

[+] makmanalp|12 years ago|reply

Most notably, pydbgr has out of process debugging, so you can attach to a server process and diagnose a deadlock, for example.

[+] zmmmmm|12 years ago|reply

Thanks! this is what I miss most from debugging on the JVM stack. I am rarely in control of / responsible for starting the processes I want to debug. The JVM's ability to simply attach and set a breakpoint to jump in in real time, even on a remote server, when something is going wrong is a complete lifesaver.

[+] siliconc0w|12 years ago|reply

What is also useful are the various web framework's support for debugging in realtime. If you haven't worked on a web application that lets you just type code in when it throws a 500 I highly recommend it.

Also what I like to do with ipdb is set a debug point and just write new functionality in real time. Most good programmers probably do this in their heads but having a computer do it for you is the next best thing. You catch bugs almost immediately (hey this variable isn't supposed to be empty!). It feels pretty cool to send a request to a web app and just pound out the code to make it respond correctly before the browser gives up on the HTTP connection.

[+] dmd|12 years ago|reply

One thing I love about much-maligned Tcl is its ability to connect a repl to a running program.

Not just remote debugging of a halted program - you can actually inspect and alter variables and issue commands into the executing code.

Is there a way to do that in, say, Python?

[+] defrex|12 years ago|reply

This article is missing pdb++[1], which I would argue is much less buggy then ipdb while having essentially all the same features.

[1] https://pypi.python.org/pypi/pdbpp/

[+] memracom|12 years ago|reply

Print debugging is faster. Most people who use this technique also don't make so many mistakes because they take the time to review their code, not to mention writing unit tests.

Their debug cycle is a) notice something is not quite right. b) insert some print statements in the code that they just changed. c) run it again. d) look at the code and the print statements to see where they made a wrong assumption, fix it and move on.

No need for figuring out where to set a break or single stepping through too much tangle.

[+] benzesandbetter|12 years ago|reply

Really... so using a debugger, um that's pretty pro-level.

I thought there would actually be some innovative techniques in this article.

Perhaps a better title would be, "A review of python debuggers".

[+] softworks|12 years ago|reply

TDD + print() == "debugging like a BOSS". :)

In all honesty I rarely feel the need for anything more. Generally I don't even need the print() because I stick to small self contained functions.

The exception is when I'm using a poorly documented or new to me open source library. I guess at times like that a debugger may be useful. So next time I run into such a situation I'll try out a debugger.

But I can't see much of a reason for it in my own code.

[+] cbarion|12 years ago|reply

Here's another library to help debugging: https://pypi.python.org/pypi/globexc. It tells the Python interpreter to write a detailed trace file (including contents of variables) if there is an unhandled exception. It's less powerful than a proper debugger but the trace file is always there after a crash which can be very convenient.

[+] gbog|12 years ago|reply

I would say that the real boss debugging starts with finding a way to make it first a unittest debugging.

Once the weird behavior is covered, usually print debugging is even better than pdb because it will encourage you to extend the test suite.

What I would like though is a special monkey patching in python that allow my to write print like "p this that" and will prettyprint this and that on stdout

[+] ruxkor|12 years ago|reply

A very nice list of debuggers, but I'm wondering why there is no mentioning of the (very good) debugging support you can find in IntelliJ and Eclipse and mostly, why there is no mentioning of Winpdb [1]; a very nice and platform-independent Python debugger with a full-fledged GUI.

[1]: http://winpdb.org/

[+] roycehaynes|12 years ago|reply

First off, great post Cooksey. I'm actually a sublime and pycharm type of guy, where I use pycharm mostly for debugging purposes. I use pdb only if I'm debugging on a machine thats not mine, I need to debug something fairly quickly, or PyCharm isn't available. I definitely need to give ipdb a try.

[+] est|12 years ago|reply

remote debugging with pdb+socket

1. first listen to a socket with nc -l -U 1.sock

2. Add this to your python script.

    import socket, pdb

    s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
    s.connect('1.sock')
    f = s.makefile()
    pdb.Pdb(stdin=f, stdout=f).set_trace()

    raise wtf

3. now debug in nc. enjoy.

[+] rbanffy|12 years ago|reply

Interestingly, it's a dupe:

https://news.ycombinator.com/item?id=6770412

But I'm happy. When I posted, it failed to generate a discussion. If it were identified as a dupe this time, it would probably be forgotten.

It's a great article.

[+] jamtan|12 years ago|reply

I use epdb because of it's ability to open a raw socket at a breakpoint with epdb.serve()

https://bitbucket.org/rpathsync/epdb

98 comments