Git-based fabric deploys are awesome

[+] maybird|14 years ago|reply

  One other valid problem I've heard raised about git-based
  deploys is that you can end up with cruft in your working
  copy that sticks around like .pyc files where the
  original .py file is deleted and there is the chance that
  this file could still be imported even though the
  original .py was deleted.

I'm shocked. If a py file is modified, the pyc is rebuilt. So we know it's poking at the original source file. Why not fail to import if the original source is missing?

Maybe this behavior is meant to support binary-only distribution of Python applications, but there really should be an option to override this behavior.

Edit: Did a bit more research and found this:

(1) http://docs.python.org/release/1.5.1p1/tut/node43.html

  It is possible to have a file called "spam.pyc" without
  a module "spam.py" in the same module. This can be used
  to distribute a library of Python code in a form that is
  moderately hard to reverse engineer.

Makes sense.

(2) http://docs.python.org/using/cmdline.html#cmdoption-B

  If given, Python won’t try to write .pyc or .pyo files
  on the import of source modules.

So you can mitigate that behavior by removing existing pyc files and using "-B".

[+] julien_p|14 years ago|reply

When I ran into this problem I simply added a post-checkout hook that deletes *.pyc files within the git root.

[+] rhizome31|14 years ago|reply

I can confirm, that one alredy bit me.

Removing existing pyc files when deploying sounds sensible, but I'm not sure if -B is really a good idea in this case, as each new worker process will have to parse the code again instead of just using the bytecode.

[+] dguaraglia|14 years ago|reply

I've experienced this before. Not that .pyc/.pyo files ever get into my repo, but sometimes Python will decide it doesn't need to recompile the files and I get old behavior from a piece of software I know I've just re-deployed.

My solution? Simple, I've got a a Fabric command that deletes all .pyc files in the repo after pulling.

def clean_pyc(): run('find . -name "*.pyc" -exec rm {} \;')

And that's that :)

[+] sigil|14 years ago|reply

Look at "git-clean." If I were using git-based deploys with Python code, I'd definitely run that after checkout.

[+] obtu|14 years ago|reply

    export PYTHONDONTWRITEBYTECODE=1

may be more convenient than passing -B.

[+] 5h|14 years ago|reply

so add them to your gitignore.....

[+] calpaterson|14 years ago|reply

" Packages are, of course, a legitimate way to push out changes but the downside of deploying with packages is that it takes time to build them and upload them"

Hopefully your CI server is building them. Packages (whether native or otherwise) come in quite handy for loads of reasons. Native package managers mature? and have fantastic tools available

[+] calpaterson|14 years ago|reply

I find it interesting that this comment has been given some points, and it's actually the draft of the comment I was trying to write on my phone, but gave up on (Hacker News doesn't work very well on my phone). Here's what I meant to say:

Packages do take time to build, but your CI server should be doing that. For most kinds of deployments, you can rely on unix's copy-on-write filesystem by installing the package and then restarting you process manager (ie: supervisor or apache). This means your program is only down between restarts (assuming you have no migrations to apply). The outage due to a deployment is then typically a couple of seconds, assuming your program is quick to start up. This is a sufficiently short downtime period for many situations.

There are lots of additional benefits with using packages, especially native ones, like dpkg/deb. You can add specific dependencies onto the package (for example python 2.7), which will be checked at install time. Multiple language programs are often handled better. The program is easily un-installable. If your CI server is building the package, you can download it for manual testing (for example, for UAT).

[+] veeti|14 years ago|reply

> Why anyone would want to write a billion line init script now that upstart exists is beyond me. Perhaps they don't know about upstart. It could also be that they are stuck on CentOS or RedHat. My heart goes out to you if that's the case. I know how that feels.

RHEL/CentOS 6 uses Upstart (and the next major version will use systemd).

[+] sandGorgon|14 years ago|reply

I use Monit to start my webapps - it is cross platform (none of that systemd vs upstart business) and I get a lot of tools (admin panels, etc.) to use with it.

The con is that I dont get a lot of advanced dependency management in case of worker threads, but unless you go to specialized tools like God or Bluepill, it is good enough.

[+] bmelton|14 years ago|reply

1) Because a billion-line init script is a gross exaggeration.

2) What do you do if you have 1,000 servers and your upstart script has to change? Fabric is nice to have in situations like that.

[+] viraptor|14 years ago|reply

> we got sick of waiting for the packages to build and upload

I don't get it. What did they do, that took longer than git commit / git push? It's not like you're compiling anything in python and when you do have deps to compile, they would take that time on deployment anyway.

[+] 5h|14 years ago|reply

Been doing this for a while, came with another bonus: I no longer have a single ftp server running anywhere on my servers now, using codebasehq.com as my repo / deployment source ... works extremely well.

[+] kgrin|14 years ago|reply

The pattern that I've seen fairly often (and use myself) is to git pull into a "working copy" on the deployment target, but do a git archive/symlink to actually build the copy that's served (before doing the service restart).

That gives you a more atomic process and several points to back out gracefully if something goes wrong, without having to worry about *.pyc files or any similar cruft.

20 comments