One other valid problem I've heard raised about git-based
deploys is that you can end up with cruft in your working
copy that sticks around like .pyc files where the
original .py file is deleted and there is the chance that
this file could still be imported even though the
original .py was deleted.
I'm shocked. If a py file is modified, the pyc is rebuilt. So we know it's poking at the original source file. Why not fail to import if the original source is missing?
Maybe this behavior is meant to support binary-only distribution of Python applications, but there really should be an option to override this behavior.
It is possible to have a file called "spam.pyc" without
a module "spam.py" in the same module. This can be used
to distribute a library of Python code in a form that is
moderately hard to reverse engineer.
Removing existing pyc files when deploying sounds sensible, but I'm not sure if -B is really a good idea in this case, as each new worker process will have to parse the code again instead of just using the bytecode.
I've experienced this before. Not that .pyc/.pyo files ever get into my repo, but sometimes Python will decide it doesn't need to recompile the files and I get old behavior from a piece of software I know I've just re-deployed.
My solution? Simple, I've got a a Fabric command that deletes all .pyc files in the repo after pulling.
" Packages are, of course, a legitimate way to push out changes but the downside of deploying with packages is that it takes time to build them and upload them"
Hopefully your CI server is building them. Packages (whether native or otherwise) come in quite handy for loads of reasons. Native package managers mature? and have fantastic tools available
I find it interesting that this comment has been given some points, and it's actually the draft of the comment I was trying to write on my phone, but gave up on (Hacker News doesn't work very well on my phone). Here's what I meant to say:
Packages do take time to build, but your CI server should be doing that. For most kinds of deployments, you can rely on unix's copy-on-write filesystem by installing the package and then restarting you process manager (ie: supervisor or apache). This means your program is only down between restarts (assuming you have no migrations to apply). The outage due to a deployment is then typically a couple of seconds, assuming your program is quick to start up. This is a sufficiently short downtime period for many situations.
There are lots of additional benefits with using packages, especially native ones, like dpkg/deb. You can add specific dependencies onto the package (for example python 2.7), which will be checked at install time. Multiple language programs are often handled better. The program is easily un-installable. If your CI server is building the package, you can download it for manual testing (for example, for UAT).
> Why anyone would want to write a billion line init script now that upstart exists is beyond me. Perhaps they don't know about upstart. It could also be that they are stuck on CentOS or RedHat. My heart goes out to you if that's the case. I know how that feels.
RHEL/CentOS 6 uses Upstart (and the next major version will use systemd).
I use Monit to start my webapps - it is cross platform (none of that systemd vs upstart business) and I get a lot of tools (admin panels, etc.) to use with it.
The con is that I dont get a lot of advanced dependency management in case of worker threads, but unless you go to specialized tools like God or Bluepill, it is good enough.
> we got sick of waiting for the packages to build and upload
I don't get it. What did they do, that took longer than git commit / git push? It's not like you're compiling anything in python and when you do have deps to compile, they would take that time on deployment anyway.
Been doing this for a while, came with another bonus: I no longer have a single ftp server running anywhere on my servers now, using codebasehq.com as my repo / deployment source ... works extremely well.
The pattern that I've seen fairly often (and use myself) is to git pull into a "working copy" on the deployment target, but do a git archive/symlink to actually build the copy that's served (before doing the service restart).
That gives you a more atomic process and several points to back out gracefully if something goes wrong, without having to worry about *.pyc files or any similar cruft.
[+] [-] maybird|14 years ago|reply
Maybe this behavior is meant to support binary-only distribution of Python applications, but there really should be an option to override this behavior.
Edit: Did a bit more research and found this:
(1) http://docs.python.org/release/1.5.1p1/tut/node43.html
Makes sense.(2) http://docs.python.org/using/cmdline.html#cmdoption-B
So you can mitigate that behavior by removing existing pyc files and using "-B".[+] [-] julien_p|14 years ago|reply
[+] [-] rhizome31|14 years ago|reply
Removing existing pyc files when deploying sounds sensible, but I'm not sure if -B is really a good idea in this case, as each new worker process will have to parse the code again instead of just using the bytecode.
[+] [-] dguaraglia|14 years ago|reply
My solution? Simple, I've got a a Fabric command that deletes all .pyc files in the repo after pulling.
def clean_pyc(): run('find . -name "*.pyc" -exec rm {} \;')
And that's that :)
[+] [-] sigil|14 years ago|reply
[+] [-] obtu|14 years ago|reply
[+] [-] 5h|14 years ago|reply
[+] [-] calpaterson|14 years ago|reply
Hopefully your CI server is building them. Packages (whether native or otherwise) come in quite handy for loads of reasons. Native package managers mature? and have fantastic tools available
[+] [-] calpaterson|14 years ago|reply
Packages do take time to build, but your CI server should be doing that. For most kinds of deployments, you can rely on unix's copy-on-write filesystem by installing the package and then restarting you process manager (ie: supervisor or apache). This means your program is only down between restarts (assuming you have no migrations to apply). The outage due to a deployment is then typically a couple of seconds, assuming your program is quick to start up. This is a sufficiently short downtime period for many situations.
There are lots of additional benefits with using packages, especially native ones, like dpkg/deb. You can add specific dependencies onto the package (for example python 2.7), which will be checked at install time. Multiple language programs are often handled better. The program is easily un-installable. If your CI server is building the package, you can download it for manual testing (for example, for UAT).
[+] [-] veeti|14 years ago|reply
RHEL/CentOS 6 uses Upstart (and the next major version will use systemd).
[+] [-] sandGorgon|14 years ago|reply
The con is that I dont get a lot of advanced dependency management in case of worker threads, but unless you go to specialized tools like God or Bluepill, it is good enough.
[+] [-] bmelton|14 years ago|reply
2) What do you do if you have 1,000 servers and your upstart script has to change? Fabric is nice to have in situations like that.
[+] [-] viraptor|14 years ago|reply
I don't get it. What did they do, that took longer than git commit / git push? It's not like you're compiling anything in python and when you do have deps to compile, they would take that time on deployment anyway.
[+] [-] 5h|14 years ago|reply
[+] [-] kgrin|14 years ago|reply
That gives you a more atomic process and several points to back out gracefully if something goes wrong, without having to worry about *.pyc files or any similar cruft.