Python Deployment Chronicles: Deploying legacy TurboGears projects with modern tools

At CCNMTL most of our new Python projects are written in Django, but we still support a number of older projects that were written with TurboGears 1.0.4. They've continued to be stable, and we don't do a ton of new development on them, so it hasn't been worthwhile to upgrade them to newer versions of TurboGears.

But we do occasionally make changes to their code, and recently we've begun migrating them to newer servers.  So I recently spent some time updating their deployment processes to CCNMTL's current best practices:

  • Installation with pip instead of easy_install
  • Fully pinned local source distributions versioned alongside the code
  • No Internet access required anywhere in the deployment
  • Containment with virtualenv
I ended up with a package that you can use to create an isolated TurboGears 1.0.4 environment to run legacy projects in, or (if for some reason you want to) to create new TurboGears 1.0.4 projects.  You can get it on Github here: https://github.com/ccnmtl/turbogears_pip_bootstrapper

In this post I'll go into detail about what it does, and the hurdles I ran into along the way.

Like our Django bootstrap process, this is a one-step installation procedure for creating an isolated TurboGears 1.0.4 environment.  Just run bootstrap.py and you'll have a virtualenv in ./ve, with all the packages you need to run or create a TurboGears 1.0.4 project.

(If you're curious what I'm upgrading this fromAnders has a post describing our historical TurboGears bootstrap process.)


Switching from easy_install to pip

From Eggs to Tarballs

Until now, we were using easy_install to install the packages we needed.  Switching to pip is usually straightforward, but we'd been bundling all our dependencies as eggs instead of source tarballs, and pip doesn't support installation from eggs.

So the first step was collecting source tarballs for each of the eggs we were installing into a TurboGears environment.  This was mostly just a matter of downloading the .tar.gz file from PyPI that corresponded to the exact version of the package we were installing from an egg.  Some of them were hard to find or weren't on PyPI; http://files.turbogears.org/eggs/ and http://peak.telecommunity.com/snapshots/ were helpful here too, and for one or two particularly stubborn dependencies I just had to get the right version from SVN and package it in a tarball myself.

To speed up the process, I wrote a script that tries to find and download all the right tarballs, given a directory of eggs.  After a few iterations, that took me most of the way.  I was able to find the rest by hand, and eventually I had all the source distributions I needed.  (You can get them all here.)

The Requirements File

The last step was writing a pip requirements file that listed all of the dependencies.  This was tricky for two reasons:

  1. The order that you list the requirements matters -- if package A depends on package B, you have to list A before B in your requirements file.  Otherwise pip might end up downloading a copy of B from the Internet before it even sees that you've specified a local source distribution.  We want to prevent deployments from requiring Internet access (and relying on PyPI uptime) -- and, most likely, you'll end up with the wrong version of package B installed.
  2. We use a couple of TurboGears plugins, and those plugins try to import turbogears in their setup.py files -- meaning that you can't install them until after TurboGears itself is fully installed.
For the former, it was -- again -- just a matter of time to get it right: run the bootrap script, see if pip tries to download any packages, reorder the lines in requirements.txt, and repeat until it all worked right.  To make it a little easier, I passed an --index-url='' option to pip, which made it fail early and loudly the first time it tried to download anything.  (I blogged about this trick for preventing pip network access a few weeks ago.)

For the latter, reordering the requirements file wasn't sufficient -- because pip runs setup.py egg_info on all packages in your requirements file before it runs setup.py install on any of them -- so when it got to the packages that tried to import turbogears in their setup.py, it failed with an ImportError.  So I made a second requirements file, and I ran pip install a second time to install these packages after TurboGears and its dependencies were all fully installed.


Pinning Setuptools

The only other tricky thing was getting the right version of setuptools installed.  The version of TurboGears we're using relies on setuptools 0.6c8 -- versions later than that cause errors with static file serving and other things, because of changes in pkg_resources.

Virtualenv is pretty opinionated about what version of setuptools it installs -- https://github.com/pypa/virtualenv/issues/89 has the details.  For my purposes, the simplest thing to do was modify the copy of virtualenv.py I'm providing.  I made a few changes to the file so that, instead of saying "install setuptools 0.6c11 or greater," it instead says "install setuptools 0.6c8 exactly."  

To prevent network access and ensure full version pinning, I also dropped in local copies of a setuptools 0.6c8 egg and a pip 1.0 tarball, and told virtualenv.py where to find them.

The resulting bootstrap repository can be dropped into all of our TurboGears 1.0.4 projects, letting us upgrade our builds to pip, virtualenv, source distributions, no dependency on Internet access, and full version pinning in a single step.  In case anyone else has a similar need to modernize their deployment processes for a frozen TurboGears 1.0 project, we've made the code available on Github.  Patches welcome!