Poaching (Patching) Eggs

The term “egg” as used in the Python community seems so whimsical.  It deserves lots of puns.  A couple of weeks ago, I made a little utility for myself that takes all the eggs from an egg farm produced by zc.buildout and makes a single directory tree full of Python packages and modules.  I called it Omelette.  Get it?  Ha!  (I can hear chickens groaning already…)  The surprising thing about Omelette is it typically finishes in less than 1 second, even with dozens of eggs and thousands of modules.  It mostly produces symlinks, but it also unpacks zip files.  I plan to share it, but I don’t know when I’ll get around to packaging it.

Anyway, I want to talk about poaching patching eggs.  As systems grow in complexity, patching becomes more important.  Linux distributors, for example, solve a really complex problem, and their solution is to patch nearly every package.  If they didn’t, installed systems would be an unstable and unglued mess.  I imagine distributors’ patches usually reach upstream maintainers, but I also imagine it often takes months or years for those patches to trickle into a stable release of each package.

I really want to find a good way to integrate patching into the Python egg installation process.  I want to be able to say, in package metadata, that my package requires ZODB with a certain patch.  That patch would take the form of a diff file that might be downloaded from the Web.  I also want to be able to say that another package requires ZODB with a different patch, and assuming those patches have no conflicts, I want the Python package installation system to install ZODB with both patches.  Moreover, I want other buildouts to use ZODB without patches, even though I have a centralized cache of eggs in my ~/.buildout directory.

So let’s say my Python package installation system is zc.buildout, setuptools, and distutils.  Which layer should be modified to support patching?  I don’t think the need for automated patching arises until you’re combining a lot of packages, so it would seem most natural to put patching in zc.buildout. I can imagine a build.cfg like this:

[versions]
ZODB3=3.8.1 +poll-invalidations

[patches]
poll-invalidations=http://example.com/path/poll-invalidations-3.8.1.diff

I wonder how difficult it would be to achieve that.  Some modification of setuptools might be required.  Alternatively, can Paver patch eggs?  I suspect Paver is not very good at patching eggs either.