ZODB + Protobuf Looking Good

Today I drafted a module that mixes ZODB with Protocol Buffers.  It’s looking good!  I was hoping not to use metaclasses, but metaclasses solved a lot of problems, so I’ll keep them unless they cause deeper problems.

The code so far lets you create a class like this:

class Tower(Persistent):
    __metaclass__ = ProtobufState
    protobuf_type = Tower_pb

    def __init__(self):
        self.name = "Rapunzel's"
        self.height = 1000.0
        self.width = 10.0

    def __str__(self):
        return '%s %fx%f %d' % (
            self.name, self.height, self.width, self.build_year)

The only special lines are the second and third.  The second line says the class should use a metaclass that causes all persistent state to be stored in a protobuf message.  The third line says which protobuf class to use.  The protobuf class comes from a generated Python module (not shown).

If you try to store a string in the height attribute, Google’s code raises an error immediately, which is probably a nice new benefit.  Also, this Python code does not have to duplicate the content of the .proto file.  Did you notice the build_year attribute?  Nothing sets it, but since it’s defined in the message schema, it’s possible to read the default value at any time.  Here is the message schema, by the way:

message Tower {
    required string name = 1;
    required float width = 2;
    required float height = 3;
    optional int32 build_year = 4 [default = 1900];
}

The metaclass creates a property for every field defined in the protobuf message schema.  Each property delegates the storage of an attribute to a message field.  The metaclass also adds __getstate__, __setstate__, and __new__ methods to the class; these are used by the pickle module.  Since ZODB uses the pickle module, the serialization of these objects in the database is now primarily a protobuf message.  Yay!  I accomplished all of this with only about 100 lines of code, including comments. 🙂

I have a solution for subclassing that’s not beautiful, but workable.  Next, I need to work on persistent references and _p_changed notification.  Once I have those in place, I intend to release a pre-alpha and change ZODB to remove the pickle wrapper around database objects that are protobuf messages.  At that point, when combined with RelStorage, ZODB will finally store data in a language independent format.

As a matter of principle, important data should never be tied to any particular programming language.  Still, the fact that so many people use ZODB even without language independence is a testament to how good ZODB is.

Jonathan Ellis pointed out Thrift, an alternative to protobuf.  Thrift looks like it could be good, but my customer is already writing protobuf code, so I’m going to stick with that for now.  I suspect almost everything I’m doing will be be applicable to Thrift anyway.

I know what some of you must be thinking: what about storing everything in JSON like CouchDB?  I’m sure it could be done, but I don’t yet see a benefit.  Maybe someone will clue me in.

Contemplating Integration of Protocol Buffers into ZODB

I am now looking at integrating Google’s Protocol Buffers into ZODB.  This means that wherever possible, ZODB should store a serialized protocol buffer rather than a pickle.  The main thing my customer hopes to gain is language independence, but storing protocol buffers could open other possibilities as well.  I think this is a very good idea and I’m going to pursue it.

Although ZODB has always clung tightly to Python pickles, I don’t think that moving to a different serialization format violates any basic ZODB design principle.  On the other hand, I have tried to change the serialization format before; in particular, the APE project tried to make the serialization format completely pluggable.  The APE project turned out not to be viable.  Therefore, this project must not repeat the mistake that broke APE.

Why did APE fail?  APE was not viable because it was too hard to debug.  It was too hard to debug because storage errors never occurred near their source, so tracebacks were never very helpful.  The only people who could solve such problems were those who were intimately familiar with the application, APE, and ZODB, all at the same time.  The storage errors indicated that some application code was trying to store something the serialization layer did not like.  The serialization layer had no ability to voice its objections until transaction commit, at which time the incompatible application code was no longer on the stack (and might even be on a different machine if ZEO was involved).

This turned out to be a much more serious problem than I anticipated.  It made me realize that one of the top design concerns for large applications is producing high quality stack tracebacks in error messages.  A quality traceback can pinpoint the cause of an error in seconds.  I am not aware of any substitute for quality tracebacks, so I am now willing to sacrifice a lot to get them.

So I am faced with a basic design choice.  Should protocol buffers be integrated into ZODB at the application layer, rather than behind a storage layer like APE did?  If I choose to take this route, Persistent subclasses will need to explicitly store a protocol buffer.  Storage errors will indeed occur mostly at their source, since the protocol buffer classes will check validity immediately.

Now that I’ve written this out, the right choice seems obvious: the main integration should indeed be done at the application layer.  Until now, it was hard to distinguish this issue from other issues like persistent references and the format of database records.

Furthermore, the simplest thing to do at first is to store a protocol buffer object as an attribute of a normal persistent object, rather than my initial idea of creating classes that join Persistent with Google’s generated classes.  That means we will still store a pickle, but the pickle will contain a serialized protocol buffer.  Later on, I will figure out how to store a protocol buffer without a pickle surrounding it.  I will also provide a method of storing persistent references, though it might be different from the method ZODB users are accustomed to.

Wing IDE Broke?

Seeing this in a text editor makes me nervous:

That’s invalid code, but I didn’t write it: the IDE is displaying my file completely incorrectly. There are lines missing. There is some kind of repaint bug and it has something to do with scrolling. No matter how featureful an IDE might be, I can’t use it if it can’t show me a text file without jumbling the lines. When I once saw an open source text editor do the same kind of thing, I dropped that editor so fast that I no longer remember its name. 🙂

Wing IDE?

I have been trying out Wing IDE.  It’s nice that it shows me instant documentation as I’m typing, but there’s still a lot I’d like to see.  I have some feature requests:

  • The file dialog in Wing IDE is a royal pain, just like most file dialogs.  KDE is the only system I’ve seen with a consistently good file dialog, so please let me use that instead.  Provide some configuration option that tells the IDE to use a shell command like “kdialog –getopenfilename /” whenever I want to open a file.
  • NetBeans has the right idea for renaming symbols.  It’s even better than Eclipse.  In NetBeans, Ctrl-R doesn’t open a search/replace dialog, nor does it open a refactoring dialog if the symbol is private.  NetBeans does something much more clever: it selects all instances of the symbol, then as you type, all instances of that symbol change simultaneously.  No dialog is necessary.  That feature alone tempts me to use NetBeans for Python code, even though NetBeans is as oversized as Eclipse.
  • When I’m typing code, the main documentation I’m interested in is interface documentation, not implementation documentation.  So Wing IDE really needs to support zope.interface.
  • In both Eclipse and NetBeans, I can almost completely ignore import statements.  Auto-completion adds the necessary import statements automatically.  Eclipse goes even further and generates import statements when I paste code from another file, but that’s just icing on the cake.

If only Wing IDE supported these features, buying a license would be an easy decision.  A promise from the developers that those features are coming soon would be very encouraging.

Bootstrap.py versus pkg_resources.py

I’ve been using zc.buildout quite a bit over the past month.  Although it has been working, it has been doing strange things like using the wrong version of zope.interface.  Yesterday I finally figured out why, and today I found a possible solution.

It turns out that Ubuntu (8.10) provides a package called python-pkg-resources.  At least one Ubuntu package (Snowballz, a strategy game written in Python) pulls in that package automatically.  It installs a pkg_resources module in Python’s site-packages directory, but it does not install the rest of setuptools.

I can understand why Ubuntu chose to split up setuptools, but that choice causes havoc for the bootstrap.py module people use to install zc.buildout.  Here is what bootstrap.py is supposed to do:

  1. Download ez_setup.py and run it.
  2. ez_setup tries to import the pkg_resources module, but fails.
  3. The setuptools package is not found, so ez_setup downloads setuptools in a temporary directory.
  4. ez_setup alters sys.path to include the new setuptools package.
  5. bootstrap.py imports the pkg_resources module from the version of setuptools just downloaded.
  6. Ask pkg_resources about the installed setuptools package.
  7. Use setuptools to install zc.buildout.

Here is what bootstrap.py actually does when pkg_resources.py exists in the site-packages directory (differences emphasized):

  1. Download ez_setup.py and run it.
  2. ez_setup successfully imports the pkg_resources module from site-packages.
  3. The setuptools package is not found, so ez_setup downloads setuptools in a temporary directory.
  4. ez_setup alters sys.path to include the new setuptools package.
  5. boostrap.py continues to use the previously imported pkg_resources module.
  6. Ask pkg_resources about the installed setuptools package.
  7. pkg_resources does not find setuptools because pkg_resources does not notice the change to sys.path.  bootstrap.py fails.

At first, following ideas I gleaned from various posts about zc.buildout, I worked around this by deleting the setuptools egg and the pkg_resources module from site-packages.  I didn’t know exactly why this helped until I studied the problem.  It turns out that bootstrap.py was just not written to cope with a system-wide installation of pkg_resources.

Now I think I recognize another bad choice that zc.buildout has been making.  zc.buildout generates a “bin” directory full of Python scripts.  Those scripts prepend egg directories and egg zip files to sys.path before doing their work.  I noticed that sometimes the list of paths to prepend includes “/usr/lib/python2.5/site-packages”, which is already on sys.path.  I now suspect that whenever zc.buildout includes paths like that, it’s wrong, and the cause is a mixup involving a system-wide installation of pkg_resources, setuptools, or some other foundational package.

Here is a possible way to fix bootstrap.py.  Just before the “import pkg_resources” line, add this:

del sys.modules[‘pkg_resources’]

This solved the bootstrap.py problem for me.  Altering sys.modules is rarely a good idea, but this might be a good exception to the rule.  I don’t believe we need to catch KeyError because ez_setup should have imported pkg_resources already.

Beyond this, there is probably more work to do to make zc.buildout produce correct scripts.

Whoever said computers behave logically must have been joking or delusional.  The people who provide the software never fully agree with each other–nor even themselves!

RelStorage 1.1.1 Released

For those just tuning in: RelStorage is an adapter for ZODB (the database behind Zope) that stores data in a relational database such as MySQL, PostgreSQL, or Oracle.  It has advantages over FileStorage and ZEO, the default storage methods.  Read more in the ZODB wiki.

Download this release from PyPI.  As I mentioned in another post, this release works around MySQL performance bugs to make it possible to pack large databases in a reasonable amount of time.

If you are upgrading from previous versions, a simple schema migration is required.  See the migration instructions.

Packing With MySQL

I’ve been working on packing a 6.5GB production database based on RelStorage and MySQL 5.0.  It contains 6.4 million objects, meaning “select count(1) from current_object” returns 6426879.  I would call this a moderately sized database.  When I first approached it, it had not been packed for at least 3 months.  At first, I could not pack it, because RelStorage was using statements that MySQL mis-optimizes.  I will discuss how I fixed the problem.

The first time I tried to pack this database, the pre_pack phase took several hours before I stopped it by killing the Python process.  I studied the problem using “show processlist” and the MySQL docs online.  I found a statement in RelStorage that MySQL mis-optimizes.  I tried again and it still took too long; this time I discovered that a workaround I had applied earlier for a mis-optimized query did not actually help this instance of MySQL.  I decided to make the workaround more robust by using an explicit temporary table.

Then, on Saturday, I started a pack again.  The pre_pack phase completed in less than an hour this time and the final phase began.  The final phase gets from the pre_pack phase a complete list of objects and states to delete; all it really has to do is delete them.  I let it run all weekend.  Today (Monday), using “show processlist” again, I measured 17 seconds per transaction, with 26,000 transactions to go.  With a 50% duty cycle, this would have taken about 10 days!  (The 50% duty cycle is a configurable parameter that causes RelStorage to periodically sleep during packing, allowing other clients to write to the database without significant interruption.)

I read up on the delete statement.  It turns out that delete with a subquery is often very expensive in MySQL.  I converted the delete statements to use a table join instead of a subquery, and the time per transaction dropped from 17 seconds to about 3 milliseconds.  This time the pack finished in minutes.

Does this mean I should optimize every statement this way?  Certainly not.  The SQL syntax I had to use to optimize MySQL is not portable, so I had to sacrifice portability, readability, and maintainability, in order to gain acceptable performance.  I will only make that sacrifice when necessary.

I am very happy that I did not try to optimize this code until I was in a position where I could properly measure the performance with millions of real objects.  If I had optimized prematurely, I would have focused on micro-optimizations that might have only doubled or tripled the performance.  By waiting until the time was ripe, I multiplied the pack performance by a factor of at least 5,000, with only a few days of work and only minor code changes.  Next time someone asks why premature optimization is so evil, I will try to remember this experience.  Most of the best software developers I know skip micro-optimizations and focus on readability instead, helping to ensure that much larger optimizations will be possible later.

Note that during my optimization work, I never once needed to interrupt the main site nor restart MySQL.  I only restarted my own Python program repeatedly.  I doubt anyone else knew there was a pack going on.

RelStorage 1.1.1 Coming Soon

After a lot of testing and performance profiling, on Friday I released RelStorage 1.1.  On Saturday, I was planning to blog about the good news and the bad news, but I was unhappy enough with the bad news that I decided to turn as much of the bad news as I could into good news.  So I fixed the performance regressions and will release RelStorage 1.1.1 next week.

The performance charts have changed since RelStorage 1.0.

Chart 1: Speed tests with 100 objects per transaction.

Chart 2: Speed tests with 10,000 objects per transaction.

I’m still using 32 bit Python 2.4 and the same hardware (AMD 5200+), but this time, I ran the tests using ZODB 3.8.1.

The good news:

  • Compared with RelStorage 1.0, reads from MySQL and Oracle are definitely faster.  In some tests, the Oracle adapter now reads as quickly as MySQL.  To get the Oracle optimization, you need cx_Oracle 5, which was just released this week.  Thanks again to Helge Tesdal and Anthony Tuininga for making this happen.
  • Loading 100 objects or less per transaction via MySQL is now so fast that my speed test can no longer measure it.  The time per transaction is 2 ms whether I’m running one ZODB client or 15 concurrently.  (The bad news is now I probably ought to refine the speed test.)

The bad news in version 1.1 that became good news in version 1.1.1:

  • The Oracle adapter took a hit in write speed in 1.1, but a simple optimization (using setinputsizes()) fixed that and writes to Oracle are now slightly faster than they were in RelStorage 1.0.
  • MySQL performance bugs continue to be a problem for packing.  I attempted to pack a 5 GB customer database, with garbage collection enabled, using RelStorage 1.1, but MySQL mis-optimized some queries and wanted to spend multiple days on an operation that should only take a few seconds.  I replaced two queries involving subqueries with temporary table manipulations.  Version 1.1.1 has the workarounds.

The bad news that I haven’t resolved:

  • Writing a lot of objects to MySQL is now apparently a little slower.  With 15 concurrent writers, it used to take 2.2 seconds for each of them to write 10,000 objects, but now it takes 2.9 seconds.  This puts MySQL in last place for write speed under pressure.  That could be a concern, but I don’t think it should hold up the release.

Of the three databases RelStorage supports, MySQL tends to be both the fastest and the slowest.  MySQL is fast when RelStorage executes simple statements that the optimizer can hardly get wrong, but MySQL is glacial when its query optimizer makes a bad choice.  PostgreSQL performance seems more predictable.  PostgreSQL’s optimizer seems to handle a subquery just as well as a table join, and sometimes SQL syntax rules make it hard to avoid a subquery.  So, to support MySQL, I have to convert moderately complex statements into a series of statements that force the database to make reasonable decisions.  MySQL is smart, but absent-minded.

The ZEO tests no longer segfault in ZODB 3.8.1, which is very good news.  However, ZEO still appears to be the slowest at reading a lot of objects at once.  While it would be easy to blame the global interpreter lock, the second chart doesn’t agree with that assumption, since it shows that ZEO reads are slow regardless of concurrency level.  ZEO clients write to a cache after every read and perhaps that is causing the overhead.

I will recommend that all users of RelStorage upgrade to this version.  The packing improvements alone make it worth the effort.

RelStorage 1.1: All Green

I said earlier that before I release RelStorage 1.1, I wanted to set up thorough automated testing of RelStorage with different combinations of ZODB versions, RelStorage versions, Python versions, and platform types.  I originally intended to use hand crafted scripts to run the tests, but this past week I learned Buildbot’s model through experience working on a customer’s Buildbot configuration, so now I’m comfortable with it and I decided to use Buildbot after all.

Here are the results.

I could just say all the tests are passing, but it only feels real when there is a table with many green cells and no red cells.  This system tests a lot of software combinations.  There are three platforms: 32 bit Debian Etch, 32 bit Debian Lenny, and 64 bit Debian Lenny.  On each platform, I’m testing the RelStorage trunk and 1.1 branch combined with ZODB 3.7, ZODB 3.8, and the ZODB shane-poll-invalidations branch, except that the 64 bit platform with Python 2.5 does not test the ZODB 3.7 branch, since ZODB 3.7 is not compatible with 64 bit Python 2.5.  Each 32 bit platform runs tests with MySQL, PostgreSQL, and Oracle.  The 64 bit platform runs only the MySQL and PostgreSQL tests.

That adds up to 44 tested combinations, all passing.  I can relax now!  My Buildbot is running on a private server, but I can expose a read-only web view if there’s interest.

HP dv9933cl

I recently bought a laptop for my new job.  I chose an HP dv9933cl, intending to install Ubuntu on it.  The verdict?  It’s great!  I bought it from CostCo rather than Newegg because I wanted to avoid shipping in case I needed to replace it, but so far I see no need to replace it.

It has a 17″ screen, 4GB RAM, 320GB HD, Intel Core 2 Duo, a LightScribe DVD writer, SD card slot, WiFi (802.11abgn), nVidia graphics, and all the regular ports you’d expect.  It comes with Windows Vista, which is OK for watching movies, but it’s not the right environment for getting work done. 🙂

Ubuntu 8.10 installed with hardly a hitch.  I installed all of the drivers I needed from a standard Ubuntu mirror.  The wi-fi didn’t work on the first boot, but apparently something sorted itself out and it began working (and hasn’t stopped working) on the second boot.  The wi-fi has better range than any laptop I’ve used before.  I can put the laptop to sleep just by closing the lid.  Compiz (for desktop 3D effects) works well.  The SD card reader works.  Even the unusual little infrared remote control that came with the laptop works.

I suspect most laptops in the HP Pavilion dv9000 line will have similar success with Linux.  I wish HP would publish something that says Linux works well on it.