RelStorage 1.3.0b1, Now With Blob Support

I have just released two versions of RelStorage. Version 1.3.0b1 adds full support for ZODB blobs stored on the filesystem. Version 1.2.0 is currently the better choice if you’re upgrading a production system and don’t need blob support.

People have been asking for blob support for months. I am glad to finally get it done, with a little help from a customer. With blob support, now we can easily store large artifacts on the filesystem, while keeping all metadata in the database.

To celebrate the new release, I have created a sample buildout.cfg that builds Plone with RelStorage, PostgreSQL, and blob support. (Thanks goes to Hanno Schlichting, who released a compatible version of plone.recipe.zope2instance only moments after I requested it.) Here it is:

[buildout]
parts = plone zope2 instance zopepy
find-links =
    http://dist.plone.org
    http://download.zope.org/ppix/
    http://download.zope.org/distribution/
    http://effbot.org/downloads
    http://packages.willowrise.org
eggs =
    elementtree
    PILwoTk
    RelStorage
    psycopg2
versions = versions

[versions]
ZODB3 = 3.8.3-polling
RelStorage = 1.3.0b1

[plone]
recipe = plone.recipe.plone

[zope2]
recipe = plone.recipe.zope2install
url = ${plone:zope2-url}

[instance]
recipe = plone.recipe.zope2instance
zope2-location = ${zope2:location}
user = admin:admin
products = ${plone:products}
eggs =
    ${buildout:eggs}
    ${plone:eggs}
    plone.app.blob
zcml = plone.app.blob
rel-storage =
    type postgresql
    dsn dbname='plone' user='plone' host='localhost' password='plone'
    blob-dir var/blobs

[zopepy]
recipe = zc.recipe.egg
eggs = ${instance:eggs}
interpreter = zopepy
extra-paths = ${instance:zope2-location}/lib/python
scripts = zopepy zodbconvert

P.S. I have been told that a very prominent Plone developer recently configured RelStorage with master/slave replication on MySQL, and that it works smoothly. I expect him to announce his success soon!

RelStorage 1.1.1 Released

For those just tuning in: RelStorage is an adapter for ZODB (the database behind Zope) that stores data in a relational database such as MySQL, PostgreSQL, or Oracle.  It has advantages over FileStorage and ZEO, the default storage methods.  Read more in the ZODB wiki.

Download this release from PyPI.  As I mentioned in another post, this release works around MySQL performance bugs to make it possible to pack large databases in a reasonable amount of time.

If you are upgrading from previous versions, a simple schema migration is required.  See the migration instructions.

RelStorage 1.1.1 Coming Soon

After a lot of testing and performance profiling, on Friday I released RelStorage 1.1.  On Saturday, I was planning to blog about the good news and the bad news, but I was unhappy enough with the bad news that I decided to turn as much of the bad news as I could into good news.  So I fixed the performance regressions and will release RelStorage 1.1.1 next week.

The performance charts have changed since RelStorage 1.0.

Chart 1: Speed tests with 100 objects per transaction.

Chart 2: Speed tests with 10,000 objects per transaction.

I’m still using 32 bit Python 2.4 and the same hardware (AMD 5200+), but this time, I ran the tests using ZODB 3.8.1.

The good news:

  • Compared with RelStorage 1.0, reads from MySQL and Oracle are definitely faster.  In some tests, the Oracle adapter now reads as quickly as MySQL.  To get the Oracle optimization, you need cx_Oracle 5, which was just released this week.  Thanks again to Helge Tesdal and Anthony Tuininga for making this happen.
  • Loading 100 objects or less per transaction via MySQL is now so fast that my speed test can no longer measure it.  The time per transaction is 2 ms whether I’m running one ZODB client or 15 concurrently.  (The bad news is now I probably ought to refine the speed test.)

The bad news in version 1.1 that became good news in version 1.1.1:

  • The Oracle adapter took a hit in write speed in 1.1, but a simple optimization (using setinputsizes()) fixed that and writes to Oracle are now slightly faster than they were in RelStorage 1.0.
  • MySQL performance bugs continue to be a problem for packing.  I attempted to pack a 5 GB customer database, with garbage collection enabled, using RelStorage 1.1, but MySQL mis-optimized some queries and wanted to spend multiple days on an operation that should only take a few seconds.  I replaced two queries involving subqueries with temporary table manipulations.  Version 1.1.1 has the workarounds.

The bad news that I haven’t resolved:

  • Writing a lot of objects to MySQL is now apparently a little slower.  With 15 concurrent writers, it used to take 2.2 seconds for each of them to write 10,000 objects, but now it takes 2.9 seconds.  This puts MySQL in last place for write speed under pressure.  That could be a concern, but I don’t think it should hold up the release.

Of the three databases RelStorage supports, MySQL tends to be both the fastest and the slowest.  MySQL is fast when RelStorage executes simple statements that the optimizer can hardly get wrong, but MySQL is glacial when its query optimizer makes a bad choice.  PostgreSQL performance seems more predictable.  PostgreSQL’s optimizer seems to handle a subquery just as well as a table join, and sometimes SQL syntax rules make it hard to avoid a subquery.  So, to support MySQL, I have to convert moderately complex statements into a series of statements that force the database to make reasonable decisions.  MySQL is smart, but absent-minded.

The ZEO tests no longer segfault in ZODB 3.8.1, which is very good news.  However, ZEO still appears to be the slowest at reading a lot of objects at once.  While it would be easy to blame the global interpreter lock, the second chart doesn’t agree with that assumption, since it shows that ZEO reads are slow regardless of concurrency level.  ZEO clients write to a cache after every read and perhaps that is causing the overhead.

I will recommend that all users of RelStorage upgrade to this version.  The packing improvements alone make it worth the effort.

RelStorage: A New ZODB Storage

I’m writing RelStorage, a new storage implementation for ZODB / Zope / Plone. RelStorage replaces PGStorage. I’ve put up a RelStorage Wiki page and the zodb-dev mailing list has been discussing it. There is no stable release yet, but a stable release is planned for this month.

While performance is not the main goal (reliability and scalability are more important), I was pleasantly surprised to discover last week that creating a Plone 3 site in RelStorage on PostgreSQL 8.1 is a bit faster than doing the same thing in FileStorage, the default ZODB storage. Clearly, the PostgreSQL team is doing a great job!

Several years ago I put together an early prototype of PGStorage. I recall discovering that PostgreSQL was terribly slow at storing a lot of BLOBs. I read about the soon-to-come TOAST feature, but I wasn’t sure it would solve the problem, so I discarded the whole idea for years. Today, PostgreSQL seems to have no problem at all with this kind of work. It sure has come a long way.

RelStorage also connects to Oracle 10g. According to benchmarks, Oracle has a slight performance advantage, perhaps due to the “read only” isolation mode that Oracle provides. It might be useful for PostgreSQL to get that feature too.

I’m considering setting up a MySQL adapter for RelStorage as well. When the database is in MySQL and Zope is running in mod_wsgi, we could say that the “P” in LAMP stands for Plone!