At the Zope sprint held during PyCon 2009, we had a multi-day discussion about the Zope publisher. Some people were expecting a fight between Jim and me about the best way forward. So what happened? Read on!
The Publisher Discussion
Let me first give some background on our discussions about the Zope publisher.
Back in January and February, I spent time deciphering the Zope publisher (to solve a problem for a customer) and working out a better strategy. I posted some blog entries and a proposal about what I’d like to do with the publisher, joined a mailing list discussion, and mocked up some code that I called zope.pipeline. Through these discussions, Jim Fulton and I had some trouble understanding each other, so we decided to work out our differences in person.
We held that discussion at the PyCon 2009 Zope sprint. Some people thought it would be a heated discussion, but it was not at all. As Jim and I worked together for two days, I learned I had made some incorrect assumptions. In particular, I believed that the request type detection machinery in zope.app.publication was important and helpful. Jim believes it causes unnecessary complexity and that having one request type is the better way to go. I came around to his viewpoint.
This had a significant effect on my zope.pipeline idea. If there is only one request type, then we don’t need dynamically configured pipelines, since the dynamicism was chosen based on the request type. This means the ZCML support currently in zope.pipeline is unnecessary and a static pipeline should be completely effective.
Jim and I worked on a sketch for a publisher rewrite, calling it zope.pub, but I admit that I never liked the sketch at all. I was not successful at expressing my concerns during the sprint, but perhaps I will be more successful in writing now.
Today, there is very little in the publisher framework that we want to be rigid; virtually everything in the Zope publishing process needs to be pluggable and flexible. Chris McDonough heard that discussion and started poking in the air at an imaginary gelatinous object, illustrating that it is never comfortable to work with a nebulous programming framework. The Zope publisher is both difficult to describe and not rigid in any way.
The Zope publisher used to be easy to describe. We used to say that the Zope publisher exposes Python objects to a web server. Then WSGI came along and showed everyone an efficient way to expose any Python code to a web server without all that extra stuff that the Zope publisher does. So why would anyone want to use the Zope publisher except in Zope 2 or Zope 3? Now we make a fine distinction to help people understand why the Zope publisher remains useful outside the context of Zope: WSGI exposes Python code to a web server, while the Zope publisher exposes a Python object graph to a web server.
Exposing a Python object graph, as opposed to calling Python code, means starting at some root object, traversing the graph, and then doing something with that object. We call this process publishing an object. If that object graph is located in a database, then the publication process needs to commit or abort transactions. Error handling, authentication, and other aspects also need to happen somewhere in the middle of the publishing process. With that information, we can define a complete Zope publisher, right? Well, no, because developers don’t always want the objects to be in a database, traversal might have arbitrary rules, authentication obviously must be pluggable, and there is no agreement about what to do with the object we traverse to. We need to defer all of that policy to pluggable things. The Zope publisher, therefore, defers all such policy to an object that provides an interface called IPublication.
IPublication is a series of hooks expected to be filled by a publication object. IPublication and the associated publish() function represent a set of best practices developed over many years. They specify the order things should be called for a sensible and robust publication process. That’s the theory.
But put yourself in the shoes of a web application developer who needs to customize authentication, for example. The IPublication interface says nothing about when authentication happens. The answer is somewhat complex and will require study of both code and current best practices (meaning mailing list searches and discussion with other Zope developers). Since most Zope developers use zope.app.publication, the readability problem is compounded by the poor organization of the zope.app.publication package. Furthermore, zope.app.publication pulls in more dependencies than most application authors need. Excessive dependencies force developers to learn concepts not needed for their applications.
These are all problems typical of a nebulous framework. Can we avoid creating a nebulous framework? I felt like the zope.pub sketch was still very nebulous.
The WSGI Discussion
Based on what I’ve seen from the Repoze project, I believe a WSGI pipeline can provide the publishing abstraction we need while retaining readability. Since the larger Python community has already taken on the burden of teaching people about WSGI, if we base as much as possible on WSGI, there will be fewer new concepts to teach when people want to learn about publishing objects in Zope, and the process should be much less nebulous. I want people to find it easy to innovate on object publication.
On the third sprint day, I revised my pipeline ideas and explained them to several core Zope developers. They provided excellent feedback got excited about the new plans.
Here is what we agreed upon. I encourage feedback and corrections.
- The goal is to provide a set of WSGI applications as well as a standard configuration of those applications. Some people call a standard configuration of WSGI applications a “pancake”. I usually call it a “pipeline”.
- The goal of constructing a WSGI pipeline dynamically, based on request type, has been thrown out. See above for the rationale. No new ZCML directive types will be required.
- A WSGI application that acts like middleware, but has side effects, is to be known as a WSGI framework component, since that is the term used by other people working with WSGI. It was nice to work in the same room as other WSGI developers! One of the developers walked up and told us about the term as we were talking. Of course, it’s easier to just say WSGI component, which I find sufficient.
- We are going to sprinkle WSGI applications in various Zope packages. For example, we expect to add a small WSGI application to the new zope.authentication package. This strategy is intended to keep dependencies under control. WSGI itself adds no dependencies, since a WSGI application is simply a callable object. We typically call the module holding WSGI applications wsgi.py. People can ignore the wsgi.py module if they don’t need it, but we expect most users of any given Zope library to be interested in its WSGI integration features.
- The zope.pipeline package will pull in all of the WSGI modules that comprise a standard Zope WSGI pipeline. The package will have a factory function that produces a pipeline usable by any WSGI server. People who want a different pipeline will ignore the zope.pipeline package and provide their own pipeline factory in their own package. The zope.pipeline package should not contain much code.
- We are replacing zope.app.publication and the IPublication interface, but we still have to depend on zope.publisher for its other interfaces.
- We are going to build primarily on WebOb and provide an adapter from webob.Request to Zope’s IRequest.
- The new publishing functionality will be compatible with Zope 3 and Grok. Maybe Zope 2 also.
In this design, there is no publish() function nor IPublication. The use of functional composition makes them unnecessary.
Gary Poster and I dived into specific WSGI applications we intend to create and where to put them. I will list those decisions in another post.
I think I’m the “some people” you refer to above. 🙂 I didn’t really expect a flaming row between you and Jim, I just teased Jim about it as he did get quite argumentative before. And I called the concept of a bunch of WSGI framework components consolidated into a single reusable one a “pancake” but I agree pipeline is a better word. I wonder what word the TurboGears folks use for such a thing.
Oh, and I should say I’m happy to see your work on it, think the thinking is going in the right direction, and I’m cheering you along!
The developer suggesting “WSGI framework component” was Mark Ramm of TurboGears fame. I trust him to have figured out the “PJE-compliant” WSGI nomenclature 😉