Django on Google App Engine

Deciding how to run Django on Google App Engine (GAE) is a bit of a challenge. There are plenty of conflicting instructions and at least three viable options. What’s a developer to do?

The underlying issue is that Django’s relational data model is incompatible with App Engine’s “BigTable” model. Features that depend on Django model APIs simply don’t work on App Engine. This includes users, sessions, and the admin interface.

In this article I’ll describe the three best approaches and then describe how I eventually made my choice.

Three forks in the road.

  1. App Engine Patch.

    When you download and unzip App Engine Patch (AEP), you get a directory that contains all the config files and directory structure necessary to “start” a Python App Engine application. Included is a heavily customized Django 1.1 source tree. To keep your GAE application’s static file count low, the Django sources are rolled up into a single zip file.

    The advantage of the AEP approach is that major Django features — users, sessions, and admin — work “out of the box” on App Engine. If you’re a Django developer, you can move forward using the tools you know, including manage.py and unit tests.

    There appear to be a few disadvantages to AEP: (1) updates to mainline Django take a while to appear in AEP, (2) if you run into a bug, it is sometimes unclear whether Django or AEP is to blame, and (3) AEP uses the inefficient zipimport mechanism to load Django.

    In my opinion, zipimport performance is the real killer. AEP must unzip the Django source tree before it can handle a request. This is a slow process: it can take several seconds on the production App Engine infrastructure. The performance impact is partially mitigated by the fact that GAE caches the import for future requests. Unfortunately, it is impossible to predict how long the cache will last. If your app doesn’t see many requests, GAE may decide to allocate zero CPUs to it until a new request comes in. As a result, “first” requests after a long pause are expensive. At the same time, if your traffic is spiking, GAE may decide to allocate several new CPUs to it — each of which, presumably, will have to serve at least one “slow” request first.

    The AEP team is well aware of this issue. Unfortunately, there isn’t a clean way forward just yet.

  2. Google App Engine Helper For Django.

    Written by the App Engine team, this open source tool makes it easier to move a large Django-only codebase onto GAE. The key feature is the BaseModel class, which mimics Django’s model API but is compatible with the GAE data store. There are also tools to get manage.py, unit tests, and so forth working. That said, you do not get admin, sessions, or users for free. If your application contains complicated queries — and certainly if your application contains raw SQL — then you’ll have to do some work to re-factor your models regardless.

    As a booster while moving a Django app to GAE, the “helper” may provide some benefit. In my opinion, however, it doesn’t confer much benefit to new applications.

  3. Pre-Installed Django 1.x Libraries.

    Most people know that Django 0.96 is pre-installed with App Engine. Lesser known is that Django versions 1.0 and 1.1 are also available as part of the production GAE infrastructure. By default, if an App Engine application attempts to import django it gets 0.96. However, by calling App Engine’s use_library(...) API, you can request the namespace get replaced with either version 1.0.2 or (recently, and not mentioned in all relevant places in the official documentation) 1.1.

    The great news about use_library(...) is that it is extremely fast. The bad news is that you’re stuck with a build of Django where many features — again: users, sessions, and admin — simply don’t work. Tools like manage.py don’t work, either. Unfortunately, it doesn’t seem possible to take the changes AEP has made in their custom source tree and “apply” them somehow as needed. You’re better off working with what you’re given.

What did I choose?

In order to choose, I decided to step back and ask myself why Django was so desirable on App Engine. The chief benefit of App Engine is that, with little engineering effort, developers can build highly scalable applications. The chief benefit of Django is that a lot of useful functionality works out of the box. Unfortunately, a lot of that functionality just isn’t built in a fashion that’s compatible with App Engine’s intentional constraints. Trying to get “all of Django” running on App Engine felt a bit square-peg, round-hole to me. (In fact, I’d argue that admin on App Engine isn’t terribly interesting, since admin is targeted to lower-scale CMS-style applications.)

So: if I don’t get users, sessions, or admin — well, what again is the point of Django on App Engine? There are still some pieces that work “out of the box,” of course, including: forms, templates, URL routing, middleware, and context processors. That said, a few of those (templates, routing) work just fine under the lighter weight webapp framework.

In the end, there were two reasons why Django won out over webapp. First, I wanted to use the latest Django template syntax; webapp is locked to Django 0.96. Second, I wanted to use Django’s newforms library. I tried to create a “frakenframework” that mixed webapp with newforms, but this proved impossible due to internal dependencies inside webapp.

The bottom line for me? I now use Django on App Engine via the use_library(...) API. In my next posts about Django on App Engine, I’ll describe how to set up a use_library(...) app, and I’ll describe the new code I’ve written to support both sessions and users on App Engine.