Webapps on App Engine, part 6: Lazy loading

This is part of a series on writing a webapp framework for App Engine Python. For details, see the introductory post here.

A major concern for many people developing for App Engine, particularly those building low-to-medium traffic sites, is instance load time. When App Engine serves the first request to a new instance of your app, it must import the request handler module you specified, which in turn imports all the other modules required to serve the request. In large apps, this can add up to quite a lot of additional overhead for loading requests, and substantially impact the experience for end users.

There are a number of things you can do to reduce loading times, including using lighter weight frameworks instead of all inclusive ones, and breaking seldom used components up into separate handlers - an approach taken by bloggart for the admin interface. One source of inefficiency stands out as a prime candidate for optimisation, though: unnecessary imports.

Many frameworks, including the built in webapp framework, require you to provide a list of handler classes that should be instantiated to serve requests, in a 'url map'. When a request comes in, the framework simply instantiates the relevant class and calls it to handle the request. However, doing this requires you to import all your handler classes so you can construct the url map, and this likely results in transitively importing your entire webapp, even though only a small proportion of it may be required to handle the request at hand.

You may have noticed that our framework, so far, doesn't appear to do any better on this front - and you'd be right. That's about to change, however. One thing we've kept a constant throughout writing the framework is making as many components as possible independent, often by making them WSGI applications in their own right. The router is a WSGI application, and so are handler classes, and so is the WebOb response object. Today, we'll make use of that feature to add support for lazy loading in a manner that's completely independent of our framework - and would work on any other WSGI-based framework, too.

The interface to our lazy loader will be straightforward: Instantiate a class with the fully qualified name of a module containing a WSGI application (such as a handler for our framework), and it will act as a WSGI application that imports the handler module on first access, and calls it in turn. To do this, we need to be able to import a module whose name is defined at runtime - and for that we use the Python builtin __import__.

__import__'s operation is fairly straightforward: Simply call it with the name of the module to be imported as the first argument, and it imports the relevant module into the Python runtime. Somewhat confusingly, though, it returns the top-level module rather than the one we asked for - so __import__('foo.bar.baz') returns 'foo'. As additional information, we pass __import__ it our local and global variables using two more arguments, to allow it to resolve relative imports. In short, this:

from mypackage import mymodule

Is equivalent to this:

__import__('mypackage.mymodule', globals(), locals())
mymodule = sys.modules['mypackage.mymodule']

We also need to know how to retrieve a name from a module dynamically. This is even simpler: Every object in python (more or less) is backed by a dict, which can be accessed with its '__dict__' attribute. Retrieving a name from a module is thus achieved like so:

myclass = mymodule.__dict__['myclass']

With that in mind, writing our lazy importer is simple:

class WSGILazyLoader(object):
  def __init__(self, fullname):
    self.modulename, self.objname = fullname.rpartition('.')
    self.obj = None

  def __call__(self, environ, start_response):
    if not self.obj:
      __import__(self.modulename, globals(), locals())
      module = sys.modules[self.modulename]
      self.obj = module.__dict__[self.objname]
    return self.obj(environ, start_response)

To use it, we simply define our handler (and create an instance of it) in one module:

import framework

class HomeHandler(framework.RequestHandler):
  def get(self):
    self.response.body = "Hello, world!"

home_handler = HomeHandler()

And reference it using the lazy loader in another:

import framework
from google.appengine.ext.webapp.util import run_wsgi_app

application = framework.WSGIRouter()
application.connect('/', framework.WSGILazyLoader('home.home_handler'))

def main():
  run_wsgi_app(application)

if __name__ == "__main__":
  main()

You can enhance the lazy loader, of course. Possibilities include having it take the name of a class instead of an instance, and have it automatically instantiate the handler class for each new request, thus duplicating the instance-per-request mechanism of webapp and other frameworks.

Comments

blog comments powered by Disqus