Generating PDFs on App Engine Python, and introducing Mapvelopes

This is the first of two posts covering the technologies used to implement the Mapvelopes app, an App Engine app that generates customized printable envelopes with the map to your recipient on them.

While HTML is the lingua-franca of the web, it's not the be all and end all. Sometimes, you need your webapp to generate something slightly different, and often, that something is a PDF. PDFs have the major advantage that they're designed for printing: pagination is built in, and the PDF defines the page size, so nothing about the layout is left to chance. When you need to provide something for the user to print, especially when it's complex, using a PDF can make the difference between okay output and really excellent output. Hit 'Print' in a Google Docs spreadsheet, and you'll see this in action.

PDF generation on App Engine is something that's been left largely up to individual users to figure out. Depending on your runtime - Java or Python - and your specific needs, it may be quite straightforward, or rather complicated. In particular, if you want to include images in your PDF, you're going to have to jump through some ...

On the hazards of promising something you don't have

Astute readers may recall that on monday I promised that today's post would be about providing user feedback on file uploads. It turns out, however, that I'm too clever for my own good: I thought I knew exactly how to handle this, but an unexpected roadblock has made things... problematic.

Not to worry, though! I have a way forward, it'll just take me a little longer than expected to put it together for your consumption. Your patience is appreciated.

In the meantime, I'd like to direct you to my latest post on the official App Engine blog, Easy Performance Profiling with AppStats.

Implementing a dropbox service with the Blobstore API (Part 1)

The blobstore api is a recent addition to the App Engine platform, and makes it possible to upload and serve large files (currently up to 50MB). It's also one of the most complex APIs to use, as it has several moving parts. This short series will demonstrate how to implement a dropbox type file hosting service on App Engine, using the Blobstore API. To start, we'll cover the basics needed to upload files, keep track of them in the datastore, and serve them back to users.

First up is the upload form. This step is fairly straightforward: We create a standard HTML form, only we generate the URL to post to by calling blobstore.create_upload_url, and passing it the URL of the handler we want called by it. Here's the handler code:

class FileUploadFormHandler(BaseHandler):
  def get(self):
    self.render_template("upload.html", {
        'form_url': blobstore.create_upload_url('/upload'),
        'logout_url': users.create_logout_url('/'),

Standard stuff - though it's worth pointing out that, for convenience, we're using the login_required decorator from the google.appengine.ext.webapp.util package to require users to be logged in (and redirect them to the login form if they're not). And here's ...

Task Queue task chaining done right

One common pattern when using the Task Queue API is known as 'task chaining'. You execute a task on the task queue, and at some point, determine that you're going to need another task, either to complete the work the current task is doing, or to start doing something new. Let's say you're doing the former, and your code looks something like this:

def task_func():
  # Do some stuff
  florb # This line causes an error

I'm sure you can guess what happens here. You successfully do some work, successfully chain the next task, then you encounter an error. Your code throws an exception, and returns a non-200 status code to the task queue, which notes the failure and schedules your task for re-execution. When it re-executes, the whole thing happens all over again (if your error is persistent, instead of transient, like the above).

Meanwhile, the task you enqueued runs. Perhaps it also fails after chaining its next task. Now you have two repeatedly executing tasks. Soon you have 4 - then 8 - then 16 - and so forth. Disaster!

"Ah, " you may say smugly, "I don't do anything important after chaining the next task ...

Announcing a robust datastore bulk update utility for App Engine

Note: This library is deprecated in favor of appengine-mapreduce, which is now bundled with the SDK.

I'm pleased to announce the release of bulkupdate, an unoriginally-named library for the App Engine Python runtime that facilitates doing bulk operations on datastore data. With bulkupdate, simple operations like bulk re-puts and bulk deletes are trivial, while more complex operations like schema transitions or even emailing all your users become much simpler.

The basic operation of bulkupdate is very similar to the 'map' phase of the well known 'mapreduce' pattern. To use it, you create a subclass of the 'Bulkupdater' class, and define two methods: get_query(), which returns the query to execute, and handle_entity(), which is called once for each entity returned by the query. For example, suppose you want to write a daily task that sends an XMPP message to everyone with new activity on their accounts - the updater class would look something like this:

class ActivityNotifier(bulkupdate.BulkUpdater):
  def __init__(self, date_threshold):
    self.date_threshold = date_threshold

  def get_query(self):
    return UserAccount.all().filter('last_update >', self.date_threshold)

  def handle_entity(self, user):
    if user.unread_messages > 0:
      xmpp.send_message(user.jid, "You have %s unread messages!" % user.unread_messages)

Running the job is even simpler ...

Taking advantage of the new Apps Marketplace

The recently unveiled Apps Marketplace has been getting a lot of attention lately, and a lot of people are wanting to know how they can integrate their App Engine app with it, making use of its integrated single-signon support. Today we'll go over what's required to get this working.

Apps Marketplace uses OpenID for SSO. Fortunately, we can use the openid library, which provides a Users-API-Lookalike interface, to support this in App Engine. There are two additional requirements for getting SSO to work in an Apps Marketplace app:

Handling the first of these is easy: The aeoid library sets the realm of an OpenID request, by default, to the domain that the request was made over, so all we need to do is use that same domain name as the realm in our app's manifest file.

The second is a little trickier. The 'janrain' python-openid library which aeoid and other Python-based solutions are based on does not support host-meta as a discovery mechanism for OpenID URLs. Let's analyze what this discovery ...

Please stand by

Due to unforseen technical difficulties, today's blog post has been delayed. Look for it next week, where I'll describe what you can do to get started writing an app for the new Apps Marketplace right now.

In other news, I'm spending most of next week travelling, so I won't be able to keep up my usual thrice-weekly updates. Regular blogging will resume the following week. Sorry!

Using the ereporter module for easy error reporting in App Engine

One little known package in the google.appengine.ext package is ereporter. This package exists to make it easier to get summaries of errors generated by your Python App Engine app, and today we'll show you how.

Far too often for new webapps, error reports for live webapps are a catch-as-catch-can type practice, with reports coming in from dedicated users, and whenever you think to check the logs page of your app. A lot of bugs can slip through this way, however, with exceptions going unnoticed to everyone but the users who experience them, then walk away in disgust, never to return again. With ereporter, however, we'll demonstrate how to set up a simple handler that takes care of capturing all the exceptions that occur in your app, and emailing a daily report to you, summarizing what went wrong.

Installing ereporter consists of 3 stages: Modifying your handler script, modifying your app.yaml, and adding a cron job. Let's start by modifying your handler script(s). Add the following to the top of all your handler scripts (that is, scripts that are mentioned in app.yaml):

import logging
from google.appengine.ext import ereporter


The ...

Announcing the SQLite datastore stub for the Python App Engine SDK

For the past couple of weeks, I've been working on one of those projects that seems to suck up every available moment (and some that technically aren't). Now, however, it's largely done, and as an extra bonus, I've been given permission to release it as an early preview for those that are interested.

The code in question is a new implementation of the local datastore for the Python App Engine SDK. While some of you are probably delighted at the news, I expect most of you are puzzled. Why do we need a new local datastore implementation? Let me explain.

The purpose of the local stubs in the App Engine SDK is to exactly replicate the behaviour of the production environment, and in general they do that very well. A specific non-goal is replicating the performance characteristics of the production environment, or being as scalable as the production environment - the stubs are designed for testing, not production use.

The Python SDK's datastore implementation operates by storing the entire contents of your development datastore in memory. It writes changes to disk so that it can reload your datastore when the dev_appserver is restarted, but the in-memory ...

Handling downtime: The capabilities API and testing

After the unfortunate outage the other day, how to handle downtime with your App Engine app is a bit of a hot topic. So what better time to address proper error handling for situations where App Engine isn't performing at 100%?

There's three major topics to cover here: Handling timeouts from API calls, using the Capabilities API, and testing your app's support for handling failures. We'll go over them in order.

Handling timeouts

At the 'stub' level, timeouts and other exceptions are communicated by the stub throwing an google.appengine.runtime.apiproxy_errors.ApplicationError. ApplicationError instances have an 'application_error' field, which contains an ID, drawn from google.appengine.runtime.apiproxy_errors, which indicates the cause of the error. As you can see, DEADLINE_EXCEEDED is 4. Other errors of interest are OVER_QUOTA, which will occur if your app runs out of quota for a given API call or capability, and CAPABILITY_DISABLED, which is thrown if the API capability has been explicitly disabled (more on this later).

Each of the various APIs catches ApplicationErrors thrown by their stub, and wraps them in a higher level exception. The datastore, for example, has a function, _ToDatastoreError that maps different error codes to ...