Nick's Blog

App Engine Java API call hooks

Posted by Nick Johnson | Filed under tech, app-engine, coding, java

In a previous post, we discussed API call hooks for Python. It's possible to hook and modify RPC calls in Java, too. In this post, we'll demonstrate how.

All API calls in Java are handled by the class com.google.apphosting.api.ApiProxy. This class behaves similarly to the ApiProxyStubMap in the Python SDK, having makeSyncCall and makeAsyncCall functions that take care of invoking the relevant API calls. Unlike the Python runtime, however, all Java API calls are handled by a single delegate class, defined by the interface ApiProxy.Delegate. The active delegate for an App Engine app can be retrieved with ApiProxy.getDelegate, and set with ApiProxy.setDelegate.

Another difference from the Python API is that API calls are serialized into byte arrays before being passed to the ApiProxy, and responses are likewise deserialized by the caller. As a result, the makeSyncCall method takes a byte array as an argument, and returns a byte array.

Because all calls are routed to a single Delegate, the granularity of hooks supported is restricted to hooking all API calls, or none. To make things easier, we'll define a Delegate implementation that dispatches API calls to other Delegate subclasses based ...

Read more | Comments | 16 November, 2009

Shiny!

Posted by Nick Johnson | Filed under app-engine, usb-key, tech

Shiny, and headed my way (hopefully) in time for .astronomy:

Read more | Comments | 14 November, 2009

Python Gotchas

Posted by Nick Johnson | Filed under python, tech, app-engine, coding

A lot of App Engine developers are fairly new to Python as well, and so probably haven't encountered a few subtle 'gotchas' about the Python programming language. This post aims to sum up the ones you're most likely to encounter while programming for App Engine.

Mutable default arguments

First up is something every Pythonista learns sooner or later. What's wrong with this snippet?

def append(value, l=[]):
  l.append(value)
  return l

Read more | Comments | 13 November, 2009

API call hooks for fun and profit

Posted by Nick Johnson | Filed under tech, app-engine, cookbook, coding, hooks

API call hooks are a technique that's reasonably well documented for App Engine - there's an article about it - but only lightly used. Today we'll cover some practical uses of API call hooks.

To start, let's define a simple logging handler. This can be useful when you're seeing some odd behaviour from the datastore, especially when you're using a library that may be modifying how it works. You can also use it to log any other API, such as the URLFetch or Task Queue APIs. For this, we just need a post-call hook:

def log_api_call(service, call, request, response):
  logging.debug("Call to %s.%s", service, call)
  logging.debug(request)
  logging.debug(response)

To install this for the datastore, for example, we call this:

apiproxy_stub_map.apiproxy.GetPostCallHooks().Append(
        'log_api_call', log_api_call, 'datastore_v3')

The arguments to Append are, in order, a name for the hook, the hook function itself, and, optionally, the service you want to hook. If you leave out the last argument, the hook is installed for all API calls.

Once a hook is installed, it remains installed as long as the runtime is loaded - including across requests, and for requests to ...

Read more | Comments | 11 November, 2009

Damn Cool Algorithms: Spatial indexing with Quadtrees and Hilbert Curves

Posted by Nick Johnson | Filed under tech, coding, damn-cool-algorithms

Last Thursday night at Oredev, after the sessions, was "Birds of a Feather" - a sort of mini-unconference. Anyone could write up a topic on the whiteboard; interested individuals added their names, and each group got allocated a room to chat about the topic. I joined the "Spatial Indexing" group, and we spent a fascinating hour and a half talking about spatial indexing methods, reminding me of several interesting algorithms and techniques.

Spatial indexing is increasingly important as more and more data and applications are geospatially-enabled. Efficiently querying geospatial data, however, is a considerable challenge: because the data is two-dimensional (or sometimes, more), you can't use standard indexing techniques to query on position. Spatial indexes solve this through a variety of techniques. In this post, we'll cover several - quadtrees, geohashes (not to be confused with geohashing), and space-filling curves - and reveal how they're all interrelated.

Quadtrees

Quadtrees are a very straightforward spatial indexing technique. In a Quadtree, each node represents a bounding box covering some part of the space being indexed, with the root node covering the entire area. Each node is either a leaf node - in which case it contains ...

Read more | Comments | 09 November, 2009

Trip reports: Stack Overflow Amsterdam, and Oredev

Posted by Nick Johnson | Filed under tech, conferences

I spent this week attending and speaking at two different conferences - the Stack Overflow Dev Day in Amsterdam, and Øredev, in Malmö, Sweden.

Stack Overflow was a lot of fun. Talks covered topics such as FogBugz, jQuery, Fog Creek Software, creators of FogBugz, QT (pronounced 'cute'), FogBugz, Python, App Engine (my own talk), Yahoo! developer tools, and, of course, FogBugz.

Particular highlights for me were Simon Willison's talk about Python, and Christian Heilmann's talk on Yahoo! developer tools. Simon works at The Guardian, a UK newspaper that is particularly keen on exposing and consuming data and APIs, and used his hour to introduce Python by demonstrating how he uses it, particularly the interactive interpreter, on a day to day basis to extract and munge statistical data and produce content for news stories, such as infographics. Although I know Python very well, his presentation style was extremely engaging, and I had to restrain myself from going up to him and asking if he was looking for apprentices in the fine art of presenting. He's also a co-author of Django, so he showed off some of Django's features in his talk as well.

Christian's talk was completely ...

Read more | Comments | 06 November, 2009

Talk Schedule

Posted by Nick Johnson | Filed under app-engine, conferences

I'm going to be spending the next week going to conferences. As a result, your regularly scheduled posts are likely to be absent next week; it's going to be hectic enough as it is. In the meantime, here's my talk schedule for the next while.

November 2: Presenting on App Engine Python at Stack Overflow Dev Day Amsterdam.
November 4: Presenting on App Engine Java at Øredev in Malmo, Sweden.
November 18: Python Ireland talk on App Engine at Scruffy Murphy's, Dublin, Ireland.
November 30 - December 4: At .Astronomy 2009 in Leiden, Netherlands, helping people put together Astronomy apps on App Engine
January 24 - January 30: LovelySnow Sprint in the Austrian Alps, another App Engine hackathon.

Server-side JavaScript with Rhino

Posted by Nick Johnson | Filed under tech, app-engine, cookbook, coding, java

I've only made limited use of the Java runtime for App Engine so far: The two runtimes are largely equivalent in terms of features, and my personal preference is for Python. There are, however, a few really cool things that you can only do on the Java runtime. One of them is embedding an interpreter for a scripting language.

Rhino is a JavaScript interpreter implemented in Java. It supports sandboxing, meaning you can create an environment in which it's safe to execute user-provided code, and it allows you to expose arbitrary Java objects to the JavaScript runtime, meaning you can easily provide interfaces with which the JavaScript code can carry out whatever actions it's designed to carry out.

Rhino also supports several rather cool additional features. You can set callback functions that count the number of operations executed by the JavaScript code, and terminate it if it takes too long, you can serialize a context or individual objects and deserialize them later, and you can even use continuations to pause execution when waiting for data, and pick it up again later - possibly even in another request! Hopefully we'll show off some of these features in future ...

Blogging on App Engine, part 10: Recap

Posted by Nick Johnson | Filed under coding, app-engine, tech, bloggart

This is part of a series of articles on writing a blogging system on App Engine. An overview of what we're building is here.

Over the last 9 posts, we've covered building a fully functional blogging system for App Engine from scratch, and I've even migrated this blog to it. In the process, we've covered many important components of App Engine, including the new deferred library (and through it, Task Queues), important design principles, such as pre-generation of content, and interesting technologies such as PubSubHubbub, CSEs, Disqus, and sitemaps.

There's also been significant community involvement, for which I am very grateful. Amongst the contributors were:

Moraes, who ported bloggart to werkzeug, calling it bloggartzeug
Sylvain, who ported bloggart to tornado, calling it bloggartornado
andialbrecht, who contributed - and continues to contribute - enhancements and bugfixes for bloggart
Everyone who contributed a name suggestion on the first post, and everyone who has provided feedback during the series
Everyone who filed bug reports and feature requests in the issue tracker

To give some sort of objective assessment of how well Bloggart measures up, we need to compare it to our original goals, outlined in the very first post:

Simple ...

Blogging on App engine, part 9: Sitemaps and verification

Posted by Nick Johnson | Filed under tech, app-engine, coding, bloggart

This is part of a series of articles on writing a blogging system on App Engine. An overview of what we're building is here.

Today we're going to cover basic sitemap support and verifying your site with Google.

Sitemaps

Sitemaps are a recent innovation that aim to make it easier for search engines to find and index your site. The format is a very straightforward XML file. Several optional attributes can be present, such as the last-modified date and update frequency; for this first attempt we're not going to use any of them, and just provide a basic listing of URLs. Future enhancements could provide more sitemap information, and break the sitemap into multiple files for extensibility.

For the purposes of generating a complete sitemap, we have a significant advantage: Our static serving infrastructure provides us with a convenient means of getting a list of all valid URLs. Not all URLs should be indexed, however, so we should make it possible to specify what content should be indexed. Add a new property to the StaticContent model in static.py:

    indexed = db.BooleanProperty(required=True, default=True)

We'll need to enhance our set() method to take this ...

Mutable default arguments

Quadtrees

Sitemaps

Blogroll