Google I/O playlist, day 6: Batch data processing with App Engine

This is the sixth in a series of posts providing a day-by-day playlist to help break up the Google I/O session videos - specifically the App Engine ones - into manageable chunks for those that haven't seen them.

Today's video is Mike Aizatsky's Batch data processing with App Engine, where he describes the recently-released mapper framework for App Engine. I blogged previously about this framework, giving a detailed breakdown of the demo mapreduce used in this very talk!

The mapper API is initially being released for Python, so it'll be mostly of interest to Python users - but Java is coming soon, so it's well worth watching even if you only speak Java.

Exploring the new mapper API

One of the new features announced at this year's Google I/O is the new mapper library. This library makes it easy to perform bulk operations on your data, such as updating it, deleting it, or transforming/filtering/processing it in some fashion, using the 'map' (and soon, 'reduce') pattern. I'm happy to say that I'm deprecating my own bulkupdate library in favor of it.

The mapper API isn't just limited to mapping over datastore entities, either. You can map over lines in a text file in the blobstore, or over the contents of a zip file in the blobstore. It's even possible to write your own data sources - something we'll cover in a later post. Today, though, I'd like to dissect the demo that was presented at I/O. The demo uses a number of the mapper framework's more sophisticated features, so it's a good one to use to get an idea for how the framework works.

For basic usage, the Getting Started page in the mapper docs is the place to go. If you're interested in seeing something more complex in practice, read on...

The demo at I ...