Google I/O playlist, day 5: Data migration in App Engine

This is the fifth in a series of posts providing a day-by-day playlist to help break up the Google I/O session videos - specifically the App Engine ones - into manageable chunks for those that haven't seen them.

Today's video is Data migration in App Engine, Matthew Blain's talk on new improvements to the bulkloader. I've blogged about the new bulkloader previously, but Matthew's talk goes into a lot more detail.

Matt starts talking about the new configuration format at 6:38, if you want to skip the intro.

Using the new bulkloader

Recently, Matthew Blain, of the App Engine team, announced the prerelease of a new bulkloader. The new bulkloader uses yaml files for configuration, and takes a 'declarative' rather than procedural approach to configuration for downloading and uploading data. As a result, you don't have to understand Python in order to configure and use the new bulkloader, which is a significant advantage for users of the Java App Engine runtime.

There are, of course, many other significant improvements, including autogeneration of config files, a bulit in library of converters for common data types, support for input and output types other than CSV, and more. Today, we'll walk through basic usage of the new bulkloader, and demonstrate some of its features.

Configuration autogeneration

One of the most significant new features of the bulkloader is its support for autogenerating config files. It works like this: You point it at your production app, and it downloads the datastore stats, and uses them to generate a configuration file for you. You edit the configuration file to fill in a few missing fields and tidy it up, and presto, you have a working bulkloader configuration. Let's see how that works out when we ...