Breaking things down to Atoms

Over the last few years, more and more sites have been providing RSS and Atom feeds. This has been a huge boon both for keeping up to date with content through feed readers, and for programmatically consuming data. There are, inevitably, a few holdouts, though. Notable amongst those holdouts -and particularly relevant to me at the moment - are property listing sites. Few, if any property listing sites provide any sort of Atom or RSS feed for listings, let alone a API of any kind.

To address this, I decided to put together a service for turning other types of notification into Atom feeds. I wanted to make the service general, so it can support many different formats, but the initial target will be email, since most property sites offer email notifications. The requirements for an email to Atom gateway are fairly straightforward:

  • Incoming email should be converted to entries in an Atom feed in such a fashion as to be easily interpretable in a feed reader
  • As much of the original message and metadata as possible should be preserved in the Atom feed entry
  • It should be possible to access the original message if desired

In addition, I had a ...

Consuming RSS feeds with PubSubHubbub

Frequently, it's necessary or useful to consume an Atom or RSS feed provided by another application. Doing so, though, is rarely as simple as it seems: To do so robustly, you have to worry about polling frequency, downtime, badly formed feeds, multiple formats, timeouts, determining which items are new and other such issues, all of which distract from your original, seemingly simple goal of retrieving new updates from an Atom feed. You're not alone, either: Everyone ends up dealing with the same set of issues, and solving them in more or less the same manner. Wouldn't it be nice if there was a way to let someone else take care of all this hassle?

As you've no doubt guessed, I'm about to tell you that there is. I'm speaking, of course, of PubSubHubbub. I discussed publishing to PubSubHubbub as part of the Blogging on App Engine series, but I haven't previously discussed what's required to act as a subscriber. Today, we'll cover the basics of PubSubHubbub subscriptions, and how you can use them to outsource all the usual issues consuming feeds.

At this point, you may be wondering how this is ...