pamela fox's blog

Friday, September 3, 2010

Putting Europopped on the Map

After spending 3 years of my life making Maps API mashups, I now have a bit of an addiction. Whenever I see geographic information, I have this uncontrollable urge to visualize that information on a map. So when I started reading Europopped.com, a blog that chronicles awesome and awful music videos from European countries, I also started imagining how I could show those blog posts on a map. Last night, between episodes of Arrested development and True Blood (a balanced TV diet), I realized my fantasies: Europopped: On the Map.

Here's a step-by-step of how the mashup works:

It uses the JavaScript Maps API v3 to create a map centered on Europe.
It queries the Posterous API to retrieve all the tags and tag counts for the blog.
It creates markers for each of the tags, using latitude/longitude coordinates stored in the JS, and showing the tag count on top of each marker.
When you click on a particular country marker, it queries the Posterous API for all the posts for that tag.
It creates a sidebar of links for each post, and setting a click listener that embeds the video for each post in the infowindow.

Here are some tips for how I made it quickly:

I used my Spreadsheets Geocoder wizard to get the coordinates for all of the countries, and used a spreadsheet formula to generate JSON from my geocoded spreadsheet.
I used getlatlon.com to find the ideal map center.
I used MapIconMaker Wizard to generate a template marker image URL for my markers, and passesd the tag count into that image URL to change the number for each country.
I used my CenterBox control class from another project to create a centered info box instead of the typical infowindow.
I used regexpal.com to test the regular expression that extracted the Youtube URLs.
I used my Posterous JSON API proxy for all of the API calls.

That's why I love APIs and the web -- once you're aware of them, they make it possible to quickly create new ways to use and explore the data and sites that you love.

Happy EuroPopping! :)

Tuesday, August 31, 2010

How to Pretty-Print Code Snippets in Blogger

I often want to include snippets of code in my instructional blog posts, so to make my blog posts easier to read, I decided to add syntax-highlighting to my blog tonight. There are various syntax highlighting solutions out there, but we use google-code-prettify on code.google.com and it's worked well enough there, so I went with what I knew.

Here's how I added it to my Blogger blog:

Click the Design tab and "Edit HTML".

After the meta tab in the HTML, paste these two includes for the JS and CSS:

<link href='http://google-code-prettify.googlecode.com/svn/trunk/src/prettify.css' rel='stylesheet' type='text/css'/>
<script src='http://google-code-prettify.googlecode.com/svn/trunk/src/prettify.js' type='text/javascript'/>

Search for script - for me, there's a script tag near the bottom of the page. In that script tag, put this javascript call:
```
prettyPrint();
```
If that tag doesn't exist, then just create a script tag at the bottom yourself.
Now, whenever you're posting, add the prettyprint class to your pre or code tags:
```
<pre class="prettyprint">
var i = 2 + 4;
</pre>
```

To see examples of where I've used this, check out JSON API for Posterous for Python snippets or the Google APIs Timeline for JS snippets.

For more details on using the prettify library, see the readme.

Monday, August 30, 2010

A JSON API for Posterous

I recently became mildly obsessed with Europopped.com, a blog that highlights both really catchy & horribly tacky music videos from all over Europe, and I've started thinking up mashups to fuel my obsession. So, I looked up the API for Posterous.com, the blogging platform that powers Europopped, and discovered that its API is not quite as mashup-friendly as I hoped. They do offer an API for retrieving public feeds without authentication -- the first thing I looked for -- but the API result output is a custom XML format -- not optimal for client-side mashups. I was expecting to find an API output that was either ATOM-based, so I could pipe it through existing Feed->JS proxies like the Google AJAX Feeds API, or even better, an API output in JSON with support for callback parameters. The documentation indicates the API is still under development, however, so hopefully they will soon go down one or both of those routes.

But in the meantime, I decided to remedy their lack of a JSON output with a quick App Engine app to proxy API requests, convert the XML to JSON, and return it.

First, the end result:

If I wanted to use the Posterous API to get the last 50 posts from the Europopped blog, I'd fetch this URL and it would return XML for each post:

http://posterous.com/api/readposts?hostname=europopped&num_posts=50

To use my proxied JSON API to get those 50 posts, I'd fetch this URL:

http://posterous-js.appspot.com/api/readposts?hostname=europopped&num_posts=50

Tip: Install the JSONView extension for Chrome to see the result pretty-printed.

Notice that the only difference is the domain name -- I wanted the proxied API to mirror the actual API as much as possible, to make it easy to figure out the URLs to construct from the documentation, and to make it easy to port to an actual JSON offering from Posterous in the future, on the assumption that actually happens. :)

If I want to get the same JSON wrapped in a callback, to use it inside a webpage, I'd fetch this URL:

http://posterous-js.appspot.com/api/readposts?hostname=europopped&num_posts=50&callback=loadPosts

Now, the code behind it:

I've checked in the two files it took to write the proxy on App Engine for Python, and I'll step through them here.

First, I set up a URL handler to direct all /api requests to my api.py script:

application: posterous-js
version: 1
runtime: python
api_version: 1
handlers:
- url: /api/.*
  script: api.py

Then, in api.py, I directed all requests to be handled by ApiHandler, a webapp.RequestHandler class.In that class, I reconstruct the URL for the Posterous API request:

  url = 'http://posterous.com' + self.request.path + '?' + self.request.query_string

Then I check memcache to see if I've already fetched that request recently (in last 5 minutes):

   cached_result = memcache.get(url)
    if cached_result:
      dict = simplejson.loads(cached_result)
    else:
      dict = self.convert_results(url)

If I didn't find it in cache, then I'll call a function to fetch the URL and convert specified top-level tags in the XML to JSON:

  result = urlfetch.fetch(url, deadline=10)
  if result.status_code == 200:
      dom = minidom.parseString(result.content)
      errors = dom.getElementsByTagName('err')
      if errors:
        dict = {'error': errors[0].getAttribute('msg')}
      elif url.find('readposts') > -1:
        dict = self.convert_dom(dom, 'post')
      elif url.find('gettags') > -1:
        dict = self.convert_dom(dom, 'tag')
      elif url.find('getsites') > -1:
        dict = self.convert_dom(dom, 'site')

I convert from XML to JSON using the minidom library, converting each tag to a JSON key and recording the text data or CDATA as the JSON value. This technique means that I don't actually convert any nested XML tags, but in the Posterous API, that only means that my output is missing the comments information for posts, which is the least interesting information for me.

 def convert_dom(self, dom, tag_name):
    dict = {}
    top_nodes = dom.getElementsByTagName(tag_name)
    nodes_list = []
    for top_node in top_nodes:
      child_dict = {}
      for child_node in top_node.childNodes:
        if child_node.nodeType != child_node.TEXT_NODE:
          child_dict[child_node.tagName] = child_node.firstChild.wholeText
      nodes_list.append(child_dict)
    dict[tag_name] = nodes_list
    return dict

Finally, after getting the JSON representing the API call, I output it to the screen with the appropriate mime-type and wrap it in a callback, if specified:

      json = simplejson.dumps(dict)
      memcache.set(url, json, 300)
      callback = self.request.get('callback')
      self.response.headers['Content-Type'] = 'application/json'
      if callback:
        self.response.out.write(callback + '(' + json + ')')
      else:
        self.response.out.write(json)

It's a quick hack and one that I hope to see replaced by the official Posterous API, but it's cool that it was so easy to do and now I can move on to actually making the Europopped mashup of my dreams. :)

Sunday, August 29, 2010

Girl Develop It: Teaching Web Programming to Women

A few months ago, Sara Chipps and I ended up on the same list of "Hacker Women on Twitter", and I followed the link from her bio to the Girl Develop it (GDI) project. The mission of GDI is to lessen the gender gap on the web by getting more women to develop software, and they're going about that by offering low-cost web programming courses to women in their local area (New York City). Some of those women might become web developers themselves, but even if most of them don't, they will hopefully inspire the women around them (like daughters and friends) to think about going down that road. Of all the various attempts to get more girls in computing, this one is my favorite. It may take time -- it may even take generations -- but it's worth a shot.

So, I'm bringing GDI to Sydney, and kickstarting it with an introductory series on HTML & CSS, which will take place in 5 classes over 3 weeks at our local Google office. The series will be run like an actual course, in that students are expected to attend every class, to do homework, and to do a small final project (a personal website).

I'm currently in the midst of creating the curriculum, using my HTML5-based slides making application, and am hoping to make it fun, practical, and re-usable. :)

Here's where you come in:

If you're a local female looking to learn those topics, then you can read more and register from the GDI Sydney page.
If you're a local female that's keen to help others learn these topics, then we'd love to have you as a teaching assistant for the course. Just send me an email or wave (pamela.fox@).
If you're just curious to see how it goes, then subscribe to the main GDI blog.

Thanks so much to Sara for supporting the Sydney version of GDI. I'm excited to see how this goes and to meet the first class of students!

Wednesday, August 25, 2010

Importing data from Spreadsheets to App Engine

Google App Engine provides the remote_api mechanism for uploading and downloading data from the datastore. It's handy and lets you import different types of data, but requires a certain amount of setup, and well, sometimes I'm lazy and don't feel like going through that setup. So, another way that you can easily import data into your datastore is to store it in a Google spreadsheet, publish the sheet, and write a handler to import the spreadsheet rows as datastore entities.

For example, I created a spreadsheet to store information on Wave extensions, using one column for the URL and another column to indicate if they're featured or not.

Then, I published that spreadsheet using the Share->Publish menu, and constructed a URL for the JSON database-like output:

https://spreadsheets.google.com/feeds/list/0Ah0xU81penP1dDNwSFROSU5KVlFRbmo5cERsTElKTGc/od6/public/values?alt=json

To get the URL for your own public spreadsheet, just change the spreadsheet key (the long string there) and the worksheet ID (the first sheet is always 'od6').

That JSON includes an array of entry objects, and each entry object contains an object for each of the columns, e.g.

[...
 gsx$url: {$t: "http://api.rucksack.com/hostelwithme.xml"},
 gsx$featured: {$t: "yes"}
..]

Note: Column headers are stripped of whitespace and lowercased when converted to keys in the JSON feed, so I always just start off with them that way in the spreadsheet to make it painless to find them in the JSON.

Now, I write a simple handler that will pull in that JSON, parse each entry object, and convert them into datastore entities.

class ImportAppsActionHandler(BaseHandler):
 """ Handler for importing existing apps."""

 def get(self):
   user = users.get_current_user()
   # Need admin access to import
   if not user.is_current_user_admin():
     self.error(403)
   # Fetch JSON of published spreadsheet
   url = "http://spreadsheets.google.com/feeds/list/0Ah0xU81penP1dDNwSFROSU5KVlFRbmo5cERsTElKTGc/od6/public/values?alt=json"
   result = urlfetch.fetch(url)
   if result.status_code == 200:
     feed_obj = simplejson.loads(result.content)
     if "feed" in feed_obj:
       entries = feed_obj["feed"]["entry"]
       # Make an Application entity for each entry in feed
       for entry in entries:
         url = entry['gsx$url']['$t']
         featured = entry['gsx$featured']['$t']
         app = models.Application()
         app.url = url
         app.moderation_status = models.Application.MOD_APPROVED
         app.AddAuthor(user)
         app.AddMetadata()
         app.put()

   # Clear the memcache
   memcache.flush_all()

When I visit that handler, it imports the data, and works both on the local devapp server and the public server in the same way.

There are various caveats to this technique, of course. First, your spreadsheet needs to be published. If you wanted to do it with a private spreadsheet, for more sensitive data, you would need to use the full spreadsheets API and do an authentication dance. Second, your handler is limited to the typical 30 seconds limit for an App Engine request. If you wanted to use it to import many rows of data, you'd probably want to split it up across multiple requests by using the deferred task queue or re-directing with pagination.

But, hey, it was useful for my situation, so maybe it's useful for one more situation out there in the world. :)

Tuesday, August 24, 2010

Tip for Networking at Conferences: Be a Speaker!

Most people don't realize it, but I am incredibly shy -- they don't realize it because I've also spent a long time being shy, and have developed various "workarounds" because I know that it's healthy for me to interact with people and that it's not healthy for me to be a hermit (though tempting).

One of the situations where I find it quite easy for my shy-ness to take over is at conference, where I'm surrounded by hundreds of people that I don't know, and I think perhaps that some of them would be interesting conversational partners, but I haven't the slightest idea who, and how to approach them.

So, I work around it -- by being a speaker. By speaking at a conference, I make it so that there is atleast a room full of people that now have an excuse to talk to me, and I have something to talk with them about. That's a room-full more of people than when I was wandering around aimlessly through the halls before the talk!

Now, I know, it's not possible to be a speaker at every conference you go to. But, many conferences (atleast the cool ones) offer lightning talk sessions which can be signed-up for on the day of the event -- and many people have atleast one interesting or funny topic they can talk about for 5 minutes.

Whether you're a pre-slotted speaker or a lightning talk speaker, try to get your speaking slot on the first day. First, of course, that will mean you'll be able to relax after your talk and enjoy more of the conference, and second, that means that the room-ful of people will know of you sooner, and have more time to strike up a conversation with you.

And, hey, if any of you ever see me wandering around a conference (or sneaking into a bathroom to hide from all the intimidating people), stop by and say "hi". :)

Sunday, August 22, 2010

5lide: HTML5-based Slides Maker

At last week's GTUG campout, a 3-day long HTML5 hackathon, I signed up to be a TA for the weekend. That meant I spent most of my time wandering around answering random questions and helping developers debug their hacks. But, I can't be surrounded by a bunch of people hacking on cool shit and not join in myself -- it's just way too tempting. So, on Friday night, after coming home from the pitches and discovering that drinking 2 Dr. Pepper's was not in fact a good way to avoid jet lag, I stayed up into the wee hours hacking on an idea I'd been brewing for a few weeks.

As some of you know from my posts about Prezi and Ignite, I am a fan of alternative slide formats and presentation techniques. In my work as both a student and a developer advocate, I have made a massive number of Powerpoint presentations, and I do believe there is much room for improvement and room for experimentation. So, whenever I spot a new slide format in the wild, I get excited to try it out myself.

Early last year, the HTML5 advocates started using a set of slides that both showed off HTML5 features and were written in HTML5 - so they could do interactive samples and harness the power of HTML5 at the same time. (And by HTML5, I mostly mean rounded corners and CSS transitions :). They recently created a generic stripped-down version for anyone to modify and use in the HTML5 studio, but I wanted to take it a step further than that. I wanted to be able to store my slide data in a database and pull that into the slides template dynamically, so that I could work on my slide content separate from my presentation and easily create multiple slidesets without coding the base HTML each time. Thus began my hack!

Since I had limited time to work on the app, I looked around for a sample application to start off with. One of the things I love about App Engine (well, atleast the Python version) is that when I find an open-sourced app similar to the one in my head, I can get it downloaded and deployed in just a few minutes. In the google-app-engine-samples project, I discovered the tasks app by the great Bret Taylor. The tasks app lets users sign in and create different task lists, where every list has a re-orderable set of tasks. The similarity to my app design was uncanny, and ridiculously convenient. With some simple search and replace, the tasks app became a slides app, letting users create different slide sets, where every slide set has a re-orderable set of slides. (See what I mean?) Then I added the more slide-oriented features: I turned the generic HTML5 slide deck into a django template that pulled in the data, I added a "theme" option for each slide deck and used a different CSS for each theme ("party", "ballerina", and "android"), and I created a notion of a slide type for each slide (either the intro, transition, or body).

I demoed the app in this form on demo night, and as usual, I haven't had time to add anything else to it since then. I'd like to add an "import from docs" as the next feature, as I have a few slidesets I want to bring over. I also think the slide editing interface could use some love and re-thinking, as it's really just a re-skinned task list editing interface right now. I have open-sourced all of the code here, as I'd love for other people to play around with it and maybe submit some patches (hint hint).

Happy 5lide-ing! :)