Thursday, November 27, 2008

Geocoding with Google Spreadsheets (and Gadgets)

Update: Google is deprecating Google Spreadsheets gadgets, as they announced in this post, so I am no longer updating or supporting them. If you're a developer, you can try Apps Script or the Spreadsheets Data API to see if you can accomplish the same thing. If you're a user, sorry, sometimes Google shuts down little used features.

Recently, quite a few people pointed me out to Tony Hirst's post on "Data Scraping Wikipedia with Google Spreadsheets". This was a really cool post because it demonstrated alot of features about Google Spreadsheets that people don't take advantage of (but should, because they're awesome): importing data from webservices or websites, publishing spreadsheets, creating charts of data, etc. But, I was sad to see that Tony had to go through so much pain to just geocode the data. Most people I know wouldn't be able to create a Yahoo! Pipe to parse the CSV and geocode spreadsheet data.. there should be an easier way. Well, there is, and that's what this post will show.

Google Spreadsheets now lets people embed gadgets, just like they can on iGoogle, Orkut (OpenSocial), Google Maps (Mapplets), and other containers. Those gadgets can live within the spreadsheet, access un-published data from a worksheet, and manipulate that data into some useful or visual output. There are gadgets for creating timelines, motion charts, word clouds, and now, courtesy of me, a gadget for geocoding data. This is my secret weapon for geocoding small sets of data, and can hopefully help other people out there.

Here are the steps for geocoding a spreadsheet with the gadget:

  1. Create a new spreadsheet, and put the addresses into rows. If your address is composed of multiple columns, just concatenate those into one single column using the "&" operator. The image below shows a spreadsheet of Australian beaches:

  2. Click on the "Insert" menu and then select "Gadget...":

  3. This presents you with various categories of gadgets to choose from (similar to the iGoogle directory). My gadget isn't yet in the gallery, so you'll need to select "Custom" and then type in the URL to the gadget:

  4. The gadget will appear embedded in the current worksheet, and it will prompt you to select a range of data to send to the gadget. Select all the rows for the column with your address data, and you should see the Range text field update with the range. If it doesn't work, you can always manually type it in. After doing that, click the Apply button:

  5. Since I always hate having charts or gadgets cluttering my sheets (and I like taking advantage of screen real estate), I then usually select the "Move to own sheet" option from the gadget menu. If you don't opt to do that, atleast resize the gadget to give it a decent amount of space in the worksheet. (Image omitted because I accidentally wrote over it :).
  6. When the gadget loads, it will present instructions, a blank map area, and a "Find Addresses" button. Click the "Find Addresses" button. It will start sending geocoding calls using the Maps API, and filling in the sidebar and map with results. It will stop at 99 results, for both technical and legal reasons (just use the gadget multiple times if you have more data than that). Any addresses that failed to geocode will have a red marker and show up in the bottom left of the map. All the markers are draggable, so you can move them around to tweak the location (this was a really important feature to me- geocodes can often use a nudge or two). In this step, you may notice some of the data didn't geocode at all, and perhaps you'll go back to your spreadsheet, edit the address a bit, and do this step again. For example, I had to add "Australia" to some of the beaches in my spreadsheet for disambiguation with other beaches.

  7. When you're satisfied with the marker locations, then select all in the text field below the map, and copy the text. The text contains the latitude/longitudes in row-order for all the markers, and is formatted so that it's easy to paste into spreadsheet columns.

  8. Finally, create two empty columns in your worksheet, and paste that text into the cells. If you put your cursor in the first row/column, then it should paste perfectly and align with the address data.

  9. You're done! Now you can use the latitude/longitude data in your Google Maps API mashup, perhaps by using the Spreadsheets -> Map Wizard or the Spreadsheets Mapper.

For those of you intrigued by Google Spreadsheets Gadgets, you may also want to check out my Spreadsheets -> JSON/SQL/XML converter, Spreadsheets based Flashcards, and State-Based Cluster Map. And the code is open-source, so you might want to modify and improve them, too. :)

7 comments:

Ankit Guglani said...

This is also a neat way to circumvent the 50 importdata functions per sheet limit; if you have been doing it the old-fashioned way.

I will have to take a peek at the source after exams, if I can add info-windows and custom icons, this could possibly be uber-useful.

You could have dynamic (and more than 50 records of) data plotted and geo-coded, if you put it in a table somewhere, use importurl and then use this on the resultant table in the spreadsheet.

Thanks Pamela.

Tony Hirst said...

Hi Pamela,

That was a really neat hack, thanks for sharing it:-)

I was wondering about doing a gadget myself, but I was thinking about building one in the context of the maps environment, rather than the spreadsheet environment... ie I imagined the gadget sitting in Google maps and pulling data from the spreadsheet...)

Another thought I had was just to define a formula that would take a cell that contains eg a place name and then call a geocoding webservice. This then led to the thought - it would be really neat if you could define a cell as a "geotype" that the spreadsheet automatically geocoded and annotated the cell with eg lat/long data in a couple of cells stacked in the third dimension...

Just out of interest, how did you get started with writing the gadget? I'm looking for a dead simple howto that shows just how i can eg get the data from a selection of cells and then display it in a table in the gadget (to provide to myself I can get the data and I can then populate a table with it?)

The use case I have is to repurpose this youtube playlist gadget (http://www.searchfeedr.com/gadgets/youTubePlaylistGadget.xml), and hack it so that instead of pulling the youtube movie URLs in from an RSS feed, it pulls them in from a list of cells...

tony

Pamela Fox said...

Hey Tony-

You could turn this gadget into a mapplet, but I think it'd have to use a published spreadsheet and the API. I don't believe it can grab the authenticated spreadsheets data.

Re the geotype idea - my issue with that is that then there is no tweaking of the geocodes. It's my belief that geocodes are never perfect, so you should do a visual check of them before proceeding. A geotype would certainly make things simpler, but it would be better if there was a way of clicking on the cell and tweaking the result.

Re starting the gadget - I don't recall. Nowadays I just start from one of my previous gadgets, delete stuff, add stuff. It looks like you figured it out, I saw your Youtube post. Nice work! :)

JR said...

Tony,
The "getting started" page on the spreadsheet gadgets api is the best place to start:
http://code.google.com/apis/spreadsheets/spreadsheet_gadgets.html

Pamela - this is such a great gadget... the ability to move pins is simply perfect. Thanks (again) for leading the way.

Hospital Publico said...

Como se hace para poner un google map en una pagina y que las busquedas de los marcadores queden en el mapa y no se borren con el refresh

William B. Shuey said...

Thanks for this helpful gadget. I have to agree with Tony Hurst though, it would be great have a cell in google spreadsheet that could automatically generate the latitude or longitude from the address in another cell.

Anonymous said...

Hello,

Thanks for sharing the link - but unfortunately it seems to be not working? Does anybody here at blog.pamelafox.org have a mirror or another source?


Cheers,
Daniel