Data Scraping Wikipedia with Google Spreadsheets
Prompted in part by a presentation I have to give tomorrow as an OU eLearning community session (I hope some folks turn up – the 90 minute session on Mashing Up the PLE – RSS edition is the only reason I’m going in…), and in part by Scott Leslie’s compelling programme for a similar duration Mashing Up your own PLE session (scene scetting here: Hunting the Wily “PLE”), I started having a tinker with using Google spreadsheets as for data table screenscraping. So here’s a quick summary of (part of) what I found I could do. The Google spreadsheet function =importHTML(“”,”table”,N) will scrape a table from an HTML web page into a Google spreadsheet. The URL of the target web page, and the target table element both need to be in double quotes. The number N identifies the N’th table in the page (counting starts at 0) as the target table for data scraping. Grab the URL, fire up a new Google spreadsheet, and satrt to enter the formula “=importHTML” into one of the cells: =ImportHtml(“ Why CSV?
Related: Jornalismo de Dados