Web Scraping: working with APIs

APIs present researchers with a diverse set of data sources through a standardised access mechanism: send a pasted together HTTP request, receive JSON or XML in return. Today we tap into a range of APIs to get comfortable sending queries and processing responses.

These are the slides from the final class in Web Scraping through R: Web scraping for the humanities and social sciences


This week we explore how to use APIs in R, focusing on the Google Maps API. We then attempt to transfer this approach to query the Yandex Maps API. Finally, the practice section includes examples of working with the YouTube V2 API, a few 'social' APIs such as LinkedIn and Twitter, as well as APIs less off the beaten track (Cricket scores, anyone?).

I enjoyed teaching this course and hope to repeat and improve on it next year. When designing the course I tried to cram in everything I wish I had been taught early on in my PhD (resulting in information overload, I fear). Still, hopefully it has been useful to students getting started with digital data collection, showing on the one hand what is possible, and on the other giving some idea of key steps in achieving research objectives.


Download the .Rpres file to use in Rstudio here

A regular R script with code-snippets only can be accessed here


Slides from the first session here

Slides from the second session here

Slides from the third session here

UPDATE March 2015:
New 2015 version of slides here
PDFs of slides available here


5 comments:

  1. Nice blog on Web scraping, all the post on web scraping is very interesting and useful. Thanks for sharing a very useful information web scraping.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. wow that's great web scraping working with apis this topic i have searching last few days finally i found you blog it's really great thanks for shearing helpful information.

    ReplyDelete
  4. Thank you very much for sharing this. Your slides really help me learn the basics of digital data collection, which is crucial part of my research which applied text mining.

    ReplyDelete
  5. For straight forward data extraction, json-csv.com could save someone a bit of time. You just need to paste in the JSON API url and it will produce a neatly formatted CSV file which you can work with in Excel.

    ReplyDelete