Web Scraping: Scaling up Digital Data Collection

The latest slides from web scraping through R: Web scraping for the humanities and social sciences

Slides from the first session here

Slides from the second session here

Slides from the fourth and final session here

This week we look in greater detail at scaling up digital data-collection: coercing scraper output into dataframes, how to download files (along with a cursory look at the state of IP law), cover basic text-manipulation in R, and take a first look at working with the APIs (share counts on Facebook).

Download the .Rpres file to use in Rstudio here

A regular R script with code-snippets only can be accessed here

UPDATE March 2015:
New 2015 version of slides here
PDFs of slides available here


  1. This is great! This information becomes obsolete fast so it is quite useful.
    Is there any chance of a downloadable form for your slides? For whatever reason, I don't feel comfortable unless I have PDF files that I can annotate myself!

    1. Glad you find it useful! I've added PDF slides. Fingers crossed they work properly - a bit hard to get the formatting right

  2. This comment has been removed by a blog administrator.

  3. Web Scraping, in general, means looking a webpage as a table in database and website as a database.

  4. I don't feel comfortable unless I have PDF files that I can annotate myself!

    scrape a website

  5. R is best when it comes to web scraping. I am using R to develop web scraper as per my clients requirements.Thanks for sharing this information with us.

  6. Great post! I have read through your tutorial from part I and they are awesome! Thanks for sharing your knowledge!

  7. Web Scraping Services or website scraping service is like a boon to grow business and reach your business to new heights and success. Website scraping services is nothing but a process of extracting data from website for your business need.

  8. Finding the time and actual effort to create a superb article like this is great thing. I’ll learn many new stuff right here! Good luck for the next post buddy..
    Java Training in Chennai