Web Scraping: Scaling up Digital Data Collection

The latest slides from web scraping through R: Web scraping for the humanities and social sciences

Slides from the first session here

Slides from the second session here


Slides from the fourth and final session here


This week we look in greater detail at scaling up digital data-collection: coercing scraper output into dataframes, how to download files (along with a cursory look at the state of IP law), cover basic text-manipulation in R, and take a first look at working with the APIs (share counts on Facebook).

Download the .Rpres file to use in Rstudio here

A regular R script with code-snippets only can be accessed here

UPDATE March 2015:
New 2015 version of slides here
PDFs of slides available here

15 comments:

  1. This is great! This information becomes obsolete fast so it is quite useful.
    Is there any chance of a downloadable form for your slides? For whatever reason, I don't feel comfortable unless I have PDF files that I can annotate myself!
    Thanks

    ReplyDelete
    Replies
    1. Glad you find it useful! I've added PDF slides. Fingers crossed they work properly - a bit hard to get the formatting right

      Delete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. Web Scraping, in general, means looking a webpage as a table in database and website as a database.

    ReplyDelete
  4. I don't feel comfortable unless I have PDF files that I can annotate myself!

    scrape a website

    ReplyDelete
  5. R is best when it comes to web scraping. I am using R to develop web scraper as per my clients requirements.Thanks for sharing this information with us.

    ReplyDelete
  6. Great post! I have read through your tutorial from part I and they are awesome! Thanks for sharing your knowledge!

    ReplyDelete
  7. Web Scraping Services or website scraping service is like a boon to grow business and reach your business to new heights and success. Website scraping services is nothing but a process of extracting data from website for your business need.

    ReplyDelete
  8. Finding the time and actual effort to create a superb article like this is great thing. I’ll learn many new stuff right here! Good luck for the next post buddy..
    Java Training in Chennai

    ReplyDelete

  9. Finance and investment cannot give improvement. It depends on business ideas. e for exploring.eHow-why,will show you insights of new trends for success. Read More

    ReplyDelete
  10. Please click on this post if you wanna paly with online casino.Thank you.
    ทางเข้าจีคลับ
    บาคาร่า

    ReplyDelete
  11. Great post! I have read through your tutorial from part I and they are awesome! Thanks for sharing your knowledge!>>>

    goldenslot casino
    บาคาร่าออนไลน์
    gclub casino




    ReplyDelete
  12. G club Online gambling sites are fully playable every day. To play more easily. Where to gamble yourself every day. Bet at all Choose to gamble yourself at all times. Full of play to play easily. There are gambling games that will give you more profit. To play well is to play to give more money, and then ready to gamble is fun every time.

    In addition, it is interesting with figs. Not only gives you high fiber You have a lot of drugs. The fig fruits are very nutritious. Specific properties that help eliminate waste in the body. Stimulates excretion And help prevent gallstones. Help inhibit the growth of colon cancer.

    There are many gambling games that will be liked to play all kinds of gambling. To play very easily. To play well with this gambler. It is fun to play all the gamblers who will love to make good money everywhere. You can play gambling every day with a much easier playing. Make a bet to gamble yourself every day. All bets are fun. Gclub มือถือ

    ReplyDelete