plot textual differences in Shiny



Wordclouds such as Wordle are pretty rubbish, so I thought I'd try to make a better one, one that actually produces (statistically) meaningful results. I was so happy with the outcome I decided to make it interactive, so go on, have a play!

Compare any two files texts (turns out file uploading in Shiny is pretty experimental/dysfunctional) , and graphically map differences between them. The application will stem the file, remove stop words, and calculate statistical significance, all in a few clicks. Using the controls below you can also change the text size, plot title, the positioning of the terms (to avoid overlap), add transparency, and change the number of words plotted.

The sample image included to the left shows differences between my undergraduate thesis about Richard Pipes as a figure or ridicule in Rusian media (on the left) and my mphil theses about Katyn in Polish and Russian media (on the right). I think the plot makes the differences in emphasis pretty obvious. The words in light blue in the middle are terms featuring strongly in both texts and which are not significantly more present in one or the other.

I've presented the code and the logic behind the application elsewhere, so here I include only basic instructions: select two files to compare. Comparisons work best for medium sized files - too small and there will be no differences, too large and processing time will become a bottleneck. If trying to do anything big I strongly recommend compiling the R script locally.

Any language should work, but you may need to find your own stoplist (and stem it!) to get meaningful results. My Russian stop list may be downloaded from here. UPDATE: the Russian stoplist has been hardcoded into the app. Native support for English and I think German also exists, but for other languages you will need to recompile the programme with a custom made stoplist.

I've embedded the app below, but a more userfriendly version can be acccessed here

UPDATE: file upload is not working at the moment, so text needs to be pasted in. This will only work for small to medium size documents.

7 comments:

  1. This looks wonderful! I'm interested to see your R script, but it looks like the URL isn't quite right. Can you fix it? thanks.

    ReplyDelete
    Replies
    1. I don't think I ever uploaded it. Oops. Should be done now. I've identified a small bug in the way the z scores are calculated, so don't treat this as gospel! I'll try to update sometime this week. Best, R

      Delete
  2. To play online casino games, just click on the online casino site.

    viva9988 Is a web g club Casinos that offer online gambling services are the famous way to welcome the opportunity for all gamblers to play online games at the same time because of the current game. Online gamble can be played without the player having to download the program to lose more time.

    Playing online gambling games is another great new way to make your online casino games playable on the web. Our online casino is full of services. Famous for its gamble, online gambling through live broadcasts, or to play online gambling games in the genre of online gambling, whether it is playing online gambling games. Which kind of person can join the fun and participate in online gambling games as pleased.

    Therefore, the online gambling games will happen every time the player decides to play online gambling with us. And when the risk of online gambling, we want players to think about the services of the site. จีคลับ

    ReplyDelete
  3. The AsigoSystem video training program is going to attract a lot of people as it has a lot of things to offer. Anyone who is interested to set up an e-store can join this program and learn some new strategies to earn money. It’s clear from the AsigoSystem review that we strongly recommend this program for our beloved readers as we found that this program promising.

    ReplyDelete