[deleted post on] d3 visualisations of the GDELT data

I accidentally deleted my post on visualising the GDELT data using d3, and because it was really fiddly to make blogger display it properly in the first place, I won't reupload it. Instead here is the intro along with the RMD file (compile it in R using knitr). With any luck, running the code will generate a nice d3 visualisation where you can click shiny buttons to make data appear or disappear at will. I have not maintained this script since the early days of GDELT, so there's every likelihood it will need some tinkering to work with the latest data. 

[start of deleted post]
Below I take the example of the GDELT data to demonstrate how Python can very quickly slice data into manageable chunks, which in turn can be formatted with R and visualised using rCharts to create interactive d3 visualisations. I don't think I present anything new here, but rather give a quick demo of how easy it can be to peek into relatively big data. Regarding GDELT specifically, I've included a quick-fix for importing the details of the event-codes into the data, which helps clarify some specifics. Some users of the GDELT data have apparently found it a bit impenetrable, a bit dense, because of the sheer quantity of data. With any luck this post should lower this barrier to getting started with GDELT.

But more generally I utilise the rCharts package which draws on d3 to make interactive charts. This makes getting an overview very much easier, as irrelevant event categories can easily be hidden. The workflow below is much more efficient than it would be if the whole process was done in just one language. Timed on my old laptop the 3.3 million events were filtered and made interactive in about 30 seconds.

No comments:

Post a Comment