Animated Wordcloud

What am I looking at?

This is a word cloud animated over time, showing most mentioned words (minus some stop words) in the TNR system, a continuous learning system for inclusive education specialists. It shows the time range from September 2011 to June 2016. One can see that in the beginning, there is relatively low movement in the system. In mid-2013 and early 2014 there are two activity spikes that correspond to some specific activities of a more concerted discussion of inclusive education issues.

How was it made?

The static word cloud layout was done by Jason Davies and depends on Mike Bostock's D3. The words in the cloud were extracted by an R script that connects to TNR's Drupal database and extracts textual data from certain node types and comments. To calculate word weights in R, I used the tm framework for general text mining and SnowballC for word stemming. The word weight at each date corresponds to its frequency in the 30 days prior to that date. The R script produces a JSON file that serves as input for the visualization. The animation of the word cloud is done in a simple loop that redraws the canvas with the weights corresponding to each date.

But why?

I was simply curious how one might create an animated word cloud and whether this might yield some insight. The most obvious insight is the activity spikes. A more subtle one is when which topics (re-)gained or lost popularity. When comparing filtered word clouds side-by-side, additional insights are possible.

If I would do it again...

Aside from the obligatory test with users other than me to see whether this visualization makes sense and is useful, some controls to pause/rewind/... the animation, as well as some filters (e.g. specific content types, time periods, users, etc.) might be interesting. And yes, real-time analysis instead of pre-processed data would be nice.