Animated scatterplot and word clouds in a grid layout

What am I looking at?

The video shows a scatterplot and four word clouds in a grid layout, visualizing movement in the TNR system, a continuous learning system for inclusive education specialists. The word clouds visualize activity in four different areas of the system: the "pérolas" quiz, the "café com Bel" chat, and the "Nossos Casos" case discussion, and the rest (the upper right word cloud labelled "Tudo menos..."). The scatterplot visualizes the total number of comments ("commentários" on the x-axis) and posts ("conteúdos" on the y-axis) in the period of September 2009 to December 2014. The users represented in the scatterplot are color-coded by user profile (red for researchers (pesquisador), green for multipliers (semeador), blue for others (outro)). The circle size corresponds to the number of accesses to the system during the last month.

It is fairly easy to see, how different areas of the TNR system show activity during different periods, and that in each, some topics have more weight than others. The accompanying scatterplot shows that there are single users of a certain type that seem to participate more in some activities.

How was it made?

The implementation of this example is rather ugly: it has no production quality whatsoever and should be seen as a quick-and-dirty visualization prototype. Usually, a materialized prototype supports discussion, reflection and analysis far better than an idea that only exists in the heads of people.


At the time I did this, I wasn't able to create an animated scatterplot with D3. Now, I have seen it is possible. Furthermore, the Gapminder tools were not open sourced yet, and the only other tools/frameworks I found (Google Charts and the animation package for R) weren't able to produce smooth animations with the data load. Trying to display four animated word clouds in a browser windows also quickly showed I would run into serious performance issues on the "average laptop".


I decided to create a separate graph for each data point, then stitch together an image consisting of one scatterplot and four word clouds for any given day in the data, and creating a video, using one image per frame.

The scatterplots were created with R, saving each scatterplot as a PNG. I already described the word cloud basics. To get a word cloud image for each data point, I used PhantomJS to create an initial template of the SVG with all word positions, and then Node.js with cheerio to generate an SVG with correct word sizes for each day. The SVGs were rendered to PNGs with the SVG Rasterizer of the Batik Project. For stitching together the single images (i.e. the frames) and including legends I used ImageMagick, the video was then produced with ffmpeg.

But why?

Isolated word clouds or any single, isolated animation might yield limited insights. Despite potential information overload, side-by-side, synchronized visualizations of different aspects of the system seemed an interesting option to explore.

If I would do it again...

I would try to do the animations using HTML5 Canvas or SVG. Also, definitely some research and additional interaction mechanisms are required to reduce information overload.