Wikipedia Data Visualization
A visualization of thousands of Wikipedia edits that were made by a single software bot. Each color corresponds to a different page.
Image: Fernanda B. Viégas, Martin Wattenberg, and Kate Hollenbach Source: www.wired.com
Besides being a pretty [colorful] picture, I’m not sure this visualization is useful at all. I understand the need to visualize terabytes of data because a human being can only comprehend so much. Can someone please tell me how this visualization is useful in its current state. There wasn’t a reference to what the colors actually represent in terms of pages.
If you relax your eyes and stare at the center, you will actually see the hidden picture (stereogram).



June 26th, 2008 at 9:31 am
The only use I can think of for this graphic is the shock value of seeing that a bot is apparently making many many edits to the same page, or actually to several same pages (top few rows in particular). Makes me want to know what is going on — two dueling bots editing and re-editing? And what is a bot doing editing wikipedia this many times in the first place??
June 26th, 2008 at 11:20 am
“Besides being a pretty [colorful] picture, I’m not sure this visualization is useful at all.”
I feel this way about a lot of visualizations. Often they seem more a vehicle for showing creative use of form and color than for showing relationships in data. I always look at the feeds for blogs like http://visualcomplexity.com, http://infosthetics.com, and others, and usually come away thinking “huh?”. http://neoformix.com sometimes strikes me the same way, although they lately have shown some interesting things with streamgraphs.
June 26th, 2008 at 9:05 pm
Chris – I would probably agree that much of the intention is to shock folks that don’t really understand or know about Data Visualization.
Jon – So I’m not alone. I’m not entirely on board with the streamgraphs yet. They are like Wordle in that I can see the large figures clearly, but the small ones are impossible to view unless it’s an interactive application.
Thanks for the comments!
June 26th, 2008 at 9:14 pm
I’m not sure about wanting to shock people, just dazzle them with colors.
The streamgraphs are okay if you treat them as a mostly qualitative display. There is some thicker = more to the bands, but it’s very hard to quantify, because the bands undulate around a baseline. They’re better than all the funny word constructions, or cloud stuff, or sentence parsing, or all the things that look like a tangled ball of Christmas lights that someone plugged in anyway.
June 26th, 2008 at 9:26 pm
I’m laughing at your comparison of the *clouds to how Christmas lights always end up. I also agree that dazzle would be a better description than shock.
Comparing streamgraphs to word clouds like Wordle would be an interesting exercise. I wonder which would be more effective and/or popular to the audience.
For those that aren’t familiar here are two examples
Example of word cloud using Wordle:
http://supportanalytics.com/blog/wp-content/uploads/2008/06/itunes1.jpg
Example of Streamgraph on Neoformix:
http://www.neoformix.com/2008/TwitterClientUsageStream.html
June 27th, 2008 at 7:48 am
To me, the Wordle thing looks like a blob of words. What’s important, font size? because the length of a word also impacts the space it fills (e.g., U2 is three times as tall as Dave Matthews Band, but DMB covers more area). The mixture of vertical and horizontal does nothing for me. The Wordle graphic is at best a qualitative display. If you wanted to show the amounts of play each band receives, you should use a bar chart or dot plot.
The streamgraph is more interesting in that it actually shows a time series of the data. You cannot show as many series as you can show words in the Wordle thing, and of course the streamgraph suffers from the problems of most stacked charts (no constant baseline particularly). Some of the labels are just too small to read; I think all labels should be a consistent font size, and let the thickness of the bands indicate the magnitudes.