My previous software visualization experiment, code_swarm, turned out pretty good. But some wanted a more analytic view of the data — one that was more persistent. I wondered about what this could look like, and came across this XKCD comic. It represents characters as lines that converge in time as they share scenes. Could this technique be adapted for software developers who work on the same code?


Data comes from the project repository logs. Time flows horizontally from left to right. At each timestep (usually a month) developers are clustered by the files they modify. A histogram at the bottom shows the volume and type of file committed. You can mouse-over individual lines to see them better.


Each project visualization below links to an interactive SVG representation. Most modern browsers support SVG rendering, but IE currently does not. Unfortunately, SVG cannot handle large numbers of elements so the interactive histogram (in Processing, where selecting a developer shows their commit volume) is left out.

A Java automated build tool.
This is a smaller project that shows relatively stable development until 2007 when activity drops off.

An open-source database system.
This project also has a consistent core of developers and more stable development activity.

A popular scripting language.
This visualization once again shows Guido van Rossum’s initiative and dedication to the project at the beginning, and the quickening pace of development in 2000.

A popular webserver. (2.0 branch shown)
The first two years go into planning and documentation.

The prototype code is now open source. Updates are still forthcoming. The research paper on this technique was presented at SoftVis 2010.

Thanks to everyone who completed the survey.

Created by Michael Ogawa with Processing.