- Recognize the limitations of network visualization
- Identify the reasons to use visualization tools
- Learn available tools for visualization
For whatever reason, humans like pictures of networks. They are complex, rich, and often beautifully intricate diagrams. However, more often than not, they are as useful to researchers as trying to determine the number of noodle on a plate of spagetti by staring at it. The fact of the matter is that in order for visualization to be useful to the researcher, she must know a tremendous amount about the network she is working with. In the early stages, this is often not the case.
Issues with Visualization
The reason that network visualizations are less useful than, say, scatter, bar, or other well-established plots, boils down to two factors: complexity and distortion.
As alluded to in the introduction, most networks have lots of nodes and lots of edges. The human mind is not developed to visually process highly dense patterns of connectivity, so once a network gets sufficiently large (>50 nodes and 100 edges), free-form visual inspection becomes uninformative.
In order to display a network, it must be drawn on the screen (or on paper). Consider the problem of laying out a network on a flat surface while making each edge equal length. If you try it, even for a simple network, you'll discover that it's impossible. This is because nearly all interesting networks we study are non-planar, meaning that they cannot be embedded on a 2D surface. What this means is that, no matter how you layout the network, the distances between nodes will always be distorted. The result is that judging distances between nodes in a visualization is very difficult because one must accomodate for the distortion introduced by the network drawing program.
When to Use Visualization
Despite the limitations of current visualization technology, visualization can be a very compelling way of presenting specific features of a network. Certain aspects can be very effectively drawn out including:
- Centrality (often using color gradients)
- Community structure (using different colors for different communities)
- Regions of high clustering (using color gradients)
In order to bring out these and other features, it is often necessary to manipulate both the coloring of the network as well as the layout of the network. Despite this fact, however, the end result can help audiences more readily understand the overall finding being discussed.
There are two established software packages that permit both numerical and visual analysis of networks:
- Cytoscape - this tool is favored by the biological community, though the software is perfectly capable of being used by social scientists.
- Gephi - this tool has been more recently released but has more-or-less the same features at Cytoscape
Visualization of networks can be fun and, when used appropriately, very effective. However, it is important to be aware of the limitations of network visualization in order to avoid wasting time or, worse, developing a wrong intuition for the network data.
- [[Def: ]]
- [[Def: ]]
- [[Def: ]]