Skip to Main Content

Data Visualization

Tips, tricks and tools for visualizing data

Data Types

Developing a visualization depends on the kind of data you are working with, as well as what you want the visualization to accomplish. This approach, also known as the “task by data type taxonomy (TTT),” was introduced by Ben Schneiderman in 1996.  The tasks and data types listed below are not exhaustive or mutually exclusive, but they do provide a nice starting point!

Schneiderman (1996) describes seven data types:

1-dimensional

Text data, as appears in program source code, literature or written correspondence.

Word clouds use 1-dimensional data to show how frequently words appear in a collection. The word cloud featured here examines the "Advice from a Caterpillar" portion of Alice in Wonderland. Larger words, like "Alice" and "Caterpillar" appear more frequently, while smaller words, like "chin" and "foot" appear less frequently.

Alice in Wonderland word cloud of the caterpillar's advice

                                     Image credit: Leticia Zelaya

2-dimensional

Map data, such as geographic maps or floor plans.

Choropleth maps use degrees of shading or patterns to represent some variable across space. The choropleth map shows the distribution of Republican and Democratic voters in the 2012 presidential election.

2012 US Elections choropleth map

                                         Image credit: Alicia Parlapiano, New York Times

A cartogram uses area or distance to represent some variable across space. Cartograms often distort familiar geographic spaces. The mosaic cartogram below shows the distribution of the world population by representing it through 15,266 squares, each representing half a million people.

undefined

                                                 Image credit: Max Roser - https://ourworldindata.org/world-population-cartogram

3-dimensional

Data representing real-world objects, such as molecules or buildings. Scientific visualization often uses 3-dimensional data. While many data visualizations are digital in nature, Watson and Crick's 1953 model of a DNA molecule also constitutes a visualization.

Watson and Crick's 1953 model of a DNA molecule

                        Image credit: The Science Museum/SSPL

Temporal

Data that reflects a change(s) through time.

Timelines and Gantt charts are commonly used to visualize temporal data, but there are many unique approaches.

Sankey diagrams use line width to show shifting quantities. The subsequent chart uses color, shading and line thickness to illustrate party representation in Congress since 1776.

Sankey chart showing party representation in Congress through time

                                          Image credit: https://xkcd.com/1127/large/​

Steam graphs use area to illustrate changing variables over time. You visit the interactive version of The Ebb and Flow of movies on the New York Times website.

The Ebb and Flow of Movies: Box Office Receipts 1986 — 2008

Image credit: Matthew Block, Lee Byron, Shan Carter and Amanda                                    Cox, http://archive.nytimes.com/www.nytimes.com/interactive/2008/02/23/movies/20080223_REVENUE_GRAPHIC.html

Multi-dimensional

Data with many attributes.

Small multiples allow one to compare multiple charts. "Drought's Footprint" uses small multiples to map drought across time and space.

Drought's footprint, small multiples show drought through time and space in the USA

Image credit: Haeyoun Park and Kevin Quealy, https://archive.nytimes.com/www.nytimes.com/interactive/2012/07/20/us/drought-  footprint.html?_r=0

Interactivity increases the power of small multiples. The interactive GitHut visualization shows frequency of repositories by programming language.

Parallel coordinates use axes to represent unique variables and their values. Each data point is drawn as a single line through the appropriate points on each axis.

iris data presented through parallel coordinates

                                                      Image credit: https://en.wikipedia.org/wiki/Parallel_coordinates

Again, interactivity can help make sense of parallel coordinates, as in this visualization of 1970s' and 1980s' car features.

Tree

Hierarchical data, wherein each item has a link to a parent item.

Dendrograms are commonly used in biology.

dendrogram of iris data

                                                      Image credit: https://en.wikipedia.org/wiki/Dendrogram

Sunbursts use the layers of arcs to show hierarchy, with the root value as the innermost arc. The subsequent sunburst shows the variety of paths users take when navigating a certain website. You can also visit an interactive version of this visualization.

sunburst showing web navigation

                                                   Image credit: Kerry Rodden, https://bl.ocks.org/kerryrodden/7090426​

Network 

Data also shows items with relationships, but their relationships are not hierarchical. The node-link diagram below (although unlabeled) shows the co-appearance of characters in Les Miserables.

node link diagram of les miserables characters (and when they appear together)

                                                         Image credit: Mike Bostock

Tasks

In addition to data type, your visualization will also be informed by the tasks it aims to achieve. Schneiderman (1996, p. 337) defines the following, interactive tasks:

  • Overview: Gain an overview of the entire collection.
  • Zoom: Zoom in on items of interest
  • Filter: filter out uninteresting items.
  • Details-on-demand: Select an item or group and get details when needed.
  • Relate: View relationships among items.
  • History: Keep a history of actions to support undo, replay, and progressive refinement.
  • Extract: Allow extraction of sub-collections and of the query parameters.

References

Schneiderman, B. (1996). The eyes have it: A task by data type taxonomy for information visualizations. Proceedings from 1996 IEEE Symposium on Visual Languages, 336-343.