11  Foundations of interactive data-driven, visual design

All we’ve discussed preceding this section is a prerequisite to interactivity — whatever we mean by that — because all those concepts about communication and graphics best practices still apply. Stated differently, if you haven’t used those principles from static graphics when making an interactive graphic, adding interactivity will just be an interactive version of an ineffective graphic.

11.1 Common interactions

We interact with computers, iPhones, and tablets by performing actions through interfaces: commonly, a keyboard, or mouse, or trackpad, or directly on a screen. Typical actions include pointing, hovering, touching, scrolling, clicking, pressing, dragging, pinching, swiping, or other gestures:

It is worth noticing that the intuitive concept of hovering a pointer with a trackpad just requires moving on those devices, but if directly engaging with a screen, we have to touch it. And that can make an equivalent direct-touch action for hovering more challenging.

Through these actions, we interact with various graphical elements, like these:

But what are we talking about when we talk about interactive graphics? Remember the first data graphic in this text, Figure 1.1 showing the baseball fields and fences? It was used to explain the importance of context? What you can’t really tell by just inspecting the graphic is that it consists of two separate graphics, one above the other. The top graphic shows the boundaries of the 30 major league baseball fields. And the bottom graphic shows the relative heights of the outfield fences for each field. Now, before interacting with it, all the 30 lines in both graphics are overlain on top of each other. We can’t really distinguish which field each line refers to, but we can get a sense of the variation and scale for both the field dimensions and the fence heights, would you agree?

But the graphic also allows us to interact with it. The graphic responds in four ways. First, it bolds or emphasizes the fence line my mouse pointer has touched. Second, it lightens or de-emphasizes all the other field boundaries. Third, it shows a label or tooltip identifying the field. And I just coded an index of 1 to 30 here, but I could have gathered the field names and show that instead. As we’ll discuss soon, you can think about this as showing “details on demand.” And fourth, touching the particular link also links to the same emphasis and de-emphasis in the lower graphic to show which fence height relates to the field boundary I selected or hovered over.

Interactive graphics may bring to mind graphics like our first example, Figure 1.1, publications like Roston and Migliozzi (2015) or Aisch and Cox (2015) or Bostock, Carter, and Tse (2014), all exemplary for their audience and purpose. Notice the range of interactivity in each. We’ve already discussed the interactivity of Figure 1.1, simply hovering. In Roston and Migliozzi (2015), the main form of interaction is simply scrolling. The entire graphic reacts in multiple ways to the readers’ flick of the finger. We might view Aisch and Cox (2015) as requiring a bit more from its audience: clicking. And yet more is required from audiences who want to explore Bostock, Carter, and Tse (2014). Along with clicking, readers click and drag sliders, enter text into text boxes, among other things. Along with these examples, Hohman et al. (2020) provides many different exemplary interactive graphics for categories of interactivity.

This text, Data in Wonderland, was created in RStudio, and we can make interactive graphics using RStudio. Is RStudio an interactive graphic?

RStudio is a web browser application with a graphical user interface. We can “interact” with it by entering text (code), clicking buttons (consider the “back”, “next”, and “zoom” buttons for a series of graphics you’ve made), etc. In response, RStudio changes the view of the data graphic we see. Is this an interactive graphic? We can make interactive graphics in Lyra2, but is Lyra2 an interactive graphic?

Lyra2 is a web browser application with a graphical user interface. We can “interact” with it by dragging and dropping, clicking buttons, similar to Tableau but free, open-source, and based on the powerful Vega / D3 javascript languages. In response, Lyra2 changes the view of the data graphic we see. Is this an interactive graphic?

11.2 Human in the loop

Each of these examples fits how Tominski and Schumann (2020) describes interactivity — as a “human in the loop” of an iterative process of executing something in a system pursuant to some goal and evaluating what results.

The authors list, as forms of interaction with data graphics, the following possibilities:

  • source code editing.
  • Scripting commands.
  • Graphical interfaces.
  • Direct manipulation.
  • And direct touch.

Let’s consider each of these.

Source code editing. An example here might be C++ and the graphics library graphics.h. On the implementation level, a visual representation is defined by source code. Only a few lines of code need to be edited to set the visualization view to where the nodes of interest are located. The altered code is compiled and run to see if the visual result is as expected. If not, the procedure is re-run until the user is satisfied. Changing code lines, re-compiling, and test-running the visualization is the least direct form of interaction, as it exhibits large conceptual, spatial, and temporal separation.

Scripting commands. R and the R package ggplot2 would be one example. Alternatively, the visualization may offer a scripting interface allowing the user to enter commands to zoom the view. Once issued, the commands take effect immediately while the visualization is running. In this scenario, no separate compilation is necessary, which reduces the temporal separation. But still the interaction is rather indirect, and several commands may be necessary before the view fits as desired.

Graphical interface. This brings to mind buttons, sliders, and text boxes. The field of view is displayed in a graphical interface alongside the visualization. Standard controls such as buttons and sliders allow the user to easily shift the view and control its zoom factor. Any changes are immediately reflected in the graphical interface and the visualization. Given the fact that the graphical interface represent the view status and at the same time serves to manipulate it, the conceptual gap is narrowed. Yet, the interaction (with the controls) and the visual feedback (in the graph visualization) are still spatially separated.

Direct manipulation. Examples of this graphical element touched by a pointer directed with a mouse or trackpad. The user zooms the view directly by drawing an elastic rectangle around the nodes to be inspected in detail. This is a rather simple press-drag-release operation when using the mouse. During the interaction, visual feedback constantly indicates the frame that will make up the new view once the mouse button is released. Any necessary fine-tuning can be done using the mouse wheel. In this scenario, the manipulation of the view takes place directly in the visualization. There is no longer a spatial separation between the interaction and the visual feedback. Or is there?

Direct touch. To obtain a yet higher degree of directness, the interaction can alternatively be carried out using touch input on the display. Think iPhones and iPads. The basic principle of sketching an elastic rectangle to specify a new view is maintained, but the action is performed using a finger directly on the display. Now, the interaction takes place exactly where the visual feedback is shown. A truly direct way of zooming in on a node-link diagram.

11.3 Level of interaction

So to know what level of interaction seems appropriate, we should consider what? Our audience and purpose? So maybe we need to consider our audience in deciding what type of interactivity we want to provide? Before we think about audience, though, let’s briefly consider why we tend to like interactivity.

Forty years ago, Jaques Bertin explained, interactivity enables us to more quickly find relationships in the data:

A graphic is not “drawn” once and for all; it is “constructed” and reconstructed until it reveals all the relationships constituted by the interplay of the data. The best graphic operations are those carried out by the decision-maker himself (Bertin 1981).

Interesting. This sounds a lot like how we distinguished exploratory graphics from explanatory graphics doesn’t it? He seems to be talking about what we’ve described as exploratory work, for ourselves.

And in exploring what are some of the goals? We may want to mark something as interesting. We want to see different things, or different arrangement of things. Or a different representation or transformation of a thing. We want to look at more or less detail in a group of things. We want to see a thing given that some other thing happened, in other words see a set of data conditionally. Very importantly, we want to see related things, so we can compare and contrast them to provide meaning. Or maybe we want to go back to a previous data graphic. I’m sure you could list many more reasons, right?

Having started thinking about why we want interactive graphics, let’s think about how we interact, even if we’ve already discussed a few ways, and even if this seems obvious. It is helpful to list or categorize them so we have a sense of possibilities.

First off, we have common interfaces we can use to interact. We have a keyboard. We have a mouse. We have a trackpad. And with some devices, we can touch the screen directly. Now there are other ways, like virtual reality, but for this discussion we’ll stick with the common ways I’ve just listed.

Now we can interact with graphical systems in through actions you find common, either with our hands directly on the screen, or with a pointer we direct with a mouse or trackpad. Let me say something that’s a bit special about pointing or hovering. That’s something we can easily do with the trackpad but can’t when using our hands directly, right? So some of these actions are specific to what interface we choose. Of course, we’ve already seen several of these in action with our examples. Scrolling and Clicking. We can also interact by only a part of the click action. It actually involves several sub actions. We can press and hold to perform a separate action from the immediate release, for example. And we can drag things, swipe. Pinch or spread, rotate, and use various other gestures. And we shouldn’t be surprised when new things become available to us, right? So the systems we use listens for these actions and links them to various graphical elements. Or we do that by coding the instructions to do so.

And we’ve already seen various ways that actions can trigger events. In the rent or buy article example, again, it linked our actions for entering text and our actions for pressing and dragging a slider to other graphical elements, right? And in the yield curve article, it linked clicking on the back and next buttons to advancing animations and next graphics views. The first example with the baseball fields more directly linked our actions to the graphical elements, right?

So we can either use widgets or link actions directly to data elements.

And when we start mixing and matching things and ideas, we have more options than we’d like to count, but we should match what we do to either our audience’s past experiences and expectations, provide descriptions and explanations for new ways we want our audience to interact, i.e., take actions to get responses, or both. Part of thinking about our audience is thinking about what tasks they may want to perform. I’ve categorized common tasks as, one, specifying a data and view. Two, manipulating that view. And three, considering the process and provenance of the data and analysis. So we can perform various tasks that relate to these things. We choose visual encodings, filter, sort, transform data. Select. Show more or less detail. Link views. Organize multiple views together. And so forth. If these are common tasks, how might we think about ordering or arranging what’s available to an audience?

11.4 Overview, zoom, filter, details

Twenty-five years ago, the computer scientist Ben Shneiderman wrote an article, Shneiderman (1996a), that generalized how we tend to want to interact with information. First we want an overview of the entire collection — whatever that means — then we zoom in things we are interested in, we filter out things we are not interested in, and we select particular items or groups of items to get more details when needed. Totally makes sense, right?

If we base how timely the advise still is, other research continues to cite Ben’s work and we see this mantra all over the place. By the way, Ben focuses on user experience, not just data graphics. So what does it mean to have an overview of something and what does he mean by an entire collection?

Let’s pull out the Oxford English Dictionary. It’s defined in part here as “a comprehensive review of facts,” “a broad or overall view of a subject,” and “A view from above,” which sounds something like zooming out to see the whole of something, right? Before thinking about this in terms of data graphics, let’s start with something more familiar. Books.

Consider Darwin’s sixth edition — sixth edition — of The Origin of Species, and the first page of the table of contents:

Figure 11.1: First page of the table of contents in Darwin’s Origin of Species, sixth edition.

Is the table of contents an overview of the book? Can a book’s table of contents be an overview? Is an overview summarizing? Showing all of something? Both? Neither? Why or why not?

What about a book’s introduction? Review the introduction from Darwin’s book:

Figure 11.2: Exerpts from introduction of Darwin’s book.

In it, he tells his readers what each chapter is about. Is this an overview? Why or why not? Or should we show every word in his book? A company did just that. The company Fathom created a data graphic of all the words in his sixth edition, and color coded where each word first appeared, which addition Darwin first published that word or sentence (Fry, n.d.). Please review their original, but here’s a screenshot:

Figure 11.3: Screen shot from interactive visual representation of editions of Darwin’s book.

It’s also an interactive, by the way, so I’ll invite you to click and check it out before continuing. The encodings show that the entire chapter VII did not appear until the sixth edition1. So is this an overview? Or was Darwin’s table of contents an overview? Neither? Or were both? Let’s listen to what another researcher in visualization says.

Tamara Munzner, a professor of computer science visualization researcher, in Munzner (2014) explained:

“A vis idiom that provides an overview is intended to give the user a broad awareness of the entire information space. A common goal in overview design is to show all items in the dataset simultaneously, without any need for navigation to pan or scroll. Overviews help the user find regions where further investigation in more detail might be productive.”

Does this help explain the mantra in Shneiderman (1996b), “Overview first, zoom and filter, then details-on-demand”? Munzner thinks the intention is to show the “entire information space”. “Show all items in the dataset simultaneously.” What do you think?

It’s pretty easy to show lots of observations of a few variables at the same time. But what about when we have many variables? How can we accomplish this? What use would it be?

Remember all the exploratory work we considered for the Citi Bike example, looking at different partial views of the data? And from that work, we created Spencer (2019) ? This graphic shows all station locations, each station’s id, latitude and longitude, it shows where that is in relation to geographic boundaries, it shows, when a station is empty or full, which we calculated, and shows the time of day when that event happened. It shows overall activity level throughout the 24 hour period, and it shows the time of sunrise and sunset. But, if you recall (or go back and review now), even this version of the data did not show all the information. We had lots more data variables than that that we did not show here. We had available, at any given moment, the number of bikes and number of available spaces, and we had data on the sex of each rider and birth year, we had weather information we could join with each ride, that include time, temperature, wind speed and direction, humidity, rain or sunshine or cloudy, and more. We found, and decided that subway entrance and exit locations seemed relevant, and local traffic conditions.

Given all these data we did not show, is Spencer (2019) what Munzner would call an overview?

11.5 Overview depends on purpose

A couple more authors have investigated what visualization designers mean when they use the term overview. As published in Hornbæk and Hertzum (2011), the authors gathered, from thousands of journal articles and conference proceedings, descriptions that use “overview” and categorized the concepts that the term was meant to convey. From that study, the authors conclude that what constitutes an “overview of an information space” depends:

What constitutes an overview of an information space may differ depending on whether the task is a monitoring task, a navigation task, a planning task, or the user has some other focus.

It depends on what kind of task is being performed, and what focus the user has in working with the graphic. Ah, that makes more sense! The authors also modeled a working definition of overview from all the researched uses. Here it is:

Figure 11.4: Empirically studied uses of the term overview in the context of interactive graphics.

While it’s shown as a schematic, we can read it as a definition with parts. An overview is an awareness of the content, the structure, and the changes, of an information space that is acquired by pre-attentive cues, either when initiating or through a given task. An overview is useful for monitoring, navigating, exploring, understanding, planning. To be useful it should have good performance. And we create such overviews using either static or dynamic visuals that shrink or zoom out to see the entire information space we’re interested in.

So if we’re interested in measures of a single variable, we show all measurements together. Or maybe measures for two attributes of observations, we show all those. And if you think about it, we reason about relationships between things. But it becomes increasingly difficult and complex to reason about many interactions of things at once. So maybe our overview shows all measurements of attributes related to the particular interaction or relationship for a specific task.

Now that we’re starting to think about an overview first, Shneiderman says we typically follow that up by zooming or filtering. Let’s go back to our example, Darwin’s book and the Fathom project.

Notice that, first, we decode all the color-coded words together. Then, we can hover over any sentence to zoom in on that sentence. The zoomed in sentence is shown as a tooltip, similar to how we commonly (though not only) show details-on-demand with interactive graphics.

11.6 An aside, vector versus raster

As an aside, it’s important to keep in mind how our graphics are rendered. There are two ways, as raster graphics or as shapes, vector graphics. And there are trade-offs to each. If we zoom in on raster graphics, which are shapes in the form of pixels, they start looking ugly or low resolution. If we specify that our graphics render shapes as vectors, then when we zoom in, we keep high resolution. That’s a major advantage to vector graphics, and there are other advantages we won’t discuss now.

But there can be one disadvantage, too. When there are millions of shapes to be shown on a graphic, our computers get bogged down and performance of showing the shapes slows. But if we render the graphic with pixels, that isn’t an issue.

The example CitiBike graphic we just looked at has more than 100,000 shapes shown, and the file size gets a little large and it can be a little slow to show them on some devices. If there were millions, then I’d probably need to change to pixel-based shapes. For the interactive version of Spencer (2019) that allows the reader to hover over stations for their street names and for activity levels throughout the day, for interactivity to work at all on contemporary computers, we had to rasterize the layer 2 containing all the segments that indicated empty or full stations. For those familiar with Adobe’s creative suite, Illustrator works well with vector graphics, and Photoshop does the same for raster graphics.

11.7 Actions for zoom, filter, details

Next, think about what actions we should consider linking or binding to zooming, filtering, and showing details-on-demand? Before concluding, let’s consider what an editor at the New York Times says.

Archie Tse, Deputy Graphics Director at The New York Times, in 2016 explained,

Readers just want to scroll. . . . If you make the reader click or do anything other than scroll, something spectacular has to happen.

he explained that they used to show lots of interactive graphics that asked the user to click, or do other things. And the Times has learned by watching readers behaviors when on their website articles that users mostly do not click, or use a drop down list or a slider, or other widget. They learned that those things tend to feel like barriers for the user to get information. And other research suggests that at least for some audiences, even adding narrative may not be enough to overcome this learned behavioral preference (Boy, Detienne, and Fekete 2015).

But if we link graphics responses to just scrolling, our users tend to like that. This works particularly well when holding a graphic in place while scrolling text beside the graphic:

Then we link changes to the graphic to locations of anchors in the scrolled text (and animate the transitions between these states). Again, he says, “Readers just want to scroll… If you make the reader click or do anything other than scroll, something spectacular has to happen.”

Reconsider the three publications cited at the beginning of this section, their specific purposes, and what each published article asked of its reader.

Very recently, the CIO of Datawrapper and former editor, also from the New York Times, weighed in (Aisch 2017). Gregor explained that we can still ask users to take actions beyond scrolling, and that things like hovering and clicking can be helpful for three reasons. It helps interested readers dig deeper into the information and details. It helps them see different views of the data. And by showing them more, it helps to build trust through transparency. Just don’t hide important information behind an interaction:

Knowing that the majority of readers doesn’t click buttons does not mean you shouldn’t use any buttons. Knowing that many many people will ignore your tooltips doesn’t mean you shouldn’t use any tooltips. All it means is that you should not hide important content behind interactions” like “mak[ing] the user click or hover to see it.

In our discussion, we have begun to consider the foundations of modern user interaction in the context of data-driven, visuals and narratives. Such interactions may trigger scrolling, some overview, zooming in or out, filtering, or showing details-on-demand, providing context to a relationship, history or context, extracting. Other helpful interactions include brushing and linking; hovering; clicking; selecting; and gestures. Part of allowing interaction results in authors having less control over the intended narrative and, thus, may think about interaction as giving the audience some level of co-creation in the narrative.

Next, we begin to consider the technologies that enable these interactions.


  1. Interestingly, Darwin’s “survival of the fittest” did not appear until the fifth edition.↩︎

  2. We used Petukhov, Brand, and Biederstedt (2021), which works well with ggplot objects in R.↩︎