Hello everyone, good evening. Welcome to Wednesday night live from New York! Tonight we’re going to start thinking about how our audiences can interact with data graphics.

All aspects of communication come into play when we bring interactivity into our communications. Before we get started, let’s remind ourselves of our course timeline.

Here’s our course timeline.

You’ve just turned in your most challanging individual practice with graphics, homework four. The concepts you’ve worked through in that homework should well prepare you for thinking through how to transform data and map it to visual variables in ways that let you explore new questions, and see things in different ways.

Next week, you’ll turn in your proposals. Then, your group will have a couple of weeks to create an interactive graphic, and another week after that, present to the CEO.

Before we jump into the topics, Let’s take a poll to try to get some idea about how you are feeling about learning interactivity for data graphics and communications.

[ACTIVATE, SCROLL DOWN]

[DISCUSS, DEACTIVATE, SCROLL UP]

Now, before we get started with interactivity specifically, I want us to take note.

Every design concept we’ve learned for static, data-driven visual narratives equally apply to interactive communications. Every one of them.

Adding interactivity to poor design only creates a poor communication.

To motivate our discussion on interactivity, let’s begin by considering some examples, some of which you’re already familiar with.

We’ve considered this visual a few weeks ago when discussing context, right? Tonight, let’s review the same graphic to answer a couple of related, but different questions. First,

What audience actions does the graphic respond to?

Second,

How, specifically, does the graphic respond?

[DISCUSS]

Let’s now consider a second example.

This example, and the next two following, were all published in the New York Times. I’ve just pasted static versions of them here, so let’s go to the original, interactive version.

I’ll ask you the same questions. First,

What audience actions does the graphic respond to?

Second,

How, specifically, does the graphic respond?

[DISCUSS]

Let’s do another one.

Again. First,

What audience actions does the graphic respond to?

Second,

How, specifically, does the graphic respond?

[DISCUSS]

And for this one. First,

What audience actions does the graphic respond to?

Second,

How, specifically, does the graphic respond?

[DISCUSS]

What about RStudio itself? As I’ve written here:

RStudio is a web browser application with a graphical user interface. We can “interact” with it by entering text (code), clicking buttons, etc. In response, RStudio changes the view of the data graphic we see.

Is this an interactive graphic?

Here’s another software tool, Tableau. As I’ve written,

Tableau is a software application with a graphical user interface. We can “interact” with it by dragging and dropping, clicking buttons. In response, Tableau changes the view of the data graphic we see.

Is Tableau an interactive graphic?

Here’s yet another tool, Lyra. Again, as I’ve written,

Lyra2 is a web browser application with a graphical user interface. We can “interact” with it by dragging and dropping, clicking buttons, similar to Tableau but free, open-source, and based on the powerful Vega / D3 javascript languages. In response, Lyra2 changes the view of the data graphic we see.

Is this an interactive graphic?

Let’s consider what we mean by interactivity.

Interactivity is explained as a “human in the loop” of executing and evaluating information. In each of these examples, there is an iterative process, with a “human in the loop”, where we do something with a system to further a goal, the system responds visually, and we evaluate the new visual.

Back to this question about if RStudio is an interactive graphic.

We might think of interactivity on, say, five levels.

You might think of these forms of interaction with data graphics as including, source code editing. Scripting commands. Graphical interfaces. Direct manipulation. And direct touch. And I’ve written examples beside each of these categories.

Let’s consider each of these.

Source code editing. On the implementation level, a visual representation is defined by source code. Only a few lines of code need to be edited to set the visualization view to where the nodes of interest are located. The altered code is compiled and run to see if the visual result is as expected. If not, the procedure is re-run until the user is satisfied. Changing code lines, re-compiling, and test-running the visualization is the least direct form of interaction, as it exhibits large conceptual, spatial, and temporal separation.

Scripting commands. Alternatively, the visualization may offer a scripting interface allowing the user to enter commands to change the view. Once issued, the commands take effect immediately while the visualization is running. In this scenario, no separate compilation is necessary, which reduces the temporal separation. But still the interaction is rather indirect, and several commands may be necessary before the view fits as desired.

Graphical interface. The field of view is displayed in a graphical interface alongside the visualization. Standard controls such as buttons and sliders allow the user to easily shift the view and control its zoom factor. Any changes are immediately reflected in the graphical interface and the visualization. Given the fact that the graphical interface represent the view status and at the same time serves to manipulate it, the conceptual gap is narrowed. Yet, the interaction (with the controls) and the visual feedback (in the graph visualization) are still spatially separated. Direct manipulation. The user zooms the view directly by drawing an elastic rectangle around the nodes to be inspected in detail. This is a rather simple press-drag-release operation when using the mouse. During the interaction, visual feedback constantly indicates the frame that will make up the new view once the mouse button is released. Any necessary fine-tuning can be done using the mouse or trackpad. In this scenario, the manipulation of the view takes place directly in the visualization. There is no longer a spatial separation between the interaction and the visual feedback. Or is there?

Direct touch. Indeed there remains some degree of separation. The interaction is carried out with the mouse, whereas the feedback is shown on the screen. To obtain a yet higher degree of directness, the interaction can alternatively be carried out using touch input on the display. Now, the interaction takes place exactly where the visual feedback is shown. A truly direct way of zooming in on a node-link diagram.

So to know what level of interaction seems appropriate, we should consider what? Our audience and purpose?

So maybe we need to consider our audience in deciding what type of interactivity we want to provide? Before we think about audience, though, let’s briefly consider why we might find interactivity helpful.

Remember Jaques Bertin, whom I’ve introduced you to several times now? Let’s read together what he says about graphics.

“A graphic is not ‘drawn’ once and for all; it is ‘constructed’ and reconstructed until it reveals all the relationships constituted by the interplay of the data. The best graphic operations are those carried out by the decision-maker himself.”

Interesting. This sounds a bit like how we distinguished exploratory graphics from explanatory graphics doesn’t it? He seems to be talking about what we’ve talked about as exploratory. Do you agree with that?

And in exploring what are some of our goals?

Do these I’ve listed here resonate with you? We want to mark something as interesting? We want to see different things, or different arrangement of things. Or a different representation or transformation of a thing. We want to look at more or less detail in a group of things. We want to see a thing given that some other thing happened, in other words see a set of data conditionally. Very importantly, we want to see related things, so we can compare and contrast them to provide meaning.

Or maybe we want to go back to a previous data graphic.

I’m sure you could list yet more reasons, right?

Having started thinking about why, let’s think about how we interact, even if we’ve already discussed a few ways, and even if this seems obvious. It is helpful to list or categorize them so we have a sense of possibilities.

First off, we have common interfaces we can use to interact. We have a keyboard. We have a mouse. We have a trackpad or a touch screen. Now there are other ways, like virtual reality, but for this discussion we’ll stick with the common ways I’ve just listed.

Now I’ve illustrated common actions we can take, either with our hands directly on the screen, or with a pointer we direct with a mouse or trackpad.

Let me say something that’s a bit special about pointing or hovering. That’s something we can easily do with the trackpad but can’t so easily when using our hands directly, right? So some of these actions are specific to what interface we choose.

Of course, we’ve already seen several of these in action with our examples. Scrolling and Clicking.

We can also interact by only a part of the click action.

It actually involves several sub actions. We can press and hold to perform a separate action from the immediate release, for example. And we can drag things, swipe. Pinch or spread, rotate, and use various other gestures.

And we shouldn’t be surprised when new things become available to us, right?

So the systems we use listen for these actions and links these actions to various graphical elements. Or we do that by coding the instructions for the computer to do so.

And we’ve already seen various ways that actions can trigger events. In the rent or buy article example, it linked our actions for entering text and our actions for pressing and dragging a slider to other graphical elements, right? And in the yield curve article, it linked clicking on the back and next buttons to advancing animations and next graphics views. The first example with the baseball fields more directly linked our actions to the graphical elements, right?

So we can either use widgets or link actions directly to data elements.

And when we start mixing and matching things and ideas, we have more options than we might count, but we should match what we do to either our audience’s past experiences and expections, provide descriptions and explanations for new ways we want our audience to interact, i.e., take actions to get responses, or both.

Part of thinking about our audience is thinking about what tasks they may want to perform. I’ve listed a few of these.

I’ve categorized common tasks as, one, specifying a data and view. Two, manipulating that view. And three, considering the process and provenance of the data and analysis. So we can perform various tasks that relate to these things. We choose visual encodings, filter, sort, transform data. Select. Show more or less detail. Link views. Organize multiple views together. And so forth.

If these are common tasks, how might we think about ordering or organizing what’s available to an audience?

Twenty-five years ago, the computer scientist Ben Shneiderman wrote an article that generalized how we tend to interact with information.

First we want an overview of the entire collection — whatever that means — then we zoom in things we are interested in, we filter out things we are not interested in, and we select particular items or groups of items to get more details when needed.

Totally makes sense, right?

If we base how timely the advise still is, other research continues to cite Ben’s work and we see this mantra all over the place.

By the way, Ben focuses on user experience, not just data graphics.

So what does it mean to have an overview of something and what does he mean by an entire collection?

Let’s pull out the Oxford English Dictionary.

Here’s the entry for an overview. Notice the second and third definitions. It’s defined in part here as

“a comprehensive review of facts,” “a broad or overall view of a subject,” and “A view from above,”

which sounds something like zooming out to see the whole of something, right?

Before thinking about this in terms of data graphics, let’s start with something more familiar. Books.

Here, this example, is the cover of Darwin’s sixth edition — sixth edition — of The Origin of Species, and the first page of the table of contents.

Is the table of contents an overview of the book? Can a book’s table of contents be an overview?

Is an overview summarizing? Showing all of something? Both? Neither?

Why or why not?

What about a book’s introduction? Here, I’ve pulled the introduction from Darwin’s book. And in it, I’ve highlighted where he tells the reader what each chapter is about. Is this an overview? Why or why not?

Or should we show every word in his book? Sort of, lay all the pages side by side and step far back to look at them all at once?

A company did just that. Let’s see what it looks like.

Ben Fry, founder of the company Fathom, created a data graphic of all the words in his sixth edition, and color coded where each word first appeared, which addition Darwin first published that word or sentence. It’s also an interactive, so I’ll invite you to click and check it out.

So Ben’s interactive graphic of Darwin’s text an overview? Or was the table of contents an overview?

Let’s consider what another researcher in visualization says.

Tamara Munzner, and information designer, explains what I’ve quoted here. Let’s read what she says together.

“A vis idiom that provides an overview is intended to give the user a broad awareness of the entire information space. A common goal in overview design is to show all items in the dataset simultaneously, without any need for navigation to pan or scroll. Overviews help the user find regions where further investigation in more detail might be productive.”

Does this help explain Shneiderman’s

“Overview first, zoom and filter, then details-on-demand”?

Munzner thinks the intention is to show the

“entire information space”. “Show all items in the dataset simultaneously.”

What do you think?

It’s pretty easy to show lots of observations of a few variables at the same time. But what about when we have many variables? How can we accomplish this? What use would it be?

Remember the exploratory work we’ve been doing on CitiBike, looking at different partial views of the data?

In our very first lecture, we talked about this graphic.

This graphic shows all station locations, each station’s id, the latitude and longitude, it shows where that is in relation to geographic boundaries, it shows, when a station is empty or full, and shows the time of day when that event happened. It shows overall activity level throughout the 24 hour period, and it shows the time of sunrise and sunset.

But, if you recall, even this version of the data did not show all the information. We had lots more data variables than that that we did not show here. We had, at any given moment, we had the number of bikes and number of available spaces, and we had data on the sex of each rider and birth year, we had weather information we could join with each ride, that include time, temperature, wind direction, humidity, rain or sunshine or cloudy, and more.

Is this an overview?

We can see the interactive version:

[SCROLL DOWN]

A couple more researchers have investigated what visualization designers mean when they use the term overview. They gathered, from thousands of journal articles and conference proceedings, descriptions that use “overview” and categorized the concepts that the term was meant to convey.

Here’s the interactive version.

[SCROLL UP]

From that study, they conclude,

“What constitutes an overview of an information space may differ depending on whether the task is a monitoring task, a navigation task, a planning task, or the user has some other focus.”

Does that help us reason about an overview?

The authors also modeled a working definition of an overview from all the uses of the term in previous research.

Here it is. It’s shown as a schematic, but we can read it as a definition with parts. It’s much more complex than what Oxford suggested.

An overview is an awareness of the content, the structure, and the changes, of an information space that is acquired by pre-attentive cues, either when initiating or during a given task. An overview is useful for monitoring, navigating, exploring, understanding, planning. To be useful it should have good performance. And we create such overviews using either static or dynamic visuals that shrink or zoom out to see the entire information space we’re interested in.

So if we’re interested in measures of a single variable, we show all measurements together. Or maybe measures for two attributes of observations, we show all those. And if you think about it, we reason about relationships between things.

But it becomes increasingly difficult and complex to reason about many interactions of things at once. So maybe our overview shows all measurements of attributes related to the particular interaction or relationship for a specific task.

Now that we’re starting to think about an overview first, Shneiderman says we typically follow that up by zooming or filtering. Let’s go back to our Darwin example.

Here’s an example, from first seeing all the color-coded words together, we can hover over any sentence to zoom in on that sentence. The zoomed in sentence is shown as a tooltip, similar to how we commonly show details on demand with interactive graphics.

[SCROLL DOWN FOR INTERACTIVE]

[SCROLL UP]

We’ve seen this example before, right? It’s a portion of a report. These are screenshots. Let’s explore the interactive version a moment.

[SCROLL DOWN]

Notice that the description above the graphic describes what actions the graphic responds to, and how the graphic responds to those actions.

[SCROLL UP]

By the way, it’s important to keep in mind how our graphics are rendered. There are two ways, as raster graphics or as shapes, vector graphics. And there are tradeoffs to each.

If we zoom in on raster graphics, which are shapes in the form of pixels, they start looking ugly or low resolution. I’m showing you that on the left.

If we specify that our graphics render shapes as vectors, then when we zoom in, we keep high resolution. That’s a major advantage to vector graphics, and there are other advantages we won’t discuss now.

But there can be one disadvantage to vector graphics, too. When there are millions of shapes to be shown on a graphic, our computers get bogged down and performance of showing the shapes slows terribly. But if we render the graphic with pixels, that isn’t as much an issue.

The example CitiBike graphic we just looked at has 100s of thousands of shapes shown, and the file size gets a little large and it can be a little slow to show them on some devices. If there were millions, then I’d probably need to change to pixel-based shapes.

I just wanted to raise this issue as you’re thinking about your graphics. Makes sense? Cool.

I want us to think about what actions we should link to zooming, filtering, and showing details? Let’s consider what an editor at the New York Times says.

A few years ago, she explained that they used to show lots of interactive graphics that asked the user to click, or do other things. And the Times has learned by watching readers behaviors when on their website articles that users mostly do not click, or use a drop down list or a slider, or other widget. They learned that those things tend to feel like barriers for the user to get information.

But if you link graphics responses to just scrolling, users like that. She says,

“Readers just want to scroll. If you make the reader click or do anything other than scroll, something spectacular has to happen.”

And, here, I’m showing you conceptually what scrolling can do. And really, it’s generally a better use, in my opinion, for interactive graphics.

It is common that a graphic is complex, and needs a lot of explanation to understand it. And if you think about books you’ve read, and how they keep discussing a data graphic and have to refer back to it by a number, even multiple pages back, right? It would be great if we could keep the data graphic in view the whole time. Kind of like having two versions of the book. One to keep the graphic open. The other to read the explanations.

So that’s one place that scrollytelling can shine. We can keep a data graphic on the screen in place while scrolling the text that explains the graphic. And we can even trigger changes to that data graphic based on where the reader has scrolled to! That’s pretty awesome!

This is the approach we saw in a couple of the examples we looked at earlier tonight, right? So should we never use anything else? How do we decide?

Well, first let’s keep in mind the audience. Again, that’s a news organization whose purpose is to generally inform an audience, and many in that audience have different levels of interest in any given article, right? So we need to judge how motivated our audience is in accessing the information before deciding what we want to make them do.

Here’s more advice from Gregor, whose the current CIO at Datawrapper and a former editor at the New York Times. He says

we can still ask users to take actions beyond scrolling, and that things like hovering and clicking can be helpful for three reasons. It helps interested readers dig deeper into the information and details. It helps them see different views of the data. And by showing them more, it helps to build trust through transparency.

But we should never hide important information behind those interactions!

Ok, tonight, I’m going to very briefly introduce you to some tools to start adding interactivity. The first two build on what you’ve already learned. The second, Tableau, I’ll explain in terms of what you’ve learned.

By using another library called plotly, we can obtain basic interactivity for the same graphics you’ve already made in one step. Just save your grammar of graphics to an R object, and give that object to a function called ggplotly.

Now the library lets you do more than this, but we won’t go into it today.

It’s that easy to get some default interactivity. It’s when you want to customize particular interactivity that it gets more challenging.

Let’s look at another tool that extends the grammar of graphics.

Another R library called ggiraph aims to create a grammar of interactive graphics. To get interactivity with ggiraph, letting it make default choices for you, is almost as easy. We load the library.

For almost every geom you’ve used from ggplot2, this library provides a substitute, corresponding function named the same except is has underscore interactive at the end of its name. And you have to give it one or two extra arguments. A data_id and if you want a tooltip, then a tooltip. The most important is the data_id.

[DISCUSS DATA_ID]

Save the graphic into an object like with the other library.

Then, provide the object to a function called girafe() to the parameter name ggobj.

Again, pretty easy. Now, you can do much more with this library too, and we’ll get into more detail next week.

Next, let’s discuss Tableau.

Tableau is a business-oriented software tool that gives users the ability to create interactive graphics by dragging and dropping. No code needed. It’s easy to use, you don’t need to attend some class like this, just go onto YouTube and play with the software a bit.

Now you’ve been gaining experience with using the grammar of graphics to transform data and create graphics.

Tableau is not capable of many of the things you have been practicing in your homeworks. But Tableau has valid use cases.

Many people, perhaps including some executives we communicate with a) do not code and do not want to learn to code and b) still need to consider different views of data. These audiences do, however, understand how to use a mouse or trackpad and touch things on the screen with it. First, Tableau makes it pretty easy to create standard, exploratory graphics. Second, it provides some forms of interactivity with almost no work. Together, this means you could give a Tableau file to, say, and executive, where you’ve already setup a graphic or dashboard, and your audience can to some limited extent explore different views of the graphic just by basic actions they are already familiar with.

That’s actually the primary use I’ve personally had with the tool. I don’t personally use it for any of my own work as it’s far less powerful, but there have been a few clients that want to get a graphic in this form specifically because anyone can use it; it doesn’t require any special skills. We’ll see that in a moment. So it can be given to executives to play around with, or interact with, graphics that someone else made with little to no real training.

So let’s identify its point and click interface with the grammar of graphics functionality, which you are already familiar. I’m hoping that this approach will get you familiar with the tool quickly.

First, we need to import data, just as we need to do in R or Python. In Tableau, there are a few ways to do this. You can just

Once data are in Tableau, as with having data.frames in R, we can begin to graph it. On Tableau, that takes the form of a list on the left side of the screen of tables with variables. This list is a bit like, in ggplot2, where you provide the name of a data.frame to the data parameter in the function ggplot or one of the geom_ functions. Does that make sense?

So with these variables, we can begin mapping them to visual elements of tables. Let’s begin with axes.

So we can drag and drop a variable into the “columns” entry. Tableau, I think names its axes “columns” and “rows” because many people transition from Microsoft Excel to Tableau for graphing. And Excel users are familiar with those concepts. But they mean x and y axes respectively. So dragging and dropping onto the column entry is like in ggplot using the mapping = aes( to the same parameter, here that is x = ).

Mapping variables onto the y axis works the same way. Drag the variable over to the rows entry and drop it in. This is, again, like in ggplot function geom_whatever, mapping = aes( to the parameter y =.

Make sense?

Now, one note. For whatever reason, Tableau makes default choices on how to represent the variable you dragged and dropped onto one of the axes. For continuous variables, for example, it automatically decides to summarise that variable by summing the observations.

Now in my opinion that’s a terrible design choice, but they likely had reasons.

To change Tableau’s default choices, after you drag and drop variables onto the x and y axes, you can click the drop down box to change how the variable is represented. You can think of a dimension as the raw data and a measure as some type of summarization of that measure. I’ve listed the R versions that are listed in the drop down menu for this example variable.

Or to show the data as-is: the raw values, choose “dimension.”

Ok, alone with mapping variables to axes, we can map them to other aesthetics.

Let’s start with mapping variables to color. Just drag and drop the variable onto the color box I’ve highlighted here.

That’s like in ggplot, where we enter the variable name within geom_type( mapping = aes(color or fill = .

Just like with color, we can map a data variable to size. We do this by dragging the variable onto the size box. And this is like in R/ggplot in the mapping = aes() or aesthetic, making the size parameter = the variable name.

Make sense?

And we can add data as text on the graphic. To do that, we drag the variable and drop it onto the text box in Tableau. This is like plotting text using the R/ggplot geom_text function. Within that function, we map the variable to the label aesthetic by typing in the variable name equal to label. Of course, we also need to specific where on the graphic to map that text using other parameters, like x = and y =. And we need to do the same in Tableau.

Again, make sense?

To do something like map a variable summarized by group, which we can do in the grammar of graphics several ways. In R/ggplot, we can separate markings in the aesthetic by a variable we assign to the group parameter. Or before we map to aesthetics, we can use the tidy verse or dplyr functions to group_by and summarise, right?

Similarly, I guess, in Tableau, we can apply those summarise statistics or what Tableau called Measures, per group as it shows up in a variable. To do that, we drag a variable into the detail box.

Make sense?

Along with using grouping, we can specify filtering.

In R, we can use the dplyr filter() function to map a subset of the data to aesthetics, right?

To filter the visual markings in Tableau, just drag and drop a variable into the filter box, then Tableau pops up a menu of choices where you answer questions about what kind of filter you want, so answer the questions in pop up boxes about how you want to filter.

Now all these mappings of data to aesthetics up to this point, if you think about it, relate to non-changing things. Static components of how we’ve been thinking of using the grammar of graphics.

Tableau lets you add a tooltip, which is a popup box, when you hover your pointer over a data point on the graphic. By default it adds all the variables you have dragged and dropped onto other aesthetics. You can add more by dragging variables onto the tooltip box.

In R, we have it just as easy by using a package called ggiraph. That package makes available almost every geometric mapping function, the geom underscore whatever, makes an interactive version of it. So for geom_point, the ggiraph packages gives us the function geom_point_interactive. And the interactive version of the function includes as a parameter the tooltip, where you provide what variable or information you want in the tooltip.

Make sense?

Now, a few minutes ago, we discussed a drop down box where you can choose how to represent the variable, as just the data or as some aggregated measure like sum, min, max, etc.

But beyond basic aggregations, Tableau isn’t really designed for complex data transformations like you’ve been doing in your homework practice.

It does have options similar to formulas in excel sheets, Tableau calls those calculated fields. But generally speaking, let’s see what an expert in Tableau has said, and I’m quoting him here, that “If data reshaping is required, do it before importing into Tableau.”

So generally speaking, if you can code in something like R/tidyverse like you have been doing, it’s easier, more efficient, and more flexible to transform your data in that code, then save the transformed data, then import it into Tableau for your audience to play with in your pre-made graphic.

Make sense?

Up to now, we’ve been discussing the ways we can map data to aesthetics in Tableau. We’ve learned in the grammar of graphics the importance of optimizing non-data ink, right? Changing the look of things that are not mapped to data?

We can do this in Tableau, too.

To change the look of things like the non-data ink that we change in ggplot or grammar of graphics themes, you can right click on whatever you want to change, and then select or click the format.

That opens a panel that lets you change formatting of various things.

Getting the basic pattern of how you work with Tableau?

Finally, for tonight, you can arrange multiple graphics onto what Tableau calls a dashboard, by clicking on the button for a dashboard, and then dragging graphics from different sheets onto the dashboard.

This is similar to using R packages that make it easy to layout multiple graphics. Some of those I’ve already shown you. Others we will get to in the coming weeks.

Ok, that’s the new material I wanted to discuss tonight.

I’ve introduced you to three approaches tonight to begin to create interactivity. Let’s put these and other tools into a comparative context.

I’ve create this table to help you make comparisons between tools. It’s on our class website under the topic selections.

[DISCUSS]

Ok, let’s use the remainder of class to review your individual homework four.

[DISCUSS]

Ok, as always, I’m ending by giving you hand picked resources where you can go more in depth to the topics we’ve discussed tonight.

Here they are.

In your readings and material for next week, I’m going to give you resources to actually make interactions, and we’ll look at these technologies behind the concepts we discussed this week. That’s all for tonight. Great to see you all again. I’ll stay for questions. Otherwise have a great night!