index.knit

You might think of these forms of interaction with data graphics as including, source code editing. Scripting commands. Graphical interfaces. Direct manipulation. And direct touch. And I’ve written examples beside each of these categories.

Let’s consider each of these.

Source code editing. On the implementation level, a visual representation is defined by source code. Only a few lines of code need to be edited to set the visualization view to where the nodes of interest are located. The altered code is compiled and run to see if the visual result is as expected. If not, the procedure is re-run until the user is satisfied. Changing code lines, re-compiling, and test-running the visualization is the least direct form of interaction, as it exhibits large conceptual, spatial, and temporal separation.

Scripting commands. Alternatively, the visualization may offer a scripting interface allowing the user to enter commands to change the view. Once issued, the commands take effect immediately while the visualization is running. In this scenario, no separate compilation is necessary, which reduces the temporal separation. But still the interaction is rather indirect, and several commands may be necessary before the view fits as desired.

Graphical interface. The field of view is displayed in a graphical interface alongside the visualization. Standard controls such as buttons and sliders allow the user to easily shift the view and control its zoom factor. Any changes are immediately reflected in the graphical interface and the visualization. Given the fact that the graphical interface represent the view status and at the same time serves to manipulate it, the conceptual gap is narrowed. Yet, the interaction (with the controls) and the visual feedback (in the graph visualization) are still spatially separated. Direct manipulation. The user zooms the view directly by drawing an elastic rectangle around the nodes to be inspected in detail. This is a rather simple press-drag-release operation when using the mouse. During the interaction, visual feedback constantly indicates the frame that will make up the new view once the mouse button is released. Any necessary fine-tuning can be done using the mouse or trackpad. In this scenario, the manipulation of the view takes place directly in the visualization. There is no longer a spatial separation between the interaction and the visual feedback. Or is there?

Direct touch. Indeed there remains some degree of separation. The interaction is carried out with the mouse, whereas the feedback is shown on the screen. To obtain a yet higher degree of directness, the interaction can alternatively be carried out using touch input on the display. Now, the interaction takes place exactly where the visual feedback is shown. A truly direct way of zooming in on a node-link diagram.

So to know what level of interaction seems appropriate, we should consider what? Our audience and purpose?

First off, we have common interfaces we can use to interact. We have a keyboard. We have a mouse. We have a trackpad or a touch screen. Now there are other ways, like virtual reality, but for this discussion we’ll stick with the common ways I’ve just listed.

Now I’ve illustrated common actions we can take, either with our hands directly on the screen, or with a pointer we direct with a mouse or trackpad.

Let me say something that’s a bit special about pointing or hovering. That’s something we can easily do with the trackpad but can’t so easily when using our hands directly, right? So some of these actions are specific to what interface we choose.

Of course, we’ve already seen several of these in action with our examples. Scrolling and Clicking.

We can also interact by only a part of the click action.

It actually involves several sub actions. We can press and hold to perform a separate action from the immediate release, for example. And we can drag things, swipe. Pinch or spread, rotate, and use various other gestures.

And we shouldn’t be surprised when new things become available to us, right?

So the systems we use listen for these actions and links these actions to various graphical elements. Or we do that by coding the instructions for the computer to do so.

And we’ve already seen various ways that actions can trigger events. In the rent or buy article example, it linked our actions for entering text and our actions for pressing and dragging a slider to other graphical elements, right? And in the yield curve article, it linked clicking on the back and next buttons to advancing animations and next graphics views. The first example with the baseball fields more directly linked our actions to the graphical elements, right?

So we can either use widgets or link actions directly to data elements.

And when we start mixing and matching things and ideas, we have more options than we might count, but we should match what we do to either our audience’s past experiences and expections, provide descriptions and explanations for new ways we want our audience to interact, i.e., take actions to get responses, or both.

Part of thinking about our audience is thinking about what tasks they may want to perform. I’ve listed a few of these.

Remember the exploratory work we’ve been doing on CitiBike, looking at different partial views of the data?

In our very first lecture, we talked about this graphic.

This graphic shows all station locations, each station’s id, the latitude and longitude, it shows where that is in relation to geographic boundaries, it shows, when a station is empty or full, and shows the time of day when that event happened. It shows overall activity level throughout the 24 hour period, and it shows the time of sunrise and sunset.

But, if you recall, even this version of the data did not show all the information. We had lots more data variables than that that we did not show here. We had, at any given moment, we had the number of bikes and number of available spaces, and we had data on the sex of each rider and birth year, we had weather information we could join with each ride, that include time, temperature, wind direction, humidity, rain or sunshine or cloudy, and more.

Is this an overview?

We can see the interactive version:

[SCROLL DOWN]

A couple more researchers have investigated what visualization designers mean when they use the term overview. They gathered, from thousands of journal articles and conference proceedings, descriptions that use “overview” and categorized the concepts that the term was meant to convey.

Here it is. It’s shown as a schematic, but we can read it as a definition with parts. It’s much more complex than what Oxford suggested.

An overview is an awareness of the content, the structure, and the changes, of an information space that is acquired by pre-attentive cues, either when initiating or during a given task. An overview is useful for monitoring, navigating, exploring, understanding, planning. To be useful it should have good performance. And we create such overviews using either static or dynamic visuals that shrink or zoom out to see the entire information space we’re interested in.

So if we’re interested in measures of a single variable, we show all measurements together. Or maybe measures for two attributes of observations, we show all those. And if you think about it, we reason about relationships between things.

But it becomes increasingly difficult and complex to reason about many interactions of things at once. So maybe our overview shows all measurements of attributes related to the particular interaction or relationship for a specific task.

Now that we’re starting to think about an overview first, Shneiderman says we typically follow that up by zooming or filtering. Let’s go back to our Darwin example.

By the way, it’s important to keep in mind how our graphics are rendered. There are two ways, as raster graphics or as shapes, vector graphics. And there are tradeoffs to each.

If we zoom in on raster graphics, which are shapes in the form of pixels, they start looking ugly or low resolution. I’m showing you that on the left.

If we specify that our graphics render shapes as vectors, then when we zoom in, we keep high resolution. That’s a major advantage to vector graphics, and there are other advantages we won’t discuss now.

But there can be one disadvantage to vector graphics, too. When there are millions of shapes to be shown on a graphic, our computers get bogged down and performance of showing the shapes slows terribly. But if we render the graphic with pixels, that isn’t as much an issue.

The example CitiBike graphic we just looked at has 100s of thousands of shapes shown, and the file size gets a little large and it can be a little slow to show them on some devices. If there were millions, then I’d probably need to change to pixel-based shapes.

I just wanted to raise this issue as you’re thinking about your graphics. Makes sense? Cool.

And, here, I’m showing you conceptually what scrolling can do. And really, it’s generally a better use, in my opinion, for interactive graphics.

It is common that a graphic is complex, and needs a lot of explanation to understand it. And if you think about books you’ve read, and how they keep discussing a data graphic and have to refer back to it by a number, even multiple pages back, right? It would be great if we could keep the data graphic in view the whole time. Kind of like having two versions of the book. One to keep the graphic open. The other to read the explanations.

So that’s one place that scrollytelling can shine. We can keep a data graphic on the screen in place while scrolling the text that explains the graphic. And we can even trigger changes to that data graphic based on where the reader has scrolled to! That’s pretty awesome!

This is the approach we saw in a couple of the examples we looked at earlier tonight, right? So should we never use anything else? How do we decide?

Well, first let’s keep in mind the audience. Again, that’s a news organization whose purpose is to generally inform an audience, and many in that audience have different levels of interest in any given article, right? So we need to judge how motivated our audience is in accessing the information before deciding what we want to make them do.

Another R library called ggiraph aims to create a grammar of interactive graphics. To get interactivity with ggiraph, letting it make default choices for you, is almost as easy. We load the library.

For almost every geom you’ve used from ggplot2, this library provides a substitute, corresponding function named the same except is has underscore interactive at the end of its name. And you have to give it one or two extra arguments. A data_id and if you want a tooltip, then a tooltip. The most important is the data_id.

[DISCUSS DATA_ID]

Save the graphic into an object like with the other library.

Then, provide the object to a function called girafe() to the parameter name ggobj.

Again, pretty easy. Now, you can do much more with this library too, and we’ll get into more detail next week.

Next, let’s discuss Tableau.

Tableau is a business-oriented software tool that gives users the ability to create interactive graphics by dragging and dropping. No code needed. It’s easy to use, you don’t need to attend some class like this, just go onto YouTube and play with the software a bit.

Now you’ve been gaining experience with using the grammar of graphics to transform data and create graphics.

Tableau is not capable of many of the things you have been practicing in your homeworks. But Tableau has valid use cases.

Many people, perhaps including some executives we communicate with a) do not code and do not want to learn to code and b) still need to consider different views of data. These audiences do, however, understand how to use a mouse or trackpad and touch things on the screen with it. First, Tableau makes it pretty easy to create standard, exploratory graphics. Second, it provides some forms of interactivity with almost no work. Together, this means you could give a Tableau file to, say, and executive, where you’ve already setup a graphic or dashboard, and your audience can to some limited extent explore different views of the graphic just by basic actions they are already familiar with.

That’s actually the primary use I’ve personally had with the tool. I don’t personally use it for any of my own work as it’s far less powerful, but there have been a few clients that want to get a graphic in this form specifically because anyone can use it; it doesn’t require any special skills. We’ll see that in a moment. So it can be given to executives to play around with, or interact with, graphics that someone else made with little to no real training.

So let’s identify its point and click interface with the grammar of graphics functionality, which you are already familiar. I’m hoping that this approach will get you familiar with the tool quickly.

Now all these mappings of data to aesthetics up to this point, if you think about it, relate to non-changing things. Static components of how we’ve been thinking of using the grammar of graphics.

Tableau lets you add a tooltip, which is a popup box, when you hover your pointer over a data point on the graphic. By default it adds all the variables you have dragged and dropped onto other aesthetics. You can add more by dragging variables onto the tooltip box.

In R, we have it just as easy by using a package called ggiraph. That package makes available almost every geometric mapping function, the geom underscore whatever, makes an interactive version of it. So for geom_point, the ggiraph packages gives us the function geom_point_interactive. And the interactive version of the function includes as a parameter the tooltip, where you provide what variable or information you want in the tooltip.

Make sense?

Now, a few minutes ago, we discussed a drop down box where you can choose how to represent the variable, as just the data or as some aggregated measure like sum, min, max, etc.

But beyond basic aggregations, Tableau isn’t really designed for complex data transformations like you’ve been doing in your homework practice.

It does have options similar to formulas in excel sheets, Tableau calls those calculated fields. But generally speaking, let’s see what an expert in Tableau has said, and I’m quoting him here, that “If data reshaping is required, do it before importing into Tableau.”

So generally speaking, if you can code in something like R/tidyverse like you have been doing, it’s easier, more efficient, and more flexible to transform your data in that code, then save the transformed data, then import it into Tableau for your audience to play with in your pre-made graphic.

Make sense?