Data in Wonderland - 13 Tools for interactive graphics

13.1 ggiraph, an htmlwidget

The grammar of graphics — implemented in R as ggplot2 — is currently among the most flexible coding libraries for creating static graphics. We’ve already seen how to use a complementary package with ggplot2 to add animation: gganimate, a grammar of animated graphics. With similar complementary packages, we can specify interactivity. Let’s review a static version of our example of the 30 baseball outfields, and then make it interactive using ggiraph. Here’s the code and graphic:

It’s pretty simple. All the data is is a data frame with each observation being the id of the ball field and connect-the-dots x and y coordinates. And I use geom_path() to draw the 30 black boundary lines. If we’re looking closely, we’ll see that I also have a geom_polygon() and I have subset the data between the two geometries. That’s because I draw the outer black lines with geom_path() but also have the infield brown dirt location information in the data, so I filter that out of the black lines, and I only use the dirt path information for geom_polygon(). And geom_polygon() enables us to fill the polygon with color, here brown.

To make it interactive we can use almost the same code. We include a second package called ggiraph, for this example:

The actual interactive version, of course, is near the beginning of this text, Figure 1.1.

Once we understand the basics of ggplot, it is pretty easy to add interactivity, as we have above. And the website for ggiraph includes other examples of functionality it offers. In this example, we save our interactive ggplot as gg_boundaries and pass it as a parameter into girafe(). Notice, this function gives us options to change css when one of our actions on an element in the graphic triggers a reaction in the code.

Next, let’s share interactivity across this graphic with another for fence heights. To add the second graphic with fence heights, we start by making another ggplot graphic,

but again using the interactive version of the ggplot geom. Again, we save the second graphic into an object. Here I’ve named it gg_fences. To combine the two graphics, we give both plots at the same time to the girafe function.

Of note, in print( gg_boundaries / gg_fences ), the slash is to organize the graphics, and that overloaded function we get from the package patchwork, which is useful for organizing plots. The slash means we are placing one graphic on top of the other, like a fraction with a number on top and bottom. And that’s it. The rest is the same as with one graphic, and now both graphics work together, cross sharing, in this case, the hover action.

Learn more about this tool from Gohel and Skintzos (2021). To learn more generally about htmlwidgets, consult Vaidyanathan et al. (2020). Let’s very briefly consider one more set of tools.

13.2 plotly

plotly is an R package for creating interactive graphics, and interfaces with the same-named javascript library, plotly.js, which in turn is based on d3.js. R’s plotly has several helpful features. The first of these are, like, ggiraph, allows easy integration with ggplot2. The first function, perhaps, to learn, is ggplotly() which takes as a parameter a ggplot object and makes it interactive. And combined with another package, crosstalk, a plotly graphic can link or bind with other htmlwidgets. Here’s an example, linking it with DT, an R interface to the DataTables widget:

First, we create a key that identifies observations across the graphic and table, which is added to our data frame. That’s the highlight_key() function. Here, our data frame is the mpg example in R. Then we save the keyed version of it in a new object, here I’ve called that, just m. Then, we create our ggplot object, but notice that we use the keyed version of the data frame for our data. Once we’ve made, and saved the ggplot to an R object, we make it interactive. Plotly’s version of this uses two functions, called highlight() and one of its parameters is the ggplot object wrapped in another function called ggplotly().

Both objects are html widgets, which we can place together in a single html file for them to interact. The crosstalk package also includes a helper function that makes it easy to place these objects side by side to view directly when run from RStudio or even the console, even without designing an html page: bscols() creates a browsable html element.

And with plotly, we also get other functionality pretty easy, like zooming and panning, or a filter box, or a radio selection box, and so forth.

Sievert (2020) is very helpful in becoming comfortable with this tool.

13.3 Organizing interactive graphics

Now that we have web-based, interactive graphics, we can organize them into web pages using the technologies just discussed: html, css, css grid, and javascript. Now, as with the tools above, we can also abstract the details using tools like r markdown, knitr, and RStudio. We’ve already discussed these in the context of a reproducible workflow, Section 1.3. But we can add r markdown templates for organizing graphics into, say, a dashboard. Enter flexdashboard, another R package.

13.4 `r` `markdown` and `quarto` templates

The flexdashboard template is an r markdown file that creates a grid to place things. The particular grid it uses is called flex grid¹ and is, more specifically, a framework called bootstrap4² that sets defaults for a flex grid.

A nice feature with bootstrap4 version of flexgrid is that it is fully responsive to rearranging your content to best fit the size of device the user is viewing, like a desktop versus an iPhone. And we didn’t have to code specifically to achieve this responsiveness.

More recently, r markdown templates have been upgraded and generalized to more easily be used across programming languages. The upgraded tech is called quarto.

13.5 css grid

But next, let’s consider a more, ahem, flexible approach. Flex grid is, perhaps ironically, less flexible than css grid. By that, I mean flex grid only allows us to specify either rows or columns, not both, like we reviewed with css grid. So flex grid is not as precise either. So, now, let’s consider our own widget placement using css grid inside a basic r markdown file.

Here is the original R markdown file.

13.6 Combining the technologies — a minimal example

Now, let’s bring these technologies together into a fairly minimal example. For this example, we continue with studying and communicating for Citi Bike. This time, we’ll create a dashboard for Lyft’s head of marketing for the operation of Citibike. Recall, first, from earlier sections, and the three fundamental laws of communication in Doumont (2009), that we begin by considering our audience and purpose. One helpful place to learn more about marketing executives, their responsibilities (broader than you may think), how they reason about data, and communicate with other marketing executives about data, consult three articles by David Carr, head of digital marketing for Digitas in London: Carr (2019), Carr (2016), and Carr (2018). Re-read them.

Now, of course, what we can learn from David are about him, and perhaps marketing executives in general. We should strive to learn as much as we can about the particular person we plan to communicate with. Here, that would currently be Azmat Ali, Head of Rider Product Marketing at Lyft.

We’ll add to our earlier interactive graphics that showed bike stations, and times they were empty and full. This time, we’ll use pre-ride data from the same time frame (January 2019) and into it merge hourly information on weather: e.g., temperature, precipitation, wind conditions.

We can frame research questions relevant to Ali, and for learning purposes, we’ll categorize them using the graphic of values in Carr (2019). Here are some examples:

Cultural Value | relevance. Are there better temperatures for us to trigger marketing messages to encourage rides?

Customer value | purchase experience. How can we segment our audience to find opportunities for increasing ridership?

Cultural Value | relevance. Are there better times of day for us to trigger marketing messages to encourage rides?

Business value | insight creation. Do any anomalies suggest preferred customer behavior within conditions?

Business value | insight creation. What customer/rider attributes and use cases are more correlated with high usage? How can we use this information to expand our prospect marketing efforts and more effectively appeal to prospects who display similar behaviors?

Business value | insight creation. Similarly, Do rider attributes correlate with lower usage? Are we missing key target audiences?

Customer value / purchase experience. Are there any anomalies in the data that would indicate lack of availability may be causing lower usage?

Let’s now present a few of these questions into an html document containing interactive data-graphics for purposes of exploratory communication. Visit this link to view the interactive. Here’s a screenshot:

To create this example, we used the same technology stack and tools just discussed: an r markdown document,

---
output: 
  bookdown::html_document2
---

that includes css styles,

<style>
</style>

and between those <style> tags, we defined an exact size for the document (.main-container) and css grid classes for each section of content we wanted to place:

.main-container {
  min-width: 1600px;
  max-width: 1600px;

}

a {
  color: #bbbbbb;
}

.gridlayout { 
  display: grid;
  position: relative;
  margin: 10px;
  gap: 10px;
  grid-template-columns: 
    repeat(8, 1fr);
  grid-template-rows: 
    repeat(8, auto);
}

.title {
  grid-column: 1 / 9;
  grid-row: 1 / 2;
  font-size: 30pt;
  font-weight: 900;
}

.lefttoptext {
  grid-column: 1 / 2;
  grid-row: 4 / 5;
  hyphens: auto;
  text-align: justify;
  font-size: 10pt;
  line-height: 1.4;
}

.righttoptext {
  grid-column: 3 / 4;
  grid-row: 4 / 5;
  hyphens: auto;
  text-align: justify;
  font-size: 10pt;
  line-height: 1.4;
}

.leftbottomtext {
  grid-column: 1 / 2;
  grid-row: 6 / 7;
  hyphens: auto;
  text-align: justify;
  font-size: 10pt;
  line-height: 1.4;
}

.rightbottomtext {
  grid-column: 3 / 4;
  grid-row: 6 / 7;
  hyphens: auto;
  text-align: justify;
  font-size: 10pt;
  line-height: 1.4;
}

.rightinteractive {
  grid-column: 5 / 9;
  grid-row: 3 / 8;
}

.instructions {
  grid-column: 5 / 9;
  grid-row: 2 / 3;
  font-size: 10pt;
  line-height: 1.4;
  color: #888888;
  column-count: 2;
  column-gap: 30px;
  hyphens: auto;
  text-align: justify;
}

.citesource {
  color: #bbbbbb;
  text-align: right;
  grid-column: 3 / 9;
  grid-row: 8 / 9;
  font-size: 8pt;
}

Then we used html <div></div> elements with our defined classes as attributes, and within those tags placed our content — which, in one case, includes an r code chunk — (below, shown as ...):

<div class="gridlayout">

  <div class="title"> ...
  </div>
  
  <div class="lefttoptext"> ...
  </div>
  
  ...
  
  <div class="rightinteractive">

  ```{r}
  # r code for interactive graphics placed here
  ```

  </div>
  
  ...
  
</div>

Identify the varying types of interactivity, how varying elements interact, and from what audience actions the data graphics react. Do the connections between the data seem relevant to the questions presented for exploration?

Next, we’ll consider how we can use the concepts of interactivity throughout longer communications, not just interactivity within and between particular data-graphics.

Helpful references for implementing the details in this section — and for using additional, related tools (e.g., shiny, d3.js, p5.js, processing) — include: (Tominski and Schumann 2020, sec. 4); (Fay et al. 2021); (Janert 2019); (Meeks 2018); (Murray 2017); (Reas and Fry 2014); and (Wickham, n.d.).