21 Specification: Writing Prompts for Interactive Graphics

In the previous chapter, we explored the technologies underlying interactive graphics—HTML, CSS, SVG, and JavaScript—and developed a formal grammar for describing interactivity. We learned that interactive graphics build upon the foundation of Wilkinson’s Grammar of Graphics (Wilkinson 2005) and Bertin’s visual variables (Bertin 1981), adding three dynamic components: selection (what the user does), binding (what changes in response), and composition (how views coordinate). The static grammar still provides the foundation—data, transformations, scales, coordinates, elements, and guides—while the interactivity layer adds the capacity for these marks to change based on user input. This extended grammar, further grounded in the work of Shneiderman (Shneiderman 1996) and Yi et al. (Yi et al. 2007), provides a theoretically sound vocabulary for specifying interactive graphics.

Now we turn to the practical question: how do you write specifications that result in the interactive graphics you need? The answer lies in effective communication with language models—not by simply asking for “an interactive chart,” but by specifying using the complete grammar vocabulary: first establishing the static foundation (marks, encodings, scales per Wilkinson and Bertin), then describing the dynamic layer through selections (point, interval, multi), bindings (marks, scales, filters, views), and composition (shared, independent, hierarchical).

Think of it this way: you are the director, the AI is your implementation team, and the grammar is your shared language. When you say “add a point selection triggered by hovering, bind it to display a tooltip,” you are speaking in terms the AI can translate into working code—whether that code uses ggiraph, plotly, or Vega-Lite. The grammar transcends any single tool.

This chapter shows you how to write these specifications. For each example, we present:

Narrative context: What are we trying to achieve and why?
The specification: A prompt using grammar vocabulary
Why this works: How the grammatical components serve the analytical goal
Implementation details: Which tools the AI might use (collapsed for reference)
Results: What the resulting graphic enables

Let us begin with the simplest form of interactivity: showing details on demand.

21.1 Example 1: Showing Details on Hover

21.1.1 The Goal

You are analyzing Citi Bike ridership patterns from January 2019. You want to understand the relationship between daily temperature and bike ridership. Students can explore how weather affects usage patterns and identify specific days with unusual patterns. You want readers to hover over any point and see daily details including the date, temperature, ride count, average trip duration, and percentage of subscribers.

21.1.2 The Specification

LLM Prompt

Create an interactive scatterplot using the complete grammar of interactivity. Use the aggregated Citi Bike daily data from January 2019 (data/rides_weather.rds).

Data Operations: Load the RDS file containing daily aggregated data with columns: date, n_rides, daily_avg_temp, daily_avg_trip, daily_frac_subscriber. First convert the date column from character to Date type. Then aggregate the data to daily level by summing n_rides across all records for each date (the data has multiple rows per day). Extract the day of week from the date and create a weekend indicator (weekend vs weekday).

Static Encodings (Wilkinson/Bertin Foundation): - Map daily_avg_temp to the x-axis (continuous scale, temperature in °F) - Map n_rides to the y-axis (continuous scale, number of rides) - Use point marks with alpha blending to emphasize overlapping points - Map weekend/weekday indicator to color encoding - Apply minimal theme with clear axis labels and informative title

Selection (Yi et al.): Implement a point selection triggered by hovering over data points. This captures the “select” intent—marking items of interest.

Binding (Shneiderman): Bind the point selection to two effects: (1) display a tooltip showing the date, daily_avg_temp, n_rides, daily_avg_trip (average trip duration), and daily_frac_subscriber (subscriber percentage) as details-on-demand, and (2) slightly enlarge the selected point while reducing opacity of unselected points to 30% to provide visual emphasis.

Tool constraint: Use R with ggplot2 for the static foundation and ggiraph for the interactive layer, as we want to extend an existing ggplot2 workflow.

Why this specification works

This prompt applies the complete grammar systematically:

Data Operations: Explicitly specifies loading the pre-aggregated daily data and transformation (extracting weekend/weekday from date). This follows Wilkinson’s data → transformation pipeline and allows focus on daily patterns rather than individual trips.
Static foundation: Establishes the Wilkinson/Bertin foundation first—temperature and ride count mapped to axes (position encodings), day type mapped to color (hue encoding), using point marks with alpha blending. These static encodings persist regardless of interaction.
Selection: Point selection on hover captures the “select” intent from Yi et al.’s taxonomy—allowing users to mark specific days as interesting without clicking.
Binding: Two binding targets serve different purposes: the tooltip provides details-on-demand (Shneiderman), while the size and opacity changes modify Bertin’s size and value variables dynamically based on user input.
Composition: Single view, so no composition needed—this is the simplest grammatical structure.

The specification follows the complete pattern: data operations (Wilkinson) → static marks (Wilkinson/Bertin) → selection (Yi) → binding (Shneiderman). By specifying ggiraph, we indicate that the interactive layer should extend ggplot2’s aesthetic system, which already implements Wilkinson’s grammar.

Example implementation

The AI would likely generate code using ggiraph’s interactive aesthetics:

#| eval: false

library(tidyverse)
library(ggiraph)

# Data operations - aggregate to daily level
rides_weather <- readRDS("data/rides_weather.rds") %>%
  mutate(date = as.Date(date)) %>%
  group_by(date) %>%
  summarise(
    n_rides = sum(n_rides),
    daily_avg_temp = first(daily_avg_temp),
    daily_avg_trip = first(daily_avg_trip),
    daily_frac_subscriber = first(daily_frac_subscriber),
    .groups = "drop"
  ) %>%
  mutate(
    day_of_week = weekdays(date),
    is_weekend = ifelse(day_of_week %in% c("Saturday", "Sunday"), "Weekend", "Weekday")
  )

# Static foundation + interactive layer
gg <- ggplot(rides_weather, aes(x = daily_avg_temp, y = n_rides, color = is_weekend)) +
  geom_point_interactive(
    aes(data_id = date,
        tooltip = paste(
          "Date:", date,
          "\nTemperature:", round(daily_avg_temp, 1), "°F",
          "\nRides:", n_rides,
          "\nAvg Trip:", round(daily_avg_trip, 1), "min",
          "\nSubscriber %:", round(daily_frac_subscriber * 100, 1), "%"
        )),
    alpha = 0.7, size = 3
  ) +
  scale_x_continuous(limits = c(15, 55), breaks = seq(15, 55, 10)) +
  scale_y_continuous(limits = c(10000, 45000)) +
  scale_color_brewer(palette = "Set2") +
  labs(x = "Daily Average Temperature (°F)", 
       y = "Number of Rides",
       color = "Day Type",
       title = "Daily Bike Ridership vs. Temperature - January 2019") +
  theme_minimal()

# Render with interactivity
girafe(ggobj = gg,
       options = list(
         opts_hover(css = "fill-opacity:1; stroke:black; stroke-width:1px; r:6px;"),
         opts_hover_inv(css = "opacity:0.3;"),
         opts_tooltip(css = "background-color: white; padding: 8px; border: 1px solid #333; font-family: sans-serif;")
       ))

The data_id aesthetic implements selection (identifying which observation is selected). The tooltip aesthetic implements binding (what details display). The CSS in opts_hover and opts_hover_inv implements the visual feedback binding.

Here is the actual interactive graphic produced by the specification:

Figure 21.1: Interactive scatterplot of daily Citi Bike ridership versus temperature. Hover over points to see daily details.

21.1.3 Results

The resulting interactive graphic enables efficient exploration of weather-ridership relationships. Readers can:

See the overview: The scatterplot reveals a clear positive relationship between temperature and daily ridership. Warmer days consistently see more bike rides than colder days in January 2019.
Distinguish day types: Weekdays (green) generally show higher ridership than weekends (orange) at similar temperatures, reflecting commuter traffic. Weekends cluster at lower ride counts even on warmer days.
Explore specific days: Hovering over any point reveals the exact daily details—date, temperature, total rides, average trip duration, and subscriber percentage—without requiring readers to cross-reference a data table.
Identify anomalies: Days that fall outside the main pattern become apparent. For example, a cold weekday with surprisingly low ridership, or a warm weekend with exceptionally high usage.
Examine threshold effects: The visualization suggests there may be a “sweet spot” around 40-50°F where ridership increases dramatically, and a threshold below 30°F where ridership drops substantially.

This simple interactive element transforms a static scatterplot into an exploratory tool. Readers can answer questions like “What was the coldest day with over 35,000 rides?” or “Do subscriber percentages vary by temperature?” through direct manipulation rather than reading axis values or consulting separate tables.

21.2 Example 2: Linked Views for Comparison

21.2.1 The Goal

You are analyzing vehicle efficiency data. You have a scatterplot showing the relationship between engine displacement and highway miles-per-gallon, and you want to link this to a data table showing detailed specifications for each vehicle. When readers hover over a point in the scatterplot, they should see the corresponding vehicle highlighted in the table—and vice versa.

21.2.2 The Specification

LLM Prompt

Create two coordinated views using the grammar of interactivity with shared composition. The dataset contains vehicle specifications with a unique vehicle_id for each observation.

View 1 - Scatterplot: Map engine displacement (x-axis, continuous) against highway MPG (y-axis, continuous). Use point marks with color encoding for vehicle class (SUV, compact, etc.).

View 2 - Data Table: Display the same observations in a tabular format showing vehicle_id, manufacturer, model, displacement, and highway MPG.

Selection and Composition: Implement a shared point selection using vehicle_id as the selection key. The selection propagates bidirectionally across both views—hovering in either view highlights the corresponding observation in the other. This implements Yi et al.’s “connect” intent across views.

Binding: In the scatterplot, bind the selection to enlarge and color the selected point red. In the table, bind the selection to highlight the entire row background. Both bindings serve the same purpose: making the connected observation visually salient.

Tool constraint: Use R with plotly and crosstalk, as we need shared selection across different view types (plot + table).

Why this specification works

This prompt demonstrates composition—the third grammatical component:

Selection: Point selection using vehicle_id as the shared key
Binding: Different binding targets for each view (scatterplot marks vs. table rows), but coordinated through the shared selection
Composition: Shared selection across views implements Yi et al.’s “connect” intent—showing relationships between items across different representations

The bidirectional linking means the “connect” intent works both ways: exploring in the scatterplot reveals table details, and exploring in the table reveals the scatterplot position. This supports comparison and relationship-finding.

Example implementation

The AI would likely use crosstalk’s shared key mechanism:

#| eval: false

library(crosstalk)
library(DT)

# Create shared key
shared_data <- highlight_key(vehicles, ~vehicle_id)

# Scatterplot with plotly
scatter <- ggplot(shared_data, aes(x = displ, y = hwy, color = class)) +
  geom_point() +
  theme_minimal()

scatter_interactive <- ggplotly(scatter) %>%
  highlight(on = "plotly_hover", color = "red", opacityDim = 0.3)

# Data table
vehicle_table <- datatable(shared_data, 
  options = list(pageLength = 10))

# Combine with bscols for side-by-side display
bscols(scatter_interactive, vehicle_table)

The highlight_key() creates the shared selection identifier. Both views reference the same shared data, enabling the selection to propagate across them.

21.2.3 What This Enables

The linked views support analytical reasoning by enabling comparison across representations. Readers can identify patterns in the scatterplot (“compact cars cluster in the upper left”) and immediately see the specific vehicles that constitute those patterns in the table. Conversely, they can find a specific vehicle in the table and see where it falls in the efficiency spectrum. This cross-view coordination supports the exploratory workflow Shneiderman envisioned.

21.3 Example 3: Filtering Through Brushing

21.3.1 The Goal

You have a time series showing temperature over the course of a month, and you want readers to be able to select a specific time range to examine in detail. When they brush a region on the time series, other views should filter to show only data from that time period.

21.3.2 The Specification

LLM Prompt

Create a coordinated multi-view display using the grammar of interactivity with hierarchical composition. The dataset contains hourly Citi Bike station availability data for January 2019.

View 1 - Overview Time Series: Show average bike availability across all stations over time (x-axis: hour of day, y-axis: average available bikes). Use a line mark.

View 2 - Detail Map: Show a geographic map with station locations as points, colored by their current availability status.

View 3 - Detail Table: Show a table of currently selected stations with their IDs, names, and availability metrics.

Selection and Composition: Implement a hierarchical composition where View 1 serves as the controller. Use an interval selection (brush) on the time series to define a time range. This selection filters the data shown in Views 2 and 3—only stations and time points within the brushed interval appear. This implements Shneiderman’s “filter” intent in a hierarchical structure.

Binding: - In View 1: The brush itself provides visual feedback (highlighted region) - In View 2: Filter binding—only show stations that had availability events during the selected time period - In View 3: Filter binding—only show station records from the selected time window

Tool constraint: Use R with plotly and crosstalk, implementing the interval selection through plotly’s brush capabilities.

Why this specification works

This prompt demonstrates hierarchical composition:

Selection: Interval selection (brush) captures a range rather than a single point
Binding: Filter binding reduces the dataset shown in dependent views
Composition: Hierarchical—View 1 controls Views 2 and 3, implementing an overview+detail pattern

The hierarchical structure follows Shneiderman’s “overview first, then filter” workflow. The overview (time series) provides context; the brush enables filtering; the detail views show filtered results. This supports the “explore” intent—seeing different subsets of data based on conditions.

Example implementation

Implementation using plotly’s subplot and brush coordination:

#| eval: false

library(plotly)

# Create subplot with linked brushing
fig1 <- plot_ly(availability_data, x = ~hour, y = ~avg_available, 
                type = 'scatter', mode = 'lines') %>%
  layout(dragmode = "select",  # Enable brushing
         title = "Overview: Average Availability by Hour")

fig2 <- plot_ly(availability_data, x = ~longitude, y = ~latitude, 
                type = 'scatter', mode = 'markers',
                color = ~status, marker = list(size = 8)) %>%
  layout(title = "Station Map (Filtered)")

fig3 <- plot_ly(availability_data, type = 'table',
                header = list(values = c("Station", "Available", "Status")),
                cells = list(values = list(~station_name, ~available_bikes, ~status)))

# Arrange with brushing linkage
subplot(fig1, fig2, fig3, nrows = 2, shareX = FALSE) %>%
  highlight(on = "plotly_selected", persistent = FALSE)

The brush on fig1 creates the interval selection. The subplot arrangement enables coordinated filtering across views.

21.3.3 What This Enables

The hierarchical composition supports temporal exploration. Readers can brush morning hours to see which stations had availability issues during rush hour, then brush evening hours to compare. The overview time series provides context for the entire day; the brush enables focus on specific periods; the detail views show the consequences of that focus. This supports both exploratory analysis (finding patterns) and explanatory communication (showing specific time windows).

21.4 Example 4: Complete Dashboard Specification

21.4.1 The Goal

You need to create a comprehensive dashboard for a marketing executive analyzing Citi Bike usage patterns. The dashboard should enable exploration of relationships between weather conditions and ridership, support segmentation by rider type, and provide detailed information on demand.

21.4.2 The Specification

LLM Prompt

Create an interactive dashboard using the complete grammar of interactivity. The dataset contains Citi Bike trip data merged with hourly weather data for January 2019.

Spatial Composition: Use CSS Grid to organize the dashboard: - Title section spanning the full width at top - Left column (40% width): Narrative text explaining analytical questions and observations - Right column (60% width): Interactive graphics area divided into two sub-regions: a scatterplot (top) and a linked data table (bottom)

Static Encodings - Scatterplot: - X-axis: Temperature (continuous, -10 to 50°F) - Y-axis: Trip count (continuous, 0 to 4000) - Point marks colored by precipitation type (none, rain, snow) - Size encoding: Total trip duration (larger points = longer average trips)

Selection Mechanisms: 1. Point selection triggered by hovering over scatterplot points (Yi et al.’s “select”) 2. Multi-selection triggered by clicking legend categories to toggle precipitation types on/off (Yi et al.’s “filter”)

Binding - First Selection (Hover): - Display tooltip with: date, temperature, trip count, precipitation type, average trip duration - Enlarge selected point by 50% - Maintain other points at 30% opacity to focus attention

Binding - Second Selection (Legend Click): - Filter the scatterplot to show only selected precipitation categories - Update the data table to show only trips matching the selected weather conditions - Dim excluded points to 10% opacity rather than removing them entirely (maintaining context)

Composition: Shared point selection between scatterplot and table using a composite key (date + hour). Hovering a point highlights the corresponding rows in the table; clicking table rows highlights the point.

Responsive Design: Ensure CSS Grid adapts to minimum width of 1200px (desktop-optimized for executive presentation).

Tool constraints: Use R with plotly for the scatterplot (supports both hover selection and legend filtering), DT for the table, and crosstalk for shared selection. Implement CSS Grid manually within an R Markdown document for precise control over layout.

Why this specification works

This comprehensive prompt uses all three grammatical components:

Selection: - Point selection (hover) for details-on-demand - Multi-selection (legend click) for filtering

Binding: - Tooltip binding (Shneiderman’s details-on-demand) - Visual encoding binding (size, opacity changes per Bertin) - Filter binding (showing subsets)

Composition: - Spatial composition (CSS Grid layout) - Shared selection (scatterplot-table coordination)

The specification also includes design rationale: - Opacity rather than removal maintains overview while enabling focus - Desktop-optimized for the executive audience context - Two selection types serve different analytical purposes: exploration (hover) vs. segmentation (filter)

Example implementation

The AI would generate an R Markdown document with CSS Grid:

---
output: html_document
---


::: {.cell}

```{.css .cell-code}
.grid-container {
  display: grid;
  grid-template-columns: 2fr 3fr;
  grid-template-rows: auto 1fr 1fr;
  gap: 20px;
  min-width: 1200px;
}

.title { grid-column: 1 / 3; }
.text { grid-column: 1 / 2; grid-row: 2 / 4; }
.plot { grid-column: 2 / 3; grid-row: 2 / 3; }
.table { grid-column: 2 / 3; grid-row: 3 / 4; }

library(crosstalk)
library(plotly)
library(DT)

# Shared data with composite key
shared_bike_data <- highlight_key(bike_weather, ~interaction(date, hour))

# Scatterplot with both selection types
p <- plot_ly(shared_bike_data, x = ~temperature, y = ~trip_count, 
             color = ~precipitation_type, size = ~avg_duration,
             type = 'scatter', mode = 'markers') %>%
  layout(dragmode = FALSE) %>%  # Disable brush for point selection only
  highlight(on = "plotly_hover", 
            color = "red", 
            opacityDim = 0.3,
            selected = attrs_selected(size = 15))

# Table
bike_table <- datatable(shared_bike_data[, c("date", "hour", "temperature", 
                                              "trip_count", "precipitation_type")],
                        options = list(pageLength = 5))

Citi Bike Weather Analysis

[Analytical narrative here]

{r} #| eval: false\n p

{r} #| eval: false\n bike_table

``` :::

21.4.3 What This Enables

The dashboard serves multiple analytical workflows simultaneously: - Overview: The scatterplot shows the full temperature-ridership relationship - Filtering: Legend clicks enable segmentation by weather type - Details: Hovering reveals specific values for any observation - Connection: Linked table shows exact records for selected points

The spatial composition (CSS Grid) ensures the executive sees analytical context (left text) alongside data exploration (right graphics), supporting both data-driven decision making and communication to other stakeholders.

21.6 Summary: The Specification Framework

When writing specifications for interactive graphics, systematically address each grammatical component:

Data Operations: What transformations are needed? - Loading and cleaning - Feature engineering (extracting hour, calculating age) - Filtering and aggregation - Following Wilkinson’s data → transformation pipeline

Static Foundation: What are the base marks and encodings? - Marks (points, lines, bars per Wilkinson) - Encodings (position, color, size per Bertin) - Scales and coordinates - These persist regardless of interaction

Selection: What user action triggers the interaction? - Point selection (hover, click) - Interval selection (brush, drag) - Multi-selection (shift-click, legend toggle)

Binding: What changes in response? - Marks (size, color, opacity) - Scales (zoom, pan) - Filters (subset display) - Other views (linked highlighting)

Composition: How do views coordinate? - Independent (no coordination) - Shared (bidirectional selection propagation) - Hierarchical (overview controls detail views)

By specifying in these terms—grounded in the theoretical work of Wilkinson, Shneiderman, Yi et al., and implemented through tools like ggiraph, plotly, and Vega-Lite—you enable the AI to translate your analytical vision into working interactive graphics. You focus on what the user should experience; the AI handles how to build it.

The next chapter explores how to weave these interactions into longer narrative documents—scrollytelling, parameter manipulation, and guided exploration—extending the grammar into temporal composition.

21.1 Example 1: Showing Details on Hover

21.1.1 The Goal

21.1.2 The Specification

21.1.3 Results

21.2 Example 2: Linked Views for Comparison

21.2.1 The Goal

21.2.2 The Specification

21.2.3 What This Enables

21.3 Example 3: Filtering Through Brushing

21.3.1 The Goal

21.3.2 The Specification

21.3.3 What This Enables

21.4 Example 4: Complete Dashboard Specification

21.4.1 The Goal

21.4.2 The Specification

Citi Bike Weather Analysis

21.4.3 What This Enables

21.5 Iterative Refinement

21.6 Summary: The Specification Framework