2 Specification and communication with AI

Working with AI systems is like entering Wonderland: we must know where we want to end up before we can chart a path. Just as the Cheshire Cat told Alice that her destination determines her direction, our specifications determine what AI systems produce. Without clear intent, we wander aimlessly through possibilities.

The lesson extends beyond Wonderland. We might ask, with Holmes, what evidence the AI needs to reason correctly—the data’s structure, variables, relationships. Or we might wonder, with Tukey, what patterns should become visible through careful specification. Or we might consider, with Kahneman, what story will move our audience to decision. Effective specifications channel all three impulses: the demand for precise evidence, the search for unexpected patterns, and the understanding that data becomes meaningful only through narrative.

2.1 The specification mindset

Working with generative artificial intelligence resembles collaborating with a highly capable colleague who has read extensively but possesses no memory of this particular project, no sense of organizational context, and no awareness of what is left unsaid. The AI may demonstrate profound knowledge in one domain and surprising gaps in the next. Its knowledge is jagged, not uniform. Our task is to bridge these gaps—to provide context, specify constraints, and make explicit what we might otherwise assume shared. If we do, the AI can usually complete what we have requested. This approach marks an evolution in how we develop communications, data visualizations, and analyses: instead of immediately opening a programming environment, we first specify what we want, the tools to be used, and the standards by which the work should be judged.

This is not a revolution in communication so much as an evolution—a making visible of a skill we have always needed. Whether writing a research proposal, explaining an analysis to an executive, or documenting code for our future selves, the quality of our work depends on our ability to specify clearly what we mean. Generative AI simply makes this requirement more explicit. When we work with AI, we cannot rely on shared context, implicit assumptions, or the goodwill of a colleague who will fill in our gaps. The AI takes us literally. This literalness is not a limitation but a mirror: it reflects back to us the precision, or lack thereof, in our thinking.

Consider the difference between two requests, this:

“Make a chart of the sales data.”

versus this:

“Create a scatter plot showing the relationship between monthly sales revenue (y-axis, in thousands of dollars) and advertising spend (x-axis, in thousands of dollars) for Q1-Q4 2023. Include a trend line with confidence intervals. Label both axes clearly and add a title that describes the correlation shown.”

The first request might yield anything from a bar chart to a line graph, with arbitrary scales and unclear labeling. The second request, through more specificity, helps to constrain the possible outcomes to something useful and interpretable. The skill demonstrated in the second request—specification thinking—is the same skill that makes us effective communicators with any audience. The difference is that with AI, the feedback loop is immediate and unforgiving. We learn quickly whether we have been specific enough.

Exercise 2.1 (Diagnosing vague specifications) Consider these three requests to an AI:

“Analyze the customer data.”
“Create a chart showing monthly revenue trends.”
“Build a model to predict churn.”

For each request, identify at least three elements that remain unspecified. What data, exactly? Which customers? What time period? What kind of chart? Which revenue metric? What features should the model use? How will we know if it’s any good?

Now rewrite one of these requests with the level of specificity shown in the sales-advertising example above. Include context about the domain, constraints on the approach, and criteria for success.

Exchange your revised specification with a partner. Can they identify any remaining gaps or ambiguities? What questions would they need to ask before beginning work?

2.2 Elements of effective specification

Just as a well-scoped research proposal contains specific sections that structure our thinking, an effective specification to AI contains components that ensure the output meets our needs. These elements parallel the communication frameworks we explored in Chapter 1.

2.2.1 Context and background

Before requesting any output, we must establish what the AI needs to know. This includes:

The domain: What field or subject area are we working in? What conventions or standards apply?
The data: What dataset are we using? What do the variables represent? What are their types and ranges?
The purpose: Why are we creating this output? What decision will it inform?
The audience: Who will consume this output? What is their level of expertise?

Without this context, the AI operates in a vacuum, making assumptions that may not align with our intent. The context grounds the specification in reality, just as understanding the context of data—what Loukissas (2019) calls data settings rather than data sets—grounds our interpretation of measurements.

2.2.2 Constraints and requirements

Effective specifications include explicit constraints:

Technical constraints: Which programming language or tool should be used? What libraries or packages are preferred or prohibited?
Design constraints: What colors, fonts, or visual styles should be applied? What dimensions or aspect ratio?
Content constraints: What must be included? What should be excluded? Are there specific labels, annotations, or reference lines needed?
Quality constraints: What standards of accuracy, completeness, or clarity must be met?

These constraints function like the specifications in an engineering project or the requirements in a software contract. They bound the solution space and prevent the AI from making choices that violate our needs.

2.2.3 Success criteria

We must tell the AI how we will evaluate its output. This includes:

Output format: Should the result be a visualization, a table, a narrative summary, or code?
Verification method: How can we check that the output is correct? What tests or validations should be possible?
Quality indicators: What makes this output good? What characteristics distinguish excellent work from adequate work?

By specifying success criteria, we give the AI—and ourselves—a target to aim for and a standard against which to measure results.

Exercise 2.2 (Building a complete specification) Imagine you need to create a visualization showing the relationship between temperature and ice cream sales for a business presentation. Build a complete specification that includes all three elements discussed above:

Context and background (at least 3-4 sentences): - What business domain are you working in? - What data do you have available? - Who is the audience and what decision will this inform?

Constraints and requirements (list 4-6 specific items): - Technical: Programming language, tools, libraries - Design: Colors, dimensions, visual style - Content: What must be shown, what should be excluded - Quality: Standards for accuracy or completeness

Success criteria (2-3 clear statements): - What output format do you need? - How will you verify the result is correct? - What distinguishes excellent work from merely adequate?

Exchange specifications with a partner. Can they visualize what you want based solely on your specification? Where are the gaps?

2.3 The specification workflow

Working with AI follows an iterative workflow that resembles the scientific method: specify, generate, evaluate, refine. This workflow parallels the statistical workflow we discussed in Section 1.2, with an additional emphasis on the communication layer.

2.3.1 Specification as drafting

Our first specification is rarely our best. Just as Zinsser advises writers that “rewriting is the essence of writing well,”¹ we should expect to revise our specifications. The first draft serves to discover what we have failed to specify. The AI’s output reveals the gaps in our thinking: the variable we forgot to mention, the scale we assumed but did not state, the edge case we did not consider.

This iterative process is not a failure of specification but its essence. Each cycle teaches us more about what matters in our request. The final specification, achieved through several rounds of refinement, represents a clarity of thought that might have taken much longer to achieve through other means.

2.3.2 Verification and validation

Generative AI can produce plausible-looking nonsense. Code may run but produce incorrect results. Visualizations may be attractive but misleading. Therefore, every AI-generated output requires verification:

Logical verification: Does the output make sense given the data and the question?
Technical verification: Does the code run without errors? Do the results match expectations for known cases?
Comparative verification: How does this output compare to what we would have produced manually? What differences exist, and why?

This verification step connects to our broader theme of uncertainty communication (Chapter 17). When we use AI, we introduce a new source of uncertainty—the potential for specification gaps or AI errors—that we must acknowledge and manage.

2.4 Specification as transferable skill

The skills we develop in specifying to AI transfer directly to other forms of communication. A prompt that successfully generates a visualization contains the same elements we would need to communicate that visualization to a colleague or client:

What we are showing
Why it matters
How to interpret it
What to watch out for or verify

Similarly, the iterative refinement we practice with AI—starting vague and adding specificity—mirrors how we should approach any complex communication. The first draft exposes what we have not yet clarified. Subsequent drafts add the precision that makes the communication effective.

2.4.1 Communication about AI

As we integrate AI into our workflow, we must also communicate about AI to others. This includes:

Transparency: Acknowledging when and how AI was used in our work
Limitations: Explaining what AI can and cannot do, what we verified and what we assumed
Reproducibility: Documenting the prompts and AI tools used so others can assess or replicate our work
Ethics: Considering the implications of AI-generated content, including potential biases, errors, or misrepresentations

These communication challenges are not unique to AI—they are the same challenges we face in communicating any analysis. But AI adds new dimensions: the opacity of the generation process, the potential for confident-sounding errors, and the rapid evolution of capabilities and limitations.

2.5 Working with language models

Specifications are only useful if we can communicate them to language models, yet the landscape of tools evolves so rapidly that any catalog would be outdated before publication. Rather than recommending specific products, we can identify three implementation approaches that represent different trade-offs between convenience, control, and privacy. Understanding these patterns helps you choose the right tool for your context, even as the products themselves change.

The most straightforward approach uses cloud-hosted models through web interfaces or APIs. Services like OpenAI’s ChatGPT, Anthropic’s Claude, or Google’s Gemini offer immediate access to powerful frontier models without hardware investment or setup complexity. You type or paste your specification into a web page interface, receive a response, and copy any generated code into your local environment. This convenience comes with trade-offs: your data travels to external servers, you face ongoing subscription costs, and you depend on service availability. Your prompts may contribute to the service’s training data unless, in some cases, you can explicitly opt out. This pattern resembles consulting a reference library—comprehensive and accessible, but fundamentally public.

For those working with sensitive data or who prefer not to rely on external services, local hosting offers an alternative. Open-weight models like Llama, Mistral, or Qwen can run entirely on your own hardware. Tools such as LM Studio provide graphical interfaces for downloading, configuring, and interacting with these models, lowering the barrier for those uncomfortable with command lines. No data leaves your machine; prompts and responses remain local. The trade-offs here are hardware requirements. Full-size frontier models can occupy more than a terabyte of system memory. Only recently home or small office compute become somewhat capable of handling this size. Most notably, we can “cluster” multiple Apple M3 Ultra computers, each with up to 512 GB unified memory, over thunderbolt 5 with RDMA (remote direct memory access).

Now we can also shrink these models to fit on even modest laptops, with tradeoffs. Running a quantized model on 16 GB of RAM or Apple Silicon with unified memory is possible, but that reduction in size equates to some reduced capability compared to frontier cloud models. But they are still capable. You also assume responsibility for updates and configuration. This pattern resembles maintaining a personal laboratory: complete control, but complete responsibility.

The most powerful—and currently most experimental—approach integrates AI directly into your workflow as an active collaborator rather than a passive responder. Agentic tools like opencode enable AI systems to read files, execute commands, search codebases, and maintain context across extended sessions. The AI becomes a research assistant who sits at your desk, works alongside you, and can take action rather than merely generate text. This integration requires comfort with command-line interfaces and tolerance for rapidly evolving capabilities that sometimes feel “duct-taped” together. Context management remains complex, and the learning curve is steep. But for complex, multi-file projects, this pattern offers capabilities impossible in simpler approaches.

Regardless of which approach you choose, you face a fundamental constraint that shapes how you must work: context windows. Language models possess limited working memory—typically thousands to tens of thousands of tokens (roughly word-pieces). This constraint creates three categories of context we must manage. Short-term context comprises the immediate conversation history the model can directly reference. Long-term memory includes information the model retrieves through RAG (Retrieval-Augmented Generation), external databases, or file system access. Context compression involves techniques for summarizing or selecting the most relevant information when working with large codebases or datasets.

Current solutions for context management feel largely provisional—clever workarounds that address immediate needs but may not represent enduring architectures. Vector databases for RAG, hierarchical summarization schemes, and agent frameworks spawning sub-agents all await more elegant solutions. Yet certain principles will likely outlast current implementations: the need to separate immediate working memory from long-term storage, the value of structured and retrievable documentation, the importance of verification when AI generates summaries that might lose critical nuance, and the necessity of human oversight for consequential decisions. The prudent approach builds workflows around these enduring principles while remaining flexible about specific tools.

Consider how Holmes may have maintained his index of criminal cases, or how Tukey may have kept meticulous experimental logs. The discipline of clear specification and careful documentation serves us regardless of whether we work with cloud APIs, local models, or agentic assistants. The technology changes; the need for precision and verification remains.

2.6 From language model conversation to code

To make these principles concrete, let us consider how we might specify the creation of a visualization. In the next chapter, we explore audiences and their needs. Here, we preview how specification thinking applies to a specific communication task.

Consider the baseball stadium data we examined in Chapter 1. We have data on the outfield dimensions (Spencer, Scott 2021b) and fence heights (Spencer, Scott 2021a) of Major League Baseball stadiums, along with a team index (Spencer, Scott 2021c) for reference. We want to create an interactive visualization that helps readers understand the context in which home runs occur.

A naive request might be:

“Make an interactive plot of baseball field dimensions.”

But this leaves too much unspecified. Instead, we might write:

LLM Prompt

Create an interactive visualization comparing Major League Baseball stadium outfields. Load two datasets: one containing outfield boundary coordinates for each of 30 teams, and another containing fence height data at various points around each field.

Create two coordinated plots: 1. Outfield shape plot: Show each team’s outfield as a path, with standard fields (Toronto’s Rogers Centre and Tampa Bay’s Tropicana Dome) highlighted with filled polygons as reference points. 2. Fence height plot: Show fence heights across the outfield from left field to right field for all teams.

Link the plots interactively so hovering over a team in one plot highlights that team in the other. Use a void theme (no axes or background) for the outfield shapes. For the fence height plot, use “Left Field”, “Center Field”, and “Right Field” as x-axis labels at positions 100, 300, and 500 feet.

Ensure the coordinate system preserves equal scaling so spatial relationships are not distorted. Use black lines with slight transparency for all teams, and apply highlight styling (thicker stroke) on hover.

Why this prompt should work

This prompt demonstrates comprehensive specification:

Data operations: Specifies which datasets to load and how they relate (both linked by team ID)
Visual encodings: Explicitly maps team identity to interactive highlighting, fence heights to y-position, and field positions to x-position with custom labels
Multiple coordinated views: Requests two plots that serve different purposes (shape comparison vs. height comparison) with interactive linking
Coordinate system: Explicitly requires equal scaling for geographic accuracy
Interactive behavior: Specifies hover effects and how the two plots should respond to user interaction
Reference standards: Identifies specific teams to serve as visual anchors

The specification forces us to think through what the visualization should communicate and how each visual choice serves that communication.

Example Implementation

team_idx <- read_rds("data/team_idx.rds")
outfields <- read_rds("data/outfields.rds")
outfields <- outfields %>% left_join(team_idx, by = 'id')

# Convert paths to segments to avoid closed path hover issues
outfields_seg <- outfields %>%
  group_by(id) %>%
  mutate(xend = lead(xsh), yend = lead(ysh)) %>%
  filter(!is.na(xend)) %>%
  ungroup()

ggoutfields <- ggplot(filter(outfields_seg, !(id %in% c(31, 32)))) + 
  theme_void() +
  coord_equal() +
  geom_polygon(data = filter(outfields, (id %in% c(31, 32))),
               aes(x = xsh,
                   y = -ysh,
                   group = id),
               fill = "#FAD9B4",
               color = '#FAD9B4') +
  geom_segment_interactive(aes(x = xsh, y = -ysh,
                               xend = xend, yend = -yend,
                               tooltip = team, data_id = id),
               color = 'black',
               alpha = 0.5)

fences <- readRDS("data/fences.rds") %>% left_join(team_idx, by = 'id')

# Convert fence paths to segments to avoid closed path hover issues
fences_seg <- fences %>%
  group_by(id) %>%
  mutate(xend = lead(xsh), yend = lead(ysh)) %>%
  filter(!is.na(xend)) %>%
  ungroup()

ggfences <- ggplot(fences_seg) + 
  theme_void() +
  theme(axis.text.x = element_text()) +
  coord_equal() +
  scale_x_continuous(breaks = c(100, 300, 500), labels = c("Left Field", "Center Field", "Right Field")) +
  geom_segment_interactive(aes(x = xsh, y = -ysh,
                               xend = xend, yend = -yend,
                               tooltip = team, data_id = id),
                        color = 'black',
                        alpha = 0.5)

# make the two graphs interactive
girafe(code = print(ggfences / ggoutfields),
       options = list(
         opts_hover_inv(css = "opacity:0.15;"),
         opts_hover(css = "stroke-width:3; fill:none;")
       ))

The result:

Figure 2.1: We cannot understand the outcome of a batter’s hit without understanding its context, including the distances and heights of each stadium’s outfield fences.

This example illustrates the full arc of specification thinking: understanding what we want to communicate (the context of home runs), identifying the data that supports that communication (outfield dimensions and fence heights), specifying how that data should be transformed and encoded visually, and verifying that the result serves our communicative purpose.

As we proceed through this text, we will apply this specification framework to increasingly complex communication challenges. Each chapter will present prompts that demonstrate how to translate communicative intent into specific instructions, followed by implementations that show how those instructions manifest in code. The goal is not to teach you to write code—though you will see plenty of it—but to teach you to think with the precision that effective communication, with AI or any audience, requires.

2.7 Exercises

Exercise 2.3 Identify three tasks you regularly perform in your data work (e.g., creating a summary table, plotting a time series, cleaning a dataset). For each task, write a specification to an AI that includes: (1) the context and purpose, (2) specific constraints and requirements, (3) success criteria. Exchange specifications with a colleague and attempt to implement each other’s requests. Where did the specifications succeed? Where did they fail? What did you learn about your own assumptions?

Exercise 2.4 Take a visualization or analysis you created in the past. Write a specification that would produce the same result. Now, modify the specification to produce a variation that serves a different audience or purpose (e.g., making it more detailed for technical reviewers, or more simplified for executives). What elements of the specification changed? What remained constant?

Exercise 2.5 Find a published data visualization or analysis that you admire. Working backward from the output, reconstruct the specification that might have produced it. What data operations were required? What visual encodings were chosen? What design decisions were made? This reverse-engineering exercise develops your ability to see specifications in finished work.

See Zinsser (2006) for the classic treatment of writing as rewriting.↩︎