Summary
Synthetic data has legitimate uses in research—AI agents excel at automated cognitive walkthroughs, spotting logical inconsistencies, broken flows, and accessibility violations. However, using AI to generate 'synthetic users' or fake survey responses is methodologically dangerous. AI lacks lived experience; it simulates plausibility, not truth. The rule: use synthetic data to test the system (logic, accessibility), use real data to understand the human (behavior, emotion, motivation).
The promise is seductive: why recruit 12 participants when you can simulate 1,000? Why wait for scheduling when an AI can "walk through" your prototype in seconds?
The answer depends entirely on what question you are trying to answer.
Large Language Models and AI agents have opened new possibilities for research automation. Some are genuinely valuable. Others are methodological landmines. This guide helps you tell the difference.
The Legitimate Use Case: Automated Cognitive Walkthrough
An AI agent can systematically navigate a prototype or live product, evaluating it against logical criteria. This is not fake research—it is super-powered heuristic evaluation.
What AI Agents Can Do
| Capability | Example | Value |
|---|---|---|
| Logical flow analysis | "Step 3 references data not collected until Step 5" | Catches sequencing errors |
| Label consistency | "The button says 'Submit' here but 'Send' elsewhere" | Identifies confusing terminology |
| Navigation auditing | "This page has no way to return to the dashboard" | Finds dead ends |
| Accessibility scanning | "This image has no alt text; this form field has no label" | Flags WCAG violations |
| Content evaluation | "This error message doesn't explain how to fix the problem" | Improves microcopy |
Why This Works
Logic is programmable. An AI can be given explicit rules:
- "Every action should have a clear undo path"
- "Every form field should have a visible label"
- "Every error should explain the problem and suggest a fix"
- "Navigation should be consistent across all pages"
The AI then systematically checks every screen against these rules, faster and more consistently than a human evaluator.
The Cognitive Walkthrough Protocol
- Define evaluation criteria: What heuristics or standards should the AI check against?
- Provide the interface: Screenshots, prototype links, or live URLs
- Run the walkthrough: AI navigates and flags violations
- Review findings: Human researcher validates and prioritizes
- Fix and re-run: Iterate until baseline issues are resolved
What Automated Walkthroughs Cannot Do
Even in legitimate use cases, AI has limits:
| Limitation | Example |
|---|---|
| Cannot assess emotional response | "Does this error message feel patronizing?" |
| Cannot evaluate trust | "Would you enter your credit card here?" |
| Cannot predict workarounds | "Users might screenshot this instead of using the share button" |
| Cannot surface unstated needs | "I wish this also showed me X" |
These require real humans with real context.
The Dangerous Use Case: Mimicking People
The temptation is to go further: if AI can evaluate a flow, can it also respond like a user? Can it generate survey responses, simulate interview answers, or create "synthetic personas" based on demographic profiles?
This is where synthetic data becomes dangerous.
The Fundamental Problem
Language models predict the probable next word based on training data. They do not model the true human reaction to your specific product.
| What AI Does | What Research Needs |
|---|---|
| Predicts statistically likely response | Captures actual human reaction |
| Defaults to "average internet opinion" | Surfaces edge cases and outliers |
| Simulates plausibility | Reveals truth |
| Generates coherent text | Reflects lived experience |
Why AI Cannot Simulate Humans
AI lacks lived experience. It has never:
- Lost a job and felt the anxiety of checking a bank balance
- Struggled to complete a form while a baby cried in the background
- Felt the specific frustration of a promise broken by a brand
- Experienced the trust that comes from years of positive interactions
- Made an irrational choice because of a memory from childhood
These experiences shape how real users interact with products. AI can generate text that sounds like these experiences, but it is simulation, not observation.
The "Average User" Trap
When you ask an AI to respond as "a 35-year-old working mother," it generates a statistically average representation based on how such people are described online. This has two problems:
- Stereotyping: The AI reproduces cultural assumptions and biases
- Flattening: Real humans are contradictory, surprising, and individual
The insights that matter most—the unexpected behaviors, the edge cases, the genuine confusion—are exactly what synthetic data cannot produce.
Specific Failures of Synthetic User Data
| Method | What Goes Wrong |
|---|---|
| Synthetic survey responses | AI generates plausible-sounding but meaningless data; statistical analysis produces confident but false conclusions |
| Synthetic interviews | AI produces coherent narratives that confirm your assumptions; you learn nothing new |
| AI-generated personas | Stereotypes are reinforced; edge cases are invisible; design for the "average" that represents no one |
| Synthetic usability feedback | AI predicts what users might struggle with, missing what they actually struggle with |
The Verdict: A Clear Line
The distinction is simple:
| Test the System | Understand the Human |
|---|---|
| Use synthetic data | Use real data |
| Is it logical? | Is it desirable? |
| Is it consistent? | Does it solve a problem? |
| Is it accessible? | How does it feel? |
| Are there obvious errors? | What surprises us? |
The Decision Framework
When Synthetic Methods Are Appropriate
| Method | Appropriate Use |
|---|---|
| AI cognitive walkthrough | Pre-testing before human participants |
| Automated accessibility audit | Baseline compliance check |
| AI-assisted content review | Catching inconsistencies at scale |
| Synthetic load testing | Stress-testing system performance |
When Synthetic Methods Are Dangerous
| Method | Why It Fails |
|---|---|
| Synthetic survey responses | Produces false confidence in meaningless data |
| AI-generated interview transcripts | Confirms assumptions, surfaces no surprises |
| Synthetic personas replacing real segmentation | Designs for stereotypes, not real people |
| AI "predicting" user behavior | Misses the irrational, emotional, contextual reality |
The Ethical Dimension
Beyond methodology, there is an ethical question: synthetic user data can be used to fake research entirely.
A team under pressure could generate "1,000 survey responses" to justify a decision already made. A vendor could claim "user research" that was actually AI fabrication. A report could present synthetic quotes as real participant voices.
Transparency Requirements
If you use AI in any part of your research process, disclose it:
- "Accessibility issues were identified using automated scanning tools"
- "Initial heuristic evaluation was AI-assisted; findings were validated by human reviewers"
- "Prototype was pre-tested with automated walkthrough before participant sessions"
Never present AI-generated content as human participant data.
What This Means for Practice
Synthetic data is a tool—powerful when used correctly, dangerous when misused.
- Use AI for system testing: Automated walkthroughs, accessibility audits, and logical consistency checks are legitimate and valuable
- Never use AI to replace human participants: Survey responses, interview data, and behavioral observations require real people
- Remember the limitation: AI simulates plausibility, not truth; it lacks lived experience
- Disclose AI use: Transparency about methodology protects your credibility
- Apply the test: "Am I testing the system or understanding the human?"
The most sophisticated AI cannot tell you what it feels like to be your user. Only your users can do that.