> ## Documentation Index
> Fetch the complete documentation index at: https://docs.duckie.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Replay Testing

> Test agents against real historical conversations

Replay testing lets you pull real conversations from connected support sources, choose a turn, and generate a fresh Duckie response for that point in the conversation.

Use replay testing when you want to compare Duckie's current behavior against what happened in an actual customer conversation.

## When To Use Replay Testing

Replay testing is best for:

* validating an agent against real customer phrasing
* checking multi-turn context handling
* testing fixes against a specific historical ticket
* reviewing how a changed agent would respond to an old conversation
* turning real examples into candidates for batch tests

Replay testing is not a scored regression suite. Use [batch testing](/testing/batch-testing) when you need repeatable runs, rubrics, and aggregate scores.

## Supported Sources

Replay testing can fetch conversations from connected sources that provide conversation history:

| Source    | Fetch recent conversations | Fetch by ticket ID |
| --------- | -------------------------- | ------------------ |
| Slack     | Yes                        | Yes                |
| Zendesk   | Yes                        | Yes                |
| HubSpot   | Yes                        | Yes                |
| Intercom  | Yes                        | Yes                |
| Freshdesk | Yes                        | No                 |
| Plain     | Yes                        | No                 |
| Pylon     | Yes                        | No                 |
| Discord   | Yes                        | No                 |

Duckie only shows sources that are connected for your organization.

## Run A Replay Test

<Steps>
  <Step title="Open Replay Chats">
    Go to **Test → Replay Chats**.
  </Step>

  <Step title="Select an agent">
    Choose the active agent you want to test.
  </Step>

  <Step title="Select a connection">
    Choose the connected source to fetch conversations from.
  </Step>

  <Step title="Fetch conversations">
    Click **Fetch Conversations**. Duckie loads replayable conversations and splits them into turns.
  </Step>

  <Step title="Pick a conversation and turn">
    Select a conversation, then move through its turns with **Prev Message** and **Next Message**.
  </Step>

  <Step title="Generate Duckie's response">
    Click **Generate Duckie Response** to run the selected agent from that point in the conversation.
  </Step>

  <Step title="Compare the result">
    Compare the historical **Expected** response with the generated **Duckie** response.
  </Step>
</Steps>

## Fetch A Specific Ticket

For Slack, Zendesk, HubSpot, and Intercom, you can fetch one conversation directly by ID.

<Steps>
  <Step title="Select a supported connection">
    Choose Slack, Zendesk, HubSpot, or Intercom.
  </Step>

  <Step title="Enter the ticket ID">
    Paste the ticket, thread, or conversation ID into the **Ticket ID** field.
  </Step>

  <Step title="Fetch the ticket">
    Click **Fetch Ticket**. Duckie adds the replayable conversation if the source returns one.
  </Step>
</Steps>

## How Turns Work

Duckie splits each conversation into turns:

| Part                 | What it contains                                                    |
| -------------------- | ------------------------------------------------------------------- |
| Conversation history | Messages before the selected customer turn                          |
| Current messages     | One or more customer messages in the selected turn                  |
| Expected response    | The historical agent response that followed those customer messages |
| Duckie response      | The new response generated by the selected Duckie agent             |

Conversations without any historical agent response are filtered out because there is no expected response to compare against.

When you replay later turns, Duckie uses the prior customer messages and prior agent responses as context. If you already generated Duckie responses for earlier turns, those generated responses become the prior assistant context for later turns.

## Use Context Settings

Replay tests use the context panel to simulate source metadata for the run.

Use the context panel to:

* confirm the source Duckie should treat the conversation as coming from
* add source-specific fields when needed
* review internal-note and write-action settings for the run

## Test Mode Safety

Replay tests run in testing mode, the same safety layer that protects playground and batch test runs from unintended external side effects.

In testing mode:

* write app tools, custom tools, and MCP tools are skipped
* responder output is converted to internal notes where the source supports internal notes
* Slack and Discord responder delivery is skipped because they do not have an internal note target

You can turn on **Allow Write Actions** in the context panel when you intentionally want to exercise write tools during a replay.

## Review Replay History

The **Replay Chats History** panel shows replay runs created from the Replay Chats page.

From history, you can:

* load a previous replay run
* open the run details drawer
* inspect steps, tool calls, and generated messages
* delete replay runs you no longer need

## Turn Replays Into Regression Tests

Replay testing is useful for finding examples worth preserving. When a replay exposes behavior you want to keep checking, add the scenario to a batch test.

<Steps>
  <Step title="Find a useful replay">
    Replay a real conversation and identify the turn you want to preserve.
  </Step>

  <Step title="Open Batch Test">
    Go to **Test → Batch Test** and open or create a batch.
  </Step>

  <Step title="Add the scenario">
    Use **Manage tickets** to add a custom ticket with the relevant customer messages and expected response.
  </Step>

  <Step title="Run the batch">
    Run the batch whenever you need repeatable validation.
  </Step>
</Steps>

## Next Steps

<CardGroup cols={2}>
  <Card title="Playground" icon="flask" href="/testing/playground">
    Test new scenarios interactively
  </Card>

  <Card title="Batch Testing" icon="list-check" href="/testing/batch-testing">
    Create repeatable regression suites
  </Card>
</CardGroup>
