Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.duckie.ai/llms.txt

Use this file to discover all available pages before exploring further.

Replay testing lets you pull real conversations from connected support sources, choose a turn, and generate a fresh Duckie response for that point in the conversation. Use replay testing when you want to compare Duckie’s current behavior against what happened in an actual customer conversation.

When To Use Replay Testing

Replay testing is best for:
  • validating an agent against real customer phrasing
  • checking multi-turn context handling
  • testing fixes against a specific historical ticket
  • reviewing how a changed agent would respond to an old conversation
  • turning real examples into candidates for batch tests
Replay testing is not a scored regression suite. Use batch testing when you need repeatable runs, rubrics, and aggregate scores.

Supported Sources

Replay testing can fetch conversations from connected sources that provide conversation history:
SourceFetch recent conversationsFetch by ticket ID
SlackYesYes
ZendeskYesYes
HubSpotYesYes
IntercomYesYes
FreshdeskYesNo
PlainYesNo
PylonYesNo
DiscordYesNo
Duckie only shows sources that are connected for your organization.

Run A Replay Test

1

Open Replay Chats

Go to Test → Replay Chats.
2

Select an agent

Choose the active agent you want to test.
3

Select a connection

Choose the connected source to fetch conversations from.
4

Fetch conversations

Click Fetch Conversations. Duckie loads replayable conversations and splits them into turns.
5

Pick a conversation and turn

Select a conversation, then move through its turns with Prev Message and Next Message.
6

Generate Duckie's response

Click Generate Duckie Response to run the selected agent from that point in the conversation.
7

Compare the result

Compare the historical Expected response with the generated Duckie response.

Fetch A Specific Ticket

For Slack, Zendesk, HubSpot, and Intercom, you can fetch one conversation directly by ID.
1

Select a supported connection

Choose Slack, Zendesk, HubSpot, or Intercom.
2

Enter the ticket ID

Paste the ticket, thread, or conversation ID into the Ticket ID field.
3

Fetch the ticket

Click Fetch Ticket. Duckie adds the replayable conversation if the source returns one.

How Turns Work

Duckie splits each conversation into turns:
PartWhat it contains
Conversation historyMessages before the selected customer turn
Current messagesOne or more customer messages in the selected turn
Expected responseThe historical agent response that followed those customer messages
Duckie responseThe new response generated by the selected Duckie agent
Conversations without any historical agent response are filtered out because there is no expected response to compare against. When you replay later turns, Duckie uses the prior customer messages and prior agent responses as context. If you already generated Duckie responses for earlier turns, those generated responses become the prior assistant context for later turns.

Use Context Settings

Replay tests use the context panel to simulate source metadata for the run. Use the context panel to:
  • confirm the source Duckie should treat the conversation as coming from
  • add source-specific fields when needed
  • review internal-note and write-action settings for the run

Test Mode Safety

Replay tests run in testing mode, the same safety layer that protects playground and batch test runs from unintended external side effects. In testing mode:
  • write app tools, custom tools, and MCP tools are skipped
  • responder output is converted to internal notes where the source supports internal notes
  • Slack and Discord responder delivery is skipped because they do not have an internal note target
You can turn on Allow Write Actions in the context panel when you intentionally want to exercise write tools during a replay.

Review Replay History

The Replay Chats History panel shows replay runs created from the Replay Chats page. From history, you can:
  • load a previous replay run
  • open the run details drawer
  • inspect steps, tool calls, and generated messages
  • delete replay runs you no longer need

Turn Replays Into Regression Tests

Replay testing is useful for finding examples worth preserving. When a replay exposes behavior you want to keep checking, add the scenario to a batch test.
1

Find a useful replay

Replay a real conversation and identify the turn you want to preserve.
2

Open Batch Test

Go to Test → Batch Test and open or create a batch.
3

Add the scenario

Use Manage tickets to add a custom ticket with the relevant customer messages and expected response.
4

Run the batch

Run the batch whenever you need repeatable validation.

Next Steps

Playground

Test new scenarios interactively

Batch Testing

Create repeatable regression suites