Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.duckie.ai/llms.txt

Use this file to discover all available pages before exploring further.

The playground is an interactive chat interface for testing agents in real-time — your safe space to experiment and validate before deploying.

Using the Playground

1

Navigate to Playground

Go to Test → Playground in your dashboard.
2

Select an Agent

Choose which agent you want to test from the dropdown.
3

Start Chatting

Type messages as if you were a customer and press Enter.
4

Review Responses

Read the agent’s responses and check quality.
5

View Execution Details

Click on any response to see how the agent processed it.

Viewing Execution Details

The execution panel shows everything that happened:

Steps Executed

See the sequence of actions:
  1. Message received
  2. Guardrails checked
  3. Knowledge searched
  4. Runbook/workflow executed
  5. Response generated

Knowledge Retrieved

See what the agent found:
  • Which articles matched
  • Relevance scores
  • Content used in response

Tool Calls

See what actions were taken:
  • Tool name
  • Input parameters
  • Output/result
  • Success or failure

Reasoning

Understand why the agent made decisions:
  • How it interpreted the question
  • Why it chose certain knowledge
  • How it formulated the response

Testing Scenarios

Happy Path Testing

Test common questions with clear answers:
"How do I reset my password?"
"What's your refund policy?"
"How do I upgrade my account?"

Edge Case Testing

Test unusual or difficult scenarios:
"sdflkjsdf" (gibberish)
"I need help but I don't know with what"
"Can you help with X and also Y and also Z?"

Guardrail Testing

Test messages that should trigger escalation:
"I want to cancel and talk to a manager"
"This is unacceptable, I'm contacting my lawyer"
"I've tried 5 times and nothing works"
Test restriction triggers:
"What discount can you give me?"
"How does [competitor] compare?"
"What's the internal pricing?"

Knowledge Testing

Test for knowledge gaps:
Questions about features you haven't documented
Very specific or technical questions
Questions using unusual terminology

Turning Playground Scenarios Into Batch Tests

Use successful playground conversations as source material for manual batch tickets:
1

Have a Conversation

Test a specific scenario in the playground.
2

Open Batch Test

Go to Test → Batch Test and open or create a batch.
3

Add a custom ticket

Open Manage tickets, click Add custom ticket, and recreate the customer and agent messages from the playground scenario.
4

Run the batch

Run the batch against the agent and score the result with a rubric.

Replay Real Conversations

Use Replay Testing when you want to start from existing support conversations instead of manually recreating a scenario. Replay testing fetches historical conversations from connected sources, splits them into turns, and lets you generate a new Duckie response for a selected turn.

Rating Responses

Provide feedback on response quality:
  • 👍 Good response
  • 👎 Needs improvement
  • Add comments for context
Ratings help track quality and identify improvement areas.

Starting a New Conversation

To start fresh:
  1. Click New Conversation or the refresh icon
  2. Previous conversation is cleared
  3. Context is reset

Tips for Effective Testing

Test Like a Customer

Write messages the way customers actually write:
  • Incomplete sentences
  • Typos and informal language
  • Multiple questions at once
  • Vague descriptions

Test Conversation Flow

Don’t just test single messages — test multi-turn conversations:
  1. Customer asks question
  2. Agent responds
  3. Customer asks follow-up
  4. Agent maintains context

Document Issues

When you find problems:
  1. Note what you asked
  2. Note what the agent did wrong
  3. Check execution details for root cause
  4. Update configuration to fix

Test After Changes

After updating agent configuration:
  1. Re-test affected scenarios
  2. Verify the fix works
  3. Check for regressions

Playground vs Production

AspectPlaygroundProduction
MessagesYou send manuallyReal customers
ContextFresh each conversationFull history
ActionsMay be simulatedReal execution
VisibilityOnly youCustomers see
Playground is for validation. Always follow up with testing mode on real traffic before going live.

Next Steps

Replay Testing

Test against real conversations

Batch Testing

Automate your tests