What is Batch Testing?
Batch testing runs multiple test cases against an agent at once:- Define expected inputs and outcomes
- Execute all tests automatically
- Review pass/fail results
- Track changes over time
Why Batch Testing?
| Benefit | How |
|---|---|
| Catch regressions | Run tests after every change |
| Pre-deployment validation | Verify before going live |
| Consistent quality | Same tests run the same way |
| Confidence | Know your agent works before customers do |
Creating Test Cases
1
Navigate to Batch Test
Go to Test → Batch Test in your dashboard.
2
Click Create Test Case
Click Create Test Case.
3
Enter Test Input
Write the customer message to test:
4
Define Expected Outcome
Specify what should happen:
- Response should contain certain content
- Specific tool should be called
- Should/shouldn’t escalate
- Should assign certain category
5
Save
Save the test case to your suite.
Test Case Components
Input
The message sent to the agent:Expected Outcomes
What should happen:| Expectation Type | Example |
|---|---|
| Response contains | ”password reset” |
| Response doesn’t contain | competitor names |
| Tool called | Password Reset Tool |
| Escalation | Should not escalate |
| Category | Account Issues |
| Attribute | Priority = Medium |
Multiple Expectations
Combine expectations for thorough testing:Running Tests
1
Select Agent
Choose which agent to test against.
2
Select Test Cases
Choose which tests to run:
- All tests
- Specific subset
- Tests by tag/category
3
Run Tests
Click Run Tests to start execution.
4
Wait for Results
Tests run sequentially. Progress is shown.
Reviewing Results
Results Overview
See aggregate results:- Tests passed: 18/20
- Tests failed: 2/20
- Pass rate: 90%
Individual Test Results
For each test:| Status | Meaning |
|---|---|
| ✅ Pass | All expectations met |
| ❌ Fail | One or more expectations not met |
| ⚠️ Error | Test couldn’t complete |
Failure Details
Click a failed test to see what went wrong:- Expected: Response contains “reset email”
- Actual: Response mentioned “contact support”
- Full response: [View complete agent response]
- Execution: [View step-by-step execution]
Test Suites
Organize tests into suites:By Feature
By Type
Creating Tests from Playground
The easiest way to create tests:- Test a scenario in playground
- Verify the response is correct
- Click Save as Test
- The input and expectations are pre-filled
Pre-Deployment Workflow
Before deploying any changes:1
Make Changes
Update agent configuration, knowledge, or guidelines.
2
Run Batch Tests
Execute your full test suite.
3
Review Failures
Investigate any failed tests.
4
Fix Issues
Address problems found in testing.
5
Re-run Tests
Verify fixes and check for regressions.
6
Deploy
Once all tests pass, deploy with confidence.
Best Practices
Build Tests as You Develop
Don’t wait until the end:- Create tests as you build features
- Save good playground conversations as tests
- Add tests when bugs are found and fixed
Cover Critical Scenarios
Prioritize tests for:- Most common customer questions
- Highest-impact scenarios
- Previous issues/bugs
- Guardrail triggers
Run Regularly
- Before every deployment
- After configuration changes
- On a schedule (daily/weekly)
Keep Tests Updated
As your product changes:- Update expected outcomes
- Add new test cases
- Remove obsolete tests