Skip to main content
Roll out the system in phases. Start with visibility and human review, then expand automation only after replay tests and live testing show reliable behavior.

Phase 1: Model the Process

Define:
  • Case states
  • Required fields
  • Reason categories
  • Evidence checklists
  • Deadline rules
  • Approval thresholds
  • Escalation reasons
  • Final outcomes
Build the first version of the dispute handling workflow and case validation workflow.

Phase 2: Connect Intake in Testing Mode

Create the intake deployment from the ticketing system, custom webhook, or processor event. In Testing mode, verify:
  • Case fields are extracted correctly
  • Missing fields route to enrichment or escalation
  • Deadlines are calculated correctly
  • Classification output is structured
  • Runs are easy to inspect

Phase 3: Add Specialist Agents

Add agents one at a time:
  1. Reason classification agent
  2. Account communication agent
  3. Evidence drafting agent
  4. Slack escalation agent
  5. Duckie Assistant feedback agent
Keep each agent narrow. Give each one a clear contract and validate its output in the workflow.

Phase 4: Add Human Approval

Before any external submission or final case action:
  • Route the evidence packet to a reviewer
  • Capture approve, request changes, reject, and escalate decisions
  • Record reviewer identity and comments
  • Send changes back to the evidence workflow
Do not move final dispute submission to live automation until reviewers trust the evidence packet quality and the workflow handles edge cases reliably.

Phase 5: Add Scheduled Reporting

Create a Scheduler deployment for the Duckie Assistant reporting agent. Start with:
  • Weekdays at 9 AM in the stakeholder timezone
  • Testing mode
  • Read-only tools
  • A single stakeholder channel
Switch to Live after the team confirms that the digest is accurate and useful.

Phase 6: Replay and Batch Test

Test historical cases before expanding automation:
  • Straightforward accepted cases
  • Straightforward counter cases
  • Missing evidence cases
  • Deadline-risk cases
  • High-value cases
  • Tool failure cases
  • Cases requiring human review
  • Cases with poor or incomplete source data
Use failures to update workflows, instructions, tools, and guardrails.

Phase 7: Expand Live Scope

Move low-risk paths to Live first:
  • Intake and classification
  • Internal notes and status updates
  • Account information requests
  • Daily reporting
  • Reviewer-ready evidence drafts
Keep high-risk actions gated:
  • Final external submissions
  • High-value cases
  • Legal or regulatory cases
  • Cases with weak evidence
  • Tool failures that require manual interpretation

Launch Checklist

  • Intake deployment is in the correct mode
  • Main workflow and subflows are published
  • Agent tool access is scoped to the job
  • Slack escalation channel has reviewer owners
  • Daily reporting channel has stakeholder owners
  • Categories and attributes are configured
  • Runs are reviewed during the first live week
  • Rollback plan is documented

Deployment Modes

Control testing and live behavior.

Replay Testing

Test historical cases before launch.

Batch Testing

Evaluate many cases at once.

Security

Scope sensitive actions safely.