Skip to main content
Guardrails are safety constraints that protect your customers and brand by defining what your agent cannot do and when it should escalate to humans.

What are Guardrails?

Guardrails are the safety net for your AI agents. They define boundaries and trigger actions when those boundaries are approached.

Two Types

TypePurposeWhen Triggered
Escalation RulesHand off to humansCustomer is upset, legal question, VIP, complexity
RestrictionsBlock specific actions/topicsForbidden topics, unsafe promises, confidential info

Escalation Rules

Define when the agent should stop and hand off to a human: Examples:
RuleDescription
Angry CustomerCustomer expresses frustration or mentions canceling
Legal QuestionQuestions about contracts, liability, or legal terms
VIP CustomerConversations from enterprise or high-value accounts
Complex IssueAgent cannot confidently resolve after attempts
Human RequestCustomer explicitly asks to speak with a person
When an escalation rule triggers:
  1. Agent stops processing
  2. Sends an appropriate message to the customer
  3. Creates an internal note with context
  4. Routes to human queue

Restrictions

Define absolute limits on what the agent cannot do: Examples:
RestrictionResponse When Triggered
No refund promises”I’d love to help! Let me connect you with our billing team who can review refund requests.”
No competitor discussion”I’m focused on how we can help you. Let me tell you about what makes us great.”
No internal pricing”For custom pricing, please contact our sales team at [email protected]
No medical/legal advice”I can’t provide medical/legal advice, but I can help you find appropriate resources.”
Restrictions are the highest priority — they’re checked before anything else.

Priority Order

Guardrails are evaluated in this order:
1. Restrictions (highest priority)

   If triggered → Block/Redirect

2. Escalation Rules

   If triggered → Escalate

3. Normal Processing

   Generate response

Detection Methods

Guardrails can detect triggers using:
MethodBest ForExample
AI-basedComplex context, sentiment, intent”Customer is frustrated about repeated issues”
KeywordSpecific words or phrases”cancel”, “lawsuit”, “speak to manager”
RegexStructured patternsEmail addresses, order numbers

Testing Guardrails

Each guardrail includes a built-in playground:
  1. Open the guardrail configuration
  2. Enter test messages
  3. See if the guardrail triggers
  4. Refine detection criteria

Guardrails vs Guidelines

GuardrailsGuidelines
Safety constraintsCommunication style
Hard limitsSoft guidance
Trigger actions (escalate, block)Shape responses
Evaluated per-messageAlways applied
Guardrail: “Never discuss competitor pricing” → Blocks the topic Guideline: “Be professional and helpful” → Shapes all responses

Common Guardrail Patterns

Safety Net

Catch situations that need human judgment:
  • Angry or threatening language
  • Legal or compliance topics
  • Account security concerns
  • Complex edge cases

Brand Protection

Prevent brand damage:
  • Competitor discussions
  • Unauthorized promises
  • Confidential information
  • Off-brand responses

Compliance

Meet regulatory requirements:
  • Privacy requests (GDPR, CCPA)
  • Financial advice disclaimers
  • Medical/legal limitations
  • Age-restricted content

Next Steps