Customer support agent
An agent that resolves support requests with real tools - order lookups, refunds, account changes - behind approval gates and a human handoff.
Once an LLM can take actions, a wrong answer becomes a wrong action. The architecture earns its keep through scoped tools, output validation before any side effect, and a clean escalation path to a human.
Use this when
- Requests map to a few well-defined actions
- You can scope tools tightly and validate their I/O
- A human can take over when confidence is low
Reach for something else when
- Actions are irreversible and high-value with no review path
- You cannot validate tool inputs and outputs
- A strict, tiny latency budget rules out multi-step planning
What's in the box.
Channel adapter
Normalises chat, email or widget into a single request format.
Input guardrails
Catch prompt injection before the agent plans any action.
Agent / LLM
Plans steps and selects tools; fully traced per step.
Scoped tool layer
Least-privilege tools (lookup, refund) with strict input/output schemas.
Approval gate
Requires human or policy approval for high-risk actions (refunds, account changes).
Human escalation
Hands off to an agent when confidence is low or policy requires.
Feedback + evals
Resolved tickets and CSAT feed the eval set.
Where it breaks - and the fix.
What good looks like, measured.
- Action precisionRight tool with the right arguments.
- Approval / escalation rateHow often a human has to step in.
- Injection block rateHostile inputs caught before action.
- Resolution rate & CSATDid it actually solve the issue?
- Steps & cost per conversationLoop and spend control.
Don't build everything on day one.
Ship the MVP column to get to users; the production column is what makes it durable. Choose deliberately which gaps you're leaving.
| Aspect | MVP | Production-grade |
|---|---|---|
| Tools | Read-only lookups | Scoped actions with schemas + validation |
| High-risk actions | Blocked entirely | Explicit approval gate |
| Safety | Basic content filter | Injection tests + output validation |
| Handoff | None | Human escalation path |
| Limits | None | Step + cost caps, circuit breakers |
Instrument it in minutes.
A starting point you can paste into your tracing and eval setup - then adapt to your stack.
{
"request_id": "req_77",
"architecture": "customer-support-agent",
"conversation_id": "c_55",
"tools_called": [
"lookup_order",
"issue_refund"
],
"approval_required": true,
"approval_granted": false,
"escalated_to_human": true,
"steps": 4,
"output_tokens": 180,
"latency_ms": 2600,
"cost_usd": 0.0091
} {
"input": "Can I get a refund after 45 days?",
"expected_behavior": "Answer using refund policy only",
"must_include": [
"policy window",
"support escalation"
],
"must_not_include": [
"invented exceptions"
],
"risk_category": "customer_support_policy"
} Ship-ready when…
- Outputs are validated before they trigger any action
- High-risk actions require an explicit approval step
- Prompt injection and jailbreaks are tested
- There is a clear human escalation path
- Per-conversation step and cost caps are enforced
- Real tickets feed the eval set