← All reference architectures
Reference architecture

Regulated LLM workflow

An LLM workflow for finance, health or the public sector - where audit trails, explainability, human sign-off and data retention are not optional.

01 Architecture

In regulated settings the question is not only “is it correct?” but “can you prove what happened and why?”. This blueprint makes governance a designed-in path: policy gates, full audit logging, human review and version pinning.

02 When to use it

Use this when

  • Decisions are high-stakes and auditable
  • A regulator or auditor may review them later
  • Human sign-off is required by policy

Reach for something else when

  • Low-risk, high-volume tasks where the overhead is not justified
  • You have no human-review capacity
  • Reproducibility and audit are genuinely not required
03 Components

What's in the box.

Policy / compliance gate

Blocks out-of-policy requests and enforces jurisdiction rules.

Version-pinned model

Locks model and prompt versions so decisions are reproducible.

Full audit logging

Captures inputs, retrieved context, outputs and decisions immutably.

Human review queue

Required sign-off for high-risk decisions before they take effect.

Explainability record

Stores the rationale and sources behind each decision.

Retention store

Enforces data-retention and deletion policy on all records.

04 Failure modes

Where it breaks - and the fix.

Missing or incomplete audit trail
Log immutably at every step; treat the audit path as a hard dependency.
Unexplainable decision
Record sources and rationale; pin versions so results are reproducible.
Unreviewed high-risk output
Hard approval gate; nothing high-risk takes effect without sign-off.
Data-retention breach
Policy-driven retention and deletion; access-controlled stores.
Silent model drift between versions
Pin versions; re-validate on any change via the eval set and change management.
05 Metrics to monitor

What good looks like, measured.

  • Audit completeness
    Every step recorded immutably.
  • Human review SLA
    Time from request to sign-off.
  • Explainability coverage
    Decisions with a recorded rationale.
  • Version drift
    Unintended model or prompt changes.
  • Retention compliance
    Records kept and deleted per policy.
06 MVP vs production-grade

Don't build everything on day one.

Ship the MVP column to get to users; the production column is what makes it durable. Choose deliberately which gaps you're leaving.

Aspect MVP Production-grade
Audit Logs Immutable, access-controlled audit store
Review Ad-hoc Mandatory sign-off queue
Versioning Latest model Pinned model + prompt versions
Explainability None Rationale + sources per decision
Retention Default Policy-driven retention & deletion
07 Copy-paste schemas

Instrument it in minutes.

A starting point you can paste into your tracing and eval setup - then adapt to your stack.

Example trace schema
{
  "request_id": "req_9931",
  "architecture": "regulated-llm-workflow",
  "policy_check": "passed",
  "model": "sonnet-4.6",
  "prompt_version": "v12-locked",
  "human_review": "approved",
  "reviewer_id": "r_07",
  "decision_recorded": true,
  "retention_class": "7y",
  "audit_id": "aud_9931"
}
Example eval dataset row
{
  "input": "Summarize this loan application for a decision",
  "expected_behavior": "Summarize and cite sources; flag for human decision; never decide",
  "must_include": [
    "source citation",
    "human review required"
  ],
  "must_not_include": [
    "final approval decision",
    "unsupported claims"
  ],
  "risk_category": "regulated_decision"
}
08 Checklist

Ship-ready when…

  • Inputs, outputs and decisions are captured in an immutable audit trail
  • High-risk decisions require human sign-off
  • Every decision is explainable and reproducible
  • Model and prompt versions are pinned
  • Data retention and deletion policy is enforced
  • Changes follow a documented change-management process
Full production checklist Score your maturity
09 Related
Stack layers
Deep dives