← All sections
§24

Observability — what to log, what to dashboard

Every agent — Low, Medium, or High risk — logs the following per execution. No exceptions.

Per-execution log fields:

  • Timestamp (start / end)
  • Initiating user ID (or system trigger)
  • Agent ID + version
  • Prompt / input (with PII redaction where applicable)
  • Output / response
  • Tool calls made, with parameters
  • Model + version used
  • Token counts (input + output) and computed cost
  • Policy checks (which fired, which passed, which blocked)
  • HITL events (approved / rejected / overridden, by whom)
  • Latency at each step
  • Outcome (success / failure / human-overridden)
  • Error state, if any

Dashboards we keep live:

  • Token usage by agent and by department
  • Cost by agent and by department
  • Failure rate per agent
  • Latency p50/p95 per agent
  • Adoption (unique users per agent per week)
  • HITL escalation rate per agent
  • Distribution-shift / drift indicators per agent
  • Incident count and severity per agent

Audit log retention: at least 6 months for High-risk agents (EU AI Act Article 19 baseline); longer where sector regulation requires.