← All sections

§24

Observability — what to log, what to dashboard

Every agent — Low, Medium, or High risk — logs the following per execution. No exceptions.

Per-execution log fields:

Timestamp (start / end)
Initiating user ID (or system trigger)
Agent ID + version
Prompt / input (with PII redaction where applicable)
Output / response
Tool calls made, with parameters
Model + version used
Token counts (input + output) and computed cost
Policy checks (which fired, which passed, which blocked)
HITL events (approved / rejected / overridden, by whom)
Latency at each step
Outcome (success / failure / human-overridden)
Error state, if any

Dashboards we keep live:

Token usage by agent and by department
Cost by agent and by department
Failure rate per agent
Latency p50/p95 per agent
Adoption (unique users per agent per week)
HITL escalation rate per agent
Distribution-shift / drift indicators per agent
Incident count and severity per agent

Audit log retention: at least 6 months for High-risk agents (EU AI Act Article 19 baseline); longer where sector regulation requires.