← All templates
Template 04

Template 04 — Responsible-AI Checklist

ID
04-responsible-ai-checklist
Version
1
Last revised
2026-05-14
Owner
AI CoE Lead (drives) · Legal (signs) · Department Champion (informs)

Purpose

The 10-item checklist that every Medium- and High-risk agent passes before Pilot. This is where ethical principles become operational checks. Low-tier agents may skip with CoE Lead discretion.

This is paired with the Security review (template 05) — the two together produce the M10 sign-offs in the In Action roadmap.

  • When you use it: At M10 in the roadmap, after Agent Card written and before Step 11 Pilot. Re-run at every material scope change.
  • Who fills it: CoE Lead (asks the questions, records answers). Builder provides evidence per item. Legal signs.
  • Time to complete: 60–120 minutes total — including evidence-gathering.
  • Output: Signed checklist attached to the registry entry.

Worked example (AP Accountant invoice reconciliation)

"Reviewed 2026-05-02 by Morteza Moradi (CoE Lead) + John Smith (General Counsel). Agent is Medium tier."

Agent: finance-invoice-recon v1.0

#ItemStatusEvidenceNotes
1Fairness / bias✅ N/An/aNot a decision about people — invoice matching, no demographic dimension. No bias review required. Will reassess if scope ever changes to vendor approval decisions.
2Privacy / PII✅ PassAgent Card §5 + DLP rules in §6Mild PII (vendor contact data). Classification: Internal. Retention: 6 months active in LangSmith, then S3 Glacier 7 years (SOX adjacency). Masked in logs: vendor account numbers stripped before log emit. Deletion path: vendor request → NetSuite data-deletion workflow.
3Data residency✅ PassAgent Card §6 + LLM provider configAnthropic Claude API: EU endpoint for EU-vendor invoices, US endpoint for others. Routing logic embedded in orchestrator. Verified in test environment 2026-04-30.
4Reliability / safety✅ PassAgent Card §10 + threat model (template 05)Worst case: wrong match proposed → human catches in HITL review → cost is one re-review. Acceptable. Failure modes documented. Kill switch tested (60-sec drill 2026-04-25).
5Transparency / disclosure✅ PassEmail draft includes footerAll AI-drafted reconciliation emails include footer: "Auto-drafted by AI, reviewed by [accountant name]." No external disclosure needed — internal use only. EU AI Act Article 50 not triggered (no customer-facing synthetic content).
6Explainability✅ PassReasoning logged per executionEvery match proposal includes confidence score + brief rationale ("matched vendor X, line items 3/4 match within tolerance, PO #12345"). Accountant can request fuller reasoning trace from LangSmith. Not affecting individuals' rights, so no formal explanation mechanism required.
7Inclusiveness✅ N/An/aInternal-only agent serving AP function. No demographic or accessibility dimension at this stage. Re-review if user base expands beyond AP team.
8Accountability (named owner, override path)✅ PassAgent Card §1 + §9Owner: Mike Chen (AP Manager). Override: every NetSuite action requires accountant click. Kill switch: LaunchDarkly flag + Entra ID disable, tested.
9Audit log retention (≥ 6 months for High; longer per regulation)✅ PassAgent Card §116 months active + 7 years archived (SOX precaution for AP records). LangSmith retention configured. S3 Glacier archive policy in place.
10Synthetic-content labeling (EU AI Act Art. 50 if applicable)✅ PassEmail draft footerAI-generated content (draft email) is labeled. No customer-facing synthetic content.

Sign-off

RoleNameDateSignature
AI CoE LeadMorteza Moradi2026-05-02(signed)
General CounselJohn Smith2026-05-02(signed)
Department Champion (informed)Mike Chen2026-05-02(acknowledged)

Conditional approvals / open items

None. All 10 items pass or are documented as N/A with reasons. Cleared to proceed to template 05 (Threat Model) → Step 11 (Pilot).


Blank template (copy below for your agent)

# Responsible-AI Checklist — [Agent Name]

**Agent ID:** [agent-dept-slug]
**Agent version:** [X.X]
**Tier:** [Medium / High]
**Review date:** [YYYY-MM-DD]

| # | Item | Status | Evidence | Notes |
|---|---|---|---|---|
| 1 | Fairness / bias — if the agent makes decisions about people, has it been tested across demographic groups? Known limitations documented? | [✅ Pass / ⚠️ Conditional / ❌ Fail / N/A] | [Link to evidence] | [Reasoning] |
| 2 | Privacy / PII — data classification done; retention rules defined; masking applied where required; user deletion path available | [✅ / ⚠️ / ❌ / N/A] | | |
| 3 | Data residency — LLM calls hit providers in approved jurisdictions? | [✅ / ⚠️ / ❌ / N/A] | | |
| 4 | Reliability / safety — failure modes documented; worst-case action assessed and acceptable? | [✅ / ⚠️ / ❌] | | |
| 5 | Transparency / disclosure — when the agent talks to external parties, is its AI nature disclosed? When internal, is it labeled as AI-generated? | [✅ / ⚠️ / ❌ / N/A] | | |
| 6 | Explainability — for agents affecting individuals, can we explain why a given output was produced? | [✅ / ⚠️ / ❌ / N/A] | | |
| 7 | Inclusiveness — does the agent serve all user groups, or does it exclude / disadvantage some by design or data? | [✅ / ⚠️ / ❌ / N/A] | | |
| 8 | Accountability — single named human owner; clear path to override or stop the agent | [✅ / ⚠️ / ❌] | | |
| 9 | Audit log retention — period defined per the company's data policy and regulation (EU AI Act: high-risk logs ≥ 6 months) | [✅ / ⚠️ / ❌] | | |
| 10 | Synthetic-content labeling — if synthetic content generated (image, audio, video, text that imitates), labeled per EU AI Act Article 50 | [✅ / ⚠️ / ❌ / N/A] | | |

## Sign-off

| Role | Name | Date | Signature |
|---|---|---|---|
| AI CoE Lead | | | |
| General Counsel (or delegate) | | | |
| Department Champion (informed) | | | |

## Conditional approvals / open items

[List any ⚠️ items + the conditions / open items + due date to resolve]

Per-item guidance

1. Fairness / bias

When it applies: Agent makes decisions about people — hiring, promotion, lending, pricing, eligibility for services, performance review. EU AI Act Annex III categories. NIST AI RMF MAP 5.1.

What to check:

  • Has the agent been tested across demographic groups (race, gender, age, etc. as relevant)?
  • Are approval / rejection / outcome rates documented across those groups?
  • Have known limitations been written down?
  • Is there a process for affected individuals to challenge an output?

Tools: IBM AI Fairness 360, Microsoft Fairlearn, Aequitas, Google What-If Tool.

N/A criteria: Agent makes no decisions about people. Document the reasoning — auditor will check this isn't being skipped.

2. Privacy / PII

When it applies: Any agent that processes personal data — including mild PII like internal employee names.

What to check:

  • Data classification per source documented (Agent Card §5)
  • Retention period defined (typically ≥ 6 months for High tier; longer if sector regulation requires)
  • Masking / redaction applied to logs where appropriate
  • User deletion path exists (GDPR Article 17 right to erasure)

Tools: Microsoft Purview, BigID, OneTrust, TrustArc.

Pass criteria: Each bullet above has a documented answer.

3. Data residency

When it applies: Any agent that processes data subject to residency requirements — EU GDPR, Schrems II, sector regulations.

What to check:

  • LLM provider's processing region matches the data's residency requirement
  • Vector store + observability platform regions also confirmed
  • Cross-border transfer mechanisms in place if data crosses borders (SCCs, adequacy decisions)

Tools: LLM providers with regional endpoints (Anthropic, OpenAI, Bedrock, Vertex, Azure AI Foundry).

Pass criteria: Each data flow documented with region; cross-border transfers have legal basis.

4. Reliability / safety

What to check:

  • Failure modes from Agent Card §10 reviewed
  • Worst-case action assessed: is it acceptable?
  • If worst case is irreversible (e.g., autonomous deletion), additional controls in place
  • Kill switch tested

Pass criteria: Worst-case is acceptable AND mitigations are wired AND kill switch is tested.

5. Transparency / disclosure

When it applies:

  • External parties (customers, partners, regulators) — disclosure required
  • Internal stakeholders consuming AI-generated content — labeling recommended
  • EU AI Act Article 50 specifically requires:
    • AI systems that interact with humans must disclose AI nature
    • Synthetic content (deepfakes, generated audio/video/text) must be marked machine-readable

Pass criteria: Required disclosures are wired into the agent's output.

6. Explainability

When it applies:

  • For agents affecting individuals' rights, jobs, finances, access to services
  • EU AI Act Article 13 obligations for high-risk

What to check:

  • Can the company explain why a given output was produced?
  • Is the explanation accessible to affected individuals (not just engineers)?
  • Are reasoning logs retained?

Tools: SHAP, LIME, Captum (for ML models). Reasoning logs / chain-of-thought capture for LLM agents.

Pass criteria: Explanation mechanism exists and is documented.

7. Inclusiveness

What to check:

  • Does the agent serve all user groups equally?
  • Language support — does it work for non-English speakers if relevant?
  • Accessibility — does its output work with screen readers, alternative inputs?
  • Demographic coverage in training/eval data

N/A criteria: Limited internal use case with homogeneous user group. Document the reasoning.

8. Accountability (named owner, override path)

What to check:

  • One named human Owner (Agent Card §1)
  • Override path: how can a human stop or correct the agent in real time?
  • Kill switch tested

Pass criteria: Named owner + working override path + tested kill switch.

9. Audit log retention

Baseline: 6 months for Medium and High tier (EU AI Act Article 19 baseline).

Longer if:

  • Sector regulation requires it (SOX = 7 years for financial controls; HIPAA = 6 years; etc.)
  • Internal data retention policy requires longer
  • Contractual obligation with customers requires longer

Pass criteria: Retention period documented per the agent's tier + regulatory exposure.

10. Synthetic-content labeling

EU AI Act Article 50 obligations:

  • AI systems that interact with humans → disclose AI nature
  • AI-generated audio, image, video, or text content imitating a person or another entity → mark as AI-generated, machine-readable where feasible
  • Emotion recognition or biometric categorization → disclose to subjects

Tools: C2PA content credentials, watermarking libraries.

N/A criteria: Agent doesn't generate synthetic content (e.g., it only parses or ranks). Document the reasoning.


Usage notes

  • N/A is a real answer. Several items legitimately don't apply for many agents. Document the reasoning — "Not decisions about people" is enough, but it must be written.
  • The 4 always-applicable items: §2 Privacy, §4 Reliability, §8 Accountability, §9 Audit retention. These cannot be N/A.
  • EU AI Act Article 50 is broader than people think. It applies to any AI system that interacts with humans, not just chatbots. Read it carefully.
  • The checklist is not just for the first review. Re-run on any material change to scope, data sources, or autonomy stage.
  • Conditional pass (⚠️) requires a documented resolution path — open item + owner + due date. Don't carry ⚠️ items into pilot.

Common pitfalls

PitfallWhat it looks likeFix
Fairness skipped because "not really about people"Loan-scoring agent classified as not about peopleRead NIST AI 100-1 §3 — fairness applies whenever outputs influence opportunities for people
Privacy is vague"We respect privacy"Tie to Agent Card §5 + documented retention + masking
Data residency assumed"We use OpenAI, they have data centers globally"Specify which region for which data flow
Worst-case not actually assessed§4 marked Pass with no failure modes referencedCross-reference Agent Card §10; if §10 is missing, fix that first
Transparency conflated with explainabilityBoth items marked Pass with same evidenceTransparency = "disclose AI was used." Explainability = "explain why this output." Different.
Retention based on convenience"30 days because that's what LangSmith defaults to"Tie to regulation + tier. Adjust LangSmith settings.
Article 50 missedAgent generates customer-facing content unlabeledAdd disclosure footer or watermark; re-review

Framework cross-references

  • framework.md §18 (this checklist's specification)
  • framework.md §10 (risk tier — Medium+ triggers this review)
  • framework.md §17 (privileged identities — feeds §8)
  • framework.md §22.1 EU AI Act (Articles 13, 14, 19, 50)
  • framework.md §22.2 NIST AI RMF (MEASURE function, trustworthy AI characteristics)
  • framework.md §22.3 ISO/IEC 42001 (Annex A fairness + transparency controls)
  • workflows.md Step 7 (Responsible-AI review)
  • workflows.html → In Action view → node M10 (Reviews — Medium+)