IR-006 recommended post-mortems
Blameless post-mortems after incidents
Post-mortems turn incidents into learning opportunities. Blameless means focusing on systems and processes, not individuals.
Question to ask
"After your last incident, did you fix the system or blame a person?"
Pass criteria
- ✓ Post-mortems written for significant incidents
- ✓ Blameless culture (focus on systems, not people)
- ✓ Documents what happened, why, and how to prevent
- ✓ Template or consistent format used
Fail criteria
- ✗ No post-mortems
- ✗ Blame-focused culture
- ✗ Post-mortems written but superficial
- ✗ Only for major outages
Related items
IR-007 Action items tracked to completion section-34
Verification guide
Severity: Recommended
Post-mortems turn incidents into learning opportunities. "Blameless" means focusing on systems and processes, not individuals - people make mistakes, systems should catch them.
Check automatically:
- Look for post-mortem documentation:
# Check for post-mortem directories
ls -la postmortems/ post-mortems/ incidents/ docs/postmortems/ docs/incidents/ 2>/dev/null
# Search for post-mortem content
grep -riE "post-?mortem|incident.*review|RCA|root.*cause|blameless" docs/ README.md CLAUDE.md --include="*.md" 2>/dev/null
# Look for post-mortem templates
find . -maxdepth 3 -name "*postmortem*" -o -name "*post-mortem*" -o -name "*incident*template*" 2>/dev/null | grep -v node_modules
Ask user:
- "Do you write post-mortems after incidents?"
- "Is there a template or standard format?"
- "Are post-mortems blameless? (focus on systems, not 'Bob broke it')"
What a good post-mortem covers:
- Timeline - What happened and when
- Impact - Who/what was affected, for how long
- Root cause - Why did it happen (5 whys)
- Contributing factors - What made it worse or delayed recovery
- What went well - What worked during response
- Action items - Concrete steps to prevent recurrence
Cross-reference with:
- IR-007 (action items tracked) - post-mortem outputs action items
- Section 34 (rollback/recovery) - post-mortems often reveal rollback gaps
- All other sections - post-mortems may surface gaps anywhere
Pass criteria:
- Post-mortems written for significant incidents
- Blameless culture (focus on systems, not people)
- Documents what happened, why, and how to prevent
- Template or consistent format used
Fail criteria:
- No post-mortems ("we just fix and move on")
- Blame-focused ("this is Bob's fault")
- Post-mortems written but superficial (no root cause analysis)
- Only for major outages (missing learning from smaller incidents)
Evidence to capture:
- Location of post-mortems (if any exist)
- Template in use (if any)
- Number of post-mortems written (indicates culture)
- Whether they include root cause analysis