RR-008 recommended emergency-recovery

Know RPO (Recovery Point Objective)

Maximum acceptable data loss defined with backup frequency to support it

Question to ask

"How much data could you lose right now before it becomes a crisis?"

Verification guide

Severity: Recommended

RPO is the maximum acceptable data loss measured in time. It drives backup frequency and replication strategy.

Check automatically:

Look for RPO documentation:

# Search for RPO mentions
grep -riE "RPO|recovery.*point.*objective|data.*loss|backup.*frequency|point.*in.*time" docs/ runbooks/ README.md CLAUDE.md SLA* --include="*.md" 2>/dev/null

Check backup frequency:

# Check for backup schedules
grep -riE "backup.*schedule|cron.*backup|daily.*backup|hourly.*backup" .github/ scripts/ terraform/ --include="*.yml" --include="*.tf" --include="*.sh" 2>/dev/null

# Check for point-in-time recovery (PITR)
grep -riE "point_in_time|pitr|continuous.*backup" terraform/ infrastructure/ --include="*.tf" 2>/dev/null

RPO tiers and required strategies:

RPO	Strategy Required
0 (no data loss)	Synchronous replication, multi-region writes
< 1 min	Async replication, streaming WAL
< 1 hour	Point-in-time recovery (PITR)
< 24 hours	Daily backups
< 1 week	Weekly backups

Ask user:

"How much data loss is acceptable? (1 hour? 1 day?)"
"What's your backup frequency?"
"Do you have point-in-time recovery enabled for your database?"
"Is RPO agreed with stakeholders/business?"

Cross-reference with:

RR-007 (RTO) - often defined together as recovery objectives
RR-005/RR-006 (recovery docs/testing) - RPO should be mentioned and validated
Section 26 (backups) - backup frequency determines achievable RPO

Pass criteria:

RPO is defined (even informally: "losing a day of data would be bad")
Backup frequency supports the RPO (daily backups = 24h RPO max)
Team understands the tradeoff (tighter RPO = higher cost)
For critical data: PITR enabled or frequent backups

Fail criteria:

No idea what acceptable data loss is
Backup frequency doesn't match expectations (weekly backups but expect no data loss)
RPO defined but infrastructure doesn't support it
Never verified backup restore point (might be older than expected)

Evidence to capture:

Defined RPO (or lack thereof)
Actual backup frequency
Whether PITR is enabled
Gap between target RPO and actual capability

Section

34. Rollback & Recovery

API & Security