HA-006 recommended Backups
Backup window appropriate for RPO
Backup window intentional (low-traffic period); frequency aligns with business RPO; no performance impact during backups
Question to ask
"How much data are you willing to lose in a disaster?"
Verification guide
Severity: Recommended
Backup timing should be intentional: during low-traffic periods to minimize performance impact, and frequent enough to meet business RPO requirements.
Check automatically:
- Check backup window timing:
# AWS RDS backup window (UTC)
aws rds describe-db-instances --query "DBInstances[].{ID:DBInstanceIdentifier,BackupWindow:PreferredBackupWindow}" --output table
# GCP Cloud SQL backup start time
gcloud sql instances describe INSTANCE_NAME --format="get(settings.backupConfiguration.startTime)"
- Check Terraform for backup window:
# AWS RDS
grep -rE "preferred_backup_window|backup_window" --include="*.tf" 2>/dev/null
# GCP Cloud SQL
grep -rE "start_time.*backup" --include="*.tf" 2>/dev/null
- Check cron schedules for scripted backups:
# Look for backup cron patterns
grep -rE "cron|schedule" --include="*.yml" --include="*.yaml" --include="*.sh" 2>/dev/null | grep -iE "backup|dump"
# Check Kubernetes CronJobs
kubectl get cronjobs -A 2>/dev/null | grep -iE "backup|dump"
- Check backup frequency:
# AWS RDS - automated backups are daily, but PITR provides continuous
# Check snapshot frequency for manual snapshots
aws rds describe-db-snapshots --snapshot-type manual --query "DBSnapshots[].SnapshotCreateTime" --output text | head -10
# GCP - check backup frequency
gcloud sql instances describe INSTANCE_NAME --format="get(settings.backupConfiguration.transactionLogRetentionDays)"
Ask user:
- "When do your backups run? Is this during low-traffic periods?"
- "What's your RPO (Recovery Point Objective) - how much data loss is acceptable?"
- "Does your backup frequency match your RPO?"
- "Have you noticed performance impact during backup windows?"
RPO considerations:
- If RPO is 1 hour, daily backups aren't enough (need PITR or hourly snapshots)
- If RPO is 24 hours, daily backups are sufficient
- PITR with continuous WAL archiving effectively gives RPO of seconds/minutes
Cross-reference with:
- HA-003 (backups exist - this item is about timing)
- HA-005 (PITR - if enabled, provides continuous protection regardless of window)
- Section 34 (Rollback & Recovery - RPO/RTO definitions)
- MON-002 (database performance - backup impact on queries)
Pass criteria:
- Backup window defined and intentional (not just default)
- Window is during low-traffic period for the application
- Backup frequency aligns with business RPO requirements
- No significant performance degradation during backups
Fail criteria:
- Default backup window never reviewed
- Backups run during peak traffic causing performance issues
- RPO requirement is 1 hour but backups are daily (and no PITR)
- Backup window conflicts with other maintenance
Evidence to capture:
- Backup window (time in UTC and local timezone)
- Backup frequency (daily, hourly, continuous)
- Business RPO requirement
- Whether PITR fills the gap between snapshots
- Any known performance impact