FF-002 recommended general

Kill switches for quick disable

Critical features have kill switches that can be toggled in < 5 minutes without deploy, with documented procedures and controlled access

Question to ask

"How fast can you disable that new feature if it's melting prod?"

Verification guide

Severity: Recommended

Kill switches are feature flags that can instantly disable functionality in production without a deploy. Critical for incident response when a feature causes problems.

Check automatically:

  1. Check for kill switch naming patterns:
# Common kill switch naming
grep -rE "KILL_|DISABLE_|ENABLE_|EMERGENCY_" src/ app/ lib/ --include="*.ts" --include="*.js" 2>/dev/null

# Feature flags that control critical features
grep -rE "isOn\(['\"].*payment|isOn\(['\"].*checkout|isOn\(['\"].*auth" src/ app/ lib/ --include="*.ts" --include="*.js" 2>/dev/null
  1. Check toggle speed (can flags be changed without deploy?):

For env var flags:

  • Require restart/redeploy to toggle = slow (minutes to hours)
  • NOT suitable for kill switches

For GrowthBook/feature flag services:

  • Toggle via dashboard = instant (seconds)
  • Suitable for kill switches
# Check if using env vars (slow toggle) vs SDK (fast toggle)
grep -rE "process\.env\.(FEATURE_|ENABLE_|DISABLE_)" src/ app/ lib/ --include="*.ts" --include="*.js" 2>/dev/null

# Check for GrowthBook SDK (fast toggle)
grep -rE "gb\.isOn|gb\.evalFeature|useFeatureIsOn" src/ app/ lib/ --include="*.ts" --include="*.js" 2>/dev/null
  1. Check for documented kill switch procedures:
# Look for runbooks or incident docs
find . -name "*.md" -exec grep -l -iE "kill switch|disable feature|emergency" {} \; 2>/dev/null

# Check CLAUDE.md or operational docs
grep -iE "kill switch|disable|emergency" CLAUDE.md README.md docs/*.md 2>/dev/null
  1. Check for critical features that should have kill switches:

Identify features that could cause incidents if they break:

  • Payment processing
  • External API integrations
  • New features in active development
  • Third-party service dependencies
# Find critical integrations that might need kill switches
grep -rE "stripe|paypal|twilio|sendgrid|openai|anthropic" src/ app/ lib/ --include="*.ts" --include="*.js" 2>/dev/null | head -20

Ask user:

  • "How quickly can you disable a feature in production? (< 5 min = good, requires deploy = bad)"
  • "Which features have kill switches? (payments, external APIs, new features)"
  • "Is there documentation on how to disable features in an emergency?"
  • "Who has access to toggle kill switches?"

Cross-reference with:

  • FF-001 (kill switches are a type of feature flag)
  • Section 35 (incident response - kill switches in runbooks)
  • DEPLOY-001 (deployments should be fast, but kill switches should be faster)

Pass criteria:

  • Critical features have kill switches
  • Kill switches can be toggled in < 5 minutes (ideally instant via dashboard)
  • Team knows how to use them (documented or well-known)
  • Kill switch access is controlled but available to on-call

Fail criteria:

  • No kill switches for risky features (payments, external APIs)
  • Toggling requires a deploy (defeats the purpose)
  • No documentation on how to disable features in emergency
  • Only one person knows how to toggle (bus factor = 1)

Notes on kill switch implementation:

Env var kill switches (acceptable for simple cases):

  • Set DISABLE_PAYMENTS=true in environment
  • Requires restart/redeploy to take effect
  • OK for non-urgent features, not for emergencies

Feature flag service kill switches (recommended):

  • Toggle in GrowthBook dashboard
  • Takes effect immediately (SDK polls or uses SSE)
  • Proper audit trail of who toggled what when

Hybrid approach:

  • Use feature flag service for instant toggles
  • Have env var override as backup if flag service is down
  • process.env.DISABLE_PAYMENTS || !gb.isOn('payments')

Evidence to capture:

  • Kill switches identified and their toggle speed
  • Critical features covered (or not)
  • Documentation/runbook status
  • Who has access to toggle kill switches

Section

33. Feature Flags & Rollouts

API & Security