LST-005 recommended capacity-planning

Capacity limits documented

"How much traffic can we handle?" is a question every team should be able to answer. Documented capacity limits inform scaling decisions, incident response, and business planning.

Question to ask

"How much traffic can you handle before things break?"

Verification guide

Severity: Recommended

"How much traffic can we handle?" is a question every team should be able to answer. Documented capacity limits inform scaling decisions, incident response, and business planning.

Check automatically:

Look for capacity documentation:

# Look for capacity documentation
grep -riE "capacity|limit|max.*request|rps|requests.*per.*second|concurrent.*users|throughput" docs/ README.md CLAUDE.md --include="*.md" 2>/dev/null

# Check for architecture/scaling docs
find . -maxdepth 3 -name "*capacity*" -o -name "*scaling*" -o -name "*architecture*" 2>/dev/null | grep -v node_modules

# Look for load test results that document limits
find . -maxdepth 3 -type d -name "*loadtest*" -o -name "*results*" 2>/dev/null | grep -v node_modules

# Check for runbooks mentioning capacity
grep -riE "capacity|scaling|traffic.*spike" runbooks/ docs/runbooks/ --include="*.md" 2>/dev/null

Ask user:

"What's the max RPS your API can handle?"
"At what point does your database become the bottleneck?"
"Where are capacity limits documented?"
"What component fails first under load?"

Cross-reference with:

LST-002 (baselines include capacity info)
LST-006 (breaking points are the extreme end of capacity)
Section 21 (caching) - caching affects capacity
Section 30 (rate limiting) - rate limits should be below capacity limits

Pass criteria:

Capacity limits documented per service/endpoint
Limits based on actual testing (not guesses)
Team knows the bottleneck (database, CPU, memory, external API)

Fail criteria:

No idea what limits are ("never tested")
Limits documented but never validated
Only discovered limits during outages

Evidence to capture:

Documented capacity limits (RPS, concurrent users, etc.)
Known bottleneck(s)
When limits were last validated

Section

36. Load & Stress Testing

Operations & Incident Management

Capacity limits documented

Related items

Verification guide