LST-002 recommended load-testing-setup

Baseline performance metrics established

You can't know if performance degraded if you don't know what "normal" looks like. Baselines are the reference point for all performance work.

Question to ask

"What's your p95 response time right now?"

Verification guide

Severity: Recommended

You can't know if performance degraded if you don't know what "normal" looks like. Baselines are the reference point for all performance work.

Check automatically:

Look for performance/baseline documentation:

# Look for performance documentation
grep -riE "baseline|benchmark|performance.*metrics|p50|p95|p99|latency|throughput" docs/ README.md CLAUDE.md --include="*.md" 2>/dev/null

# Check for performance test results stored
find . -maxdepth 3 -type d -name "*performance*" -o -name "*benchmark*" -o -name "*loadtest*" 2>/dev/null | grep -v node_modules

# Look for APM dashboards referenced
grep -riE "datadog|newrelic|dynatrace|grafana.*dashboard" docs/ README.md --include="*.md" 2>/dev/null

# Check for benchmark scripts
find . -name "*benchmark*" -type f 2>/dev/null | grep -v node_modules

Ask user:

"What's your API's p95 response time under normal load?"
"How many requests per second can your main endpoints handle?"
"Where are these baselines documented?"

Key metrics to have documented:

Metric	What It Measures
p50/p95/p99 response time	Latency distribution
Throughput (RPS)	Requests per second at steady state
Error rate	Baseline error percentage
Resource utilization	CPU/memory at normal load

Cross-reference with:

LST-001 (tool needed to measure baselines)
LST-005 (capacity limits are the upper bound of baselines)
Section 17 (performance monitoring) - APM tools provide ongoing baselines

Pass criteria:

Key metrics documented (response times, throughput, error rates)
Baselines established from actual measurements (not guesses)
Team knows where to find baseline data (APM tool, docs, etc.)

Fail criteria:

No idea what "normal" performance looks like
Baselines exist but are outdated (pre-major changes)
Only anecdotal ("it feels fast")

Evidence to capture:

Location of baseline documentation
Key metrics tracked (p50, p95, p99, throughput, error rate)
When baselines were last updated

Section

36. Load & Stress Testing

Operations & Incident Management

Baseline performance metrics established

Related items

Verification guide