LST-002 recommended load-testing-setup
Baseline performance metrics established
You can't know if performance degraded if you don't know what "normal" looks like. Baselines are the reference point for all performance work.
Question to ask
"What's your p95 response time right now?"
Pass criteria
- ✓ Key metrics documented (response times, throughput, error rates)
- ✓ Baselines established from actual measurements (not guesses)
- ✓ Team knows where to find baseline data (APM tool, docs, etc.)
Fail criteria
- ✗ No idea what "normal" performance looks like
- ✗ Baselines exist but are outdated (pre-major changes)
- ✗ Only anecdotal ("it feels fast")
Related items
Verification guide
Severity: Recommended
You can't know if performance degraded if you don't know what "normal" looks like. Baselines are the reference point for all performance work.
Check automatically:
- Look for performance/baseline documentation:
# Look for performance documentation
grep -riE "baseline|benchmark|performance.*metrics|p50|p95|p99|latency|throughput" docs/ README.md CLAUDE.md --include="*.md" 2>/dev/null
# Check for performance test results stored
find . -maxdepth 3 -type d -name "*performance*" -o -name "*benchmark*" -o -name "*loadtest*" 2>/dev/null | grep -v node_modules
# Look for APM dashboards referenced
grep -riE "datadog|newrelic|dynatrace|grafana.*dashboard" docs/ README.md --include="*.md" 2>/dev/null
# Check for benchmark scripts
find . -name "*benchmark*" -type f 2>/dev/null | grep -v node_modules
Ask user:
- "What's your API's p95 response time under normal load?"
- "How many requests per second can your main endpoints handle?"
- "Where are these baselines documented?"
Key metrics to have documented:
| Metric | What It Measures |
|---|---|
| p50/p95/p99 response time | Latency distribution |
| Throughput (RPS) | Requests per second at steady state |
| Error rate | Baseline error percentage |
| Resource utilization | CPU/memory at normal load |
Cross-reference with:
- LST-001 (tool needed to measure baselines)
- LST-005 (capacity limits are the upper bound of baselines)
- Section 17 (performance monitoring) - APM tools provide ongoing baselines
Pass criteria:
- Key metrics documented (response times, throughput, error rates)
- Baselines established from actual measurements (not guesses)
- Team knows where to find baseline data (APM tool, docs, etc.)
Fail criteria:
- No idea what "normal" performance looks like
- Baselines exist but are outdated (pre-major changes)
- Only anecdotal ("it feels fast")
Evidence to capture:
- Location of baseline documentation
- Key metrics tracked (p50, p95, p99, throughput, error rate)
- When baselines were last updated