rcourtman
|
14d06a1654
|
test: add soak test with runtime instrumentation (Phase 2 Task 9d)
Add comprehensive soak testing capabilities:
**Runtime Instrumentation:**
- Periodic sampling of heap, stack, goroutines, GC count
- Sample every 10s during harness runs
- HarnessReport includes full RuntimeSamples history
- Detect memory leaks (>10% sustained growth)
- Detect goroutine leaks (>20 leaked goroutines)
**Soak Test:**
- TestAdaptiveSchedulerSoak with 15min+ duration
- Skip unless -soak flag or HARNESS_SOAK_MINUTES set
- 80 synthetic instances (60 healthy, 15 transient, 5 permanent)
- Configurable duration via env var
- Validates: heap growth <10%, goroutines stable, queue depth bounded
- Staleness threshold: 45s for long-running tests
**Wrapper Script:**
- testing-tools/run_adaptive_soak.sh for easy execution
- Accepts duration in minutes: ./run_adaptive_soak.sh 30
- Logs to tmp/adaptive_soak_<timestamp>.log
- Sets proper timeout (duration + 5min buffer)
**Test Results (2-minute validation):**
- 80 instances, 17 samples
- Heap: 2.3MB → 3.1MB (healthy)
- Goroutines: 16 → 6 (no leak, actually decreased)
- Circuit breakers: correctly blocking transient failures
Run with: go test -tags=integration ./internal/monitoring -run TestAdaptiveSchedulerSoak -soak -timeout 20m
Part of Phase 2 Task 9 (Integration/Soak Testing)
|
2025-10-20 15:13:38 +00:00 |
|
rcourtman
|
2636ba9137
|
test: add comprehensive integration test harness for adaptive polling (Phase 2 Task 9c)
Add PollExecutor seam and integration test infrastructure:
**PollExecutor Interface:**
- Add pluggable executor interface for testability
- Implement realExecutor wrapping existing poll functions
- Add SetExecutor() for test injection
- Zero impact on production behavior
**Integration Test Harness:**
- Build-tagged integration tests (go:build integration)
- Synthetic workload generator with configurable scenarios
- Fake executor simulating latencies, failures, recovery
- Runtime metrics collection (queue depth, staleness, goroutines)
**Comprehensive Assertions:**
- Queue depth bounds: stays within 1.5× instance count
- Staleness: healthy instances <20s, multiple poll cycles
- Circuit breakers: transient failures recover, permanent stay blocked
- Dead-letter queue: only permanent failures routed
- Scheduler health: snapshot consistency validation
**Test Scenarios:**
- 10 healthy PVE instances (rapid polling)
- 1 transient failure instance (fail → recover)
- 1 permanent failure instance (DLQ routing)
- 55s test duration with 3s base intervals
- Validates full adaptive scheduler lifecycle
Runs with: go test -tags=integration ./internal/monitoring -run TestAdaptiveSchedulerIntegration
Part of Phase 2 Task 9 (Integration/Soak Testing)
|
2025-10-20 15:13:38 +00:00 |
|