Backend:
- Enhanced buildEnrichedResourceContext to ALWAYS show learned baselines with
status indicators (normal/elevated/anomaly) instead of only when anomalous
- This makes Pulse Pro's 'moat' visible - users can see the AI understands
their infrastructure's normal behavior patterns
- Added baseline import to service.go
Frontend (user changes):
- Added incident event type filtering with toggle buttons
- Added resource incident panel to view all incidents for a resource
- Added timeline expand/collapse functionality in alert history
- Added incident note saving with proper incidentId tracking
- Added startedAt parameter for proper incident timeline loading
Reverts overly strict alert ID validation that was rejecting valid
alert IDs containing special characters. Docker host IDs can contain
user-supplied data like hostnames which may include parentheses,
brackets, or other printable ASCII characters.
The previous validation only allowed alphanumeric + limited punctuation,
which caused 400 errors when acknowledging alerts from Docker hosts
with special characters in their identifiers.
Related to #852
- Rename checkFlapping to checkFlappingLocked to clarify lock contract
- Replace goto statements with structured control flow
- Wire up unused recordAlertFired/recordAlertResolved metric hooks
- Add trackingMapCleanup goroutine to prevent memory leaks from stale entries
- Tighten alert ID validation to alphanumeric + safe punctuation
- Fix history save error handling to properly manage backup lifecycle
- Add auto-migration for deprecated GroupingWindow field
- Refactor 300+ line UpdateConfig into focused helper functions
- Unify duplicate evaluateVMCondition/evaluateContainerCondition
- Add constants for magic numbers (thresholds, timing, flapping)
- Update tests to match new backup behavior
Introduces granular permission scopes for API tokens (docker:report, docker:manage, host-agent:report, monitoring:read/write, settings:read/write) allowing tokens to be restricted to minimum required access. Legacy tokens default to full access until scopes are explicitly configured.
Adds standalone host agent for monitoring Linux, macOS, and Windows servers outside Proxmox/Docker estates. New Servers workspace in UI displays uptime, OS metadata, and capacity metrics from enrolled agents.
Includes comprehensive token management UI overhaul with scope presets, inline editing, and visual scope indicators.
Implement 5 medium/low priority improvements identified in systematic review:
UX IMPROVEMENTS:
- Notify existing critical alerts when activating from pending_review state
Previously: critical alerts during observation window would never notify
Now: users receive notifications for active critical alerts after activation
Implementation: Added NotifyExistingAlert() method and logic in ActivateAlerts()
PERFORMANCE OPTIMIZATIONS:
- Replace per-alert cleanup goroutines with periodic batch cleanup
Prevents spawning 1000s of goroutines during alert flapping
recentlyResolved entries now cleaned up once per minute instead of 1 goroutine per alert
- Simplify GetActiveAlerts() implementation
Removed intermediate map copy, holds lock slightly longer but operation is fast
Cleaner code with reduced memory allocation
CONFIGURATION VALIDATION:
- Validate timezone in quiet hours configuration
Invalid timezones now disable quiet hours with error log instead of silent fallback
Prevents unexpected behavior when timezone is typo'd or invalid
GRACEFUL SHUTDOWN:
- Add 100ms delay in Stop() for background goroutine cleanup
Reduces risk of state corruption during shutdown
Allows escalation checker and periodic save to exit cleanly
Technical details:
- internal/alerts/alerts.go: Added NotifyExistingAlert(), optimized cleanup patterns
- internal/api/alerts.go: Enhanced ActivateAlerts() to notify existing critical alerts
- Removed ~20 lines of goroutine spawning code
- Added periodic cleanup for recentlyResolved map
- All changes preserve backward compatibility
Testing: Verified compilation with 'go build -o /dev/null ./...'