Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-04-28 11:30:15 +00:00

Author	SHA1	Message	Date
rcourtman	9b531c547d	Fix recovery notifications silently disabled by config PUT (#1332 ) Two fixes for missing recovery/resolved notifications: 1. API config PUT handler now preserves notifyOnResolve when the client omits it from the request body. Go decodes a missing bool as false, which silently disabled recovery notifications on older clients. 2. CancelAlert now always cleans up the cooldown record even when the alert has already left the pending buffer, preventing stale cooldown entries from suppressing future alert cycles.	2026-03-09 11:28:28 +00:00
rcourtman	0dd3fc779b	Fix alert disable notification suppression Some checks failed Build and Test / Secret Scan (push) Has been cancelled Details Build and Test / Frontend & Backend (push) Has been cancelled Details Core E2E Tests / Playwright Core E2E (push) Has been cancelled Details	2026-03-07 18:40:08 +00:00
rcourtman	12a5a98117	fix: SSE race conditions, alert user spoofing, and security status oracle SSE Broadcaster: - Add per-client mutex to prevent concurrent writes to ResponseWriter - Fix data race in cleanupLoop reading LastActive without synchronization - Update LastActive in SendHeartbeat so clients aren't incorrectly pruned after 5 minutes of idle heartbeat traffic Alert Acknowledgements: - Extract authenticated user from X-Authenticated-User header instead of hardcoding 'admin' or trusting request body's User field - Prevents audit log spoofing and ensures accurate user attribution Security Status Endpoint: - Remove ?token= query param validation from public /api/security/status - Prevents endpoint from acting as a token validity oracle for attackers - Authentication still works via session cookies and X-API-Token header	2026-02-03 17:40:58 +00:00
rcourtman	289d95374f	feat: add multi-tenancy foundation (directory-per-tenant) Implements Phase 1-2 of multi-tenancy support using a directory-per-tenant strategy that preserves existing file-based persistence. Key changes: - Add MultiTenantPersistence manager for org-scoped config routing - Add TenantMiddleware for X-Pulse-Org-ID header extraction and context propagation - Add MultiTenantMonitor for per-tenant monitor lifecycle management - Refactor handlers (ConfigHandlers, AlertHandlers, AIHandlers, etc.) to be context-aware with getConfig(ctx)/getMonitor(ctx) helpers - Add Organization model for future tenant metadata - Update server and router to wire multi-tenant components All handlers maintain backward compatibility via legacy field fallbacks for single-tenant deployments using the "default" org.	2026-01-22 13:39:06 +00:00
rcourtman	f478046696	refactor(api): Add interfaces to handlers for testability Extract interfaces from concrete monitor type dependencies: alerts.go: - Add AlertManager, ConfigPersistence, AlertMonitor interfaces - Change AlertHandlers to accept AlertMonitor interface notifications.go: - Add NotificationManager, NotificationConfigPersistence interfaces - Add NotificationMonitor interface - Change NotificationHandlers to accept NotificationMonitor interface updates.go: - Add UpdatesMonitor interface - Change UpdatesHandlers to accept interface audit_handlers.go: - Update to use interface-based injection profile_suggestions.go: - Minor interface alignment Benefits: - Handlers can now be tested with mock implementations - Decouples handlers from concrete monitoring.Monitor type - Works with monitor_wrappers.go added in previous commit	2026-01-19 19:21:46 +00:00
rcourtman	3096ec53b5	fix: preserve alert activation state when saving config. Related to #1096	2026-01-16 14:24:02 +00:00
rcourtman	fa43628cde	fix: Alert acknowledge/unacknowledge fails with reverse proxies Reverse proxies (Traefik, Caddy, nginx) often normalize or reject URLs containing %2F (encoded slash). Alert IDs contain forward slashes (e.g., "docker-container-state-docker:abc/def"), causing acknowledge requests to fail with 400 errors when going through a reverse proxy. Added new body-based endpoints that accept alert ID in JSON body: - POST /api/alerts/acknowledge {"id": "..."} - POST /api/alerts/unacknowledge {"id": "..."} - POST /api/alerts/clear {"id": "..."} Updated frontend to use the new endpoints. Legacy path-based endpoints are preserved for backwards compatibility. Related to #1026	2026-01-03 20:51:25 +00:00
rcourtman	96573f4aca	feat: enhance AI baseline context visibility and incident timeline improvements Backend: - Enhanced buildEnrichedResourceContext to ALWAYS show learned baselines with status indicators (normal/elevated/anomaly) instead of only when anomalous - This makes Pulse Pro's 'moat' visible - users can see the AI understands their infrastructure's normal behavior patterns - Added baseline import to service.go Frontend (user changes): - Added incident event type filtering with toggle buttons - Added resource incident panel to view all incidents for a resource - Added timeline expand/collapse functionality in alert history - Added incident note saving with proper incidentId tracking - Added startedAt parameter for proper incident timeline loading	2025-12-21 00:14:20 +00:00
rcourtman	e157aa04ad	fix: Allow printable ASCII in alert IDs for acknowledge requests Reverts overly strict alert ID validation that was rejecting valid alert IDs containing special characters. Docker host IDs can contain user-supplied data like hostnames which may include parentheses, brackets, or other printable ASCII characters. The previous validation only allowed alphanumeric + limited punctuation, which caused 400 errors when acknowledging alerts from Docker hosts with special characters in their identifiers. Related to #852	2025-12-16 20:10:31 +00:00
rcourtman	d0d989289a	Refactor alert system: fix race conditions, memory leaks, and improve code quality - Rename checkFlapping to checkFlappingLocked to clarify lock contract - Replace goto statements with structured control flow - Wire up unused recordAlertFired/recordAlertResolved metric hooks - Add trackingMapCleanup goroutine to prevent memory leaks from stale entries - Tighten alert ID validation to alphanumeric + safe punctuation - Fix history save error handling to properly manage backup lifecycle - Add auto-migration for deprecated GroupingWindow field - Refactor 300+ line UpdateConfig into focused helper functions - Unify duplicate evaluateVMCondition/evaluateContainerCondition - Add constants for magic numbers (thresholds, timing, flapping) - Update tests to match new backup behavior	2025-12-02 23:31:36 +00:00
rcourtman	b4d497ce3b	security: Add request body size limits to API handlers Add http.MaxBytesReader to 16 additional handlers to prevent memory exhaustion attacks via oversized request bodies: - docker_agents.go: HandleReport (512KB), HandleCommandAck (8KB), HandleSetCustomDisplayName (8KB) - alerts.go: UpdateAlertConfig (64KB), BulkAcknowledgeAlerts (32KB), BulkClearAlerts (32KB) - config_handlers.go: HandleAddNode, HandleTestConnection, HandleUpdateNode, HandleTestNodeConfig (32KB each), HandleVerifyTemperatureSSH, HandleExportConfig, HandleDiscoverServers, HandleSetupScriptURL (8KB each), HandleImportConfig (1MB), HandleUpdateMockMode (16KB)	2025-12-02 16:43:13 +00:00
rcourtman	255357d2fe	Add recovery notifications and grouping controls	2025-11-21 22:07:00 +00:00
rcourtman	77282bd3a6	Implement Pulse tag overrides and alert clear persistence	2025-10-25 14:28:32 +00:00
rcourtman	5c54685f04	Add API token scopes and standalone host agent Introduces granular permission scopes for API tokens (docker:report, docker:manage, host-agent:report, monitoring:read/write, settings:read/write) allowing tokens to be restricted to minimum required access. Legacy tokens default to full access until scopes are explicitly configured. Adds standalone host agent for monitoring Linux, macOS, and Windows servers outside Proxmox/Docker estates. New Servers workspace in UI displays uptime, OS metadata, and capacity metrics from enrolled agents. Includes comprehensive token management UI overhaul with scope presets, inline editing, and visual scope indicators.	2025-10-23 11:40:31 +00:00
rcourtman	ff4dc49ae4	Update Pulse install flow and related components	2025-10-21 19:58:53 +00:00
rcourtman	ad371bf412	feat: improve alert system performance, UX, and edge case handling Implement 5 medium/low priority improvements identified in systematic review: UX IMPROVEMENTS: - Notify existing critical alerts when activating from pending_review state Previously: critical alerts during observation window would never notify Now: users receive notifications for active critical alerts after activation Implementation: Added NotifyExistingAlert() method and logic in ActivateAlerts() PERFORMANCE OPTIMIZATIONS: - Replace per-alert cleanup goroutines with periodic batch cleanup Prevents spawning 1000s of goroutines during alert flapping recentlyResolved entries now cleaned up once per minute instead of 1 goroutine per alert - Simplify GetActiveAlerts() implementation Removed intermediate map copy, holds lock slightly longer but operation is fast Cleaner code with reduced memory allocation CONFIGURATION VALIDATION: - Validate timezone in quiet hours configuration Invalid timezones now disable quiet hours with error log instead of silent fallback Prevents unexpected behavior when timezone is typo'd or invalid GRACEFUL SHUTDOWN: - Add 100ms delay in Stop() for background goroutine cleanup Reduces risk of state corruption during shutdown Allows escalation checker and periodic save to exit cleanly Technical details: - internal/alerts/alerts.go: Added NotifyExistingAlert(), optimized cleanup patterns - internal/api/alerts.go: Enhanced ActivateAlerts() to notify existing critical alerts - Removed ~20 lines of goroutine spawning code - Added periodic cleanup for recentlyResolved map - All changes preserve backward compatibility Testing: Verified compilation with 'go build -o /dev/null ./...'	2025-10-21 11:05:45 +00:00
rcourtman	85ffe10aed	docs: add Mermaid diagrams to improve visual documentation Enhance documentation with six Mermaid diagrams to better explain complex system implementations: - Adaptive polling lifecycle flowchart showing enqueue→execute→feedback cycle with scheduler, priority queue, and worker interactions - Circuit breaker state machine diagram illustrating Closed↔Open↔Half-open transitions with triggers and recovery paths - Temperature proxy architecture diagram highlighting trust boundaries, security controls, and data flow between host/container/cluster - Sensor proxy request flow sequence diagram showing auth, rate limiting, validation, and SSH execution pipeline - Alert webhook pipeline flowchart detailing template resolution, URL rendering, HTTP dispatch, and retry logic - Script library workflow diagram illustrating dev→test→bundle→distribute lifecycle emphasizing modular design These visualizations make it easier for operators and contributors to understand Pulse's sophisticated architectural patterns.	2025-10-21 10:40:33 +00:00
rcourtman	5927535110	Ref #556 : adjust alert history range handling	2025-10-15 18:41:06 +00:00
rcourtman	0a5a4c1a0d	Allow printable alert IDs for acknowledgements (#550 )	2025-10-14 16:48:22 +00:00
rcourtman	f46ff1792b	Fix settings security tab navigation	2025-10-11 23:29:47 +00:00

20 commits