Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-05-07 08:57:12 +00:00

Author	SHA1	Message	Date
rcourtman	b8a551ce22	Forward-port webhook JSON template escaping	2026-04-01 17:04:40 +01:00
rcourtman	a253016327	Harden Apprise server URL base handling	2026-04-01 15:50:30 +01:00
rcourtman	53f41fdb45	Harden webhook request URL validation	2026-03-29 13:18:40 +01:00
rcourtman	d6536932fc	Harden outbound URLs and file-backed storage	2026-03-29 12:47:55 +01:00
rcourtman	778a2577b6	feat: Pulse v6 release	2026-03-18 16:06:30 +00:00
rcourtman	9b531c547d	Fix recovery notifications silently disabled by config PUT (#1332 ) Two fixes for missing recovery/resolved notifications: 1. API config PUT handler now preserves notifyOnResolve when the client omits it from the request body. Go decodes a missing bool as false, which silently disabled recovery notifications on older clients. 2. CancelAlert now always cleans up the cooldown record even when the alert has already left the pending buffer, preventing stale cooldown entries from suppressing future alert cycles.	2026-03-09 11:28:28 +00:00
rcourtman	0dd3fc779b	Fix alert disable notification suppression Some checks failed Build and Test / Secret Scan (push) Has been cancelled Details Build and Test / Frontend & Backend (push) Has been cancelled Details Core E2E Tests / Playwright Core E2E (push) Has been cancelled Details	2026-03-07 18:40:08 +00:00
rcourtman	464d3f8486	Fix stale queued notification delivery	2026-03-05 23:46:35 +00:00
rcourtman	eb2397d99a	fix(notifications): route escalation notifications to selected channels only (#1259 ) Escalation was calling SendAlert() which always sends to all enabled channels, ignoring the per-level channel selection (email/webhook/all). Add SendAlertToChannels() that snapshots only the requested channel configs and uses a distinct "_escalation" queue type so the dequeue handler skips cooldown writes — preventing interference with the alert manager's own re-notify cadence.	2026-02-26 20:49:10 +00:00
rcourtman	77bd2e70d9	fix(notifications): add service-specific resolved webhook templates (#1259 ) Backport from v6 (88d5865a8). Recovery webhook notifications were using the firing PayloadTemplate which services like Telegram, Teams, Discord etc. silently rejected as malformed. Now uses a three-tier template pipeline matching the firing path: - Tier 1: Custom user template (if configured) - Tier 2: Service-specific ResolvedPayloadTemplate (Discord green embed, Telegram chat_id+text, Slack header blocks, Teams MessageCard/Adaptive, PagerDuty event_action:"resolve", Pushover, Gotify, Mattermost) - Tier 3: Generic JSON fallback (backward compatible) Also adds Event, ResolvedAt, ResolvedAtISO fields to WebhookPayloadData.	2026-02-24 23:28:33 +00:00
rcourtman	82ccb662f9	fix(notifications): use service-specific templates for resolved webhooks (#1068 ) Recovery notifications for Discord, Slack, Teams, PagerDuty, and other service webhooks were sending a generic JSON payload that lacked the required format (e.g. Discord needs `embeds`, Slack needs `blocks`), causing resolved notifications to silently fail. - Add `prepareResolvedWebhookData` to build template data with Level="resolved" - Route resolved webhooks through service-specific templates with full URL rendering, Telegram ChatID extraction, and PagerDuty routing_key - Custom user templates take precedence over built-in service templates - Return errors on service template failures instead of falling back to generic payloads that endpoints would reject - Fix PagerDuty template to send event_action="resolve" for resolved alerts	2026-02-24 10:49:52 +00:00
rcourtman	8a48acef1d	fix: hotfix 5.1.5 — node duplication, alert scrambling, ntfy resolved formatting - fix(models): filter nodes by instance in UpdateNodesForInstance to prevent PVE node duplication across poll cycles (#1214, #1192, #1217) - fix(alerts): sort GetActiveAlerts output for stable ordering, preventing hostname scrambling in frontend (#1218) - fix(notifications): add ntfy-specific resolved webhook formatting with plain-text body and proper headers (#1213) - fix(frontend): respect "hide Docker update actions" setting in DockerFilter Update All button (#1219) - fix(frontend): add missing v prefix to GitHub release tag URLs (#1195) - fix(monitoring): reduce disk detection warning from Warn to Debug to eliminate log spam for pass-through disks (#1216) - chore: bump VERSION to 5.1.5	2026-02-08 11:48:22 +00:00
rcourtman	b3fa409b74	Allow SMTP auth over unencrypted connections, fix rate limit persistence, sanitize diagnostics export - Replace Go stdlib smtp.PlainAuth (which refuses credentials without TLS) with a custom plainAuth that respects the user's explicit transport choice - Remove TLS guard from LoginAuth for the same reason - Add RateLimit field to EmailConfig so the user's configured value is persisted instead of being silently overwritten with 60 - Implement actual sanitization in the "Export for GitHub" diagnostics button (was previously ignored — both exports produced identical data) Related to #1189	2026-02-04 15:42:47 +00:00
rcourtman	05266d9062	Show node display name in alerts instead of raw Proxmox node name Alerts previously showed the raw Proxmox node name (e.g., "on pve") even when users configured a display name (e.g., "SPACEX") via Settings or the host agent --hostname flag. This affected the alert UI, email notifications, and webhook payloads. Add NodeDisplayName field to the alert chain: cache display names in the alert Manager (populated by CheckNode/CheckHost on every poll), resolve them at alert creation via preserveAlertState, refresh on metric updates, and enrich at read time in GetActiveAlerts. Update models.Alert, the syncAlertsToState conversion, email templates, Apprise body text, webhook payloads, and all frontend rendering paths. Related to #1188	2026-02-04 14:26:44 +00:00
rcourtman	7c1ebbecd5	fix(security): enhance webhook validation, enforce API scopes, and improve test coverage	2026-02-03 22:41:44 +00:00
rcourtman	81f146dcf0	Security fixes: Prevent Apprise RCE and Webhook DNS Rebinding	2026-02-03 22:00:02 +00:00
rcourtman	b2639ed5a5	Fix security vulnerabilities and critical bugs - Fix WebSocket CORS bypass by strictly verifying origin - Fix OIDC refresh token persistence by encrypting at rest - Fix grouped webhook data mutation by cloning alerts - Fix host agent uninstall authorization and config fetch logic - Fix notification queue recovery for stuck sending items - Fix ignored update history limit parameter - Fix ineffective break statement in WebSocket write pump	2026-02-03 17:16:27 +00:00
rcourtman	4f40c3d751	fix: resolve critical stability and auth issues - Fix data race in webhook notifications by removing shared state - Fix duplicate monitors on config reload by stopping old instances - Prevent metrics ID deletion on transient startup errors - Support Bearer auth header for config export/import endpoints	2026-02-03 16:46:27 +00:00
rcourtman	aeca5e39fa	Fix multi-tenant persistence and backend stability - Initialize Alert and Notification managers with tenant-specific data directories - Add panic recovery to WebSocket safeSend for stability - Record host metrics to history for sparkline support	2026-02-03 16:24:42 +00:00
rcourtman	d06ed2edb3	refactor: Add testability improvements to core packages hostagent/commands.go: - Extract execCommandContext as mockable variable hostagent/proxmox_setup.go: - Convert stateFilePath constants to variables (testable) - Extract runCommand and lookPath as mockable functions - Add duplicate comment (minor cleanup needed) notifications/notifications.go: - Add GetQueueStats() method for interface compliance - Used by NotificationMonitor interface updates/manager.go: - Add AddSSEClient, RemoveSSEClient, GetSSECachedStatus methods - Enables interface-based SSE client management pkg/audit/export.go: - Minor testability improvements go.mod/go.sum: - Add stretchr/objx v0.5.2 (test mocking dependency)	2026-01-19 19:25:38 +00:00
rcourtman	17dec929a0	feat: Add mention support for webhook alerts. Related to #1118 Adds a Mention field to webhook configurations that allows users to tag individuals or groups when alerts are sent. This works with: - Discord: @everyone, <@USER_ID>, <@&ROLE_ID> - Microsoft Teams: @General, user email - Mattermost: @channel, @all, @username The mention is included in the webhook payload via the {{.Mention}} template variable. Built-in templates for Discord, Slack, and Teams now conditionally include mentions when configured. Backend changes: - Add Mention field to WebhookConfig struct - Add Mention field to WebhookPayloadData for template access - Pass mention through sendGroupedWebhook Frontend changes: - Add mention field to Webhook interface - Add Mention input to webhook configuration form - Show service-specific help text for mention formats	2026-01-18 15:16:37 +00:00
rcourtman	e6d07c3294	style: remove emojis from log messages Replaced emoji icons with plain text for cleaner logs and cross-platform compatibility.	2025-12-13 21:29:11 +00:00
rcourtman	611740087c	style: fix additional staticcheck warnings - Lowercase error messages (ST1005) - Use context.Background() instead of nil (SA1012) - Fix rand.Intn(1) which always returns 0 (SA4030) - Remove unnecessary nil check before len() (S1009)	2025-11-27 09:21:11 +00:00
rcourtman	ad998a1e2f	style: fix staticcheck style warnings - Merge variable declaration with assignment (S1021) - Use unconditional strings.TrimPrefix (S1017) - Remove unnecessary nil checks around range (S1031) - Remove unnecessary fmt.Sprintf (S1039) - Use copy() instead of manual loop (S1001) - Use time.Until instead of t.Sub(time.Now()) (S1024) - Use buf.String() instead of string(buf.Bytes()) (S1030)	2025-11-27 09:19:33 +00:00
rcourtman	b370799988	chore: remove more dead code Remove 330 lines of unreachable code: - internal/monitoring/temperature_service.go: unused temperature service abstraction - internal/monitoring/temperature.go: unused NewTemperatureCollector wrapper - internal/mock/generator.go: unused GenerateAlerts function - internal/mock/integration.go: unused ToggleMockMode wrapper - internal/notifications/notifications.go: unused sendEmailWithContent, generatePayloadFromTemplate, isPrivateRange172, groupAlerts - internal/notifications/email_providers.go: unused GetProviderDefaults	2025-11-27 00:10:55 +00:00
rcourtman	255357d2fe	Add recovery notifications and grouping controls	2025-11-21 22:07:00 +00:00
rcourtman	11d7f4fd4e	Add Apprise test support for notifications Related to #584	2025-11-20 17:54:20 +00:00
rcourtman	8d320ef56b	Fix notification manager deadlock in Stop() Critical deadlock fix: - Stop() was holding n.mu lock while calling queue.Stop() - queue.Stop() waits for worker goroutines to finish - Worker goroutines call ProcessQueuedNotification() which needs n.mu lock - This created a classic lock-order deadlock Fix: - Unlock n.mu before calling queue.Stop() - Relock after queue shutdown completes - Workers can now finish and acquire lock as needed This resolves 30-second test timeouts in notifications package. Tests now complete in <1s instead of timing out at 30s.	2025-11-11 23:58:18 +00:00
rcourtman	d7766af799	Fix backend test failures blocking release workflow Three categories of fixes: 1. Goroutine leak causing 10-minute timeout: - Add defer mon.notificationMgr.Stop() in monitor_memory_test.go - Background goroutines from notification manager weren't being stopped 2. Database NULL column scanning errors: - Change LastError from string to string in queue.go - Change PayloadBytes from int to int in queue.go - SQL NULL values require pointer types in Go 3. SSRF protection blocking test servers: - Check allowlist for localhost before rejecting in notifications.go - Set PULSE_DATA_DIR to temp directory in tests - Add defer nm.Stop() calls to prevent goroutine leaks Fixes for preflight test failures in workflow run 19280879903.	2025-11-11 23:27:03 +00:00
rcourtman	1b221cca71	feat: Add configurable allowlist for webhook private IP targets (addresses #673 ) Allow homelab users to send webhooks to internal services while maintaining security defaults. Changes: - Add webhookAllowedPrivateCIDRs field to SystemSettings (persistent config) - Implement CIDR parsing and validation in NotificationManager - Convert ValidateWebhookURL to instance method to access allowlist - Add UI controls in System Settings for configuring trusted CIDR ranges - Maintain strict security by default (block all private IPs) - Keep localhost, link-local, and cloud metadata services blocked regardless of allowlist - Re-validate on both config save and webhook delivery (DNS rebinding protection) - Add comprehensive tests for CIDR parsing and IP matching Backend: - UpdateAllowedPrivateCIDRs() parses comma-separated CIDRs with validation - Support for bare IPs (auto-converts to /32 or /128) - Thread-safe allowlist updates with RWMutex - Logging when allowlist is updated or used - Validation errors prevent invalid CIDRs from being saved Frontend: - New "Webhook Security" section in System Settings - Input field with examples and helpful placeholder text - Real-time unsaved changes tracking - Loads and saves allowlist via system settings API Security: - Default behavior unchanged (all private IPs blocked) - Explicit opt-in required via configuration - Localhost (127/8) always blocked - Link-local (169.254/16) always blocked - Cloud metadata services always blocked - DNS resolution checked at both save and send time Testing: - Tests for CIDR parsing (valid/invalid inputs) - Tests for IP allowlist matching - Tests for bare IP address handling - Tests for security boundaries (localhost, link-local remain blocked) Related to #673 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-09 08:31:12 +00:00
rcourtman	7ee11105f5	Implement queue cancellation and atomic DB operations (P1 fixes) Queue cancellation mechanism: - Add CancelByAlertIDs method to mark queued notifications as cancelled when alerts resolve - Update CancelAlert to cancel queued notifications containing resolved alert IDs - Skip cancelled notifications in queue processor - Prevents resolved alerts from triggering notifications after they clear Atomic DB operations: - Add IncrementAttemptAndSetStatus to atomically update attempt counter and status - Replace separate IncrementAttempt + UpdateStatus calls with single atomic operation - Prevents orphaned queue entries when crashes occur between operations - Eliminates race condition where rows get stuck in "pending" or "sending" status These fixes ensure queued notifications are properly cancelled when alerts resolve and prevent database inconsistencies during crash scenarios.	2025-11-07 08:33:09 +00:00
rcourtman	c6a69e525c	Fix critical notification system bugs and security issues Critical fixes (P0): - Fix cooldown timing: Mark cooldown only after successful delivery, not before enqueue - Add os.MkdirAll to queue initialization to prevent silent failures on fresh installs - Add DNS re-validation at webhook send time to prevent DNS rebinding SSRF attacks - Add SSRF validation for Apprise HTTP URLs - Remove secret logging (bot tokens, routing keys) from debug logs - Implement lastNotified cleanup to prevent unbounded memory growth - Use shared HTTP client for webhooks to enable TLS connection reuse - Add fallback to direct sending when queue enqueue fails - Make queue worker concurrent (5 workers with semaphore) to prevent head-of-line blocking - Fix webhook rate limiter race condition with separate mutex - Fix email manager thread safety with mutex on rate limiter - Fix grouping timer leak by adding stopCleanup signal - Fix webhook 429 double sleep (use Retry-After OR backoff, not both) Frontend improvements: - Add queue/DLQ management API methods (getQueueStats, getDLQ, retryDLQItem, deleteDLQItem) - Add getNotificationHealth and getWebhookHistory endpoints - Add Apprise test support to NotificationTestRequest type Related to notification system audit	2025-11-07 08:29:13 +00:00
rcourtman	febce91145	Remove internal development documentation files Remove 4 LLM-generated internal development docs that don't belong in the repository: - MIGRATION_SCAFFOLDING.md - NOTIFICATION_AUDIT.md - NOTIFICATION_QUICK_REFERENCE.md - NOTIFICATION_SYSTEM_MAP.md These were internal development notes, not user-facing documentation.	2025-11-07 08:23:19 +00:00
rcourtman	6a48c759e8	Fix critical notification system bugs and security issues This commit addresses multiple critical issues identified in the notification system audit conducted with Codex: Critical Fixes: 1. Queue Retry Logic (Critical #1) - Fixed broken retry/DLQ system where send functions never returned errors - Made sendGroupedEmail(), sendGroupedWebhook(), sendGroupedApprise() return errors - Made sendWebhookRequest() return errors - ProcessQueuedNotification() now properly propagates errors to queue - Retry logic and DLQ now function correctly 2. Attempt Counter Bug (Critical #2) - Fixed double-increment bug in queue processing - Separated UpdateStatus() from attempt tracking - Added IncrementAttempt() method - Notifications now get correct number of retry attempts 3. Secret Exposure (Critical #3 & #4) - Masked webhook headers and customFields in GET /api/notifications/webhooks - Added redactSecretsFromURL() to sanitize webhook URLs in history - Truncated/redacted response bodies in webhook history - Protected against credential harvesting via API 4. Email Rate Limiting (Critical #5) - Added emailManager field to NotificationManager - Shared EnhancedEmailManager instance across sends - Rate limiter now accumulates across multiple emails - SMTP rate limits are now enforced correctly 5. SSRF Protection (High #6) - Added DNS resolution of webhook URLs - Added isPrivateIP() check using CIDR ranges - Blocks all private IP ranges (10/8, 172.16/12, 192.168/16, 127/8, 169.254/16) - Blocks IPv6 private ranges (::1, fe80::/10, fc00::/7) - Prevents DNS rebinding attacks - Returns error instead of warning for private IPs New Features: 6. Health Endpoint (High #8) - Added GET /api/notifications/health - Returns queue stats (pending, sending, sent, failed, dlq) - Shows email/webhook configuration status - Provides overall health indicator Related to notification system audit Files changed: - internal/notifications/notifications.go: Error returns, rate limiting, SSRF hardening - internal/notifications/queue.go: Attempt tracking fix - internal/api/notifications.go: Secret masking, health endpoint	2025-11-06 23:26:03 +00:00
rcourtman	20099549c6	Add comprehensive release validation to prevent missing artifacts Adds automated validation script to prevent the pattern of patch releases caused by missing files/artifacts. scripts/validate-release.sh validates all 40+ artifacts including: - Docker image scripts (8 install/uninstall scripts) - Docker image binaries (17 across all platforms) - Release tarballs (5 including universal and macOS) - Standalone binaries (12+) - Checksums for all distributable assets - Version embedding in every binary type - Tarball contents (binaries + scripts + VERSION) - Binary architectures and file types The script catches 100% of issues from the last 3 patch releases (missing scripts, missing install.sh, missing binaries, broken version embedding). Updated RELEASE_CHECKLIST.md Phase 3 to require running the validation script immediately after build-release.sh and before proceeding to Docker build/publish phases. Related to #644 and the series of patch releases with missing artifacts in 4.26.x.	2025-11-06 16:33:49 +00:00
rcourtman	ddc787418b	Round float values in webhook payloads to 1 decimal place Webhook alert payloads now round Value and Threshold fields to 1 decimal place before template rendering. This eliminates excessive precision in webhook messages (e.g., 62.27451680630036 becomes 62.3). The fix is applied in prepareWebhookData() so all webhook templates benefit automatically, including Google Space webhooks, generic JSON webhooks, and custom templates. Related to #619	2025-11-05 19:19:10 +00:00
rcourtman	02864f54dd	Add test notification functionality for Apprise - Add support for testing Apprise notifications via /api/notifications/test endpoint - Users can now test their Apprise configuration (both CLI and HTTP modes) using method="apprise" - Added comprehensive unit tests for both CLI and HTTP modes - Tests verify correct behavior when Apprise is enabled/disabled - Tests validate that notifications are properly sent through Apprise channels Related to #584	2025-11-05 18:54:18 +00:00
rcourtman	77282bd3a6	Implement Pulse tag overrides and alert clear persistence	2025-10-25 14:28:32 +00:00
rcourtman	be26f957c0	Add snapshot size alert thresholds (#585 )	2025-10-22 13:30:40 +00:00
rcourtman	524f42cc28	security: complete Phase 1 sensor proxy hardening Implements comprehensive security hardening for pulse-sensor-proxy: - Privilege drop from root to unprivileged user (UID 995) - Hash-chained tamper-evident audit logging with remote forwarding - Per-UID rate limiting (0.2 QPS, burst 2) with concurrency caps - Enhanced command validation with 10+ attack pattern tests - Fuzz testing (7M+ executions, 0 crashes) - SSH hardening, AppArmor/seccomp profiles, operational runbooks All 27 Phase 1 tasks complete. Ready for production deployment.	2025-10-20 15:13:37 +00:00
Pulse Automation Bot	80b9d0602a	Add Apprise notification integration (#570 )	2025-10-18 16:39:39 +00:00
rcourtman	91fecacfef	feat: add docker agent command handling	2025-10-15 19:27:19 +00:00
rcourtman	f46ff1792b	Fix settings security tab navigation	2025-10-11 23:29:47 +00:00

43 commits