Pulse

vrr/Pulse

mirror of https://github.com/rcourtman/Pulse.git synced 2026-04-30 12:30:17 +00:00

Author	SHA1	Message	Date
rcourtman	dd1d222ad0	Improve bootstrap token UX for easier discovery The bootstrap token security requirement was added proactively but lacked discoverability, causing user friction during first-run setup. These improvements make the token easier to find while maintaining the security benefit. Improvements: - Display bootstrap token prominently in startup logs with ASCII box (previously: single line log message) - Add `pulse bootstrap-token` CLI command to display token on demand (Docker: docker exec <container> /app/pulse bootstrap-token) - Improve error messages in quick-setup API to show exact commands for retrieving token when missing or invalid - Error messages now include both Docker and bare metal examples User experience improvements: - Token visible in `docker logs` output immediately - Clear instructions printed with token - Helpful error messages if token is wrong/missing - CLI helper for operators who need to retrieve token later Security unchanged: - Bootstrap token still required for first-run setup - Token still auto-deleted after successful setup - No bypass mechanism added Related to discussion about bootstrap token UX friction.	2025-11-06 17:29:49 +00:00
rcourtman	f9ca2c0e68	Add hashpw utility for generating password hashes Simple CLI utility to generate bcrypt password hashes for admin users. Usage: hashpw <password> This utility helps administrators generate properly hashed passwords for use in configuration files or manual user setup.	2025-11-06 16:46:56 +00:00
rcourtman	20099549c6	Add comprehensive release validation to prevent missing artifacts Adds automated validation script to prevent the pattern of patch releases caused by missing files/artifacts. scripts/validate-release.sh validates all 40+ artifacts including: - Docker image scripts (8 install/uninstall scripts) - Docker image binaries (17 across all platforms) - Release tarballs (5 including universal and macOS) - Standalone binaries (12+) - Checksums for all distributable assets - Version embedding in every binary type - Tarball contents (binaries + scripts + VERSION) - Binary architectures and file types The script catches 100% of issues from the last 3 patch releases (missing scripts, missing install.sh, missing binaries, broken version embedding). Updated RELEASE_CHECKLIST.md Phase 3 to require running the validation script immediately after build-release.sh and before proceeding to Docker build/publish phases. Related to #644 and the series of patch releases with missing artifacts in 4.26.x.	2025-11-06 16:33:49 +00:00
rcourtman	5b89b2371a	Make pulse-sensor-proxy resilient to read-only filesystems Related to #637 The sensor-proxy was failing to start on systems with read-only filesystems because audit logging required a writable /var/log/pulse/sensor-proxy directory. Changes: - Modified newAuditLogger() to automatically fall back to stderr (systemd journal) if the audit log file cannot be opened - Removed error return from newAuditLogger() since it now always succeeds - Added warning logs when fallback mode is used to alert operators - Updated tests to handle the new signature - Added better debugging to audit log tests This allows the sensor-proxy to run on: - Immutable/read-only root filesystems - Hardened systems with restricted /var mounts - Containerized environments with limited write access Audit events are still captured via systemd journal when file logging is unavailable, maintaining the security audit trail.	2025-11-06 00:18:51 +00:00
rcourtman	930ad20921	Add configurable log level for pulse-sensor-proxy Users can now control logging verbosity through: - YAML config file: log_level: "debug\|info\|warn\|error" - Environment variable: PULSE_SENSOR_PROXY_LOG_LEVEL Default log level is set to "info" instead of debug, reducing verbose output. Supported levels: trace, debug, info, warn, error, fatal, panic, disabled Related to #629	2025-11-05 19:48:00 +00:00
rcourtman	3194b10398	Improve Alpine Linux support and agent startup validation Related to #612 This commit addresses the Alpine Linux installation issues reported where: 1. The OpenRC init system was not properly detected 2. Manual startup instructions were unclear and used placeholder values 3. The agent didn't validate configuration properly at startup Changes: Install Script (install-docker-agent.sh): - Improved OpenRC detection to check for rc-service and rc-update commands instead of looking for openrc-run binary in specific paths - Added specific Alpine Linux detection via /etc/alpine-release and /etc/os-release - Enhanced manual startup instructions to show actual values instead of placeholders - Added clearer warnings and guidance when no init system is detected - Included comprehensive startup command with all required parameters Agent Startup Validation (pulse-docker-agent): - Added validation to detect unexpected command-line arguments - Added helpful note about double-dash flag requirements (--token vs -token) - Improved error messages to include example usage patterns - Added warning when defaulting to localhost without explicit URL configuration - Provide both command-line and environment variable examples in error messages These improvements ensure that: - Alpine Linux installations will properly detect and configure OpenRC services - Users who must start the agent manually get clear, copy-pasteable commands - Configuration errors are caught early with actionable error messages - Common mistakes (like missing --url) are clearly explained	2025-11-05 19:01:09 +00:00
rcourtman	fdf0977be2	Add host agent multi-platform binary distribution and improve host details UI - Build host agent binaries for all platforms (linux/darwin/windows, amd64/arm64/armv7) in Docker - Add Makefile target for building agent binaries locally - Add startup validation to check for missing agent binaries - Improve download endpoint error messages with troubleshooting guidance - Enhance host details drawer layout with better organization and visual hierarchy - Update base images to rolling versions (node:20-alpine, golang:1.24-alpine, alpine:3.20)	2025-11-05 17:38:17 +00:00
rcourtman	6eb1a10d9b	Refactor: Code cleanup and localStorage consolidation This commit includes comprehensive codebase cleanup and refactoring: ## Code Cleanup - Remove dead TypeScript code (types/monitoring.ts - 194 lines duplicate) - Remove unused Go functions (GetClusterNodes, MigratePassword, GetClusterHealthInfo) - Clean up commented-out code blocks across multiple files - Remove unused TypeScript exports (helpTextClass, private tag color helpers) - Delete obsolete test files and components ## localStorage Consolidation - Centralize all storage keys into STORAGE_KEYS constant - Update 5 files to use centralized keys: * utils/apiClient.ts (AUTH, LEGACY_TOKEN) * components/Dashboard/Dashboard.tsx (GUEST_METADATA) * components/Docker/DockerHosts.tsx (DOCKER_METADATA) * App.tsx (PLATFORMS_SEEN) * stores/updates.ts (UPDATES) - Benefits: Single source of truth, prevents typos, better maintainability ## Previous Work Committed - Docker monitoring improvements and disk metrics - Security enhancements and setup fixes - API refactoring and cleanup - Documentation updates - Build system improvements ## Testing - All frontend tests pass (29 tests) - All Go tests pass (15 packages) - Production build successful - Zero breaking changes Total: 186 files changed, 5825 insertions(+), 11602 deletions(-)	2025-11-04 21:50:46 +00:00
rcourtman	32392d1212	Add disk metrics, block I/O, and mount details to Docker monitoring Extends Docker container monitoring with comprehensive disk and storage information: - Writable layer size and root filesystem usage displayed in new Disk column - Block I/O statistics (read/write bytes totals) shown in container drawer - Mount metadata including type, source, destination, mode, and driver details - Configurable via --collect-disk flag (enabled by default, can be disabled for large fleets) Also fixes config watcher to consistently use production auth config path instead of following PULSE_DATA_DIR when in mock mode.	2025-10-29 12:05:36 +00:00
rcourtman	68ce8e7520	feat: finalize swarm service monitoring (#598 )	2025-10-26 09:35:49 +00:00
rcourtman	8e83eaf823	Add container state filtering to Docker agent	2025-10-25 21:40:59 +00:00
rcourtman	6333a445e9	feat: add native Windows service support and expandable host details Windows Host Agent Enhancements: - Implement native Windows service support using golang.org/x/sys/windows/svc - Add Windows Event Log integration for troubleshooting - Create professional PowerShell installation/uninstallation scripts - Add process termination and retry logic to handle Windows file locking - Register uninstall endpoint at /uninstall-host-agent.ps1 Host Agent UI Improvements: - Add expandable drawer to Hosts page (click row to view details) - Display system info, network interfaces, disks, and temperatures in cards - Replace status badges with subtle colored indicators - Remove redundant master-detail sidebar layout - Add search filtering for hosts Technical Details: - service_windows.go: Windows service lifecycle management with graceful shutdown - service_stub.go: Cross-platform compatibility for non-Windows builds - install-host-agent.ps1: Full Windows installation with validation - uninstall-host-agent.ps1: Clean removal with process termination and retries - HostsOverview.tsx: Expandable row pattern matching Docker/Proxmox pages Files Added: - cmd/pulse-host-agent/service_windows.go - cmd/pulse-host-agent/service_stub.go - scripts/install-host-agent.ps1 - scripts/uninstall-host-agent.ps1 - frontend-modern/src/components/Hosts/HostsOverview.tsx - frontend-modern/src/components/Hosts/HostsFilter.tsx The Windows service now starts reliably with automatic restart on failure, and the uninstall script handles file locking gracefully without requiring reboots.	2025-10-23 22:11:56 +00:00
rcourtman	5c54685f04	Add API token scopes and standalone host agent Introduces granular permission scopes for API tokens (docker:report, docker:manage, host-agent:report, monitoring:read/write, settings:read/write) allowing tokens to be restricted to minimum required access. Legacy tokens default to full access until scopes are explicitly configured. Adds standalone host agent for monitoring Linux, macOS, and Windows servers outside Proxmox/Docker estates. New Servers workspace in UI displays uptime, OS metadata, and capacity metrics from enrolled agents. Includes comprehensive token management UI overhaul with scope presets, inline editing, and visual scope indicators.	2025-10-23 11:40:31 +00:00
rcourtman	77108abc65	Propagate config updates to settings nodes (#588 )	2025-10-22 13:45:13 +00:00
rcourtman	35adcf104f	docs: add guidance for large deployments (30+ nodes) in rate limit config Update config.example.yaml with: - Recommendations for very large deployments (30+ nodes) - Formula for calculating optimal rate limits based on node count - Example calculation: 30 nodes with 10s polling = 300ms interval - Security note about minimum safe intervals This helps admins properly configure the proxy for enterprise deployments with dozens of nodes.	2025-10-21 11:27:13 +00:00
rcourtman	44d5f91e92	feat: make pulse-sensor-proxy rate limits configurable Add support for configuring rate limits via config.yaml to allow administrators to tune the proxy for different deployment sizes. Changes: - Add RateLimitConfig struct to config.go with per_peer_interval_ms and per_peer_burst - Update newRateLimiter() to accept optional RateLimitConfig parameter - Load rate limit config from YAML and apply overrides to defaults - Update tests to pass nil for default behavior - Add comprehensive config.example.yaml with documentation Configuration examples: - Small (1-3 nodes): 1000ms interval, burst 5 (default) - Medium (4-10 nodes): 500ms interval, burst 10 - Large (10+ nodes): 250ms interval, burst 20 Defaults remain conservative (1 req/sec, burst 5) to support most deployments while allowing customization for larger environments. Related: #`46b8b8d08` (rate limit fix for multi-node support)	2025-10-21 11:25:21 +00:00
rcourtman	d856e75018	fix: increase pulse-sensor-proxy rate limits for multi-node support - Increase rate limit from 1 req/5sec to 1 req/sec (60/min) - Increase burst from 2 to 5 requests - Fixes temperature collection failures when monitoring 3+ nodes - All requests from containerized Pulse use same UID, causing rate limiting - New limits support 5-10 node deployments comfortably Resolves issue where adding standalone nodes broke temperature monitoring for all nodes due to aggressive rate limiting.	2025-10-21 11:21:12 +00:00
rcourtman	73fb9d986f	feat: add PBS/PMG stubs to test harness and implement HTTP config fetch Resolves two remaining TODOs from codebase audit. ## 1. PBS/PMG Test Harness Stubs Location: internal/monitoring/harness_integration.go:149-151 Changes: - Added PBS client stub registration: `monitor.pbsClients[inst.Name] = &pbs.Client{}` - Added PMG client stub registration: `monitor.pmgClients[inst.Name] = &pmg.Client{}` - Added imports for pkg/pbs and pkg/pmg Purpose: Enables integration test scenarios to include PBS and PMG instance types alongside existing PVE support. Stubs allow scheduler to register and execute tasks for these instance types during integration testing. Testing: ✅ TestAdaptiveSchedulerIntegration passes (55.5s) ✅ Integration test harness now supports all three instance types ## 2. HTTP Config URL Fetch Location: cmd/pulse/config.go:226-261 Problem: `PULSE_INIT_CONFIG_URL` was recognized but not implemented, returning "URL import not yet implemented" error. Implementation: - URL validation (http/https schemes only) - HTTP client with 15 second timeout - Status code validation (2xx required) - Empty response detection - Base64 decoding with fallback to raw data - Matches existing env-var behavior for `PULSE_INIT_CONFIG_DATA` Security: - Both HTTP and HTTPS supported (HTTPS recommended for production) - URL scheme validation prevents file:// or other protocols - Timeout prevents hanging on unresponsive servers Usage: ```bash export PULSE_INIT_CONFIG_URL="https://config-server/encrypted-config" export PULSE_INIT_CONFIG_PASSPHRASE="secret" pulse config auto-import ``` Testing: ✅ Code compiles cleanly ✅ Follows same pattern as existing PULSE_INIT_CONFIG_DATA handling ## Impact - Completes integration test infrastructure for all instance types - Enables automated config distribution via HTTP(S) for container deployments - Removes last TODOs from codebase (no TODO/FIXME remaining in Go files)	2025-10-20 16:05:45 +00:00
rcourtman	7d422d2909	feat: add professional logging with runtime configuration and performance optimization Implements structured logging package with LOG_LEVEL/LOG_FORMAT env support, debug level guards for hot paths, enriched error messages with actionable context, and stack trace capture for production debugging. Improves observability and reduces log overhead in high-frequency polling loops.	2025-10-20 15:13:38 +00:00
rcourtman	57429900a6	feat: add adaptive polling scheduler infrastructure (Phase 2 Tasks 1-3) Implements adaptive scheduling foundation for Phase 2: - Poll cycle metrics: duration, staleness, queue depth, in-flight counters - Adaptive scheduler with pluggable staleness/interval/enqueue interfaces - Config support: ADAPTIVE_POLLING_ENABLED flag + min/max/base intervals - Feature flag defaults to disabled for safe rollout - Scheduler wiring into Monitor with conditional instantiation Tasks 1-3 of 10 complete. Ready for staleness tracker implementation.	2025-10-20 15:13:37 +00:00
rcourtman	524f42cc28	security: complete Phase 1 sensor proxy hardening Implements comprehensive security hardening for pulse-sensor-proxy: - Privilege drop from root to unprivileged user (UID 995) - Hash-chained tamper-evident audit logging with remote forwarding - Per-UID rate limiting (0.2 QPS, burst 2) with concurrency caps - Enhanced command validation with 10+ attack pattern tests - Fuzz testing (7M+ executions, 0 crashes) - SSH hardening, AppArmor/seccomp profiles, operational runbooks All 27 Phase 1 tasks complete. Ready for production deployment.	2025-10-20 15:13:37 +00:00
rcourtman	29f4879cd4	test: add comprehensive security tests and documentation Implements all remaining Codex recommendations before launch: 1. Privileged Methods Tests: - TestPrivilegedMethodsCompleteness ensures all host-side RPCs are protected - Will fail if new privileged RPC is added without authorization - Verifies read-only methods are NOT in privilegedMethods 2. ID-Mapped Root Detection Tests: - TestIDMappedRootDetection covers all boundary conditions - Tests UID/GID range detection (both must be in range) - Tests multiple ID ranges, edge cases, disabled mode - 100% coverage of container identification logic 3. Authorization Tests: - TestPrivilegedMethodsBlocked verifies containers can't call privileged RPCs - TestIDMappedRootDisabled ensures feature can be disabled - Tests both container and host credentials 4. Comprehensive Security Documentation (23 KB): - Architecture overview with diagrams - Complete authentication & authorization flow - Rate limiting details (already implemented: 20/min per peer) - SSH security model and forced commands - Container isolation mechanisms - Monitoring & alerting recommendations - Development mode documentation (PULSE_DEV_ALLOW_CONTAINER_SSH) - Troubleshooting guide with common issues - Incident response procedures Rate Limiting Status: - Already implemented in throttle.go (20 req/min, burst 10, max 10 concurrent) - Per-peer rate limiting at line 328 in main.go - Per-node concurrency control at line 825 in main.go - Exceeds Codex's requirements All tests pass. Documentation covers all security aspects. Addresses final Codex recommendations for production readiness.	2025-10-19 16:47:13 +00:00
rcourtman	1519390f08	security: enhance logging for denied privileged method calls Improved security audit trail for attempted container privilege escalation: - Added detailed logging when containers attempt privileged methods - Logs UID, GID, PID, correlation ID, and method name - Marked with "SECURITY:" prefix for easy filtering/alerting - Helps operators detect and investigate compromise attempts Example log output: SECURITY: Container attempted to call privileged method - access denied method=ensure_cluster_keys uid=101000 gid=101000 pid=12345 Addresses Codex recommendation for comprehensive logging of denied privileged RPCs to enable monitoring and alerting on attempted abuse.	2025-10-19 16:40:42 +00:00
rcourtman	026b9c5b77	security: add method-level authorization for privileged RPC methods RELEASE BLOCKER FIX - Prevents containers from triggering host-level operations. Added host-only method restrictions: - RPCEnsureClusterKeys (SSH key distribution) - RPCRegisterNodes (node registration) - RPCRequestCleanup (cleanup operations) Implementation: - New privilegedMethods map defines host-only methods - Request handler checks if method is privileged - If privileged AND caller is from ID-mapped UID range (container), reject - Host processes (real root, configured UIDs) can still call privileged methods - Containers can still call get_temperature and get_status Security impact: - Prevents compromised containers from: • Triggering unwanted SSH key distribution to cluster nodes • Learning about cluster topology via forced registration • DOS attacks by repeatedly calling key distribution • Other host-level privileged operations Without this fix, any container with root could call these methods after authentication, undermining the security isolation between container and host. Addresses high-severity finding #2 from security audit.	2025-10-19 16:31:50 +00:00
rcourtman	3a6a4fd362	security: fix SSH command injection vulnerabilities in pulse-sensor-proxy CRITICAL security fixes for pulse-sensor-proxy: 1. Strengthened hostname validation regex: - Now requires hostnames to start with alphanumeric character - Prevents SSH option injection via hostnames starting with '-' - Pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{0,63}$ (1-64 chars total) - Added IPv4 and IPv6 validation regexes for future use 2. Added validation to vulnerable V1 RPC handlers: - handleGetTemperature: Now validates node parameter before SSH - handleRegisterNodes: Now validates discovered cluster nodes - Previously these handlers passed unsanitized input directly to SSH 3. Defense in depth: - V2 handlers already had validation (now using improved regex) - Multiple layers of protection against malicious node identifiers - Validation prevents container from passing SSH options as hostnames Without these fixes, a compromised container could potentially inject SSH options by providing malicious node names, though the 'root@' prefix provided some mitigation. Addresses high-severity finding from security audit.	2025-10-19 16:28:38 +00:00
rcourtman	123e0f04ca	feat: add comprehensive node cleanup system Implements automated cleanup workflow when nodes are deleted from Pulse, removing all monitoring footprint from the host. Changes include a new RPC handler in the sensor proxy for cleanup requests, enhanced node deletion modal with detailed cleanup explanations, and improved SSH key management with proper tagging for atomic updates.	2025-10-17 18:53:45 +00:00
rcourtman	f141f7db33	feat: enhance sensor proxy with improved cluster discovery and SSH management Improvements to pulse-sensor-proxy: - Fix cluster discovery to use pvecm status for IP addresses instead of node names - Add standalone node support for non-clustered Proxmox hosts - Enhanced SSH key push with detailed logging, success/failure tracking, and error reporting - Add --pulse-server flag to installer for custom Pulse URLs - Configure www-data group membership for Proxmox IPC access UI and API cleanup: - Remove unused "Ensure cluster keys" button from Settings - Remove /api/diagnostics/temperature-proxy/ensure-cluster-keys endpoint - Remove EnsureClusterKeys method from tempproxy client The setup script already handles SSH key distribution during initial configuration, making the manual refresh button redundant.	2025-10-17 11:43:26 +00:00
rcourtman	91fecacfef	feat: add docker agent command handling	2025-10-15 19:27:19 +00:00
rcourtman	e4c3b06f14	Automate sensor proxy container mount and auth	2025-10-14 12:41:48 +00:00
rcourtman	b952444837	refactor: Rename pulse-temp-proxy to pulse-sensor-proxy The name "temp-proxy" implied a temporary or incomplete implementation. The new name better reflects its purpose as a secure sensor data bridge for containerized Pulse deployments. Changes: - Renamed cmd/pulse-temp-proxy/ to cmd/pulse-sensor-proxy/ - Updated all path constants and binary references - Renamed environment variables: PULSE_TEMP_PROXY_* to PULSE_SENSOR_PROXY_* - Updated systemd service and service account name - Updated installation, rotation, and build scripts - Renamed hardening documentation - Maintained backward compatibility for key removal during upgrades	2025-10-13 13:17:05 +00:00
rcourtman	c7bb76c12e	fix: Switch proxy socket to directory-level bind mount for stability Fixes LXC bind mount issue where socket-level mounts break when the socket is recreated by systemd. Following Codex's recommendation to bind mount the directory instead of the file. Changes: - Socket path: /run/pulse-temp-proxy/pulse-temp-proxy.sock - Systemd: RuntimeDirectory=pulse-temp-proxy (auto-creates /run/pulse-temp-proxy) - Systemd: RuntimeDirectoryMode=0770 for group access - LXC mount: Bind entire /run/pulse-temp-proxy directory - Install script: Upgrades old socket-level mounts to directory-level - Install script: Detects and handles bind mount changes This survives socket recreations and container restarts. The directory mount persists even when systemd unlinks/recreates the socket file. Related to #528	2025-10-12 22:33:53 +00:00
rcourtman	6d4694f019	security: Add SO_PEERCRED authentication to temperature proxy Addresses security concern raised in code review: - Socket permissions changed from 0666 to 0660 - Added SO_PEERCRED verification to authenticate connecting processes - Only allows root (UID 0) or proxy's own user - Prevents unauthorized processes from triggering SSH key rollout - Documented passwordless root SSH requirement for clusters This prevents any process on the host or in other containers from accessing the proxy RPC endpoints.	2025-10-12 21:42:22 +00:00
rcourtman	e7bc338891	feat: Implement secure temperature proxy for containerized deployments Addresses #528 Introduces pulse-temp-proxy architecture to eliminate SSH key exposure in containers: Architecture: - pulse-temp-proxy runs on Proxmox host (outside LXC/Docker) - SSH keys stored on host filesystem (/var/lib/pulse-temp-proxy/ssh/) - Pulse communicates via unix socket (bind-mounted into container) - Proxy handles cluster discovery, key rollout, and temperature fetching Components: - cmd/pulse-temp-proxy: Standalone Go binary with unix socket RPC server - internal/tempproxy: Client library for Pulse backend - scripts/install-temp-proxy.sh: Idempotent installer for existing deployments - scripts/pulse-temp-proxy.service: Systemd service for proxy Integration: - Pulse automatically detects and uses proxy when socket exists - Falls back to direct SSH for native installations - Installer automatically configures proxy for new LXC deployments - Existing LXC users can upgrade by running install-temp-proxy.sh Security improvements: - Container compromise no longer exposes SSH keys - SSH keys never enter container filesystem - Maintains forced command restrictions - Transparent to users - no workflow changes Documentation: - Updated TEMPERATURE_MONITORING.md with new architecture - Added verification steps and upgrade instructions - Preserved legacy documentation for native installs	2025-10-12 21:35:35 +00:00
rcourtman	f46ff1792b	Fix settings security tab navigation	2025-10-11 23:29:47 +00:00

1 2 3 4

184 commits