Commit graph

33 commits

Author SHA1 Message Date
rcourtman
0223a52f82 Fix cluster proxy token collision - store per-node control tokens
When multiple cluster nodes register sensor-proxy, each registration
was overwriting the previous node's control token on the shared
PVEInstance. This caused "Proxy token not recognized" errors on all
but the last-registered node.

Changes:
- Add TemperatureProxyControlToken field to ClusterEndpoint struct
- Store control tokens per-endpoint for cluster registrations
- Check both instance-level and endpoint-level tokens when validating

Related to #738
2025-11-30 20:37:58 +00:00
rcourtman
0dc0235f77 chore: remove dead code and unused files
Remove 604 lines of unreachable code identified by deadcode analysis:
- internal/config/credentials.go: unused credential resolver
- internal/config/registration.go: unused registration config
- internal/monitoring/poller.go: unused channel-based polling (keep types)
- internal/api/middleware.go: unused TimeoutHandler, JSONHandler, NewAPIError, ValidationError
- internal/api/security.go: unused IsLockedOut, SecurityHeaders
- internal/api/auth.go: unused min helper
- internal/config/config.go: unused SaveConfig
- internal/config/client_helpers.go: unused CreatePBSConfigFromFields
- internal/logging/logging.go: unused NewRequestID
2025-11-27 00:05:04 +00:00
courtmanr@gmail.com
930c086556 WIP: Save all pending changes including frontend updates and unified agent scaffolding 2025-11-25 11:27:07 +00:00
courtmanr@gmail.com
f4c2bd7c35 Implement UI toggle for Hide Local Login (related to issue #750) 2025-11-25 08:14:19 +00:00
courtmanr@gmail.com
f347dedcdd Add PULSE_AUTH_HIDE_LOCAL_LOGIN option to hide password form
Implements #750 - allows hiding the username/password login form when
using OIDC SSO to avoid user confusion, while maintaining security.

- Added HideLocalLogin config option (env: PULSE_AUTH_HIDE_LOCAL_LOGIN)
- Exposed hideLocalLogin in /api/security/status endpoint
- Updated Login.tsx to conditionally hide local login form
- Added escape hatch via ?show_local=true URL parameter

This approach avoids the security and upgrade issues that led to
DISABLE_AUTH being removed (see #707, #678), while solving the UX
problem of users being confused by multiple login options.
2025-11-24 17:40:43 +00:00
courtmanr@gmail.com
4168eb41f8 Fix host agent registration verification issues (#746)
- Change default server listen addresses to empty string (listen on all interfaces including IPv6)
- Add short hostname matching fallback in host lookup API to handle FQDN vs short name mismatches
- Implement retry loop (30s) in both Windows and Linux/macOS installers for registration verification
- Fix lint errors: remove unnecessary fmt.Sprintf and nil checks before len()

This resolves the 'Installer could not yet confirm host registration with Pulse' warning
by addressing timing issues, hostname matching, and network connectivity.
2025-11-24 14:28:09 +00:00
courtmanr@gmail.com
0d4406b91f Add mutex protection for config watcher reloads (re #748)
Introduced sync.RWMutex to protect concurrent access to configuration
fields (AuthUser, AuthPass, APITokens) that are modified by the
ConfigWatcher at runtime.

- Added global config.Mu RWMutex in internal/config/config.go
- Protected config updates in ConfigWatcher.reloadConfig() and reloadAPITokens()
- Protected config reads in CheckAuth and all API token handlers
- Protected Router.SetConfig() during full config reloads

This prevents race conditions when .env file changes trigger config
reloads while authentication handlers are reading the same fields.
2025-11-24 07:45:21 +00:00
rcourtman
9c6c8cc0a0 Add OIDC CA bundle support 2025-11-22 09:44:03 +00:00
rcourtman
51b368ddc1 feat: make PVE polling interval configurable (related to #467) 2025-11-18 21:30:04 +00:00
rcourtman
47d5c14aef Improve temperature proxy control-plane flow 2025-11-15 21:49:51 +00:00
rcourtman
2ee693cc63 Add HTTP mode to pulse-sensor-proxy for multi-instance temperature monitoring
This implements HTTP/HTTPS support for pulse-sensor-proxy to enable
temperature monitoring across multiple separate Proxmox instances.

Architecture changes:
- Dual-mode operation: Unix socket (local) + HTTPS (remote)
- Unix socket remains default for security/performance (no breaking change)
- HTTP mode enables temps from external PVE hosts

Backend implementation:
- Add HTTPS server with TLS + Bearer token authentication to sensor-proxy
- Add TemperatureProxyURL and TemperatureProxyToken fields to PVEInstance
- Add HTTP client (internal/tempproxy/http_client.go) for remote proxy calls
- Update temperature collector to prefer HTTP proxy when configured
- Fallback logic: HTTP proxy → Unix socket → direct SSH (if not containerized)

Configuration:
- pulse-sensor-proxy config: http_enabled, http_listen_addr, http_tls_cert/key, http_auth_token
- PVEInstance config: temperature_proxy_url, temperature_proxy_token
- Environment variables: PULSE_SENSOR_PROXY_HTTP_* for all HTTP settings

Security:
- TLS 1.2+ with modern cipher suites
- Constant-time token comparison (timing attack prevention)
- Rate limiting applied to HTTP requests (shared with socket mode)
- Audit logging for all HTTP requests

Next steps:
- Update installer script to support HTTP mode + auto-registration
- Add Pulse API endpoint for proxy registration
- Generate TLS certificates during installation
- Test multi-instance temperature collection

Related to #571 (multi-instance architecture)
2025-11-13 16:13:53 +00:00
rcourtman
accecdb50b Make api_tokens.json authoritative source for API tokens (fixes #685)
This is the proper architectural fix for #685. The previous commit was a
bandaid that prevented unnecessary .env writes. This commit addresses the
root cause: dual-source-of-truth for API tokens (.env vs api_tokens.json).

Changes:

1. Startup Migration (config.go:896-951):
   - When loading config, if API_TOKEN/API_TOKENS exist in .env but not in
     api_tokens.json, automatically migrate them
   - Migrated tokens are named "Migrated from .env (prefix)" for clarity
   - Logs a deprecation warning: API_TOKEN/API_TOKENS in .env are deprecated
   - Leaves .env untouched (safe for existing deployments)

2. Config Watcher Changes (watcher.go:338-424):
   - Only load tokens from .env if api_tokens.json is EMPTY
   - Once api_tokens.json has records, it becomes the authoritative source
   - .env changes no longer trigger token overwrites when api_tokens.json exists
   - Logs debug message when ignoring env tokens

Result:
- Existing deployments: env tokens automatically migrated to api_tokens.json
- UI-created tokens: never overwritten by .env changes
- Dark mode toggle: no longer triggers token reload from .env
- Backward compatible: fresh installs with API_TOKEN in .env still work
- Migration path: users can safely keep API_TOKEN in .env, it will be ignored

Future improvement: Add UI warning when API_TOKEN/API_TOKENS still present
in .env, prompting users to rotate tokens via the UI.
2025-11-11 00:17:40 +00:00
rcourtman
48fabdd827 Improve Docker temperature monitoring documentation for clarity (related to #600)
Updated the Quick Start for Docker section in TEMPERATURE_MONITORING.md to be
more user-friendly and address common setup issues:

- Added clear explanation of why the proxy is needed (containers can't access hardware)
- Provided concrete IP example instead of placeholder
- Showed full docker-compose.yml context with proper YAML structure
- Added sudo to commands where needed
- Updated docker-compose commands to v2 syntax with note about v1
- Expanded verification steps with clearer success indicators
- Added reminder to check container name in verification commands

These improvements should help users who encounter blank temperature displays
due to missing proxy installation or bind mount configuration.
2025-11-07 15:09:42 +00:00
rcourtman
e21a72578f Add configurable SSH port for temperature monitoring
Related to #595

This change adds support for custom SSH ports when collecting temperature
data from Proxmox nodes, resolving issues for users who run SSH on non-standard
ports.

**Why SSH is still needed:**
Temperature monitoring requires reading /sys/class/hwmon sensors on Proxmox
nodes, which is not exposed via the Proxmox API. Even when using API tokens
for authentication, Pulse needs SSH access to collect temperature data.

**Changes:**
- Add `sshPort` configuration to SystemSettings (system.json)
- Add `SSHPort` field to Config with environment variable support (SSH_PORT)
- Add per-node SSH port override capability for PVE, PBS, and PMG instances
- Update TemperatureCollector to accept and use custom SSH port
- Update SSH known_hosts manager to support non-standard ports
- Add NewTemperatureCollectorWithPort() constructor with port parameter
- Maintain backward compatibility with NewTemperatureCollector() (uses port 22)
- Update frontend TypeScript types for SSH port configuration

**Configuration methods:**
1. Environment variable: SSH_PORT=2222
2. system.json: {"sshPort": 2222}
3. Per-node override in nodes.enc (future UI support)

**Default behavior:**
- Defaults to port 22 if not configured
- Maintains full backward compatibility
- No changes required for existing deployments

The implementation includes proper ssh-keyscan port handling and known_hosts
management for non-standard ports using [host]:port notation per SSH standards.
2025-11-05 20:03:29 +00:00
rcourtman
b1831d7b3e Add guest URL support for PVE hosts
Related to discussion #615

Add optional GuestURL field to PVE instances and cluster endpoints,
allowing users to specify a separate guest-accessible URL for web UI
navigation that differs from the internal management URL.

Backend changes:
- Add GuestURL field to PVEInstance and ClusterEndpoint structs
- Add GuestURL field to Node model
- Update cluster auto-discovery to preserve existing GuestURL values
- Update node creation logic to populate GuestURL from config
- Update API handlers to accept and persist GuestURL field

Frontend changes:
- Add GuestURL input field to NodeModal for configuration
- Update NodeGroupHeader and NodeSummaryTable to use GuestURL for navigation
- Add GuestURL to Node and PVENodeConfig TypeScript interfaces

When GuestURL is configured, it will be used for navigation links
instead of the Host URL, allowing users to access PVE hosts through
a reverse proxy or different domain while maintaining internal API
connections.
2025-11-05 19:06:08 +00:00
rcourtman
c93581e1aa Add DNS caching to reduce excessive DNS queries
Related to #608

Implements DNS caching using rs/dnscache to dramatically reduce DNS query
volume for frequently accessed Proxmox hosts. Users were reporting 260,000+
DNS queries in 37 hours for the same hostnames.

Changes:
- Added rs/dnscache dependency for DNS resolution caching
- Created pkg/tlsutil/dnscache.go with DNS cache wrapper
- Updated HTTP client creation to use cached DNS resolver
- Added DNSCacheTimeout configuration option (default: 5 minutes)
- Made DNS cache timeout configurable via:
  - system.json: dnsCacheTimeout field (seconds)
  - Environment variable: DNS_CACHE_TIMEOUT (duration string)
- DNS cache periodically refreshes to prevent stale entries

Benefits:
- Reduces DNS query load on local DNS servers by ~99%
- Reduces network traffic and DNS query log volume
- Maintains fresh DNS entries through periodic refresh
- Configurable timeout for different network environments

Default behavior: 5-minute cache timeout with automatic refresh
2025-11-05 18:25:38 +00:00
rcourtman
27f2038dab Add per-node temperature monitoring and fix critical config update bug
This commit implements per-node temperature monitoring control and fixes a critical
bug where partial node updates were destroying existing configuration.

Backend changes:
- Add TemperatureMonitoringEnabled field (*bool) to PVEInstance, PBSInstance, and PMGInstance
- Update monitor.go to check per-node temperature setting with global fallback
- Convert all NodeConfigRequest boolean fields to *bool pointers
- Add nil checks in HandleUpdateNode to prevent overwriting unmodified fields
- Fix critical bug where partial updates zeroed out MonitorVMs, MonitorContainers, etc.
- Update NodeResponse, NodeFrontend, and StateSnapshot to include temperature setting
- Fix HandleAddNode and test connection handlers to use pointer-based boolean fields

Frontend changes:
- Add temperatureMonitoringEnabled to Node interface and config types
- Create per-node temperature monitoring toggle handler with optimistic updates
- Update NodeModal to wire up per-node temperature toggle
- Add isTemperatureMonitoringEnabled helper to check effective monitoring state
- Update ConfiguredNodeTables to show/hide temperature badge based on monitoring state
- Update NodeSummaryTable to conditionally show temperature column
- Pass globalTemperatureMonitoringEnabled prop through component tree

The critical bug fix ensures that when updating a single field (like temperature
monitoring), the backend only modifies that specific field instead of zeroing out
all other boolean configuration fields.
2025-11-05 14:11:53 +00:00
rcourtman
d52ac6d8b5 Fix CSRF token validation and improve token management
- Add Access-Control-Expose-Headers to allow frontend to read X-CSRF-Token response header
- Implement proactive CSRF token issuance on GET requests when session exists but CSRF cookie is missing
- Ensures frontend always has valid CSRF token before making POST requests
- Fixes 403 Forbidden errors when toggling system settings

This resolves CSRF validation failures that occurred when CSRF tokens expired or were missing while valid sessions existed.
2025-11-05 09:23:44 +00:00
rcourtman
6eb1a10d9b Refactor: Code cleanup and localStorage consolidation
This commit includes comprehensive codebase cleanup and refactoring:

## Code Cleanup
- Remove dead TypeScript code (types/monitoring.ts - 194 lines duplicate)
- Remove unused Go functions (GetClusterNodes, MigratePassword, GetClusterHealthInfo)
- Clean up commented-out code blocks across multiple files
- Remove unused TypeScript exports (helpTextClass, private tag color helpers)
- Delete obsolete test files and components

## localStorage Consolidation
- Centralize all storage keys into STORAGE_KEYS constant
- Update 5 files to use centralized keys:
  * utils/apiClient.ts (AUTH, LEGACY_TOKEN)
  * components/Dashboard/Dashboard.tsx (GUEST_METADATA)
  * components/Docker/DockerHosts.tsx (DOCKER_METADATA)
  * App.tsx (PLATFORMS_SEEN)
  * stores/updates.ts (UPDATES)
- Benefits: Single source of truth, prevents typos, better maintainability

## Previous Work Committed
- Docker monitoring improvements and disk metrics
- Security enhancements and setup fixes
- API refactoring and cleanup
- Documentation updates
- Build system improvements

## Testing
- All frontend tests pass (29 tests)
- All Go tests pass (15 packages)
- Production build successful
- Zero breaking changes

Total: 186 files changed, 5825 insertions(+), 11602 deletions(-)
2025-11-04 21:50:46 +00:00
rcourtman
e07336dd9f refactor: remove legacy DISABLE_AUTH flag and enhance authentication UX
Major authentication system improvements:

- Remove deprecated DISABLE_AUTH environment variable support
- Update all documentation to remove DISABLE_AUTH references
- Add auth recovery instructions to docs (create .auth_recovery file)
- Improve first-run setup and Quick Security wizard flows
- Enhance login page with better error messaging and validation
- Refactor Docker hosts view with new unified table and tree components
- Add useDebouncedValue hook for better search performance
- Improve Settings page with better security configuration UX
- Update mock mode and development scripts for consistency
- Add ScrollableTable persistence and improved responsive design

Backend changes:
- Remove DISABLE_AUTH flag detection and handling
- Improve auth configuration validation and error messages
- Enhance security status endpoint responses
- Update router integration tests

Frontend changes:
- New Docker components: DockerUnifiedTable, DockerTree, DockerSummaryStats
- Better connection status indicator positioning
- Improved authentication state management
- Enhanced CSRF and session handling
- Better loading states and error recovery

This completes the migration away from the insecure DISABLE_AUTH pattern
toward proper authentication with recovery mechanisms.
2025-10-27 19:46:51 +00:00
rcourtman
d643dcf0bc perf: reduce polling allocations and guest metadata load 2025-10-25 13:12:47 +00:00
rcourtman
5c54685f04 Add API token scopes and standalone host agent
Introduces granular permission scopes for API tokens (docker:report, docker:manage, host-agent:report, monitoring:read/write, settings:read/write) allowing tokens to be restricted to minimum required access. Legacy tokens default to full access until scopes are explicitly configured.

Adds standalone host agent for monitoring Linux, macOS, and Windows servers outside Proxmox/Docker estates. New Servers workspace in UI displays uptime, OS metadata, and capacity metrics from enrolled agents.

Includes comprehensive token management UI overhaul with scope presets, inline editing, and visual scope indicators.
2025-10-23 11:40:31 +00:00
rcourtman
ff4dc49ae4 Update Pulse install flow and related components 2025-10-21 19:58:53 +00:00
rcourtman
56c6c0cc0c feat: improve discovery with progress tracking, validation, and structured errors
Significantly enhanced network discovery feature to eliminate false positives,
provide real-time progress updates, and better error reporting.

Key improvements:
- Require positive Proxmox identification (version data, auth headers, or certificates)
  instead of reporting any service on ports 8006/8007
- Add real-time progress tracking with phase/target counts and completion percentage
- Implement structured error reporting with IP, phase, type, and timestamp details
- Fix TLS timeout handling to prevent hangs on unresponsive hosts
- Expose progress and structured errors via WebSocket for UI consumption
- Reduce log verbosity by moving discovery logs to debug level
- Fix duplicate IP counting to ensure progress reaches 100%

Breaking changes: None (backward compatible with legacy API methods)
2025-10-20 22:29:30 +00:00
rcourtman
5ebb32ce10 feat: enhance runtime configuration and system settings management
Improves configuration handling and system settings APIs to support
v4.24.0 features including runtime logging controls, adaptive polling
configuration, and enhanced config export/persistence.

Changes:
- Add config override system for discovery service
- Enhance system settings API with runtime logging controls
- Improve config persistence and export functionality
- Update security setup handling
- Refine monitoring and discovery service integration

These changes provide the backend support for the configuration
features documented in the v4.24.0 release.
2025-10-20 17:41:19 +00:00
rcourtman
7d422d2909 feat: add professional logging with runtime configuration and performance optimization
Implements structured logging package with LOG_LEVEL/LOG_FORMAT env support, debug level guards for hot paths, enriched error messages with actionable context, and stack trace capture for production debugging. Improves observability and reduces log overhead in high-frequency polling loops.
2025-10-20 15:13:38 +00:00
rcourtman
57429900a6 feat: add adaptive polling scheduler infrastructure (Phase 2 Tasks 1-3)
Implements adaptive scheduling foundation for Phase 2:
- Poll cycle metrics: duration, staleness, queue depth, in-flight counters
- Adaptive scheduler with pluggable staleness/interval/enqueue interfaces
- Config support: ADAPTIVE_POLLING_ENABLED flag + min/max/base intervals
- Feature flag defaults to disabled for safe rollout
- Scheduler wiring into Monitor with conditional instantiation

Tasks 1-3 of 10 complete. Ready for staleness tracker implementation.
2025-10-20 15:13:37 +00:00
rcourtman
524f42cc28 security: complete Phase 1 sensor proxy hardening
Implements comprehensive security hardening for pulse-sensor-proxy:
- Privilege drop from root to unprivileged user (UID 995)
- Hash-chained tamper-evident audit logging with remote forwarding
- Per-UID rate limiting (0.2 QPS, burst 2) with concurrency caps
- Enhanced command validation with 10+ attack pattern tests
- Fuzz testing (7M+ executions, 0 crashes)
- SSH hardening, AppArmor/seccomp profiles, operational runbooks

All 27 Phase 1 tasks complete. Ready for production deployment.
2025-10-20 15:13:37 +00:00
Pulse Automation Bot
0b4e4f9c59 Add configurable backup polling interval 2025-10-18 13:06:41 +00:00
rcourtman
261bd7ac74 Adopt multi-token auth across docs, UI, and tooling 2025-10-14 15:47:49 +00:00
rcourtman
5c79d2516d feat: streamline docker agent onboarding 2025-10-14 09:45:32 +00:00
rcourtman
c18cf3d4b8 Fix node config API to preserve fields on partial updates
The PUT /api/config/nodes/{id} endpoint was corrupting node configurations
when making partial updates (e.g., updating just monitorPhysicalDisks):

- Authentication fields (tokenName, tokenValue, password) were being cleared
  when updating unrelated settings
- Name field was being blanked when not included in request
- Monitor* boolean fields were defaulting to false

Changes:
- Only update name field if explicitly provided in request
- Only switch authentication method when auth fields are explicitly provided
- Preserve existing auth credentials on non-auth updates
- Applied fix to all node types (PVE, PBS, PMG)

Also enables physical disk monitoring by default (opt-out instead of opt-in)
and preserves disk data between polling intervals.
2025-10-12 17:50:55 +00:00
rcourtman
f46ff1792b Fix settings security tab navigation 2025-10-11 23:29:47 +00:00