Commit graph

173 commits

Author SHA1 Message Date
Pulse Automation Bot
0adfad3270 Add configurable backup polling interval 2025-10-18 13:06:41 +00:00
Pulse Automation Bot
c3becc5272 Add Helm chart tooling, CI, and release packaging 2025-10-18 11:50:57 +00:00
Richard Courtman
74663d8cf7 fix: gracefully handle standalone node cleanup limitation
- Cleanup script now detects forced command restriction on standalone nodes
- Logs helpful message explaining limitation (security by design)
- Does not fail when standalone nodes cannot be cleaned up
- Documents that standalone node cleanup is limited by forced command security
- Automatic cleanup works fully for cluster nodes
- Manual cleanup command provided for standalone nodes if needed
2025-10-18 07:34:18 +00:00
Richard Courtman
e0844f1894 docs: add automatic cleanup documentation for node removal 2025-10-18 07:03:42 +00:00
Richard Courtman
8f2fdb70a6 fix: improve turnkey temperature monitoring for standalone nodes
- Fix script input handling to work with standard curl | bash pattern by prioritizing /dev/tty
- Add Raspberry Pi temperature sensor support (cpu_thermal chip and generic temp sensors)
- Add comprehensive documentation for turnkey standalone node setup
- Fix printf formatting error in setup script
2025-10-18 06:51:56 +00:00
rcourtman
a0cc661027 docs: implement Codex recommendations for temperature monitoring
Add comprehensive documentation improvements based on architectural review:

1. Enhanced Known Limitations section:
   - Document single proxy failure mode
   - Explain sensors output parsing brittleness with mitigation steps
   - Clarify cluster discovery dependencies and fallback options
   - Describe SSH fan-out scaling considerations for large clusters

2. Documented SSH key rotation workflow:
   - Promote automated rotation script as recommended approach
   - Include dry-run, execution, and rollback examples
   - Provide manual fallback process
   - Reference existing pulse-proxy-rotate-keys.sh script

3. Added Future Improvements roadmap:
   - Proxmox API integration (when available)
   - Agent-based architecture option
   - SNMP/IPMI support
   - Schema validation
   - Caching and throttling
   - Automated rotation timer
   - Health check endpoint

Instrumentation verified: proxy already has comprehensive Prometheus metrics
(RPC/SSH requests, latency, queue depth, rate limiting) and structured logging.
2025-10-17 12:03:31 +00:00
rcourtman
b782132db7 docs: update temperature monitoring guide to reflect removed UI button
- Replace references to 'Ensure cluster keys' button with instructions to re-run setup script
- Update troubleshooting section for new cluster nodes
- The setup script already handles SSH key distribution automatically
2025-10-17 11:46:31 +00:00
rcourtman
f8ed75a8a0 Add guest agent caching and update doc hints (refs #560) 2025-10-16 08:15:49 +00:00
rcourtman
c60614dcf3 feat: enhance alerts system with tests and improved thresholds
- Add comprehensive test coverage for alerts package with 285+ new tests
- Implement ThresholdsTable component with metric thresholds display
- Enhance Alerts page UI with improved layout and metric filtering
- Add frontend component tests for Alerts page and ThresholdsTable
- Set up Vitest testing infrastructure for SolidJS components
- Improve config persistence with better validation
- Expand discovery tests with 333+ test cases
- Update API, configuration, and Docker monitoring documentation
2025-10-15 22:25:04 +00:00
rcourtman
1b362efbd5 feat: add docker agent command handling 2025-10-15 19:27:19 +00:00
rcourtman
cadc21f5ff Ignore read-only guest filesystems in disk aggregation 2025-10-14 16:13:53 +00:00
rcourtman
bcfdf5d121 Adopt multi-token auth across docs, UI, and tooling 2025-10-14 15:47:49 +00:00
rcourtman
98144a595c Document optional host-script upgrade path 2025-10-14 13:19:38 +00:00
rcourtman
8ebc52cc1a Align proxy upgrade messaging with node re-add workflow 2025-10-14 13:17:34 +00:00
rcourtman
bdb25b8b12 Document proxy installer upgrade path 2025-10-14 12:43:50 +00:00
rcourtman
3860d565f2 Automate sensor proxy container mount and auth 2025-10-14 12:41:48 +00:00
rcourtman
db1cfc75af Update Proxmox guest agent permissions docs and tooling (refs #548) 2025-10-14 10:21:52 +00:00
rcourtman
039a4d89fa feat: streamline docker agent onboarding 2025-10-14 09:45:32 +00:00
rcourtman
509536b0e4 docs: add manual pulse-sensor-proxy install steps 2025-10-13 19:36:50 +00:00
rcourtman
eb7272582e refactor: Rename install-temp-proxy.sh to install-sensor-proxy.sh
Complete the pulse-sensor-proxy rename by updating the installer script name and all references to it.

Updated:
- Renamed scripts/install-temp-proxy.sh → scripts/install-sensor-proxy.sh
- Updated all documentation references
- Updated install.sh references
- Updated build-release.sh comments
2025-10-13 13:23:53 +00:00
rcourtman
ce499245a0 refactor: Rename pulse-temp-proxy to pulse-sensor-proxy
The name "temp-proxy" implied a temporary or incomplete implementation. The new name better reflects its purpose as a secure sensor data bridge for containerized Pulse deployments.

Changes:
- Renamed cmd/pulse-temp-proxy/ to cmd/pulse-sensor-proxy/
- Updated all path constants and binary references
- Renamed environment variables: PULSE_TEMP_PROXY_* to PULSE_SENSOR_PROXY_*
- Updated systemd service and service account name
- Updated installation, rotation, and build scripts
- Renamed hardening documentation
- Maintained backward compatibility for key removal during upgrades
2025-10-13 13:17:05 +00:00
rcourtman
4a928c8e29 docs: Update socket paths and add monitoring section to TEMPERATURE_MONITORING.md
Updated documentation to reflect new directory-level bind mount architecture:
- Changed socket path from /var/run/pulse-temp-proxy.sock to /run/pulse-temp-proxy/pulse-temp-proxy.sock
- Updated LXC bind mount syntax to directory-level (create=dir instead of create=file)
- Added "Monitoring the Proxy" section with manual monitoring commands
- Documents systemd restart-on-failure reliance for v1
- Notes future pulse-watchdog integration planned

Related to #528
2025-10-12 22:42:38 +00:00
rcourtman
9e1acb2ac1 docs: Add comprehensive Operations & Troubleshooting section
Addresses operational documentation gaps for pulse-temp-proxy:

- Service management (restart, stop, start, enable/disable)
- Log locations and viewing commands
- SSH key rotation procedures (recommended every 90 days)
- Key revocation when nodes leave cluster
- Failure modes (proxy down, socket issues, pvecm absent, off-cluster)
- Known limitations (one per host, cluster membership, cross-cluster)
- Common issues with troubleshooting steps
- Diagnostic info collection for bug reports

This provides operators with everything they need to manage the proxy service
in production environments.
2025-10-12 21:50:55 +00:00
rcourtman
f19dab70fb security: Add SO_PEERCRED authentication to temperature proxy
Addresses security concern raised in code review:
- Socket permissions changed from 0666 to 0660
- Added SO_PEERCRED verification to authenticate connecting processes
- Only allows root (UID 0) or proxy's own user
- Prevents unauthorized processes from triggering SSH key rollout
- Documented passwordless root SSH requirement for clusters

This prevents any process on the host or in other containers from
accessing the proxy RPC endpoints.
2025-10-12 21:42:22 +00:00
rcourtman
cdaeeb1782 feat: Implement secure temperature proxy for containerized deployments
Addresses #528

Introduces pulse-temp-proxy architecture to eliminate SSH key exposure in containers:

**Architecture:**
- pulse-temp-proxy runs on Proxmox host (outside LXC/Docker)
- SSH keys stored on host filesystem (/var/lib/pulse-temp-proxy/ssh/)
- Pulse communicates via unix socket (bind-mounted into container)
- Proxy handles cluster discovery, key rollout, and temperature fetching

**Components:**
- cmd/pulse-temp-proxy: Standalone Go binary with unix socket RPC server
- internal/tempproxy: Client library for Pulse backend
- scripts/install-temp-proxy.sh: Idempotent installer for existing deployments
- scripts/pulse-temp-proxy.service: Systemd service for proxy

**Integration:**
- Pulse automatically detects and uses proxy when socket exists
- Falls back to direct SSH for native installations
- Installer automatically configures proxy for new LXC deployments
- Existing LXC users can upgrade by running install-temp-proxy.sh

**Security improvements:**
- Container compromise no longer exposes SSH keys
- SSH keys never enter container filesystem
- Maintains forced command restrictions
- Transparent to users - no workflow changes

**Documentation:**
- Updated TEMPERATURE_MONITORING.md with new architecture
- Added verification steps and upgrade instructions
- Preserved legacy documentation for native installs
2025-10-12 21:35:35 +00:00
rcourtman
3c4193c43a fix: Add security gates for containerized temperature monitoring
Addresses #528

- Added opt-in confirmation prompt to setup script with security notice
- Added runtime warning when containerized Pulse uses SSH temperature monitoring
- Documented security considerations and hardening recommendations
- Users must explicitly confirm understanding before enabling in containers
2025-10-12 21:01:25 +00:00
rcourtman
a9377e20d6 Improve NVMe temperature handling 2025-10-12 16:06:55 +00:00
rcourtman
8701d8bb30 feat: capture Proxmox memory snapshots in diagnostics 2025-10-12 10:25:43 +00:00
rcourtman
7a1a1d4233 docs: refresh monitoring guides and troubleshooting 2025-10-11 22:32:26 +00:00
rcourtman
255924aef9 fix: Remove unused PMG_COLUMN_GROUPS variable 2025-10-11 22:28:41 +00:00
rcourtman
28e9cf46fd fix: Add missing PMG alert types and fix TypeScript errors 2025-10-11 22:27:38 +00:00
rcourtman
df70775066 docs: Update API docs and feature descriptions for Ceph, Docker, and updates 2025-10-11 22:21:51 +00:00
rcourtman
b8d70147f9 Clarify alert timeline context 2025-10-11 22:00:14 +00:00
rcourtman
5ca1ba234e Add threshold discoverability and reset to defaults
- Add tooltips to threshold inputs explaining -1 disables alerts
- Add help banner at top of thresholds page with usage tips
- Add FAQ entry documenting how to disable specific metrics
- Add reset to defaults button for each threshold table
- Define factory default constants for all resource types
- Reset button restores defaults and marks form as unsaved
2025-10-11 17:53:16 +00:00
rcourtman
9c21d86212 Add multi-target Docker agent support and update installer 2025-10-11 16:34:33 +00:00
rcourtman
30fa3fd810 feat: add complete Proxmox Mail Gateway (PMG) monitoring support
Add comprehensive PMG monitoring with mail statistics, queue depth tracking,
spam distribution analysis, and quarantine monitoring. Includes full discovery
support and UI consistency improvements across all Proxmox products.

Backend:
- Add pkg/pmg package with complete API client for PMG operations
- Implement mail statistics collection (inbound/outbound, spam, virus, bounces)
- Add queue depth monitoring (active, deferred, hold, incoming queues)
- Support spam score distribution and quarantine totals
- Add PMG-specific discovery logic to differentiate from PVE on port 8006
- Extend mock data generator with realistic PMG instances and metrics
- Add PMG node configuration support in config system

Frontend:
- Create MailGateway.tsx component with detailed PMG dashboard
- Display mail flow statistics with time-series charts
- Show queue depth with color-coded warnings (>50 messages or >30min age)
- Add spam distribution histogram and quarantine status
- Support cluster node status with individual queue monitoring
- Add PMG to network discovery with purple branding and mail icon
- Implement conditional navigation (hide PMG tab when no instances configured)
- Standardize discovery UI controls across PVE/PBS/PMG settings pages

API:
- Add /api/config/pmg endpoints for node configuration
- Support PMG-specific monitoring toggles (mail stats, queues, quarantine)
- Extend system settings with PMG configuration options

Discovery:
- Detect PMG vs PVE on shared port 8006 using /api2/json/statistics/mail endpoint
- Return 'pmg' type for mail gateway servers in discovery results
- Update DiscoveryModal to display PMG servers with appropriate styling

This completes ecosystem monitoring support for all three Proxmox products:
Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
2025-10-10 14:30:51 +00:00
rcourtman
7ab425b07a Publish docker agent image via GHCR workflow 2025-10-10 07:44:34 +00:00
rcourtman
bf1e1ecaff chore: prepare v4.22.0-rc.1 2025-10-08 16:52:06 +00:00
rcourtman
4f69608b72 docs: align guides with current backend 2025-10-08 15:53:35 +00:00
rcourtman
e22685b23d docs: document installing previous releases 2025-10-08 13:33:05 +00:00
rcourtman
dcff5d7a67 docs: update docker compose guidance 2025-10-08 13:13:34 +00:00
rcourtman
bb312ed022 Add Docker monitoring integration with agent-based architecture
Implements comprehensive Docker monitoring with a dedicated agent that collects
container metrics and reports them to the main Pulse server. Adds Docker-specific
alert rules and threshold management with a redesigned UI.

Backend changes:
- Add Docker agent binary with container metrics collection
- Implement Docker host and container models with CPU/memory tracking
- Add Docker-specific alert types (offline, state, health)
- Extend threshold system to support Docker resources
- Add WebSocket message types for Docker agent communication
- Implement Docker agent API endpoints for registration and metrics

Frontend changes:
- Add Docker monitoring page with host/container views
- Add Docker agent settings panel for configuration
- Reorganize thresholds page with Proxmox/Docker tabs
- Add Docker-specific alert threshold management
- Improve layout consistency with vertical stacking
- Fix defensive null checks and TypeScript errors

This change enables monitoring of Docker containers across multiple hosts
with the same alerting and threshold capabilities as Proxmox resources.
2025-10-05 17:51:16 +00:00
rcourtman
062268d4e7 Restrict temperature monitoring SSH key to sensors command
Refs #101
2025-10-04 15:38:34 +00:00
rcourtman
b0f68933dd docs: clarify SSH temperature usage 2025-10-01 15:26:00 +00:00
rcourtman
fa2656c8f0 docs: clarify SSH temperature usage 2025-10-01 15:23:41 +00:00
rcourtman
1c2431fcf6 refactor: add mock.env to repository with local override support
Make mock mode configuration part of the repository instead of a local-only
file. This ensures consistent mock mode behavior across all environments
(development, CI/CD, demo server) and makes it work out of the box for
new contributors.

Changes:
- Add mock.env to repository with sensible defaults (mock mode OFF by default)
- Support mock.env.local for personal overrides (gitignored)
- Update .gitignore to allow mock.env but exclude .local variants
- Backend loads mock.env then merges mock.env.local overrides
- hot-dev.sh loads both files in correct order

Benefits:
- New developers can clone and use mock mode immediately
- Demo server gets consistent mock configuration
- Personal preferences stay private in .local file
- No surprises - mock mode disabled by default in fresh clones
- CI/CD can use mock mode without custom configuration

Documentation:
- Updated README.md to explain mock.env is in repo
- Enhanced MOCK_MODE.md with local override instructions
- Updated claude.md with new configuration strategy
- Added mock.env.local.example for quick setup

Example workflow:
  git clone <repo>
  npm run mock:on        # Works immediately with repo defaults
  # Or create personal config:
  cp docs/development/mock.env.local.example mock.env.local
  # Edit mock.env.local with your preferences
2025-10-01 13:38:39 +00:00
rcourtman
67fc5977d1 feat: add hot-reloadable mock mode with auto-detection
Implement a hot-reloadable mock mode system that works seamlessly in both
development and production environments without requiring manual restarts
or port changes.

Key Features:
- Backend watches mock.env and auto-reloads when changed (via fsnotify + polling)
- npm commands for easy toggling: mock:on, mock:off, mock:status, mock:edit
- Works in both hot-dev mode and systemd deployments
- Reload completes in 2-5 seconds with no manual intervention
- No port changes or process restarts required

Implementation:
- Extended ConfigWatcher to monitor both .env and mock.env
- Added callback system to trigger ReloadableMonitor.Reload()
- Enhanced toggle-mock.sh to support both hot-dev and systemd modes
- Updated hot-dev.sh banner to show mock status and commands
- Created comprehensive documentation in docs/development/MOCK_MODE.md

Testing:
- Backend builds successfully
- Watcher initializes and monitors both files
- npm run mock:on/off toggles successfully
- mock.env updates correctly
- Scripts work in both hot-dev and systemd modes

Documentation:
- Added Mock Mode section to README.md
- Created detailed guide in docs/development/MOCK_MODE.md
- Updated claude.md with mock mode architecture and usage

Mock mode continues to return cached data instantly from memory
(no API calls, no locks, no timeouts), ensuring fast /api/state responses.
2025-10-01 13:35:17 +00:00
rcourtman
c664204b59 feat: add OIDC logout URL support and improve UX
Enhancements for OIDC authentication based on user feedback from issue #327:

1. Add OIDC logout URL support
   - New OIDC_LOGOUT_URL environment variable
   - UI field in OIDC settings panel for logout URL configuration
   - Properly redirects to IdP logout endpoint (e.g., Authentik end-session)
   - Stored in config and returned via security status API

2. Fix redirect URL help text in UI
   - Handle empty defaultRedirect string properly
   - Improved help text when PUBLIC_URL is not set
   - Clarify when auto-detection vs manual config is needed

3. Documentation improvements
   - Add note about using https:// in PUBLIC_URL/OIDC_REDIRECT_URL when behind TLS proxy
   - Document OIDC_LOGOUT_URL environment variable
   - Clarify X-Forwarded-Proto header behavior in OIDC docs
   - Add better guidance for Authentik users on HTTPS setup

4. Frontend improvements
   - Add HS256 signature algorithm error message in Login component
   - Display OIDC logout URL when available

These changes address the remaining OIDC UX issues reported by users,
particularly around logout functionality and reverse proxy configuration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-01 10:59:22 +00:00
rcourtman
2b4b6a08e1 fix: resolve OIDC authentication issues with DISABLE_AUTH and improve UX
Fixes multiple OIDC authentication issues reported in GitHub issue #327:

1. Fix DISABLE_AUTH=true disabling OIDC sessions
   - Reorder authentication checks to validate proxy auth and OIDC sessions
     before checking DISABLE_AUTH flag
   - Allows OIDC to function even when basic auth is disabled

2. Fix missing username display for OIDC users
   - Add GetSessionUsername() function to look up username from session ID
   - Set X-Authenticated-User header for OIDC authenticated requests
   - Update security status endpoint to return oidcUsername field
   - Display OIDC username in UI header alongside logout button

3. Fix missing logout button for OIDC users
   - Set hasAuth(true) when OIDC session is detected in frontend
   - Update security status endpoint to return OIDC info even when
     DISABLE_AUTH=true
   - Properly initialize WebSocket and load user preferences for OIDC sessions

4. Add documentation for Authentik HS256/RS256 issue
   - Document requirement for RSA signing key in Authentik
   - Add troubleshooting entry for signature algorithm mismatch
   - Provide clear resolution steps in CONFIGURATION.md and OIDC.md

All changes maintain backward compatibility and follow defensive security
practices. X-Forwarded-Proto header handling was verified to be correct.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-01 10:53:19 +00:00
rcourtman
fd52a7add1 improve oidc error logging and documentation
addresses #327

- added detailed logging when ID token verification fails
- added better error messages for common OIDC issues
- updated docs with Authentik-specific configuration
- added troubleshooting section for redirect loops and invalid_id_token errors

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 19:52:55 +00:00