Commit graph

139 commits

Author SHA1 Message Date
rcourtman
8f92273e33 security: enforce scope checks for AI approvals and config management 2026-02-03 18:40:31 +00:00
rcourtman
beae4c860c fix: address 6 security and reliability issues
Security fixes:
- Auto-register now requires settings:write scope for API tokens
- X-Forwarded-For in auto-register only trusted from verified proxies
- Public URL capture requires authentication (no loopback bypass)
- Lockout reset now uses RequireAdmin for session users

Reliability fixes:
- Docker stop command expiration clears PendingUninstall flag
- Cancelled notifications get completed_at set and are cleaned up
2026-02-03 17:32:44 +00:00
rcourtman
4af5fc4246 refactor(config): rename BackendHost/BackendPort to BindAddress
Simplify server config by consolidating BackendHost and BackendPort into
a single BindAddress field. The port is now solely controlled by FrontendPort.

Changes:
- Replace BackendHost/BackendPort with BindAddress in Config struct
- Add deprecation warning for BACKEND_HOST env var (use BIND_ADDRESS)
- Update connection timeout default from 45s to 60s
- Remove backendPort from SystemSettings and frontend types
- Update server.go to use cfg.BindAddress
- Update all tests to use new config field names
2026-02-01 23:26:32 +00:00
rcourtman
508c9f88f6 fix: Support partial updates for PBS nodes. Related to #1105
Allow updating PBS node settings (like excludeDatastores) without
requiring host to be resent. Match the behavior of PVE/PMG handlers
which only validate and update fields when provided.

Previously, PUT /api/config/nodes/{pbs-id} with just {excludeDatastores: [...]}
would fail with 'host is required' because the handler always called
normalizeNodeHost regardless of whether a new host was provided.
2026-01-23 00:13:28 +00:00
rcourtman
289d95374f feat: add multi-tenancy foundation (directory-per-tenant)
Implements Phase 1-2 of multi-tenancy support using a directory-per-tenant
strategy that preserves existing file-based persistence.

Key changes:
- Add MultiTenantPersistence manager for org-scoped config routing
- Add TenantMiddleware for X-Pulse-Org-ID header extraction and context propagation
- Add MultiTenantMonitor for per-tenant monitor lifecycle management
- Refactor handlers (ConfigHandlers, AlertHandlers, AIHandlers, etc.) to be
  context-aware with getConfig(ctx)/getMonitor(ctx) helpers
- Add Organization model for future tenant metadata
- Update server and router to wire multi-tenant components

All handlers maintain backward compatibility via legacy field fallbacks
for single-tenant deployments using the "default" org.
2026-01-22 13:39:06 +00:00
rcourtman
a55bdb7a3a feat(api): security and metrics history improvements
- Require admin + settings:write scope for setup-script-url endpoint
- Add license enforcement for long-term metrics (30d/90d require Pro)
- Add downsampling step calculation for metrics history queries
- Add isContainerSSHRestricted helper for SSH restriction checks
- Clean up temperature proxy references from config handlers
- Minor OIDC and rate limit improvements
2026-01-22 00:44:12 +00:00
rcourtman
7599915b8f refactor(api): remove sensor proxy config from API handlers
- config_handlers.go: remove proxy configuration endpoints
- system_settings.go: remove proxy-related settings
- rate_limit_config.go: update rate limit configuration
- Update related test files
2026-01-21 12:02:46 +00:00
rcourtman
035436ad6e fix: add mutex to prevent concurrent map writes in Docker agent CPU tracking
The agent was crashing with 'fatal error: concurrent map writes' when
handleCheckUpdatesCommand spawned a goroutine that called collectOnce
concurrently with the main collection loop. Both code paths access
a.prevContainerCPU without synchronization.

Added a.cpuMu mutex to protect all accesses to prevContainerCPU in:
- pruneStaleCPUSamples()
- collectContainer() delete operation
- calculateContainerCPUPercent()

Related to #1063
2026-01-15 21:10:55 +00:00
rcourtman
9b49d3171d feat(pbs): add datastore exclusion to reduce PBS log noise
Users with removable/unmounted datastores (e.g., external HDDs for
offline backup) experienced excessive PBS log entries because Pulse
was querying all datastores including unavailable ones.

Added `excludeDatastores` field to PBS node configuration that accepts
patterns to exclude specific datastores from monitoring:
- Exact names: "exthdd1500gb"
- Prefix patterns: "ext*"
- Suffix patterns: "*hdd"
- Contains patterns: "*removable*"

Pattern matching is case-insensitive.

Fixes #1105
2026-01-14 12:26:18 +00:00
rcourtman
b7f5cfde1c fix: apply subnet preference for cluster nodes in fallback path
When cluster node validation fails (because cluster-reported IPs are on
an internal network unreachable from Pulse), the fallback path was not
applying subnet preference logic. This caused Pulse to continue trying
to connect to internal cluster IPs instead of management network IPs.

Now the fallback path queries node network interfaces via the initial
connection and sets IPOverride to an IP on the same network as the
original connection, just like the validated node path does.

Fixes #929
2026-01-10 15:40:48 +00:00
rcourtman
3e2824a7ff feat: remove Enterprise badges, simplify Pro upgrade prompts
- Replace barrel import in AuditLogPanel.tsx to fix ad-blocker crash
- Remove all Enterprise/Pro badges from nav and feature headers
- Simplify upgrade CTAs to clean 'Upgrade to Pro' links
- Update docs: PULSE_PRO.md, API.md, README.md, SECURITY.md
- Align terminology: single Pro tier, no separate Enterprise tier

Also includes prior refactoring:
- Move auth package to pkg/auth for enterprise reuse
- Export server functions for testability
- Stabilize CLI tests
2026-01-09 16:51:08 +00:00
rcourtman
020553a12d fix: use flexible subnet matching instead of fixed /24
The previous implementation assumed /24 subnets, which failed for
larger networks (e.g., /16 or /20). Now uses progressive subnet
matching that tries /24, /20, and /16 to handle various network sizes.

Example: If connection IP is 10.1.1.5 and a node has 10.1.2.6,
it now correctly identifies them as being on the same network.
2026-01-08 23:24:50 +00:00
rcourtman
bd1df9f942 feat: automatic subnet preference for cluster node discovery
When discovering cluster nodes, Pulse now automatically prefers IPs
on the same subnet as the initial connection. This fixes the common
issue where Pulse used internal cluster network IPs (e.g., 172.x.x.x)
instead of management network IPs (e.g., 10.x.x.x).

How it works:
1. Extract subnet from initial connection URL (assumes /24 for IPv4)
2. For each discovered node, query /nodes/{node}/network for all IPs
3. If cluster-reported IP is on a different subnet, find an IP on
   the preferred subnet and set it as IPOverride
4. Manual IPOverride settings are preserved and take precedence

This eliminates the need for manual IPOverride configuration in most
multi-network Proxmox setups.

Refs #929, #1066
2026-01-08 23:12:30 +00:00
rcourtman
9e339957c6 fix: Update runtime config when toggling Docker update actions setting
The DisableDockerUpdateActions setting was being saved to disk but not
updated in h.config, causing the UI toggle to appear to revert on page
refresh since the API returned the stale runtime value.

Related to #1023
2026-01-03 11:14:17 +00:00
rcourtman
ee45323312 feat: Allow configuring physical disk polling interval in UI (Related to #1007) 2026-01-01 16:00:28 +00:00
rcourtman
76990a65a7 fix: Preserve user's configured hostname when agent registers with IP
When a node was manually added with a hostname (e.g., pve.example.com)
and then the agent registered using its IP address, the code would
correctly deduplicate but incorrectly overwrite the user's configured
hostname with the agent's IP.

Now when matching by IP resolution (hostname resolves to agent's IP),
we preserve the user's original hostname configuration instead of
replacing it with the IP.

Related to #940
2025-12-28 15:44:40 +00:00
rcourtman
1dff90817f fix: detect duplicate nodes by IP resolution during agent auto-register. Related to #924
When an agent registers using an IP address, check if any existing node's
hostname resolves to that same IP. This prevents duplicates when a node
was manually configured via hostname and later the agent is installed
which registers using the host's IP.

Changes:
- Add extractHostIP() to extract IP from URL if present
- Add resolveHostnameToIP() with 2s timeout for DNS resolution
- During agent auto-registration, check if existing hostname-based
  configs resolve to the new IP and update instead of creating duplicates
- Add test for extractHostIP helper function
2025-12-27 11:02:00 +00:00
rcourtman
4277aa753c feat(pbs): turnkey PBS setup with password auth
When adding a PBS node with username/password credentials, Pulse now
automatically:
1. Connects to PBS using the provided credentials
2. Creates a 'pulse-monitor@pbs' user with Audit permissions
3. Generates an API token
4. Stores the token instead of the password

This enables one-click PBS setup for Docker/containerized deployments
where you can't easily run the agent installer. Simply enter root@pam
credentials in the UI and Pulse handles the rest.

Falls back to password auth if token creation fails (e.g., old PBS
version or permission issues).
2025-12-26 10:12:04 +00:00
rcourtman
3d671c1824 feat(pbs): add API-based token creation for turnkey PBS setup
- Added PBS client methods: CreateUser, SetUserACL, CreateUserToken
- Added SetupMonitoringAccess() turnkey method that creates user + token
- Updated handleSecureAutoRegister to use PBS API for token creation
- Enables one-click PBS setup for Docker/containerized deployments

When users provide PBS root credentials, Pulse can now create the
monitoring user and API token remotely via the PBS API, eliminating
the need to SSH/exec into the container manually.
2025-12-26 10:08:41 +00:00
rcourtman
a9078e96d1 fix: apply duplicate hostname fix to HandleAddNode (manual UI)
Extended Issue #891 fix to cover manual node addition via the UI:

1. HandleAddNode now checks for duplicates by Host URL (not name)
2. Disambiguator applied to PVE, PBS, and PMG node creation
3. Error message updated: 'host URL already exists' instead of 'name already exists'

This ensures the fix works whether nodes are added via:
- Agent auto-registration ✓
- Manual UI setup ✓

All node creation paths now consistently:
- Match by Host URL only
- Disambiguate duplicate hostnames with IP: 'px1' → 'px1 (10.0.2.224)'
2025-12-24 16:17:37 +00:00
rcourtman
44b25b43ac fix: handle DHCP IP changes without creating duplicates
Follow-up to #891 fix - also match by name+tokenID to handle the case
where the same physical host gets a new IP (DHCP). This ensures:

1. Same hostname + DIFFERENT token = different physical hosts → create separate nodes
2. Same hostname + SAME token = same host with new IP → update existing node

Also updates the host URL when an existing node is matched, so IP changes
are properly reflected in the saved configuration.
2025-12-24 16:09:22 +00:00
rcourtman
92988ae0e6 fix: allow duplicate hostnames for different Proxmox hosts. Related to #891
PROBLEM:
When two Proxmox hosts have the same hostname (e.g., 'px1' on different networks),
the auto-registration was matching by name and overwriting the first with the second.
This has been a recurring issue (#104) with at least 3 prior fix attempts.

ROOT CAUSE:
The auto-register handler matched existing nodes by BOTH Host URL and Name.
Matching by name is incorrect - different physical hosts can share hostnames.

FIXES:
1. Remove name-based matching in auto-registration - match by Host URL only
2. Add disambiguateNodeName() to append IP when duplicate hostnames exist
3. Add regression tests to prevent this from breaking again

Now when registering two hosts named 'px1':
- First becomes: px1
- Second becomes: px1 (10.0.2.224)
Both are stored as separate nodes with their own credentials.
2025-12-24 16:05:07 +00:00
rcourtman
e0dc6695fc fix: Per-node TLS fingerprints for cluster peers (TOFU)
When a PVE cluster has unique self-signed certificates on each node, Pulse
would mark secondary nodes as unhealthy because only the primary node's
fingerprint was used for all connections.

Now, during cluster discovery, Pulse captures each node's TLS fingerprint
and uses it when connecting to that specific node. This enables
"Trust On First Use" (TOFU) for clusters with unique per-node certs.

Changes:
- Add Fingerprint field to ClusterEndpoint config
- Add FetchFingerprint() to tlsutil for capturing node certs
- validateNodeAPI() now captures and returns fingerprints during discovery
- NewClusterClient() accepts endpointFingerprints map for per-node certs
- All client creation paths use per-endpoint fingerprints when available

Related to #879
2025-12-24 10:05:03 +00:00
rcourtman
e4732af0f5 fix: use configured Guest URLs for PVE/PBS/PMG navigation (#870)
- Fix PVE nodes: buildNodeUrl in ProxmoxNodesSection.tsx now prioritizes
  guestURL over host (was ignoring guestURL entirely)
- Add PBS support: GuestURL field added to PBSInstance config, model,
  and API handlers
- Add PMG support: GuestURL field added to PMGInstance config, model,
  and API handlers
- Update NodeSummaryTable to use guestURL for PBS nodes
- Frontend types updated for PBS/PMG guestURL support

The Guest URL setting in node configuration now works correctly across
all node types. When set, it takes priority over the Host URL when
clicking on node names to navigate to the Proxmox/PBS/PMG web UI.

Closes #870
2025-12-22 22:05:25 +00:00
rcourtman
397871629c fix: cluster-aware guest deduplication and multi-agent token binding
- Add cluster-aware guest ID generation (clusterName-VMID instead of instanceName-VMID)
  to prevent duplicate VMs/containers when multiple cluster nodes are monitored

- Add cluster deduplication at registration time - when a node is added that belongs
  to an already-configured cluster, merge as endpoint instead of creating duplicate

- Add startup consolidation to automatically merge duplicate cluster instances

- Change host agent token binding from agent GUID to hostname, allowing:
  - Multiple host agents to share a token (each bound by hostname)
  - Agent reinstalls on same host without token conflicts

- Remove 12-character password minimum requirement

- Remove emoji from auto-registration success message

- Fix grouped view node lookup to support both cluster-aware node IDs
  (clusterName-nodeName) and legacy guest grouping keys (instance-nodeName)

Fixes duplicate guests appearing when agents are installed on multiple
cluster nodes. Also improves multi-agent UX by allowing shared tokens.
2025-12-14 10:16:17 +00:00
rcourtman
60c980a921 Show AI cost refresh errors and harden log redaction 2025-12-12 11:05:24 +00:00
rcourtman
8948e84fe5 feat: AI features, agent improvements, and host monitoring enhancements
AI Chat Integration:
- Multi-provider support (Anthropic, OpenAI, Ollama)
- Streaming responses with markdown rendering
- Agent command execution for remote troubleshooting
- Context-aware conversations with host/container metadata

Agent Updates:
- Add --enable-proxmox flag for automatic PVE/PBS token setup
- Improve auto-update with semver comparison (prevents downgrades)
- Add updatedFrom tracking to report previous version after update
- Reduce initial update check delay from 30s to 5s
- Add agent version column to Hosts page table

Host Metrics:
- Add DiskIO stats collection (read/write bytes, ops, time)
- Improve disk filtering to exclude Docker overlay mounts
- Add RAID array monitoring via mdadm
- Enhanced temperature sensor parsing

Frontend:
- New Agent Version column on Hosts overview table
- Improved node modal with agent-first installation flow
- Add DiskIO display in host drawer
- Better responsive handling for metric bars
2025-12-05 10:37:02 +00:00
rcourtman
bda8056e48 Add refresh-cluster button to detect new Proxmox cluster members
When new nodes are added to a Proxmox cluster after Pulse was
initially configured, they weren't showing up in Settings. The
existing "Refresh" button only triggered network discovery, not
cluster membership re-detection.

Changes:
- Add POST /api/config/nodes/{id}/refresh-cluster endpoint
- Add "Refresh" button in cluster node panel in Settings
- Re-detect cluster membership and update stored endpoints

Related to #799
2025-12-02 22:01:00 +00:00
rcourtman
b4d497ce3b security: Add request body size limits to API handlers
Add http.MaxBytesReader to 16 additional handlers to prevent memory
exhaustion attacks via oversized request bodies:

- docker_agents.go: HandleReport (512KB), HandleCommandAck (8KB),
  HandleSetCustomDisplayName (8KB)
- alerts.go: UpdateAlertConfig (64KB), BulkAcknowledgeAlerts (32KB),
  BulkClearAlerts (32KB)
- config_handlers.go: HandleAddNode, HandleTestConnection,
  HandleUpdateNode, HandleTestNodeConfig (32KB each),
  HandleVerifyTemperatureSSH, HandleExportConfig, HandleDiscoverServers,
  HandleSetupScriptURL (8KB each), HandleImportConfig (1MB),
  HandleUpdateMockMode (16KB)
2025-12-02 16:43:13 +00:00
rcourtman
98d170e087 perf: Cache err.Error() in cluster node validation
Cache err.Error() result in two locations in config_handlers.go:
- TLS mismatch detection (3x calls to 1)
- Standalone node detection (2x calls to 1)
2025-12-02 15:41:18 +00:00
rcourtman
b21fad833e refactor: Remove dead HandleTestExistingNode function
This function was superseded by HandleTestNode which is wired to the
router at /api/config/nodes/{id}/test. The old function was not
registered in the router and served no purpose.

Removes 121 lines of unused code.
2025-12-01 10:21:50 +00:00
rcourtman
bd2635cabe chore: remove deprecated build tags and use strings.ReplaceAll
- Remove redundant // +build directives (go:build is sufficient in Go 1.17+)
- Replace strings.Replace(..., -1) with strings.ReplaceAll
2025-11-27 10:16:08 +00:00
rcourtman
8152197207 fix: mark unused parameters to satisfy unparam linter
Mark intentionally unused parameters with underscore to:
- Silence unparam warnings for legitimate unused parameters
- Keep function signatures intact for API compatibility
- Remove unused req from serveChecksum helper
2025-11-27 10:12:48 +00:00
rcourtman
e1b9c133c3 fix: remove ineffectual assignments
- Fix loop variable reassignment in config_handlers.go
- Remove redundant boolean assignments in swarm.go
2025-11-27 09:48:29 +00:00
rcourtman
ad998a1e2f style: fix staticcheck style warnings
- Merge variable declaration with assignment (S1021)
- Use unconditional strings.TrimPrefix (S1017)
- Remove unnecessary nil checks around range (S1031)
- Remove unnecessary fmt.Sprintf (S1039)
- Use copy() instead of manual loop (S1001)
- Use time.Until instead of t.Sub(time.Now()) (S1024)
- Use buf.String() instead of string(buf.Bytes()) (S1030)
2025-11-27 09:19:33 +00:00
courtmanr@gmail.com
71fea10aa5 Further reduce setup script verbosity: silence token checks and consolidate permission logs 2025-11-25 10:20:17 +00:00
courtmanr@gmail.com
32c1c3fac5 Suppress 'User already exists' message in setup script 2025-11-25 10:16:08 +00:00
courtmanr@gmail.com
bddb90229b Improve setup script clarity: reduce verbosity and fix confusing messages 2025-11-25 10:13:20 +00:00
courtmanr@gmail.com
0c6fd01ff2 Improve setup script output by hiding irrelevant Docker/proxy info 2025-11-25 10:01:41 +00:00
courtmanr@gmail.com
f4c2bd7c35 Implement UI toggle for Hide Local Login (related to issue #750) 2025-11-25 08:14:19 +00:00
rcourtman
3b85436c0f Related to #738: make pulse proxy mount migration-safe 2025-11-21 21:29:14 +00:00
rcourtman
27ec1daf85 Harden public URL detection and setup token handling 2025-11-20 19:27:14 +00:00
rcourtman
17102706ae Related to #727: restore default Proxmox ports 2025-11-20 16:35:08 +00:00
courtmanr@gmail.com
c8b4d4a0d8 Implement sensor proxy installation and configuration updates 2025-11-20 13:23:21 +00:00
courtmanr@gmail.com
8635675cb4 refactor: simplify sensor proxy installer argument detection by validating CTID and defaulting to standalone mode. 2025-11-20 12:37:08 +00:00
rcourtman
d68da802ac Respect user-provided node host URLs (Related to #724) 2025-11-20 09:40:38 +00:00
rcourtman
7d0bbaf961 WIP: Fix temperature proxy registration persistence (incomplete)
This commit contains multiple fixes for temperature proxy registration,
but the core issue remains unresolved.

## What's Fixed:
1. Added config pointer and reloadFunc to TemperatureProxyHandlers
2. Added SetConfig method to keep handler in sync with router config changes
3. Added config reload after registration to prevent monitor from overwriting
4. Fixed installer port conflict detection and duplicate YAML key issues
5. Added comprehensive debug logging throughout registration flow

## What's Still Broken:
The TemperatureProxyURL, TemperatureProxyToken, and TemperatureProxyControlToken
fields are NOT persisting to nodes.enc after SaveNodesConfig is called.

Debug logs confirm:
- HandleRegister correctly updates nodesConfig.PVEInstances[matchedIndex]
- The correct data is passed to SaveNodesConfig (verified in logs)
- SaveNodesConfig completes without errors
- Config reload executes successfully
- BUT after Pulse restart, the fields are empty when loaded from disk

The bug is in SaveNodesConfig serialization or file writing logic itself.

Related files:
- internal/api/temperature_proxy.go: Registration handler
- internal/config/persistence.go: SaveNodesConfig implementation
- internal/config/config.go: PVEInstance struct definition
2025-11-19 20:12:19 +00:00
rcourtman
452a19eafa fix(setup): use runtime AUTH_TOKEN variable for Authorization headers
Changed Authorization headers in ssh-config and verify-temperature-ssh API
calls to use the runtime $AUTH_TOKEN variable instead of compile-time
hardcoded authToken.

This fixes a bug where users who override the auth token via:
- PULSE_SETUP_TOKEN environment variable
- Interactive prompt (when auth_token URL param omitted)

...would still send an empty Bearer token in the Authorization headers,
causing API calls to fail with 401 Unauthorized.

Changes:
- Line 4748: -H "Authorization: Bearer %s" → -H "Authorization: Bearer $AUTH_TOKEN"
- Line 4937: -H "Authorization: Bearer %s" → -H "Authorization: Bearer $AUTH_TOKEN"
- Removed 2 authToken arguments from fmt.Sprintf (lines 5059)

Now the script respects runtime token overrides in all code paths.

Identified by Codex during fmt.Sprintf argument alignment review.
2025-11-19 14:53:44 +00:00
rcourtman
a982456189 fix(setup): correct fmt.Sprintf argument alignment for PVE quick setup
Fixed critical argument mismatch bug where fmt.Sprintf arguments didn't align
with template placeholders. This caused:
- authToken being passed where pulseURL expected (curl errors)
- pulseURL being passed where authToken expected (empty Authorization headers)
- tokenName misalignment (Token ID placeholder broken)

Root cause: Template has 51 %s placeholders (54 total - 3 escaped %%s), but
argument list had wrong count and ordering.

Solution: Rebuilt argument list (lines 5049-5059) with correct mapping:
- 27 pulseURL (all installer URLs, --pulse-server flags, API endpoints)
- 11 tokenName (token creation, checks, final Token ID)
- 3 authToken (AUTH_TOKEN variable + 2 Authorization headers)
- 3 serverHost (error message rerun hints)
- 1 each: serverName, time, pulseIP, storagePerms, SSH keys, minProxyReadyVersion

Verified with go vet (passes). Mapping confirmed by walking each placeholder
in template and matching to correct argument type.

Related to #TBD (user will test)
2025-11-19 14:53:44 +00:00
rcourtman
d38c00474e fix(setup): make manual repair instructions actionable
Issue: When deployment type cannot be determined, error message referenced
$PROXY_INSTALLER but deleted it immediately, making instructions unusable.

Fix: Provide complete curl commands that users can copy-paste directly:
  curl -fsSL $PULSE_URL/api/install/install-sensor-proxy.sh | bash -s -- ...

This ensures users have a working repair path even when auto-detection fails.

Identified by Codex final review.
2025-11-19 13:33:28 +00:00