This resolves issues where snapshots/backups persist after deletion if the
Instance field didn't match the ID prefix (due to case changes, name changes, etc).
Now consistent with how VMs, Containers, Storage, etc. are filtered.
Also adds Instance field to BackupTask model for completeness.
Addresses #1009 (refs #991)
- Add persistent volume mounts for Go/npm caches (faster rebuilds)
- Add shell config with helpful aliases and custom prompt
- Add comprehensive devcontainer documentation
- Add pre-commit hooks for Go formatting and linting
- Use go-version-file in CI workflows instead of hardcoded versions
- Simplify docker compose commands with --wait flag
- Add gitignore entries for devcontainer auth files
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
pollGuestSnapshots was reading m.state.VMs and m.state.Containers while
only holding the Monitor's mutex (m.mu), not the State's internal mutex.
This caused a data race where VMs/containers could be modified by another
goroutine while being read, leading to stale or missing snapshot data.
Symptoms: Deleted snapshots persisting in UI, new snapshots not appearing,
only fixable by service restart.
Fix: Use GetSnapshot() which properly acquires State's mutex and returns
a consistent copy of the data.
Related to #991
When a host agent is running on a Proxmox node (linked host agent),
merge the agent's SMART disk temperature data into the Physical Disks
view for that node. This allows disk temps collected by pulse-agent
to populate the Physical Disks page without requiring Proxmox SMART
monitoring to be enabled.
Matching is done by WWN (most reliable), serial number, or device path.
Closes part of issue #909 (follow-up from MichiFr)
Previously, toggling AI Commands in the Agents view would show a pending state
and wait for the agent to confirm the change (up to 2 minutes). If the agent
was slow to report or the WebSocket update was missed, the toggle would appear
stuck.
Now, UpdateHostAgentConfig also updates the Host model in state immediately,
providing instant UI feedback. The agent will still receive the config on its
next report, but users see the change right away.
Added SetHostCommandsEnabled function to models.State for this purpose.
Allows specifying which IP address the agent should report, useful for:
- Multi-homed systems with separate management networks
- Systems with private monitoring interfaces
- VPN/overlay network scenarios
Usage:
pulse-agent --report-ip 192.168.1.100
PULSE_REPORT_IP=192.168.1.100 pulse-agent
The GuestURL field was missing from NodeFrontend and its converter,
causing configured Guest URLs to be ignored when clicking on cluster
node names. The frontend would fall back to the auto-detected IP
instead of using the user-configured Guest URL.
Related to #940
- Add DELETE /api/agents/unregister endpoint for agent self-unregistration
- Agent now unregisters itself from Pulse server when uninstalled
- Add clarifying note in UnifiedAgents explaining linked agents behavior
- Linked agents are managed via their PVE node but this is now explained in UI
- Add LastSeen field to HostAgent model for better agent status tracking
When a host agent is deleted via the UI, the LinkedHostAgentID on any
PVE nodes that were linked to it was not being cleared. This caused
the "Agent" tag to persist in the UI after uninstalling the agent.
Related to #920
- Add smartctl package to collect disk temperature and health data
- Add SMART field to agent Sensors struct
- Host agent now runs smartctl to collect disk temps when available
- Backend processes agent SMART data for temperature display
- Graceful fallback when smartctl not installed
When two Proxmox nodes have the same hostname (e.g., 'px1' on different IPs),
the getHostAgentTemperature function was matching by hostname alone, causing
both nodes to show temperature from whichever host agent appeared first.
The fix:
- Added getHostAgentTemperatureByID that first tries matching by LinkedNodeID
(the unique node ID) before falling back to hostname matching
- Updated the caller to pass modelNode.ID for precise matching
- Maintains backwards compatibility for setups where linking hasn't occurred
Related to #891
This implements full remote configuration for the AI command execution setting:
Backend:
- Add CommandsEnabled field to HostMetadata for persistent storage
- Add GetHostAgentConfig/UpdateHostAgentConfig methods to Monitor
- Add /api/agents/host/{id}/config endpoint (GET for agents, PATCH for UI)
- Server includes config in report response for immediate agent application
- Agent parses response and dynamically enables/disables command client
Frontend:
- Add 'AI Commands' toggle column in Managed Agents table
- Toggle immediately updates server config; agent applies on next heartbeat
- Add 'Enable AI command execution' checkbox in agent installer wizard
- Checkbox adds --enable-commands flag to generated install commands
This allows users to:
1. Enable at install time via checkbox in the wizard
2. Toggle remotely via the Managed Agents UI for existing agents
3. Agents apply changes automatically on their next report cycle
- Add CommandsEnabled field to AgentInfo in pkg/agents/host/report.go
- Agent now reports whether AI command execution is enabled
- Server stores and exposes this via Host model
- Frontend can now show which agents have commands enabled
- This provides visibility before implementing remote configuration
Backup and snapshot polling runs asynchronously and could take 20-45 seconds to complete, but WebSocket broadcasts happened on a separate fixed-interval timer. This caused frontend to show stale data until a broadcast happened to coincide with completed polling - which could take hours.
Now broadcasts state immediately after backup/snapshot polling completes, ensuring users see changes within seconds.
Related to #895
When a PVE cluster has unique self-signed certificates on each node, Pulse
would mark secondary nodes as unhealthy because only the primary node's
fingerprint was used for all connections.
Now, during cluster discovery, Pulse captures each node's TLS fingerprint
and uses it when connecting to that specific node. This enables
"Trust On First Use" (TOFU) for clusters with unique per-node certs.
Changes:
- Add Fingerprint field to ClusterEndpoint config
- Add FetchFingerprint() to tlsutil for capturing node certs
- validateNodeAPI() now captures and returns fingerprints during discovery
- NewClusterClient() accepts endpointFingerprints map for per-node certs
- All client creation paths use per-endpoint fingerprints when available
Related to #879
When a user deletes an API token that was migrated from .env, track
the hash in a suppression list to prevent it from being re-migrated
on the next restart.
Changes:
- Add SuppressedEnvMigrations field to Config
- Add env_token_suppressions.json persistence
- Check suppression list during env token migration
- Record suppressed hash when deleting "Migrated from .env" tokens
- Update RemoveAPIToken to return the removed record
Related to #871
PBS datastores are now displayed in the Storage overview alongside PVE
storage. Each PBS datastore is converted to a Storage entry with:
- type: 'pbs'
- content: 'backup'
- shared: true
- active: based on PBS instance status
This provides a complete picture of all storage resources in one view
while keeping detailed PBS info in the dedicated PBS section.
Closes#869
- Fix PVE nodes: buildNodeUrl in ProxmoxNodesSection.tsx now prioritizes
guestURL over host (was ignoring guestURL entirely)
- Add PBS support: GuestURL field added to PBSInstance config, model,
and API handlers
- Add PMG support: GuestURL field added to PMGInstance config, model,
and API handlers
- Update NodeSummaryTable to use guestURL for PBS nodes
- Frontend types updated for PBS/PMG guestURL support
The Guest URL setting in node configuration now works correctly across
all node types. When set, it takes priority over the Host URL when
clicking on node names to navigate to the Proxmox/PBS/PMG web UI.
Closes#870
Backend:
- Enhanced buildEnrichedResourceContext to ALWAYS show learned baselines with
status indicators (normal/elevated/anomaly) instead of only when anomalous
- This makes Pulse Pro's 'moat' visible - users can see the AI understands
their infrastructure's normal behavior patterns
- Added baseline import to service.go
Frontend (user changes):
- Added incident event type filtering with toggle buttons
- Added resource incident panel to view all incidents for a resource
- Added timeline expand/collapse functionality in alert history
- Added incident note saving with proper incidentId tracking
- Added startedAt parameter for proper incident timeline loading
Addresses issue #861 - syslog flooded on docker host
Many routine operational messages were being logged at INFO level,
causing excessive log volume when monitoring multiple VMs/containers.
These messages are now logged at DEBUG level:
- Guest threshold checking (every guest, every poll cycle)
- Storage threshold checking (every storage, every poll cycle)
- Host agent linking messages
- Filesystem inclusion in disk calculation
- Guest agent disk usage replacement
- Polling start/completion messages
- Alert cleanup and save messages
Users can set LOG_LEVEL=debug to see these messages if needed for
troubleshooting. The default INFO level now produces significantly
less log output.
Also updated documentation in CONFIGURATION.md and DOCKER.md to:
- Clarify what each log level includes
- Add tip about using LOG_LEVEL=warn for minimal logging
The issue was a SolidJS reactivity problem in the Dashboard component.
When guestMetadata signal was accessed inside a For loop callback and
assigned to a plain variable, SolidJS lost reactive tracking.
Changed from:
const metadata = guestMetadata()[guestId] || ...
customUrl={metadata?.customUrl}
To:
const getMetadata = () => guestMetadata()[guestId] || ...
customUrl={getMetadata()?.customUrl}
This ensures SolidJS properly tracks the signal dependency when the
getter function is called directly in JSX props.
- Install script now auto-detects Docker, Kubernetes, and Proxmox
- Platform monitoring is enabled automatically when detected
- Users can override with --disable-* or --enable-* flags
- Allow same token to register multiple hosts (one per hostname)
- Update tests to reflect new multi-host token behavior
- Improve CompleteStep and UnifiedAgents UI components
- Update UNIFIED_AGENT.md documentation
- Add cluster-aware guest ID generation (clusterName-VMID instead of instanceName-VMID)
to prevent duplicate VMs/containers when multiple cluster nodes are monitored
- Add cluster deduplication at registration time - when a node is added that belongs
to an already-configured cluster, merge as endpoint instead of creating duplicate
- Add startup consolidation to automatically merge duplicate cluster instances
- Change host agent token binding from agent GUID to hostname, allowing:
- Multiple host agents to share a token (each bound by hostname)
- Agent reinstalls on same host without token conflicts
- Remove 12-character password minimum requirement
- Remove emoji from auto-registration success message
- Fix grouped view node lookup to support both cluster-aware node IDs
(clusterName-nodeName) and legacy guest grouping keys (instance-nodeName)
Fixes duplicate guests appearing when agents are installed on multiple
cluster nodes. Also improves multi-agent UX by allowing shared tokens.
When a host agent registers, it now searches for a PVE node with a
matching hostname and links them together. Similarly, when PVE nodes
are discovered, they check for existing host agents with matching hostnames.
This prevents the confusion of seeing duplicate entries when users install
agents on PVE cluster nodes that were already discovered via the cluster API.
- Added LinkedHostAgentID field to Node struct
- Added LinkedNodeID/LinkedVMID/LinkedContainerID fields to Host struct
- Added findLinkedProxmoxEntity() to match by hostname (with domain stripping)
- Updated UpdateNodesForInstance() to preserve and auto-set links
When no auth is configured (fresh install), CheckAuth allows all requests.
This creates a race condition where existing agents from a previous setup
can report data before the wizard completes security configuration.
This fix clears all host agents and docker hosts when /api/security/quick-setup
is called, ensuring the wizard shows a clean state after security is configured.
Added:
- State.ClearAllHosts() - removes all host agents
- State.ClearAllDockerHosts() - removes all docker hosts
- Monitor.ClearUnauthenticatedAgents() - clears both and resets token bindings
- Call to ClearUnauthenticatedAgents() in handleQuickSecuritySetupFixed()
- Add MetricsRetentionRawHours, MetricsRetentionMinuteHours, MetricsRetentionHourlyDays, MetricsRetentionDailyDays to SystemSettings
- Wire settings from system.json through Config to metrics store initialization
- Set sensible defaults: Raw=2h, Minute=24h, Hourly=7d, Daily=90d
- Log active retention values on startup for transparency
Users can now customize how long metrics are stored at each aggregation tier.
- Add sortable table headers for Pod and Deployment views
- Click column headers to toggle sort direction
- Sort state persists across sessions
- Add namespace dropdown filter for Pods/Deployments views
- Auto-populates from available namespaces
- Include namespace filter in reset and active filters check
Backend:
- Seed OCI classification from previous state so containers never
'downgrade' to LXC if config fetching intermittently fails
- Prevent type regression in recordGuestSnapshot when OCI was previously detected
- Move metrics zeroing before snapshot recording for cleaner flow
Frontend:
- Add isOCIContainer() memo that checks both type and isOci flag
- Use isOCI helper in Dashboard.tsx for AI context building
- Include oci-container type in useResources container conversion
- Preserve isOci and osTemplate fields through legacy conversion
This ensures OCI containers retain their classification even when
Proxmox API permissions or transient errors prevent config reads.
- Refactored enrichContainerMetadata to not return early when container is stopped
- Status API calls are still skipped for stopped containers (as expected)
- Config fetch now runs regardless of status, enabling OCI detection
- Added test for OCI detection on stopped containers
Discovered: Proxmox 9.1 requires VM.Config.Options permission to read
OCI container configs (not just VM.Audit). Document this in setup guides.
- Added isOCIContainerByConfig() to detect OCI containers by:
- Presence of 'entrypoint' field (only OCI containers have this)
- Combination of ostype=unmanaged, cmode=console, and lxc.signal.halt
- This is needed because Proxmox doesn't persist ostemplate after creation
- Now supports detection of already-created OCI containers (like the test alpine container)
- Backend: Add IsOCI and OSTemplate fields to Container model
- Backend: Add extractContainerOSTemplate() and isOCITemplate() detection functions
- Backend: Detect OCI containers via ostemplate config and set type to 'oci'
- Frontend: Add isOci and osTemplate to Container interface
- Frontend: Add 'oci-container' to ResourceType with distinct purple badge
- Frontend: Update Dashboard filters to include OCI containers with LXC
- Tests: Add comprehensive unit tests for OCI detection logic
OCI containers are detected by checking the ostemplate for patterns like:
- oci: prefix (e.g., oci:docker.io/library/alpine:latest)
- docker: prefix (e.g., docker:nginx:latest)
- Known registry URLs (docker.io, ghcr.io, gcr.io, quay.io, etc.)
- Local templates with oci- or oci_ filename patterns
- Add DOMPurify sanitization for AI chat markdown rendering (XSS fix)
- Configure DOMPurify to add target=_blank and rel=noopener to links
- Update system prompt to align with command approval policy
- Clarify safe vs destructive commands in prompt
- Improve patrol auto-fix mode guidance with safe operation list
- Add verification requirements for auto-fix actions
- Update observe-only mode to be clearer about read-only restrictions
Phase 1 of Pulse AI differentiation:
- Create internal/ai/context package with types, trends, builder, formatter
- Implement linear regression for trend computation (growing/declining/stable/volatile)
- Add storage capacity predictions (predicts days until 90% and 100%)
- Wire MetricsHistory from monitor to patrol service
- Update patrol to use buildEnrichedContext instead of basic summary
- Update patrol prompt to reference trend indicators and predictions
This gives the AI awareness of historical patterns, enabling it to:
- Identify resources with concerning growth rates
- Predict capacity exhaustion before it happens
- Distinguish between stable high usage vs growing problems
- Provide more actionable, time-aware insights
All tests passing. Falls back to basic summary if metrics history unavailable.
- Add 'content' type to StreamDisplayEvent for tracking text chunks
- Track content events in streamEvents array for chronological display
- Update render to use Switch/Match for cleaner conditional rendering
- Interleave thinking, tool calls, and content as they stream in
- Add fallback for old messages without streamEvents for backwards compat
Previously, tool/command outputs stayed at top while AI text responses
accumulated at the bottom. Now all events appear in order like a
normal chatbot.
- Add alert-triggered AI analysis for real-time incident response
- Implement patrol history persistence across restarts
- Add patrol schedule configuration UI in AI Settings
- Enhance AIChat with patrol status and manual trigger controls
- Add resource store improvements for AI context building
- Expand Alerts page with AI-powered analysis integration
- Add Vite proxy config for AI API endpoints
- Support both Anthropic and OpenAI providers with streaming
- Add Claude OAuth authentication support with hybrid API key/OAuth flow
- Implement Docker container historical metrics in backend and charts API
- Add CEPH cluster data collection and new Ceph page
- Enhance RAID status display with detailed tooltips and visual indicators
- Fix host deduplication logic with Docker bridge IP filtering
- Fix NVMe temperature collection in host agent
- Add comprehensive test coverage for new features
- Improve frontend sparklines and metrics history handling
- Fix navigation issues and frontend reload loops
Backend:
- Call SetMonitor after router creation to inject resource store
- Add debug logging for resource population and broadcast
Frontend:
- Add resources array to WebSocket store initial state
- Handle resources in WebSocket message processing
- Use reconcile for efficient state updates
The unified resources are now properly:
1. Populated from StateSnapshot on each broadcast cycle
2. Converted to frontend format (ResourceFrontend)
3. Included in WebSocket state messages
4. Received and stored in frontend state
5. Consumed by migrated route components
Console now shows '[DashboardView] Using unified resources: VMs: X'
confirming the migration is working end-to-end.
- Added PopulateFromSnapshot method to resources.Store
- Extended ResourceStoreInterface to include PopulateFromSnapshot
- Monitor now calls updateResourceStore before broadcasts
- This ensures resources are fresh on every WebSocket broadcast
Without this, the store would only be populated when /api/resources or
/api/state endpoints are hit, leaving WebSocket broadcasts empty.