- Add DELETE /api/agents/unregister endpoint for agent self-unregistration
- Agent now unregisters itself from Pulse server when uninstalled
- Add clarifying note in UnifiedAgents explaining linked agents behavior
- Linked agents are managed via their PVE node but this is now explained in UI
- Add LastSeen field to HostAgent model for better agent status tracking
When a host agent is deleted via the UI, the LinkedHostAgentID on any
PVE nodes that were linked to it was not being cleared. This caused
the "Agent" tag to persist in the UI after uninstalling the agent.
Related to #920
- Add smartctl package to collect disk temperature and health data
- Add SMART field to agent Sensors struct
- Host agent now runs smartctl to collect disk temps when available
- Backend processes agent SMART data for temperature display
- Graceful fallback when smartctl not installed
This implements full remote configuration for the AI command execution setting:
Backend:
- Add CommandsEnabled field to HostMetadata for persistent storage
- Add GetHostAgentConfig/UpdateHostAgentConfig methods to Monitor
- Add /api/agents/host/{id}/config endpoint (GET for agents, PATCH for UI)
- Server includes config in report response for immediate agent application
- Agent parses response and dynamically enables/disables command client
Frontend:
- Add 'AI Commands' toggle column in Managed Agents table
- Toggle immediately updates server config; agent applies on next heartbeat
- Add 'Enable AI command execution' checkbox in agent installer wizard
- Checkbox adds --enable-commands flag to generated install commands
This allows users to:
1. Enable at install time via checkbox in the wizard
2. Toggle remotely via the Managed Agents UI for existing agents
3. Agents apply changes automatically on their next report cycle
- Add CommandsEnabled field to AgentInfo in pkg/agents/host/report.go
- Agent now reports whether AI command execution is enabled
- Server stores and exposes this via Host model
- Frontend can now show which agents have commands enabled
- This provides visibility before implementing remote configuration
Backup and snapshot polling runs asynchronously and could take 20-45 seconds to complete, but WebSocket broadcasts happened on a separate fixed-interval timer. This caused frontend to show stale data until a broadcast happened to coincide with completed polling - which could take hours.
Now broadcasts state immediately after backup/snapshot polling completes, ensuring users see changes within seconds.
Related to #895
When a PVE cluster has unique self-signed certificates on each node, Pulse
would mark secondary nodes as unhealthy because only the primary node's
fingerprint was used for all connections.
Now, during cluster discovery, Pulse captures each node's TLS fingerprint
and uses it when connecting to that specific node. This enables
"Trust On First Use" (TOFU) for clusters with unique per-node certs.
Changes:
- Add Fingerprint field to ClusterEndpoint config
- Add FetchFingerprint() to tlsutil for capturing node certs
- validateNodeAPI() now captures and returns fingerprints during discovery
- NewClusterClient() accepts endpointFingerprints map for per-node certs
- All client creation paths use per-endpoint fingerprints when available
Related to #879
When a user deletes an API token that was migrated from .env, track
the hash in a suppression list to prevent it from being re-migrated
on the next restart.
Changes:
- Add SuppressedEnvMigrations field to Config
- Add env_token_suppressions.json persistence
- Check suppression list during env token migration
- Record suppressed hash when deleting "Migrated from .env" tokens
- Update RemoveAPIToken to return the removed record
Related to #871
PBS datastores are now displayed in the Storage overview alongside PVE
storage. Each PBS datastore is converted to a Storage entry with:
- type: 'pbs'
- content: 'backup'
- shared: true
- active: based on PBS instance status
This provides a complete picture of all storage resources in one view
while keeping detailed PBS info in the dedicated PBS section.
Closes#869
- Fix PVE nodes: buildNodeUrl in ProxmoxNodesSection.tsx now prioritizes
guestURL over host (was ignoring guestURL entirely)
- Add PBS support: GuestURL field added to PBSInstance config, model,
and API handlers
- Add PMG support: GuestURL field added to PMGInstance config, model,
and API handlers
- Update NodeSummaryTable to use guestURL for PBS nodes
- Frontend types updated for PBS/PMG guestURL support
The Guest URL setting in node configuration now works correctly across
all node types. When set, it takes priority over the Host URL when
clicking on node names to navigate to the Proxmox/PBS/PMG web UI.
Closes#870
Backend:
- Enhanced buildEnrichedResourceContext to ALWAYS show learned baselines with
status indicators (normal/elevated/anomaly) instead of only when anomalous
- This makes Pulse Pro's 'moat' visible - users can see the AI understands
their infrastructure's normal behavior patterns
- Added baseline import to service.go
Frontend (user changes):
- Added incident event type filtering with toggle buttons
- Added resource incident panel to view all incidents for a resource
- Added timeline expand/collapse functionality in alert history
- Added incident note saving with proper incidentId tracking
- Added startedAt parameter for proper incident timeline loading
Addresses issue #861 - syslog flooded on docker host
Many routine operational messages were being logged at INFO level,
causing excessive log volume when monitoring multiple VMs/containers.
These messages are now logged at DEBUG level:
- Guest threshold checking (every guest, every poll cycle)
- Storage threshold checking (every storage, every poll cycle)
- Host agent linking messages
- Filesystem inclusion in disk calculation
- Guest agent disk usage replacement
- Polling start/completion messages
- Alert cleanup and save messages
Users can set LOG_LEVEL=debug to see these messages if needed for
troubleshooting. The default INFO level now produces significantly
less log output.
Also updated documentation in CONFIGURATION.md and DOCKER.md to:
- Clarify what each log level includes
- Add tip about using LOG_LEVEL=warn for minimal logging
The issue was a SolidJS reactivity problem in the Dashboard component.
When guestMetadata signal was accessed inside a For loop callback and
assigned to a plain variable, SolidJS lost reactive tracking.
Changed from:
const metadata = guestMetadata()[guestId] || ...
customUrl={metadata?.customUrl}
To:
const getMetadata = () => guestMetadata()[guestId] || ...
customUrl={getMetadata()?.customUrl}
This ensures SolidJS properly tracks the signal dependency when the
getter function is called directly in JSX props.
- Install script now auto-detects Docker, Kubernetes, and Proxmox
- Platform monitoring is enabled automatically when detected
- Users can override with --disable-* or --enable-* flags
- Allow same token to register multiple hosts (one per hostname)
- Update tests to reflect new multi-host token behavior
- Improve CompleteStep and UnifiedAgents UI components
- Update UNIFIED_AGENT.md documentation
- Add cluster-aware guest ID generation (clusterName-VMID instead of instanceName-VMID)
to prevent duplicate VMs/containers when multiple cluster nodes are monitored
- Add cluster deduplication at registration time - when a node is added that belongs
to an already-configured cluster, merge as endpoint instead of creating duplicate
- Add startup consolidation to automatically merge duplicate cluster instances
- Change host agent token binding from agent GUID to hostname, allowing:
- Multiple host agents to share a token (each bound by hostname)
- Agent reinstalls on same host without token conflicts
- Remove 12-character password minimum requirement
- Remove emoji from auto-registration success message
- Fix grouped view node lookup to support both cluster-aware node IDs
(clusterName-nodeName) and legacy guest grouping keys (instance-nodeName)
Fixes duplicate guests appearing when agents are installed on multiple
cluster nodes. Also improves multi-agent UX by allowing shared tokens.
When a host agent registers, it now searches for a PVE node with a
matching hostname and links them together. Similarly, when PVE nodes
are discovered, they check for existing host agents with matching hostnames.
This prevents the confusion of seeing duplicate entries when users install
agents on PVE cluster nodes that were already discovered via the cluster API.
- Added LinkedHostAgentID field to Node struct
- Added LinkedNodeID/LinkedVMID/LinkedContainerID fields to Host struct
- Added findLinkedProxmoxEntity() to match by hostname (with domain stripping)
- Updated UpdateNodesForInstance() to preserve and auto-set links
When no auth is configured (fresh install), CheckAuth allows all requests.
This creates a race condition where existing agents from a previous setup
can report data before the wizard completes security configuration.
This fix clears all host agents and docker hosts when /api/security/quick-setup
is called, ensuring the wizard shows a clean state after security is configured.
Added:
- State.ClearAllHosts() - removes all host agents
- State.ClearAllDockerHosts() - removes all docker hosts
- Monitor.ClearUnauthenticatedAgents() - clears both and resets token bindings
- Call to ClearUnauthenticatedAgents() in handleQuickSecuritySetupFixed()
- Add MetricsRetentionRawHours, MetricsRetentionMinuteHours, MetricsRetentionHourlyDays, MetricsRetentionDailyDays to SystemSettings
- Wire settings from system.json through Config to metrics store initialization
- Set sensible defaults: Raw=2h, Minute=24h, Hourly=7d, Daily=90d
- Log active retention values on startup for transparency
Users can now customize how long metrics are stored at each aggregation tier.
- Add sortable table headers for Pod and Deployment views
- Click column headers to toggle sort direction
- Sort state persists across sessions
- Add namespace dropdown filter for Pods/Deployments views
- Auto-populates from available namespaces
- Include namespace filter in reset and active filters check
Backend:
- Seed OCI classification from previous state so containers never
'downgrade' to LXC if config fetching intermittently fails
- Prevent type regression in recordGuestSnapshot when OCI was previously detected
- Move metrics zeroing before snapshot recording for cleaner flow
Frontend:
- Add isOCIContainer() memo that checks both type and isOci flag
- Use isOCI helper in Dashboard.tsx for AI context building
- Include oci-container type in useResources container conversion
- Preserve isOci and osTemplate fields through legacy conversion
This ensures OCI containers retain their classification even when
Proxmox API permissions or transient errors prevent config reads.
- Refactored enrichContainerMetadata to not return early when container is stopped
- Status API calls are still skipped for stopped containers (as expected)
- Config fetch now runs regardless of status, enabling OCI detection
- Added test for OCI detection on stopped containers
Discovered: Proxmox 9.1 requires VM.Config.Options permission to read
OCI container configs (not just VM.Audit). Document this in setup guides.
- Added isOCIContainerByConfig() to detect OCI containers by:
- Presence of 'entrypoint' field (only OCI containers have this)
- Combination of ostype=unmanaged, cmode=console, and lxc.signal.halt
- This is needed because Proxmox doesn't persist ostemplate after creation
- Now supports detection of already-created OCI containers (like the test alpine container)
- Backend: Add IsOCI and OSTemplate fields to Container model
- Backend: Add extractContainerOSTemplate() and isOCITemplate() detection functions
- Backend: Detect OCI containers via ostemplate config and set type to 'oci'
- Frontend: Add isOci and osTemplate to Container interface
- Frontend: Add 'oci-container' to ResourceType with distinct purple badge
- Frontend: Update Dashboard filters to include OCI containers with LXC
- Tests: Add comprehensive unit tests for OCI detection logic
OCI containers are detected by checking the ostemplate for patterns like:
- oci: prefix (e.g., oci:docker.io/library/alpine:latest)
- docker: prefix (e.g., docker:nginx:latest)
- Known registry URLs (docker.io, ghcr.io, gcr.io, quay.io, etc.)
- Local templates with oci- or oci_ filename patterns
Phase 1 of Pulse AI differentiation:
- Create internal/ai/context package with types, trends, builder, formatter
- Implement linear regression for trend computation (growing/declining/stable/volatile)
- Add storage capacity predictions (predicts days until 90% and 100%)
- Wire MetricsHistory from monitor to patrol service
- Update patrol to use buildEnrichedContext instead of basic summary
- Update patrol prompt to reference trend indicators and predictions
This gives the AI awareness of historical patterns, enabling it to:
- Identify resources with concerning growth rates
- Predict capacity exhaustion before it happens
- Distinguish between stable high usage vs growing problems
- Provide more actionable, time-aware insights
All tests passing. Falls back to basic summary if metrics history unavailable.
- Add 'content' type to StreamDisplayEvent for tracking text chunks
- Track content events in streamEvents array for chronological display
- Update render to use Switch/Match for cleaner conditional rendering
- Interleave thinking, tool calls, and content as they stream in
- Add fallback for old messages without streamEvents for backwards compat
Previously, tool/command outputs stayed at top while AI text responses
accumulated at the bottom. Now all events appear in order like a
normal chatbot.
- Add alert-triggered AI analysis for real-time incident response
- Implement patrol history persistence across restarts
- Add patrol schedule configuration UI in AI Settings
- Enhance AIChat with patrol status and manual trigger controls
- Add resource store improvements for AI context building
- Expand Alerts page with AI-powered analysis integration
- Add Vite proxy config for AI API endpoints
- Support both Anthropic and OpenAI providers with streaming
- Add Claude OAuth authentication support with hybrid API key/OAuth flow
- Implement Docker container historical metrics in backend and charts API
- Add CEPH cluster data collection and new Ceph page
- Enhance RAID status display with detailed tooltips and visual indicators
- Fix host deduplication logic with Docker bridge IP filtering
- Fix NVMe temperature collection in host agent
- Add comprehensive test coverage for new features
- Improve frontend sparklines and metrics history handling
- Fix navigation issues and frontend reload loops
Backend:
- Call SetMonitor after router creation to inject resource store
- Add debug logging for resource population and broadcast
Frontend:
- Add resources array to WebSocket store initial state
- Handle resources in WebSocket message processing
- Use reconcile for efficient state updates
The unified resources are now properly:
1. Populated from StateSnapshot on each broadcast cycle
2. Converted to frontend format (ResourceFrontend)
3. Included in WebSocket state messages
4. Received and stored in frontend state
5. Consumed by migrated route components
Console now shows '[DashboardView] Using unified resources: VMs: X'
confirming the migration is working end-to-end.
- Added PopulateFromSnapshot method to resources.Store
- Extended ResourceStoreInterface to include PopulateFromSnapshot
- Monitor now calls updateResourceStore before broadcasts
- This ensures resources are fresh on every WebSocket broadcast
Without this, the store would only be populated when /api/resources or
/api/state endpoints are hit, leaving WebSocket broadcasts empty.
- Extended StateFrontend with Resources field containing unified resource data
- Added ResourceFrontend and related types for frontend-compatible resource data
- Extended ResourceStoreInterface to include GetAll() method
- Monitor now injects resources into WebSocket broadcasts
- Added helper method getResourcesForBroadcast() to convert resources to frontend format
- All existing tests pass
This enables the frontend to access unified resources via WebSocket state.
This commit implements the Unified Resource Architecture for AI-first
infrastructure management. Key features:
Phase 1 - Backend Unification:
- New unified Resource type with 9 resource types, 7 platforms, 7 statuses
- Resource store with identity-based deduplication (hostname, machineID, IP)
- 8 converter functions (FromNode, FromVM, FromContainer, etc.)
- REST API endpoints: /api/resources, /api/resources/stats, /api/resources/{id}
- 28 comprehensive unit tests
Phase 2 - AI Context Enhancement:
- Unified context builder for AI system prompts
- Cross-platform query methods: GetTopByCPU, GetTopByMemory, GetTopByDisk
- Resource correlation: GetRelated (parent, children, siblings, cluster)
- Infrastructure summary: GetResourceSummary with health status counts
- AI context now includes top consumers and infrastructure overview
Phase 3 - Agent Preference & Hybrid Mode:
- Polling optimization methods in resource store
- ResourceStoreInterface added to Monitor
- SetResourceStore() and shouldSkipNodeMetrics() helper methods
- Store automatically wired into Monitor via Router.SetMonitor()
- Foundation ready for reduced API polling when agents are active
Files added:
- internal/resources/resource.go - Core Resource type
- internal/resources/store.go - Store with deduplication
- internal/resources/converters.go - Type converters
- internal/resources/platform_data.go - Platform-specific data
- internal/resources/store_test.go - 28 tests
- internal/resources/converters_test.go - Converter tests
- internal/api/resource_handlers.go - REST API handlers
- internal/ai/resource_context.go - AI context builder
- .gemini/docs/unified-resource-architecture.md - Architecture docs
All tests pass.
- Implement 'Show Problems Only' toggle combining degraded status, high CPU/memory alerts, and needs backup filters
- Add 'Investigate with AI' button to filter bar for problematic guests
- Fix dashboard column sizing inconsistencies between bars and sparklines view modes
- Fix PBS backups display and polling
- Refine AI prompt for general-purpose usage
- Fix frontend flickering and reload loops during initial load
- Integrate persistent SQLite metrics store with Monitor
- Fortify AI command routing with improved validation and logging
- Fix CSRF token handling for note deletion
- Debug and fix AI command execution issues
- Various AI reliability improvements and command safety enhancements
- Extract ostype from LXC container config (debian, ubuntu, alpine, etc.)
- Map ostype values to human-readable names (e.g., "debian" -> "Debian")
- Add OSName field to Container model and ContainerFrontend
- Add icons for NixOS, openSUSE, and Gentoo in frontend
- LXC containers now show OS icons alongside VMs in the dashboard
Supported LXC OS types: alpine, archlinux, centos, debian, devuan,
fedora, gentoo, nixos, opensuse, ubuntu, unmanaged
Cache err.Error() result in two locations:
- monitor.go: storage query retry logic (2x calls to 1)
- monitor_polling.go: storage timeout handling (2x calls to 1)
strconv.Itoa is faster than fmt.Sprintf("%d", ...) because it doesn't
need to parse a format string. Changed 4 occurrences in monitoring
package where integers are converted to strings.
- firstForwardedValue: strings.Split always returns at least one element
- shouldRunBackupPoll: remaining is always >= 1 by math
- convertContainerDiskInfo: lowerLabel is never empty for non-rootfs
All three functions now at 100% coverage.
Host disk bars were showing virtual filesystems like tmpfs, /dev, /run,
/sys, and Docker overlay mounts. These clutter the UI and don't represent
meaningful disk usage.
Changed from `shouldIgnoreReadOnlyFilesystem` (read-only only) to the
full `fsfilters.ShouldSkipFilesystem` which also excludes:
- Virtual FS types: tmpfs, devtmpfs, sysfs, proc, cgroup, etc.
- Special mountpoints: /dev, /proc, /sys, /run, /var/lib/docker, /snap
- Network filesystems: fuse, nfs, cifs, etc.
Related to #790
The error message referenced "Settings -> Docker -> Removed hosts" but
that UI path no longer exists. The correct path is now
"Settings -> Agents -> Removed Docker Hosts".
Related to #778
The backup status indicator feature was incomplete - it added the UI
component but never populated VM/Container LastBackup from actual
backup data. Now SyncGuestBackupTimes() is called after storage
backups and PBS backups are polled, matching each guest's VMID to
its most recent backup timestamp.
Fixes#786
Move the inline filesystem skip logic from pollVMsAndContainersEfficient
into a reusable ShouldSkipFilesystem function. This consolidates filtering
for virtual filesystems (tmpfs, cgroup, etc.), network mounts (nfs, cifs,
fuse), and special mountpoints (/dev, /proc, /snap, etc.) into one tested
location.
Reduces cyclomatic complexity of pollVMsAndContainersEfficient and adds
28 test cases covering virtual fs types, network mounts, special mounts,
Windows paths, and edge cases.