The acquire() function blocked indefinitely without respecting context
cancellation. When clients disconnect while waiting for the per-node
lock, goroutines would remain blocked forever, connections accumulate
in CLOSE_WAIT state, and rate limiter semaphores are never released.
Added acquireContext() that respects context cancellation and updated
both HTTP and RPC handlers to use it. This prevents:
- Goroutine leaks from cancelled requests
- CLOSE_WAIT connection accumulation
- Cascading failures from filled semaphores
Related to #832
The acquire() function blocked indefinitely without respecting context
cancellation. When clients disconnect while waiting for the per-node
lock, goroutines would remain blocked forever, connections accumulate
in CLOSE_WAIT state, and rate limiter semaphores are never released.
Added acquireContext() that respects context cancellation and updated
both HTTP and RPC handlers to use it. This prevents:
- Goroutine leaks from cancelled requests
- CLOSE_WAIT connection accumulation
- Cascading failures from filled semaphores
Related to #832
Keep only the simple AI-powered approach:
- set_resource_url tool lets AI save discovered URLs
- Users ask AI directly: 'Find URLs for my containers'
- AI uses its intelligence to discover and set URLs
Removed:
- URLDiscoveryService (rigid port scanning)
- Bulk discovery API endpoints
- Frontend discovery button
The AI itself is smart enough to iterate through resources
and discover URLs when asked.
- Add URLDiscoveryService for scanning all resources at once
- Scans common web ports (80, 443, 8080, 8096, 3000, etc.)
- Automatically saves discovered URLs to resource metadata
- Add API endpoints for start/status/cancel discovery
- Progress tracking with results reporting
Endpoints:
- POST /api/ai/discover-urls/start - Start bulk discovery
- GET /api/ai/discover-urls/status - Check progress
- POST /api/ai/discover-urls/cancel - Cancel discovery
- Add MetadataProvider interface for AI to update resource URLs
- Add set_resource_url tool to AI service
- Wire up metadata stores to AI service via router
- Add URL discovery guidance to AI system prompt
- AI can now inspect guests/containers/hosts for web services
and automatically save discovered URLs to Pulse metadata
Usage: Ask the AI 'Find the web URL for this container' and it will:
1. Check for listening ports and web servers
2. Get the IP address
3. Verify the URL works
4. Save it to Pulse for quick dashboard access
- Add host metadata API for custom URL editing on hosts page
- Enhance AI routing with unified resource provider lookup
- Add encryption key watcher script for debugging key issues
- Improve AI service with better command timeout handling
- Update dev environment workflow with key monitoring docs
- Fix resource store deduplication logic
- Add Claude OAuth authentication support with hybrid API key/OAuth flow
- Implement Docker container historical metrics in backend and charts API
- Add CEPH cluster data collection and new Ceph page
- Enhance RAID status display with detailed tooltips and visual indicators
- Fix host deduplication logic with Docker bridge IP filtering
- Fix NVMe temperature collection in host agent
- Add comprehensive test coverage for new features
- Improve frontend sparklines and metrics history handling
- Fix navigation issues and frontend reload loops
Replaced 18+ console.log statements in AI-related files with
logger.debug/warn/error calls. This ensures debug output only
appears in development mode, keeping production logs clean.
Files updated:
- frontend-modern/src/api/ai.ts (15 statements)
- frontend-modern/src/components/AI/AIChat.tsx (4 statements)
These were internal planning/architecture docs not meant for end users:
- .gemini/docs/unified-resource-architecture.md (design doc)
- .gemini/tasks/persistent-metrics-storage.md (implementation plan)
- frontend-modern/PLAN-column-visibility.md (implementation plan)
The AI service now uses only buildUnifiedResourceContext() for
infrastructure context, since the resourceProvider is always set
during router initialization.
Removed:
- buildInfrastructureContext() function (~288 lines of dead code)
- Legacy fallback path in buildSystemPrompt()
The unified resource context provides a cleaner, deduplicated view
of infrastructure that includes:
- All resources grouped by platform and type
- Top CPU/Memory/Disk consumers
- Active alerts on resources
- Infrastructure summary statistics
This completes the AI service migration to unified resources.
This cleanup addresses transition debt from the unified resources migration:
Frontend cleanup:
- Move all Resource→Legacy type conversions to useResourcesAsLegacy() hook
- Add asNodes() and asDockerHosts() adapter functions to the hook
- Simplify DockerRoute, HostsRoute, DashboardView to use the centralized hook
- Remove ~300 lines of duplicated adapter code from App.tsx
- Remove debug console.log statements from Dashboard.tsx
- Fix CPU value conversion (divide by 100) for Dashboard compatibility
Backend fixes (from previous session):
- Fix parentID format in converters (VM, Container, Storage) to match Node.ID
- Format changed from 'instance/node/nodename' to 'instance-nodename'
- Update tests to match new parentID format
This consolidates all legacy type conversion logic in one place,
making future cleanup easier when components are migrated to use
unified resources directly.
The Dashboard grouping was broken because:
- node was set to r.parentId (full resource ID)
- instance was set to r.platformId
Fixed to read from platformData which contains the correct values:
- node = platformData.node (e.g., 'minipc')
- instance = platformData.instance (e.g., 'https://pve:8006')
This matches the legacy data format and fixes the grouped/list toggle.
BREAKING: Route components no longer fall back to legacy state arrays.
All data now flows through the unified resource model:
- DockerRoute: uses state.resources filtered for docker-host/docker-container
- HostsRoute: uses state.resources filtered for host
- DashboardView: uses state.resources filtered for node/vm/container
The legacy arrays (state.nodes, state.vms, etc.) are still broadcast
by the backend for API compatibility, but the main UI routes no longer
use them.
If resources array is empty, pages will show no data rather than
falling back to legacy data. This ensures a clean data model with
no hidden fallback behavior.
Backend:
- Call SetMonitor after router creation to inject resource store
- Add debug logging for resource population and broadcast
Frontend:
- Add resources array to WebSocket store initial state
- Handle resources in WebSocket message processing
- Use reconcile for efficient state updates
The unified resources are now properly:
1. Populated from StateSnapshot on each broadcast cycle
2. Converted to frontend format (ResourceFrontend)
3. Included in WebSocket state messages
4. Received and stored in frontend state
5. Consumed by migrated route components
Console now shows '[DashboardView] Using unified resources: VMs: X'
confirming the migration is working end-to-end.
Temporary logging to verify which code path is being used:
- DockerRoute: logs docker-host and docker-container counts
- HostsRoute: logs host count
- DashboardView: logs VM count
Check browser console to confirm unified resources are being received.
These logs can be removed once migration is verified.
- Documented WebSocket state migration as completed
- Listed all pages migrated to unified resources
- Outlined future Phase 6 cleanup tasks
- Clarified the strategic shift from dedicated /resources view to
migrating existing pages
Docker page:
- DockerRoute now uses unified resources with fallback to legacy data
- Reconstructs container hierarchy from flat resource list
- Maps docker-host and docker-container resources to DockerHost type
Dashboard page:
- DashboardView now uses unified resources with fallback
- Converts vm, container, and node resources to legacy types
- Maintains full backward compatibility with existing components
Both pages use resource type filtering and platform data extraction
to adapt the unified model to existing component interfaces.
- Updated HostsRoute to consume unified resources with fallback to legacy data
- Added asHosts adapter to useResourcesAsLegacy hook
- Adapter converts Resource type to Host type for existing component
The Hosts page now uses resources from state.resources when available,
falling back to state.hosts for backward compatibility. This approach
allows gradual migration without breaking the existing HostsOverview
component.
- Added PopulateFromSnapshot method to resources.Store
- Extended ResourceStoreInterface to include PopulateFromSnapshot
- Monitor now calls updateResourceStore before broadcasts
- This ensures resources are fresh on every WebSocket broadcast
Without this, the store would only be populated when /api/resources or
/api/state endpoints are hit, leaving WebSocket broadcasts empty.
- Created resource.ts with TypeScript types for unified Resource model
- ResourceType, PlatformType, SourceType, ResourceStatus enums
- Resource interface matching backend ResourceFrontend
- Helper functions: isInfrastructure, isWorkload, getDisplayName, etc.
- ResourceFilter interface for complex filtering
- Updated api.ts State interface to include optional resources array
- Created useResources hook for accessing unified resources
- Reactive access via getGlobalWebSocketStore
- Pre-computed memos for infra, workloads, statusCounts
- Filtering methods: byType, byPlatform, filtered
- Query helpers: get, children, topByCpu, topByMemory
- Created useResourcesAsLegacy helper for migration
- Converts resources to legacy VM/Container formats
- Enables gradual component migration
This provides the foundation for migrating frontend pages to use
the unified resource model.
- Extended StateFrontend with Resources field containing unified resource data
- Added ResourceFrontend and related types for frontend-compatible resource data
- Extended ResourceStoreInterface to include GetAll() method
- Monitor now injects resources into WebSocket broadcasts
- Added helper method getResourcesForBroadcast() to convert resources to frontend format
- All existing tests pass
This enables the frontend to access unified resources via WebSocket state.
- Removed /resources page and associated frontend components
- Removed ResourcesOverview.tsx, UnifiedResourceRow.tsx, columns.ts
- Removed frontend types/resource.ts
- Updated unified-resource-architecture.md to mark Phase 4 as ABANDONED
- Removed unified-view-migration-plan.md
- Backend unified resource model remains for AI context
This is a checkpoint before attempting full frontend migration to unified model.
The Resources page was showing 0 resources because the store was only
populated when /api/state was called (from the dashboard). Now the
resources are populated on-demand when /api/resources is accessed.
Changes:
- Added StateProvider interface to ResourceHandlers
- SetStateProvider() method for injecting the monitor
- HandleGetResources now calls PopulateFromSnapshot before querying
- Router injects monitor as state provider during SetMonitor()
This ensures the /resources page works even when accessed directly
without visiting the main dashboard first.
This implements Phase 4 of the Unified Resource Architecture - the frontend
unified resources view.
New Features:
- Unified resources page at /resources route
- Fetches from /api/resources REST endpoint
- Auto-refreshes every 10 seconds
- Filtering by search, type, platform, status
- Grouping by type, platform, or parent
- Status indicators with alert badges
- CPU/Memory/Disk progress bars
Files Added:
- frontend-modern/src/types/resource.ts - TypeScript types matching Go backend
- frontend-modern/src/components/Resources/ResourcesOverview.tsx - Main component
Files Modified:
- frontend-modern/src/App.tsx - Added lazy import and route for ResourcesOverview
- .gemini/docs/unified-resource-architecture.md - Updated Phase 4 status
Access the unified view by navigating to /resources directly.
The route is not yet in the main navigation (power user feature).
This commit implements the Unified Resource Architecture for AI-first
infrastructure management. Key features:
Phase 1 - Backend Unification:
- New unified Resource type with 9 resource types, 7 platforms, 7 statuses
- Resource store with identity-based deduplication (hostname, machineID, IP)
- 8 converter functions (FromNode, FromVM, FromContainer, etc.)
- REST API endpoints: /api/resources, /api/resources/stats, /api/resources/{id}
- 28 comprehensive unit tests
Phase 2 - AI Context Enhancement:
- Unified context builder for AI system prompts
- Cross-platform query methods: GetTopByCPU, GetTopByMemory, GetTopByDisk
- Resource correlation: GetRelated (parent, children, siblings, cluster)
- Infrastructure summary: GetResourceSummary with health status counts
- AI context now includes top consumers and infrastructure overview
Phase 3 - Agent Preference & Hybrid Mode:
- Polling optimization methods in resource store
- ResourceStoreInterface added to Monitor
- SetResourceStore() and shouldSkipNodeMetrics() helper methods
- Store automatically wired into Monitor via Router.SetMonitor()
- Foundation ready for reduced API polling when agents are active
Files added:
- internal/resources/resource.go - Core Resource type
- internal/resources/store.go - Store with deduplication
- internal/resources/converters.go - Type converters
- internal/resources/platform_data.go - Platform-specific data
- internal/resources/store_test.go - 28 tests
- internal/resources/converters_test.go - Converter tests
- internal/api/resource_handlers.go - REST API handlers
- internal/ai/resource_context.go - AI context builder
- .gemini/docs/unified-resource-architecture.md - Architecture docs
All tests pass.
- Extended AI context selection to host rows in HostsOverview
- Added resourceId prop to StackedMemoryBar for sparkline support
- Relocated guest URL editing from GuestRow name click
- Added GuestNotes component with URL field in AI sidebar
- Refined host routing in AI service backend
- Minor animation and styling improvements
- Implement 'Show Problems Only' toggle combining degraded status, high CPU/memory alerts, and needs backup filters
- Add 'Investigate with AI' button to filter bar for problematic guests
- Fix dashboard column sizing inconsistencies between bars and sparklines view modes
- Fix PBS backups display and polling
- Refine AI prompt for general-purpose usage
- Fix frontend flickering and reload loops during initial load
- Integrate persistent SQLite metrics store with Monitor
- Fortify AI command routing with improved validation and logging
- Fix CSRF token handling for note deletion
- Debug and fix AI command execution issues
- Various AI reliability improvements and command safety enhancements
Simplified OS display to plain "Windows" and "Linux" text labels.
Previous icon attempts were rejected as too complex or unclear.
Text labels are cleaner and more universally recognizable.
- Extract ostype from LXC container config (debian, ubuntu, alpine, etc.)
- Map ostype values to human-readable names (e.g., "debian" -> "Debian")
- Add OSName field to Container model and ContainerFrontend
- Add icons for NixOS, openSUSE, and Gentoo in frontend
- LXC containers now show OS icons alongside VMs in the dashboard
Supported LXC OS types: alpine, archlinux, centos, debian, devuan,
fedora, gentoo, nixos, opensuse, ubuntu, unmanaged
- Add OSInfoCell component with OS-specific icons (Windows, Ubuntu,
Debian, Alpine, CentOS/RHEL, Fedora, Arch, FreeBSD, generic Linux)
- Each OS type has a distinct color for quick visual identification
- Portal tooltip shows full OS name, version, and guest agent version
- Much more compact than text strings like "Microsoft Windows Server 2022"
OS info requires guest agent to be installed and configured, so most
guests won't have this data. Move to detailed tier so it only shows
on extra-wide screens or when explicitly enabled by user.
- Add checkmark icon for fresh backups
- Add warning triangle for stale backups
- Add X icon for critical/never backups
- Use consistent Portal-based tooltip matching other columns
- Show formatted date, time, and relative age in tooltip
Replace drawer-based info display with inline columns that can be toggled:
- Add IP, Uptime, Node, Backup, OS, Tags columns (user-toggleable)
- Add ColumnPicker dropdown to show/hide columns with localStorage persistence
- Columns auto-show based on screen width using priority system
- Remove GuestDrawer - all info now visible inline or via tooltips
Rich hover tooltips:
- Disk bar: Shows all mount points with usage %, color-coded by severity
- Memory bar: Shows used/free/balloon/swap breakdown
- IP column: Shows network icon + count, hover for interfaces, MACs, IPs, traffic
Also:
- Create useColumnVisibility hook for responsive column management
- Create ColumnPicker component for column toggle UI
- Update drawer layouts in Hosts/Docker tabs for consistency
The StatusDot component was computing variant, size, and className once
at mount time, not reactively. When a VM transitioned from stopped to
running, the tooltip updated (it accessed props.title directly) but the
dot color stayed red because className was stale.
Fix: Convert plain variable assignments to getter functions that access
props reactively, and call them in the JSX template.
ClearActiveAlerts triggers an async save to disk, which can race with
LoadActiveAlerts reading the file. The test now clears the in-memory
map directly without triggering the async save.
- Add AI service with Anthropic, OpenAI, and Ollama providers
- Add AI chat UI component with streaming responses
- Add AI settings page for configuration
- Add agent exec framework for command execution
- Add API endpoints for AI chat and configuration
EnhanceCP uses /var/container_tmp/{uuid}/merged for container overlays.
These are ephemeral container layers, not user storage, and should be
filtered from disk usage display. Related to #790
On dual-stack systems with net.ipv6.bindv6only=1 (like some Proxmox 8
configurations), Go's net.Listen("tcp", "0.0.0.0:8443") may still bind
to IPv6-only. This caused IPv4 localhost connections to hang while
IPv6 worked.
Fix by detecting IPv4 addresses and explicitly using "tcp4" network
type when creating the listener. Related to #805
On dual-stack systems with net.ipv6.bindv6only=1 (like some Proxmox 8
configurations), Go's net.Listen("tcp", "0.0.0.0:8443") may still bind
to IPv6-only. This caused IPv4 localhost connections to hang while
IPv6 worked.
Fix by detecting IPv4 addresses and explicitly using "tcp4" network
type when creating the listener. Related to #805