Commit graph

1228 commits

Author SHA1 Message Date
rcourtman
6e2cae2363 feat(ui): add history chart components for guest drawer
- HistoryChart: single metric visualization (CPU, memory, disk)
- UnifiedHistoryChart: combined multi-metric view
- Support for time range selection (1h to 90d)
- Responsive charts with proper dark mode support
- Fix corrupted tools_query_test.go from stash merge
2026-01-22 00:46:52 +00:00
rcourtman
2e0da42a81 chore: reliability and maintenance improvements
Host agent:
- Add SHA256 checksum verification for downloaded binaries
- Verify checksum file matches expected bundle filename

WebSocket:
- Add write failure tracking with graceful disconnection
- Increase write deadline to 30s for large state payloads
- Better handling for slow clients (Raspberry Pi, slow networks)

Monitoring:
- Remove unused temperature proxy imports
- Add monitor polling improvements
- Expand test coverage

Other:
- Update package.json dependencies
- Fix generate-release-notes.sh path handling
- Minor reporting engine cleanup
2026-01-22 00:45:04 +00:00
rcourtman
a55bdb7a3a feat(api): security and metrics history improvements
- Require admin + settings:write scope for setup-script-url endpoint
- Add license enforcement for long-term metrics (30d/90d require Pro)
- Add downsampling step calculation for metrics history queries
- Add isContainerSSHRestricted helper for SSH restriction checks
- Clean up temperature proxy references from config handlers
- Minor OIDC and rate limit improvements
2026-01-22 00:44:12 +00:00
rcourtman
f293f41499 refactor: consolidate AI tools tests
- Remove executor_test.go (tests moved to specific tool test files)
- Refactor infrastructure, patrol, profiles, and query tests
- Add query tool enhancements for better resource filtering
2026-01-22 00:43:41 +00:00
rcourtman
633eea83db refactor: remove deprecated config fields
- Remove unused envconfig tags (BackendHost, FrontendHost, etc.)
- Remove APITokenEnabled (infer from token count)
- Remove IframeEmbeddingAllow, Port, Debug, ConcurrentPolling
- Clean up temperature proxy comments from ClusterEndpoint
- Simplify API token diagnostic to use config field directly
2026-01-22 00:43:27 +00:00
rcourtman
c8b6cbfc6d feat(pro): long-term metrics history (30d/90d)
- Add FeatureLongTermMetrics license feature for Pro tier
- Implement tiered storage in metrics store (raw, minute, hourly, daily)
- Add covering index for unified history query performance
- Seed mock data for 90 days with appropriate aggregation tiers
- Update PULSE_PRO.md to document the feature
- 7-day history remains free, 30d/90d requires Pro license
2026-01-22 00:42:41 +00:00
rcourtman
bb47e1831c security: SSRF protection for webhook URLs
- Add DNS resolution validation to block webhooks to internal IPs
- Validate hostname resolves before accepting webhook URL
- Block metadata endpoints (AWS, GCP, Azure)
- Block localhost, private IPs, and reserved ranges
- Add context timeout for DNS lookups (2s)
2026-01-22 00:42:23 +00:00
rcourtman
222c88f33c chore: Mac-compatible dev scripts
- hot-dev.sh: Fix hostname -I for macOS, use ifconfig instead
- hot-dev.sh: Fix PULSE_AUDIT_DIR for mock mode
- hot-dev.sh: Use PULSE_REPOS_DIR for Pro module detection
- dev-check.sh: Fix pgrep -c (not supported on macOS)
- dev-check.sh: Use /tmp/pulse-debug.log on macOS instead of journalctl
- Update internal/api docs to use env var paths
2026-01-22 00:30:15 +00:00
rcourtman
61bb582d82 fix: disk-exclude now works with device paths and disk I/O
- Add MatchesDiskExclude() to check both device path and mountpoint
- Add MatchesDeviceExclude() for device-only matching
- Update collectDisks to check device in addition to mountpoint
- Update collectDiskIO to respect disk exclusions
- Patterns like /dev/sda, sda, or /mnt/backup all work now

Related to #1142
2026-01-21 19:03:05 +00:00
rcourtman
c44cb5af5b fix: use pure Go SQLite driver for arm64 compatibility
Switch from mattn/go-sqlite3 (CGO) to modernc.org/sqlite (pure Go)
for auth, audit, and notification queue storage. This enables SQLite
functionality on arm64 Docker images which are built with CGO_ENABLED=0.

Related to #1140
2026-01-21 18:58:23 +00:00
rcourtman
098f645b34 chore: update Docker configs and installer
- Dockerfile: remove sensor proxy build target
- docker-compose.yml: remove proxy service configuration
- install.sh: simplify installer without proxy
- updates/manager.go: minor updates
2026-01-21 12:03:52 +00:00
rcourtman
925815c3e7 test: update config and monitoring tests after proxy removal
Remove references to sensor proxy config fields in test cases.
2026-01-21 12:03:30 +00:00
rcourtman
7599915b8f refactor(api): remove sensor proxy config from API handlers
- config_handlers.go: remove proxy configuration endpoints
- system_settings.go: remove proxy-related settings
- rate_limit_config.go: update rate limit configuration
- Update related test files
2026-01-21 12:02:46 +00:00
rcourtman
7049f5b43c refactor: simplify temperature monitoring after sensor proxy removal
Remove proxy-related temperature code paths:
- temperature.go: remove proxy client integration and fallback logic
- config.go: remove SensorProxyEnabled and related config fields
- monitor.go: remove proxy client initialization and state

Temperature monitoring now relies solely on the unified agent approach.
2026-01-21 12:00:28 +00:00
rcourtman
d306e02151 fix: remove unused imports and obsolete tests in API handlers
- diagnostics.go: remove unused path/filepath and syscall imports
- router.go: remove unused errors import
- diagnostics_test.go: remove tests for deleted functions
  (normalizeHostForComparison, matchInstanceNameByHost)

These changes fix build errors after sensor proxy removal.
2026-01-21 11:59:41 +00:00
rcourtman
d4a6c0d2e8 refactor: remove legacy pulse-sensor-proxy temperature monitoring
The sensor proxy approach for temperature monitoring has been superseded
by the unified agent architecture where host agents report temperature
data directly. This removes:

- cmd/pulse-sensor-proxy/ - standalone proxy daemon
- internal/tempproxy/ - client library
- internal/api/*temperature_proxy* - API handlers and tests
- internal/api/sensor_proxy_gate* - feature gate
- internal/monitoring/*proxy_test* - proxy-specific tests
- scripts/*sensor-proxy* - installation and management scripts
- security/apparmor/, security/seccomp/ - proxy security profiles

Temperature monitoring remains available via the unified agent approach.
2026-01-21 11:59:04 +00:00
rcourtman
ebc29b4fdb feat: show pending apt updates for Proxmox nodes (#1083)
- Add PendingUpdates and PendingUpdatesCheckedAt fields to Node model
- Add GetNodePendingUpdates method to Proxmox client (calls /nodes/{node}/apt/update)
- Add 30-minute polling cache to avoid excessive API calls
- Add pendingUpdates to frontend Node type
- Add color-coded badge in NodeSummaryTable (yellow: 1-9, orange: 10+)
- Update test stubs for interface compliance

Requires Sys.Audit permission on Proxmox API token to read apt updates.
2026-01-21 10:53:36 +00:00
rcourtman
cdcd50c8c1 fix: persist full-width layout preference on server. Related to #1130
Full-width mode now syncs to server like dark mode, ensuring the setting
persists across Proxmox helper script updates. Previously only used
localStorage which gets cleared on some update methods.
2026-01-20 23:01:33 +00:00
rcourtman
7ce1355bba fix(test): disable email in TestSendResolvedAlert to avoid retry delays 2026-01-20 18:29:29 +00:00
rcourtman
eec4bcf33e fix(test): update API test expectations for status codes and response format 2026-01-20 18:12:58 +00:00
rcourtman
a383f06848 fix(test): add stateFileDir to TestRun_Legacy test setup 2026-01-20 17:43:58 +00:00
rcourtman
36622d2c17 Hide unavailable AI tools 2026-01-20 17:19:47 +00:00
rcourtman
ecc31730f6 Remove OpenCode references 2026-01-20 16:56:41 +00:00
rcourtman
b57b4a7c3c Tighten AI chat routing and context display 2026-01-20 16:30:55 +00:00
rcourtman
0248f0de5a fix(alerts): Prevent RAID check/scrub from triggering rebuild alerts. Related to #1125
DSM data scrubbing causes RAID arrays to enter a 'check' state with
RebuildPercent > 0, which was incorrectly triggering rebuild warnings.

Now distinguishes between:
- 'check' state: scheduled data scrubbing (no alert)
- 'recover'/'resync' state: actual rebuild (warning alert)
- 'clean' state with RebuildPercent: scrub in progress (no alert)
2026-01-20 16:13:58 +00:00
rcourtman
96b7370f7b test: improve coverage for API, AI, Alerts, and Frontend Utils
- Add comprehensive tests for internal/api/config_handlers.go (Phases 1-3)
- Improve test coverage for AI tools, chat service, and session management
- Enhance alert and notification tests (ResolvedAlert, Webhook)
- Add frontend unit tests for utils (searchHistory, tagColors, temperature, url)
- Add proximity client API tests
2026-01-20 15:52:39 +00:00
rcourtman
ee63d438cc docs: standardize markdown syntax and remove deprecated sensor-proxy docs 2026-01-20 09:43:49 +00:00
rcourtman
1c22688d9b fix: standardize API version format and guest key separators
Closes #1115 (discussion feedback)

Two API consistency issues reported by @FabienD74:

1. Version format mismatch in /api/version:
   - currentVersion: "5.0.16" (no prefix)
   - latestVersion: "v5.0.16" (with prefix)

   Fixed: LatestVersion now strips the "v" prefix to match CurrentVersion format.

2. Guest ID separator inconsistency:
   - Some code used colons: "instance:node:vmid"
   - BuildGuestKey used dashes: "instance-node-vmid"

   Fixed: BuildGuestKey now uses colon separator matching the canonical
   format used by makeGuestID in the monitoring package. The existing
   legacy migration in GetWithLegacyMigration handles old dash-format
   entries in guest_metadata.json.
2026-01-19 22:20:18 +00:00
rcourtman
a6a8efaa65 test: Add comprehensive test coverage across packages
New test files with expanded coverage:

API tests:
- ai_handler_test.go: AI handler unit tests with mocking
- agent_profiles_tools_test.go: Profile management tests
- alerts_endpoints_test.go: Alert API endpoint tests
- alerts_test.go: Updated for interface changes
- audit_handlers_test.go: Audit handler tests
- frontend_embed_test.go: Frontend embedding tests
- metadata_handlers_test.go, metadata_provider_test.go: Metadata tests
- notifications_test.go: Updated for interface changes
- profile_suggestions_test.go: Profile suggestion tests
- saml_service_test.go: SAML authentication tests
- sensor_proxy_gate_test.go: Sensor proxy tests
- updates_test.go: Updated for interface changes

Agent tests:
- dockeragent/signature_test.go: Docker agent signature tests
- hostagent/agent_metrics_test.go: Host agent metrics tests
- hostagent/commands_test.go: Command execution tests
- hostagent/network_helpers_test.go: Network helper tests
- hostagent/proxmox_setup_test.go: Updated setup tests
- kubernetesagent/*_test.go: Kubernetes agent tests

Core package tests:
- monitoring/kubernetes_agents_test.go, reload_test.go
- remoteconfig/client_test.go, signature_test.go
- sensors/collector_test.go
- updates/adapter_installsh_*_test.go: Install adapter tests
- updates/manager_*_test.go: Update manager tests
- websocket/hub_*_test.go: WebSocket hub tests

Library tests:
- pkg/audit/export_test.go: Audit export tests
- pkg/metrics/store_test.go: Metrics store tests
- pkg/proxmox/*_test.go: Proxmox client tests
- pkg/reporting/reporting_test.go: Reporting tests
- pkg/server/*_test.go: Server tests
- pkg/tlsutil/extra_test.go: TLS utility tests

Total: ~8000 lines of new test code
2026-01-19 19:26:18 +00:00
rcourtman
d06ed2edb3 refactor: Add testability improvements to core packages
hostagent/commands.go:
- Extract execCommandContext as mockable variable

hostagent/proxmox_setup.go:
- Convert stateFilePath constants to variables (testable)
- Extract runCommand and lookPath as mockable functions
- Add duplicate comment (minor cleanup needed)

notifications/notifications.go:
- Add GetQueueStats() method for interface compliance
- Used by NotificationMonitor interface

updates/manager.go:
- Add AddSSEClient, RemoveSSEClient, GetSSECachedStatus methods
- Enables interface-based SSE client management

pkg/audit/export.go:
- Minor testability improvements

go.mod/go.sum:
- Add stretchr/objx v0.5.2 (test mocking dependency)
2026-01-19 19:25:38 +00:00
rcourtman
dc16c94766 fix: Add robustness improvements to approval, auth, and server
approval/store.go:
- Make Approve() idempotent - return success if already approved
- Handles double-clicks and race conditions gracefully

auth.go:
- Add dev mode admin bypass (disabled by default)
- When ALLOW_ADMIN_BYPASS=1, sets X-Authenticated-User header

server.go:
- Call router.StopOpenCodeAI() during shutdown
- Ensures AI service stops cleanly on server termination
2026-01-19 19:24:45 +00:00
rcourtman
f478046696 refactor(api): Add interfaces to handlers for testability
Extract interfaces from concrete monitor type dependencies:

alerts.go:
- Add AlertManager, ConfigPersistence, AlertMonitor interfaces
- Change AlertHandlers to accept AlertMonitor interface

notifications.go:
- Add NotificationManager, NotificationConfigPersistence interfaces
- Add NotificationMonitor interface
- Change NotificationHandlers to accept NotificationMonitor interface

updates.go:
- Add UpdatesMonitor interface
- Change UpdatesHandlers to accept interface

audit_handlers.go:
- Update to use interface-based injection

profile_suggestions.go:
- Minor interface alignment

Benefits:
- Handlers can now be tested with mock implementations
- Decouples handlers from concrete monitoring.Monitor type
- Works with monitor_wrappers.go added in previous commit
2026-01-19 19:21:46 +00:00
rcourtman
5dc0177ec2 refactor(ai): Rename findings adapter and add chat patrol alias
- Rename findings_mcp_adapter.go -> findings_tools_adapter.go
- Update imports from mcp to tools package
- Add findings_tools_adapter_test.go with basic tests
- Add SetChatPatrol method as alias for SetOpenCodePatrol
  (maintains API compatibility during transition)
2026-01-19 19:20:49 +00:00
rcourtman
ffb8928dbf refactor(api): Update handlers for native AI chat service
Adapts API handlers to use the new native chat service:

ai_handler.go:
- Replace opencode.Service with chat.Service
- Add AIService interface for testability
- Add factory function for service creation (mockable)
- Update provider wiring to use tools package types

ai_handlers.go:
- Add Notable field to model list response
- Simplify command approval - execution handled by agentic loop
- Remove inline command execution from approval endpoint

router.go:
- Update imports: mcp -> tools, opencode -> chat
- Add monitor wrapper types for cleaner dependency injection
- Update patrol wiring for new chat service

agent_profiles:
- Rename agent_profiles_mcp.go -> agent_profiles_tools.go
- Update imports for tools package

monitor_wrappers.go:
- New file with wrapper types for alert/notification monitors
- Enables interface-based dependency injection
2026-01-19 19:20:00 +00:00
rcourtman
0f5807d0f9 refactor(ai): Remove deprecated opencode sidecar package
The opencode package implemented the old architecture where:
- Pulse ran an OpenCode subprocess as a sidecar
- AI messages were proxied through OpenCode
- OpenCode connected back to Pulse's MCP server for tool access

This is now replaced by the native chat service (internal/ai/chat)
which calls AI providers directly and executes tools inline.

Removed files:
- sidecar.go: OpenCode process management
- service.go: Message proxying and session management
- client.go: HTTP client for OpenCode API
- bridge.go: MCP tool bridging
- patrol.go: Patrol context extraction
- *_test.go: Associated tests

Benefits of new architecture:
- No external subprocess dependency
- Direct streaming from AI providers
- Lower latency (no proxy hop)
- Simpler deployment
2026-01-19 19:18:14 +00:00
rcourtman
17ca31a557 refactor(ai): Replace mcp package with tools package for direct tool execution
This refactoring removes the MCP (Model Context Protocol) server layer and
converts AI tools to be called directly by the chat service.

Key changes:
- Rename package from internal/ai/mcp to internal/ai/tools
- Remove server.go - tools no longer exposed via MCP server
- Tools are now called directly by the chat service via ExecuteTool()

New tools added:
- Kubernetes: clusters, nodes, pods, deployments (4 tools)
- PMG: mail gateway status, mail stats, queues, spam stats (4 tools)
- Infrastructure: snapshots, PBS jobs, backup tasks, network stats,
  disk I/O, cluster status, swarm, services, tasks, recent tasks,
  physical disks, RAID status, host Ceph, resource disks (14 tools)
- Patrol: connection health, resolved alerts (2 tools)

Test coverage:
- Added comprehensive test files for adapters, infrastructure,
  patrol, profiles, and query tools

Total tools: 50 (was ~25)
2026-01-19 19:17:24 +00:00
rcourtman
5ff4f97a0d feat(ai): Add native chat service with streaming and tool execution
Replace the OpenCode sidecar with a native chat service that handles:
- Real-time streaming responses from AI providers
- Multi-turn conversation sessions with history
- Tool execution with automatic function calling
- Agentic workflows for autonomous task completion
- Patrol integration for automated health analysis

The chat service directly communicates with AI providers using the
new StreamingProvider interface, eliminating the need for an external
sidecar process. Sessions are managed in-memory with configurable
history limits.

Key components:
- service.go: Main chat service with provider integration
- session.go: Session management and message history
- agentic.go: Agentic loop for autonomous tool execution
- patrol.go: Patrol-specific chat context and analysis
- tools.go: Tool execution bridge to tools package
- types.go: Chat message and event type definitions
2026-01-19 19:12:04 +00:00
rcourtman
4fe3d7df77 feat(ai): Add streaming support and notable models to AI providers
- Add ChatStream method to all providers (Anthropic, OpenAI, Gemini, Ollama)
  for real-time streaming of AI responses with tool call support
- Add StreamingProvider interface with StreamEvent types for content,
  thinking, tool_start, tool_end, done, and error events
- Add notable models feature that fetches model metadata from models.dev
  to identify recent/recommended models (within last 3 months)
- Add Notable field to ModelInfo struct to flag "latest and greatest" models
- Add SupportsThinking method to check for extended reasoning capability

The streaming support enables real-time AI chat responses instead of
waiting for complete responses. The notable models feature helps users
identify which models are current and recommended.
2026-01-19 19:10:58 +00:00
rcourtman
17dec929a0 feat: Add mention support for webhook alerts. Related to #1118
Adds a Mention field to webhook configurations that allows users to tag
individuals or groups when alerts are sent. This works with:

- Discord: @everyone, <@USER_ID>, <@&ROLE_ID>
- Microsoft Teams: @General, user email
- Mattermost: @channel, @all, @username

The mention is included in the webhook payload via the {{.Mention}} template
variable. Built-in templates for Discord, Slack, and Teams now conditionally
include mentions when configured.

Backend changes:
- Add Mention field to WebhookConfig struct
- Add Mention field to WebhookPayloadData for template access
- Pass mention through sendGroupedWebhook

Frontend changes:
- Add mention field to Webhook interface
- Add Mention input to webhook configuration form
- Show service-specific help text for mention formats
2026-01-18 15:16:37 +00:00
rcourtman
71bcc570ad fix: Add nil checks in findDuplicate() to prevent crash. Related to #1119
When a resource exists in the hostname or IP index but has been removed from
the main resources map, looking up and accessing .Type would cause a nil
pointer dereference panic.

The MachineID lookup already had this check, but hostname and IP lookups
were missing it. This adds consistent nil checking across all three lookup
paths.
2026-01-18 13:41:00 +00:00
rcourtman
204a9fe084 perf: Cache agent profiles to prevent disk I/O on every report. Related to #1094
GetHostAgentConfig was loading profiles and assignments from disk on
every agent report (every 10-30 seconds per host). With multiple hosts,
this caused disk I/O contention that eventually led to request timeouts.

Added in-memory caching with 60-second TTL:
- Fast path reads from cache without locks when valid
- Double-checked locking pattern for cache refresh
- Cache auto-invalidates after TTL, no manual invalidation needed
2026-01-17 22:31:02 +00:00
rcourtman
432f13b6f5 feat(ai): add Docker update management MCP tools
Add three new MCP tools for Docker container update management:
- pulse_list_docker_updates: list containers with pending updates
- pulse_check_docker_updates: trigger update check on a host
- pulse_update_docker_container: apply update with approval workflow

Changes:
- Add UpdatesProvider interface to executor.go
- Add response types to data_types.go
- Add UpdatesMCPAdapter to adapters.go
- Register tools and handlers in tools_infrastructure.go
- Add SetUpdatesProvider() to service.go
- Wire provider in router.go wireOpenCodeProviders()
2026-01-17 15:47:36 +00:00
rcourtman
4cea85ec97 feat(mcp): expand MCP tools and add session management APIs
New API endpoints:
- POST /api/ai/sessions/{id}/summarize - Compress context
- GET /api/ai/sessions/{id}/diff - Get file changes
- POST /api/ai/sessions/{id}/fork - Branch conversation
- POST /api/ai/sessions/{id}/revert - Undo changes
- POST /api/ai/sessions/{id}/unrevert - Restore reverted changes

MCP provider wiring:
- Storage, backup, disk health providers
- Metrics history, baseline, pattern detection
- Findings manager and metadata updater

Tool improvements:
- pulse_get_topology: Unified infrastructure view
- Improved tool descriptions with usage examples
- Better license checking with logging
2026-01-17 14:43:58 +00:00
rcourtman
c26f0e6e6c feat(ai): improve OpenCode integration and control level handling
OpenCode client improvements:
- Fix session listing with proper timestamp parsing
- Model selection with provider inference (anthropic, google, etc)
- Add session management APIs (summarize, diff, fork, revert)
- Generated session titles from first user message

Control level refactoring:
- IsAutonomous() helper for cleaner checks
- Legacy autonomous_mode maps to control_level for backwards compat
- Simplified system instructions (rely on tool descriptions instead)

Includes tests for model provider inference.
2026-01-17 14:43:28 +00:00
rcourtman
103eb9c3e0 feat(monitoring): auto-detect Docker inside LXC containers
Adds automatic Docker detection for Proxmox LXC containers:
- New HasDocker and DockerCheckedAt fields on Container model
- Docker socket check via connected agents on first run, restart, or start
- Parallel checking with timeouts for efficiency
- Caches results and only re-checks after state transitions

This enables the AI to know which LXC containers are Docker hosts
for better infrastructure guidance.
2026-01-17 14:42:52 +00:00
rcourtman
3512069965 feat(license): grant Pro features in dev mode
Extends the demo mode behavior to also apply when PULSE_DEV=true,
allowing developers to test Pro features during development without
requiring a license key.
2026-01-17 14:42:27 +00:00
rcourtman
3096ec53b5 fix: preserve alert activation state when saving config. Related to #1096 2026-01-16 14:24:02 +00:00
rcourtman
035436ad6e fix: add mutex to prevent concurrent map writes in Docker agent CPU tracking
The agent was crashing with 'fatal error: concurrent map writes' when
handleCheckUpdatesCommand spawned a goroutine that called collectOnce
concurrently with the main collection loop. Both code paths access
a.prevContainerCPU without synchronization.

Added a.cpuMu mutex to protect all accesses to prevContainerCPU in:
- pruneStaleCPUSamples()
- collectContainer() delete operation
- calculateContainerCPUPercent()

Related to #1063
2026-01-15 21:10:55 +00:00
rcourtman
a7de907c35 chore: remove internal planning doc, add gitignore patterns
- Remove docs/AGENTS_AI_SCOPE_PLAN.md (internal dev doc)
- Add gitignore patterns for *_PLAN.md, *_ROADMAP.md, *IMPLEMENTATION*.md in docs/
2026-01-15 13:53:42 +00:00
rcourtman
8c7581d32c feat(profiles): add AI-assisted profile suggestions
Add ability for users to describe what kind of agent profile they need
in natural language, and have AI generate a suggestion with name,
description, config values, and rationale.

- Add ProfileSuggestionHandler with schema-aware prompting
- Add SuggestProfileModal component with example prompts
- Update AgentProfilesPanel with suggest button and description field
- Streamline ValidConfigKeys to only agent-supported settings
- Update profile validation tests for simplified schema
2026-01-15 13:24:18 +00:00