Commit graph

559 commits

Author SHA1 Message Date
rcourtman
d716bbfdeb fix(security): add proper authorization to sensitive endpoints
- /api/agent-install-command: require admin + settings:write scope
  Previously only RequireAuth, allowing any authenticated user to mint
  high-privilege API tokens (host-agent:manage)

- /api/system/ssh-config: require settings:write scope
  Previously any authenticated token could modify ~/.ssh/config

- /api/system/verify-temperature-ssh: require settings:write scope
  Previously any authenticated token could trigger SSH connection
  attempts to arbitrary nodes (network scanning risk)

- /api/diagnostics: require admin privileges
  Previously exposed API token metadata (IDs, hints, usage mapping)
  to any authenticated token, enabling enumeration attacks
2026-02-03 17:47:40 +00:00
rcourtman
12a5a98117 fix: SSE race conditions, alert user spoofing, and security status oracle
SSE Broadcaster:
- Add per-client mutex to prevent concurrent writes to ResponseWriter
- Fix data race in cleanupLoop reading LastActive without synchronization
- Update LastActive in SendHeartbeat so clients aren't incorrectly pruned
  after 5 minutes of idle heartbeat traffic

Alert Acknowledgements:
- Extract authenticated user from X-Authenticated-User header instead of
  hardcoding 'admin' or trusting request body's User field
- Prevents audit log spoofing and ensures accurate user attribution

Security Status Endpoint:
- Remove ?token= query param validation from public /api/security/status
- Prevents endpoint from acting as a token validity oracle for attackers
- Authentication still works via session cookies and X-API-Token header
2026-02-03 17:40:58 +00:00
rcourtman
beae4c860c fix: address 6 security and reliability issues
Security fixes:
- Auto-register now requires settings:write scope for API tokens
- X-Forwarded-For in auto-register only trusted from verified proxies
- Public URL capture requires authentication (no loopback bypass)
- Lockout reset now uses RequireAdmin for session users

Reliability fixes:
- Docker stop command expiration clears PendingUninstall flag
- Cancelled notifications get completed_at set and are cleaned up
2026-02-03 17:32:44 +00:00
rcourtman
b2639ed5a5 Fix security vulnerabilities and critical bugs
- Fix WebSocket CORS bypass by strictly verifying origin
- Fix OIDC refresh token persistence by encrypting at rest
- Fix grouped webhook data mutation by cloning alerts
- Fix host agent uninstall authorization and config fetch logic
- Fix notification queue recovery for stuck sending items
- Fix ignored update history limit parameter
- Fix ineffective break statement in WebSocket write pump
2026-02-03 17:16:27 +00:00
rcourtman
bd030c7c87 security: fix webhook SSRF, rate limit spoofing, metrics retention, and url poisoning
- Fix SSRF and rate limit bypass in SendEnhancedWebhook by validating the rendered URL.
- Fix rate limit spoofing in updates API by using secure IP extraction (trusted proxies).
- Fix memory leak in metrics history by correctly clearing fully stale data series.
- Fix public URL poisoning by preventing overwrites when explicitly configured.
2026-02-03 16:58:13 +00:00
rcourtman
4f40c3d751 fix: resolve critical stability and auth issues
- Fix data race in webhook notifications by removing shared state
- Fix duplicate monitors on config reload by stopping old instances
- Prevent metrics ID deletion on transient startup errors
- Support Bearer auth header for config export/import endpoints
2026-02-03 16:46:27 +00:00
rcourtman
935326ebb7 fix(api/ai): resolve critical auth, agent download, and lifecycle issues
- Fix API-only mode to accept Bearer tokens and query params
- Fix data race in API token validation using fine-grained locking
- Fix unified agent download serving wrong binary for invalid arch
- Fix AI infra discovery running when AI disabled and missing stop mechanism
2026-02-03 16:35:12 +00:00
rcourtman
3d8374e527 Fix AI investigation context and UI settings
- Ensure correct org context is used for AI chat service resolution

- Fix AI adapter tests

- Update AI Intelligence page UI for advanced settings
2026-02-03 16:24:56 +00:00
rcourtman
bea3bbe5f6 Fix API token authentication and multi-tenancy logic
- Fix AuthContextMiddleware to use tenant-specific config for token validation

- Resolve data race in token LastUsedAt update

- Fix invalid org IDs returning 501/402 instead of 400

- Prevent unauthenticated organization directory creation (DoS protection)
2026-02-03 16:24:28 +00:00
rcourtman
88d95f40be feat: add Discovery Transparency & Trust features
- Add AI provider indicator showing local (Ollama) vs cloud (Anthropic/OpenAI) analysis
- Add "What Discovery Does" explanation section before first scan
- Show commands preview before scan so users know what will run
- Add scan details section showing raw command outputs for admins
- Filter sensitive Docker labels (passwords, secrets, tokens) before AI analysis
- Add comprehensive tests for label filtering

This improves sysadmin confidence by making discovery transparent about
what it does, what data it collects, and where that data goes.
2026-02-03 14:59:27 +00:00
rcourtman
c2ed6067f1 Fix: discovery routing, host identification, and UX feedback
- Fix routing for POST/PUT/DELETE on /api/discovery/host/ endpoints
  (Go's http.ServeMux was matching the longer prefix before method-specific routes)
- Add HOST-specific AI prompt that focuses on identifying the host OS
  rather than services/containers running on it
- Add success message UI after discovery completes
- Fix timing so success appears after data is visible (not during refetch)
- Add error handling and display for failed discoveries
2026-02-03 14:10:54 +00:00
rcourtman
a55ae78715 Revert "Add config option to disable tools for OpenAI-compatible endpoints"
This reverts commit 81229f206f.
2026-02-03 13:26:26 +00:00
rcourtman
81229f206f Add config option to disable tools for OpenAI-compatible endpoints
Some local LLM servers (LM Studio, llama.cpp) expose OpenAI-compatible
APIs but don't support function calling. When tools are sent to these
models, they output raw control tokens instead of proper responses.

This change adds:
- openai_tools_disabled config field in AIConfig
- AreToolsDisabledForProvider() method to check at runtime
- API support to get/set the new setting
- Tests for the new functionality

When enabled and using a custom OpenAI base URL, the chat service will
skip sending tools to the model, allowing basic chat functionality to
work even with models that don't support function calling.

Fixes #1154
2026-02-03 13:21:44 +00:00
rcourtman
744eeb0270 Chore: clean up staged changes for release
- Remove standalone pulse-assistant architecture doc (content lives in CLAUDE.md)
- Add CountdownTimer component for patrol schedule display
- Rewrite patrol handler test to focus on interval persistence
- Extract MockStateProvider to shared test file
2026-02-02 23:17:40 +00:00
rcourtman
02946d45ec test: expand api handler coverage 2026-02-02 23:01:29 +00:00
rcourtman
bfa648ddd5 Test: expand api feature test coverage
Add tests for AI intelligence, Docker/K8s agents, log redaction, and general router helper functions.
2026-02-02 22:02:22 +00:00
rcourtman
43d7fffeef Test: add coverage for auth and security handlers
Add additional tests for OIDC, SAML, and tenant middleware to improve coverage of security-critical paths.
2026-02-02 22:02:11 +00:00
rcourtman
97a985efb8 Test: improve frontend embedding coverage
Enhance tests for frontend embedding to cover filesystem overrides, dev proxy configuration, and SPA header handling.
2026-02-02 22:01:46 +00:00
rcourtman
eb2d07e48f Chore: enhance core api and metrics testability
Refactor Router to allow HTTP client injection for install script proxying. Add tests for unified agent install mechanism and additional metrics store coverage.
2026-02-02 22:01:36 +00:00
rcourtman
e60a11116b test(api): comprehensively improve test coverage to Security, Infra, and Features 2026-02-02 18:59:44 +00:00
rcourtman
d1f76982ec fix: finding drawer actions (notes persist, acknowledge visual, discuss context)
- Sync UserNote, AcknowledgedAt, SnoozedUntil, DismissedReason, Suppressed,
  and TimesRaised from ai.Finding to unified store in both callback and
  startup sync paths. Mirror note writes to unified store immediately.
- Dim acknowledged findings (opacity-60), add "Acknowledged" badge, hide
  acknowledge button once acknowledged, sort below unacknowledged in
  severity mode.
- Pass finding_id through frontend chat API → backend ChatRequest →
  ExecuteRequest. Look up full finding from unified store (mutex-guarded)
  and prepend structured context to the prompt.
2026-02-02 15:18:51 +00:00
rcourtman
e86c25c771 test(api): increase coverage for discovery and chat adapter
- Add comprehensive tests for discovery_handlers.go (~75% coverage)
- Add tests for chat_service_adapter.go (previously 0% coverage)
- Fix missing API key issues in chat adapter tests by using ollama model configuration
2026-02-02 14:53:52 +00:00
rcourtman
4af5fc4246 refactor(config): rename BackendHost/BackendPort to BindAddress
Simplify server config by consolidating BackendHost and BackendPort into
a single BindAddress field. The port is now solely controlled by FrontendPort.

Changes:
- Replace BackendHost/BackendPort with BindAddress in Config struct
- Add deprecation warning for BACKEND_HOST env var (use BIND_ADDRESS)
- Update connection timeout default from 45s to 60s
- Remove backendPort from SystemSettings and frontend types
- Update server.go to use cfg.BindAddress
- Update all tests to use new config field names
2026-02-01 23:26:32 +00:00
rcourtman
3c3b041368 Improve patrol autonomy error messages for clarity
- Update license_required error to mention 'Auto-fix' instead of
  'Assisted and Full autonomy' for clearer user messaging
- Update full_mode_locked error to reference the UI toggle label
  'Auto-fix critical issues' instead of internal field name
2026-02-01 22:25:32 +00:00
rcourtman
e780a78725 test(ai): update tests for license gate removals and DeepSeek cleanup 2026-02-01 18:08:02 +00:00
rcourtman
9279358cec imp(api): remove remaining license gates for intelligence features 2026-02-01 16:28:49 +00:00
rcourtman
0c802e7083 fix(patrol): improve service lifecycle, graceful shutdown, and concurrency 2026-02-01 16:27:25 +00:00
rcourtman
44f9a36d5c feat(license): implement free Patrol / pro Auto-Fix tiering strategy 2026-02-01 16:27:10 +00:00
rcourtman
95a0d7a6bd feat(backend): implement AI Patrol, Investigation, and system-wide refactors 2026-01-30 19:02:14 +00:00
rcourtman
bb8fbfb411 feat(backend): implement real-time log broadcasting and handlers 2026-01-30 19:01:58 +00:00
rcourtman
c457358c63 fix(api): flush SSE headers immediately on patrol stream connect
Send an SSE comment immediately when a client connects to the patrol
stream endpoint. This flushes HTTP headers so clients receive the
200 response right away, rather than blocking until the first event.

This fixes eval tests where the stream connection would time out
waiting for headers while patrol was still initializing.
2026-01-29 08:20:25 +00:00
rcourtman
9c2f8a3284 refactor(ai): remove obsolete tool and chat files
Remove files that were consolidated into other modules:
- chat/patrol.go, patrol_test.go → moved to chat/service.go
- tools_infrastructure.go → merged into tools_storage.go
- tools_intelligence.go → merged into tools_metrics.go
- tools_patrol.go → merged into tools_alerts.go
- tools_profiles.go, tools_profiles_test.go → removed (unused)

Update related test file references.
2026-01-28 21:30:24 +00:00
rcourtman
03b5586ac8 refactor(ai): update patrol and service to use chat service adapter
- Update patrol.go to use chat service for AI execution
- Update service.go with chat service provider integration
- Add patrol streaming endpoint to router
2026-01-28 21:24:34 +00:00
rcourtman
badbad4464 refactor(ai): integrate patrol execution into chat service
- Add ExecutePatrolStream method to chat.Service for patrol-specific execution
- Create chat_service_adapter.go to bridge chat.Service to ai.ChatServiceProvider
- Remove standalone patrol.go and patrol_test.go from chat package
- Add PatrolRequest/PatrolResponse types to chat service
- Add context injection for recent message context

This allows patrol to use an isolated agentic loop with its own system prompt
while leveraging the common chat infrastructure.
2026-01-28 21:21:41 +00:00
rcourtman
9dcd859056 Update API handlers for AI and discovery endpoints
API layer updates:

AI Handlers:
- Better streaming response handling
- Improved error responses
- Session management improvements

Discovery Handlers:
- New discovery endpoint handlers
- Storage config handler
- Better router organization

Removed deprecated aidiscovery handlers in favor of unified approach.
2026-01-28 16:51:35 +00:00
rcourtman
7f7edfceb4 test: expand backend coverage 2026-01-25 21:08:44 +00:00
rcourtman
f2648891a9 chore: remove unused utility functions
Remove createColumn helper from responsive types and getPulsePort
from URL utilities. Both were exported but never used anywhere.
2026-01-24 23:23:27 +00:00
rcourtman
ff2841a5c6 Fix patrol scoping and config propagation 2026-01-24 23:07:55 +00:00
rcourtman
de2cb7a29b chore: remove deprecated GetAvailableModels and ModelInfo
- Remove deprecated config.ModelInfo type (use providers.ModelInfo)
- Remove deprecated GetAvailableModels function (always returned nil)
- Remove associated test
- Update AISettingsResponse to use providers.ModelInfo
2026-01-24 23:00:16 +00:00
rcourtman
9072b8eaa8 feat: enhance API router with multi-tenant authorization
Router & Middleware:
- Add auth context middleware for user/token extraction
- Add tenant middleware with authorization checking
- Refactor middleware chain ordering for proper isolation
- Add router helpers for common patterns

Authentication & SSO:
- Enhance auth with tenant-aware context
- Update OIDC, SAML, and SSO handlers for multi-tenant
- Add RBAC handler improvements
- Add security enhancements

New Test Coverage:
- API foundation tests
- Auth and authorization tests
- Router state and general tests
- SSO handler CRUD tests
- WebSocket isolation tests
- Resource handler tests
2026-01-24 22:42:23 +00:00
rcourtman
27f1a11acb feat: add AI Intelligence system with investigation and forecasting
Major new AI capabilities for infrastructure monitoring:

Investigation System:
- Autonomous finding investigation with configurable autonomy levels
- Investigation orchestrator with rate limiting and guardrails
- Safety checks for read-only mode enforcement
- Chat-based investigation with approval workflows

Forecasting & Remediation:
- Trend forecasting for resource capacity planning
- Remediation engine for generating fix proposals
- Circuit breaker for AI operation protection

Unified Findings:
- Unified store bridging alerts and AI findings
- Correlation and root cause analysis
- Incident coordinator with metrics recording

New Frontend:
- AI Intelligence page with patrol controls
- Investigation drawer for finding details
- Unified findings panel with actions

Supporting Infrastructure:
- Learning store for user preference tracking
- Proxmox event ingestion and correlation
- Enhanced patrol with investigation triggers
2026-01-24 22:41:43 +00:00
rcourtman
c4ca169e2b feat: add multi-tenant isolation foundation (disabled by default)
Implements multi-tenant infrastructure for organization-based data isolation.
Feature is gated behind PULSE_MULTI_TENANT_ENABLED env var and requires
Enterprise license - no impact on existing users.

Core components:
- TenantMiddleware: extracts org ID, validates access, 501/402 responses
- AuthorizationChecker: token/user access validation for organizations
- MultiTenantChecker: WebSocket upgrade gating with license check
- Per-tenant audit logging via LogAuditEventForTenant
- Organization model with membership support

Gating behavior:
- Feature flag disabled: 501 Not Implemented for non-default orgs
- Flag enabled, no license: 402 Payment Required
- Default org always works regardless of flag/license

Documentation added: docs/MULTI_TENANT.md
2026-01-23 21:42:27 +00:00
rcourtman
4c19fa3c1b fix: resolve btrfs disk summing (#1158), podman disable flag (#1151), and diagnostics path (#1155) 2026-01-23 19:24:38 +00:00
rcourtman
508c9f88f6 fix: Support partial updates for PBS nodes. Related to #1105
Allow updating PBS node settings (like excludeDatastores) without
requiring host to be resent. Match the behavior of PVE/PMG handlers
which only validate and update fields when provided.

Previously, PUT /api/config/nodes/{pbs-id} with just {excludeDatastores: [...]}
would fail with 'host is required' because the handler always called
normalizeNodeHost regardless of whether a new host was provided.
2026-01-23 00:13:28 +00:00
rcourtman
1ac53fa9f1 test: add stream, restart, and fallback tests for AI handlers and providers 2026-01-22 22:33:33 +00:00
rcourtman
8bf31214f5 feat: enhance API handlers and router with new endpoints
- Add new AI handler endpoints
- Enhance diagnostics API
- Improve router configuration
2026-01-22 22:31:24 +00:00
rcourtman
f2541b0d6c Refactor: Multi-tenancy support for API and License handlers
- Updated LicenseHandlers and LicenseService to be context/tenant aware
- Refactored API router and middleware to support tenant-scoped license checks
- Updated associated tests for context-aware handlers
2026-01-22 16:42:39 +00:00
rcourtman
289d95374f feat: add multi-tenancy foundation (directory-per-tenant)
Implements Phase 1-2 of multi-tenancy support using a directory-per-tenant
strategy that preserves existing file-based persistence.

Key changes:
- Add MultiTenantPersistence manager for org-scoped config routing
- Add TenantMiddleware for X-Pulse-Org-ID header extraction and context propagation
- Add MultiTenantMonitor for per-tenant monitor lifecycle management
- Refactor handlers (ConfigHandlers, AlertHandlers, AIHandlers, etc.) to be
  context-aware with getConfig(ctx)/getMonitor(ctx) helpers
- Add Organization model for future tenant metadata
- Update server and router to wire multi-tenant components

All handlers maintain backward compatibility via legacy field fallbacks
for single-tenant deployments using the "default" org.
2026-01-22 13:39:06 +00:00
rcourtman
c75972d57c Fix mock metrics history and guest drawer controls 2026-01-22 09:39:53 +00:00
rcourtman
a55bdb7a3a feat(api): security and metrics history improvements
- Require admin + settings:write scope for setup-script-url endpoint
- Add license enforcement for long-term metrics (30d/90d require Pro)
- Add downsampling step calculation for metrics history queries
- Add isContainerSSHRestricted helper for SSH restriction checks
- Clean up temperature proxy references from config handlers
- Minor OIDC and rate limit improvements
2026-01-22 00:44:12 +00:00