Commit graph

240 commits

Author SHA1 Message Date
rcourtman
fe5f660524 Clarify dedicated agent token guidance (#1353) 2026-03-27 10:59:32 +00:00
rcourtman
7482cc9600 Document unified agent placement guidance (#1371) 2026-03-27 08:29:32 +00:00
rcourtman
0d16ee331a Add legacy sensor-proxy uninstall helper (#1363) 2026-03-26 23:09:16 +00:00
rcourtman
48f4438d23 Scale v5 Proxmox guest disk polling 2026-03-25 18:24:47 +00:00
rcourtman
aae6035e66 fix(docs): audit and fix agent docs vs install script discrepancies (#1299)
- Split configuration table into "Installer flags" and "Agent-only flags"
  so users know which flags work with `curl | bash` vs the binary directly
- Add missing --cacert and --env flags to installer docs
- Fix --disable-auto-update example (install script doesn't accept it;
  use --env PULSE_DISABLE_AUTO_UPDATE=true instead)
- Add --disable-docker/kubernetes/proxmox and --proxmox-type to
  install.sh show_help()
- Fix --enable-docker=false in CENTRALIZED_MANAGEMENT.md
2026-02-27 21:20:54 +00:00
rcourtman
29a6335905 fix(docs): correct remaining --enable-*=false flags in agent docs (#1299)
All --enable-docker=false, --enable-kubernetes=false, --enable-proxmox=false
references replaced with --disable-docker, --disable-kubernetes, --disable-proxmox.
2026-02-27 21:14:05 +00:00
rcourtman
0bc9445eb8 fix(docs): correct --enable-host=false to --disable-host in agent docs (#1299)
The installer uses --disable-host as a separate flag, not --enable-host=false.
2026-02-27 20:41:32 +00:00
rcourtman
54a1ace2c5 fix(installer): remove stale sensor-proxy mount entries that prevent LXC start after reboot (#1280)
The v4 installer added mount entries for /run/pulse-sensor-proxy to LXC
container configs. After upgrading to v5 and rebooting, /run (tmpfs) is
wiped and the container fails to start. The installer now detects and
removes these stale mp<N> and lxc.mount.entry references automatically
when run on a PVE host, and the upgrade docs include manual fix steps.
2026-02-22 10:52:12 +00:00
Surendra Raika
f663aade53 feat(docker): add macOS Docker Desktop socket auto-detection
Probe ~/.docker/run/docker.sock for RuntimeDocker and RuntimeAuto
before falling back to /var/run/docker.sock. This lets the agent
connect on macOS without requiring DOCKER_HOST to be set manually.

Ref #1200
2026-02-18 19:23:14 +05:30
rcourtman
cf047bd899 feat(install): add TrueNAS CORE (FreeBSD) support to install script (#1201)
Extends the TrueNAS SCALE installer to also support TrueNAS CORE
(FreeBSD-based). The installer auto-detects the platform and configures
the appropriate service manager: systemd for SCALE, rc.d for CORE.

- Rename is_truenas_scale() to is_truenas() with FreeBSD detection
- Add FreeBSD rc.d service script generation with placeholder substitution
- Add FreeBSD bootstrap script for Init/Shutdown task persistence
- Split install/uninstall paths by OS throughout the TrueNAS block
- Add --cacert <path> flag for custom CA bundles (wired to curl only,
  not passed to the agent binary)
- Fix --cacert incorrectly mapping to --insecure in exec args
- Fix missing closing quote on RCSCRIPT_LINK in FreeBSD bootstrap
- Fix unreachable echo after exit 0 in FreeBSD bootstrap

Co-authored-by: wilddev65 <wilddev65@users.noreply.github.com>
(cherry picked from commit affdbaeebaf2b1135431b232593122f464c6bb53)
2026-02-18 12:59:55 +00:00
T. Gossen
4f023a32e1 docs: add LXC console access instructions (#1241)
Community contribution: FAQ entry for LXC console access

(cherry picked from commit 580206c14f)
2026-02-18 12:59:51 +00:00
T. Gossen
0c9a8f7383 Added LXC row to the bootstrap token table (first row) (#1242)
Added explicit command and clarification for getting first-time bootstrap token on install

(cherry picked from commit 4730da1898)
2026-02-18 12:59:46 +00:00
rcourtman
839ed5cc1e docs(release): finalize hotfix 5.1.3 checklist and version bump 2026-02-07 14:18:53 +00:00
rcourtman
ee0e89871d fix: reduce metrics memory 86x by reverting buffer and adding LTTB downsampling
The in-memory metrics buffer was changed from 1000 to 86400 points per
metric to support 30-day sparklines, but this pre-allocated ~18 MB per
guest (7 slices × 86400 × 32 bytes). With 50 guests that's 920 MB —
explaining why users needed to double their LXC memory after upgrading
to 5.1.0.

- Revert in-memory buffer to 1000 points / 24h retention
- Remove eager slice pre-allocation (use append growth instead)
- Add LTTB (Largest Triangle Three Buckets) downsampling algorithm
- Chart endpoints now use a two-tier strategy: in-memory for ranges
  ≤ 2h, SQLite persistent store + LTTB for longer ranges
- Reduce frontend ring buffer from 86400 to 2000 points

Related to #1190
2026-02-04 19:49:52 +00:00
rcourtman
4bebd2f576 docs: fix incomplete sensor-proxy cleanup commands and add upgrade warning
The legacy cleanup section in TEMPERATURE_MONITORING.md only covered 1 of the
5 systemd units and referenced an outdated binary path. Users following these
docs still had the selfheal timer running, generating recurring TASK ERROR
entries in the Proxmox task log.

Updated with the complete set of units, correct file paths, and a note that
upgrading the Pulse container does not remove the sensor proxy from the host.
Added a sensor proxy removal section to UPGRADE_v5.md so users see the warning
during upgrade.

Related to #817
2026-02-04 10:27:03 +00:00
rcourtman
3237a4d7dd docs: clarify PVE backup permission requirements
- Update UPGRADE_v5.md to clarify the backup permission issue affects
  agent-based setups (not just v4→v5 upgrades), and note the fix version
- Add troubleshooting section to UNIFIED_AGENT.md for PVE backups

Related to #1139
2026-02-03 19:14:44 +00:00
rcourtman
744eeb0270 Chore: clean up staged changes for release
- Remove standalone pulse-assistant architecture doc (content lives in CLAUDE.md)
- Add CountdownTimer component for patrol schedule display
- Rewrite patrol handler test to focus on interval persistence
- Extract MockStateProvider to shared test file
2026-02-02 23:17:40 +00:00
rcourtman
fa1b74792e docs: add comprehensive deep-dive documentation for AI subsystems
Adds detailed architecture documentation for Pulse Patrol and Pulse Assistant. Updates AI.md and PULSE_PRO.md. Also includes additional tests.
2026-02-02 10:29:07 +00:00
rcourtman
6753727a04 docs: update API documentation and config file references
Comprehensive documentation updates:

API.md:
- Add /api/security/change-password endpoint
- Add AI provider test endpoints
- Add assistant chat & session management endpoints
- Add legacy chat sessions endpoints
- Add alert investigation and patrol autonomy endpoints
- Add findings & investigations endpoints
- Add approvals & command execution endpoints
- Add remediation plans endpoints
- Add intelligence & forecasting endpoints
- Add knowledge base endpoints
- Add debug endpoint
- Add Socket.IO compatibility endpoint

Config files:
- Document sso.enc, ai_chat_sessions.json
- Document profile-versions.json, profile-changelog.json, profile-deployments.json
2026-02-01 23:26:42 +00:00
rcourtman
017073a065 Document WebSocket endpoints and mock-mode PUT method
- Add /ws and /api/agent/ws WebSocket endpoint documentation
- Add PUT method to mock-mode endpoint
2026-02-01 22:26:17 +00:00
rcourtman
80cdfab536 Update metrics docs with canonical resourceType values
- Use canonical types (vm, container, dockerContainer) instead of
  aliases (guest, docker) in examples
- Document that guest/docker aliases are accepted by the API
- Clarify persistent store type mapping in data flow doc
2026-02-01 22:26:04 +00:00
rcourtman
487fcf76d4 Expand API documentation with additional endpoints
Document previously undocumented endpoints:

- Resource metadata endpoints (hosts, guests, docker containers)
- Public config endpoint
- Node test/update/delete/refresh endpoints
- Service discovery endpoints
- Alert management endpoints (config, history, bulk actions)
- Security apply-restart endpoint
- System settings and SSH config endpoints
- Logs streaming and download endpoints
- Server info endpoint

Also clarify resourceType aliases for metrics history.
2026-02-01 22:25:48 +00:00
rcourtman
ec802c4864 Update documentation with configuration and deployment details
- CONFIGURATION.md: Add comprehensive system.json keys table with
  descriptions for all polling, discovery, and UI settings
- DEPLOYMENT_MODELS.md: Document audit signing key, agent profile files,
  org metadata, and multi-tenant storage layout
- METRICS_HISTORY.md: Update resourceType values, add maxPoints param,
  document Pro license requirement for ranges beyond 7d
- MULTI_TENANT.md: Add storage layout and migration section, remove
  completed TODO items from backlog
- CENTRALIZED_MANAGEMENT.md: Update links and clarify architecture
- API.md: Update endpoint documentation
- UNIFIED_AGENT.md: Document --version and --self-test flags
2026-02-01 22:24:48 +00:00
rcourtman
724dee0b36 Update docs for BYOK Patrol and Pro auto-fix 2026-02-01 14:47:02 +00:00
rcourtman
17208cbf9d docs: update AI evaluation matrix and approval workflow documentation 2026-01-30 19:00:40 +00:00
rcourtman
0e880f3c89 feat(eval): improve patrol eval with polling-based completion
Refactor patrol eval runner to use a dual approach:
1. Poll GET /api/ai/patrol/status until Running=false (primary signal)
2. Best-effort SSE stream connection for tool event visibility

Changes:
- Add status polling loop with configurable timeout
- Make SSE stream optional (may not connect in time)
- Add Completed flag to PatrolRunResult
- Improve assertion error messages
- Add new scenarios and assertions

This is more reliable than relying solely on SSE stream which
may timeout waiting for headers during slow patrol initialization.
2026-01-29 08:20:39 +00:00
rcourtman
e227314d76 docs: update pulse-assistant architecture with current structure
- Remove hardcoded line numbers from enforcement references
- Update tool classification table with all current tools
- Reflect consolidated tool structure
2026-01-28 21:24:45 +00:00
rcourtman
44fecc37c0 feat(eval): enhance AI eval harness with retries and reporting
- Add retry logic for transient failures (phantom, stream, empty response)
- Add environment variable overrides for infrastructure naming
- Add JSON report output per scenario
- Expand assertions with new validation types
- Add more comprehensive test scenarios
- Add docs/EVAL.md with usage documentation

The eval harness now better handles flaky AI responses and provides
detailed reports for debugging.
2026-01-28 21:24:12 +00:00
rcourtman
94863a6750 Add comprehensive architecture documentation for Pulse Assistant
Document the complete safety architecture:

1. High-Level Architecture
   - LLM as untrusted proposer pattern
   - FSM gating and tool execution flow
   - ResolvedContext for session truth

2. Safety Invariants (9 total)
   - Session-scoped tool registration
   - FSM state enforcement
   - Strict resolution requirements
   - ExecutionIntent classification
   - NonInteractiveOnly constraint
   - Read/Write tool separation
   - Phantom execution detection
   - Recovery loop protection
   - Telemetry for all safety blocks

3. Implementation Details
   - FSM states and transitions
   - Tool classification rules
   - Intent detection patterns
   - Error handling and recovery

4. Extension Guide
   - Adding new tools safely
   - Required validations
   - Testing requirements

This serves as authoritative reference for contributors
and security auditors.
2026-01-28 16:49:51 +00:00
rcourtman
6873913e64 fix: install script and docs improvements
- Fixed --disable-docker not being passed to systemd service file. Related to #1151
- Added init: true requirement to HTTPS/TLS docs for Docker. Related to #1166
2026-01-26 20:48:57 +00:00
rcourtman
4a8f9827fe feat: add config migration system and multi-tenant support
Migration System:
- Add migration framework for config schema updates
- Add migration tests

Config Enhancements:
- Add multi-tenant configuration support
- Add DeepCopy for tenant isolation
- Enhance AI config options
- Improve API token handling
- Update persistence layer

Documentation:
- Update multi-tenant documentation
2026-01-24 22:43:10 +00:00
rcourtman
c4ca169e2b feat: add multi-tenant isolation foundation (disabled by default)
Implements multi-tenant infrastructure for organization-based data isolation.
Feature is gated behind PULSE_MULTI_TENANT_ENABLED env var and requires
Enterprise license - no impact on existing users.

Core components:
- TenantMiddleware: extracts org ID, validates access, 501/402 responses
- AuthorizationChecker: token/user access validation for organizations
- MultiTenantChecker: WebSocket upgrade gating with license check
- Per-tenant audit logging via LogAuditEventForTenant
- Organization model with membership support

Gating behavior:
- Feature flag disabled: 501 Not Implemented for non-default orgs
- Flag enabled, no license: 402 Payment Required
- Default org always works regardless of flag/license

Documentation added: docs/MULTI_TENANT.md
2026-01-23 21:42:27 +00:00
rcourtman
5efd1591ca docs: update AI documentation 2026-01-22 22:32:42 +00:00
rcourtman
ad4acf1222 chore: add frontend utilities and metrics documentation
- Add useResizeObserver and useTooltip React hooks
- Add utility functions for anomaly colors, error extraction, text width, and threshold colors
- Add METRICS_DATA_FLOW.md documentation
- Ignore SQLite temp files (*.db-shm, *.db-wal)
2026-01-22 13:48:41 +00:00
rcourtman
f1c2d7c12c docs: add logging overrides to configuration reference
Document LOG_FILE, LOG_MAX_SIZE, LOG_MAX_AGE, and LOG_COMPRESS
environment variables for log file configuration.
2026-01-22 00:44:33 +00:00
rcourtman
c8b6cbfc6d feat(pro): long-term metrics history (30d/90d)
- Add FeatureLongTermMetrics license feature for Pro tier
- Implement tiered storage in metrics store (raw, minute, hourly, daily)
- Add covering index for unified history query performance
- Seed mock data for 90 days with appropriate aggregation tiers
- Update PULSE_PRO.md to document the feature
- 7-day history remains free, 30d/90d requires Pro license
2026-01-22 00:42:41 +00:00
rcourtman
0ca6001bad docs: update documentation after sensor proxy deprecation
Update docs to reflect the simplified temperature monitoring architecture:
- Remove references to pulse-sensor-proxy throughout
- Update TEMPERATURE_MONITORING.md to focus on unified agent approach
- Update CONFIGURATION.md, DEPLOYMENT_MODELS.md, FAQ.md
- Remove SECURITY_CHANGELOG.md (proxy-specific security notes)
- Clarify current recommended setup in various guides
2026-01-21 12:00:59 +00:00
rcourtman
ee63d438cc docs: standardize markdown syntax and remove deprecated sensor-proxy docs 2026-01-20 09:43:49 +00:00
rcourtman
035436ad6e fix: add mutex to prevent concurrent map writes in Docker agent CPU tracking
The agent was crashing with 'fatal error: concurrent map writes' when
handleCheckUpdatesCommand spawned a goroutine that called collectOnce
concurrently with the main collection loop. Both code paths access
a.prevContainerCPU without synchronization.

Added a.cpuMu mutex to protect all accesses to prevContainerCPU in:
- pruneStaleCPUSamples()
- collectContainer() delete operation
- calculateContainerCPUPercent()

Related to #1063
2026-01-15 21:10:55 +00:00
rcourtman
a7de907c35 chore: remove internal planning doc, add gitignore patterns
- Remove docs/AGENTS_AI_SCOPE_PLAN.md (internal dev doc)
- Add gitignore patterns for *_PLAN.md, *_ROADMAP.md, *IMPLEMENTATION*.md in docs/
2026-01-15 13:53:42 +00:00
rcourtman
8c7581d32c feat(profiles): add AI-assisted profile suggestions
Add ability for users to describe what kind of agent profile they need
in natural language, and have AI generate a suggestion with name,
description, config values, and rationale.

- Add ProfileSuggestionHandler with schema-aware prompting
- Add SuggestProfileModal component with example prompts
- Update AgentProfilesPanel with suggest button and description field
- Streamline ValidConfigKeys to only agent-supported settings
- Update profile validation tests for simplified schema
2026-01-15 13:24:18 +00:00
rcourtman
95b849e213 chore: remove internal dev roadmap doc 2026-01-12 15:28:05 +00:00
rcourtman
b5233466a3 docs: add Pro features roadmap and implementation status
Documents implementation status for:
- Advanced SSO (SAML & Multi-Provider) - 100%
- Advanced Reporting - 100%
- Audit Logging - 100%
- Agent Profiles - 100%
- RBAC - 100%
- AI Auto-Fix - 100%
- Kubernetes AI - 55%
- AI Alert Analysis - 70%
- AI Patrol - 85%
2026-01-12 15:21:46 +00:00
rcourtman
f527e6ebd0 docs: fix Kubernetes DaemonSet deployment guide
Fixes #1091 - addresses all three documentation issues reported:

1. Binary path: Changed from /usr/local/bin/pulse-agent (which doesn't
   exist in the main image) to /opt/pulse/bin/pulse-agent-linux-amd64

2. PULSE_AGENT_ID: Added to example and documented why it's required
   for DaemonSets (prevents token conflicts when all pods share one
   API token)

3. Resource visibility flags: Added PULSE_KUBE_INCLUDE_ALL_PODS and
   PULSE_KUBE_INCLUDE_ALL_DEPLOYMENTS to example, with explanation
   of the default behavior (show only problematic resources)

Also added tolerations, resource requests/limits, and ARM64 note.
2026-01-11 21:43:23 +00:00
rcourtman
80729408c1 docs: add RBAC endpoints, OIDC group mapping, and update Pro terminology
- Add RBAC/role management endpoints to API.md
- Document OIDC group-to-role mapping feature in OIDC.md
- Add missing config files to CONFIGURATION.md (audit.db, AI files)
- Add OIDC_GROUP_ROLE_MAPPINGS env var documentation
- Fix "enterprise" -> "Pro" terminology in TROUBLESHOOTING.md
- Refocus TEMPERATURE_MONITORING.md on agent method, collapse legacy proxy docs
2026-01-10 13:59:50 +00:00
rcourtman
2a8f55d719 feat(enterprise): add Advanced Reporting and Audit Webhooks integration
This commit adds enterprise-grade reporting and audit capabilities:

Reporting:
- Refactored metrics store from internal/ to pkg/ for enterprise access
- Added pkg/reporting with shared interfaces for report generation
- Created API endpoint: GET /api/admin/reports/generate
- New ReportingPanel.tsx for PDF/CSV report configuration

Audit Webhooks:
- Extended pkg/audit with webhook URL management interface
- Added API endpoint: GET/POST /api/admin/webhooks/audit
- New AuditWebhookPanel.tsx for webhook configuration
- Updated Settings.tsx with Reporting and Webhooks tabs

Server Hardening:
- Enterprise hooks now execute outside mutex with panic recovery
- Removed dbPath from metrics Stats API to prevent path disclosure
- Added storage metrics persistence to polling loop

Documentation:
- Updated README.md feature table
- Updated docs/API.md with new endpoints
- Updated docs/PULSE_PRO.md with feature descriptions
- Updated docs/WEBHOOKS.md with audit webhooks section
2026-01-09 21:31:49 +00:00
rcourtman
3e2824a7ff feat: remove Enterprise badges, simplify Pro upgrade prompts
- Replace barrel import in AuditLogPanel.tsx to fix ad-blocker crash
- Remove all Enterprise/Pro badges from nav and feature headers
- Simplify upgrade CTAs to clean 'Upgrade to Pro' links
- Update docs: PULSE_PRO.md, API.md, README.md, SECURITY.md
- Align terminology: single Pro tier, no separate Enterprise tier

Also includes prior refactoring:
- Move auth package to pkg/auth for enterprise reuse
- Export server functions for testability
- Stabilize CLI tests
2026-01-09 16:51:08 +00:00
rcourtman
33bb0a95bb docs: Fix formatting in API reference 2026-01-08 20:15:25 +00:00
rcourtman
73c5128a87 feat(audit): Add audit log API endpoints and UI with signature verification
- Add GET /api/audit endpoint for listing events with filters
- Add GET /api/audit/:id/verify endpoint for signature verification
- Add AuditLogPanel UI component with filtering and verification
- Update docs with audit API documentation
- Add localStorage utils for persisting UI state
- Update gitignore patterns
2026-01-08 19:19:57 +00:00
rcourtman
7342191075 docs: fix Helm chart install commands to use GitHub Pages repo
The GHCR OCI registry (ghcr.io/rcourtman/pulse-chart) is returning 403/404
errors for unauthenticated users. Updated all Helm references to use the
working GitHub Pages Helm repository at https://rcourtman.github.io/Pulse

Fixes install issues reported by customers trying to deploy via Helm.

Files updated:
- docs/KUBERNETES.md
- docs/INSTALL.md
- docs/DEPLOYMENT_MODELS.md
- docs/UPGRADE_v5.md
2026-01-08 14:27:45 +00:00