Commit graph

29 commits

Author SHA1 Message Date
rcourtman
5c54685f04 Add API token scopes and standalone host agent
Introduces granular permission scopes for API tokens (docker:report, docker:manage, host-agent:report, monitoring:read/write, settings:read/write) allowing tokens to be restricted to minimum required access. Legacy tokens default to full access until scopes are explicitly configured.

Adds standalone host agent for monitoring Linux, macOS, and Windows servers outside Proxmox/Docker estates. New Servers workspace in UI displays uptime, OS metadata, and capacity metrics from enrolled agents.

Includes comprehensive token management UI overhaul with scope presets, inline editing, and visual scope indicators.
2025-10-23 11:40:31 +00:00
rcourtman
a885fb5472 Surface LXC interface IPs via PVE interfaces API (#596) 2025-10-23 08:07:32 +00:00
rcourtman
b95c01066e Capture dynamic LXC IP metrics (#596) 2025-10-23 07:50:45 +00:00
rcourtman
be85459db2 Add LXC config metadata for guest drawers (#596) 2025-10-23 07:30:32 +00:00
rcourtman
aac3dacd63 Improve LXC guest metrics visibility (#596) 2025-10-22 22:24:33 +00:00
rcourtman
fe1533ea13 Improve PMG metric ingestion refs #551 2025-10-22 18:15:27 +00:00
rcourtman
3a3e0e080c Add replication monitoring plumbing and UI
Refs #395
2025-10-22 16:10:15 +00:00
rcourtman
c9543e8a7e Add qemu guest agent version metadata 2025-10-22 15:24:07 +00:00
rcourtman
f8b6aa6c97 Treat 501 responses as non-fatal in cluster failover (#449) 2025-10-22 14:23:13 +00:00
rcourtman
13e2577c57 Handle FreeBSD guest agent disk counters
Refs #580
2025-10-22 14:06:45 +00:00
rcourtman
ff4dc49ae4 Update Pulse install flow and related components 2025-10-21 19:58:53 +00:00
rcourtman
7c00055047 feat: unify and improve Proxmox discovery/scanning architecture
Replaced inconsistent per-product detection logic with a unified probe
architecture using confidence scoring and product-specific matchers.

Key improvements:
- PBS detection now inspects TLS certs, auth headers (401/403), and
  probes PBS-specific endpoints (/api2/json/status, /config/datastore)
  fixing false negatives for self-signed and auth-protected servers
- PMG detection uses header analysis first, then conditional endpoint
  probing, working consistently regardless of port
- Single unified probeProxmoxService() replaces separate checkPort8006()
  and checkServer() code paths, eliminating duplication
- Confidence scoring (0.0-1.0+) with evidence tracking for debugging
- Consolidated hostname resolution and version handling

Technical changes:
- Added ProxmoxProbeResult with structured evidence and scoring
- Added product matchers: applyPVEHeuristics, applyPMGHeuristics,
  applyPBSHeuristics
- Removed legacy methods: checkPort8006, checkServer, isPMGServer,
  detectProductFromEndpoint, and duplicate hostname helpers
- Updated all tests to use new unified probe architecture
- Added probe_test_helpers.go for test access to internal methods

All tests passing. Fixes PBS detection issues and improves consistency
across PVE/PMG/PBS discovery.
2025-10-21 13:09:41 +00:00
rcourtman
56c6c0cc0c feat: improve discovery with progress tracking, validation, and structured errors
Significantly enhanced network discovery feature to eliminate false positives,
provide real-time progress updates, and better error reporting.

Key improvements:
- Require positive Proxmox identification (version data, auth headers, or certificates)
  instead of reporting any service on ports 8006/8007
- Add real-time progress tracking with phase/target counts and completion percentage
- Implement structured error reporting with IP, phase, type, and timestamp details
- Fix TLS timeout handling to prevent hangs on unresponsive hosts
- Expose progress and structured errors via WebSocket for UI consumption
- Reduce log verbosity by moving discovery logs to debug level
- Fix duplicate IP counting to ensure progress reaches 100%

Breaking changes: None (backward compatible with legacy API methods)
2025-10-20 22:29:30 +00:00
rcourtman
5ebb32ce10 feat: enhance runtime configuration and system settings management
Improves configuration handling and system settings APIs to support
v4.24.0 features including runtime logging controls, adaptive polling
configuration, and enhanced config export/persistence.

Changes:
- Add config override system for discovery service
- Enhance system settings API with runtime logging controls
- Improve config persistence and export functionality
- Update security setup handling
- Refine monitoring and discovery service integration

These changes provide the backend support for the configuration
features documented in the v4.24.0 release.
2025-10-20 17:41:19 +00:00
rcourtman
c91b7874ac docs: comprehensive v4.24.0 documentation audit and updates
Complete documentation overhaul for Pulse v4.24.0 release covering all new
features and operational procedures.

Documentation Updates (19 files):

P0 Release-Critical:
- Operations: Rewrote ADAPTIVE_POLLING_ROLLOUT.md as GA operations runbook
- Operations: Updated ADAPTIVE_POLLING_MANAGEMENT_ENDPOINTS.md with DEFERRED status
- Operations: Enhanced audit-log-rotation.md with scheduler health checks
- Security: Updated proxy hardening docs with rate limit defaults
- Docker: Added runtime logging and rollback procedures

P1 Deployment & Integration:
- KUBERNETES.md: Runtime logging config, adaptive polling, post-upgrade verification
- PORT_CONFIGURATION.md: Service naming, change tracking via update history
- REVERSE_PROXY.md: Rate limit headers, error pass-through, v4.24.0 verification
- PROXY_AUTH.md, OIDC.md, WEBHOOKS.md: Runtime logging integration
- TROUBLESHOOTING.md, VM_DISK_MONITORING.md, zfs-monitoring.md: Updated workflows

Features Documented:
- X-RateLimit-* headers for all API responses
- Updates rollback workflow (UI & CLI)
- Scheduler health API with rich metadata
- Runtime logging configuration (no restart required)
- Adaptive polling (GA, enabled by default)
- Enhanced audit logging
- Circuit breakers and dead-letter queue

Supporting Changes:
- Discovery service enhancements
- Config handlers updates
- Sensor proxy installer improvements

Total Changes: 1,626 insertions(+), 622 deletions(-)
Files Modified: 24 (19 docs, 5 code)

All documentation is production-ready for v4.24.0 release.
2025-10-20 17:20:13 +00:00
rcourtman
7d422d2909 feat: add professional logging with runtime configuration and performance optimization
Implements structured logging package with LOG_LEVEL/LOG_FORMAT env support, debug level guards for hot paths, enriched error messages with actionable context, and stack trace capture for production debugging. Improves observability and reduces log overhead in high-frequency polling loops.
2025-10-20 15:13:38 +00:00
rcourtman
524f42cc28 security: complete Phase 1 sensor proxy hardening
Implements comprehensive security hardening for pulse-sensor-proxy:
- Privilege drop from root to unprivileged user (UID 995)
- Hash-chained tamper-evident audit logging with remote forwarding
- Per-UID rate limiting (0.2 QPS, burst 2) with concurrency caps
- Enhanced command validation with 10+ attack pattern tests
- Fuzz testing (7M+ executions, 0 crashes)
- SSH hardening, AppArmor/seccomp profiles, operational runbooks

All 27 Phase 1 tasks complete. Ready for production deployment.
2025-10-20 15:13:37 +00:00
rcourtman
b640347a78 fix: improve discovery performance and reliability
Discovery Fixes:
- Always update cache even when scan finds no servers (prevents stale data)
- Remove automatic re-add of deleted nodes to discovery (was causing confusion)
- Optimize Docker subnet scanning from 762 IPs to 254 IPs (3x faster)
- Add getHostSubnetFromGateway() to detect host network from container

Frontend Type Fixes:
- Fix ThresholdsTable editScope type errors
- Fix SnapshotAlertConfig index signature
- Remove unused variable in Settings.tsx

These changes make discovery faster, more reliable, and fix the issue where
deleted nodes would persist in the discovery cache or immediately reappear.
2025-10-18 22:59:40 +00:00
Pulse Automation Bot
cfdfe896be Adjust backup and snapshot alert handling 2025-10-18 20:11:01 +00:00
rcourtman
6fdef61710 Expand monitoring and discovery test coverage 2025-10-16 08:17:08 +00:00
rcourtman
4838793677 feat: enhance alerts system with tests and improved thresholds
- Add comprehensive test coverage for alerts package with 285+ new tests
- Implement ThresholdsTable component with metric thresholds display
- Enhance Alerts page UI with improved layout and metric filtering
- Add frontend component tests for Alerts page and ThresholdsTable
- Set up Vitest testing infrastructure for SolidJS components
- Improve config persistence with better validation
- Expand discovery tests with 333+ test cases
- Update API, configuration, and Docker monitoring documentation
2025-10-15 22:25:04 +00:00
rcourtman
91fecacfef feat: add docker agent command handling 2025-10-15 19:27:19 +00:00
rcourtman
aaae27dc11 Log memory source transitions for diagnostics (#553) 2025-10-15 19:19:11 +00:00
rcourtman
32421b36b8 Refs #533: add total-minus-used memory fallback 2025-10-15 18:19:54 +00:00
rcourtman
5ce47a72ec Improve discovery classification heuristics
Refs #551
2025-10-15 14:08:05 +00:00
rcourtman
881b7f9a54 Fix false ZFS log/cache warnings 2025-10-14 20:57:43 +00:00
rcourtman
7e5fa9a147 fix: restore cache-aware node memory on PVE 8.4 2025-10-14 16:40:45 +00:00
rcourtman
2163d6f5a8 Use guest meminfo available for VM memory usage 2025-10-12 11:03:56 +00:00
rcourtman
f46ff1792b Fix settings security tab navigation 2025-10-11 23:29:47 +00:00