Docker container URL preserved on update (#1054): container updates
recreate the container with a new runtime ID. The agent now includes
{oldContainerId, newContainerId} in the completion ACK payload; the
server uses this to copy persisted metadata (custom URLs, descriptions,
tags) to the new ID so nothing is lost. Migration is a copy, not a move,
so rollback scenarios still find metadata under the original ID.
Reduce metrics.db write amplification (#1124): add a UNIQUE index on
(resource_type, resource_id, metric_type, timestamp, tier) so rollup
reprocessing after a failed checkpoint uses INSERT OR IGNORE instead of
creating duplicate rows. Existing duplicates are deduplicated once on
startup if the index creation would otherwise fail. Also sets
wal_autocheckpoint(500) to checkpoint the WAL more frequently, preventing
unbounded WAL growth.
Fixes#1054Fixes#1124
VACUUM creates a full copy of the database. Running retention first
deletes stale data (5GB → ~60MB live), so the VACUUM copies far less
data — faster startup and much less temporary disk space needed.
The auto_vacuum(INCREMENTAL) pragma from the previous commit only takes
effect on new databases. SQLite requires a full VACUUM to restructure
existing files when switching from NONE to INCREMENTAL. Without this,
users upgrading from bloated 5GB+ databases would never reclaim space.
Adds a one-time migration on startup that detects the current auto_vacuum
mode and runs VACUUM to convert if needed. Subsequent startups skip the
migration since the mode is already INCREMENTAL.
The metrics database could grow to 5GB+ for modest setups because:
1. Retention deletes rows hourly but SQLite never reclaims the space
2. WAL file grows unbounded without explicit checkpointing
3. No cleanup runs on startup, so restarts accumulate stale data
Fixes:
- Enable auto_vacuum=INCREMENTAL so deleted pages can be reclaimed
- Run incremental_vacuum after each retention cleanup
- Force WAL checkpoint(TRUNCATE) after deletes to prevent WAL bloat
- Run retention on startup to clean stale data immediately
Expected DB size for a 50-resource setup drops from 5GB+ to ~60-70MB.
Ref: GitHub Discussion #1231
When querying short time ranges (1h, 6h), the metrics store only looked
in TierRaw and TierMinute which were empty in mock mode. The seeded data
was stored in TierHourly and TierDaily.
Updated tierFallbacks to include coarser tiers as fallbacks:
- TierRaw now falls back to TierMinute, then TierHourly
- TierMinute now falls back to TierRaw, then TierHourly
This ensures sparkline data is available in mock/demo mode where
historical data is seeded into coarser tiers.
- Move all SQLite pragmas from db.Exec() to DSN parameters so every
connection the pool creates gets busy_timeout and other settings.
Previously only the first connection had these applied.
- Set MaxOpenConns(1) on audit, RBAC, and notification databases
(metrics already had this). Fixes potential for multiple connections
where new ones lack busy_timeout.
- Increase busy_timeout from 5s to 30s across all databases to
tolerate disk I/O pressure during backup windows.
- Fix nested query deadlocks in GetRoles(), GetUserAssignments(), and
CancelByAlertIDs() that would deadlock with MaxOpenConns(1).
- Fix circuit breaker retryInterval not resetting on recovery, which
caused the next trip to start at 5-minute backoff instead of 5s.
Related to #1156
- Add FeatureLongTermMetrics license feature for Pro tier
- Implement tiered storage in metrics store (raw, minute, hourly, daily)
- Add covering index for unified history query performance
- Seed mock data for 90 days with appropriate aggregation tiers
- Update PULSE_PRO.md to document the feature
- 7-day history remains free, 30d/90d requires Pro license
This commit adds enterprise-grade reporting and audit capabilities:
Reporting:
- Refactored metrics store from internal/ to pkg/ for enterprise access
- Added pkg/reporting with shared interfaces for report generation
- Created API endpoint: GET /api/admin/reports/generate
- New ReportingPanel.tsx for PDF/CSV report configuration
Audit Webhooks:
- Extended pkg/audit with webhook URL management interface
- Added API endpoint: GET/POST /api/admin/webhooks/audit
- New AuditWebhookPanel.tsx for webhook configuration
- Updated Settings.tsx with Reporting and Webhooks tabs
Server Hardening:
- Enterprise hooks now execute outside mutex with panic recovery
- Removed dbPath from metrics Stats API to prevent path disclosure
- Added storage metrics persistence to polling loop
Documentation:
- Updated README.md feature table
- Updated docs/API.md with new endpoints
- Updated docs/PULSE_PRO.md with feature descriptions
- Updated docs/WEBHOOKS.md with audit webhooks section