Add missing godoc comments to:
- NewRateLimiter and Allow in ratelimit.go
- SnapshotSyncStatus in temperature_proxy.go
- NewClient and GetVersion in pkg/pmg/client.go
Cover RPM field handling (numeric, string, SSD, N/A, null, invalid),
invalid JSON error path, and unexpected type fallbacks for both
wearout and RPM fields.
Coverage: 50% → 95.5%
Test error handling for password authentication user format validation:
- Missing realm separator (no @)
- Empty user string
- Multiple @ symbols
Improves NewClient coverage from 74.2% to 83.9%.
Test error handling for JSON parsing edge cases:
- Invalid JSON syntax
- Unsupported field types (bool, array)
- Unparseable string values for total-bytes and used-bytes
Improves coverage from 83.3% to 94.4%.
- Test jobid fallback when id field is missing
- Test jobnum field takes precedence over ID parsing
- Test last_sync_duration and duration fields
- Test last-sync-duration fallback format
- Test next_sync and next-sync fallback formats
Coverage: 79.7% → 100%
Add 4 new test cases covering previously untested branches:
- Float zero exactly (0.0)
- Float negative zero (-0.0)
- Only escaped quotes becoming empty after trimming
- Quoted whitespace becoming empty after trimming
Coverage improved from 95.8% to 100%.
The error pattern `/storage/` only matched storage content endpoints
(`/storage/{name}/content`) but not the main storage list endpoint
(`/nodes/{node}/storage`).
This caused storage timeout errors like:
Get ".../nodes/pve-100-224/storage": context deadline exceeded
to incorrectly mark cluster nodes as unhealthy, even though the timeout
was due to a slow cross-node storage query, not actual node connectivity
issues.
Fixes#754
Add 13 new test cases covering previously untested branches:
- float32 timestamp with valid value (using smaller value for precision)
- float32/float64 zero and negative values
- json.Number zero and negative values
- int32 and uint32 timestamp handling
- Invalid date format strings (no matching layout)
- Partial date strings
- Unsupported types (bool, slice)
Coverage improved from 93.8% to 100%.
Add 6 new test cases covering previously untested branches:
- float64 at MaxUint64 boundary (clamping behavior)
- float64 exceeding MaxUint64 (overflow protection)
- String with quoted "null" value
- String with quoted empty value ("")
- String with single quoted empty value ('')
- Invalid float parsing in scientific notation
Coverage improved from 92.3% to 97.4%.
Previously, recovery of unhealthy nodes only triggered when ALL nodes
were unhealthy. This caused individual degraded nodes to stay degraded
forever since operations would succeed on healthy nodes and never
trigger the recovery path.
Now recovery is attempted whenever any unhealthy nodes exist, allowing
clusters to recover individual nodes over time.
Also added:
- Panic-safe unlock/lock pattern using anonymous function
- Refresh of both healthy and cooling endpoints after recovery
- Updated timestamp for accurate cooldown checks
Related to #754
Covers both Proxmox API formats:
- Integer format (older versions): direct int value
- Object format (Proxmox 8.3+): {enabled, available} fields
- Preference order: available > enabled > 0
- Invalid input handling defaults to 0
- Integration with VMStatus struct
Tests for Proxmox client authentication error handling:
- authHTTPError.Error: message formatting based on status code
(401/403 include status in message, others don't)
- shouldFallbackToForm: determines when to retry with form encoding
(triggers on 400/415, not on auth errors or server errors)
16 test cases covering all code paths.
Tests added by ADA run #97 but commit was missed.
Covers: RaidZ types, log/cache/spare devices, nested mirrors,
ConvertToModelZFSPool, and struct field tests.
Add comprehensive tests for the cloneProfile and clonePhase utility
functions in pkg/discovery/discovery.go. Tests verify deep copying
behavior for all fields including subnets, metadata, warnings, extra
targets, and phases to ensure mutations don't affect original objects.
Test coverage for error detection and retry logic:
- extractStatusCode: 13 test cases for HTTP status code extraction
- isTransientRateLimitError: 17 test cases for rate limit detection
- isNotImplementedError: 14 test cases for 501 error detection
- isVMSpecificError: 16 test cases for VM-scoped errors
- calculateRateLimitBackoff: backoff timing verification
- isAuthError: 12 test cases for authentication errors
Coverage 35.5% → 37.3%
Comprehensive test coverage for JSON parsing helpers used in
replication job status parsing: stringFromAny, intFromAny,
boolFromAny, floatFromAny, parseReplicationTime, parseDurationSeconds,
parseHHMMSSToSeconds, and parseReplicationJob.
Coverage increased from 22.6% to 35.5%.
The previous fix (6db4ee7a) cleared stale error messages but didn't mark
endpoints as healthy again after successful operations. This caused
clusters to remain in "degraded" state permanently once any endpoint had
a temporary issue, even if all endpoints were actually working.
The fix now marks endpoints healthy in clearEndpointError() after
successful operations, ensuring degraded clusters recover automatically.
Related to #659
Move the inline filesystem skip logic from pollVMsAndContainersEfficient
into a reusable ShouldSkipFilesystem function. This consolidates filtering
for virtual filesystems (tmpfs, cgroup, etc.), network mounts (nfs, cifs,
fuse), and special mountpoints (/dev, /proc, /snap, etc.) into one tested
location.
Reduces cyclomatic complexity of pollVMsAndContainersEfficient and adds
28 test cases covering virtual fs types, network mounts, special mounts,
Windows paths, and edge cases.
Previously, errors stored in ClusterClient.lastError were only cleared
during initial health checks or when recovering unhealthy nodes. This
caused stale error messages to persist in the UI even after the
underlying issues were resolved.
The fix clears cached errors in two places:
1. After passing connectivity test in getHealthyClient()
2. After successful operation in executeWithFailover()
This ensures that once an endpoint starts working again, any previous
error messages are cleared from the UI without requiring a restart.
Related to #659, #754
- Merge variable declaration with assignment (S1021)
- Use unconditional strings.TrimPrefix (S1017)
- Remove unnecessary nil checks around range (S1031)
- Remove unnecessary fmt.Sprintf (S1039)
- Use copy() instead of manual loop (S1001)
- Use time.Until instead of t.Sub(time.Now()) (S1024)
- Use buf.String() instead of string(buf.Bytes()) (S1030)
Add seamless migration path from legacy agents to unified agent:
- Add AgentType field to report payloads (unified vs legacy detection)
- Update server to detect legacy agents by type instead of version
- Add UI banner showing upgrade command when legacy agents are detected
- Add deprecation notice to install-host-agent.ps1
- Create install-docker-agent.sh stub that redirects to unified installer
Legacy agents (pulse-host-agent, pulse-docker-agent) now show a "Legacy"
badge in the UI with a one-click copy command to upgrade to the unified
agent.
Increases confidence score for PBS when receiving 401/403 responses to avoid unnecessary probing of other endpoints that trigger auth failure logs.
Fixes#741
Proxmox VE 9.x removed support for the "full" parameter in the
/nodes/{node}/qemu/{vmid}/status/current endpoint. When Pulse sent
GetVMStatus() requests with ?full=1, Proxmox responded with:
API error 400: {"errors":{"full":"property is not defined in schema..."}}
This caused the cluster client to mark ALL endpoints as unhealthy, which
cascaded into multiple failures:
- VM status checks failed
- Guest agent queries were blocked
- Filesystem data collection stopped working
- All Windows VMs showed disk:-1 (unknown) instead of actual disk usage
The fix removes the ?full=1 parameter since Proxmox 9.x returns all data
by default without needing this parameter. This maintains backward
compatibility with older Proxmox versions while fixing the issue in 9.x.
After this fix:
- Cluster endpoints are correctly marked as healthy
- Guest agent queries work properly
- Windows VMs report actual disk usage (e.g., 26% on C:\ drive)
- VM monitoring functions normally on Proxmox 9.x