Users were abandoning Pulse due to catastrophic temperature monitoring setup failures. This commit addresses the root causes:
**Problem 1: Silent Failures**
- Installations reported "SUCCESS" even when proxy never started
- UI showed green checkmarks with no temperature data
- Zero feedback when things went wrong
**Problem 2: Missing Diagnostics**
- Service failures logged only in journald
- Users saw "Something going on with the proxy" with no actionable guidance
- No way to troubleshoot from error messages
**Problem 3: Standalone Node Issues**
- Proxy daemon logged continuous pvecm errors as warnings
- "ipcc_send_rec" and "Unknown error -1" messages confused users
- These are expected for non-clustered/LXC setups
**Solutions Implemented:**
1. **Health Gate in install.sh (lines 1588-1629)**
- Verify service is running after installation
- Check socket exists on host
- Confirm socket visible inside container via bind mount
- Fail loudly with specific diagnostics if any check fails
2. **Actionable Error Messages in install-sensor-proxy.sh (lines 822-877)**
- When service fails to start: dump full systemctl status + 40 lines of logs
- When socket missing: show permissions, service status, and remediation command
- Include common issues checklist (missing user, permission errors, lm-sensors, etc.)
- Direct link to troubleshooting docs
3. **Better Standalone Node Detection in ssh.go (lines 585-595)**
- Recognize "Unknown error -1" and "Unable to load access control list" as LXC indicators
- Log at INFO level (not WARN) since this is expected behavior
- Clarify message: "using localhost for temperature collection"
**Impact:**
- Eliminates "green checkmark but no temps" scenario
- Users get immediate actionable feedback on failures
- Standalone/LXC installations work silently without error spam
- Reduces support burden from #571 (15+ comments of user frustration)
Related to #571
Snap-installed Docker does not automatically create a docker group,
causing permission denied errors when the pulse-docker service user
tries to access /var/run/docker.sock.
Changes:
- Auto-detect Snap Docker installations
- Create docker group if missing when Snap Docker is detected
- Restart Snap Docker after group creation to refresh socket ACLs
- Add socket access validation before starting the service
- Handle symlinked Docker sockets in systemd unit ReadWritePaths
- Document troubleshooting steps in DOCKER_MONITORING.md
Bare metal installations couldn't serve Windows host agent downloads because
the Windows and macOS binaries weren't included in the universal tarball. The
download endpoint would return 404 when Windows users tried to install the
host agent from a bare metal Pulse deployment (Proxmox LXC, Debian VM, etc.).
Changes:
- build-release.sh: Copy Windows/macOS host agent binaries into universal tarball
- build-release.sh: Create symlinks for Windows binaries without .exe extension
- validate-release.sh: Add Windows 386 binary and symlink to Docker validation
- validate-release.sh: Add explicit validation that universal tarball contains all Windows/macOS binaries
The universal tarball now matches the Docker image, ensuring both deployment
methods can serve the complete set of downloadable binaries for the /download/
endpoint.
Replace mv with install command to ensure correct SELinux context.
The mv command preserves the user_tmp_t label from /tmp, which
prevents systemd from executing the binary on SELinux systems.
The install command creates a new file with the correct label for
/usr/local/bin. Added automatic restorecon call for SELinux systems
to ensure policy compliance.
Related to #688
Following best practices for release format transitions:
- build-release.sh now generates both formats from same sha256sum run
- Workflow uploads both checksums.txt and individual .sha256 files
- Validation ensures both formats exist and match
This provides a safe transition period for users with older install scripts
while maintaining the cleaner checksums.txt format going forward. After 2-3
releases when most users have updated scripts, we can remove .sha256 generation.
Related: Install script already supports both formats (falls back gracefully).
Linux host-agent binaries don't have separate archives - they're included in
the main pulse-v*.tar.gz files. Only macOS and Windows have separate archives.
Removed validation checks for standalone binaries that are no longer
uploaded to GitHub releases. These binaries are only needed in Docker
images for the /download/ endpoint.
Updated required assets list to include all versioned tarballs/zips
instead of standalone binaries.
Removed:
- Individual .sha256 files (checksums.txt already contains all checksums)
- Standalone binaries without version numbers (users should download versioned tarballs/zips)
Standalone binaries are only needed in Docker images for the /download/ endpoint.
GitHub releases should only contain versioned archives for user downloads.
This reduces release assets from ~54 files to ~19 files per release.
Users don't care about CI/CD improvements, release workflows, build
processes, or testing infrastructure. Only include user-visible changes.
Related to #671
Commit hashes clutter the release notes and aren't useful for end users.
Only include issue references when explicitly mentioned in commits.
Related to #671
Remove # symbol from commit hash references so GitHub auto-links them.
Format: (abc123) instead of (#abc123)
Issue references still use #: (#123)
Related to #671
- Use exact template format from v4.28.0 and prior releases
- Include all standard sections: New Features, Bug Fixes, Improvements, Breaking Changes
- Add complete installation instructions (systemd, Docker, Manual Binary, Helm)
- Include Downloads section with all artifact types
- Add Notes section for important highlights and upgrade considerations
- Ensure LLM outputs format exactly matching previous releases
Related to #671 (automated release workflow)
- Create scripts/generate-release-notes.sh to auto-generate release notes from git commits
- Supports both Anthropic Claude and OpenAI APIs
- Uses Claude Haiku 4.5 (claude-haiku-4-5-20251001) for cost efficiency ($1/$5 per million tokens)
- Falls back to OpenAI gpt-4o-mini if Anthropic key not available
- Integrates into release workflow between validation and release creation
- Compares current version with previous git tag to generate changelog
- Outputs categorized, user-friendly release notes with installation instructions
Workflow now automatically:
1. Finds previous release tag
2. Analyzes all commits since last release
3. Generates structured release notes via LLM
4. Uses generated notes for draft release body
Requires ANTHROPIC_API_KEY or OPENAI_API_KEY in GitHub secrets.
Related to #671 (automated release workflow)
The script does pushd into RELEASE_DIR, so tarball paths should not include
the RELEASE_DIR prefix. Also fixed checksum validation glob patterns to
exclude .sha256 files from matching.
Tarballs are created with ./bin/pulse paths (relative from inside staging dir)
but validation was looking for bin/pulse paths. Updated all tar -tzf checks
to use correct ./ prefix.
The validation script was looking for tarballs in the current directory
instead of the release/ directory, causing all validations to fail.
Now properly prepends $RELEASE_DIR to all file paths.
This commit introduces a comprehensive GitHub Actions workflow for
creating releases, ensuring all artifacts are validated before upload.
Changes:
- Add .github/workflows/release.yml: Manual workflow_dispatch trigger
that builds, validates, and creates draft releases
- Update scripts/validate-release.sh: Add --skip-docker flag to allow
validation without Docker image checks
Key features:
- Validation runs BEFORE any assets are uploaded
- If validation fails, no release is created
- checksums.txt and artifacts come from the same build
- No manual steps between validation and upload
- Checksums uploaded first, then all other assets
- Creates draft release for manual review before publishing
The workflow ensures that checksums.txt cannot drift from binaries
by running the entire build-validate-upload pipeline atomically.
This fixes two bugs that prevented temperature monitoring from working
after running install-sensor-proxy.sh on LXC deployments:
1. CRITICAL: Pulse service not restarted after systemd override
- The installer wrote PULSE_SENSOR_PROXY_SOCKET env var to systemd
drop-in and ran daemon-reload, but never restarted Pulse service
- Running Pulse instances continued using old environment variables
- Temperatures wouldn't work until manual Pulse restart
- Now: Automatically restart Pulse if running after writing override
2. Added guard to check if Pulse service exists before configuring
- Installer would write systemd override even if Pulse not installed
- Left orphaned drop-in files that confused users
- Now: Check if pulse.service exists, warn and skip if not found
3. MINOR: Fix inconsistent Docker mount instructions
- docker-compose.yml showed :ro (read-only) mount
- Installer output showed :rw (read-write) mount
- Changed installer to match compose file (:ro is correct and secure)
Impact: Users in #600 reported "socketFound=false" even after running
installer successfully. This was because Pulse never picked up the new
socket path without a restart.
Adds build support for 32-bit Windows (windows-386) for pulse-host-agent.
Changes:
- Add windows-386 build to Dockerfile host-agent build section
- Add windows-386 binary copy and symlink to Dockerfile
- Add windows-386 build to build-release.sh
- Add windows-386 zip package to release artifacts
- Include windows-386 binary in standalone binary copies
This enables pulse-host-agent to run on 32-bit Windows systems, which are still relevant in legacy/industrial monitoring environments through late 2025.
Adds build support for 32-bit x86 (i386/i686) and ARMv6 (older Raspberry Pi models) architectures across all agents and install scripts.
Changes:
- Add linux-386 and linux-armv6 to build-release.sh builds array
- Update Dockerfile to build docker-agent, host-agent, and sensor-proxy for new architectures
- Update all install scripts to detect and handle i386/i686 and armv6l architectures
- Add architecture normalization in router download endpoints
- Update update manager architecture mapping
- Update validate-release.sh to expect 24 binaries (was 18)
This enables Pulse agents to run on older/legacy hardware including 32-bit x86 systems and Raspberry Pi Zero/Zero W devices.
Fixes two critical bugs in refresh_smart_cache() that prevented SMART
temperature collection from working:
1. Invalid smartctl parameter: Changed -n standby,after to -n standby
The 'after' parameter is not valid in smartctl 7.4 and causes:
"INVALID ARGUMENT TO -n: standby,after"
Valid syntax is standby[,STATUS[,STATUS2]] where STATUS must be numeric.
2. Broken process detection: Replaced exec -a with lock file approach
The original exec -a pulse-sensor-wrapper-refresh bash line replaced
the subshell with a new bash process that had no script to run, causing
the function to exit immediately without collecting any SMART data.
New approach uses a lock file ($CACHE_DIR/smart-refresh.lock) with
trap-based cleanup to prevent concurrent refresh operations.
Credits to @ZaDarkSide for identifying these issues in PR #672.
The download endpoint had a dangerous fallback that silently served the
wrong binary when the requested platform/arch combination was missing.
If a Docker image shipped without Windows binaries, the installer would
receive a Linux ELF instead of a Windows PE, causing ERROR_BAD_EXE_FORMAT.
Changes:
- Download handler now operates in strict mode when platform+arch are
specified, returning 404 instead of serving mismatched binaries
- PowerShell installer validates PE header (MZ signature)
- PowerShell installer verifies PE machine type matches requested arch
- PowerShell installer fetches and verifies SHA256 checksums
- PowerShell installer shows diagnostic info: OS arch, download URL,
file size for better troubleshooting
This prevents silent failures and provides clear error messages when
binaries are missing or corrupted.
The self-heal timer runs 'systemctl list-unit-files | grep -q' every hour.
When grep matches and exits early, systemctl logs "Failed to print table:
Broken pipe" to syslog. This is cosmetic but floods Proxmox logs and
confuses operators.
Changes:
- Redirect stderr from systemctl to /dev/null
- Prevents the broken pipe message from reaching syslog
- Self-heal functionality unchanged
This addresses the concern raised in discussion #628.
Windows 11 25H2 ships exclusively on ARM64 hardware. When users on ARM64
attempt to install the host agent, the Service Control Manager fails to
load the amd64 binary with ERROR_BAD_EXE_FORMAT, surfaced as "The Pulse
Host Agent is not compatible with this Windows version".
Changes:
- Dockerfile: Build pulse-host-agent-windows-arm64.exe alongside amd64
- Dockerfile: Copy windows-arm64 binary and create symlink for download endpoint
- install-host-agent.ps1: Use RuntimeInformation.OSArchitecture to detect ARM64
- build-release.sh: Build darwin-amd64, darwin-arm64, windows-amd64, windows-arm64
- build-release.sh: Package Windows binaries as .zip archives
- validate-release.sh: Check for windows-arm64 binary and symlink
- validate-release.sh: Add architecture validation for all darwin/windows variants
The installer now correctly detects ARM64 and downloads the appropriate binary.
Extends temperature monitoring to collect SMART temps for SATA/SAS disks,
addressing issue #652 where physical disk temperatures showed as empty.
Architecture:
- Deploys pulse-sensor-wrapper.sh as SSH forced command on Proxmox nodes
- Wrapper collects both CPU/GPU temps (sensors -j) and disk temps (smartctl)
- Implements 30-min cache with background refresh to avoid performance impact
- Uses smartctl -n standby,after to skip sleeping drives without waking them
- Returns unified JSON: {sensors: {...}, smart: [...]}
Backend changes:
- Add DiskTemp model with device, serial, WWN, temperature, lastUpdated
- Extend Temperature model with SMART []DiskTemp field and HasSMART flag
- Add WWN field to PhysicalDisk for reliable disk matching
- Update parseSensorsJSON to handle both legacy and new wrapper formats
- Rewrite mergeNVMeTempsIntoDisks to match SMART temps by WWN → serial → devpath
- Preserve legacy NVMe temperature support for backward compatibility
Performance considerations:
- SMART data cached for 30 minutes per node to avoid excessive smartctl calls
- Background refresh prevents blocking temperature requests
- Respects drive standby state to avoid spinning up idle arrays
- Staggered disk scanning with 0.1s delay to avoid saturating SATA controllers
Install script:
- Deploys wrapper to /usr/local/bin/pulse-sensor-wrapper.sh
- Updates SSH forced command from "sensors -j" to wrapper script
- Backward compatible - falls back to direct sensors output if wrapper missing
Testing note:
- Requires real hardware with smartmontools installed for full functionality
- Empty smart array returned gracefully when smartctl unavailable
- Legacy sensor-only nodes continue working without changes
The checksum generation was including pulse-host-agent-v*-darwin-arm64.tar.gz
twice: once from the *.tar.gz pattern and once from the pulse-host-agent-*
pattern. Fixed by using extglob to exclude .tar.gz and .sha256 files from
the agent binary patterns since tarballs are already matched separately.
Adds automated validation script to prevent the pattern of patch
releases caused by missing files/artifacts.
scripts/validate-release.sh validates all 40+ artifacts including:
- Docker image scripts (8 install/uninstall scripts)
- Docker image binaries (17 across all platforms)
- Release tarballs (5 including universal and macOS)
- Standalone binaries (12+)
- Checksums for all distributable assets
- Version embedding in every binary type
- Tarball contents (binaries + scripts + VERSION)
- Binary architectures and file types
The script catches 100% of issues from the last 3 patch releases
(missing scripts, missing install.sh, missing binaries, broken
version embedding).
Updated RELEASE_CHECKLIST.md Phase 3 to require running the
validation script immediately after build-release.sh and before
proceeding to Docker build/publish phases.
Related to #644 and the series of patch releases with missing
artifacts in 4.26.x.
The Dockerfile and build-release.sh were missing several installer and uninstaller
scripts that the router expects to serve via HTTP endpoints:
- install-container-agent.sh
- install-host-agent.ps1
- uninstall-host-agent.sh
- uninstall-host-agent.ps1
This caused 404 errors when users attempted to add Docker/Podman hosts or use the
PowerShell installer, as reported in #644.
Changes:
- Dockerfile: Added missing scripts to /opt/pulse/scripts/ with proper permissions
- build-release.sh: Added missing scripts to both per-platform and universal tarballs
to ensure bare-metal deployments serve the same endpoints as Docker deployments
The .sha256 files generated during release builds contained only the hash,
but sha256sum -c expects the format "hash filename". This caused all
install.sh updates to fail with "Checksum verification failed" even when
the checksum was correct.
Root cause: build-release.sh line 289 was using awk to extract only field 1
(the hash), discarding the filename that sha256sum -c needs.
Fix: Remove the awk filter to preserve the full sha256sum output format.
This affected the demo server update workflow and user installations.
Issue: HOST_AGENT.md documented downloading pulse-host-agent binaries
from GitHub releases, but those assets didn't exist. Only tarballs were
available, making manual installation unnecessarily complex.
Changes:
- Copy standalone host-agent binaries (all architectures) to release/
directory alongside sensor-proxy binaries
- Include host-agent binaries in checksum generation
- Update HOST_AGENT.md to clarify available architectures
- Retroactively uploaded missing binaries to v4.26.1
This enables air-gapped and manual installations without requiring an
already-running Pulse server to download from.
Root cause: install.sh was not being copied to the release directory
during build-release.sh execution, so it was never uploaded as a
release asset. This caused the download URL to return "Not Found",
which bash attempted to execute as a command.
Changes:
- Copy install.sh to release/ directory in build-release.sh
- Include install.sh in checksums generation
Note: RELEASE_CHECKLIST.md also updated locally to verify install.sh
presence in Phase 3 and Phase 5, but that file is gitignored.
Addresses the root cause of issue #631 (infinite Docker agent restart loop)
and prevents similar issues with host-agent and sensor-proxy.
Changes:
- Set dockeragent.Version default to "dev" instead of hardcoded version
- Add version embedding to server build in Dockerfile
- Add version embedding to host-agent builds (all platforms)
- Add version embedding to sensor-proxy builds (all platforms)
This ensures:
1. Server's /api/agent/version endpoint returns correct v4.26.0
2. Downloaded agent binaries have matching embedded versions
3. Dev builds skip auto-update (Version="dev")
4. No version mismatch triggers infinite restart loops
Related to #631
Related to #639
Users reported "Failed to download checksum for Pulse release" errors
during installation. The root cause was a mismatch between what the
build system generates and what the installer expects:
- install.sh downloads individual .sha256 files (e.g., pulse-v4.25.0-linux-amd64.tar.gz.sha256)
- build-release.sh only created a single checksums.txt file
This commit updates build-release.sh to generate both:
1. Individual .sha256 files for each asset (required by install.sh)
2. Combined checksums.txt for manual verification and signing
This maintains backwards compatibility with the installer while keeping
the aggregated checksums.txt for power users and GPG signing.
Related to #571
This addresses multiple temperature monitoring issues:
1. Fix single-node Proxmox installation failure: Add '|| true' to pvecm status
calls to prevent script exit on standalone (non-clustered) nodes with
'set -euo pipefail'. The script now properly falls through to standalone
node configuration when cluster detection fails.
2. Build pulse-sensor-proxy for all Linux architectures (amd64, arm64, armv7)
in Dockerfile to ensure binaries are available for download on all supported
platforms. This resolves the missing binary issue from v4.23.0.
Note: AMD Tctl sensor support was already implemented in a previous commit.
Related to #612
This commit addresses the Alpine Linux installation issues reported where:
1. The OpenRC init system was not properly detected
2. Manual startup instructions were unclear and used placeholder values
3. The agent didn't validate configuration properly at startup
Changes:
Install Script (install-docker-agent.sh):
- Improved OpenRC detection to check for rc-service and rc-update commands
instead of looking for openrc-run binary in specific paths
- Added specific Alpine Linux detection via /etc/alpine-release and /etc/os-release
- Enhanced manual startup instructions to show actual values instead of placeholders
- Added clearer warnings and guidance when no init system is detected
- Included comprehensive startup command with all required parameters
Agent Startup Validation (pulse-docker-agent):
- Added validation to detect unexpected command-line arguments
- Added helpful note about double-dash flag requirements (--token vs -token)
- Improved error messages to include example usage patterns
- Added warning when defaulting to localhost without explicit URL configuration
- Provide both command-line and environment variable examples in error messages
These improvements ensure that:
- Alpine Linux installations will properly detect and configure OpenRC services
- Users who must start the agent manually get clear, copy-pasteable commands
- Configuration errors are caught early with actionable error messages
- Common mistakes (like missing --url) are clearly explained
This addresses GitHub Discussion #605 where users were unclear about
configuring the temperature proxy when running Pulse in Docker.
Changes:
**install-sensor-proxy.sh:**
- Add Docker-specific post-install instructions when --standalone flag is used
- Show required docker-compose.yml bind mount configuration
- Provide verification commands for Docker deployments
- Link to full documentation for troubleshooting
**TEMPERATURE_MONITORING.md:**
- Add prominent "Quick Start for Docker Deployments" section at the top
- Move Docker instructions earlier in the document for better visibility
- Provide complete 4-step setup process with verification commands
These changes ensure Docker users immediately see:
1. How to install the proxy on the Proxmox host
2. What bind mount to add to docker-compose.yml
3. How to restart and verify the setup
4. Where to find detailed troubleshooting
The installer now provides actionable next steps instead of just
confirming installation, reducing confusion for containerized deployments.
The file watcher was only triggering on .go file modifications but missing new file creation. This happened because inotifywait sometimes reports the directory path first when a file is created.
Changes:
- Include event type in inotifywait output format
- Trigger rebuild on CREATE/DELETE/MOVED events in addition to .go modifications
- Add exclusions for temp files (.swp, .tmp, ~)
Now creating new .go files will trigger an auto-rebuild.