Fix the local hot-dev backend monitor so a missing Pulse process is counted safely under pipefail, keep backend launch stderr in the debug log, and govern the new runtime helper with focused smoke coverage.
Show workload-capable source failures on Workloads and keep matching Proxmox host agents attached to their API source when inventory collection is blocked.
Six small refactors aggregated from a simplify-review pass over this
session's commits:
1. internal/config/persistence_relay.go — LoadRelayConfig had two
ApplyEnvOverrides call sites (one inside the not-exist branch, one
on the happy path) and a redundant cfg = DefaultConfig() reassignment.
Collapse to a single ApplyEnvOverrides call after the load attempt;
the file-absent branch already has the default cfg from line 1.
2. internal/relay/config_env.go — swap two strings.TrimSpace(os.Getenv(...))
calls for utils.GetenvTrim, matching the 30+ existing call sites in
internal/config/config.go. Trim narrating comments back to the
product-behavior sentences that aren't obvious from the code.
3. internal/relay/config_env_test.go — collapse seven near-identical
ApplyEnvOverrides scenarios into a single table-driven test
(TestApplyEnvOverridesTable). Reduces ~85 lines to ~60 and gives each
subcase a named t.Run for clearer failure output. Keeps the
nil-config-safe and parseEnvBool tests separate since they exercise
different surfaces.
4. .github/workflows/install-sh-smoke.yml — replace the /api/health
bash for-loop (sleep 2; curl; loop 30x) with a single
curl --retry 30 --retry-delay 2 --retry-connrefused --retry-all-errors
invocation. Curl already implements the same polling behaviour
natively; the bash loop was 13 lines of redundant scaffolding.
5. scripts/installtests/build_release_assets_test.go — extract the
repeated "read file, iterate required substrings, fail on first
miss" boilerplate into assertFileContainsAll(t, path, required...).
Migrate the four tests I added in this session; existing tests in
the file follow the same shape and can adopt the helper
incrementally without churning unrelated code in this commit. Also
updated the pinned curl string for the /api/health retry change.
Contract-neutral: every change preserves identical user-visible
behavior. PULSE_ALLOW_CONTRACT_NEUTRAL_COMMIT applied for the
canonical-shape-guard bypass; sensitivity, gitleaks, governance-stage,
control-plane, status, registry, contract, and pre-commit hooks still
run.
Verified locally:
- go test ./internal/relay/ ./internal/config/ → all pass
- go test ./scripts/installtests/ → all pass
- ruby -ryaml install-sh-smoke.yml → parses clean
promote-floating-tags.yml's `workflow_run` chain off publish-docker.yml
silently stopped firing for rc.3 → rc.5 because publish-docker failed at
the now-removed pulse-agent push step. Customers pulling
rcourtman/pulse:latest, :6, or :6.0 stayed on whatever the previous
successful release had tagged — there was no warning anywhere that the
floating tags were stale.
Same fix pattern as install-sh-smoke (commit 7c0f65425) and
publish-helm-chart (commit 14c79a28e): add a workflow_call trigger to
promote-floating-tags.yml and call it explicitly from create-release.yml
after validate_release_assets succeeds.
Gating on validate_release_assets is intentional: that workflow waits
for the docker image to be pullable from the registry (with retry
backoff), so by the time it succeeds the image manifest exists and
re-tagging it to latest/major/minor cannot point at vapor.
The legacy workflow_run trigger stays as the primary path; this just
guarantees promotion even when the chain doesn't fire.
Tag-resolver step now accepts inputs from workflow_call / workflow_dispatch
and only falls back to the workflow_run derivation when inputs are absent,
so all three entry paths converge on the same identity.
Pinned in build_release_assets_test.go:
- new TestPromoteFloatingTagsReachableViaWorkflowCall pins the trigger
declaration and the input-priority resolver
- existing TestCreateReleaseUploadsPowerShellInstaller extended to pin
the promote_floating_tags job wiring (uses, tag, prerelease)
Contract delta in deployment-installability.md Extension Point 7
documents the same explicit-workflow_call requirement that applies to
publish-helm-chart, extended to promote-floating-tags.
v6 rc.1 → rc.5 published successfully but the Helm chart never landed on
rcourtman.github.io/Pulse/index.yaml — the index still ends at v5.1.30.
`helm install pulse pulse/pulse --version 6.0.0-rc.5` returns
chart-not-found; without `--version` helm pulls the latest published
chart (v5.1.30) into a customer's v6 cluster.
Root cause: GitHub does not fire `release: published` for releases that
were created as drafts and later PATCHed to draft=false. create-release.yml
deliberately uses that path so it can upload assets and run
validate-release-assets against the draft before promoting. Inspection of
the workflow run history confirms: every gh-API `release: published` event
since 2026-03-02 has been from manually-dispatched v5 stable cuts; zero
fired for v6 RCs published through the create-release pipeline.
Fix the same way install-sh-smoke was wired in commit 7c0f65425: add a
`workflow_call` trigger to publish-helm-chart.yml and call it explicitly
from create-release.yml as a downstream of validate_release_assets. The
chart-version resolver in publish-helm-chart now accepts inputs from
either workflow_call or workflow_dispatch and only falls back to the
release-event tag when no inputs are present, keeping the legacy
release-event path working for forks / manual gh-CLI publishes that
create with draft=false from the start.
Pinned in build_release_assets_test.go:
- create-release.yml wiring (publish_helm_chart job, version inputs)
- publish-helm-chart.yml workflow_call trigger declaration
- chart-version resolver's input-priority logic
Contract delta in deployment-installability.md Extension Point 7
documents the workflow_call requirement and forbids relying on the
release-published webhook for the create-release.yml draft-promotion
path.
The fix takes effect on the next release through the pipeline. Backfill
of the v6.0.0-rc.5 chart needs a one-time manual dispatch of
publish-helm-chart.yml against chart_version=6.0.0-rc.5.
The chart's agent.image.repository defaulted to ghcr.io/rcourtman/pulse-agent,
an image that has never been published. publish-docker.yml only pushes
rcourtman/pulse; the Dockerfile defines an agent_runtime stage that
*could* be published but it isn't, and commit da7969fb4 from earlier in
this session removed the corresponding pulse-agent attestation
expectations — a clear signal the separate agent image was intentionally
dropped without updating the chart. Customers running
`helm install pulse pulse/pulse --set agent.enabled=true` were silently
hitting ImagePullBackOff on the agent DaemonSet.
Route the chart through the main rcourtman/pulse image instead. To make
that work without per-arch chart overrides, the runtime stage in the
Dockerfile now creates an arch-resolved /usr/local/bin/pulse-agent
symlink to the right /opt/pulse/bin/pulse-agent-linux-{amd64,arm64,armv7}
binary. The chart's agent.command default is /usr/local/bin/pulse-agent,
which overrides the server ENTRYPOINT and runs the pod as a unified
agent on whichever arch the node provides. agent.yaml renders the
command via toYaml so list values pass through cleanly.
KUBERNETES.md's DaemonSet example switches from the arch-hardcoded
/opt/pulse/bin/pulse-agent-linux-amd64 to the new arch-resolved path,
restoring multi-arch portability of the docs example.
validate-release.sh asserts the symlink exists, points at one of the
three supported Linux arch binaries, and is executable in the published
image. A new TestHelmAgentRuntimePointsAtRealImage pins the chart
defaults, the template wiring, the Dockerfile symlink, and the
validate-release.sh guard so the regression class can't quietly
resurface.
Governance: extend the helm-chart-release-runtime verification policy's
exact_files to include scripts/installtests/build_release_assets_test.go
(matching its existing pin set for related deployment-installability
policies); update the subsystem_lookup_test.py fixture that pins the
exact_files list; document the agent-image and pulse-agent symlink
contract in deployment-installability.md Extension Point 7.
Verified locally: `helm lint` passes; `helm template --set agent.enabled=true`
renders a DaemonSet with image rcourtman/pulse:6.0.0,
command ["/usr/local/bin/pulse-agent"], args ["--enable-docker", "--enable-host=false"].
End-to-end image build + agent DaemonSet smoke will run via helm_smoke
on the next release once rcourtman/pulse:6.0.0 is published.
The smoke gate workflow exists from commit 065ebdb27 but until it is
called from create-release.yml it does not actually protect any release.
That is exactly the regression class that let rc.1 → rc.5 ship with a
broken install.sh: nothing in the release pipeline exercised the
documented secure-install flow against the published GitHub Release URL.
Wire install-sh-smoke.yml as a downstream workflow_call after
validate_release_assets succeeds. Gated on
historical_asset_backfill_only != 'true' since asset-backfill flows
re-upload to an already-published release and the smoke would just
re-confirm what hasn't changed.
Pre-install structural checks were verified locally against rc.5 — the
gate correctly fires the banner / agent-banner / --version handler
assertions against the broken release. The end-to-end container portion
(privileged systemd boot, install.sh execution, /api/health, /api/version
match) will run for the first time on the next release that publishes
through this workflow; existing retry loops on systemd readiness,
service activation, and health endpoint absorb transient runner flakes.
Add install-sh-smoke.yml to the deployment-installability canonical files
and to the release-promotion proof policy's match_files, and add
scripts/installtests/build_release_assets_test.go to that policy's
exact_files (matching the existing pin set for related policies in the
deployment-installability subsystem). Update subsystem_lookup_test.py
fixtures that pinned the exact_files list literally.
Pinned the create-release.yml wiring in build_release_assets_test.go
alongside the validate-release-assets wiring so the smoke step cannot
silently be unwired.
Document the gate's contract responsibilities in
deployment-installability Extension Point 2.
Across v6 rc.1 → rc.5 the published install.sh asset was the agent
installer rather than the server installer, and the README's pinned
ed25519 key did not verify what the pipeline actually signed. The first
broke `bash install.sh --version` and the in-product Update button; the
second silently failed the README's secure-install ssh-keygen step.
Neither was caught by CI because every existing gate operated on the
local release/ build, the Docker image, or the helm chart — nothing
exercised the documented LXC/systemd install commands against the
published release URL.
scripts/validate-release.sh now catches asset-identity drift at build
time. This workflow catches the rest of the regression class — anything
that breaks the actual install at runtime — by running the documented
flow end-to-end against the published release.
What it does:
- Downloads install.sh, install.sh.sshsig, and the linux-amd64 tarball
from releases/download/<tag>/.
- Extracts the README's pinned pulse-installer ed25519 key and runs the
exact ssh-keygen -Y verify command from the README's secure-install
snippet against the downloaded asset.
- Re-checks the server-installer banner, the --version) arg handler, and
the absence of the agent banner — same pins as validate-release.sh, but
now against what GitHub is actually serving (not just what was built
locally).
- Boots jrei/systemd-debian:12 privileged, runs
`bash install.sh --archive <tarball> --disable-auto-updates` from
inside, waits for systemd pulse.service to become active, hits
/api/health, and asserts /api/version reports the expected version.
--archive mode is used rather than --version so the workflow doesn't
depend on install.sh's self-refetch loop (the re-fetched bytes are the
ones we already validated). Auto-updates are disabled to avoid the timer
unit doing anything during the smoke run.
Triggers are workflow_dispatch + workflow_call only. Wire it into
create-release.yml after the next RC validates it green.
Pinned in build_release_assets_test.go so silent deletion or weakening
of any critical assertion (signature verify, banner check, /api/health
hit, version match) trips the test.
These two env vars were documented as relay overrides in v6 docs since
March 18 (CONFIGURATION.md, RELAY.md, and the frontend-served doc copy)
but no code ever read them. Operators trying to bootstrap relay headlessly
saw no effect.
Implement them rather than remove the documentation. Headless and
container deployments now have a real path to enable relay and point it
at a private endpoint without going through Settings → Relay.
internal/relay/config_env.go:
- ApplyEnvOverrides(*Config) mutates relay.Config in place.
- PULSE_RELAY_ENABLED accepts true/false/yes/no/1/0/on/off (case-
insensitive). Unrecognized values log a warning and leave the file
value untouched — important so "unset" reads differently from
"explicit false."
- PULSE_RELAY_SERVER goes through the existing validateRelayServerURL
check; invalid URLs log a warning and fall through.
internal/config/persistence_relay.go:
LoadRelayConfig calls ApplyEnvOverrides after the file load and after
the default-fallback when relay.enc is absent, so the env override
applies on every load.
Tests cover unset / true / false / garbage-bool / valid-URL / invalid-URL
/ both-together / nil-config paths in the relay package, plus two
end-to-end tests in internal/config that prove the override flows through
LoadRelayConfig against a real persisted file and against the
missing-file default branch.
Restore the env-var docs with the correct default URL (the full
wss://relay.pulserelay.pro/ws/instance, not the bare hostname the
original aspirational table claimed) and add an explicit precedence note:
saving from the UI after an env override persists the env-effective state
to disk, so clearing the env alone does not revert.
Add internal/relay/config_env_test.go to the relay-runtime registry's
desktop-relay-runtime exact_files so the new code surface is proof-tracked.
Update the matching pin in subsystem_lookup_test.py. Extend the
relay-runtime contract Extension Point 3 to document the override
semantics LoadRelayConfig must satisfy.
The README's secure-install snippet has pinned the wrong ed25519 key
since commit a60fa03d7 (April 22, 2026), so v6 rc.2 through rc.5 all
shipped with a documented verification step that does not work.
I downloaded the published rc.5 install.sh + install.sh.sshsig and
ran ssh-keygen -Y verify with both candidate keys:
Ds21c5... (README's pinned key) -> Could not verify signature
MZd/DaH... (key embedded in install.sh and pulse-auto-update.sh) -> OK
Customers who actually followed the README's secure-install path saw
"Could not verify signature" and aborted. Most users curl-pipe the
script unverified so the drift went unreported.
Replace the stale key in README.md and docs/INSTALL.md with the actual
pipeline signing key (MZd/...).
Add a validate-release.sh smoke that extracts the README's pinned key
and runs the exact ssh-keygen -Y verify command against the signed
install.sh.sshsig. Any future drift between documented key and actual
signing key fails the release before publish.
Lock both the correct-key presence and the stale-key absence in
build_release_assets_test.go for README and docs/INSTALL.md so a manual
edit cannot regress the docs back to the broken state.
Every v6 RC (rc.1 through rc.5, ~30 days) shipped the wrong install.sh.
build-release.sh was copying the rendered AGENT installer into
release/install.sh, but adapter_installsh, scripts/pulse-auto-update.sh,
the root install.sh's own --rc/--stable/--version flows, and the README
quickstart all fetch that asset and run `bash install.sh --version vX.Y.Z`.
Since the agent installer rejects --version with "Unknown argument", the
LXC quickstart, the in-product "Update Pulse" button on systemd/proxmoxve
deployments, and the pulse-auto-update.sh systemd timer were all broken
for every RC.
Fix the build-release.sh copy to publish the root server installer.
The agent installer continues to ship inside tarballs at ./scripts/install.sh
and inside Docker images at /opt/pulse/scripts/install.sh, and is served
at the running Pulse server's /install.sh endpoint — none of those paths
change. Only the top-level GitHub Releases asset moves from agent to server.
Update build_release_assets_test.go to lock in the new publishing rule
and ban the reverse drift, replacing the March 18 "legacy root install.sh"
guard that was the original mistake.
Add a validate-release.sh smoke that catches the regression mode this hid:
the published install.sh must have the Pulse server banner, the --version)
arg handler, must not contain the agent banner, and `bash install.sh --help`
must print the server installer's version-pinning help line. These checks
run as part of validate-release-assets.yml against the post-publish asset
bundle so a future swap back cannot slip through.
Document the asset identity rule and the validate-release.sh guard in the
deployment-installability contract so any future change to the publishing
pipeline has to update the contract or trip the shape guard.
The MCP adapter shipped in slice 51 with one install option:
clone the repo and go build. This slice integrates pulse-mcp
into Pulse's existing governed release pipeline so a Pulse
release publishes a pulse-mcp binary alongside the unified agent
and the install scripts that bring it home in one command.
What ships:
- scripts/build-release.sh extended to build pulse-mcp for
the same multi-OS matrix as the unified agent, package
per-platform tarballs and zips, and copy bare binaries to
RELEASE_DIR for /releases/latest/download/ redirect
compatibility.
- .github/workflows/create-release.yml extended to upload
the bare pulse-mcp binaries plus install-mcp.sh and
install-mcp.ps1 as release assets.
- scripts/install-mcp.sh: bash one-line installer that
detects platform/arch, downloads the matching binary from
the configured release (latest by default), verifies SHA256
against the published checksums.txt, places at
~/.local/bin/pulse-mcp (or /usr/local/bin if not writable).
Honors PULSE_MCP_VERSION, PULSE_MCP_BIN_DIR, PULSE_MCP_REPO,
PULSE_MCP_NO_VERIFY env vars; declines Windows shells with
a pointer at the .ps1 sibling.
- scripts/install-mcp.ps1: PowerShell installer for Windows,
placing pulse-mcp.exe at $LOCALAPPDATA\pulse-mcp.
Documentation aligned:
- cmd/pulse-mcp/README.md gains an Install section above
Quick start with three options: one-line installer,
GitHub Release download, go install. Documents the macOS
Gatekeeper bypass since v1 is unnotarized by design.
- The Settings -> API Access agent-integrations panel now
surfaces the curl|bash command above the config snippet so
operators see "install pulse-mcp" before "configure your
MCP client."
- docs/releases/AGENT_PARADIGM.md drops the "no published
distribution path" item from "what it does not do yet" and
documents the Gatekeeper / Homebrew gaps as next-tier
follow-ups.
Trade-offs surfaced and chosen:
- Same cadence as Pulse: pulse-mcp ships per Pulse release,
not on its own track. The MCP server reads the manifest
from the Pulse it talks to, so version alignment is the
natural model.
- No Homebrew tap or core formula in v1. Maintaining a tap
is real ongoing work; foundation supports adding Homebrew
later as a layer.
- No Docker image. Stdio JSON-RPC fights Docker's stdin
/stdout pattern.
- No notarization in v1. SHA256 verification through the
installer preserves the audit trail; README documents the
Gatekeeper bypass.
Subsystem contract: deployment-installability.md gains
scripts/install-mcp.sh, scripts/install-mcp.ps1, and
cmd/pulse-mcp/ in canonical files (mid-list entries
renumbered) plus a paragraph documenting the new MCP entry
point alongside the existing installer family.
Verification artifacts:
- scripts/installtests/build_release_assets_test.go gains
TestBuildReleasePackagesPulseMcpForAllPlatforms which pins
the build/package/copy wiring and the load-bearing
install-mcp.sh helpers (platform detection, SHA256
verification, install-dir resolution).
- scripts/release_control/render_release_body_test.py gains
test_agent_paradigm_release_notes_blurb_documents_-
distribution_path which pins the AGENT_PARADIGM.md draft's
install-mcp.sh reference and the four-axis frame so a
future edit cannot regress the install story silently.
Smoke-tested install-mcp.sh locally on darwin-arm64: platform
detection, install-dir resolution, URL building, and 404 error
handling all correct. The full end-to-end install path becomes
live the moment a Pulse release ships pulse-mcp binaries; the
next RC cut will exercise it.
Add missing high-risk matrix sections for the paid-runtime and mobile product-purpose gates, guard status.json release gates against missing matrix runbooks, and refresh the GA-promotion blocked record for the current rc.4 line.
Separate first-class platform support from Pulse Agent host profiles and classify Unraid as an agent-backed host profile while preserving it as presentation-only platform vocabulary.