Commit graph

71 commits

Author SHA1 Message Date
rcourtman
348582df66 Fix Assistant chat model-owned routing 2026-05-15 10:50:23 +01:00
rcourtman
93c62e691a Aggregate simplify-review cleanups (no behavior change)
Some checks failed
Build and Test / Secret Scan (push) Has been cancelled
Build and Test / Frontend & Backend (push) Has been cancelled
Six small refactors aggregated from a simplify-review pass over this
session's commits:

1. internal/config/persistence_relay.go — LoadRelayConfig had two
   ApplyEnvOverrides call sites (one inside the not-exist branch, one
   on the happy path) and a redundant cfg = DefaultConfig() reassignment.
   Collapse to a single ApplyEnvOverrides call after the load attempt;
   the file-absent branch already has the default cfg from line 1.

2. internal/relay/config_env.go — swap two strings.TrimSpace(os.Getenv(...))
   calls for utils.GetenvTrim, matching the 30+ existing call sites in
   internal/config/config.go. Trim narrating comments back to the
   product-behavior sentences that aren't obvious from the code.

3. internal/relay/config_env_test.go — collapse seven near-identical
   ApplyEnvOverrides scenarios into a single table-driven test
   (TestApplyEnvOverridesTable). Reduces ~85 lines to ~60 and gives each
   subcase a named t.Run for clearer failure output. Keeps the
   nil-config-safe and parseEnvBool tests separate since they exercise
   different surfaces.

4. .github/workflows/install-sh-smoke.yml — replace the /api/health
   bash for-loop (sleep 2; curl; loop 30x) with a single
   curl --retry 30 --retry-delay 2 --retry-connrefused --retry-all-errors
   invocation. Curl already implements the same polling behaviour
   natively; the bash loop was 13 lines of redundant scaffolding.

5. scripts/installtests/build_release_assets_test.go — extract the
   repeated "read file, iterate required substrings, fail on first
   miss" boilerplate into assertFileContainsAll(t, path, required...).
   Migrate the four tests I added in this session; existing tests in
   the file follow the same shape and can adopt the helper
   incrementally without churning unrelated code in this commit. Also
   updated the pinned curl string for the /api/health retry change.

Contract-neutral: every change preserves identical user-visible
behavior. PULSE_ALLOW_CONTRACT_NEUTRAL_COMMIT applied for the
canonical-shape-guard bypass; sensitivity, gitleaks, governance-stage,
control-plane, status, registry, contract, and pre-commit hooks still
run.

Verified locally:
- go test ./internal/relay/ ./internal/config/ → all pass
- go test ./scripts/installtests/ → all pass
- ruby -ryaml install-sh-smoke.yml → parses clean
2026-05-12 17:32:11 +01:00
rcourtman
22a94f47d9 Skip release publish downstreams for drafts 2026-05-12 17:32:11 +01:00
rcourtman
29a815ef2a Fail closed on stale API action plans 2026-05-12 16:55:51 +01:00
rcourtman
3566a4d61d Drive promote-floating-tags via workflow_call from create-release
promote-floating-tags.yml's `workflow_run` chain off publish-docker.yml
silently stopped firing for rc.3 → rc.5 because publish-docker failed at
the now-removed pulse-agent push step. Customers pulling
rcourtman/pulse:latest, :6, or :6.0 stayed on whatever the previous
successful release had tagged — there was no warning anywhere that the
floating tags were stale.

Same fix pattern as install-sh-smoke (commit 7c0f65425) and
publish-helm-chart (commit 14c79a28e): add a workflow_call trigger to
promote-floating-tags.yml and call it explicitly from create-release.yml
after validate_release_assets succeeds.

Gating on validate_release_assets is intentional: that workflow waits
for the docker image to be pullable from the registry (with retry
backoff), so by the time it succeeds the image manifest exists and
re-tagging it to latest/major/minor cannot point at vapor.

The legacy workflow_run trigger stays as the primary path; this just
guarantees promotion even when the chain doesn't fire.

Tag-resolver step now accepts inputs from workflow_call / workflow_dispatch
and only falls back to the workflow_run derivation when inputs are absent,
so all three entry paths converge on the same identity.

Pinned in build_release_assets_test.go:
- new TestPromoteFloatingTagsReachableViaWorkflowCall pins the trigger
  declaration and the input-priority resolver
- existing TestCreateReleaseUploadsPowerShellInstaller extended to pin
  the promote_floating_tags job wiring (uses, tag, prerelease)

Contract delta in deployment-installability.md Extension Point 7
documents the same explicit-workflow_call requirement that applies to
publish-helm-chart, extended to promote-floating-tags.
2026-05-12 16:47:51 +01:00
rcourtman
14c79a28e7 Trigger publish-helm-chart via workflow_call from create-release
v6 rc.1 → rc.5 published successfully but the Helm chart never landed on
rcourtman.github.io/Pulse/index.yaml — the index still ends at v5.1.30.
`helm install pulse pulse/pulse --version 6.0.0-rc.5` returns
chart-not-found; without `--version` helm pulls the latest published
chart (v5.1.30) into a customer's v6 cluster.

Root cause: GitHub does not fire `release: published` for releases that
were created as drafts and later PATCHed to draft=false. create-release.yml
deliberately uses that path so it can upload assets and run
validate-release-assets against the draft before promoting. Inspection of
the workflow run history confirms: every gh-API `release: published` event
since 2026-03-02 has been from manually-dispatched v5 stable cuts; zero
fired for v6 RCs published through the create-release pipeline.

Fix the same way install-sh-smoke was wired in commit 7c0f65425: add a
`workflow_call` trigger to publish-helm-chart.yml and call it explicitly
from create-release.yml as a downstream of validate_release_assets. The
chart-version resolver in publish-helm-chart now accepts inputs from
either workflow_call or workflow_dispatch and only falls back to the
release-event tag when no inputs are present, keeping the legacy
release-event path working for forks / manual gh-CLI publishes that
create with draft=false from the start.

Pinned in build_release_assets_test.go:
- create-release.yml wiring (publish_helm_chart job, version inputs)
- publish-helm-chart.yml workflow_call trigger declaration
- chart-version resolver's input-priority logic

Contract delta in deployment-installability.md Extension Point 7
documents the workflow_call requirement and forbids relying on the
release-published webhook for the create-release.yml draft-promotion
path.

The fix takes effect on the next release through the pipeline. Backfill
of the v6.0.0-rc.5 chart needs a one-time manual dispatch of
publish-helm-chart.yml against chart_version=6.0.0-rc.5.
2026-05-12 16:30:53 +01:00
rcourtman
ab62b46c1f Fix helm chart agent.enabled by routing through main pulse image
The chart's agent.image.repository defaulted to ghcr.io/rcourtman/pulse-agent,
an image that has never been published. publish-docker.yml only pushes
rcourtman/pulse; the Dockerfile defines an agent_runtime stage that
*could* be published but it isn't, and commit da7969fb4 from earlier in
this session removed the corresponding pulse-agent attestation
expectations — a clear signal the separate agent image was intentionally
dropped without updating the chart. Customers running
`helm install pulse pulse/pulse --set agent.enabled=true` were silently
hitting ImagePullBackOff on the agent DaemonSet.

Route the chart through the main rcourtman/pulse image instead. To make
that work without per-arch chart overrides, the runtime stage in the
Dockerfile now creates an arch-resolved /usr/local/bin/pulse-agent
symlink to the right /opt/pulse/bin/pulse-agent-linux-{amd64,arm64,armv7}
binary. The chart's agent.command default is /usr/local/bin/pulse-agent,
which overrides the server ENTRYPOINT and runs the pod as a unified
agent on whichever arch the node provides. agent.yaml renders the
command via toYaml so list values pass through cleanly.

KUBERNETES.md's DaemonSet example switches from the arch-hardcoded
/opt/pulse/bin/pulse-agent-linux-amd64 to the new arch-resolved path,
restoring multi-arch portability of the docs example.
validate-release.sh asserts the symlink exists, points at one of the
three supported Linux arch binaries, and is executable in the published
image. A new TestHelmAgentRuntimePointsAtRealImage pins the chart
defaults, the template wiring, the Dockerfile symlink, and the
validate-release.sh guard so the regression class can't quietly
resurface.

Governance: extend the helm-chart-release-runtime verification policy's
exact_files to include scripts/installtests/build_release_assets_test.go
(matching its existing pin set for related deployment-installability
policies); update the subsystem_lookup_test.py fixture that pins the
exact_files list; document the agent-image and pulse-agent symlink
contract in deployment-installability.md Extension Point 7.

Verified locally: `helm lint` passes; `helm template --set agent.enabled=true`
renders a DaemonSet with image rcourtman/pulse:6.0.0,
command ["/usr/local/bin/pulse-agent"], args ["--enable-docker", "--enable-host=false"].
End-to-end image build + agent DaemonSet smoke will run via helm_smoke
on the next release once rcourtman/pulse:6.0.0 is published.
2026-05-12 16:11:56 +01:00
rcourtman
7c0f654253 Wire install.sh smoke gate into create-release.yml release pipeline
The smoke gate workflow exists from commit 065ebdb27 but until it is
called from create-release.yml it does not actually protect any release.
That is exactly the regression class that let rc.1 → rc.5 ship with a
broken install.sh: nothing in the release pipeline exercised the
documented secure-install flow against the published GitHub Release URL.

Wire install-sh-smoke.yml as a downstream workflow_call after
validate_release_assets succeeds. Gated on
historical_asset_backfill_only != 'true' since asset-backfill flows
re-upload to an already-published release and the smoke would just
re-confirm what hasn't changed.

Pre-install structural checks were verified locally against rc.5 — the
gate correctly fires the banner / agent-banner / --version handler
assertions against the broken release. The end-to-end container portion
(privileged systemd boot, install.sh execution, /api/health, /api/version
match) will run for the first time on the next release that publishes
through this workflow; existing retry loops on systemd readiness,
service activation, and health endpoint absorb transient runner flakes.

Add install-sh-smoke.yml to the deployment-installability canonical files
and to the release-promotion proof policy's match_files, and add
scripts/installtests/build_release_assets_test.go to that policy's
exact_files (matching the existing pin set for related policies in the
deployment-installability subsystem). Update subsystem_lookup_test.py
fixtures that pinned the exact_files list literally.

Pinned the create-release.yml wiring in build_release_assets_test.go
alongside the validate-release-assets wiring so the smoke step cannot
silently be unwired.

Document the gate's contract responsibilities in
deployment-installability Extension Point 2.
2026-05-12 11:44:04 +01:00
rcourtman
065ebdb276 Add install.sh end-to-end smoke gate against published release
Across v6 rc.1 → rc.5 the published install.sh asset was the agent
installer rather than the server installer, and the README's pinned
ed25519 key did not verify what the pipeline actually signed. The first
broke `bash install.sh --version` and the in-product Update button; the
second silently failed the README's secure-install ssh-keygen step.
Neither was caught by CI because every existing gate operated on the
local release/ build, the Docker image, or the helm chart — nothing
exercised the documented LXC/systemd install commands against the
published release URL.

scripts/validate-release.sh now catches asset-identity drift at build
time. This workflow catches the rest of the regression class — anything
that breaks the actual install at runtime — by running the documented
flow end-to-end against the published release.

What it does:
- Downloads install.sh, install.sh.sshsig, and the linux-amd64 tarball
  from releases/download/<tag>/.
- Extracts the README's pinned pulse-installer ed25519 key and runs the
  exact ssh-keygen -Y verify command from the README's secure-install
  snippet against the downloaded asset.
- Re-checks the server-installer banner, the --version) arg handler, and
  the absence of the agent banner — same pins as validate-release.sh, but
  now against what GitHub is actually serving (not just what was built
  locally).
- Boots jrei/systemd-debian:12 privileged, runs
  `bash install.sh --archive <tarball> --disable-auto-updates` from
  inside, waits for systemd pulse.service to become active, hits
  /api/health, and asserts /api/version reports the expected version.

--archive mode is used rather than --version so the workflow doesn't
depend on install.sh's self-refetch loop (the re-fetched bytes are the
ones we already validated). Auto-updates are disabled to avoid the timer
unit doing anything during the smoke run.

Triggers are workflow_dispatch + workflow_call only. Wire it into
create-release.yml after the next RC validates it green.

Pinned in build_release_assets_test.go so silent deletion or weakening
of any critical assertion (signature verify, banner check, /api/health
hit, version match) trips the test.
2026-05-12 11:25:46 +01:00
rcourtman
ce7d7c1956 Fix stale README signature key and guard against future drift
The README's secure-install snippet has pinned the wrong ed25519 key
since commit a60fa03d7 (April 22, 2026), so v6 rc.2 through rc.5 all
shipped with a documented verification step that does not work.

I downloaded the published rc.5 install.sh + install.sh.sshsig and
ran ssh-keygen -Y verify with both candidate keys:
  Ds21c5... (README's pinned key) -> Could not verify signature
  MZd/DaH... (key embedded in install.sh and pulse-auto-update.sh) -> OK

Customers who actually followed the README's secure-install path saw
"Could not verify signature" and aborted. Most users curl-pipe the
script unverified so the drift went unreported.

Replace the stale key in README.md and docs/INSTALL.md with the actual
pipeline signing key (MZd/...).

Add a validate-release.sh smoke that extracts the README's pinned key
and runs the exact ssh-keygen -Y verify command against the signed
install.sh.sshsig. Any future drift between documented key and actual
signing key fails the release before publish.

Lock both the correct-key presence and the stale-key absence in
build_release_assets_test.go for README and docs/INSTALL.md so a manual
edit cannot regress the docs back to the broken state.
2026-05-12 10:30:42 +01:00
rcourtman
49412357af Ship the Pulse server install.sh as the GitHub Release asset
Every v6 RC (rc.1 through rc.5, ~30 days) shipped the wrong install.sh.
build-release.sh was copying the rendered AGENT installer into
release/install.sh, but adapter_installsh, scripts/pulse-auto-update.sh,
the root install.sh's own --rc/--stable/--version flows, and the README
quickstart all fetch that asset and run `bash install.sh --version vX.Y.Z`.
Since the agent installer rejects --version with "Unknown argument", the
LXC quickstart, the in-product "Update Pulse" button on systemd/proxmoxve
deployments, and the pulse-auto-update.sh systemd timer were all broken
for every RC.

Fix the build-release.sh copy to publish the root server installer.
The agent installer continues to ship inside tarballs at ./scripts/install.sh
and inside Docker images at /opt/pulse/scripts/install.sh, and is served
at the running Pulse server's /install.sh endpoint — none of those paths
change. Only the top-level GitHub Releases asset moves from agent to server.

Update build_release_assets_test.go to lock in the new publishing rule
and ban the reverse drift, replacing the March 18 "legacy root install.sh"
guard that was the original mistake.

Add a validate-release.sh smoke that catches the regression mode this hid:
the published install.sh must have the Pulse server banner, the --version)
arg handler, must not contain the agent banner, and `bash install.sh --help`
must print the server installer's version-pinning help line. These checks
run as part of validate-release-assets.yml against the post-publish asset
bundle so a future swap back cannot slip through.

Document the asset identity rule and the validate-release.sh guard in the
deployment-installability contract so any future change to the publishing
pipeline has to update the contract or trip the shape guard.
2026-05-12 10:24:28 +01:00
rcourtman
da7969fb48 Drop pulse-agent attestation expectations from TestReleaseWorkflowsUseSecretSafeAttestedImageBuilds
Some checks are pending
Build and Test / Secret Scan (push) Waiting to run
Build and Test / Frontend & Backend (push) Waiting to run
2026-05-12 02:01:45 +01:00
rcourtman
3da835c5bc Publish a distribution path for pulse-mcp
The MCP adapter shipped in slice 51 with one install option:
clone the repo and go build. This slice integrates pulse-mcp
into Pulse's existing governed release pipeline so a Pulse
release publishes a pulse-mcp binary alongside the unified agent
and the install scripts that bring it home in one command.

What ships:

  - scripts/build-release.sh extended to build pulse-mcp for
    the same multi-OS matrix as the unified agent, package
    per-platform tarballs and zips, and copy bare binaries to
    RELEASE_DIR for /releases/latest/download/ redirect
    compatibility.
  - .github/workflows/create-release.yml extended to upload
    the bare pulse-mcp binaries plus install-mcp.sh and
    install-mcp.ps1 as release assets.
  - scripts/install-mcp.sh: bash one-line installer that
    detects platform/arch, downloads the matching binary from
    the configured release (latest by default), verifies SHA256
    against the published checksums.txt, places at
    ~/.local/bin/pulse-mcp (or /usr/local/bin if not writable).
    Honors PULSE_MCP_VERSION, PULSE_MCP_BIN_DIR, PULSE_MCP_REPO,
    PULSE_MCP_NO_VERIFY env vars; declines Windows shells with
    a pointer at the .ps1 sibling.
  - scripts/install-mcp.ps1: PowerShell installer for Windows,
    placing pulse-mcp.exe at $LOCALAPPDATA\pulse-mcp.

Documentation aligned:

  - cmd/pulse-mcp/README.md gains an Install section above
    Quick start with three options: one-line installer,
    GitHub Release download, go install. Documents the macOS
    Gatekeeper bypass since v1 is unnotarized by design.
  - The Settings -> API Access agent-integrations panel now
    surfaces the curl|bash command above the config snippet so
    operators see "install pulse-mcp" before "configure your
    MCP client."
  - docs/releases/AGENT_PARADIGM.md drops the "no published
    distribution path" item from "what it does not do yet" and
    documents the Gatekeeper / Homebrew gaps as next-tier
    follow-ups.

Trade-offs surfaced and chosen:

  - Same cadence as Pulse: pulse-mcp ships per Pulse release,
    not on its own track. The MCP server reads the manifest
    from the Pulse it talks to, so version alignment is the
    natural model.
  - No Homebrew tap or core formula in v1. Maintaining a tap
    is real ongoing work; foundation supports adding Homebrew
    later as a layer.
  - No Docker image. Stdio JSON-RPC fights Docker's stdin
    /stdout pattern.
  - No notarization in v1. SHA256 verification through the
    installer preserves the audit trail; README documents the
    Gatekeeper bypass.

Subsystem contract: deployment-installability.md gains
scripts/install-mcp.sh, scripts/install-mcp.ps1, and
cmd/pulse-mcp/ in canonical files (mid-list entries
renumbered) plus a paragraph documenting the new MCP entry
point alongside the existing installer family.

Verification artifacts:

  - scripts/installtests/build_release_assets_test.go gains
    TestBuildReleasePackagesPulseMcpForAllPlatforms which pins
    the build/package/copy wiring and the load-bearing
    install-mcp.sh helpers (platform detection, SHA256
    verification, install-dir resolution).
  - scripts/release_control/render_release_body_test.py gains
    test_agent_paradigm_release_notes_blurb_documents_-
    distribution_path which pins the AGENT_PARADIGM.md draft's
    install-mcp.sh reference and the four-axis frame so a
    future edit cannot regress the install story silently.

Smoke-tested install-mcp.sh locally on darwin-arm64: platform
detection, install-dir resolution, URL building, and 404 error
handling all correct. The full end-to-end install path becomes
live the moment a Pulse release ships pulse-mcp binaries; the
next RC cut will exercise it.
2026-05-10 17:04:49 +01:00
rcourtman
0f747781fb Support private Pro archive installs 2026-05-07 09:28:38 +01:00
rcourtman
d6e96ebeca Fix v6 demo release signing key deployment 2026-05-05 21:40:14 +01:00
rcourtman
96c2e160c9 Fix RC4 release validation blockers 2026-05-05 15:59:23 +01:00
rcourtman
ce7b459aa7 Harden runtime Proxmox token ACLs 2026-05-05 14:42:05 +01:00
rcourtman
cf103ca9fe Harden root agent service defaults 2026-05-05 13:03:13 +01:00
rcourtman
fe30ecc81e Fix TrueNAS CORE agent supervisor restart
Refs #1457
2026-05-05 09:13:03 +01:00
rcourtman
1a9fa936ee Fix release key helper module path 2026-05-04 09:44:41 +01:00
rcourtman
c27814d190 Fix stable installer prerelease selection
Refs #1435
2026-05-03 15:20:18 +01:00
rcourtman
9ba0c3fa96 Retry release asset uploads 2026-05-03 10:26:51 +01:00
rcourtman
54378a14e5 Fix release validation draft metadata preservation 2026-05-02 02:01:57 +01:00
rcourtman
011d288cb4 Fix release asset validation workflow gates 2026-05-02 00:36:54 +01:00
rcourtman
c8e24f06d7 Fix clean VCS metadata for release builds 2026-05-01 23:12:41 +01:00
rcourtman
87aba32540 Port installer disk preflight from v5 2026-05-01 20:28:11 +01:00
rcourtman
411e8daa4d Port installer bundle fallback fix from v5 2026-05-01 20:28:11 +01:00
rcourtman
af8a5f0740 Port RC3 maintenance fixes from v5
Refs #1440, #1444, #1451
2026-05-01 20:28:11 +01:00
rcourtman
a2c101379a Guard stable updater from prerelease tags
Refuse prerelease-shaped tags and explicit GitHub prerelease responses in the unattended stable updater before installer invocation.

Add installability tests and proof routing for the auto-update prerelease refusal guard.
2026-04-25 23:49:27 +01:00
rcourtman
fb6b53268a Harden release Docker key embedding cache 2026-04-24 17:21:04 +01:00
rcourtman
3ffdf785f1 Split hosted runtime image build contract 2026-04-24 11:33:20 +01:00
rcourtman
c4f1e8d7cb Avoid tenant runtime image copy-up 2026-04-24 09:21:42 +01:00
rcourtman
c51708000f Tighten unified agent hardening proof 2026-04-23 23:37:25 +01:00
rcourtman
9bada35337 Harden unified agent runtime and installer 2026-04-23 23:04:18 +01:00
rcourtman
f58840e8a8 Guard forward release signing against trust-root drift 2026-04-22 19:59:18 +01:00
rcourtman
c0f48b27ba Grant release validation workflow required permissions 2026-04-22 17:47:13 +01:00
rcourtman
9c2e3d5ffb Add historical backfill mode to create-release workflow 2026-04-22 17:43:37 +01:00
rcourtman
16ad67a9b5 Add historical release asset backfill workflow 2026-04-22 17:25:58 +01:00
rcourtman
f96abc5ee0 Publish signed release-packet SBOM assets 2026-04-22 16:49:29 +01:00
rcourtman
21dde76c6f Validate signed release sidecar assets 2026-04-22 16:30:01 +01:00
rcourtman
a60fa03d7f Route operator updates through the local signed helper 2026-04-22 16:18:16 +01:00
rcourtman
ce95ef1fc6 Require signed server installer updates 2026-04-22 15:41:54 +01:00
rcourtman
ca26ed2f44 Pin Dockerfile base images by digest 2026-04-22 11:22:46 +01:00
rcourtman
21950c6e4c Restore QNAP agent boot and update continuity
Refs #1420

Refs #1422
2026-04-22 10:48:43 +01:00
rcourtman
74df03c78c Pin workflow actions and CI image versions 2026-04-22 10:12:15 +01:00
rcourtman
1841c032f6 Pin deployment defaults and verify Helm docs downloads 2026-04-22 06:05:06 +01:00
rcourtman
4720807ae5 Require signed installer downloads and local release sidecars 2026-04-22 03:51:46 +01:00
rcourtman
96034f5e10 Attest release artifacts and harden image provenance 2026-04-22 03:22:29 +01:00
rcourtman
7be844f23a Require signed unified agent release assets 2026-04-22 02:00:29 +01:00
rcourtman
4711d11163 Fix fresh Proxmox LXC installs defaulting to RC 2026-04-20 23:11:46 +01:00