Commit graph

75 commits

Author SHA1 Message Date
Peter Steinberger
84aed919a9
fix: restore CI restart and provider compat 2026-04-18 19:24:08 +01:00
Peter Steinberger
438799e929
fix: log detached service restart attempts 2026-04-18 19:08:36 +01:00
Peter Steinberger
28be124cc1
refactor: centralize restart log conventions 2026-04-18 19:08:35 +01:00
Peter Steinberger
089e038dfe fix: harden macOS update restart helper (#68492) (thanks @hclsys) 2026-04-18 18:39:03 +01:00
HCL
4a870300dd fix(update-cli): capture macOS launchctl stderr to a log file instead of /dev/null
The macOS restart helper emitted by `openclaw update` (darwin branch of
`prepareRestartScript`) wrote the gateway restart script with every
`launchctl` stderr redirected to `/dev/null` and the final fallback
`kickstart` chained with `|| true`. When bootstrap/kickstart failed
(plist-on-disk race, schema rejection, stale job, bootout recovery
edge cases), the script exited 0, the updater declared success, and
the gateway silently stayed offline.

The reporter saw a ~25 minute production outage before noticing the
messages going unanswered across Telegram/Discord/Feishu.

Route stderr to `~/.openclaw/logs/update-restart.log` via `exec 2>>`,
drop `2>/dev/null` on every launchctl call, and remove the `|| true`
swallow on the fallback kickstart so a genuine failure exits non-zero
and leaves a durable audit trail. Log directory creation is best-effort
via `mkdir -p ... 2>/dev/null || true` since it normally already exists
from the gateway's own logging path. Self-cleanup of the script file
via `rm -f "$0"` is retained because the log, not the script, is the
useful artifact after the fact.

Adds a targeted regression test `captures macOS launchctl stderr to
~/.openclaw/logs/update-restart.log` alongside the existing darwin
restart-script test. The existing test's assertions about the
kickstart/enable/bootstrap fallback chain + self-cleanup all still pass.

Fixes #68486
2026-04-18 18:39:03 +01:00
Onur
900e291f31
CI: expand native release validation coverage (#67144)
* Actions: grant reusable release checks actions read

* Actions: use read-all for reusable release checks

* CI: add native cross-OS release checks

* CI: wire Discord smoke secrets for cross-OS checks

* CI: fix native cross-OS installer compatibility

* CI: skip empty pnpm cache saves in matrix jobs

* CI: honor workflow runner override envs

* CI: finish native cross-OS update checks

* CI: fix native cross-OS workflow regressions

* Installer: capture Windows npm stderr safely

* CI: harden cross-OS release checks

* CI: resolve reusable workflow harness ref

* CI: stabilize cross-OS dev update lanes

* CI: tighten release-check workflow semantics

* CI: repoint repaired git CLI on POSIX

* CI: repair native dev-update shell handoff

* CI: preserve real updater semantics

* CI: harden supported release-check refs

* CI: harden release-check refs and fresh mode

* CI: skip dev-update for immutable tag refs

* CI: repair fresh installer release checks

* CI: fix native release check installer lanes

* CI: install release checks from candidate artifacts

* CI: use Windows cmd shims in release checks

* Installer: run Windows npm shim via PowerShell

* CI: pin dev update verification to candidate sha

* CI: pin reusable harness and published installers

* CI: isolate Windows dev-update PATH validation

* CI: align Windows dev-update bootstrap validation

* CI: avoid Windows installer gateway flake

* CI: run cross-OS release checks via TypeScript

* CI: bootstrap tsx for release-check workflow

* CI: fix native release-check follow-ups

* CI: tighten dev-update release checks

* CI: peel annotated workflow refs

* CI: harden native release checks

* CI: fix release-check verifier drift

* CI: fix release-check workflow drift

* CI: fix release-check ref resolution

* CI: harden Windows release-check gateway startup

* CI: fix release-check fallback validation

* CI: harden cross-os release checks

* CI: pin dev-update release checks to candidate SHA

* CI: resolve remote dev target refs

* CI: detect cloned dev-update checkouts

* CI: harden Windows release-check launcher

* Windows: harden task fallback and runner overrides

* Release checks: preserve Windows PATH and baseline version reads

* CI: add release validation live lanes

* CI: expand live and e2e release coverage

* CI: add branch dispatch for live and e2e checks
2026-04-16 19:58:19 +02:00
Ayaan Zaidi
64f258fc49 fix(update): keep downgrade follow-ups in-process 2026-04-15 13:22:04 +05:30
Mariano
8dbe1b4f5a
fix(gateway): harden service entrypoint resolution (#65984)
Merged via squash.

Prepared head SHA: 31cbc3349c
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Reviewed-by: @mbelinky
2026-04-13 17:14:29 +02:00
Vincent Koc
d4fb7d893d fix(ci): repair main tsgo regressions 2026-04-12 19:14:00 +01:00
Vincent Koc
a24af49100
fix(update-cli): respawn plugin refresh after self-update (#65471)
* fix(update-cli): respawn plugin refresh after self-update

* Update src/cli/update-cli/update-command.ts

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Update CHANGELOG.md

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-04-12 18:26:43 +01:00
Peter Steinberger
1fb2e18f47
refactor: simplify cli conversions 2026-04-11 01:27:48 +01:00
Vincent Koc
0e54440ecc fix(cycles): remove browser cli and tlon runtime seams 2026-04-10 11:45:28 +01:00
Peter Steinberger
63e00b811e
refactor: dedupe config cli command trimmed readers 2026-04-08 01:36:38 +01:00
Peter Steinberger
1a3f141215
refactor: dedupe cli lowercase helpers 2026-04-07 17:50:38 +01:00
Peter Steinberger
38cb5aefc8
fix(cli): narrow post-update root 2026-04-06 02:50:38 +01:00
Peter Steinberger
bdf1f02154
fix: exit after package-to-git handoff 2026-04-06 02:39:53 +01:00
Peter Steinberger
eba8fed94b
fix: stop old cli after package-to-git switch 2026-04-06 02:17:20 +01:00
Peter Steinberger
26c9885832
fix: skip stale post-switch update follow-ups 2026-04-06 02:03:04 +01:00
Peter Steinberger
d37b97c2ff
refactor(update): extract package manager bootstrap logic 2026-04-06 01:41:59 +01:00
Peter Steinberger
ca462fb928
fix(update): bootstrap pnpm for dev preflight 2026-04-06 01:31:27 +01:00
Peter Steinberger
e0354e71eb
fix: skip old-process config writes after git switch 2026-04-06 01:29:33 +01:00
Peter Steinberger
c4cc557604
fix: clarify dirty dev update error 2026-04-06 00:58:19 +01:00
Peter Steinberger
be16cf2f0d
fix: defer plugin sync after git switch 2026-04-06 00:46:56 +01:00
Peter Steinberger
a301e2ef87
test: trim cli and infra importActual mocks 2026-04-03 19:54:37 +01:00
Peter Steinberger
3edfc494df
test: expand builtin mock helper usage 2026-04-03 18:53:34 +01:00
Peter Steinberger
e0580e6863
test: harden shared-worker runtime setup 2026-04-03 18:18:56 +01:00
Peter Steinberger
b40d4b63f6
refactor: centralize update targets and extension guardrails 2026-04-03 23:26:31 +09:00
jayeshp19
eb4b6f7024 Use owning npm prefix for global updates 2026-04-03 23:01:43 +09:00
Peter Steinberger
3f1d6fe147
test: speed up cli and command suites 2026-03-31 02:25:02 +01:00
wangchunyue
16df3de098
fix: stabilize config default-leak landing tests (#56834) (thanks @openperf)
* fix(config): prevent AJV schema defaults from leaking into persisted config

Fixes #56772. Ensures that channel and plugin AJV validations respect the applyDefaults option, preventing runtime defaults from being written to openclaw.json during doctor/update flows.

* test: address review feedback on #56772 fix

- Split validation.channel-metadata.test.ts into applyDefaults true/false cases (fixes CI)

- Update io.write-config.test.ts regression test to use a mock plugin registry, ensuring it actually exercises the AJV default injection path

* fix(config): revert applyDefaults passthrough to prevent required-field regression

Codex-connector correctly identified that BlueBubbles channel schema marks

enrichGroupParticipantsFromContacts as both default:true and required.

Passing applyDefaults:false during write validation would cause required

checks to fail, breaking writeConfigFile entirely.

Reverted validation.ts to always use applyDefaults:true for channel/plugin

AJV validation. The protection against default leakage into persisted config

is fully handled by the persistCandidate change in io.ts (cfgToWrite uses

the pre-validation merge-patched value, not validated.config).

Updated validation.channel-metadata.test.ts to reflect this architecture.

* fix(config): apply legacy web-search normalization to persistCandidate

* fix: stabilize config default-leak landing tests (#56834) (thanks @openperf)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-03-30 09:19:17 +05:30
Peter Steinberger
47216702f4
refactor(config): use source snapshots for config mutations 2026-03-30 01:02:25 +01:00
Peter Steinberger
89a4f2a34e
refactor(config): centralize runtime config state 2026-03-30 00:17:23 +01:00
Peter Steinberger
ea08f2eb8c
fix(runtime): support Node 22.14 installs 2026-03-25 06:22:18 -07:00
Peter Steinberger
66c88b4c77
fix(update): preflight npm target node engine 2026-03-25 06:01:20 -07:00
Peter Steinberger
ce49d8bca9
fix: verify global npm correction installs 2026-03-23 21:04:08 -07:00
Vincent Koc
8fa91d283b fix(cli): preserve posix default git dir 2026-03-23 11:49:55 -07:00
Peter Steinberger
ffb287e1de
fix: harden update dev switch and refresh changelog 2026-03-23 10:56:35 -07:00
Peter Steinberger
6b9915a106
refactor!: drop legacy CLAWDBOT env compatibility 2026-03-22 22:13:39 -07:00
Peter Steinberger
4ee41cc6f3 refactor(cli): separate json payload output from logging 2026-03-22 23:19:17 +00:00
Vincent Koc
5a7aba94a2
CLI: support package-manager installs from GitHub main (#47630)
* CLI: resolve package-manager main install specs

* CLI: skip registry resolution for raw package specs

* CLI: support main package target updates

* CLI: document package update specs in help

* Tests: cover package install spec resolution

* Tests: cover npm main-package updates

* Tests: cover update --tag main

* Installer: support main package targets

* Installer: support main package targets on Windows

* Docs: document package-manager main updates

* Docs: document installer main targets

* Docs: document npm and pnpm main installs

* Docs: document update --tag main

* Changelog: note package-manager main installs

* Update src/infra/update-global.test.ts

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-03-15 14:18:12 -07:00
Vincent Koc
65f92fd839
Guard updater service refresh against missing invocation cwd (#45486)
* Update: capture a stable cwd for service refresh env

* Test: cover service refresh when cwd disappears
2026-03-13 18:09:01 -04:00
Vincent Koc
e56e0cc913
Fix updater refresh cwd for service reinstall (#45452)
* Fix updater refresh cwd for service reinstall

* Update: preserve relative env overrides during service refresh

* Test: cover updater service refresh env rebasing
2026-03-13 17:27:12 -04:00
MoerAI
9da06d918f fix(windows): add windowsHide to detached spawn calls to suppress console windows (#44693)
The restart helper and taskkill spawn calls were missing windowsHide: true,
causing visible command prompt windows to flash on screen during gateway
restart and process cleanup on Windows.
2026-03-13 21:06:33 +00:00
Peter Steinberger
d96069f0df
feat: add windows update package spec override 2026-03-12 23:56:48 +00:00
Peter Steinberger
91b701e183
fix: harden windows native updates 2026-03-12 23:42:14 +00:00
Vincent Koc
04e103d10e
fix(terminal): stabilize skills table width across Terminal.app and iTerm (#42849)
* Terminal: measure grapheme display width

* Tests: cover grapheme terminal width

* Terminal: wrap table cells by grapheme width

* Tests: cover emoji table alignment

* Terminal: refine table wrapping and width handling

* Terminal: stop shrinking CLI tables by one column

* Skills: use Terminal-safe emoji in list output

* Changelog: note terminal skills table fixes

* Skills: normalize emoji presentation across outputs

* Terminal: consume unsupported escape bytes in tables
2026-03-11 09:13:10 -04:00
Peter Steinberger
1d3dde8d21 fix(update): re-enable launchd service before updater bootstrap 2026-03-09 07:27:11 +00:00
Ayaan Zaidi
05c240fad6
fix: restart Windows gateway via Scheduled Task (#38825) (#38825) 2026-03-07 18:00:38 +05:30
Tak Hoffman
1be39d4250
fix(gateway): synthesize lifecycle robustness for restart and startup probes (#33831)
* fix(gateway): correct launchctl command sequence for gateway restart (closes #20030)

* fix(restart): expand HOME and escape label in launchctl plist path

* fix(restart): poll port free after SIGKILL to prevent EADDRINUSE restart loop

When cleanStaleGatewayProcessesSync() kills a stale gateway process,
the kernel may not immediately release the TCP port. Previously the
function returned after a fixed 500ms sleep (300ms SIGTERM + 200ms
SIGKILL), allowing triggerOpenClawRestart() to hand off to systemd
before the port was actually free. The new systemd process then raced
the dying socket for port 18789, hit EADDRINUSE, and exited with
status 1, causing systemd to retry indefinitely — the zombie restart
loop reported in #33103.

Fix: add waitForPortFreeSync() that polls lsof at 50ms intervals for
up to 2 seconds after SIGKILL. cleanStaleGatewayProcessesSync() now
blocks until the port is confirmed free (or the budget expires with a
warning) before returning. The increased SIGTERM/SIGKILL wait budgets
(600ms / 400ms) also give slow processes more time to exit cleanly.

Fixes #33103
Related: #28134

* fix: add EADDRINUSE retry and TIME_WAIT port-bind checks for gateway startup

* fix(ports): treat EADDRNOTAVAIL as non-retryable and fix flaky test

* fix(gateway): hot-reload agents.defaults.models allowlist changes

The reload plan had a rule for `agents.defaults.model` (singular) but
not `agents.defaults.models` (plural — the allowlist array).  Because
`agents.defaults.models` does not prefix-match `agents.defaults.model.`,
it fell through to the catch-all `agents` tail rule (kind=none), so
allowlist edits in openclaw.json were silently ignored at runtime.

Add a dedicated reload rule so changes to the models allowlist trigger
a heartbeat restart, which re-reads the config and serves the updated
list to clients.

Fixes #33600

Co-authored-by: HCL <chenglunhu@gmail.com>
Signed-off-by: HCL <chenglunhu@gmail.com>

* test(restart): 100% branch coverage — audit round 2

Audit findings fixed:
- remove dead guard: terminateStaleProcessesSync pids.length===0 check was
  unreachable (only caller cleanStaleGatewayProcessesSync already guards)
- expose __testing.callSleepSyncRaw so sleepSync's real Atomics.wait path
  can be unit-tested directly without going through the override
- fix broken sleepSync Atomics.wait test: previous test set override=null
  but cleanStaleGatewayProcessesSync returned before calling sleepSync —
  replaced with direct callSleepSyncRaw calls that actually exercise L36/L42-47
- fix pid collision: two tests used process.pid+304 (EPERM + dead-at-SIGTERM);
  EPERM test changed to process.pid+305
- fix misindented tests: 'deduplicates pids' and 'lsof status 1 container
  edge case' were outside their intended describe blocks; moved to correct
  scopes (findGatewayPidsOnPortSync and pollPortOnce respectively)
- add missing branch tests:
  - status 1 + non-empty stdout with zero openclaw pids → free:true (L145)
  - mid-loop non-openclaw cmd in &&-chain (L67)
  - consecutive p-lines without c-line between them (L67)
  - invalid PID in p-line (p0 / pNaN) — ternary false branch (L67)
  - unknown lsof output line (else-if false branch L69)

Coverage: 100% stmts / 100% branch / 100% funcs / 100% lines (36 tests)

* test(restart): fix stale-pid test typing for tsgo

* fix(gateway): address lifecycle review findings

* test(update): make restart-helper path assertions windows-safe

---------

Signed-off-by: HCL <chenglunhu@gmail.com>
Co-authored-by: Glucksberg <markuscontasul@gmail.com>
Co-authored-by: Efe Büken <efe@arven.digital>
Co-authored-by: Riccardo Marino <rmarino@apple.com>
Co-authored-by: HCL <chenglunhu@gmail.com>
2026-03-03 21:31:12 -06:00
Shadow
b0bcea03db
fix: drop discord opus dependency 2026-03-03 12:23:19 -06:00