mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-05-19 16:28:28 +00:00
feat(core): PR-2.5 — post-promote stream redirect + natural-exit registry settle (#3831 follow-up) (#4102)
* feat(core): PR-2.5 — post-promote stream redirect + natural-exit registry settle Closes the two limitations PR-2 (#3894) deferred for the Phase D part (b) Ctrl+B promote flow (#3831): 1. **Post-promote stream redirect**: today the `bg_xxx.output` file is frozen at promote time because `ShellExecutionService` detaches its data listener as part of PR-1's ownership-transfer contract. PR-2.5 wires a caller-side `onPostPromoteData` callback so bytes from the still-running child append to the file via an `fs.createWriteStream` opened in `handlePromotedForeground`. 2. **Natural-exit registry settle**: today the registry entry stays `'running'` until `task_stop` / session-end `abortAll` fires its abort listener. PR-2.5 wires `onPostPromoteSettle` so natural child exit transitions the entry to `'completed'` / `'failed'` with the right exitCode / signal / error message. - New exported types: `ShellExecuteOptions`, `ShellPostPromoteHandlers`, `ShellPostPromoteSettleInfo`. - `execute()` options bag now accepts `postPromote?: { onData, onSettle }`. Threaded through to both `executeWithPty` and `childProcessFallback`. - PTY's `performBackgroundPromote` (line ~1159): after disposing the foreground data + exit + error listeners, RE-ATTACH minimal forwarders that call `postPromote.onData` / `postPromote.onSettle` when the caller opted in. Backwards compat: when `postPromote` is unset the PR-2 detach-everything contract is preserved (the re-attach is gated on each callback being defined). - `childProcessFallback`'s `performBackgroundPromote` (line ~706): same pattern — re-attach `stdout.on('data', ...)`, `stderr.on('data', ...)`, `child.once('exit', ...)`, `child.once('error', ...)` when the caller opted in. `error` listener routes through `onSettle` with `error` populated, so spawn-side errors after the foreground errorHandler detached don't crash the daemon via the default unhandled `'error'` event. - Both paths wrap caller callbacks in try/catch so a thrown handler doesn't crash the child's data loop / unhandled-rejection the service. - New `PromoteArtifacts` type — slots shared between the foreground `execute()` postPromote handlers (which fire on the service side as soon as promote happens) and the post-resolve `handlePromotedForeground` finalizer (which runs after `await resultPromise` returns). The two race; the buffer + settle-queue absorb that race so neither chunks nor the eventual exit info are lost. - `executeForeground` wires `postPromote` handlers that route data to either `promoteArtifacts.stream` (if open) or `promoteArtifacts.buffer` (drained when the stream opens), and queue settle info if the wired handler isn't yet installed. - `handlePromotedForeground` opens `fs.createWriteStream(outputPath, { flags: 'w' })`, writes the initial snapshot first, drains the buffer, then registers the entry and wires `onSettleWired` with the full registry decision table: - `error` set → `registry.fail(shellId, error.message, endTime)` - `exitCode === 0` → `registry.complete(shellId, 0, endTime)` - non-zero exitCode → `registry.fail(shellId, "Exited with code N", endTime)` - signal !== null → `registry.fail(shellId, "Terminated by signal N", endTime)` - all-null fallback → `registry.fail(shellId, "Exited with unknown status", endTime)` - Fires queued settle synchronously after wiring so a fast command that exits between promote and finalizer doesn't get lost. - Self-audit catch: closes the output stream on the `registry.register` throw path so the FD doesn't leak past the orphan-child kill. - 3 new in `shellExecutionService.test.ts`: - `post-promote bytes route to postPromote.onData when callback provided` - `postPromote.onSettle fires on natural child exit after promote` - `backwards compat: without postPromote, listeners stay fully detached` - 3 new in `shell.test.ts` under a `foreground → background promote PR-2.5` describe block: - `post-promote bytes APPEND to bg_xxx.output via write stream` - `natural child exit transitions registry entry to "completed"` - `non-zero exit / signal / error → "failed" with descriptive message` - Bulk-replaced 50 prior `{},` (empty 6th-arg shellExecutionConfig) with `expect.objectContaining({}),` + added `expect.objectContaining({ postPromote: expect.any(Object) }),` as the 7th-arg expectation for the foreground execute call. - Updated the existing `registers a bg_xxx entry on result.promoted` test to assert on `fs.createWriteStream` + `stream.write` instead of the now-removed `fs.writeFileSync` snapshot path. 182/182 shell.test.ts pass + 73/73 shellExecutionService.test.ts pass + 111/111 coreToolScheduler.test.ts pass + 60/60 AppContainer.test.tsx pass; tsc + ESLint clean. Self-audit: 3 rounds (positive / reverse / cross-file) found one issue — output stream FD leak on `registry.register` throw — and fixed it before flagging complete. All flagged edge cases (stream errors, child-exits-before-wire-up race, task_stop during natural- exit window, promote-never-happens cleanup, backwards compat without callbacks) have explicit handling and / or test pinning. * fix(core): #4102 review wave — 3 Critical + UTF-8 + tests 3 Critical race/correctness issues + 1 multibyte-corruption suggestion + 3 test coverage gaps addressed: **Critical 1 — child_process late-chunk drop (service)** Settle was fired on 'exit', but stdout/stderr can emit buffered data between 'exit' and 'close'. Late chunks landed in `promoteArtifacts.buffer` after shell.ts had already closed the stream + transitioned the registry → silently dropped → truncated `bg_xxx.output`. Switched to listening on 'close' which guarantees all stdio is fully drained. (code, signal) payload is identical to 'exit', just with proper ordering. **Critical 2 — stream-flush wait before registry transition (shell)** `stream.end()` is asynchronous; pending writes can still be in the libuv queue when it returns. The old code transitioned the registry immediately after `.end()`, so a /tasks consumer could observe a `completed` entry and read the output file BEFORE the trailing bytes were on disk. Fixed: wired settle now `stream.once('finish', ...)` BEFORE calling `registry.complete/fail`. `error` event also short-circuits to the transition so a late ENOSPC doesn't hang the settle path forever. **Critical 3 — stream-open-fail buffer leak (shell)** If `fs.createWriteStream` threw, the catch path set `stream = null` but the foreground `onData` handler would still take the `stream === null` branch and push chunks into `promoteArtifacts.buffer` — unbounded growth under a sustained child whose output file couldn't be opened. Added a `streamFailed: boolean` latch on `PromoteArtifacts`. When set, `onData` drops chunks (with a debug log) instead of buffering. The catch branch sets the latch. **Suggestion — shared TextDecoder corrupts multibyte UTF-8 (service)** child_process post-promote used ONE TextDecoder for both stdout AND stderr. The decoder's continuation-byte state machine assumes one byte source; interleaved multibyte chunks corrupted. Now uses separate decoders + flushes both with `decode()` (no `stream: true`) on settle so trailing bytes surface as their final characters. **Suggestion — llmContent reflects already-settled status (shell)** When the queued-settle drain transitions the registry synchronously (fast-exit race), the model-facing copy was still saying "Status: running. … task_stop({...})". Updated to branch on `postPromoteAlreadySettled` / `postPromoteFinalStatus` — when the process is already gone, the copy says "Status: completed/failed" and replaces the `task_stop` suggestion with "Process has already exited; no `task_stop` needed". **Suggestion — test coverage gaps** Added: (a) `queued-settle race: onSettle BEFORE handlePromotedForeground completes` — custom service impl fires onSettle synchronously before resolving the promote promise, pins the drain path. (b) child_process post-promote tests for stdout/stderr forwarding + 'close'-not-'exit' settle + spawn-error settle. **Self-audit**: Round 1 + reverse audit. Stream.once mock added to fire 'finish' synchronously so existing tests don't hang on the new flush wait. 76/76 shellExecutionService.test.ts (+3) + 183/183 shell.test.ts (+1) pass; tsc + ESLint clean. * fix(core): #4102 review wave-2 — 3 more C1 (shell.ts:2227): the WriteStream `'error'` event handler only logged. `fs.createWriteStream` reports common open failures (ENOENT / EACCES / ENOSPC) asynchronously via that event rather than throwing. Result: `promoteArtifacts.stream` kept pointing at the failed stream; `onSettleWired` attached a `.once('finish')` listener that would never fire → registry stuck on `running` forever. Latch the failure (null the shared `stream` slot, set `streamFailed`); `onSettleWired`'s existing `if (!stream)` branch then transitions the registry immediately. C2 (shellExecutionService.ts:1468): the promote handoff removes the foreground `ptyErrorHandler` and only re-attaches data + exit listeners. A subsequent PTY `error` event had no listener — Node treats an unhandled `error` from an EventEmitter as a fatal exception that takes the whole CLI down. Attach a post-promote forwarder that ignores expected PTY read-exit codes (EIO / EAGAIN, same filter the foreground handler uses) and routes unexpected errors through `postPromote.onSettle` with `error` populated. Single-fire latch shared with `onExit` so settle never fires twice. C3 (shell.ts:2503): `onSettleWired` waits for the stream's asynchronous `'finish'` event before flipping `postPromoteAlreadySettled`, but the model-facing `statusLine` was built immediately after invoking `onSettleWired` on the queued settle. A fast-exited promoted command could therefore land "Status: running" + a `task_stop` instruction in production even though settle was already observed. Split into two flags: `postPromoteSettleObserved` (set synchronously when settle is classified) drives the model copy; the registry transition stays behind the stream flush. Tests: +1 PR-2.5 wave-2 PTY error-routing test; +2 shell.ts tests (stream open async error → registry still transitions; async `'finish'` after queued-settle drain → llmContent says 'completed' before registry transition fires). * fix(core): #4102 review wave-3 — 4 actionable T2 (shell.ts:2456) — Critical buffer-leak race `onSettleWired` previously set `promoteArtifacts.stream = null` BEFORE calling `stream.end()`. Any `postPromote.onData` chunk that landed between that null assignment and the actual flush completing saw `stream === null && streamFailed === false` and pushed into `promoteArtifacts.buffer` — a buffer that has no further drain path (the foreground finalizer has already returned). Result: chunks stranded indefinitely; PTY mode in particular hits this because `onExit` can fire while kernel buffers still hold data. Fix drains the pre-settle buffer to the stream BEFORE nulling AND latches `streamFailed = true` so any subsequent chunk drops via the existing `else if (streamFailed)` arm in `onData` instead of leaking. Updates the `streamFailed` doc to cover both setters (open-fail and settle-done) so the dual semantic is explicit. T3 (shell.ts:2262) — silent chunk-drop in catch path When `fs.createWriteStream` throws synchronously (rare: ENOENT on a vanished tmpdir), chunks already in `promoteArtifacts.buffer` were silently lost with no observability — oncall reading a truncated `bg_xxx.output` had no way to distinguish "stream open failed" from "child produced nothing." Logs the dropped chunk count and empties the buffer. T5 (shell.ts:2443) — opaque all-null fallback The "Exited with unknown status" fallback fired the registry to 'failed' without any context about which fields were null. This branch is meant to be unreachable; hitting it indicates the service emitted a defective settle info object. Includes the field values in both the fail message and a warn log so the oncall engineer can tell this path apart from the other "failed" branches. T6 (shellExecutionService.ts:1452) — leaked PTY post-promote listeners `ptyProcess.onData(...)` returns an `IDisposable` that was being discarded; same for `onExit`. The `'error'` listener function was also not captured (no way to `removeListener` it). EventEmitter holds refs to listener closures, which transitively hold refs to `onPostData` / `onPostSettle` / the caller's `promoteArtifacts`. While bounded by the PTY's lifetime, the closures keep the caller's state pinned for the post-settle delay window. Captures all three handles into `postPromoteDataDisposable` / `postPromoteExitDisposable` / `postPromoteErrorListener`, then releases them via a shared `disposePostPromoteListeners()` call from `firePostSettle` (idempotent — each slot null-checked and nulled after disposal). Tests: +1 service test for IDisposable + error-listener cleanup; +2 shell.ts tests for buffer drain race and catch-path snapshot fallback. Existing tests stay green (262 → 265 in the touched suites; 7819 → 7822 across the core package). * fix(core/test): drop unused 'registry' in wave-3 T2 test (TS6133) CI build failed across all platforms with src/tools/shell.test.ts(4395,15): error TS6133. The variable was a leftover from copying the queued-settle test pattern; the wave-3 T2 test inspects writeStreamMock.write call history directly and never reads the registry, so the assignment is dead code. Drop it. * fix(core): #4102 review wave-4 — 6 actionable T1 (Critical, shellExecutionService.ts:860 child_process onSettle exactly-once) The PTY path used a `firePostSettle` latch but child_process wired `close` and `error` independently to `onPostSettle`. A spawn-side error followed by Node's auto-emitted `'close'` would call the caller's settle TWICE, racing the registry transition. Added the same single-fire latch on the child_process path. T2 (Critical, shell.ts:2264 handoff race reorder) Original order was `write(snapshot) -> drain buffer -> assign stream`. Synchronous today (no race in current code), but assign-after-drain leaves a hazard for any future refactor that adds an `await` inside the drain loop — a chunk arriving in that window would land in `promoteArtifacts.buffer`, then post-assign chunks would write to the stream first, producing out-of-order bytes until the settle drain. Reordered to `write(snapshot) -> assign stream -> drain buffer`, which closes the hazard regardless of future async additions. T3 (Suggestion, shellExecutionService.ts:816 decoder flush gated on onSettle) The trailing-multibyte flush ran inside the `child.once('close', ...)` handler, which was only installed when `onSettle` was set. An `onData`-only caller (no onSettle) lost trailing continuation bytes silently. Hoisted flush into `flushPostPromoteDecoders` called from `firePostSettle`, and made `firePostSettle` available on the `'close'` path independent of onSettle (T6 install). T4 (Suggestion, shell.ts:1700 promoted ANSI passthrough) The regular `executeBackground` path strips ANSI before writing to `bg_xxx.output`; the promoted-foreground onData path appended raw chunks. Reading `bg_xxx.output` after Ctrl+B showed plain text up to the snapshot then raw `\x1b[31m` / cursor-move / clear-screen sequences for the post-promote tail — unreadable. Apply `stripAnsi(rawChunk)` before write/buffer, matching the executeBackground contract. T5 (Suggestion, shellExecutionService.ts:786 UTF-8 hardcoded) The post-promote child_process decoders were hard-coded to `new TextDecoder('utf-8')`, but the foreground decoder runs encoding detection via `getCachedEncodingForBuffer`. On a non-UTF-8 child (e.g. GBK on a Chinese Windows shell), the snapshot decoded correctly but the post-promote tail was mojibake. Capture the foreground decoder's `.encoding` property and reuse it for post-promote (with utf-8 fallback if foreground hadn't seen any bytes yet, and a try/catch around `new TextDecoder` for the rare unsupported-encoding case). T6 (Suggestion, shellExecutionService.ts:1540 `error` listener gated on onSettle) The post-promote `error` listener was attached only when `onSettle` was set. An `onData`-only caller still had the foreground errorHandler detached; a post-promote spawn error would then crash the CLI via Node's unhandled-error default. Hoisted the close + error listeners into `if (postPromote)` so any caller opting into post-promote gets crash protection; if `onSettle` is absent the listeners log + drop instead of routing. T7 (Suggestion, shellExecutionService.ts:791 onSettle-only pipe-block deadlock) Same root cause as T6: when only `onSettle` is set, the foreground `stdout`/`stderr` 'data' listeners are detached and no post-promote listener replaces them. The Readables stay paused, the OS pipe buffer fills (~64KB on Linux), the child blocks on `stdout.write`, 'close' never fires, onSettle never fires. Added `child.stdout?.resume()` and `child.stderr?.resume()` in the no-onData branch so the child can drain its pipes and reach exit. T8 (Suggestion, shell.ts:2614 dead inspectLine ternary) `inspectLine`'s ternary returned the same string on both sides — copy-paste leftover from when the other two adjacent ternaries (statusLine / stopLine) were correctly varied. Collapsed to a single string assignment. Tests: +5 regression tests (4 child_process: T1 double-fire latch, T3 onData-only flush, T6 onData-only error survives, T7 onSettle- only resume; +1 shell.ts: T4 ANSI strip). 265 -> 270 in the touched suites; 7822 -> 7827 across the core package; full suite green. * fix(core/test): use ShellOutputEvent type in wave-4 onData callbacks (TS2345) CI lint failed on the wave-4 (T3 / T6) tests with TS2345: pushing ShellOutputEvent into Array<{type:string;chunk:unknown}> narrows incompatibly. Switch to ShellOutputEvent[] (matches earlier helpers at lines 758/966) and discriminate the union via .type === 'data' when reading .chunk so the narrowed multibyte assertion still type-checks. * fix(core): address PR #4102 review — PTY error guard, flush timeout, diagnostic marker, failed-settle test - Move PTY post-promote error listener from `if (postPromote?.onSettle)` to `if (postPromote)` to match child_process path and prevent unhandled error crashes for onData-only callers - Add 10s flush timeout in onSettleWired so stalled streams don't leave registry entries stuck on 'running' forever - Append diagnostic marker to output file on stream error so truncation is visible without debug logging - Add queued-settle test with exitCode:1 asserting 'Status: failed.' in llmContent * fix(core): address PR #4102 review — align PTY/child_process guards, add flush timeout, diagnostic marker, and tests - Widen PTY post-promote onExit + error listener guard from `if (postPromote?.onSettle)` to `if (postPromote)` to match child_process path — prevents unhandled error crash and listener leak for onData-only callers - Add 10s flush timeout in onSettleWired so stalled streams don't leave registry entries stuck on 'running' indefinitely - Append diagnostic marker to output file on stream error so truncation is visible without debug logging - Remove model name references from code comments - Add tests: PTY onData-only error/exit, flush timeout fallback, appendFileSync diagnostic marker, queued-settle with failed exit code * fix(core): address PR #4102 review round 2 — listener cleanup, rename, constant hoist - Fix expect.objectContaining({}) misused as runtime arg in 2 execute() call sites - Add child_process post-promote stdout/stderr listener cleanup in firePostSettle - Rename streamFailed → streamClosed to reflect its overloaded semantics - Hoist FLUSH_TIMEOUT_MS to module-level PROMOTE_FLUSH_TIMEOUT_MS constant - Fix dangling FLUSH_TIMEOUT_MS reference (was undefined at runtime) - Add Windows note to streams pause/resume comment - Document PTY onData dispose-before-settle as known limitation
This commit is contained in:
parent
cc32ef2ff9
commit
0240c310fd
4 changed files with 2490 additions and 81 deletions
|
|
@ -19,7 +19,9 @@ import { type ChildProcess } from 'node:child_process';
|
|||
import pkg from '@xterm/headless';
|
||||
import type {
|
||||
ShellAbortReason,
|
||||
ShellExecuteOptions,
|
||||
ShellOutputEvent,
|
||||
ShellPostPromoteSettleInfo,
|
||||
} from './shellExecutionService.js';
|
||||
import {
|
||||
getShellAbortReasonKind,
|
||||
|
|
@ -266,6 +268,7 @@ describe('ShellExecutionService', () => {
|
|||
ac: AbortController,
|
||||
) => void,
|
||||
config = shellExecutionConfig,
|
||||
options: ShellExecuteOptions = {},
|
||||
) => {
|
||||
const abortController = new AbortController();
|
||||
const handle = await ShellExecutionService.execute(
|
||||
|
|
@ -275,6 +278,7 @@ describe('ShellExecutionService', () => {
|
|||
abortController.signal,
|
||||
true,
|
||||
config,
|
||||
options,
|
||||
);
|
||||
|
||||
await new Promise((resolve) => process.nextTick(resolve));
|
||||
|
|
@ -745,6 +749,283 @@ describe('ShellExecutionService', () => {
|
|||
expect(exitDisposableStub.dispose).toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('PR-2.5: post-promote bytes route to postPromote.onData when callback provided', async () => {
|
||||
// Pin the new opt-in contract: when `postPromote.onData` is set,
|
||||
// bytes the still-running PTY emits after promote go to the
|
||||
// caller's handler instead of being lost. PR-2 fully detached
|
||||
// listeners; PR-2.5 re-attaches a minimal forwarder when the
|
||||
// caller opts in.
|
||||
const onDataCalls: ShellOutputEvent[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'tail -f /tmp/never.log',
|
||||
(pty, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_pr25_data',
|
||||
} satisfies ShellAbortReason);
|
||||
},
|
||||
shellExecutionConfig,
|
||||
{
|
||||
postPromote: {
|
||||
onData: (event) => onDataCalls.push(event),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
// After promote, drive a fresh post-promote chunk through the
|
||||
// PTY's onData. The service should have attached a NEW listener
|
||||
// (the foreground one is disposed); look at the latest
|
||||
// mock.calls entry — index 1 since PR-2.5 adds a second.
|
||||
const onDataRegistrations = mockPtyProcess.onData.mock.calls;
|
||||
expect(onDataRegistrations.length).toBeGreaterThanOrEqual(2);
|
||||
const postPromoteHandler =
|
||||
onDataRegistrations[onDataRegistrations.length - 1][0];
|
||||
postPromoteHandler('post-promote-byte-stream');
|
||||
expect(onDataCalls).toEqual([
|
||||
{ type: 'data', chunk: 'post-promote-byte-stream' },
|
||||
]);
|
||||
});
|
||||
|
||||
it('PR-2.5: postPromote.onSettle fires on natural child exit after promote', async () => {
|
||||
// Pin the natural-exit settle: when the child terminates AFTER
|
||||
// promote, the caller's onSettle handler is invoked exactly
|
||||
// once with the exit code (or signal / error). PR-2 detached
|
||||
// the exit listener entirely; PR-2.5 re-attaches a forwarder
|
||||
// when the caller opts in.
|
||||
const settleCalls: ShellPostPromoteSettleInfo[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'long-running-command',
|
||||
(pty, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_pr25_settle',
|
||||
} satisfies ShellAbortReason);
|
||||
},
|
||||
shellExecutionConfig,
|
||||
{
|
||||
postPromote: {
|
||||
onSettle: (info) => settleCalls.push(info),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
// After promote, drive the PTY's onExit to simulate natural
|
||||
// completion. The service attaches a new exit listener for
|
||||
// post-promote settle — find the most-recently-registered.
|
||||
const onExitRegistrations = mockPtyProcess.onExit.mock.calls;
|
||||
expect(onExitRegistrations.length).toBeGreaterThanOrEqual(2);
|
||||
const postPromoteExitHandler =
|
||||
onExitRegistrations[onExitRegistrations.length - 1][0];
|
||||
postPromoteExitHandler({ exitCode: 0, signal: undefined });
|
||||
expect(settleCalls).toHaveLength(1);
|
||||
expect(settleCalls[0].exitCode).toBe(0);
|
||||
expect(settleCalls[0].signal).toBeNull();
|
||||
expect(settleCalls[0].error).toBeUndefined();
|
||||
expect(typeof settleCalls[0].endTime).toBe('number');
|
||||
});
|
||||
|
||||
it('PR-2.5 wave-2 (C2): unexpected post-promote PTY error routes to onSettle as failure (does NOT crash the CLI)', async () => {
|
||||
// Foreground PTY error handler removed at promote handoff. Before
|
||||
// the wave-2 fix the post-promote path attached NO error listener,
|
||||
// so an unhandled `error` event would take Node down. Now we
|
||||
// attach a forwarder: unexpected errors flow through onSettle
|
||||
// with `error` populated; expected PTY read-exit errors
|
||||
// (EIO / EAGAIN) are filtered.
|
||||
const settleCalls: ShellPostPromoteSettleInfo[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'long-running-with-error',
|
||||
(pty, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_pr25_pty_err',
|
||||
} satisfies ShellAbortReason);
|
||||
},
|
||||
shellExecutionConfig,
|
||||
{
|
||||
postPromote: {
|
||||
onSettle: (info) => settleCalls.push(info),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
|
||||
// 1. An expected PTY read-exit error (EIO) is FILTERED — onSettle
|
||||
// is NOT invoked yet (the upcoming onExit will carry status).
|
||||
mockPtyProcess.emit(
|
||||
'error',
|
||||
Object.assign(new Error('read EIO'), { code: 'EIO' }),
|
||||
);
|
||||
expect(settleCalls).toHaveLength(0);
|
||||
|
||||
// 2. An UNEXPECTED error (EPIPE) routes to onSettle as a failure.
|
||||
// Critically: emitting must NOT throw (no unhandled `error`).
|
||||
const unexpectedErr = Object.assign(new Error('disk gone'), {
|
||||
code: 'EPIPE',
|
||||
});
|
||||
expect(() => mockPtyProcess.emit('error', unexpectedErr)).not.toThrow();
|
||||
expect(settleCalls).toHaveLength(1);
|
||||
expect(settleCalls[0].error).toBe(unexpectedErr);
|
||||
expect(settleCalls[0].exitCode).toBeNull();
|
||||
expect(settleCalls[0].signal).toBeNull();
|
||||
expect(typeof settleCalls[0].endTime).toBe('number');
|
||||
|
||||
// 3. A subsequent onExit MUST NOT fire onSettle again (single-fire
|
||||
// latch): callers like the registry's `complete`/`fail`
|
||||
// transitions are not idempotent across status types.
|
||||
const onExitRegistrations = mockPtyProcess.onExit.mock.calls;
|
||||
const postPromoteExitHandler =
|
||||
onExitRegistrations[onExitRegistrations.length - 1][0];
|
||||
postPromoteExitHandler({ exitCode: 0, signal: undefined });
|
||||
expect(settleCalls).toHaveLength(1);
|
||||
});
|
||||
|
||||
it('PR-2.5 wave-3 (T6): post-promote IDisposables and error listener are released on settle (no GC roots dangling)', async () => {
|
||||
// Each promoted PTY child can sit dead for milliseconds while
|
||||
// the caller's `cancelChild` finalizes. Node's EventEmitter
|
||||
// holds refs to listener closures, which in turn hold refs to
|
||||
// `onPostData` / `onPostSettle` / the caller's
|
||||
// `promoteArtifacts`. Without disposal on settle, those refs
|
||||
// dangle until the PTY itself is collected. The fix captures
|
||||
// the IDisposables returned by `onData` / `onExit` AND the
|
||||
// `'error'` listener function we registered on the EE, then
|
||||
// releases them when `firePostSettle` fires (no matter which
|
||||
// path triggers settle).
|
||||
const removeListenerSpy = vi.spyOn(mockPtyProcess, 'removeListener');
|
||||
|
||||
const settleCalls: ShellPostPromoteSettleInfo[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'long-running-disposable',
|
||||
(pty, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_pr25_dispose',
|
||||
} satisfies ShellAbortReason);
|
||||
},
|
||||
shellExecutionConfig,
|
||||
{
|
||||
postPromote: {
|
||||
onData: () => {},
|
||||
onSettle: (info) => settleCalls.push(info),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
|
||||
// The mocked `mockReturnValue({ dispose: vi.fn() })` reuses the
|
||||
// SAME disposable object across calls, so foreground +
|
||||
// post-promote share the same dispose Mock. The foreground
|
||||
// disposable was already disposed at promote handoff; clear
|
||||
// the call history so we can assert ONLY on post-settle
|
||||
// disposal.
|
||||
const sharedDataDisposable = mockPtyProcess.onData.mock.results[0]
|
||||
.value as { dispose: Mock };
|
||||
const sharedExitDisposable = mockPtyProcess.onExit.mock.results[0]
|
||||
.value as { dispose: Mock };
|
||||
sharedDataDisposable.dispose.mockClear();
|
||||
sharedExitDisposable.dispose.mockClear();
|
||||
removeListenerSpy.mockClear();
|
||||
|
||||
// Drive onExit → firePostSettle runs disposePostPromoteListeners.
|
||||
const onExitRegistrations = mockPtyProcess.onExit.mock.calls;
|
||||
const postPromoteExitHandler =
|
||||
onExitRegistrations[onExitRegistrations.length - 1][0];
|
||||
postPromoteExitHandler({ exitCode: 0, signal: undefined });
|
||||
|
||||
expect(settleCalls).toHaveLength(1);
|
||||
// Post-settle: BOTH disposables released, error listener removed.
|
||||
expect(sharedDataDisposable.dispose).toHaveBeenCalledTimes(1);
|
||||
expect(sharedExitDisposable.dispose).toHaveBeenCalledTimes(1);
|
||||
// The post-promote error listener was attached via
|
||||
// `ptyProcess.on('error', listener)` and is released via
|
||||
// `removeListener('error', listener)`. Verify removeListener
|
||||
// was called on the 'error' channel.
|
||||
const errorRemoves = removeListenerSpy.mock.calls.filter(
|
||||
(args: unknown[]) => args[0] === 'error',
|
||||
);
|
||||
expect(errorRemoves.length).toBeGreaterThanOrEqual(1);
|
||||
|
||||
// Re-driving onExit must NOT re-fire settle (latched) AND
|
||||
// dispose calls must NOT double-count (idempotent disposal —
|
||||
// disposePostPromoteListeners nulls the slots after first
|
||||
// disposal).
|
||||
postPromoteExitHandler({ exitCode: 0, signal: undefined });
|
||||
expect(settleCalls).toHaveLength(1);
|
||||
expect(sharedDataDisposable.dispose).toHaveBeenCalledTimes(1);
|
||||
expect(sharedExitDisposable.dispose).toHaveBeenCalledTimes(1);
|
||||
|
||||
removeListenerSpy.mockRestore();
|
||||
});
|
||||
|
||||
it('PR-2.5: onData-only PTY caller has post-promote error + exit listeners (no crash, listeners disposed on exit)', async () => {
|
||||
const dataChunks: ShellOutputEvent[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'tail -f /dev/null',
|
||||
(pty, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_pty_ondata_only',
|
||||
} satisfies ShellAbortReason);
|
||||
},
|
||||
shellExecutionConfig,
|
||||
{
|
||||
postPromote: {
|
||||
onData: (event) => dataChunks.push(event),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
|
||||
// Error listener must be installed even without onSettle —
|
||||
// emitting 'error' on an EventEmitter with no listener throws.
|
||||
expect(() =>
|
||||
mockPtyProcess.emit('error', new Error('post-promote pty err')),
|
||||
).not.toThrow();
|
||||
|
||||
// onExit must also be installed so disposePostPromoteListeners
|
||||
// runs on natural exit (cleaning up data + error listeners).
|
||||
const onExitRegistrations = mockPtyProcess.onExit.mock.calls;
|
||||
expect(onExitRegistrations.length).toBeGreaterThanOrEqual(2);
|
||||
const postPromoteExitHandler =
|
||||
onExitRegistrations[onExitRegistrations.length - 1][0];
|
||||
|
||||
// Simulate natural exit — should dispose listeners without crash.
|
||||
postPromoteExitHandler({ exitCode: 0 });
|
||||
});
|
||||
|
||||
it('PR-2.5 backwards compat: without postPromote, listeners stay fully detached (no regression on PR-2 contract)', async () => {
|
||||
// Pin that omitting `postPromote` preserves the PR-2 detach-
|
||||
// everything contract. The pre-existing post-promote test at
|
||||
// line ~680 already covers this for the data path; this one
|
||||
// adds the symmetric guarantee for the exit path — natural
|
||||
// post-promote exit must NOT invoke any callback the caller
|
||||
// didn't provide.
|
||||
const onDataCalls: ShellOutputEvent[] = [];
|
||||
const onSettleCalls: ShellPostPromoteSettleInfo[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'no-post-promote-handlers',
|
||||
(pty, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_pr25_compat',
|
||||
} satisfies ShellAbortReason);
|
||||
},
|
||||
// No options arg → postPromote unset → PR-2 contract.
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
// Drive both PTY events post-promote.
|
||||
const onDataRegistrations = mockPtyProcess.onData.mock.calls;
|
||||
// PR-2 contract: only ONE onData registration (the foreground
|
||||
// one, now disposed). PR-2.5's re-attach is gated on
|
||||
// `postPromote.onData` being set, so without it the
|
||||
// registration count stays at 1.
|
||||
expect(onDataRegistrations.length).toBe(1);
|
||||
const onExitRegistrations = mockPtyProcess.onExit.mock.calls;
|
||||
expect(onExitRegistrations.length).toBe(1);
|
||||
// Caller-provided handlers were never invoked.
|
||||
expect(onDataCalls).toHaveLength(0);
|
||||
expect(onSettleCalls).toHaveLength(0);
|
||||
});
|
||||
|
||||
it('post-exit race: PTY background-promote refuses if process.kill(pid, 0) reports the pid is gone', async () => {
|
||||
// Mirror of the child_process post-exit race test. The PTY may
|
||||
// have already exited but our `exitDisposable` (onExit) handler
|
||||
|
|
@ -1210,6 +1491,7 @@ describe('ShellExecutionService child_process fallback', () => {
|
|||
const simulateExecution = async (
|
||||
command: string,
|
||||
simulation: (cp: typeof mockChildProcess, ac: AbortController) => void,
|
||||
options: ShellExecuteOptions = {},
|
||||
) => {
|
||||
const abortController = new AbortController();
|
||||
const handle = await ShellExecutionService.execute(
|
||||
|
|
@ -1219,6 +1501,7 @@ describe('ShellExecutionService child_process fallback', () => {
|
|||
abortController.signal,
|
||||
true,
|
||||
shellExecutionConfig,
|
||||
options,
|
||||
);
|
||||
|
||||
await new Promise((resolve) => process.nextTick(resolve));
|
||||
|
|
@ -1556,6 +1839,257 @@ describe('ShellExecutionService child_process fallback', () => {
|
|||
expect(result.signal).toBeNull();
|
||||
});
|
||||
|
||||
it('PR-2.5 child_process: post-promote stdout/stderr forward to postPromote.onData with SEPARATE decoders', async () => {
|
||||
// Pin: post-promote bytes from the still-running child route to
|
||||
// the caller's onData handler. Separate decoders for stdout vs
|
||||
// stderr — a single shared decoder would corrupt interleaved
|
||||
// multibyte UTF-8 (the continuation-byte state machine assumes
|
||||
// one byte source).
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
const events: Array<{ type: string; chunk?: string | unknown }> = [];
|
||||
const { result } = await simulateExecution(
|
||||
'tail -f',
|
||||
(cp, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_cp_data',
|
||||
} satisfies ShellAbortReason);
|
||||
// Drive post-promote chunks — should now flow to onData.
|
||||
cp.stdout?.emit('data', Buffer.from('post-promote-stdout\n'));
|
||||
cp.stderr?.emit('data', Buffer.from('post-promote-stderr\n'));
|
||||
},
|
||||
{
|
||||
postPromote: {
|
||||
onData: (event) => events.push(event),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
// Both streams forwarded.
|
||||
const dataChunks = events
|
||||
.filter((e) => e.type === 'data')
|
||||
.map((e) => e.chunk);
|
||||
expect(dataChunks).toContain('post-promote-stdout\n');
|
||||
expect(dataChunks).toContain('post-promote-stderr\n');
|
||||
});
|
||||
|
||||
it('PR-2.5 child_process: onSettle fires on `close` (NOT `exit`) so late chunks land before the registry transitions', async () => {
|
||||
// Pin the `close`-not-`exit` contract: child can emit buffered
|
||||
// data AFTER 'exit' but BEFORE 'close'. If onSettle fired on
|
||||
// 'exit' the caller would close the output stream + transition
|
||||
// the registry while late chunks were still in flight — they'd
|
||||
// hit a closed stream and be dropped, producing truncated logs.
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
const events: Array<{ type: string; chunk?: string | unknown }> = [];
|
||||
const settles: ShellPostPromoteSettleInfo[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'cmd',
|
||||
(cp, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_cp_close',
|
||||
} satisfies ShellAbortReason);
|
||||
// Order matters: emit 'exit' first (this would have settled
|
||||
// PR-1 of PR-2.5 too early), then a final stdout chunk, then
|
||||
// 'close'. With the new contract, onSettle only fires on
|
||||
// 'close' so the late chunk is captured.
|
||||
cp.emit('exit', 0, null);
|
||||
cp.stdout?.emit('data', Buffer.from('late-chunk\n'));
|
||||
cp.emit('close', 0, null);
|
||||
},
|
||||
{
|
||||
postPromote: {
|
||||
onData: (event) => events.push(event),
|
||||
onSettle: (info) => settles.push(info),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
// Late chunk made it through.
|
||||
const dataChunks = events
|
||||
.filter((e) => e.type === 'data')
|
||||
.map((e) => e.chunk);
|
||||
expect(dataChunks).toContain('late-chunk\n');
|
||||
// onSettle fired exactly once with exitCode 0.
|
||||
expect(settles).toHaveLength(1);
|
||||
expect(settles[0].exitCode).toBe(0);
|
||||
expect(settles[0].signal).toBeNull();
|
||||
});
|
||||
|
||||
it('PR-2.5 child_process: post-promote spawn error routes to onSettle with error populated', async () => {
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
const settles: ShellPostPromoteSettleInfo[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'cmd',
|
||||
(cp, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_cp_err',
|
||||
} satisfies ShellAbortReason);
|
||||
cp.emit('error', new Error('post-promote spawn boom'));
|
||||
},
|
||||
{
|
||||
postPromote: {
|
||||
onSettle: (info) => settles.push(info),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
expect(settles).toHaveLength(1);
|
||||
expect(settles[0].error?.message).toBe('post-promote spawn boom');
|
||||
expect(settles[0].exitCode).toBeNull();
|
||||
expect(settles[0].signal).toBeNull();
|
||||
});
|
||||
|
||||
it('PR-2.5 wave-4 (T1): post-promote `error` followed by `close` fires onSettle EXACTLY ONCE', async () => {
|
||||
// Regression for the double-fire bug: pre-fix, `child.once('close', ...)`
|
||||
// and `child.once('error', ...)` were independent and each invoked
|
||||
// `onPostSettle` directly. A spawn-side error followed by the
|
||||
// child-process automatic 'close' event would call the caller's
|
||||
// settle twice, violating the exactly-once contract and racing
|
||||
// the caller's `transitionRegistry`. Fix wraps both branches in
|
||||
// a `firePostSettle` latch (mirroring the PTY path).
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
const settles: ShellPostPromoteSettleInfo[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'cmd',
|
||||
(cp, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_cp_double',
|
||||
} satisfies ShellAbortReason);
|
||||
// First: error fires.
|
||||
cp.emit('error', new Error('error first'));
|
||||
// Then: close (Node child_process always emits 'close' even
|
||||
// after an error). Pre-fix this would call onSettle a second
|
||||
// time.
|
||||
cp.emit('close', 1, null);
|
||||
},
|
||||
{
|
||||
postPromote: {
|
||||
onSettle: (info) => settles.push(info),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
expect(settles).toHaveLength(1);
|
||||
expect(settles[0].error?.message).toBe('error first');
|
||||
});
|
||||
|
||||
it('PR-2.5 wave-4 (T3): onData-only caller still gets decoder flush on close (no trailing multibyte loss)', async () => {
|
||||
// T3 regression: the close handler used to be installed only
|
||||
// when `onSettle` was set, so an `onData`-only caller never got
|
||||
// the trailing-multibyte flush — a UTF-8 character split across
|
||||
// chunks could vanish. Fix installs close whenever ANY
|
||||
// postPromote handler is set, and the flush helper runs whenever
|
||||
// onData is set independent of onSettle.
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
const dataChunks: ShellOutputEvent[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'cmd',
|
||||
(cp, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_cp_t3',
|
||||
} satisfies ShellAbortReason);
|
||||
// Push the FIRST byte of a 3-byte UTF-8 char (€ = 0xE2 0x82 0xAC).
|
||||
// Without flush, the trailing two bytes would be stuck in the
|
||||
// decoder's continuation state and lost.
|
||||
cp.stdout?.emit('data', Buffer.from([0xe2]));
|
||||
cp.stdout?.emit('data', Buffer.from([0x82, 0xac]));
|
||||
// Trigger close so the flush runs; no onSettle to gate on.
|
||||
cp.emit('close', 0, null);
|
||||
},
|
||||
{
|
||||
postPromote: {
|
||||
onData: (event) => dataChunks.push(event),
|
||||
// NO onSettle — close handler must still fire flush.
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
// The € character should appear once the second chunk completes
|
||||
// the multibyte sequence; flush at close ensures any remainder
|
||||
// is surfaced.
|
||||
const joined = dataChunks
|
||||
.map((d) =>
|
||||
d.type === 'data' && typeof d.chunk === 'string' ? d.chunk : '',
|
||||
)
|
||||
.join('');
|
||||
expect(joined).toContain('€');
|
||||
});
|
||||
|
||||
it('PR-2.5 wave-4 (T6): onData-only caller has post-promote `error` listener (does not crash CLI)', async () => {
|
||||
// T6 regression: `child.once('error', ...)` install was gated
|
||||
// on `onSettle`, so an `onData`-only caller had the foreground
|
||||
// errorHandler detached at promote with no replacement — a
|
||||
// post-promote spawn error would surface as Node's default
|
||||
// unhandled-error crash. Fix attaches an error listener
|
||||
// whenever ANY postPromote handler is set.
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
const dataChunks: ShellOutputEvent[] = [];
|
||||
const { result } = await simulateExecution(
|
||||
'cmd',
|
||||
(cp, ac) => {
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_cp_t6',
|
||||
} satisfies ShellAbortReason);
|
||||
// Emitting 'error' on an EventEmitter with no listener throws
|
||||
// synchronously. With the fix, our listener is attached so
|
||||
// the emit does not throw.
|
||||
expect(() =>
|
||||
cp.emit('error', new Error('post-promote err')),
|
||||
).not.toThrow();
|
||||
// child_process auto-emits 'close' after 'error'.
|
||||
cp.emit('close', null, null);
|
||||
},
|
||||
{
|
||||
postPromote: {
|
||||
onData: (event) => dataChunks.push(event),
|
||||
// NO onSettle — but error must still be handled (no crash).
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
});
|
||||
|
||||
it('PR-2.5 wave-4 (T7): onSettle-only caller has stdout/stderr resumed (child does not block on full pipes)', async () => {
|
||||
// T7 regression: when `onSettle` is set but `onData` is NOT, the
|
||||
// post-promote path used to leave stdout/stderr without any data
|
||||
// listener. The Readables stay paused; the OS pipe buffer fills
|
||||
// (~64KB on Linux); the child blocks on stdout.write; 'close'
|
||||
// never fires; onSettle never fires. Fix calls .resume() on
|
||||
// both streams in the no-onData branch so the child can drain.
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
const settles: ShellPostPromoteSettleInfo[] = [];
|
||||
const stdoutResumeSpy = vi.fn();
|
||||
const stderrResumeSpy = vi.fn();
|
||||
const { result } = await simulateExecution(
|
||||
'cmd',
|
||||
(cp, ac) => {
|
||||
// Patch resume() so we can verify the wire was driven.
|
||||
if (cp.stdout) cp.stdout.resume = stdoutResumeSpy;
|
||||
if (cp.stderr) cp.stderr.resume = stderrResumeSpy;
|
||||
ac.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_cp_t7',
|
||||
} satisfies ShellAbortReason);
|
||||
cp.emit('close', 0, null);
|
||||
},
|
||||
{
|
||||
postPromote: {
|
||||
// NO onData — but stdout/stderr must still be resumed.
|
||||
onSettle: (info) => settles.push(info),
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(result.promoted).toBe(true);
|
||||
expect(stdoutResumeSpy).toHaveBeenCalled();
|
||||
expect(stderrResumeSpy).toHaveBeenCalled();
|
||||
expect(settles).toHaveLength(1);
|
||||
});
|
||||
|
||||
it('should gracefully attempt SIGKILL on linux if SIGTERM fails', async () => {
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
vi.useFakeTimers();
|
||||
|
|
|
|||
|
|
@ -177,6 +177,63 @@ export interface ShellExecutionConfig {
|
|||
disableDynamicLineTrimming?: boolean;
|
||||
}
|
||||
|
||||
/**
|
||||
* Optional caller-side handlers for the *post-promote* lifetime of a
|
||||
* background-promoted child process. PR-2 (#3894) detached every
|
||||
* service-side listener at promote time and froze `result.output` at
|
||||
* the snapshot; without these hooks the still-running child's bytes
|
||||
* are lost and the registry entry stays `'running'` until `task_stop`
|
||||
* / session-end cleanup. PR-2.5 (#3831 follow-up) wires shell.ts to
|
||||
* pass these so promoted shells behave like regular background shells:
|
||||
* bytes append to `bg_xxx.output` and the entry transitions to
|
||||
* `'completed'` / `'failed'` on natural child exit.
|
||||
*
|
||||
* Backwards compat: if `postPromote` is unset on the options bag the
|
||||
* service falls back to the PR-2 detach-everything contract — no
|
||||
* regressions for callers that don't opt in.
|
||||
*/
|
||||
export interface ShellPostPromoteHandlers {
|
||||
/**
|
||||
* Fired for each output chunk the still-running child produces
|
||||
* AFTER `result.promoted` resolves. Same `ShellOutputEvent` shape
|
||||
* the foreground stream uses so callers can reuse rendering logic;
|
||||
* `binary_detected` / `binary_progress` are NOT re-emitted (those
|
||||
* decisions were made pre-promote against the same byte stream).
|
||||
*/
|
||||
onData?: (event: ShellOutputEvent) => void;
|
||||
/**
|
||||
* Fired exactly once when the post-promote child settles — natural
|
||||
* exit (`exitCode` set, `signal: null`), signal kill (`exitCode:
|
||||
* null`, `signal` set), or spawn-side error (`error` set). NOT
|
||||
* fired for the promote-time resolve itself (that's the
|
||||
* `result.promoted` Promise resolution). Callers wire this to the
|
||||
* registry's `complete` / `fail` transitions.
|
||||
*/
|
||||
onSettle?: (info: ShellPostPromoteSettleInfo) => void;
|
||||
}
|
||||
|
||||
export interface ShellPostPromoteSettleInfo {
|
||||
exitCode: number | null;
|
||||
signal: number | NodeJS.Signals | null;
|
||||
error?: Error;
|
||||
/** `Date.now()` at the moment the service observed the exit/error. */
|
||||
endTime: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* Options bag for `ShellExecutionService.execute()`. Kept as an
|
||||
* interface (rather than the prior inline shape) so future additions
|
||||
* land without breaking signatures.
|
||||
*/
|
||||
export interface ShellExecuteOptions {
|
||||
streamStdout?: boolean;
|
||||
/**
|
||||
* Post-promote callback hooks. See {@link ShellPostPromoteHandlers}.
|
||||
* Optional; omit to preserve the PR-2 detach-everything contract.
|
||||
*/
|
||||
postPromote?: ShellPostPromoteHandlers;
|
||||
}
|
||||
|
||||
/**
|
||||
* Describes a structured event emitted during shell command execution.
|
||||
*/
|
||||
|
|
@ -430,7 +487,7 @@ export class ShellExecutionService {
|
|||
abortSignal: AbortSignal,
|
||||
shouldUseNodePty: boolean,
|
||||
shellExecutionConfig: ShellExecutionConfig,
|
||||
options: { streamStdout?: boolean } = {},
|
||||
options: ShellExecuteOptions = {},
|
||||
): Promise<ShellExecutionHandle> {
|
||||
if (shouldUseNodePty) {
|
||||
const ptyInfo = await getPty();
|
||||
|
|
@ -443,6 +500,7 @@ export class ShellExecutionService {
|
|||
abortSignal,
|
||||
shellExecutionConfig,
|
||||
ptyInfo,
|
||||
options.postPromote,
|
||||
);
|
||||
} catch (_e) {
|
||||
// Fallback to child_process
|
||||
|
|
@ -456,6 +514,7 @@ export class ShellExecutionService {
|
|||
onOutputEvent,
|
||||
abortSignal,
|
||||
options.streamStdout ?? false,
|
||||
options.postPromote,
|
||||
);
|
||||
}
|
||||
|
||||
|
|
@ -465,6 +524,7 @@ export class ShellExecutionService {
|
|||
onOutputEvent: (event: ShellOutputEvent) => void,
|
||||
abortSignal: AbortSignal,
|
||||
streamStdout: boolean,
|
||||
postPromote?: ShellPostPromoteHandlers,
|
||||
): ShellExecutionHandle {
|
||||
try {
|
||||
const isWindows = os.platform() === 'win32';
|
||||
|
|
@ -702,6 +762,229 @@ export class ShellExecutionService {
|
|||
const combined =
|
||||
snapStdout +
|
||||
(snapStderr ? (snapStdout ? separator : '') + snapStderr : '');
|
||||
// PR-2.5: re-attach post-promote listeners that forward to the
|
||||
// caller's handlers. Attach AFTER `detachServiceListeners()`
|
||||
// so we don't double-up on stdout/stderr 'data' events with
|
||||
// the foreground listeners that just got removed; attach
|
||||
// BEFORE `resolve()` so a sub-millisecond data burst right
|
||||
// after promote still lands on the caller. The new listeners
|
||||
// are direct stdout/stderr listeners (not service-managed) —
|
||||
// ownership is the caller's from this point. We also attach
|
||||
// a fresh exit listener (the foreground exitHandler is also
|
||||
// detached by detachServiceListeners) so the caller can
|
||||
// settle the registry entry on natural child exit. When
|
||||
// postPromote is undefined we fall back to the PR-2 detach-
|
||||
// everything contract: no listeners re-attach.
|
||||
// PR-2.5 wave-4: preserve the detected encoding
|
||||
// from the foreground decoders so a non-UTF-8 child (e.g.
|
||||
// GBK on a Chinese Windows shell) doesn't snapshot correctly
|
||||
// and then mojibake the post-promote tail. The foreground
|
||||
// `stdoutDecoder` / `stderrDecoder` are initialized in
|
||||
// `handleOutput` from `getCachedEncodingForBuffer(data)` on
|
||||
// the first chunk; if they're still null at promote time
|
||||
// (no bytes yet), fall back to `'utf-8'`. Capture the
|
||||
// detected encoding rather than the decoder instance — the
|
||||
// foreground decoder has already seen pre-promote bytes
|
||||
// (its multibyte state machine is at an arbitrary midpoint)
|
||||
// and may have accumulated continuation-byte state that the
|
||||
// post-promote stream shouldn't inherit; new instances with
|
||||
// the same `encoding` start fresh.
|
||||
const detectedEncoding = stdoutDecoder?.encoding ?? 'utf-8';
|
||||
// SEPARATE decoders for stdout and stderr. A single shared
|
||||
// decoder corrupts interleaved multibyte UTF-8 (the streaming
|
||||
// state machine assumes one byte source); independent
|
||||
// decoders preserve each stream's continuation-byte context.
|
||||
// Both decoders are flushed (with `stream: false`) once the
|
||||
// child has fully closed so any trailing multibyte bytes
|
||||
// surface instead of being silently dropped.
|
||||
//
|
||||
// PR-2.5 wave-4: allocate decoders whenever
|
||||
// `onData` is set (not gated on close-handler installation),
|
||||
// because the close handler now ALWAYS installs when any
|
||||
// postPromote handler is present (T6 + T7) and needs to
|
||||
// flush these decoders if onData is set, regardless of
|
||||
// whether onSettle is set.
|
||||
const safeDecoder = (encoding: string): TextDecoder => {
|
||||
try {
|
||||
return new TextDecoder(encoding, { fatal: false });
|
||||
} catch {
|
||||
// Defensive: if the detected encoding string is somehow
|
||||
// not supported by Node's ICU (extremely rare on modern
|
||||
// Node), fall back to utf-8 rather than throwing inside
|
||||
// the promote handoff path.
|
||||
return new TextDecoder('utf-8', { fatal: false });
|
||||
}
|
||||
};
|
||||
const postPromoteStdoutDecoder = postPromote?.onData
|
||||
? safeDecoder(detectedEncoding)
|
||||
: null;
|
||||
const postPromoteStderrDecoder = postPromote?.onData
|
||||
? safeDecoder(detectedEncoding)
|
||||
: null;
|
||||
let postPromoteStdoutHandler: ((chunk: Buffer) => void) | null = null;
|
||||
let postPromoteStderrHandler: ((chunk: Buffer) => void) | null = null;
|
||||
if (postPromote?.onData) {
|
||||
const onPostData = postPromote.onData;
|
||||
const safeData = (decoder: TextDecoder) => (chunk: Buffer) => {
|
||||
try {
|
||||
onPostData({
|
||||
type: 'data',
|
||||
chunk: decoder.decode(chunk, { stream: true }),
|
||||
});
|
||||
} catch (cbErr) {
|
||||
debugLogger.warn(
|
||||
`postPromote.onData threw: ${cbErr instanceof Error ? cbErr.message : String(cbErr)}`,
|
||||
);
|
||||
}
|
||||
};
|
||||
try {
|
||||
if (postPromoteStdoutDecoder) {
|
||||
postPromoteStdoutHandler = safeData(postPromoteStdoutDecoder);
|
||||
child.stdout?.on('data', postPromoteStdoutHandler);
|
||||
}
|
||||
if (postPromoteStderrDecoder) {
|
||||
postPromoteStderrHandler = safeData(postPromoteStderrDecoder);
|
||||
child.stderr?.on('data', postPromoteStderrHandler);
|
||||
}
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`re-attaching post-promote data listeners threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
} else if (postPromote) {
|
||||
// PR-2.5 wave-4: caller asked for `onSettle`
|
||||
// (or any other future postPromote handler) without
|
||||
// `onData`. The foreground stdout/stderr listeners were
|
||||
// detached above; without ANY data listener the Readable
|
||||
// streams stay paused (on Windows they may already be
|
||||
// flowing — `resume()` is a no-op in that case), the OS
|
||||
// pipe buffer fills (~64KB on Linux), and
|
||||
// `child.stdout.write` in the child blocks —
|
||||
// potentially forever. `'close'` then never fires and
|
||||
// `onSettle` is never called. `.resume()` puts the stream
|
||||
// back in flowing mode (data arrives + is dropped) so the
|
||||
// child can drain its pipes and exit normally.
|
||||
try {
|
||||
child.stdout?.resume();
|
||||
child.stderr?.resume();
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`post-promote stdout/stderr resume() threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
// PR-2.5 wave-4: single-fire latch shared by
|
||||
// 'close' and 'error' (both branches funnel through here).
|
||||
// Without it the child_process path could fire onSettle
|
||||
// twice — once from `error`, then again from the `close`
|
||||
// that immediately follows — violating the exactly-once
|
||||
// settle contract and racing the caller's `transitionRegistry`.
|
||||
//
|
||||
// PR-2.5 wave-4: the helper also performs the
|
||||
// decoder flush so any caller with `onData` set gets the
|
||||
// trailing multibyte bytes surfaced — independent of
|
||||
// whether `onSettle` is also set.
|
||||
let postPromoteSettleFired = false;
|
||||
const flushPostPromoteDecoders = (): void => {
|
||||
if (!postPromote?.onData) return;
|
||||
try {
|
||||
if (postPromoteStdoutDecoder) {
|
||||
const trailing = postPromoteStdoutDecoder.decode();
|
||||
if (trailing) {
|
||||
postPromote.onData({
|
||||
type: 'data',
|
||||
chunk: trailing,
|
||||
});
|
||||
}
|
||||
}
|
||||
if (postPromoteStderrDecoder) {
|
||||
const trailing = postPromoteStderrDecoder.decode();
|
||||
if (trailing) {
|
||||
postPromote.onData({
|
||||
type: 'data',
|
||||
chunk: trailing,
|
||||
});
|
||||
}
|
||||
}
|
||||
} catch (flushErr) {
|
||||
debugLogger.warn(
|
||||
`post-promote decoder flush threw: ${flushErr instanceof Error ? flushErr.message : String(flushErr)}`,
|
||||
);
|
||||
}
|
||||
};
|
||||
const firePostSettle = (info: ShellPostPromoteSettleInfo): void => {
|
||||
if (postPromoteSettleFired) return;
|
||||
postPromoteSettleFired = true;
|
||||
flushPostPromoteDecoders();
|
||||
if (postPromoteStdoutHandler) {
|
||||
child.stdout?.off('data', postPromoteStdoutHandler);
|
||||
postPromoteStdoutHandler = null;
|
||||
}
|
||||
if (postPromoteStderrHandler) {
|
||||
child.stderr?.off('data', postPromoteStderrHandler);
|
||||
postPromoteStderrHandler = null;
|
||||
}
|
||||
if (!postPromote?.onSettle) return;
|
||||
try {
|
||||
postPromote.onSettle(info);
|
||||
} catch (cbErr) {
|
||||
debugLogger.warn(
|
||||
`postPromote.onSettle threw: ${cbErr instanceof Error ? cbErr.message : String(cbErr)}`,
|
||||
);
|
||||
}
|
||||
};
|
||||
// PR-2.5 wave-4: install 'close' and
|
||||
// 'error' listeners whenever ANY postPromote handler is
|
||||
// present, not just when `onSettle` is set. Two reasons:
|
||||
//
|
||||
// 1. T6: `onData`-only callers still had the foreground
|
||||
// `errorHandler` detached; without a replacement
|
||||
// listener a post-promote `'error'` would crash Node
|
||||
// via the unhandled-error default. Even with no
|
||||
// onSettle to route into, the listener prevents the
|
||||
// crash (and triggers decoder flush on close).
|
||||
//
|
||||
// 2. T3 / T7: `onData`-only callers need the close handler
|
||||
// to flush trailing decoder bytes; an `onSettle`-only
|
||||
// caller needs `'close'` to fire onSettle — both share
|
||||
// the same close hook now.
|
||||
if (postPromote) {
|
||||
try {
|
||||
child.once(
|
||||
'close',
|
||||
(
|
||||
exitCode: number | null,
|
||||
signalCode: NodeJS.Signals | null,
|
||||
) => {
|
||||
// Listen on 'close' (all stdio fully drained) NOT
|
||||
// 'exit' (which can fire while stdout/stderr still
|
||||
// have buffered bytes pending). Without this, late
|
||||
// chunks emitted between 'exit' and 'close' land in
|
||||
// the caller's onData AFTER onSettle already closed
|
||||
// the output stream and transitioned the registry —
|
||||
// they'd be dropped silently and `/tasks` would
|
||||
// show a truncated log.
|
||||
firePostSettle({
|
||||
exitCode,
|
||||
signal: signalCode,
|
||||
endTime: Date.now(),
|
||||
});
|
||||
},
|
||||
);
|
||||
child.once('error', (err: Error) => {
|
||||
firePostSettle({
|
||||
exitCode: null,
|
||||
signal: null,
|
||||
error: err,
|
||||
endTime: Date.now(),
|
||||
});
|
||||
});
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`re-attaching post-promote exit/error listeners threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
resolve({
|
||||
rawOutput: finalBuffer,
|
||||
output: stripAnsi(combined).trim(),
|
||||
|
|
@ -831,6 +1114,7 @@ export class ShellExecutionService {
|
|||
abortSignal: AbortSignal,
|
||||
shellExecutionConfig: ShellExecutionConfig,
|
||||
ptyInfo: PtyImplementation,
|
||||
postPromote?: ShellPostPromoteHandlers,
|
||||
): ShellExecutionHandle {
|
||||
if (!ptyInfo) {
|
||||
// This should not happen, but as a safeguard...
|
||||
|
|
@ -1176,11 +1460,11 @@ export class ShellExecutionService {
|
|||
);
|
||||
return;
|
||||
}
|
||||
// Skip kill, dispose all our listeners on the live PTY (so
|
||||
// post-promote data/exit/error don't leak into our foreground
|
||||
// onOutputEvent or crash via the error handler's `throw err`),
|
||||
// set the listenersDetached guard so any already-enqueued
|
||||
// processingChain callback's onOutputEvent emits are
|
||||
// Skip kill, dispose all our foreground listeners on the live
|
||||
// PTY (so post-promote data/exit/error don't leak into our
|
||||
// foreground onOutputEvent or crash via the error handler's
|
||||
// `throw err`), set the listenersDetached guard so any already-
|
||||
// enqueued processingChain callback's onOutputEvent emits are
|
||||
// suppressed (in-flight writes still LAND in headlessTerminal
|
||||
// so the snapshot below reflects them), drain pending chain
|
||||
// work, drop the PTY from the active set (so cleanup() won't
|
||||
|
|
@ -1188,6 +1472,15 @@ export class ShellExecutionService {
|
|||
// resolve immediately with `promoted: true` so the awaiting
|
||||
// caller unblocks. The caller has attached its own listeners
|
||||
// by this point and owns the PTY's lifecycle.
|
||||
//
|
||||
// PR-2.5: if `postPromote.onData` / `postPromote.onSettle` were
|
||||
// provided, ATTACH NEW listeners after disposing the
|
||||
// foreground ones — bytes from the still-running child route
|
||||
// to the caller (typically shell.ts's append-to-bg_xxx.output
|
||||
// path), and the eventual natural-exit transitions the
|
||||
// registry entry to `'completed'` / `'failed'` instead of
|
||||
// leaving it stuck on `'running'`. When postPromote is
|
||||
// undefined the PR-2 detach-everything contract is preserved.
|
||||
exited = true;
|
||||
listenersDetached = true;
|
||||
abortSignal.removeEventListener('abort', abortHandler);
|
||||
|
|
@ -1228,6 +1521,154 @@ export class ShellExecutionService {
|
|||
}
|
||||
this.activePtys.delete(ptyProcess.pid);
|
||||
|
||||
// PR-2.5: re-attach minimal listeners that forward to the
|
||||
// caller's post-promote handlers. Attach BEFORE the drain so
|
||||
// late bytes the PTY emits during the drain window flow to
|
||||
// the caller instead of falling on the floor — strictly an
|
||||
// improvement; without this they'd be dropped on the way to
|
||||
// the snapshot anyway.
|
||||
//
|
||||
// PR-2.5 wave-3: capture the IDisposable
|
||||
// returned by `onData` / `onExit` and the listener function
|
||||
// we register on `'error'`, then dispose them all when
|
||||
// settle fires. node-pty's `ptyProcess` outlives the
|
||||
// post-promote handlers (the child can sit dead for
|
||||
// milliseconds before the caller's `cancelChild` finalizes
|
||||
// it), and node's EventEmitter holds refs to listener
|
||||
// closures (which in turn hold refs to `onPostData` /
|
||||
// `onPostSettle` / `promoteArtifacts`). Disposing the
|
||||
// listeners on settle releases those refs so they can be
|
||||
// GC'd without waiting for the underlying PTY to be
|
||||
// collected.
|
||||
//
|
||||
// Guard so `onSettle` fires AT MOST ONCE. Both `onExit` and
|
||||
// the post-promote `error` listener below funnel through
|
||||
// this latch — a PTY error during the read-exit race could
|
||||
// otherwise fire onSettle twice (once for the error, once
|
||||
// for the immediately-following exit) and the caller's
|
||||
// `transitionRegistry` would race itself.
|
||||
let postPromoteSettleFired = false;
|
||||
let postPromoteDataDisposable: { dispose: () => void } | null = null;
|
||||
let postPromoteExitDisposable: { dispose: () => void } | null = null;
|
||||
let postPromoteErrorListener:
|
||||
| ((err: NodeJS.ErrnoException) => void)
|
||||
| null = null;
|
||||
const disposePostPromoteListeners = () => {
|
||||
if (postPromoteDataDisposable) {
|
||||
try {
|
||||
postPromoteDataDisposable.dispose();
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`disposing post-promote data listener threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
postPromoteDataDisposable = null;
|
||||
}
|
||||
if (postPromoteExitDisposable) {
|
||||
try {
|
||||
postPromoteExitDisposable.dispose();
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`disposing post-promote exit listener threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
postPromoteExitDisposable = null;
|
||||
}
|
||||
if (postPromoteErrorListener) {
|
||||
try {
|
||||
ptyProcess.removeListener('error', postPromoteErrorListener);
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`removing post-promote error listener threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
postPromoteErrorListener = null;
|
||||
}
|
||||
};
|
||||
const firePostSettle = (info: ShellPostPromoteSettleInfo) => {
|
||||
if (postPromoteSettleFired) return;
|
||||
postPromoteSettleFired = true;
|
||||
// Dispose BEFORE invoking the caller — even if the caller
|
||||
// throws, the listeners are gone (and idempotent if we
|
||||
// come back through the error path).
|
||||
// Known limitation: node-pty may have queued onData
|
||||
// callbacks not yet delivered when onExit fires; disposing
|
||||
// the data listener here means those trailing bytes (<4KB)
|
||||
// are lost. Bounded and low severity — a setImmediate
|
||||
// delay could recover them but would complicate the
|
||||
// single-fire latch.
|
||||
disposePostPromoteListeners();
|
||||
if (!postPromote?.onSettle) return;
|
||||
try {
|
||||
postPromote.onSettle(info);
|
||||
} catch (cbErr) {
|
||||
debugLogger.warn(
|
||||
`postPromote.onSettle threw: ${cbErr instanceof Error ? cbErr.message : String(cbErr)}`,
|
||||
);
|
||||
}
|
||||
};
|
||||
if (postPromote?.onData) {
|
||||
const onPostData = postPromote.onData;
|
||||
try {
|
||||
postPromoteDataDisposable = ptyProcess.onData((data: string) => {
|
||||
try {
|
||||
onPostData({ type: 'data', chunk: data });
|
||||
} catch (cbErr) {
|
||||
// Caller's handler threw — don't let it crash the
|
||||
// child's data loop. Log + drop.
|
||||
debugLogger.warn(
|
||||
`postPromote.onData threw: ${cbErr instanceof Error ? cbErr.message : String(cbErr)}`,
|
||||
);
|
||||
}
|
||||
});
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`re-attaching post-promote data listener threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
if (postPromote) {
|
||||
try {
|
||||
postPromoteExitDisposable = ptyProcess.onExit(
|
||||
({
|
||||
exitCode,
|
||||
signal,
|
||||
}: {
|
||||
exitCode: number;
|
||||
signal?: number;
|
||||
}) => {
|
||||
firePostSettle({
|
||||
exitCode,
|
||||
signal: signal ?? null,
|
||||
endTime: Date.now(),
|
||||
});
|
||||
},
|
||||
);
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`re-attaching post-promote exit listener threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
try {
|
||||
postPromoteErrorListener = (err: NodeJS.ErrnoException) => {
|
||||
if (isExpectedPtyReadExitError(err)) {
|
||||
return;
|
||||
}
|
||||
firePostSettle({
|
||||
error: err,
|
||||
exitCode: null,
|
||||
signal: null,
|
||||
endTime: Date.now(),
|
||||
});
|
||||
};
|
||||
ptyProcess.on('error', postPromoteErrorListener);
|
||||
} catch (e) {
|
||||
debugLogger.warn(
|
||||
`re-attaching post-promote error listener threw: ${e instanceof Error ? e.message : String(e)}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// Drain in-flight chain work (already-enqueued
|
||||
// headlessTerminal.write callbacks) so the snapshot reflects
|
||||
// the last batch of bytes the PTY emitted before promote.
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load diff
|
|
@ -34,6 +34,8 @@ import type {
|
|||
ShellExecutionConfig,
|
||||
ShellExecutionResult,
|
||||
ShellOutputEvent,
|
||||
ShellPostPromoteHandlers,
|
||||
ShellPostPromoteSettleInfo,
|
||||
} from '../services/shellExecutionService.js';
|
||||
import { ShellExecutionService } from '../services/shellExecutionService.js';
|
||||
import type { ShellTaskRegistration } from '../services/backgroundShellRegistry.js';
|
||||
|
|
@ -905,6 +907,64 @@ const DEFAULT_FOREGROUND_TIMEOUT_MS = 120000;
|
|||
*/
|
||||
const PROMOTE_CANCEL_SIGKILL_TIMEOUT_MS = 200;
|
||||
|
||||
/** Maximum wait for the output stream flush before transitioning the registry. */
|
||||
const PROMOTE_FLUSH_TIMEOUT_MS = 10_000;
|
||||
|
||||
/**
|
||||
* PR-2.5 slots shared between the foreground `execute()` postPromote
|
||||
* handlers and the post-resolve `handlePromotedForeground` finalizer.
|
||||
* The handlers fire on the service side as soon as promote happens;
|
||||
* the finalizer runs after `await resultPromise` returns. They race —
|
||||
* the buffer + settle-queue absorb the race so neither chunks nor the
|
||||
* eventual exit info are lost. See `executeForeground` for the wiring
|
||||
* and `handlePromotedForeground` for the drain logic.
|
||||
*/
|
||||
interface PromoteArtifacts {
|
||||
/**
|
||||
* Chunks observed by `postPromote.onData` BEFORE the stream is
|
||||
* open. Drained into the stream once `handlePromotedForeground`
|
||||
* opens it. After drain this stays empty for the rest of the run.
|
||||
*/
|
||||
buffer: string[];
|
||||
/**
|
||||
* Append-mode write stream to `bg_xxx.output`. Null until
|
||||
* `handlePromotedForeground` opens it. Closed by `onSettleWired`.
|
||||
*/
|
||||
stream: fs.WriteStream | null;
|
||||
/**
|
||||
* Latched true when the output stream is no longer accepting writes.
|
||||
* Two paths set it:
|
||||
*
|
||||
* 1. Stream open failed (`fs.createWriteStream` threw OR fired an
|
||||
* async `'error'` event before bytes could land). The stream
|
||||
* will never reopen; future `onData` chunks must drop.
|
||||
* 2. Settle has fired and `onSettleWired` has drained the buffer
|
||||
* and called `stream.end()`. The stream is closing; any chunk
|
||||
* that arrives during the `.end()` flush window (rare but
|
||||
* possible on PTY when kernel buffers deliver late) MUST drop
|
||||
* rather than be pushed into the buffer — at this point the
|
||||
* buffer has no remaining drain path (the foreground finalizer
|
||||
* has returned).
|
||||
*
|
||||
* Without this flag the buffer would grow without bound under a
|
||||
* sustained child whose output file we can't open, OR strand
|
||||
* late-arriving post-settle bytes in an undrainable buffer.
|
||||
*/
|
||||
streamClosed: boolean;
|
||||
/**
|
||||
* Settle handler installed by `handlePromotedForeground` once the
|
||||
* registry entry exists. Null until then; `onSettle` calls below
|
||||
* queue into `settleQueued` if this isn't yet set.
|
||||
*/
|
||||
onSettleWired: ((info: ShellPostPromoteSettleInfo) => void) | null;
|
||||
/**
|
||||
* Settle info captured by `postPromote.onSettle` before the wired
|
||||
* handler was installed. `handlePromotedForeground` checks this and
|
||||
* fires the wired handler synchronously after registering.
|
||||
*/
|
||||
settleQueued: ShellPostPromoteSettleInfo | null;
|
||||
}
|
||||
|
||||
// Long-run advisory threshold: half the EFFECTIVE foreground timeout
|
||||
// (not the default), computed per-invocation by `longRunThresholdFor`.
|
||||
// Couples to whichever timeout actually governs THIS command — so a
|
||||
|
|
@ -1619,6 +1679,77 @@ export class ShellToolInvocation extends BaseToolInvocation<
|
|||
}
|
||||
};
|
||||
|
||||
// Pre-allocate the promote artifacts (PR-2.5). Lazily created — no
|
||||
// disk I/O unless the user actually fires Ctrl+B / promote signal.
|
||||
// The handlers below close over these slots; once promote happens,
|
||||
// `handlePromotedForeground` populates them (opens the stream, sets
|
||||
// the shellId / onSettle wiring), and any onData chunks that the
|
||||
// service forwarded BEFORE handlePromotedForeground caught up land
|
||||
// in `postPromoteBuffer` and drain to the stream once it opens.
|
||||
const promoteArtifacts: PromoteArtifacts = {
|
||||
buffer: [],
|
||||
stream: null,
|
||||
streamClosed: false,
|
||||
onSettleWired: null,
|
||||
settleQueued: null,
|
||||
};
|
||||
const postPromote: ShellPostPromoteHandlers = {
|
||||
onData: (event) => {
|
||||
if (event.type !== 'data') return;
|
||||
// ANSI structured chunks have no append semantics — coerce to
|
||||
// string. The output file is plain text; live ANSI updates are
|
||||
// owned by the foreground stream, which by promote-time has
|
||||
// already terminated.
|
||||
//
|
||||
// PR-2.5 wave-4: strip ANSI before writing so
|
||||
// the post-promote tail of `bg_xxx.output` matches the format
|
||||
// of the snapshot above (which is rendered terminal text, not
|
||||
// raw escape sequences) AND matches the regular
|
||||
// `executeBackground` path's `outputStream.write(stripAnsi(chunk))`
|
||||
// contract. Without this, an agent reading the file after a
|
||||
// promote would see plain text up to the promote moment, then
|
||||
// raw `\x1b[...m` color codes / cursor moves / clear-screen
|
||||
// sequences for any post-promote output — which is unreadable
|
||||
// and inconsistent.
|
||||
const rawChunk =
|
||||
typeof event.chunk === 'string'
|
||||
? event.chunk
|
||||
: event.chunk
|
||||
.map((line) => line.map((tok) => tok.text).join(''))
|
||||
.join('\n');
|
||||
const chunk = stripAnsi(rawChunk);
|
||||
if (promoteArtifacts.stream) {
|
||||
try {
|
||||
promoteArtifacts.stream.write(chunk);
|
||||
} catch (err) {
|
||||
debugLogger.warn(
|
||||
`promote: postPromote stream.write failed: ${getErrorMessage(err)}`,
|
||||
);
|
||||
}
|
||||
} else if (promoteArtifacts.streamClosed) {
|
||||
// Stream-open already failed permanently — drop chunks
|
||||
// rather than buffer them. Without this guard the buffer
|
||||
// would grow without bound under a sustained child whose
|
||||
// output file we couldn't open.
|
||||
debugLogger.debug(
|
||||
'promote: dropping post-promote chunk because output stream open failed',
|
||||
);
|
||||
} else {
|
||||
promoteArtifacts.buffer.push(chunk);
|
||||
}
|
||||
},
|
||||
onSettle: (info) => {
|
||||
if (promoteArtifacts.onSettleWired) {
|
||||
promoteArtifacts.onSettleWired(info);
|
||||
} else {
|
||||
// Service observed the child exit before handlePromotedForeground
|
||||
// finished registering. Queue the settle info — handlePromotedForeground
|
||||
// applies it as soon as the registry entry exists.
|
||||
promoteArtifacts.settleQueued = info;
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
let executionHandle;
|
||||
try {
|
||||
executionHandle = await ShellExecutionService.execute(
|
||||
|
|
@ -1628,6 +1759,7 @@ export class ShellToolInvocation extends BaseToolInvocation<
|
|||
combinedSignal,
|
||||
this.config.getShouldUseNodePtyShell(),
|
||||
shellExecutionConfig ?? {},
|
||||
{ postPromote },
|
||||
);
|
||||
} catch (err) {
|
||||
// ShellExecutionService.execute() can throw before resolving (e.g.
|
||||
|
|
@ -1725,6 +1857,7 @@ export class ShellToolInvocation extends BaseToolInvocation<
|
|||
cwd,
|
||||
commandToExecute,
|
||||
promoteAbortController,
|
||||
promoteArtifacts,
|
||||
);
|
||||
return promotedToolResult;
|
||||
}
|
||||
|
|
@ -2033,27 +2166,26 @@ export class ShellToolInvocation extends BaseToolInvocation<
|
|||
/**
|
||||
* Foreground → background promote handler. Called when the foreground
|
||||
* execute path observes `result.promoted: true` (the user pressed
|
||||
* Ctrl+B mid-flight). Snapshots captured output to a `bg_xxx.output`
|
||||
* file, registers a `BackgroundShellEntry` in the same registry the
|
||||
* `is_background: true` path uses, and returns a model-facing
|
||||
* `ToolResult` pointing at `/tasks` / the dialog / `task_stop` for
|
||||
* follow-up.
|
||||
* Ctrl+B mid-flight). Writes the initial snapshot + open the
|
||||
* post-promote append stream so subsequent child bytes land in
|
||||
* `bg_xxx.output`, registers a `BackgroundShellEntry` in the same
|
||||
* registry the `is_background: true` path uses, wires settle so
|
||||
* natural child exit transitions the entry to `'completed'` /
|
||||
* `'failed'`, and returns a model-facing `ToolResult` pointing at
|
||||
* `/tasks` / the dialog / `task_stop` for follow-up.
|
||||
*
|
||||
* Limitations (PR-2.5 follow-up):
|
||||
* - The registry entry stays `'running'` until `task_stop bg_xxx`
|
||||
* or session-end `abortAll` clears it; natural child exit does
|
||||
* NOT auto-settle the entry today (no settle hook from the
|
||||
* service after promote — the listener was detached as part of
|
||||
* PR-1's ownership-transfer contract).
|
||||
* - The `outputPath` content is FROZEN at the promote moment; the
|
||||
* service no longer streams post-promote bytes to the file.
|
||||
* Caller-side stream redirect lands in PR-2.5.
|
||||
* PR-2.5: post-promote stream redirect + natural-exit registry
|
||||
* settle are now live via the `postPromote` callbacks wired in
|
||||
* `executeForeground`. The `promoteArtifacts` parameter carries the
|
||||
* pre-allocated buffer/stream slots that absorb the race between
|
||||
* service-side promote-time data flush and this finalizer running.
|
||||
*/
|
||||
private async handlePromotedForeground(
|
||||
result: ShellExecutionResult,
|
||||
cwd: string,
|
||||
commandToExecute: string,
|
||||
abortController: AbortController,
|
||||
promoteArtifacts: PromoteArtifacts,
|
||||
): Promise<ToolResult> {
|
||||
// Mirror executeBackground's outputPath layout so /tasks-on-disk and
|
||||
// ReadFileTool's auto-allow rules treat foreground-promoted shells
|
||||
|
|
@ -2110,15 +2242,108 @@ export class ShellToolInvocation extends BaseToolInvocation<
|
|||
|
||||
const shellId = `bg_${crypto.randomBytes(4).toString('hex')}`;
|
||||
const outputPath = path.join(outputDir, `shell-${shellId}.output`);
|
||||
// Best-effort initial snapshot write — if disk is full or
|
||||
// permission flips, log + continue (the registry entry is still
|
||||
// valuable on its own; the file is only the inspection surface).
|
||||
// PR-2.5: open an append-mode write stream so the initial snapshot
|
||||
// AND post-promote bytes from the still-running child both land in
|
||||
// the same file. Synchronous open via `createWriteStream` with
|
||||
// `flags: 'w'` (overwrite) — if a stale file is somehow there from
|
||||
// a prior session with the same shellId (vanishingly unlikely
|
||||
// given the randomBytes), start fresh. Stream errors (ENOSPC mid-
|
||||
// stream, permission flip) are logged via 'error' listener; we
|
||||
// never let them crash the daemon.
|
||||
let outputStream: fs.WriteStream | null = null;
|
||||
try {
|
||||
fs.writeFileSync(outputPath, result.output);
|
||||
outputStream = fs.createWriteStream(outputPath, { flags: 'w' });
|
||||
// PR-2.5 wave-2: `createWriteStream` reports common
|
||||
// failures (ENOENT / EACCES / ENOSPC during the async libuv
|
||||
// `open`) via an `'error'` event AFTER this synchronous call
|
||||
// returns — they do NOT throw. Without latching the failure
|
||||
// here, `promoteArtifacts.stream` would still point at an
|
||||
// already-broken stream, `postPromote.onData` would `write` into
|
||||
// it (catching the throw via its own try/catch but never
|
||||
// releasing the buffer), and `onSettleWired` would attach a
|
||||
// `'finish'` listener that never fires → registry stuck on
|
||||
// `running` forever. Latch the failure: null the stream,
|
||||
// mark `streamClosed` so `onData` drops chunks, and let
|
||||
// `onSettleWired` transition the registry immediately (its
|
||||
// existing `if (!stream)` branch handles that case).
|
||||
outputStream.on('error', (err) => {
|
||||
debugLogger.warn(
|
||||
`promote: output write stream error for ${outputPath}: ${getErrorMessage(err)}`,
|
||||
);
|
||||
const droppedChunks = promoteArtifacts.buffer.length;
|
||||
promoteArtifacts.stream = null;
|
||||
promoteArtifacts.streamClosed = true;
|
||||
try {
|
||||
fs.appendFileSync(
|
||||
outputPath,
|
||||
`\n[WARNING: post-promote output lost — stream error (${getErrorMessage(err)}). ${droppedChunks} buffered chunks dropped.]\n`,
|
||||
);
|
||||
} catch {
|
||||
// Best-effort diagnostic — if the append itself fails
|
||||
// (e.g. disk full), the debugLogger.warn above is the
|
||||
// only trace left.
|
||||
}
|
||||
});
|
||||
// Initial snapshot first, so it always precedes post-promote
|
||||
// bytes in the file (write ordering is FIFO on a single stream).
|
||||
outputStream.write(result.output);
|
||||
// PR-2.5 wave-4: assign the stream BEFORE draining
|
||||
// the buffer, not after. The drain + assign block is synchronous
|
||||
// today (single-tick JS, so a service-side `onData` callback
|
||||
// cannot fire between drain-end and assign), but the assign-
|
||||
// after-drain order leaves a hazard for any future refactor
|
||||
// that introduces an `await` inside the drain — a chunk arriving
|
||||
// in that window would be pushed into `promoteArtifacts.buffer`
|
||||
// (because `stream` is still null), then later chunks would write
|
||||
// directly to the stream after assign, producing out-of-order
|
||||
// bytes in `bg_xxx.output` until the settle drain caught the
|
||||
// straggler. Assign-first eliminates the hazard entirely:
|
||||
// concurrent `onData` writes go straight through after the
|
||||
// queued snapshot + the queued drained chunks, in the correct
|
||||
// FIFO order on the stream.
|
||||
promoteArtifacts.stream = outputStream;
|
||||
while (promoteArtifacts.buffer.length > 0) {
|
||||
const chunk = promoteArtifacts.buffer.shift()!;
|
||||
outputStream.write(chunk);
|
||||
}
|
||||
} catch (err) {
|
||||
debugLogger.warn(
|
||||
`promote: failed to write initial output snapshot to ${outputPath}: ${getErrorMessage(err)}`,
|
||||
`promote: failed to open output stream for ${outputPath}: ${getErrorMessage(err)}`,
|
||||
);
|
||||
// Stream failure is recoverable — the registry entry is still
|
||||
// valuable on its own; the file is the inspection surface only.
|
||||
// Continue without a stream; future onData chunks are dropped
|
||||
// (their warns will accumulate in the log, which is enough
|
||||
// observability for a rare disk failure case).
|
||||
promoteArtifacts.stream = null;
|
||||
// Latch streamClosed so the foreground postPromote.onData
|
||||
// handler stops buffering chunks that would never be drained
|
||||
// (the drain path only runs when `stream` becomes non-null,
|
||||
// which never happens after this branch).
|
||||
promoteArtifacts.streamClosed = true;
|
||||
// PR-2.5 wave-3: record how many pre-
|
||||
// finalizer post-promote chunks are being dropped. Without
|
||||
// this an oncall engineer reading a truncated `bg_xxx.output`
|
||||
// has no signal that the truncation is due to stream-open
|
||||
// failure rather than the child not producing more output.
|
||||
// The chunks themselves are gone (no salvage path exists once
|
||||
// the stream open has failed and the buffer drain depends on
|
||||
// a non-null stream slot).
|
||||
if (promoteArtifacts.buffer.length > 0) {
|
||||
debugLogger.warn(
|
||||
`promote: dropping ${promoteArtifacts.buffer.length} buffered post-promote chunks for ${outputPath} (stream open failed before drain)`,
|
||||
);
|
||||
promoteArtifacts.buffer.length = 0;
|
||||
}
|
||||
// Last-ditch: try a sync snapshot write so /tasks still has
|
||||
// SOMETHING readable; the buffer chunks are lost in this branch.
|
||||
try {
|
||||
fs.writeFileSync(outputPath, result.output);
|
||||
} catch (err2) {
|
||||
debugLogger.warn(
|
||||
`promote: snapshot fallback writeFileSync also failed for ${outputPath}: ${getErrorMessage(err2)}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
const startTime = Date.now();
|
||||
|
|
@ -2237,15 +2462,216 @@ export class ShellToolInvocation extends BaseToolInvocation<
|
|||
} catch {
|
||||
/* swallow — we're already in an error path */
|
||||
}
|
||||
// PR-2.5: close the output stream so the FD doesn't leak past
|
||||
// the throw. Best-effort — if .end() itself throws we're
|
||||
// already in an error path with the orphan-child kill already
|
||||
// in flight.
|
||||
try {
|
||||
promoteArtifacts.stream?.end();
|
||||
} catch {
|
||||
/* swallow */
|
||||
}
|
||||
promoteArtifacts.stream = null;
|
||||
throw e;
|
||||
}
|
||||
|
||||
// PR-2.5: wire the post-promote settle so a natural child exit
|
||||
// (or spawn-side error) transitions the registry entry from
|
||||
// `'running'` to `'completed'` / `'failed'`. Without this the
|
||||
// entry stays `'running'` until `task_stop` / session-end. The
|
||||
// service's `postPromote.onSettle` fires AT MOST ONCE per
|
||||
// promote, and `registry.complete` / `registry.fail` are
|
||||
// idempotent (no-op when status !== 'running'), so a race with
|
||||
// `entryAc.abort() → registry.cancel` (task_stop fired during the
|
||||
// exit window) is safe: whichever lands first wins, the other
|
||||
// becomes a no-op.
|
||||
// Status flags consumed by the model-facing copy below.
|
||||
//
|
||||
// - `postPromoteSettleObserved`: SET SYNCHRONOUSLY inside
|
||||
// `onSettleWired` the moment we know the child has exited (the
|
||||
// service has called us with settle info). Independent of
|
||||
// whether the registry transition has actually completed yet,
|
||||
// because the transition may be deferred awaiting the output
|
||||
// stream's `'finish'` event (libuv flush). This is the flag
|
||||
// the model-facing copy branches on: once we know the child has
|
||||
// exited, saying "Status: running" + suggesting `task_stop`
|
||||
// would mislead the agent.
|
||||
// - `postPromoteFinalStatus`: classified from the settle info at
|
||||
// the same synchronous moment, so the status line can report
|
||||
// the right terminal status even if the registry transition is
|
||||
// still in flight.
|
||||
//
|
||||
// PR-2.5 wave-2: originally the model-facing copy
|
||||
// checked a `postPromoteAlreadySettled` flag that was only flipped
|
||||
// AFTER the registry transition fired (post-flush). A fast-exited
|
||||
// promoted command could therefore land "Status: running" +
|
||||
// `task_stop` instructions in the model copy even when settle was
|
||||
// already queued, because the queued-settle drain returned before
|
||||
// the stream's 'finish' event fired. The two flags decouple
|
||||
// "child has exited" (what the agent cares about) from "registry
|
||||
// transition has run" (which can lag behind libuv flush).
|
||||
let postPromoteSettleObserved = false;
|
||||
let postPromoteFinalStatus: 'completed' | 'failed' | null = null;
|
||||
const classifySettle = (
|
||||
info: ShellPostPromoteSettleInfo,
|
||||
): { status: 'completed' | 'failed'; failMsg: string | null } => {
|
||||
// Decision table: `error` → fail (spawn-side failure); `exitCode
|
||||
// === 0` → complete; non-zero exitCode → fail; signal-killed
|
||||
// (no exitCode, signal set) → fail with descriptive message;
|
||||
// everything-null → fail with generic message.
|
||||
if (info.error) return { status: 'failed', failMsg: info.error.message };
|
||||
if (info.exitCode === 0) return { status: 'completed', failMsg: null };
|
||||
if (info.exitCode !== null)
|
||||
return {
|
||||
status: 'failed',
|
||||
failMsg: `Exited with code ${info.exitCode}`,
|
||||
};
|
||||
if (info.signal !== null)
|
||||
return {
|
||||
status: 'failed',
|
||||
failMsg: `Terminated by signal ${info.signal}`,
|
||||
};
|
||||
// PR-2.5 wave-3: this branch is meant to
|
||||
// be unreachable — the service always populates one of
|
||||
// `error` / `exitCode` / `signal`. Hitting it means the
|
||||
// service emitted a defective settle info object, which is a
|
||||
// logic bug. Capture the actual field values in the failure
|
||||
// message AND warn-log so the oncall engineer reading
|
||||
// `/tasks` or the debug log can tell THIS path apart from the
|
||||
// other "failed" branches. (`info.error` has been narrowed to
|
||||
// `never` by the preceding `if (info.error) return`, so we
|
||||
// can't read `.message` here — by construction it would be
|
||||
// `undefined` at runtime anyway.)
|
||||
debugLogger.warn(
|
||||
`promote: classifySettle all-null fallback hit for ${shellId} — ` +
|
||||
`exitCode=${info.exitCode}, signal=${info.signal}, error=undefined`,
|
||||
);
|
||||
return {
|
||||
status: 'failed',
|
||||
failMsg: `Exited with unknown status (exitCode=${info.exitCode}, signal=${info.signal}, error=undefined)`,
|
||||
};
|
||||
};
|
||||
const transitionRegistry = (info: ShellPostPromoteSettleInfo) => {
|
||||
const cls = classifySettle(info);
|
||||
if (cls.status === 'completed') {
|
||||
registry.complete(shellId, info.exitCode as number, info.endTime);
|
||||
} else {
|
||||
registry.fail(shellId, cls.failMsg as string, info.endTime);
|
||||
}
|
||||
};
|
||||
promoteArtifacts.onSettleWired = (info) => {
|
||||
// Synchronous observation — the child has exited; classify now
|
||||
// so the model-facing copy can branch correctly even when the
|
||||
// registry transition is deferred behind the stream's flush.
|
||||
const cls = classifySettle(info);
|
||||
postPromoteFinalStatus = cls.status;
|
||||
postPromoteSettleObserved = true;
|
||||
// Wait for the output stream to fully FLUSH before transitioning
|
||||
// the registry. `stream.end()` is asynchronous — pending writes
|
||||
// can still be in the libuv queue when it returns. Without the
|
||||
// 'finish' wait, `/tasks` consumers can observe the entry as
|
||||
// `completed`/`failed` and read the output file BEFORE the
|
||||
// trailing bytes are on disk, producing truncated logs.
|
||||
const stream = promoteArtifacts.stream;
|
||||
// PR-2.5 wave-3: drain the pre-settle
|
||||
// buffer to the stream BEFORE nulling the shared slot. Service-
|
||||
// side `onData` callbacks that race the foreground finalizer
|
||||
// can land chunks in the buffer between when the wire fires
|
||||
// and when the buffer drain (during stream-open) sees them.
|
||||
// Without this drain those chunks are stranded. AND latch
|
||||
// `streamClosed` together with the null so that any
|
||||
// chunk arriving AFTER `.end()` (during the flush window —
|
||||
// unlikely once the service has emitted settle, but kernel
|
||||
// buffers can deliver late on PTY) is DROPPED via the
|
||||
// `else if (promoteArtifacts.streamClosed)` arm in `onData`
|
||||
// instead of being pushed into the now-undrainable buffer.
|
||||
if (stream) {
|
||||
while (promoteArtifacts.buffer.length > 0) {
|
||||
try {
|
||||
stream.write(promoteArtifacts.buffer.shift()!);
|
||||
} catch (writeErr) {
|
||||
// Stream write failure during pre-end drain — log + drop,
|
||||
// same recovery posture as the foreground `onData` write
|
||||
// path. The error event will fire async if the stream is
|
||||
// dead, latching `streamClosed` via the 'error' handler.
|
||||
debugLogger.warn(
|
||||
`promote: pre-end buffer drain write failed: ${getErrorMessage(writeErr)}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
promoteArtifacts.stream = null;
|
||||
promoteArtifacts.streamClosed = true;
|
||||
if (!stream) {
|
||||
// No stream (open failed or already ended) — transition right
|
||||
// away, no flush to wait on.
|
||||
transitionRegistry(info);
|
||||
return;
|
||||
}
|
||||
try {
|
||||
// `finish` fires after all queued writes have been flushed to
|
||||
// the underlying fd. `error` covers a late EIO / ENOSPC that
|
||||
// doesn't reach the existing `'error'` listener — race with
|
||||
// `.end()` itself. Either way, run the transition once.
|
||||
let transitioned = false;
|
||||
const finalize = () => {
|
||||
if (transitioned) return;
|
||||
transitioned = true;
|
||||
transitionRegistry(info);
|
||||
};
|
||||
const flushTimer = setTimeout(() => {
|
||||
debugLogger.warn(
|
||||
`promote: output stream flush timed out for ${shellId} after ${PROMOTE_FLUSH_TIMEOUT_MS}ms — transitioning registry without flush confirmation`,
|
||||
);
|
||||
finalize();
|
||||
}, PROMOTE_FLUSH_TIMEOUT_MS);
|
||||
flushTimer.unref();
|
||||
stream.once('finish', () => {
|
||||
clearTimeout(flushTimer);
|
||||
finalize();
|
||||
});
|
||||
stream.once('error', () => {
|
||||
clearTimeout(flushTimer);
|
||||
finalize();
|
||||
});
|
||||
stream.end();
|
||||
} catch (closeErr) {
|
||||
debugLogger.warn(
|
||||
`promote: closing output stream on settle threw: ${getErrorMessage(closeErr)}`,
|
||||
);
|
||||
transitionRegistry(info);
|
||||
}
|
||||
};
|
||||
// Drain a settle that landed BEFORE the wire installed (fast
|
||||
// commands can exit between `result.promoted` and this line).
|
||||
// After this call returns, `postPromoteSettleObserved` is true
|
||||
// if a settle was queued — that's the case the model-facing copy
|
||||
// below branches on so the message doesn't say "Status: running"
|
||||
// for a process that already finished during the registration
|
||||
// window.
|
||||
if (promoteArtifacts.settleQueued) {
|
||||
const queued = promoteArtifacts.settleQueued;
|
||||
promoteArtifacts.settleQueued = null;
|
||||
promoteArtifacts.onSettleWired(queued);
|
||||
}
|
||||
|
||||
// Build the model-facing status line based on whether the settle
|
||||
// was observed synchronously (i.e. the child has exited). Branch
|
||||
// on `postPromoteSettleObserved` rather than the post-flush latch
|
||||
// — see the flag block above for the rationale.
|
||||
const statusLine = postPromoteSettleObserved
|
||||
? `Status: ${postPromoteFinalStatus ?? 'settled'}. PID: ${result.pid ?? '(unknown)'}.`
|
||||
: `Status: running. PID: ${result.pid ?? '(unknown)'}.`;
|
||||
const inspectLine = `To inspect: \`/tasks\` (text), the Background tasks dialog (↓ + Enter on the footer pill), or \`Read\` the output file directly.`;
|
||||
const stopLine = postPromoteSettleObserved
|
||||
? `Process has already exited; no \`task_stop\` needed (the entry is observable in \`/tasks\` for inspection).`
|
||||
: `To stop the now-background process: \`task_stop({ task_id: '${shellId}' })\`.`;
|
||||
const llmContent = [
|
||||
`Foreground command "${commandToExecute}" promoted to background as ${shellId}.`,
|
||||
`Status: running. PID: ${result.pid ?? '(unknown)'}.`,
|
||||
statusLine,
|
||||
`Output snapshot at promote time saved to: ${outputPath}`,
|
||||
`To inspect: \`/tasks\` (text), the Background tasks dialog (↓ + Enter on the footer pill), or \`Read\` the output file directly.`,
|
||||
`To stop the now-background process: \`task_stop({ task_id: '${shellId}' })\`.`,
|
||||
inspectLine,
|
||||
stopLine,
|
||||
].join('\n');
|
||||
|
||||
debugLogger.debug(
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue