mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-05-20 01:01:53 +00:00
fix(core): liveness check + throw-safe abort-reason read + encoding-aware PTY snapshot (resolve 4th review pass)
Resolves 6 threads on PR-1 of #3831 — 1 Critical + 1 real bug + 2 quality + 2 test-coverage: - **[Critical] `getShellAbortReasonKind` throw-safe property read.** Previous form read `reason.kind` after only checking that `kind` is an own property. An own accessor that throws (or a Proxy with a trapping getter) would throw before the helper reached either the cancel kill path or the background promote path. Abort handlers are dispatched async and not awaited by AbortSignal, so a leaked throw here would have left the shell process alive instead of being killed on cancel — quietly. Wrapped the property read in try/catch with a fall-back to the safe 'cancel' kill behavior. - **Real bug: child_process post-exit race in background-promote** (`performBackgroundPromote`). The child may have already exited but the 'exit' event hasn't reached our handler yet (Node delivers events on the next microtask). Promoting in that window would detach our exit listener and report `promoted: true` for a process that's already dead — the caller would hold an inert pid expecting to take over. Now we read `child.exitCode` / `child.signalCode` before detaching: if either is non-null, fall through and let the pending exit handler resolve normally with the real exit info. Mirrored mock setup so `exitCode` / `signalCode` default to `null` (matching real ChildProcess) instead of `undefined`. - **PTY snapshot: re-decode + replay (mirror exit-path encoding).** The promoted snapshot was serializing `headlessTerminal` directly, which was fed by a streaming decoder initialized from the first-chunk encoding heuristic. When early output is ASCII-only but later output is in a different encoding (GBK / Shift-JIS / etc.), this produces mojibake — and the normal exit path doesn't, because it re-decodes `finalBuffer` with `getCachedEncodingForBuffer` and replays through a fresh terminal. Now mirrors that logic so `result.output` shape matches across the two paths. Direct-serialize remains as a last-ditch fallback if replay throws. - **Switch `default` no longer emits a runtime warn.** Reviewer noted the helper's whitelist made the `default: { _exhaustive: never }` branch unreachable at runtime — the `debugLogger.warn` in it could never fire. Kept the `_: never = kind` type assertion (so a future ShellAbortReason variant forces a TS error here, directing the developer to extend BOTH the helper's whitelist AND add a `case`), removed the unreachable warn. Added a comment that the assertion is the static-only safety net the union expansion would trigger. - **Direct unit tests for `getShellAbortReasonKind`** (8 cases). The helper's prototype-pollution defense is the main reason it exists; if `hasOwnProperty` is accidentally removed the regression would silently send `abortController.abort({})` (any plain reason) into the promote path. Exported the helper and added direct tests for: null / undefined, non-object, empty object (no own kind), prototype- only kind (pollution), unknown kind value, throwing accessor, Proxy trap, and the two happy paths. - **`removeListener` regression guard.** The fix to call `ptyProcess.removeListener('error', ...)` instead of `.off(...)` matters because `@lydell/node-pty`'s IPty interface only exposes `removeListener` — `.off()` throws TypeError on a real PTY but the EventEmitter mock tolerates both. Added a test that spies on both methods and asserts the production code uses `removeListener` for the 'error' event, so a future swap back to `.off()` regresses loudly under the mock instead of silently. 68 / 68 tests pass (58 baseline + 9 helper boundary + 1 removeListener guard + 1 post-exit race); tsc clean; ESLint clean.
This commit is contained in:
parent
094fff09ca
commit
4cc558b3d5
2 changed files with 256 additions and 18 deletions
|
|
@ -21,7 +21,10 @@ import type {
|
|||
ShellAbortReason,
|
||||
ShellOutputEvent,
|
||||
} from './shellExecutionService.js';
|
||||
import { ShellExecutionService } from './shellExecutionService.js';
|
||||
import {
|
||||
getShellAbortReasonKind,
|
||||
ShellExecutionService,
|
||||
} from './shellExecutionService.js';
|
||||
import type { AnsiOutput } from '../utils/terminalSerializer.js';
|
||||
|
||||
const { Terminal } = pkg;
|
||||
|
|
@ -700,6 +703,40 @@ describe('ShellExecutionService', () => {
|
|||
expect(exitDisposableStub.dispose).toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("post-promotion: ptyProcess error listener is removed via 'removeListener', NOT 'off' (regression guard for @lydell/node-pty)", async () => {
|
||||
// node EventEmitter exposes both `off` (Node 10+) and the legacy
|
||||
// `removeListener`, but @lydell/node-pty's IPty interface only
|
||||
// surfaces `removeListener` — calling `.off(...)` on a real PTY
|
||||
// throws TypeError. Pin that the production code path uses
|
||||
// `removeListener` so a future refactor swapping to `.off()`
|
||||
// doesn't silently regress under the EventEmitter mock (which
|
||||
// tolerates both).
|
||||
const removeListenerSpy = vi.spyOn(mockPtyProcess, 'removeListener');
|
||||
const offSpy = vi.spyOn(mockPtyProcess, 'off');
|
||||
|
||||
const { result } = await simulateExecution(
|
||||
'tail -f /tmp/never.log',
|
||||
(_pty, abortController) => {
|
||||
abortController.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_test123',
|
||||
} satisfies ShellAbortReason);
|
||||
},
|
||||
);
|
||||
|
||||
expect(result.promoted).toBe(true);
|
||||
// The 'error' handler is removed via legacy API; `.off` must not
|
||||
// appear in the production teardown path.
|
||||
expect(removeListenerSpy).toHaveBeenCalledWith(
|
||||
'error',
|
||||
expect.any(Function),
|
||||
);
|
||||
const offErrorCalls = offSpy.mock.calls.filter(
|
||||
([event]) => event === 'error',
|
||||
);
|
||||
expect(offErrorCalls).toEqual([]);
|
||||
});
|
||||
|
||||
it('post-promotion: PTY exit does NOT re-resolve the result (already resolved with promoted)', async () => {
|
||||
// Pin: even if the still-running child later exits naturally and the
|
||||
// caller's own exit listener fires, our foreground result Promise
|
||||
|
|
@ -1057,6 +1094,22 @@ describe('ShellExecutionService child_process fallback', () => {
|
|||
value: 12345,
|
||||
configurable: true,
|
||||
});
|
||||
// Mirror real Node ChildProcess: `exitCode` / `signalCode` are `null`
|
||||
// while the child is alive and become a number / signal name on
|
||||
// exit. The background-promote liveness guard reads these to detect
|
||||
// an exit that fired between abort dispatch and the abort handler
|
||||
// run, and a default of `undefined` would mistakenly look terminal
|
||||
// and skip the promote.
|
||||
Object.defineProperty(mockChildProcess, 'exitCode', {
|
||||
value: null,
|
||||
writable: true,
|
||||
configurable: true,
|
||||
});
|
||||
Object.defineProperty(mockChildProcess, 'signalCode', {
|
||||
value: null,
|
||||
writable: true,
|
||||
configurable: true,
|
||||
});
|
||||
|
||||
mockCpSpawn.mockReturnValue(mockChildProcess);
|
||||
});
|
||||
|
|
@ -1350,6 +1403,45 @@ describe('ShellExecutionService child_process fallback', () => {
|
|||
expect(result.output).not.toContain('post-promote-stderr');
|
||||
});
|
||||
|
||||
it('post-exit race: background-promote refuses if child is already terminal (exitCode/signalCode non-null)', async () => {
|
||||
// Race window: the child may have exited (exitCode set) but the
|
||||
// 'exit' event hasn't reached our handler yet because Node delivers
|
||||
// child_process events on the next microtask. Promoting in that
|
||||
// window would detach our exit listener and report `promoted: true`
|
||||
// for a process that's already dead — the caller would hold an
|
||||
// inert pid expecting to take over. Production code reads
|
||||
// exitCode / signalCode before detaching; if either is non-null,
|
||||
// it falls through and lets the pending exit handler resolve
|
||||
// normally with the real exit info.
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
const { result } = await simulateExecution(
|
||||
'fast-and-cancelled',
|
||||
(cp, abortController) => {
|
||||
// Simulate the race: pretend the child has already exited
|
||||
// (exitCode set on the ChildProcess) but the 'exit' event
|
||||
// emit is queued behind the abort dispatch.
|
||||
Object.defineProperty(cp, 'exitCode', {
|
||||
value: 0,
|
||||
writable: true,
|
||||
configurable: true,
|
||||
});
|
||||
abortController.abort({
|
||||
kind: 'background',
|
||||
shellId: 'bg_test123',
|
||||
} satisfies ShellAbortReason);
|
||||
// Now drain the pending exit + close events; the normal
|
||||
// exit path should resolve the result.
|
||||
cp.emit('exit', 0, null);
|
||||
cp.emit('close', 0, null);
|
||||
},
|
||||
);
|
||||
|
||||
// Result is the normal exit shape, not the promoted shape.
|
||||
expect(result.promoted).toBeUndefined();
|
||||
expect(result.aborted).toBe(true); // abortSignal.aborted is still true
|
||||
expect(result.exitCode).toBe(0);
|
||||
});
|
||||
|
||||
it('post-promotion: child exit does NOT re-resolve the result with a non-promoted shape', async () => {
|
||||
mockPlatform.mockReturnValue('linux');
|
||||
// Pin: even if the still-running child later exits naturally and the
|
||||
|
|
@ -1665,3 +1757,77 @@ describe('ShellExecutionService execution method selection', () => {
|
|||
expect(result.executionMethod).toBe('child_process');
|
||||
});
|
||||
});
|
||||
|
||||
describe('getShellAbortReasonKind (defensive abort-reason read)', () => {
|
||||
it("returns 'cancel' for null reason (e.g. plain abortController.abort())", () => {
|
||||
expect(getShellAbortReasonKind(null)).toBe('cancel');
|
||||
expect(getShellAbortReasonKind(undefined)).toBe('cancel');
|
||||
});
|
||||
|
||||
it("returns 'cancel' for non-object reasons (string / number / DOMException)", () => {
|
||||
expect(getShellAbortReasonKind('background')).toBe('cancel');
|
||||
expect(getShellAbortReasonKind(42)).toBe('cancel');
|
||||
expect(getShellAbortReasonKind(true)).toBe('cancel');
|
||||
// DOMException-like object — not the real DOMException constructor in
|
||||
// the test runtime, but the principle is the same: a non-discriminated
|
||||
// object reason without an own `kind` falls back to cancel.
|
||||
expect(getShellAbortReasonKind(new Error('aborted'))).toBe('cancel');
|
||||
});
|
||||
|
||||
it("returns 'cancel' for an empty object (no own kind)", () => {
|
||||
expect(getShellAbortReasonKind({})).toBe('cancel');
|
||||
});
|
||||
|
||||
it("returns 'cancel' when 'kind' lives only on the prototype (pollution defense)", () => {
|
||||
const polluted: Record<string, unknown> = Object.create({
|
||||
kind: 'background',
|
||||
});
|
||||
// hasOwnProperty('kind') is false → helper rejects the prototype-only kind
|
||||
expect(getShellAbortReasonKind(polluted)).toBe('cancel');
|
||||
});
|
||||
|
||||
it("returns 'cancel' for an unknown kind value (typo / future-untyped variant)", () => {
|
||||
expect(getShellAbortReasonKind({ kind: 'suspend' })).toBe('cancel');
|
||||
expect(getShellAbortReasonKind({ kind: 'BACKGROUND' })).toBe('cancel');
|
||||
expect(getShellAbortReasonKind({ kind: 42 })).toBe('cancel');
|
||||
});
|
||||
|
||||
it("returns 'cancel' when reading 'kind' throws (accessor / Proxy trap)", () => {
|
||||
const throwingReason = Object.defineProperty({}, 'kind', {
|
||||
enumerable: true,
|
||||
configurable: true,
|
||||
get() {
|
||||
throw new Error('accessor blew up');
|
||||
},
|
||||
});
|
||||
expect(getShellAbortReasonKind(throwingReason)).toBe('cancel');
|
||||
|
||||
const proxyReason = new Proxy(
|
||||
{},
|
||||
{
|
||||
get(_target, prop) {
|
||||
if (prop === 'kind') throw new Error('proxy trap blew up');
|
||||
return undefined;
|
||||
},
|
||||
getOwnPropertyDescriptor(_target, prop) {
|
||||
if (prop === 'kind') {
|
||||
return { configurable: true, enumerable: true, value: 'unused' };
|
||||
}
|
||||
return undefined;
|
||||
},
|
||||
},
|
||||
);
|
||||
expect(getShellAbortReasonKind(proxyReason)).toBe('cancel');
|
||||
});
|
||||
|
||||
it("returns 'background' for the canonical happy-path reason", () => {
|
||||
expect(getShellAbortReasonKind({ kind: 'background' })).toBe('background');
|
||||
expect(
|
||||
getShellAbortReasonKind({ kind: 'background', shellId: 'bg_x' }),
|
||||
).toBe('background');
|
||||
});
|
||||
|
||||
it("returns 'cancel' for the canonical cancel reason", () => {
|
||||
expect(getShellAbortReasonKind({ kind: 'cancel' })).toBe('cancel');
|
||||
});
|
||||
});
|
||||
|
|
|
|||
|
|
@ -44,18 +44,34 @@ const PROMOTE_DRAIN_TIMEOUT_MS = 200;
|
|||
* would force the kill path through the promote branch on any plain
|
||||
* `abortController.abort({})`. Lifecycle/safety branches deserve the
|
||||
* extra check.
|
||||
* - Wrap the property read in try/catch — an own getter or a `Proxy`
|
||||
* trap may throw during inspection. A throw here would propagate up
|
||||
* past the abort handler (which is dispatched async and not awaited
|
||||
* by AbortSignal), leaving the shell process alive instead of being
|
||||
* killed on cancel. We swallow the throw and fall back to 'cancel'.
|
||||
* - Whitelist the value against the known union — anything else (typos,
|
||||
* future-untyped variants) defaults to `'cancel'` so the historical
|
||||
* kill behavior is preserved as the safe fallback.
|
||||
*
|
||||
* Exported for direct unit testing of all six edge cases (null /
|
||||
* non-object / `{}` no own kind / prototype-only kind / unknown kind /
|
||||
* throwing accessor) — the integration tests only exercise the three
|
||||
* happy-path inputs.
|
||||
*/
|
||||
function getShellAbortReasonKind(reason: unknown): ShellAbortReason['kind'] {
|
||||
export function getShellAbortReasonKind(
|
||||
reason: unknown,
|
||||
): ShellAbortReason['kind'] {
|
||||
if (
|
||||
reason !== null &&
|
||||
typeof reason === 'object' &&
|
||||
Object.prototype.hasOwnProperty.call(reason, 'kind')
|
||||
) {
|
||||
const kind = (reason as { kind?: unknown }).kind;
|
||||
if (kind === 'background' || kind === 'cancel') return kind;
|
||||
try {
|
||||
const kind = (reason as { kind?: unknown }).kind;
|
||||
if (kind === 'background' || kind === 'cancel') return kind;
|
||||
} catch {
|
||||
// Throwing accessor / Proxy trap — fall back to safe kill below.
|
||||
}
|
||||
}
|
||||
return 'cancel';
|
||||
}
|
||||
|
|
@ -604,6 +620,25 @@ export class ShellExecutionService {
|
|||
|
||||
const performBackgroundPromote = (): void => {
|
||||
if (!child.pid || exited) return;
|
||||
// Race guard: the child may have already exited but the 'exit'
|
||||
// event hasn't reached our handler yet (Node delivers
|
||||
// child_process events on the next microtask). Promoting in
|
||||
// that window would detach our exit listener, leak the
|
||||
// already-terminal exit code, and report `promoted: true` to
|
||||
// the caller for a process that's already dead — they'd hold
|
||||
// an inert pid expecting to take over. Check exitCode /
|
||||
// signalCode before detaching: if either is non-null the
|
||||
// child is gone, so leave the listeners alone and let the
|
||||
// pending exit handler fire normally with the real exit info.
|
||||
if (child.exitCode !== null || child.signalCode !== null) {
|
||||
debugLogger.debug(
|
||||
`Background-promote requested for pid ${child.pid} but child ` +
|
||||
`is already terminal (exitCode=${child.exitCode}, ` +
|
||||
`signalCode=${child.signalCode}); falling through to the ` +
|
||||
`normal exit-handled resolution.`,
|
||||
);
|
||||
return;
|
||||
}
|
||||
// Detach our listeners (so post-promote output doesn't leak
|
||||
// into the foreground onOutputEvent or the now-finalized text
|
||||
// decoder), drop the child from our active set (so cleanup()
|
||||
|
|
@ -688,11 +723,17 @@ export class ShellExecutionService {
|
|||
await performCancelKill();
|
||||
return;
|
||||
default: {
|
||||
// Unreachable at runtime: getShellAbortReasonKind whitelists
|
||||
// the return to the union members, so this branch only
|
||||
// exists to force a TS error if the `ShellAbortReason` union
|
||||
// ever gains a new variant — that error directs the
|
||||
// developer to (1) extend the helper's whitelist and
|
||||
// (2) add a `case` here. Without this exhaustiveness check
|
||||
// the helper's whitelist and the switch could drift apart
|
||||
// silently when the union grows.
|
||||
const _exhaustive: never = kind;
|
||||
debugLogger.warn(
|
||||
`Unknown ShellAbortReason kind, falling back to cancel/kill: ${String(_exhaustive)}`,
|
||||
);
|
||||
await performCancelKill();
|
||||
return _exhaustive;
|
||||
}
|
||||
}
|
||||
};
|
||||
|
|
@ -1171,17 +1212,42 @@ export class ShellExecutionService {
|
|||
const finalBuffer = Buffer.concat(outputChunks);
|
||||
let snapshot = '';
|
||||
try {
|
||||
snapshot = serializeTerminalToText(headlessTerminal) ?? '';
|
||||
// Mirror the normal exit path's snapshot logic: re-decode
|
||||
// the full buffer with the final encoding (the streaming
|
||||
// decoder fed `headlessTerminal` from a first-chunk
|
||||
// heuristic, which can mis-detect when early output is
|
||||
// ASCII-only but later output is in a different encoding,
|
||||
// e.g. GBK). Then replay through a fresh terminal so ANSI
|
||||
// sequences land at the right cursor position. Falling back
|
||||
// to `serializeTerminalToText(headlessTerminal)` would risk
|
||||
// mojibake on the promoted snapshot that the normal exit
|
||||
// path doesn't produce.
|
||||
if (isStreamingRawContent) {
|
||||
const finalEncoding = getCachedEncodingForBuffer(finalBuffer);
|
||||
const decodedOutput = new TextDecoder(finalEncoding).decode(
|
||||
finalBuffer,
|
||||
);
|
||||
snapshot = await replayTerminalOutput(decodedOutput, cols, rows);
|
||||
} else {
|
||||
snapshot = serializeTerminalToText(headlessTerminal) ?? '';
|
||||
}
|
||||
} catch (serErr) {
|
||||
// Best-effort snapshot — terminal serialization may fail if
|
||||
// the PTY is mid-render. Empty snapshot is acceptable since
|
||||
// the caller has rawOutput, but log so the failure leaves a
|
||||
// diagnostic trail (otherwise an empty `output` is
|
||||
// indistinguishable from "command produced no output").
|
||||
// Best-effort snapshot — re-decode + replay may fail (encoding
|
||||
// detection error, terminal write throw, etc.). Empty snapshot
|
||||
// is acceptable since the caller has rawOutput, but log so
|
||||
// the failure leaves a diagnostic trail (otherwise an empty
|
||||
// `output` is indistinguishable from "command produced no
|
||||
// output"). Try the simpler direct-serialize path as a
|
||||
// last-ditch fallback before giving up.
|
||||
debugLogger.warn(
|
||||
`Background-promote snapshot serialization failed: ${serErr instanceof Error ? serErr.message : String(serErr)}. ` +
|
||||
`Result.output will be empty; caller should fall back to rawOutput.`,
|
||||
`Background-promote snapshot replay failed: ${serErr instanceof Error ? serErr.message : String(serErr)}. ` +
|
||||
`Falling back to direct headlessTerminal serialize; if that also fails, output stays empty.`,
|
||||
);
|
||||
try {
|
||||
snapshot = serializeTerminalToText(headlessTerminal) ?? '';
|
||||
} catch {
|
||||
// Both paths failed — leave snapshot empty.
|
||||
}
|
||||
}
|
||||
resolve({
|
||||
rawOutput: finalBuffer,
|
||||
|
|
@ -1234,11 +1300,17 @@ export class ShellExecutionService {
|
|||
await performCancelKill();
|
||||
return;
|
||||
default: {
|
||||
// Unreachable at runtime: getShellAbortReasonKind whitelists
|
||||
// the return to the union members, so this branch only
|
||||
// exists to force a TS error if the `ShellAbortReason` union
|
||||
// ever gains a new variant — that error directs the
|
||||
// developer to (1) extend the helper's whitelist and
|
||||
// (2) add a `case` here. Without this exhaustiveness check
|
||||
// the helper's whitelist and the switch could drift apart
|
||||
// silently when the union grows.
|
||||
const _exhaustive: never = kind;
|
||||
debugLogger.warn(
|
||||
`Unknown ShellAbortReason kind, falling back to cancel/kill: ${String(_exhaustive)}`,
|
||||
);
|
||||
await performCancelKill();
|
||||
return _exhaustive;
|
||||
}
|
||||
}
|
||||
};
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue