From aac2e96ec3ff98d3d32444436611133855f9e5f2 Mon Sep 17 00:00:00 2001 From: Shaojin Wen Date: Tue, 28 Apr 2026 11:06:50 +0800 Subject: [PATCH] feat(core): managed background shell pool with /tasks command (#3642) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(core): managed background shell pool with /bashes command Replace shell.ts's `&` fork-and-detach background path with a managed process registry. Background shells now have observable lifecycle, captured output, and explicit cancellation — matching the pattern used by background subagents (#3076). Phase B from #3634 (background task management roadmap). What changes - New `BackgroundShellRegistry` (services/backgroundShellRegistry.ts): per-process entry with status (running / completed / failed / cancelled), AbortController, output file path. State transitions are one-shot (terminal status sticks; late callbacks no-op). Mirrors the lifecycle shape of #3471's BackgroundTaskRegistry so the two can be unified later. - `shell.ts` is_background path rewritten as `executeBackground`: - Spawns the unwrapped command (no '&', no pgrep envelope) - Streams stdout to `/tasks//shell-.output` (path layout aligns with the direction sketched in #3471 review) - Bridges the external abort signal into the entry's AbortController so a single source of truth governs cancellation - Returns immediately with id + output path; agent's turn isn't blocked - Settles the registry entry asynchronously when ShellExecutionService resolves: complete (clean exit) / fail (error) / cancel (aborted) - Removes ~120 lines of dead bg-specific code from shell.ts: pgrep wrapping, '&' appending, Windows ampersand cleanup, Windows early-return path, bg PID parsing, tempFile cleanup - New `/bashes` slash command: lists registered shells with id, status, runtime, command, output path. Empty state prints a friendly message. What this PR doesn't do - Footer pill / dialog integration — gated on #3488 landing - task_stop / send_message integration — gated on #3471 landing - Auto-backgrounding heuristics for long foreground bash — Phase D Test plan - 11 registry unit tests (state machine + idempotent terminal transitions) - 4 background-path tests in shell.test.ts (spawn no-wrap + complete / fail / cancel settle paths) - 2 /bashes command tests (empty + populated) - Full core suite: 247 files / 6075 passed (existing tests unaffected) * fix(core): address PR #3642 review feedback Three [Critical] from the auto review + naming alignment with Claude Code: - shell.ts settle: non-zero exit code or termination signal now bucket into `failed` instead of `completed`. The previous `if (result.error) fail else complete()` would misreport `false` / failed `npm test` as success because ShellExecutionService surfaces ordinary command failures as a non-zero exitCode with `error: null`. Failure reason carries the exit code or signal so `/tasks` shows the real cause. - ShellExecutionService.childProcessFallback: add `streamStdout` mode that emits each decoded chunk through the existing onOutputEvent path. The default (foreground) path continues to buffer + emit the cleaned final blob, so existing in-line shell calls are unaffected. executeBackground opts in via `{ streamStdout: true }`, which is what makes the captured output file actually useful for long-running processes (dev servers, watchers) — without it the file stayed empty until the process exited. - shell.ts test fixture: cancel-settle test was using `signal: 'SIGTERM'` but `ShellExecutionResult.signal` is `number | null`. TS2322 broke the build; switched to `signal: null`. Added a test that explicitly covers the new "non-zero exit → failed" path so the bucketing change has regression coverage. - shell.ts comment: explicitly document why background shells force `shouldUseNodePty=false` (no terminal, no human; node-pty would be dead weight for fire-and-forget commands). - /bashes → /tasks (alias bashes), description "List and manage background tasks" — matches Claude Code's command name. Currently lists shells only; will surface other task kinds (subagents, monitor) as those registries land via #3471 / #3488. * fix(core): address PR #3642 second-round review feedback - shellExecutionService streaming: drop stdout/stderr buffer + outputChunks accumulation in streaming mode. Each decoded chunk goes straight to onOutputEvent and is GC-eligible immediately. Long-running background commands (dev servers, watchers) no longer accumulate unbounded memory proportional to total output. Buffered (foreground) mode is unchanged. - shell.ts executeBackground: stripAnsi each chunk before writing to the output file. Dev servers / build tools spam color codes and cursor-move sequences that would render as garbage in the file the agent reads. - bashesCommand: command description "List and manage" → "List background tasks" — current implementation only supports listing, cancellation follows when the unified task_stop tool from #3471 is wired in. Replace the hand-rolled formatRuntime helper with the shared formatDuration utility (uses hideTrailingZeros for parity with the previous output). - backgroundShellRegistry: add a comment documenting the lack of an eviction policy as a known limitation. LRU / age-based / capped-size eviction (and on-disk output rotation) is left as a follow-up alongside the broader output-file lifecycle story. * fix(core): address PR #3642 third-round review feedback - shell.ts executeBackground: add 'error' listener on the output write stream. fs.createWriteStream surfaces write failures (disk full, permission, fs going away) as 'error' events; without a listener Node treats it as an uncaught exception and kills the entire CLI session. Log + drop is the sane default — the registry still settles via resultPromise so /tasks shows the right terminal status. - shell.ts executeBackground: store the abort handler reference and removeEventListener in the settle callback. Background shells outlive the turn signal; the dangling listener was keeping `entryAc` (and transitively `outputStream`) reachable until the turn signal itself was GC'd, which for long sessions would never happen. - shell.test.ts: extend the createWriteStream mock with an `on` stub so the new error-listener wiring doesn't crash the test suite. * refactor(cli): drop /bashes alias and rename file to tasksCommand Per follow-up review: the slash command should be exclusively /tasks. Removes the `bashes` altName, renames `bashesCommand{,.test}.ts` → `tasksCommand{,.test}.ts`, renames the exported binding `bashesCommand` → `tasksCommand`, and cleans up the remaining `/bashes` references in backgroundShellRegistry.ts comments. No behavior change beyond the alias removal. * refactor(cli): finish tasksCommand rename — apply content changes The previous commit (03c8503c8) only captured the file rename via `git mv`; the export name change (`bashesCommand` → `tasksCommand`), the removal of `altNames: ['bashes']`, the import update in BuiltinCommandLoader, and the `/bashes` → `/tasks` comments in backgroundShellRegistry.ts were unstaged when that commit landed. Squash candidate before merge. * fix(core): address PR #3642 fourth-round review feedback Four reviewer concerns from @wenshao + @doudouOUC: - [Critical] Config.shutdown() now also calls `backgroundShellRegistry.abortAll()`. Previously only the subagent registry was aborted, so a managed background shell could outlive the CLI process and orphan its child. Symmetric with how `BackgroundTaskRegistry.abortAll()` is wired in. - [P1] shell.ts executeBackground strips a trailing `&` from the command before spawn. The managed path is itself the backgrounding mechanism; forwarding `node server.js &` verbatim made bash exit immediately while the real child outlived the wrapper, causing the registry to settle as `completed` while the shell was still running and chunked output to land on a closed stream. Strip + warn. - [P2] Output file moves under `storage.getProjectTempDir()` (specifically `/background-shells//shell-.output`). `ReadFileTool` already auto-allows the project temp dir, so the LLM can `Read` the captured output without bouncing off a permission prompt — important because background-agent contexts can't surface interactive prompts. - [P2] Background shells are no longer killed when the current turn's AbortSignal fires. Forwarding the turn signal into the entry's AbortController meant a Ctrl+C on the turn would also terminate intentionally backgrounded dev servers / watchers, contradicting the independent-lifecycle promise. Cancellation now flows only through `entryAc` (driven by future `task_stop` integration via #3471). Tests: - New `abortAll` registry tests cover running / mixed / empty cases. - `runs background commands as managed pool entries` test stops asserting the wrapper-vs-entry signal identity since they're now structurally separate (no turn-to-entry forwarding). - New `does not forward the turn signal into the background shell` test pins the new behavior. - New `strips trailing & from the spawned command` test pins the strip. - Removed the cancel-via-outer-signal settle test — that path no longer exists; cancellation is exercised end-to-end via the registry's own `cancel` and `abortAll` tests in `backgroundShellRegistry.test.ts`. * fix(core): tighten trailing & strip — narrow regex + ReDoS-safe Two reviewer concerns on the same line of #3642 round 4: - [Critical CodeQL] `\s*&+\s*$` is a polynomial-time regex on uncontrolled input (long all-`&` strings backtrack quadratically). - [P2 doudouOUC] `&+` is too greedy: it also rewrites `npm run dev &&` into `npm run dev` (breaks logical AND syntax) and `echo foo \&` into `echo foo \` (eats the escaped literal). Only the bare bash background operator should be stripped. Replace the regex with a small linear-time helper `stripTrailingBackgroundAmp` that explicitly checks for the three "don't touch" cases (`&&`, `\&`, no trailing `&`). Plain `endsWith` / `slice` — no regex backtracking, and the intent reads off the page. Tests: - Existing strip-trailing-`&` test still passes. - New `does not strip a trailing &&` test pins the logical-AND case. - New `does not strip an escaped trailing \\&` test pins the escape case. * fix(core): keep binary-detection sniff in streaming mode @doudouOUC noted that `streamStdout` shortcut returned before the binary-sniff path, so a background command emitting binary bytes (`cat /bin/ls`, image dump, etc.) would be text-decoded and appended to the task output file unbounded. Restructure handleOutput so the sniff-and-cutover logic runs in both modes: - Both modes accumulate up to MAX_SNIFF_SIZE for the binary check. The accumulator is bounded; once the threshold is reached, it stops growing in streaming mode (dropped on binary detection / left inert on text confirmation) and continues to accumulate in buffered mode (existing foreground behavior). - Streaming mode emits 'binary_detected' as soon as `isBinary` trips so the consumer can stop writing the output file. Up to ~4KB of bytes may have been emitted as text chunks before detection — this is bounded and acceptable; the unbounded write is the pathology reviewers flagged. - Streaming text mode still emits each decoded chunk immediately and does not accumulate stdout/stderr strings, so long-running text streams remain GC-friendly. - Buffered (foreground) behavior is unchanged — the sniff accumulator is the same path the existing tests cover. Tests: 50 shellExecutionService + 11 backgroundShellRegistry + 57 shell.test.ts all pass; no regressions. * fix(core): tighten streaming sniff bound + Windows rmSync flake Two unrelated reds on the latest CI run: 1. [P1 doudouOUC] Streaming sniff buffer leaks on small chunks. The previous fix recomputed `sniffedBytes` from `Buffer.concat(outputChunks.slice(0, 20)).length` on every chunk — pinned to the first 20 chunks. If those total under MAX_SNIFF_SIZE (line-sized stdout, e.g. dev-server logs) the byte count never grew, the sniff branch stayed open forever, and `outputChunks` accumulated every later chunk — exactly the leak `streamStdout` was meant to prevent. Track sniffed bytes by running sum (`sniffedBytes += data.length`) so the bound is genuine. When sniff confirms text in streaming mode, drop the accumulator immediately so subsequent chunks fall through the streaming emit path without ever touching it. 2. file-exporters.test.ts afterEach `fs.rmSync` flaked on Windows (ENOTEMPTY: directory not empty). The exporter's underlying write stream hasn't always released its handle by the time `rmSync` runs. Pass `maxRetries: 5, retryDelay: 50` so the cleanup retries through the brief Windows handle-release window instead of failing the test on a CI quirk. --------- Co-authored-by: wenshao --- .../cli/src/services/BuiltinCommandLoader.ts | 2 + .../cli/src/ui/commands/tasksCommand.test.ts | 94 +++ packages/cli/src/ui/commands/tasksCommand.ts | 78 +++ packages/core/src/config/config.ts | 7 + packages/core/src/index.ts | 1 + .../services/backgroundShellRegistry.test.ts | 159 +++++ .../src/services/backgroundShellRegistry.ts | 109 ++++ .../src/services/shellExecutionService.ts | 68 +- .../core/src/telemetry/file-exporters.test.ts | 9 +- packages/core/src/tools/shell.test.ts | 299 +++++---- packages/core/src/tools/shell.ts | 608 ++++++++++-------- 11 files changed, 1024 insertions(+), 410 deletions(-) create mode 100644 packages/cli/src/ui/commands/tasksCommand.test.ts create mode 100644 packages/cli/src/ui/commands/tasksCommand.ts create mode 100644 packages/core/src/services/backgroundShellRegistry.test.ts create mode 100644 packages/core/src/services/backgroundShellRegistry.ts diff --git a/packages/cli/src/services/BuiltinCommandLoader.ts b/packages/cli/src/services/BuiltinCommandLoader.ts index 086542695..68d533e2a 100644 --- a/packages/cli/src/services/BuiltinCommandLoader.ts +++ b/packages/cli/src/services/BuiltinCommandLoader.ts @@ -8,6 +8,7 @@ import type { ICommandLoader } from './types.js'; import type { SlashCommand } from '../ui/commands/types.js'; import type { Config } from '@qwen-code/qwen-code-core'; import { aboutCommand } from '../ui/commands/aboutCommand.js'; +import { tasksCommand } from '../ui/commands/tasksCommand.js'; import { agentsCommand } from '../ui/commands/agentsCommand.js'; import { arenaCommand } from '../ui/commands/arenaCommand.js'; import { approvalModeCommand } from '../ui/commands/approvalModeCommand.js'; @@ -92,6 +93,7 @@ export class BuiltinCommandLoader implements ICommandLoader { const allDefinitions: Array = [ aboutCommand, agentsCommand, + tasksCommand, arenaCommand, approvalModeCommand, authCommand, diff --git a/packages/cli/src/ui/commands/tasksCommand.test.ts b/packages/cli/src/ui/commands/tasksCommand.test.ts new file mode 100644 index 000000000..9d4355b04 --- /dev/null +++ b/packages/cli/src/ui/commands/tasksCommand.test.ts @@ -0,0 +1,94 @@ +/** + * @license + * Copyright 2026 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { vi, describe, it, expect, beforeEach } from 'vitest'; +import { tasksCommand } from './tasksCommand.js'; +import { type CommandContext } from './types.js'; +import { createMockCommandContext } from '../../test-utils/mockCommandContext.js'; +import type { BackgroundShellEntry } from '@qwen-code/qwen-code-core'; + +function entry( + overrides: Partial = {}, +): BackgroundShellEntry { + return { + shellId: 'bg_aaaaaaaa', + command: 'sleep 60', + cwd: '/tmp', + status: 'running', + startTime: Date.now() - 5_000, + outputPath: '/tmp/tasks/sess/shell-bg_aaaaaaaa.output', + abortController: new AbortController(), + ...overrides, + }; +} + +describe('tasksCommand', () => { + let context: CommandContext; + let getAll: ReturnType; + + beforeEach(() => { + getAll = vi.fn().mockReturnValue([]); + context = createMockCommandContext({ + services: { + config: { + getBackgroundShellRegistry: () => ({ getAll }), + }, + }, + } as unknown as Parameters[0]); + }); + + it('reports an empty registry', async () => { + const result = await tasksCommand.action!(context, ''); + expect(result).toEqual({ + type: 'message', + messageType: 'info', + content: 'No background shells.', + }); + }); + + it('lists running and terminal entries with status / runtime / output path', async () => { + getAll.mockReturnValue([ + entry({ + shellId: 'bg_run', + command: 'npm run dev', + status: 'running', + startTime: Date.now() - 12_000, + pid: 1111, + }), + entry({ + shellId: 'bg_done', + command: 'npm test', + status: 'completed', + exitCode: 0, + startTime: Date.now() - 70_000, + endTime: Date.now() - 5_000, + outputPath: '/tmp/tasks/sess/shell-bg_done.output', + }), + entry({ + shellId: 'bg_fail', + command: 'flaky.sh', + status: 'failed', + error: 'spawn ENOENT', + startTime: Date.now() - 3_000, + endTime: Date.now() - 2_000, + }), + ]); + + const result = await tasksCommand.action!(context, ''); + if (!result || result.type !== 'message') { + throw new Error('expected message result'); + } + expect(result.content).toContain('Background shells (3 total)'); + expect(result.content).toContain('[bg_run] running'); + expect(result.content).toContain('pid=1111'); + expect(result.content).toContain('npm run dev'); + expect(result.content).toContain('[bg_done] completed (exit 0)'); + expect(result.content).toContain('[bg_fail] failed: spawn ENOENT'); + expect(result.content).toContain( + 'output: /tmp/tasks/sess/shell-bg_done.output', + ); + }); +}); diff --git a/packages/cli/src/ui/commands/tasksCommand.ts b/packages/cli/src/ui/commands/tasksCommand.ts new file mode 100644 index 000000000..b58772e41 --- /dev/null +++ b/packages/cli/src/ui/commands/tasksCommand.ts @@ -0,0 +1,78 @@ +/** + * @license + * Copyright 2026 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type { BackgroundShellEntry } from '@qwen-code/qwen-code-core'; +import type { SlashCommand } from './types.js'; +import { CommandKind } from './types.js'; +import { t } from '../../i18n/index.js'; +import { formatDuration } from '../utils/formatters.js'; + +function statusLabel(entry: BackgroundShellEntry): string { + switch (entry.status) { + case 'completed': + return `completed (exit ${entry.exitCode ?? '?'})`; + case 'failed': + return `failed: ${entry.error ?? 'unknown error'}`; + case 'cancelled': + return 'cancelled'; + case 'running': + return 'running'; + default: + return entry.status; + } +} + +export const tasksCommand: SlashCommand = { + name: 'tasks', + get description() { + return t('List background tasks'); + }, + kind: CommandKind.BUILT_IN, + supportedModes: ['interactive', 'non_interactive', 'acp'] as const, + action: async (context) => { + const { config } = context.services; + if (!config) { + return { + type: 'message' as const, + messageType: 'error' as const, + content: 'Config not available.', + }; + } + + const entries = config.getBackgroundShellRegistry().getAll(); + + if (entries.length === 0) { + return { + type: 'message' as const, + messageType: 'info' as const, + content: 'No background shells.', + }; + } + + const now = Date.now(); + const lines: string[] = [ + `Background shells (${entries.length} total)`, + '', + ]; + for (const entry of entries) { + const endTime = entry.endTime ?? now; + const runtime = formatDuration(endTime - entry.startTime, { + hideTrailingZeros: true, + }); + const pidPart = entry.pid !== undefined ? ` pid=${entry.pid}` : ''; + lines.push( + `[${entry.shellId}] ${statusLabel(entry)} ${runtime}${pidPart} ${entry.command}`, + ); + lines.push(` output: ${entry.outputPath}`); + } + + return { + type: 'message' as const, + messageType: 'info' as const, + content: lines.join('\n'), + }; + }, +}; diff --git a/packages/core/src/config/config.ts b/packages/core/src/config/config.ts index a3b498b99..52aa49394 100644 --- a/packages/core/src/config/config.ts +++ b/packages/core/src/config/config.ts @@ -61,6 +61,7 @@ import { PermissionManager } from '../permissions/permission-manager.js'; import { SubagentManager } from '../subagents/subagent-manager.js'; import type { SubagentConfig } from '../subagents/types.js'; import { BackgroundTaskRegistry } from '../agents/background-tasks.js'; +import { BackgroundShellRegistry } from '../services/backgroundShellRegistry.js'; import { DEFAULT_OTLP_ENDPOINT, DEFAULT_TELEMETRY_TARGET, @@ -545,6 +546,7 @@ export class Config { private promptRegistry!: PromptRegistry; private subagentManager!: SubagentManager; private readonly backgroundTaskRegistry = new BackgroundTaskRegistry(); + private readonly backgroundShellRegistry = new BackgroundShellRegistry(); private extensionManager!: ExtensionManager; private skillManager: SkillManager | null = null; private permissionManager: PermissionManager | null = null; @@ -1610,6 +1612,7 @@ export class Config { } this.backgroundTaskRegistry.abortAll(); + this.backgroundShellRegistry.abortAll(); await this.cleanupArenaRuntime(); } catch (error) { @@ -2489,6 +2492,10 @@ export class Config { return this.backgroundTaskRegistry; } + getBackgroundShellRegistry(): BackgroundShellRegistry { + return this.backgroundShellRegistry; + } + /** * Whether interactive permission prompts should be auto-denied. * True for background agents that have no UI to show prompts. diff --git a/packages/core/src/index.ts b/packages/core/src/index.ts index 98e43709c..88e356265 100644 --- a/packages/core/src/index.ts +++ b/packages/core/src/index.ts @@ -147,6 +147,7 @@ export * from './services/sessionService.js'; export * from './services/sessionTitle.js'; export { stripTerminalControlSequences } from './utils/terminalSafe.js'; export * from './services/shellExecutionService.js'; +export * from './services/backgroundShellRegistry.js'; export * from './services/toolUseSummary.js'; export * from './utils/bareMode.js'; diff --git a/packages/core/src/services/backgroundShellRegistry.test.ts b/packages/core/src/services/backgroundShellRegistry.test.ts new file mode 100644 index 000000000..193492d27 --- /dev/null +++ b/packages/core/src/services/backgroundShellRegistry.test.ts @@ -0,0 +1,159 @@ +/** + * @license + * Copyright 2026 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, expect, it } from 'vitest'; +import { + BackgroundShellRegistry, + type BackgroundShellEntry, +} from './backgroundShellRegistry.js'; + +function makeEntry( + overrides: Partial = {}, +): BackgroundShellEntry { + return { + shellId: 's1', + command: 'sleep 60', + cwd: '/tmp', + status: 'running', + startTime: 1000, + outputPath: '/tmp/s1.output', + abortController: new AbortController(), + ...overrides, + }; +} + +describe('BackgroundShellRegistry', () => { + describe('register / get / getAll', () => { + it('round-trips a registered entry by id', () => { + const reg = new BackgroundShellRegistry(); + const e = makeEntry({ shellId: 'a' }); + reg.register(e); + expect(reg.get('a')).toBe(e); + }); + + it('returns undefined for unknown id', () => { + const reg = new BackgroundShellRegistry(); + expect(reg.get('missing')).toBeUndefined(); + }); + + it('lists all entries via getAll', () => { + const reg = new BackgroundShellRegistry(); + const a = makeEntry({ shellId: 'a' }); + const b = makeEntry({ shellId: 'b' }); + reg.register(a); + reg.register(b); + const all = reg.getAll(); + expect(all).toHaveLength(2); + expect(all).toContain(a); + expect(all).toContain(b); + }); + }); + + describe('complete', () => { + it('transitions running → completed with exitCode and endTime', () => { + const reg = new BackgroundShellRegistry(); + reg.register(makeEntry({ shellId: 'a' })); + reg.complete('a', 0, 2000); + const e = reg.get('a')!; + expect(e.status).toBe('completed'); + expect(e.exitCode).toBe(0); + expect(e.endTime).toBe(2000); + }); + + it('is a no-op when entry is not running', () => { + const reg = new BackgroundShellRegistry(); + reg.register(makeEntry({ shellId: 'a' })); + reg.cancel('a', 1500); + reg.complete('a', 0, 2000); + const e = reg.get('a')!; + expect(e.status).toBe('cancelled'); + expect(e.exitCode).toBeUndefined(); + }); + + it('is a no-op for unknown id', () => { + const reg = new BackgroundShellRegistry(); + expect(() => reg.complete('missing', 0, 0)).not.toThrow(); + }); + }); + + describe('fail', () => { + it('transitions running → failed with error and endTime', () => { + const reg = new BackgroundShellRegistry(); + reg.register(makeEntry({ shellId: 'a' })); + reg.fail('a', 'spawn error', 2000); + const e = reg.get('a')!; + expect(e.status).toBe('failed'); + expect(e.error).toBe('spawn error'); + expect(e.endTime).toBe(2000); + }); + + it('is a no-op when entry is not running', () => { + const reg = new BackgroundShellRegistry(); + reg.register(makeEntry({ shellId: 'a' })); + reg.complete('a', 0, 1500); + reg.fail('a', 'late error', 2000); + const e = reg.get('a')!; + expect(e.status).toBe('completed'); + expect(e.error).toBeUndefined(); + }); + }); + + describe('abortAll', () => { + it('cancels every running entry and leaves terminal entries alone', () => { + const reg = new BackgroundShellRegistry(); + const acRunning1 = new AbortController(); + const acRunning2 = new AbortController(); + const acDone = new AbortController(); + reg.register(makeEntry({ shellId: 'a', abortController: acRunning1 })); + reg.register(makeEntry({ shellId: 'b', abortController: acRunning2 })); + reg.register(makeEntry({ shellId: 'c', abortController: acDone })); + reg.complete('c', 0, 1500); + + reg.abortAll(); + + expect(reg.get('a')!.status).toBe('cancelled'); + expect(reg.get('b')!.status).toBe('cancelled'); + expect(reg.get('c')!.status).toBe('completed'); + expect(acRunning1.signal.aborted).toBe(true); + expect(acRunning2.signal.aborted).toBe(true); + expect(acDone.signal.aborted).toBe(false); + }); + + it('is a no-op when registry is empty', () => { + const reg = new BackgroundShellRegistry(); + expect(() => reg.abortAll()).not.toThrow(); + }); + }); + + describe('cancel', () => { + it('transitions running → cancelled and aborts the signal', () => { + const reg = new BackgroundShellRegistry(); + const ac = new AbortController(); + reg.register(makeEntry({ shellId: 'a', abortController: ac })); + reg.cancel('a', 2000); + const e = reg.get('a')!; + expect(e.status).toBe('cancelled'); + expect(e.endTime).toBe(2000); + expect(ac.signal.aborted).toBe(true); + }); + + it('is a no-op when entry is already terminal', () => { + const reg = new BackgroundShellRegistry(); + const ac = new AbortController(); + reg.register(makeEntry({ shellId: 'a', abortController: ac })); + reg.complete('a', 0, 1500); + reg.cancel('a', 2000); + const e = reg.get('a')!; + expect(e.status).toBe('completed'); + expect(ac.signal.aborted).toBe(false); + }); + + it('is a no-op for unknown id', () => { + const reg = new BackgroundShellRegistry(); + expect(() => reg.cancel('missing', 0)).not.toThrow(); + }); + }); +}); diff --git a/packages/core/src/services/backgroundShellRegistry.ts b/packages/core/src/services/backgroundShellRegistry.ts new file mode 100644 index 000000000..b8b2e5951 --- /dev/null +++ b/packages/core/src/services/backgroundShellRegistry.ts @@ -0,0 +1,109 @@ +/** + * @license + * Copyright 2026 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * Tracks background shell processes spawned via the `shell` tool with + * `is_background: true`. Each entry holds the metadata the agent and the + * `/tasks` slash command need to query, observe, or terminate a running + * background shell. + * + * State machine: register → running → { completed | failed | cancelled }. + * Transitions out of running are one-shot: complete/fail/cancel become + * no-ops once the entry has settled. This prevents late callbacks (e.g. a + * process that exits during cancellation) from clobbering the terminal + * status. + */ + +export type BackgroundShellStatus = + | 'running' + | 'completed' + | 'failed' + | 'cancelled'; + +export interface BackgroundShellEntry { + /** Stable id used by the model and the `/tasks` UI. */ + shellId: string; + /** The user-supplied command, after any pre-processing the tool applies. */ + command: string; + /** Working directory the process was spawned in. */ + cwd: string; + /** OS pid once spawned; absent if registration happens before spawn. */ + pid?: number; + status: BackgroundShellStatus; + /** Exit code on `completed`. */ + exitCode?: number; + /** Error message on `failed`. */ + error?: string; + /** ms epoch when the entry was registered. */ + startTime: number; + /** ms epoch when the entry transitioned out of running. */ + endTime?: number; + /** Absolute path of the captured stdout/stderr file. */ + outputPath: string; + /** Aborted by `cancel()`; callers should wire it into the spawn. */ + abortController: AbortController; +} + +export class BackgroundShellRegistry { + // Entries persist for the session lifetime — no automatic eviction of + // terminal entries. For typical interactive sessions (tens of background + // shells over an hour) this is fine, but long-running sessions that spawn + // many short-lived background commands will see the map and the on-disk + // output files grow without bound. Eviction policy (LRU? age-based? cap?) + // is left as a follow-up alongside output-file rotation. + private readonly entries = new Map(); + + register(entry: BackgroundShellEntry): void { + this.entries.set(entry.shellId, entry); + } + + get(shellId: string): BackgroundShellEntry | undefined { + return this.entries.get(shellId); + } + + getAll(): readonly BackgroundShellEntry[] { + return [...this.entries.values()]; + } + + complete(shellId: string, exitCode: number, endTime: number): void { + const entry = this.entries.get(shellId); + if (!entry || entry.status !== 'running') return; + entry.status = 'completed'; + entry.exitCode = exitCode; + entry.endTime = endTime; + } + + fail(shellId: string, error: string, endTime: number): void { + const entry = this.entries.get(shellId); + if (!entry || entry.status !== 'running') return; + entry.status = 'failed'; + entry.error = error; + entry.endTime = endTime; + } + + cancel(shellId: string, endTime: number): void { + const entry = this.entries.get(shellId); + if (!entry || entry.status !== 'running') return; + entry.status = 'cancelled'; + entry.endTime = endTime; + entry.abortController.abort(); + } + + /** + * Cancel every still-running entry. Called on session/Config shutdown so + * background shells don't outlive the CLI process and leak orphaned + * children. Symmetric with `BackgroundTaskRegistry.abortAll()` for the + * subagent path. + */ + abortAll(): void { + const endTime = Date.now(); + for (const entry of Array.from(this.entries.values())) { + if (entry.status === 'running') { + this.cancel(entry.shellId, endTime); + } + } + } +} diff --git a/packages/core/src/services/shellExecutionService.ts b/packages/core/src/services/shellExecutionService.ts index 816f41fa5..8d7e0bb3b 100644 --- a/packages/core/src/services/shellExecutionService.ts +++ b/packages/core/src/services/shellExecutionService.ts @@ -329,6 +329,7 @@ export class ShellExecutionService { abortSignal: AbortSignal, shouldUseNodePty: boolean, shellExecutionConfig: ShellExecutionConfig, + options: { streamStdout?: boolean } = {}, ): Promise { if (shouldUseNodePty) { const ptyInfo = await getPty(); @@ -353,6 +354,7 @@ export class ShellExecutionService { cwd, onOutputEvent, abortSignal, + options.streamStdout ?? false, ); } @@ -361,6 +363,7 @@ export class ShellExecutionService { cwd: string, onOutputEvent: (event: ShellOutputEvent) => void, abortSignal: AbortSignal, + streamStdout: boolean, ): ShellExecutionHandle { try { const isWindows = os.platform() === 'win32'; @@ -415,26 +418,61 @@ export class ShellExecutionService { } } - outputChunks.push(data); - + // Binary sniff applies in both modes — even streaming consumers + // (e.g. background shell output file) shouldn't pile up text-decoded + // garbage when the command actually emits binary (`cat /bin/ls`, + // image dumps, etc.). Track sniffed bytes by running sum so the + // accumulator is truly byte-bounded — the previous version recomputed + // sniffedBytes from `slice(0, 20)` on every call, which never grew + // past the first 20 chunks' total and let the chunk array leak on + // line-sized streams. if (isStreamingRawContent && sniffedBytes < MAX_SNIFF_SIZE) { - const sniffBuffer = Buffer.concat(outputChunks.slice(0, 20)); - sniffedBytes = sniffBuffer.length; - + outputChunks.push(data); + sniffedBytes += data.length; + const sniffBuffer = Buffer.concat(outputChunks); if (isBinary(sniffBuffer)) { isStreamingRawContent = false; + if (streamStdout) { + // Tell the streaming consumer to stop writing text chunks; + // drop the sniff accumulator now so it can be GC'd. + onOutputEvent({ type: 'binary_detected' }); + outputChunks.length = 0; + } + } else if (streamStdout && sniffedBytes >= MAX_SNIFF_SIZE) { + // Sniff passed in streaming mode — text confirmed, drop the + // accumulator. Subsequent chunks fall through to the streaming + // emit path below without ever touching outputChunks. + outputChunks.length = 0; } + } else if (!streamStdout) { + // Buffered (foreground) mode past sniff: keep accumulating for + // the final emit at exit. Streaming mode does not accumulate. + outputChunks.push(data); } - if (isStreamingRawContent) { - const decoder = stream === 'stdout' ? stdoutDecoder : stderrDecoder; - const decodedChunk = decoder.decode(data, { stream: true }); + if (!isStreamingRawContent) { + // Binary mode: drop further data. Foreground emits the + // binary_detected event from handleExit (existing behavior); + // background already emitted it above. + return; + } - if (stream === 'stdout') { - stdout += decodedChunk; - } else { - stderr += decodedChunk; - } + const decoder = stream === 'stdout' ? stdoutDecoder : stderrDecoder; + const decodedChunk = decoder.decode(data, { stream: true }); + + if (streamStdout) { + // Streaming text mode: push through immediately, no string + // accumulation. (Up to ~4KB may already have been emitted + // before binary detection trips — bounded, acceptable.) + onOutputEvent({ type: 'data', chunk: decodedChunk }); + return; + } + + // Buffered text mode: accumulate for the final cleaned-blob emit. + if (stream === 'stdout') { + stdout += decodedChunk; + } else { + stderr += decodedChunk; } }; @@ -451,7 +489,9 @@ export class ShellExecutionService { const finalStrippedOutput = stripAnsi(combinedOutput).trim(); if (isStreamingRawContent) { - if (finalStrippedOutput) { + // In streaming mode chunks were already emitted as they arrived; + // re-emitting the final blob would duplicate everything. + if (!streamStdout && finalStrippedOutput) { onOutputEvent({ type: 'data', chunk: finalStrippedOutput }); } } else { diff --git a/packages/core/src/telemetry/file-exporters.test.ts b/packages/core/src/telemetry/file-exporters.test.ts index 40a871de1..3dede26fc 100644 --- a/packages/core/src/telemetry/file-exporters.test.ts +++ b/packages/core/src/telemetry/file-exporters.test.ts @@ -27,7 +27,14 @@ describe('FileExporter.serialize', () => { afterEach(async () => { await exporter.shutdown(); - fs.rmSync(tmpDir, { recursive: true, force: true }); + // Windows occasionally returns ENOTEMPTY when the underlying file + // handle isn't fully released yet; retry a few times before failing. + fs.rmSync(tmpDir, { + recursive: true, + force: true, + maxRetries: 5, + retryDelay: 50, + }); }); // Regression for upstream PR #4689: a raw JSON.stringify on a ReadableSpan diff --git a/packages/core/src/tools/shell.test.ts b/packages/core/src/tools/shell.test.ts index a8748e375..206f67d99 100644 --- a/packages/core/src/tools/shell.test.ts +++ b/packages/core/src/tools/shell.test.ts @@ -31,8 +31,6 @@ import { } from '../services/shellExecutionService.js'; import * as fs from 'node:fs'; import * as os from 'node:os'; -import { EOL } from 'node:os'; -import * as path from 'node:path'; import * as crypto from 'node:crypto'; import { ToolErrorType } from './tool-error.js'; import { OUTPUT_UPDATE_INTERVAL_MS } from './shell.js'; @@ -55,12 +53,14 @@ describe('ShellTool', () => { getPermissionsDeny: vi.fn().mockReturnValue([]), getDebugMode: vi.fn().mockReturnValue(false), getTargetDir: vi.fn().mockReturnValue('/test/dir'), + getSessionId: vi.fn().mockReturnValue('test-session'), getWorkspaceContext: vi .fn() .mockReturnValue(createMockWorkspaceContext('/test/dir')), storage: { getUserSkillsDirs: vi.fn().mockReturnValue(['/test/dir/.qwen/skills']), getProjectTempDir: vi.fn().mockReturnValue('/tmp/qwen-temp'), + getProjectDir: vi.fn().mockReturnValue('/test/proj'), }, getTruncateToolOutputThreshold: vi.fn().mockReturnValue(0), getTruncateToolOutputLines: vi.fn().mockReturnValue(0), @@ -72,8 +72,24 @@ describe('ShellTool', () => { email: 'qwen-coder@alibabacloud.com', }), getShouldUseNodePtyShell: vi.fn().mockReturnValue(false), + getBackgroundShellRegistry: vi.fn().mockReturnValue({ + register: vi.fn(), + get: vi.fn(), + getAll: vi.fn().mockReturnValue([]), + cancel: vi.fn(), + complete: vi.fn(), + fail: vi.fn(), + }), } as unknown as Config; + // executeBackground writes to disk; stub mkdirSync + createWriteStream. + vi.mocked(fs.mkdirSync).mockReturnValue(undefined); + vi.mocked(fs.createWriteStream).mockReturnValue({ + write: vi.fn(), + end: vi.fn(), + on: vi.fn(), + } as unknown as fs.WriteStream); + shellTool = new ShellTool(mockConfig); vi.mocked(os.platform).mockReturnValue('linux'); @@ -271,83 +287,202 @@ describe('ShellTool', () => { resolveExecutionPromise(fullResult); }; - it('should wrap background command on linux and parse pgrep output', async () => { + it('runs background commands as managed pool entries (no & / pgrep wrap)', async () => { + const registry = mockConfig.getBackgroundShellRegistry(); const invocation = shellTool.build({ - command: 'my-command', + command: 'npm start', is_background: true, }); - const promise = invocation.execute(mockAbortSignal); - resolveShellExecution({ pid: 54321 }); - vi.mocked(fs.existsSync).mockReturnValue(true); - vi.mocked(fs.readFileSync).mockReturnValue(`54321${EOL}54322${EOL}`); // Service PID and background PID + const result = await invocation.execute(mockAbortSignal); - const result = await promise; - - const tmpFile = path.join(os.tmpdir(), 'shell_pgrep_abcdef.tmp'); - const wrappedCommand = `{ my-command & }; __code=$?; pgrep -g 0 >${tmpFile} 2>&1; exit $__code;`; + // Spawn happens with the unwrapped command — no '&', no pgrep envelope. + // Streaming mode is on so dev-server / watcher output flushes to the + // output file as it arrives instead of buffering until exit. expect(mockShellExecutionService).toHaveBeenCalledWith( - wrappedCommand, + 'npm start', '/test/dir', expect.any(Function), expect.any(AbortSignal), false, {}, + { streamStdout: true }, ); - expect(result.llmContent).toContain('PIDs: 54322'); - expect(vi.mocked(fs.unlinkSync)).toHaveBeenCalledWith(tmpFile); + // Entry registered with the spawn pid. + expect(registry.register).toHaveBeenCalledTimes(1); + const entry = (registry.register as Mock).mock.calls[0][0]; + expect(entry.command).toBe('npm start'); + expect(entry.cwd).toBe('/test/dir'); + expect(entry.status).toBe('running'); + expect(entry.pid).toBe(12345); + expect(typeof entry.shellId).toBe('string'); + expect(entry.outputPath).toContain('shell-'); + // Returns immediately with id + output path; agent's turn isn't blocked. + expect(result.llmContent).toContain(entry.shellId); + expect(result.llmContent).toContain(entry.outputPath); }); - it('should add ampersand to command when is_background is true and command does not end with &', async () => { + it('settles a background entry as completed when the process exits cleanly', async () => { + const registry = mockConfig.getBackgroundShellRegistry(); const invocation = shellTool.build({ - command: 'npm start', + command: 'true', is_background: true, }); - const promise = invocation.execute(mockAbortSignal); - resolveShellExecution({ pid: 54321 }); + await invocation.execute(mockAbortSignal); + const entry = (registry.register as Mock).mock.calls[0][0]; - vi.mocked(fs.existsSync).mockReturnValue(true); - vi.mocked(fs.readFileSync).mockReturnValue('54321\n54322\n'); + resolveExecutionPromise({ + rawOutput: Buffer.from(''), + output: '', + exitCode: 0, + signal: null, + error: null, + aborted: false, + pid: 12345, + executionMethod: 'child_process', + }); + // Flush the .then() microtask attached to resultPromise. + await new Promise((r) => setImmediate(r)); - await promise; + expect(registry.complete).toHaveBeenCalledWith( + entry.shellId, + 0, + expect.any(Number), + ); + expect(registry.fail).not.toHaveBeenCalled(); + expect(registry.cancel).not.toHaveBeenCalled(); + }); - const tmpFile = path.join(os.tmpdir(), 'shell_pgrep_abcdef.tmp'); - const wrappedCommand = `{ npm start & }; __code=$?; pgrep -g 0 >${tmpFile} 2>&1; exit $__code;`; + it('settles a background entry as failed when ShellExecutionService reports error', async () => { + const registry = mockConfig.getBackgroundShellRegistry(); + const invocation = shellTool.build({ + command: 'no-such-command', + is_background: true, + }); + await invocation.execute(mockAbortSignal); + const entry = (registry.register as Mock).mock.calls[0][0]; + + resolveExecutionPromise({ + rawOutput: Buffer.from(''), + output: '', + exitCode: null, + signal: null, + error: new Error('spawn ENOENT'), + aborted: false, + pid: 12345, + executionMethod: 'child_process', + }); + await new Promise((r) => setImmediate(r)); + + expect(registry.fail).toHaveBeenCalledWith( + entry.shellId, + 'spawn ENOENT', + expect.any(Number), + ); + expect(registry.complete).not.toHaveBeenCalled(); + }); + + it('settles a background entry as failed on non-zero exit code (no error object)', async () => { + const registry = mockConfig.getBackgroundShellRegistry(); + const invocation = shellTool.build({ + command: 'false', + is_background: true, + }); + await invocation.execute(mockAbortSignal); + const entry = (registry.register as Mock).mock.calls[0][0]; + + // ShellExecutionService reports a clean non-zero exit (no error object, + // no signal) — historically this got bucketed as `completed`, which + // misreported a failed `npm test` / `false` as a success. + resolveExecutionPromise({ + rawOutput: Buffer.from(''), + output: '', + exitCode: 1, + signal: null, + error: null, + aborted: false, + pid: 12345, + executionMethod: 'child_process', + }); + await new Promise((r) => setImmediate(r)); + + expect(registry.fail).toHaveBeenCalledWith( + entry.shellId, + expect.stringContaining('exited with code 1'), + expect.any(Number), + ); + expect(registry.complete).not.toHaveBeenCalled(); + }); + + it('strips trailing & from the spawned command (managed path handles backgrounding)', async () => { + const invocation = shellTool.build({ + command: 'node server.js &', + is_background: true, + }); + await invocation.execute(mockAbortSignal); expect(mockShellExecutionService).toHaveBeenCalledWith( - wrappedCommand, + 'node server.js', + '/test/dir', + expect.any(Function), + expect.any(AbortSignal), + false, + {}, + { streamStdout: true }, + ); + }); + + it('does not strip a trailing && (logical AND would be syntactically broken)', async () => { + const invocation = shellTool.build({ + command: 'npm run dev &&', + is_background: true, + }); + await invocation.execute(mockAbortSignal); + expect(mockShellExecutionService).toHaveBeenCalledWith( + 'npm run dev &&', expect.any(String), expect.any(Function), expect.any(AbortSignal), false, {}, + { streamStdout: true }, ); }); - it('should not add extra ampersand when is_background is true and command already ends with &', async () => { + it('does not strip an escaped trailing \\& (literal &)', async () => { const invocation = shellTool.build({ - command: 'npm start &', + command: 'echo foo \\&', is_background: true, }); - const promise = invocation.execute(mockAbortSignal); - resolveShellExecution({ pid: 54321 }); - - vi.mocked(fs.existsSync).mockReturnValue(true); - vi.mocked(fs.readFileSync).mockReturnValue('54321\n54322\n'); - - await promise; - - const tmpFile = path.join(os.tmpdir(), 'shell_pgrep_abcdef.tmp'); - const wrappedCommand = `{ npm start & }; __code=$?; pgrep -g 0 >${tmpFile} 2>&1; exit $__code;`; + await invocation.execute(mockAbortSignal); expect(mockShellExecutionService).toHaveBeenCalledWith( - wrappedCommand, + 'echo foo \\&', expect.any(String), expect.any(Function), expect.any(AbortSignal), false, {}, + { streamStdout: true }, ); }); + it('does not forward the turn signal into the background shell', async () => { + // Verifies: the AbortSignal handed to ShellExecutionService is the + // entry's own controller, not the outer turn signal. Cancelling the + // turn must not kill an intentionally backgrounded dev server / watcher. + const turnAc = new AbortController(); + const invocation = shellTool.build({ + command: 'npm run dev', + is_background: true, + }); + await invocation.execute(turnAc.signal); + const passedSignal = mockShellExecutionService.mock.calls[0][3]; + expect(passedSignal).not.toBe(turnAc.signal); + turnAc.abort(); + // The signal handed to ShellExecutionService stays un-aborted — + // the turn's abort doesn't propagate into the background shell. + expect(passedSignal.aborted).toBe(false); + }); + it('should not add ampersand when is_background is false', async () => { const invocation = shellTool.build({ command: 'npm test', @@ -479,23 +614,6 @@ describe('ShellTool', () => { ).toThrow('Directory must be an absolute path.'); }); - it('should clean up the temp file on synchronous execution error', async () => { - const error = new Error('sync spawn error'); - mockShellExecutionService.mockImplementation(() => { - throw error; - }); - vi.mocked(fs.existsSync).mockReturnValue(true); // Pretend the file exists - - const invocation = shellTool.build({ - command: 'a-command', - is_background: false, - }); - await expect(invocation.execute(mockAbortSignal)).rejects.toThrow(error); - - const tmpFile = path.join(os.tmpdir(), 'shell_pgrep_abcdef.tmp'); - expect(vi.mocked(fs.unlinkSync)).toHaveBeenCalledWith(tmpFile); - }); - describe('Streaming to `updateOutput`', () => { let updateOutputMock: Mock; beforeEach(() => { @@ -1016,43 +1134,6 @@ describe('ShellTool', () => { }); }); - describe('Windows background execution', () => { - it('should clean up trailing ampersand on Windows for background tasks', async () => { - vi.mocked(os.platform).mockReturnValue('win32'); - const mockAbortSignal = new AbortController().signal; - - const invocation = shellTool.build({ - command: 'npm start &', - is_background: true, - }); - - const promise = invocation.execute(mockAbortSignal); - - // Simulate immediate success (process started) - resolveExecutionPromise({ - rawOutput: Buffer.from(''), - output: '', - exitCode: 0, - signal: null, - error: null, - aborted: false, - pid: 12345, - executionMethod: 'child_process', - }); - - await promise; - - expect(mockShellExecutionService).toHaveBeenCalledWith( - 'npm start', - expect.any(String), - expect.any(Function), - expect.any(AbortSignal), - false, - {}, - ); - }); - }); - describe('timeout parameter', () => { it('should validate timeout parameter correctly', async () => { // Valid timeout @@ -1177,40 +1258,6 @@ describe('ShellTool', () => { expect(calledSignal).not.toBe(mockAbortSignal); }); - it('should not create timeout signal for background execution', async () => { - const mockAbortSignal = new AbortController().signal; - const invocation = shellTool.build({ - command: 'npm start', - is_background: true, - timeout: 5000, - }); - - const promise = invocation.execute(mockAbortSignal); - - resolveExecutionPromise({ - rawOutput: Buffer.from(''), - output: 'Background command started. PID: 12345', - exitCode: 0, - signal: null, - error: null, - aborted: false, - pid: 12345, - executionMethod: 'child_process', - }); - - await promise; - - // For background execution, the original signal should be used - expect(mockShellExecutionService).toHaveBeenCalledWith( - expect.any(String), - expect.any(String), - expect.any(Function), - mockAbortSignal, - false, - {}, - ); - }); - it('should handle timeout vs user cancellation correctly', async () => { const userAbortController = new AbortController(); const invocation = shellTool.build({ diff --git a/packages/core/src/tools/shell.ts b/packages/core/src/tools/shell.ts index 02c486181..625dfe4c6 100644 --- a/packages/core/src/tools/shell.ts +++ b/packages/core/src/tools/shell.ts @@ -6,7 +6,7 @@ import fs from 'node:fs'; import path from 'node:path'; -import os, { EOL } from 'node:os'; +import os from 'node:os'; import crypto from 'node:crypto'; import type { Config } from '../config/config.js'; import { ToolNames, ToolDisplayNames } from './tool-names.js'; @@ -29,6 +29,8 @@ import type { ShellOutputEvent, } from '../services/shellExecutionService.js'; import { ShellExecutionService } from '../services/shellExecutionService.js'; +import type { BackgroundShellEntry } from '../services/backgroundShellRegistry.js'; +import stripAnsi from 'strip-ansi'; import { formatMemoryUsage } from '../utils/formatters.js'; import type { AnsiOutput } from '../utils/terminalSerializer.js'; import { isSubpaths } from '../utils/paths.js'; @@ -46,6 +48,20 @@ import { const debugLogger = createDebugLogger('SHELL'); +/** + * Strip a single bare trailing `&` (bash background operator) from a + * command string. Returns the input unchanged if the trailing form is + * `&&` (logical AND), `\&` (escaped literal `&`), or there is no `&` + * at the end at all. Linear time, no regex backtracking risk. + */ +function stripTrailingBackgroundAmp(command: string): string { + const trimmed = command.trimEnd(); + if (!trimmed.endsWith('&')) return command; + if (trimmed.endsWith('&&')) return command; + if (trimmed.endsWith('\\&')) return command; + return trimmed.slice(0, -1).trimEnd(); +} + export const OUTPUT_UPDATE_INTERVAL_MS = 1000; const DEFAULT_FOREGROUND_TIMEOUT_MS = 120000; @@ -208,9 +224,12 @@ export class ShellToolInvocation extends BaseToolInvocation< }; } - const effectiveTimeout = this.params.is_background - ? undefined - : (this.params.timeout ?? DEFAULT_FOREGROUND_TIMEOUT_MS); + if (this.params.is_background) { + return this.executeBackground(signal, shellExecutionConfig); + } + + const effectiveTimeout = + this.params.timeout ?? DEFAULT_FOREGROUND_TIMEOUT_MS; // Create combined signal with timeout for foreground execution let combinedSignal = signal; @@ -219,294 +238,345 @@ export class ShellToolInvocation extends BaseToolInvocation< combinedSignal = AbortSignal.any([signal, timeoutSignal]); } - const isWindows = os.platform() === 'win32'; - const tempFileName = `shell_pgrep_${crypto - .randomBytes(6) - .toString('hex')}.tmp`; - const tempFilePath = path.join(os.tmpdir(), tempFileName); + // Add co-author to git commit commands + const processedCommand = this.addCoAuthorToGitCommit(strippedCommand); + const commandToExecute = processedCommand; + const cwd = this.params.directory || this.config.getTargetDir(); - try { - // Add co-author to git commit commands - const processedCommand = this.addCoAuthorToGitCommit(strippedCommand); + let cumulativeOutput: string | AnsiOutput = ''; + let lastUpdateTime = Date.now(); + let isBinaryStream = false; + let totalLines = 0; + let totalBytes = 0; - const shouldRunInBackground = this.params.is_background; - let finalCommand = processedCommand; + const { result: resultPromise, pid } = await ShellExecutionService.execute( + commandToExecute, + cwd, + (event: ShellOutputEvent) => { + let shouldUpdate = false; - // On non-Windows, use & to run in background. - // On Windows, we don't use start /B because it creates a detached process that - // doesn't die when the parent dies. Instead, we rely on the race logic below - // to return early while keeping the process attached (detached: false). - if ( - !isWindows && - shouldRunInBackground && - !finalCommand.trim().endsWith('&') - ) { - finalCommand = finalCommand.trim() + ' &'; - } - - // On Windows, we rely on the race logic below to handle background tasks. - // We just ensure the command string is clean. - if (isWindows && shouldRunInBackground) { - finalCommand = finalCommand.trim().replace(/&+$/, '').trim(); - } - - // On non-Windows background commands, wrap with pgrep to capture - // subprocess PIDs so we can report them to the user. - const commandToExecute = - !isWindows && shouldRunInBackground - ? (() => { - let command = finalCommand.trim(); - if (!command.endsWith('&')) command += ';'; - return `{ ${command} }; __code=$?; pgrep -g 0 >${tempFilePath} 2>&1; exit $__code;`; - })() - : finalCommand; - - const cwd = this.params.directory || this.config.getTargetDir(); - - let cumulativeOutput: string | AnsiOutput = ''; - let lastUpdateTime = Date.now(); - let isBinaryStream = false; - let totalLines = 0; - let totalBytes = 0; - - const { result: resultPromise, pid } = - await ShellExecutionService.execute( - commandToExecute, - cwd, - (event: ShellOutputEvent) => { - let shouldUpdate = false; - - switch (event.type) { - case 'data': - if (isBinaryStream) break; - cumulativeOutput = event.chunk; - // Stats are only consumed by the ANSI-output branch below, - // so skip the per-chunk accounting for plain string chunks. - if (Array.isArray(event.chunk)) { - totalLines = event.chunk.length; - totalBytes = event.chunk.reduce( - (sum, line) => - sum + - line.reduce( - (ls, token) => - ls + Buffer.byteLength(token.text, 'utf-8'), - 0, - ), + switch (event.type) { + case 'data': + if (isBinaryStream) break; + cumulativeOutput = event.chunk; + // Stats are only consumed by the ANSI-output branch below, + // so skip the per-chunk accounting for plain string chunks. + if (Array.isArray(event.chunk)) { + totalLines = event.chunk.length; + totalBytes = event.chunk.reduce( + (sum, line) => + sum + + line.reduce( + (ls, token) => ls + Buffer.byteLength(token.text, 'utf-8'), 0, - ); - } - shouldUpdate = true; - break; - case 'binary_detected': - isBinaryStream = true; - cumulativeOutput = - '[Binary output detected. Halting stream...]'; - shouldUpdate = true; - break; - case 'binary_progress': - isBinaryStream = true; - cumulativeOutput = `[Receiving binary output... ${formatMemoryUsage( - event.bytesReceived, - )} received]`; - if (Date.now() - lastUpdateTime > OUTPUT_UPDATE_INTERVAL_MS) { - shouldUpdate = true; - } - break; - default: { - throw new Error('An unhandled ShellOutputEvent was found.'); - } + ), + 0, + ); } - - if (shouldUpdate && updateOutput) { - if (typeof cumulativeOutput === 'string') { - updateOutput(cumulativeOutput); - } else { - updateOutput({ - ansiOutput: cumulativeOutput, - totalLines, - totalBytes, - // Only include timeout when user explicitly set it - ...(this.params.timeout != null && { - timeoutMs: this.params.timeout, - }), - }); - } - lastUpdateTime = Date.now(); + shouldUpdate = true; + break; + case 'binary_detected': + isBinaryStream = true; + cumulativeOutput = '[Binary output detected. Halting stream...]'; + shouldUpdate = true; + break; + case 'binary_progress': + isBinaryStream = true; + cumulativeOutput = `[Receiving binary output... ${formatMemoryUsage( + event.bytesReceived, + )} received]`; + if (Date.now() - lastUpdateTime > OUTPUT_UPDATE_INTERVAL_MS) { + shouldUpdate = true; } - }, - combinedSignal, - shouldRunInBackground - ? false - : this.config.getShouldUseNodePtyShell(), - shellExecutionConfig ?? {}, - ); - - if (pid && setPidCallback) { - setPidCallback(pid); - } - - // On Windows, background commands rely on early return since there's - // no & backgrounding or pgrep. Awaiting would block until completion. - if (shouldRunInBackground && isWindows) { - const pidMsg = pid ? ` PID: ${pid}` : ''; - const killHint = ' (Use taskkill /F /T /PID to stop)'; - - return { - llmContent: `Background command started.${pidMsg}${killHint}`, - returnDisplay: `Background command started.${pidMsg}${killHint}`, - }; - } - - const result = await resultPromise; - - if (shouldRunInBackground) { - // Read subprocess PIDs captured by the pgrep wrapper (non-Windows only) - const backgroundPIDs: number[] = []; - if (!isWindows) { - if (fs.existsSync(tempFilePath)) { - const pgrepLines = fs - .readFileSync(tempFilePath, 'utf8') - .split(EOL) - .filter(Boolean); - for (const line of pgrepLines) { - if (!/^\d+$/.test(line)) { - debugLogger.warn(`pgrep: ${line}`); - continue; - } - const bgPid = Number(line); - if (bgPid !== result.pid) { - backgroundPIDs.push(bgPid); - } - } - } else if (!signal.aborted) { - debugLogger.warn('missing pgrep output'); + break; + default: { + throw new Error('An unhandled ShellOutputEvent was found.'); } } - const bgPidMsg = - backgroundPIDs.length > 0 - ? ` PIDs: ${backgroundPIDs.join(', ')}` - : pid - ? ` PID: ${pid}` - : ''; - const killHint = ' (Use kill to stop)'; - - return { - llmContent: `Background command started.${bgPidMsg}${killHint}`, - returnDisplay: `Background command started.${bgPidMsg}${killHint}`, - }; - } - - let llmContent = ''; - if (result.aborted) { - // Check if it was a timeout or user cancellation - const wasTimeout = - !this.params.is_background && - effectiveTimeout && - combinedSignal.aborted && - !signal.aborted; - - if (wasTimeout) { - llmContent = `Command timed out after ${effectiveTimeout}ms before it could complete.`; - if (result.output.trim()) { - llmContent += ` Below is the output before it timed out:\n${result.output}`; + if (shouldUpdate && updateOutput) { + if (typeof cumulativeOutput === 'string') { + updateOutput(cumulativeOutput); } else { - llmContent += ' There was no output before it timed out.'; - } - } else { - llmContent = - 'Command was cancelled by user before it could complete.'; - if (result.output.trim()) { - llmContent += ` Below is the output before it was cancelled:\n${result.output}`; - } else { - llmContent += ' There was no output before it was cancelled.'; + updateOutput({ + ansiOutput: cumulativeOutput, + totalLines, + totalBytes, + // Only include timeout when user explicitly set it + ...(this.params.timeout != null && { + timeoutMs: this.params.timeout, + }), + }); } + lastUpdateTime = Date.now(); } - } else { - // Create a formatted error string for display, replacing the wrapper command - // with the user-facing command. - const finalError = result.error - ? result.error.message.replace(commandToExecute, this.params.command) - : '(none)'; + }, + combinedSignal, + this.config.getShouldUseNodePtyShell(), + shellExecutionConfig ?? {}, + ); - llmContent = [ - `Command: ${this.params.command}`, - `Directory: ${this.params.directory || '(root)'}`, - `Output: ${result.output || '(empty)'}`, - `Error: ${finalError}`, // Use the cleaned error string. - `Exit Code: ${result.exitCode ?? '(none)'}`, - `Signal: ${result.signal ?? '(none)'}`, - `Process Group PGID: ${result.pid ?? '(none)'}`, - ].join('\n'); - } + if (pid && setPidCallback) { + setPidCallback(pid); + } - let returnDisplayMessage = ''; - if (this.config.getDebugMode()) { - returnDisplayMessage = llmContent; - } else { + const result = await resultPromise; + + let llmContent = ''; + if (result.aborted) { + // Check if it was a timeout or user cancellation + const wasTimeout = + effectiveTimeout && combinedSignal.aborted && !signal.aborted; + + if (wasTimeout) { + llmContent = `Command timed out after ${effectiveTimeout}ms before it could complete.`; if (result.output.trim()) { - returnDisplayMessage = result.output; + llmContent += ` Below is the output before it timed out:\n${result.output}`; } else { - if (result.aborted) { - // Check if it was a timeout or user cancellation - const wasTimeout = - !this.params.is_background && - effectiveTimeout && - combinedSignal.aborted && - !signal.aborted; - - returnDisplayMessage = wasTimeout - ? `Command timed out after ${effectiveTimeout}ms.` - : 'Command cancelled by user.'; - } else if (result.signal) { - returnDisplayMessage = `Command terminated by signal: ${result.signal}`; - } else if (result.error) { - returnDisplayMessage = `Command failed: ${getErrorMessage( - result.error, - )}`; - } else if (result.exitCode !== null && result.exitCode !== 0) { - returnDisplayMessage = `Command exited with code: ${result.exitCode}`; - } - // If output is empty and command succeeded (code 0, no error/signal/abort), - // returnDisplayMessage will remain empty, which is fine. + llmContent += ' There was no output before it timed out.'; + } + } else { + llmContent = 'Command was cancelled by user before it could complete.'; + if (result.output.trim()) { + llmContent += ` Below is the output before it was cancelled:\n${result.output}`; + } else { + llmContent += ' There was no output before it was cancelled.'; } } + } else { + // Create a formatted error string for display, replacing the wrapper command + // with the user-facing command. + const finalError = result.error + ? result.error.message.replace(commandToExecute, this.params.command) + : '(none)'; - // Truncate large output and save full content to a temp file. - if (typeof llmContent === 'string') { - const truncatedResult = await truncateToolOutput( - this.config, - ShellTool.Name, - llmContent, - ); + llmContent = [ + `Command: ${this.params.command}`, + `Directory: ${this.params.directory || '(root)'}`, + `Output: ${result.output || '(empty)'}`, + `Error: ${finalError}`, // Use the cleaned error string. + `Exit Code: ${result.exitCode ?? '(none)'}`, + `Signal: ${result.signal ?? '(none)'}`, + `Process Group PGID: ${result.pid ?? '(none)'}`, + ].join('\n'); + } - if (truncatedResult.outputFile) { - llmContent = truncatedResult.content; - returnDisplayMessage += - (returnDisplayMessage ? '\n' : '') + - `Output too long and was saved to: ${truncatedResult.outputFile}`; + let returnDisplayMessage = ''; + if (this.config.getDebugMode()) { + returnDisplayMessage = llmContent; + } else { + if (result.output.trim()) { + returnDisplayMessage = result.output; + } else { + if (result.aborted) { + // Check if it was a timeout or user cancellation + const wasTimeout = + effectiveTimeout && combinedSignal.aborted && !signal.aborted; + + returnDisplayMessage = wasTimeout + ? `Command timed out after ${effectiveTimeout}ms.` + : 'Command cancelled by user.'; + } else if (result.signal) { + returnDisplayMessage = `Command terminated by signal: ${result.signal}`; + } else if (result.error) { + returnDisplayMessage = `Command failed: ${getErrorMessage( + result.error, + )}`; + } else if (result.exitCode !== null && result.exitCode !== 0) { + returnDisplayMessage = `Command exited with code: ${result.exitCode}`; } - } - - const executionError = result.error - ? { - error: { - message: result.error.message, - type: ToolErrorType.SHELL_EXECUTE_ERROR, - }, - } - : {}; - - return { - llmContent, - returnDisplay: returnDisplayMessage, - ...executionError, - }; - } finally { - if (fs.existsSync(tempFilePath)) { - fs.unlinkSync(tempFilePath); + // If output is empty and command succeeded (code 0, no error/signal/abort), + // returnDisplayMessage will remain empty, which is fine. } } + + // Truncate large output and save full content to a temp file. + if (typeof llmContent === 'string') { + const truncatedResult = await truncateToolOutput( + this.config, + ShellTool.Name, + llmContent, + ); + + if (truncatedResult.outputFile) { + llmContent = truncatedResult.content; + returnDisplayMessage += + (returnDisplayMessage ? '\n' : '') + + `Output too long and was saved to: ${truncatedResult.outputFile}`; + } + } + + const executionError = result.error + ? { + error: { + message: result.error.message, + type: ToolErrorType.SHELL_EXECUTE_ERROR, + }, + } + : {}; + + return { + llmContent, + returnDisplay: returnDisplayMessage, + ...executionError, + }; + } + + /** + * Background-execution path: spawn the command into a managed registry + * entry instead of detaching with `&`. Output streams to a per-shell file + * the agent can `Read`; cancellation flows through the entry's + * AbortController; the registry's terminal status is set when the process + * exits. Returns immediately so the agent's turn isn't blocked. + */ + private async executeBackground( + signal: AbortSignal, + shellExecutionConfig?: ShellExecutionConfig, + ): Promise { + const strippedCommand = stripShellWrapper(this.params.command); + // Strip a single bare trailing `&` (the bash background operator) before + // spawn: bash treats it as background-detach, exits the wrapper + // immediately, and the real child outlives the wrapper — the registry + // would settle as `completed` while the shell is still running, and + // chunked output would land on a closed stream. The managed path is + // itself the backgrounding mechanism, so the trailing `&` is redundant. + // + // Deliberately precise: do not touch `&&` (logical AND), `\&` (escaped + // literal `&`), or commands without a trailing `&`. Earlier `\s*&+\s*$` + // was both too greedy (it ate `&&` and `\&`) and a ReDoS hazard on + // long all-`&` inputs. Plain string checks here are linear and clearer + // than a lookbehind regex. + const noTrailingAmp = stripTrailingBackgroundAmp(strippedCommand); + if (noTrailingAmp !== strippedCommand) { + debugLogger.warn( + 'Stripped trailing & from background shell command — managed path handles backgrounding', + ); + } + const processedCommand = this.addCoAuthorToGitCommit(noTrailingAmp); + const cwd = this.params.directory || this.config.getTargetDir(); + + // Output goes under the project temp dir (which `ReadFileTool` + // auto-allows by default), so the LLM can `Read` the captured output + // without bouncing off a permission prompt — important because + // background-agent contexts can't surface interactive prompts. + const outputDir = path.join( + this.config.storage.getProjectTempDir(), + 'background-shells', + this.config.getSessionId(), + ); + fs.mkdirSync(outputDir, { recursive: true }); + + const shellId = `bg_${crypto.randomBytes(4).toString('hex')}`; + const outputPath = path.join(outputDir, `shell-${shellId}.output`); + + // Background shells are explicitly independent of the current turn: + // the user pressing Ctrl+C on a turn (which aborts `signal`) should + // NOT kill a long-running dev server / watcher they intentionally + // backgrounded. Cancellation flows only through the entry's own + // AbortController, driven by future `task_stop` integration (#3471). + // The `signal` parameter is still honored for the synchronous early + // return below (don't even spawn if the agent already aborted), but + // we deliberately do not forward it. + const entryAc = new AbortController(); + + const outputStream = fs.createWriteStream(outputPath, { flags: 'w' }); + // Without an 'error' listener, a write failure (disk full, permission + // change, fs going away) would surface as an uncaught exception and + // kill the entire CLI session. Log + drop is the sane default — the + // process keeps running, the registry still settles via resultPromise. + outputStream.on('error', (err) => { + debugLogger.warn( + `background shell ${shellId} output write error: ${err.message}`, + ); + }); + + const startTime = Date.now(); + const entry: BackgroundShellEntry = { + shellId, + command: processedCommand, + cwd, + status: 'running', + startTime, + outputPath, + abortController: entryAc, + }; + + const { result: resultPromise, pid } = await ShellExecutionService.execute( + processedCommand, + cwd, + (event: ShellOutputEvent) => { + if (event.type === 'data' && typeof event.chunk === 'string') { + // Strip ANSI escape codes (color, cursor-move, clear-screen) before + // writing — agents read the file as plain text, and dev servers / + // build tools spam plenty of escape sequences that would render as + // garbage. Costs ~one regex per chunk; cheap relative to disk I/O. + outputStream.write(stripAnsi(event.chunk)); + } + // ANSI array chunks and binary streams are not written to the output + // file: agents read the file as plain text and binary spam would be + // unhelpful. + }, + entryAc.signal, + // Background shells are non-interactive by design — no terminal to + // attach a PTY to, no human to type at it. Force the child_process + // path so we don't pull in node-pty for fire-and-forget commands. + false, + shellExecutionConfig ?? {}, + // Stream stdout/stderr through to the output file as chunks arrive. + // Default child_process mode buffers until exit, which would leave + // dev-server / watcher output files empty until the process dies. + { streamStdout: true }, + ); + + if (pid !== undefined) entry.pid = pid; + const registry = this.config.getBackgroundShellRegistry(); + registry.register(entry); + + // Settle in the background — do NOT await here, the agent should be + // unblocked immediately. + void resultPromise.then( + (result) => { + outputStream.end(); + const endTime = Date.now(); + if (entryAc.signal.aborted) { + if (registry.get(shellId)?.status === 'running') { + registry.cancel(shellId, endTime); + } + } else if ( + result.error || + (result.exitCode !== null && result.exitCode !== 0) || + result.signal !== null + ) { + // Non-zero exit / killed by signal / spawn error all count as failed. + // Treating them as `completed` would let `/tasks` (and any future + // model-facing notification) misreport a failed `npm test` or + // `false` command as a success. + const reason = result.error + ? result.error.message + : result.signal !== null + ? `terminated by signal ${result.signal}` + : `exited with code ${result.exitCode}`; + registry.fail(shellId, reason, endTime); + } else { + registry.complete(shellId, result.exitCode ?? 0, endTime); + } + }, + (err) => { + outputStream.end(); + registry.fail(shellId, getErrorMessage(err), Date.now()); + }, + ); + + const pidLine = pid !== undefined ? `pid: ${pid}\n` : ''; + return { + llmContent: + `Background shell started.\n` + + `id: ${shellId}\n` + + pidLine + + `output file: ${outputPath}\n` + + `Use the /tasks command to list and inspect background shells, or Read the output file directly.`, + returnDisplay: `Background shell ${shellId} started${pid !== undefined ? ` (pid ${pid})` : ''}.`, + }; } private addCoAuthorToGitCommit(command: string): string {