feat(core): managed background shell pool with /tasks command (#3642)
Some checks are pending
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
Qwen Code CI / Lint (push) Waiting to run

* feat(core): managed background shell pool with /bashes command

Replace shell.ts's `&` fork-and-detach background path with a managed
process registry. Background shells now have observable lifecycle, captured
output, and explicit cancellation — matching the pattern used by background
subagents (#3076).

Phase B from #3634 (background task management roadmap).

What changes
- New `BackgroundShellRegistry` (services/backgroundShellRegistry.ts):
  per-process entry with status (running / completed / failed / cancelled),
  AbortController, output file path. State transitions are one-shot
  (terminal status sticks; late callbacks no-op). Mirrors the lifecycle
  shape of #3471's BackgroundTaskRegistry so the two can be unified later.
- `shell.ts` is_background path rewritten as `executeBackground`:
  - Spawns the unwrapped command (no '&', no pgrep envelope)
  - Streams stdout to `<projectDir>/tasks/<sessionId>/shell-<id>.output`
    (path layout aligns with the direction sketched in #3471 review)
  - Bridges the external abort signal into the entry's AbortController so
    a single source of truth governs cancellation
  - Returns immediately with id + output path; agent's turn isn't blocked
  - Settles the registry entry asynchronously when ShellExecutionService
    resolves: complete (clean exit) / fail (error) / cancel (aborted)
- Removes ~120 lines of dead bg-specific code from shell.ts:
  pgrep wrapping, '&' appending, Windows ampersand cleanup, Windows
  early-return path, bg PID parsing, tempFile cleanup
- New `/bashes` slash command: lists registered shells with id, status,
  runtime, command, output path. Empty state prints a friendly message.

What this PR doesn't do
- Footer pill / dialog integration — gated on #3488 landing
- task_stop / send_message integration — gated on #3471 landing
- Auto-backgrounding heuristics for long foreground bash — Phase D

Test plan
- 11 registry unit tests (state machine + idempotent terminal transitions)
- 4 background-path tests in shell.test.ts (spawn no-wrap + complete /
  fail / cancel settle paths)
- 2 /bashes command tests (empty + populated)
- Full core suite: 247 files / 6075 passed (existing tests unaffected)

* fix(core): address PR #3642 review feedback

Three [Critical] from the auto review + naming alignment with Claude Code:

- shell.ts settle: non-zero exit code or termination signal now bucket into
  `failed` instead of `completed`. The previous `if (result.error) fail else
  complete()` would misreport `false` / failed `npm test` as success because
  ShellExecutionService surfaces ordinary command failures as a non-zero
  exitCode with `error: null`. Failure reason carries the exit code or signal
  so `/tasks` shows the real cause.

- ShellExecutionService.childProcessFallback: add `streamStdout` mode that
  emits each decoded chunk through the existing onOutputEvent path. The
  default (foreground) path continues to buffer + emit the cleaned final
  blob, so existing in-line shell calls are unaffected. executeBackground
  opts in via `{ streamStdout: true }`, which is what makes the captured
  output file actually useful for long-running processes (dev servers,
  watchers) — without it the file stayed empty until the process exited.

- shell.ts test fixture: cancel-settle test was using `signal: 'SIGTERM'`
  but `ShellExecutionResult.signal` is `number | null`. TS2322 broke the
  build; switched to `signal: null`. Added a test that explicitly covers
  the new "non-zero exit → failed" path so the bucketing change has
  regression coverage.

- shell.ts comment: explicitly document why background shells force
  `shouldUseNodePty=false` (no terminal, no human; node-pty would be dead
  weight for fire-and-forget commands).

- /bashes → /tasks (alias bashes), description "List and manage background
  tasks" — matches Claude Code's command name. Currently lists shells only;
  will surface other task kinds (subagents, monitor) as those registries
  land via #3471 / #3488.

* fix(core): address PR #3642 second-round review feedback

- shellExecutionService streaming: drop stdout/stderr buffer + outputChunks
  accumulation in streaming mode. Each decoded chunk goes straight to
  onOutputEvent and is GC-eligible immediately. Long-running background
  commands (dev servers, watchers) no longer accumulate unbounded memory
  proportional to total output. Buffered (foreground) mode is unchanged.

- shell.ts executeBackground: stripAnsi each chunk before writing to the
  output file. Dev servers / build tools spam color codes and cursor-move
  sequences that would render as garbage in the file the agent reads.

- bashesCommand: command description "List and manage" → "List background
  tasks" — current implementation only supports listing, cancellation
  follows when the unified task_stop tool from #3471 is wired in. Replace
  the hand-rolled formatRuntime helper with the shared formatDuration
  utility (uses hideTrailingZeros for parity with the previous output).

- backgroundShellRegistry: add a comment documenting the lack of an
  eviction policy as a known limitation. LRU / age-based / capped-size
  eviction (and on-disk output rotation) is left as a follow-up alongside
  the broader output-file lifecycle story.

* fix(core): address PR #3642 third-round review feedback

- shell.ts executeBackground: add 'error' listener on the output write
  stream. fs.createWriteStream surfaces write failures (disk full,
  permission, fs going away) as 'error' events; without a listener Node
  treats it as an uncaught exception and kills the entire CLI session.
  Log + drop is the sane default — the registry still settles via
  resultPromise so /tasks shows the right terminal status.

- shell.ts executeBackground: store the abort handler reference and
  removeEventListener in the settle callback. Background shells outlive
  the turn signal; the dangling listener was keeping `entryAc` (and
  transitively `outputStream`) reachable until the turn signal itself was
  GC'd, which for long sessions would never happen.

- shell.test.ts: extend the createWriteStream mock with an `on` stub so
  the new error-listener wiring doesn't crash the test suite.

* refactor(cli): drop /bashes alias and rename file to tasksCommand

Per follow-up review: the slash command should be exclusively /tasks.
Removes the `bashes` altName, renames `bashesCommand{,.test}.ts` →
`tasksCommand{,.test}.ts`, renames the exported binding `bashesCommand`
→ `tasksCommand`, and cleans up the remaining `/bashes` references in
backgroundShellRegistry.ts comments. No behavior change beyond the
alias removal.

* refactor(cli): finish tasksCommand rename — apply content changes

The previous commit (03c8503c8) only captured the file rename via
`git mv`; the export name change (`bashesCommand` → `tasksCommand`),
the removal of `altNames: ['bashes']`, the import update in
BuiltinCommandLoader, and the `/bashes` → `/tasks` comments in
backgroundShellRegistry.ts were unstaged when that commit landed.
Squash candidate before merge.

* fix(core): address PR #3642 fourth-round review feedback

Four reviewer concerns from @wenshao + @doudouOUC:

- [Critical] Config.shutdown() now also calls
  `backgroundShellRegistry.abortAll()`. Previously only the subagent
  registry was aborted, so a managed background shell could outlive the
  CLI process and orphan its child. Symmetric with how
  `BackgroundTaskRegistry.abortAll()` is wired in.

- [P1] shell.ts executeBackground strips a trailing `&` from the command
  before spawn. The managed path is itself the backgrounding mechanism;
  forwarding `node server.js &` verbatim made bash exit immediately while
  the real child outlived the wrapper, causing the registry to settle as
  `completed` while the shell was still running and chunked output to
  land on a closed stream. Strip + warn.

- [P2] Output file moves under `storage.getProjectTempDir()` (specifically
  `<projectTempDir>/background-shells/<sessionId>/shell-<id>.output`).
  `ReadFileTool` already auto-allows the project temp dir, so the LLM
  can `Read` the captured output without bouncing off a permission
  prompt — important because background-agent contexts can't surface
  interactive prompts.

- [P2] Background shells are no longer killed when the current turn's
  AbortSignal fires. Forwarding the turn signal into the entry's
  AbortController meant a Ctrl+C on the turn would also terminate
  intentionally backgrounded dev servers / watchers, contradicting the
  independent-lifecycle promise. Cancellation now flows only through
  `entryAc` (driven by future `task_stop` integration via #3471).

Tests:
- New `abortAll` registry tests cover running / mixed / empty cases.
- `runs background commands as managed pool entries` test stops asserting
  the wrapper-vs-entry signal identity since they're now structurally
  separate (no turn-to-entry forwarding).
- New `does not forward the turn signal into the background shell` test
  pins the new behavior.
- New `strips trailing & from the spawned command` test pins the strip.
- Removed the cancel-via-outer-signal settle test — that path no longer
  exists; cancellation is exercised end-to-end via the registry's own
  `cancel` and `abortAll` tests in `backgroundShellRegistry.test.ts`.

* fix(core): tighten trailing & strip — narrow regex + ReDoS-safe

Two reviewer concerns on the same line of #3642 round 4:

- [Critical CodeQL] `\s*&+\s*$` is a polynomial-time regex on
  uncontrolled input (long all-`&` strings backtrack quadratically).
- [P2 doudouOUC] `&+` is too greedy: it also rewrites `npm run dev &&`
  into `npm run dev` (breaks logical AND syntax) and `echo foo \&` into
  `echo foo \` (eats the escaped literal). Only the bare bash background
  operator should be stripped.

Replace the regex with a small linear-time helper
`stripTrailingBackgroundAmp` that explicitly checks for the three
"don't touch" cases (`&&`, `\&`, no trailing `&`). Plain `endsWith` /
`slice` — no regex backtracking, and the intent reads off the page.

Tests:
- Existing strip-trailing-`&` test still passes.
- New `does not strip a trailing &&` test pins the logical-AND case.
- New `does not strip an escaped trailing \\&` test pins the escape case.

* fix(core): keep binary-detection sniff in streaming mode

@doudouOUC noted that `streamStdout` shortcut returned before the
binary-sniff path, so a background command emitting binary bytes
(`cat /bin/ls`, image dump, etc.) would be text-decoded and appended
to the task output file unbounded.

Restructure handleOutput so the sniff-and-cutover logic runs in both
modes:

- Both modes accumulate up to MAX_SNIFF_SIZE for the binary check.
  The accumulator is bounded; once the threshold is reached, it stops
  growing in streaming mode (dropped on binary detection / left
  inert on text confirmation) and continues to accumulate in buffered
  mode (existing foreground behavior).
- Streaming mode emits 'binary_detected' as soon as `isBinary` trips
  so the consumer can stop writing the output file. Up to ~4KB of
  bytes may have been emitted as text chunks before detection — this
  is bounded and acceptable; the unbounded write is the pathology
  reviewers flagged.
- Streaming text mode still emits each decoded chunk immediately and
  does not accumulate stdout/stderr strings, so long-running text
  streams remain GC-friendly.
- Buffered (foreground) behavior is unchanged — the sniff accumulator
  is the same path the existing tests cover.

Tests: 50 shellExecutionService + 11 backgroundShellRegistry + 57
shell.test.ts all pass; no regressions.

* fix(core): tighten streaming sniff bound + Windows rmSync flake

Two unrelated reds on the latest CI run:

1. [P1 doudouOUC] Streaming sniff buffer leaks on small chunks.
   The previous fix recomputed `sniffedBytes` from
   `Buffer.concat(outputChunks.slice(0, 20)).length` on every chunk —
   pinned to the first 20 chunks. If those total under MAX_SNIFF_SIZE
   (line-sized stdout, e.g. dev-server logs) the byte count never grew,
   the sniff branch stayed open forever, and `outputChunks` accumulated
   every later chunk — exactly the leak `streamStdout` was meant to
   prevent.

   Track sniffed bytes by running sum (`sniffedBytes += data.length`)
   so the bound is genuine. When sniff confirms text in streaming mode,
   drop the accumulator immediately so subsequent chunks fall through
   the streaming emit path without ever touching it.

2. file-exporters.test.ts afterEach `fs.rmSync` flaked on Windows
   (ENOTEMPTY: directory not empty). The exporter's underlying write
   stream hasn't always released its handle by the time `rmSync` runs.
   Pass `maxRetries: 5, retryDelay: 50` so the cleanup retries through
   the brief Windows handle-release window instead of failing the test
   on a CI quirk.

---------

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
This commit is contained in:
Shaojin Wen 2026-04-28 11:06:50 +08:00 committed by GitHub
parent 03c88b7308
commit aac2e96ec3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 1024 additions and 410 deletions

View file

@ -8,6 +8,7 @@ import type { ICommandLoader } from './types.js';
import type { SlashCommand } from '../ui/commands/types.js';
import type { Config } from '@qwen-code/qwen-code-core';
import { aboutCommand } from '../ui/commands/aboutCommand.js';
import { tasksCommand } from '../ui/commands/tasksCommand.js';
import { agentsCommand } from '../ui/commands/agentsCommand.js';
import { arenaCommand } from '../ui/commands/arenaCommand.js';
import { approvalModeCommand } from '../ui/commands/approvalModeCommand.js';
@ -92,6 +93,7 @@ export class BuiltinCommandLoader implements ICommandLoader {
const allDefinitions: Array<SlashCommand | null> = [
aboutCommand,
agentsCommand,
tasksCommand,
arenaCommand,
approvalModeCommand,
authCommand,

View file

@ -0,0 +1,94 @@
/**
* @license
* Copyright 2026 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import { vi, describe, it, expect, beforeEach } from 'vitest';
import { tasksCommand } from './tasksCommand.js';
import { type CommandContext } from './types.js';
import { createMockCommandContext } from '../../test-utils/mockCommandContext.js';
import type { BackgroundShellEntry } from '@qwen-code/qwen-code-core';
function entry(
overrides: Partial<BackgroundShellEntry> = {},
): BackgroundShellEntry {
return {
shellId: 'bg_aaaaaaaa',
command: 'sleep 60',
cwd: '/tmp',
status: 'running',
startTime: Date.now() - 5_000,
outputPath: '/tmp/tasks/sess/shell-bg_aaaaaaaa.output',
abortController: new AbortController(),
...overrides,
};
}
describe('tasksCommand', () => {
let context: CommandContext;
let getAll: ReturnType<typeof vi.fn>;
beforeEach(() => {
getAll = vi.fn().mockReturnValue([]);
context = createMockCommandContext({
services: {
config: {
getBackgroundShellRegistry: () => ({ getAll }),
},
},
} as unknown as Parameters<typeof createMockCommandContext>[0]);
});
it('reports an empty registry', async () => {
const result = await tasksCommand.action!(context, '');
expect(result).toEqual({
type: 'message',
messageType: 'info',
content: 'No background shells.',
});
});
it('lists running and terminal entries with status / runtime / output path', async () => {
getAll.mockReturnValue([
entry({
shellId: 'bg_run',
command: 'npm run dev',
status: 'running',
startTime: Date.now() - 12_000,
pid: 1111,
}),
entry({
shellId: 'bg_done',
command: 'npm test',
status: 'completed',
exitCode: 0,
startTime: Date.now() - 70_000,
endTime: Date.now() - 5_000,
outputPath: '/tmp/tasks/sess/shell-bg_done.output',
}),
entry({
shellId: 'bg_fail',
command: 'flaky.sh',
status: 'failed',
error: 'spawn ENOENT',
startTime: Date.now() - 3_000,
endTime: Date.now() - 2_000,
}),
]);
const result = await tasksCommand.action!(context, '');
if (!result || result.type !== 'message') {
throw new Error('expected message result');
}
expect(result.content).toContain('Background shells (3 total)');
expect(result.content).toContain('[bg_run] running');
expect(result.content).toContain('pid=1111');
expect(result.content).toContain('npm run dev');
expect(result.content).toContain('[bg_done] completed (exit 0)');
expect(result.content).toContain('[bg_fail] failed: spawn ENOENT');
expect(result.content).toContain(
'output: /tmp/tasks/sess/shell-bg_done.output',
);
});
});

View file

@ -0,0 +1,78 @@
/**
* @license
* Copyright 2026 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import type { BackgroundShellEntry } from '@qwen-code/qwen-code-core';
import type { SlashCommand } from './types.js';
import { CommandKind } from './types.js';
import { t } from '../../i18n/index.js';
import { formatDuration } from '../utils/formatters.js';
function statusLabel(entry: BackgroundShellEntry): string {
switch (entry.status) {
case 'completed':
return `completed (exit ${entry.exitCode ?? '?'})`;
case 'failed':
return `failed: ${entry.error ?? 'unknown error'}`;
case 'cancelled':
return 'cancelled';
case 'running':
return 'running';
default:
return entry.status;
}
}
export const tasksCommand: SlashCommand = {
name: 'tasks',
get description() {
return t('List background tasks');
},
kind: CommandKind.BUILT_IN,
supportedModes: ['interactive', 'non_interactive', 'acp'] as const,
action: async (context) => {
const { config } = context.services;
if (!config) {
return {
type: 'message' as const,
messageType: 'error' as const,
content: 'Config not available.',
};
}
const entries = config.getBackgroundShellRegistry().getAll();
if (entries.length === 0) {
return {
type: 'message' as const,
messageType: 'info' as const,
content: 'No background shells.',
};
}
const now = Date.now();
const lines: string[] = [
`Background shells (${entries.length} total)`,
'',
];
for (const entry of entries) {
const endTime = entry.endTime ?? now;
const runtime = formatDuration(endTime - entry.startTime, {
hideTrailingZeros: true,
});
const pidPart = entry.pid !== undefined ? ` pid=${entry.pid}` : '';
lines.push(
`[${entry.shellId}] ${statusLabel(entry)} ${runtime}${pidPart} ${entry.command}`,
);
lines.push(` output: ${entry.outputPath}`);
}
return {
type: 'message' as const,
messageType: 'info' as const,
content: lines.join('\n'),
};
},
};

View file

@ -61,6 +61,7 @@ import { PermissionManager } from '../permissions/permission-manager.js';
import { SubagentManager } from '../subagents/subagent-manager.js';
import type { SubagentConfig } from '../subagents/types.js';
import { BackgroundTaskRegistry } from '../agents/background-tasks.js';
import { BackgroundShellRegistry } from '../services/backgroundShellRegistry.js';
import {
DEFAULT_OTLP_ENDPOINT,
DEFAULT_TELEMETRY_TARGET,
@ -545,6 +546,7 @@ export class Config {
private promptRegistry!: PromptRegistry;
private subagentManager!: SubagentManager;
private readonly backgroundTaskRegistry = new BackgroundTaskRegistry();
private readonly backgroundShellRegistry = new BackgroundShellRegistry();
private extensionManager!: ExtensionManager;
private skillManager: SkillManager | null = null;
private permissionManager: PermissionManager | null = null;
@ -1610,6 +1612,7 @@ export class Config {
}
this.backgroundTaskRegistry.abortAll();
this.backgroundShellRegistry.abortAll();
await this.cleanupArenaRuntime();
} catch (error) {
@ -2489,6 +2492,10 @@ export class Config {
return this.backgroundTaskRegistry;
}
getBackgroundShellRegistry(): BackgroundShellRegistry {
return this.backgroundShellRegistry;
}
/**
* Whether interactive permission prompts should be auto-denied.
* True for background agents that have no UI to show prompts.

View file

@ -147,6 +147,7 @@ export * from './services/sessionService.js';
export * from './services/sessionTitle.js';
export { stripTerminalControlSequences } from './utils/terminalSafe.js';
export * from './services/shellExecutionService.js';
export * from './services/backgroundShellRegistry.js';
export * from './services/toolUseSummary.js';
export * from './utils/bareMode.js';

View file

@ -0,0 +1,159 @@
/**
* @license
* Copyright 2026 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import { describe, expect, it } from 'vitest';
import {
BackgroundShellRegistry,
type BackgroundShellEntry,
} from './backgroundShellRegistry.js';
function makeEntry(
overrides: Partial<BackgroundShellEntry> = {},
): BackgroundShellEntry {
return {
shellId: 's1',
command: 'sleep 60',
cwd: '/tmp',
status: 'running',
startTime: 1000,
outputPath: '/tmp/s1.output',
abortController: new AbortController(),
...overrides,
};
}
describe('BackgroundShellRegistry', () => {
describe('register / get / getAll', () => {
it('round-trips a registered entry by id', () => {
const reg = new BackgroundShellRegistry();
const e = makeEntry({ shellId: 'a' });
reg.register(e);
expect(reg.get('a')).toBe(e);
});
it('returns undefined for unknown id', () => {
const reg = new BackgroundShellRegistry();
expect(reg.get('missing')).toBeUndefined();
});
it('lists all entries via getAll', () => {
const reg = new BackgroundShellRegistry();
const a = makeEntry({ shellId: 'a' });
const b = makeEntry({ shellId: 'b' });
reg.register(a);
reg.register(b);
const all = reg.getAll();
expect(all).toHaveLength(2);
expect(all).toContain(a);
expect(all).toContain(b);
});
});
describe('complete', () => {
it('transitions running → completed with exitCode and endTime', () => {
const reg = new BackgroundShellRegistry();
reg.register(makeEntry({ shellId: 'a' }));
reg.complete('a', 0, 2000);
const e = reg.get('a')!;
expect(e.status).toBe('completed');
expect(e.exitCode).toBe(0);
expect(e.endTime).toBe(2000);
});
it('is a no-op when entry is not running', () => {
const reg = new BackgroundShellRegistry();
reg.register(makeEntry({ shellId: 'a' }));
reg.cancel('a', 1500);
reg.complete('a', 0, 2000);
const e = reg.get('a')!;
expect(e.status).toBe('cancelled');
expect(e.exitCode).toBeUndefined();
});
it('is a no-op for unknown id', () => {
const reg = new BackgroundShellRegistry();
expect(() => reg.complete('missing', 0, 0)).not.toThrow();
});
});
describe('fail', () => {
it('transitions running → failed with error and endTime', () => {
const reg = new BackgroundShellRegistry();
reg.register(makeEntry({ shellId: 'a' }));
reg.fail('a', 'spawn error', 2000);
const e = reg.get('a')!;
expect(e.status).toBe('failed');
expect(e.error).toBe('spawn error');
expect(e.endTime).toBe(2000);
});
it('is a no-op when entry is not running', () => {
const reg = new BackgroundShellRegistry();
reg.register(makeEntry({ shellId: 'a' }));
reg.complete('a', 0, 1500);
reg.fail('a', 'late error', 2000);
const e = reg.get('a')!;
expect(e.status).toBe('completed');
expect(e.error).toBeUndefined();
});
});
describe('abortAll', () => {
it('cancels every running entry and leaves terminal entries alone', () => {
const reg = new BackgroundShellRegistry();
const acRunning1 = new AbortController();
const acRunning2 = new AbortController();
const acDone = new AbortController();
reg.register(makeEntry({ shellId: 'a', abortController: acRunning1 }));
reg.register(makeEntry({ shellId: 'b', abortController: acRunning2 }));
reg.register(makeEntry({ shellId: 'c', abortController: acDone }));
reg.complete('c', 0, 1500);
reg.abortAll();
expect(reg.get('a')!.status).toBe('cancelled');
expect(reg.get('b')!.status).toBe('cancelled');
expect(reg.get('c')!.status).toBe('completed');
expect(acRunning1.signal.aborted).toBe(true);
expect(acRunning2.signal.aborted).toBe(true);
expect(acDone.signal.aborted).toBe(false);
});
it('is a no-op when registry is empty', () => {
const reg = new BackgroundShellRegistry();
expect(() => reg.abortAll()).not.toThrow();
});
});
describe('cancel', () => {
it('transitions running → cancelled and aborts the signal', () => {
const reg = new BackgroundShellRegistry();
const ac = new AbortController();
reg.register(makeEntry({ shellId: 'a', abortController: ac }));
reg.cancel('a', 2000);
const e = reg.get('a')!;
expect(e.status).toBe('cancelled');
expect(e.endTime).toBe(2000);
expect(ac.signal.aborted).toBe(true);
});
it('is a no-op when entry is already terminal', () => {
const reg = new BackgroundShellRegistry();
const ac = new AbortController();
reg.register(makeEntry({ shellId: 'a', abortController: ac }));
reg.complete('a', 0, 1500);
reg.cancel('a', 2000);
const e = reg.get('a')!;
expect(e.status).toBe('completed');
expect(ac.signal.aborted).toBe(false);
});
it('is a no-op for unknown id', () => {
const reg = new BackgroundShellRegistry();
expect(() => reg.cancel('missing', 0)).not.toThrow();
});
});
});

View file

@ -0,0 +1,109 @@
/**
* @license
* Copyright 2026 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
/**
* Tracks background shell processes spawned via the `shell` tool with
* `is_background: true`. Each entry holds the metadata the agent and the
* `/tasks` slash command need to query, observe, or terminate a running
* background shell.
*
* State machine: register running { completed | failed | cancelled }.
* Transitions out of running are one-shot: complete/fail/cancel become
* no-ops once the entry has settled. This prevents late callbacks (e.g. a
* process that exits during cancellation) from clobbering the terminal
* status.
*/
export type BackgroundShellStatus =
| 'running'
| 'completed'
| 'failed'
| 'cancelled';
export interface BackgroundShellEntry {
/** Stable id used by the model and the `/tasks` UI. */
shellId: string;
/** The user-supplied command, after any pre-processing the tool applies. */
command: string;
/** Working directory the process was spawned in. */
cwd: string;
/** OS pid once spawned; absent if registration happens before spawn. */
pid?: number;
status: BackgroundShellStatus;
/** Exit code on `completed`. */
exitCode?: number;
/** Error message on `failed`. */
error?: string;
/** ms epoch when the entry was registered. */
startTime: number;
/** ms epoch when the entry transitioned out of running. */
endTime?: number;
/** Absolute path of the captured stdout/stderr file. */
outputPath: string;
/** Aborted by `cancel()`; callers should wire it into the spawn. */
abortController: AbortController;
}
export class BackgroundShellRegistry {
// Entries persist for the session lifetime — no automatic eviction of
// terminal entries. For typical interactive sessions (tens of background
// shells over an hour) this is fine, but long-running sessions that spawn
// many short-lived background commands will see the map and the on-disk
// output files grow without bound. Eviction policy (LRU? age-based? cap?)
// is left as a follow-up alongside output-file rotation.
private readonly entries = new Map<string, BackgroundShellEntry>();
register(entry: BackgroundShellEntry): void {
this.entries.set(entry.shellId, entry);
}
get(shellId: string): BackgroundShellEntry | undefined {
return this.entries.get(shellId);
}
getAll(): readonly BackgroundShellEntry[] {
return [...this.entries.values()];
}
complete(shellId: string, exitCode: number, endTime: number): void {
const entry = this.entries.get(shellId);
if (!entry || entry.status !== 'running') return;
entry.status = 'completed';
entry.exitCode = exitCode;
entry.endTime = endTime;
}
fail(shellId: string, error: string, endTime: number): void {
const entry = this.entries.get(shellId);
if (!entry || entry.status !== 'running') return;
entry.status = 'failed';
entry.error = error;
entry.endTime = endTime;
}
cancel(shellId: string, endTime: number): void {
const entry = this.entries.get(shellId);
if (!entry || entry.status !== 'running') return;
entry.status = 'cancelled';
entry.endTime = endTime;
entry.abortController.abort();
}
/**
* Cancel every still-running entry. Called on session/Config shutdown so
* background shells don't outlive the CLI process and leak orphaned
* children. Symmetric with `BackgroundTaskRegistry.abortAll()` for the
* subagent path.
*/
abortAll(): void {
const endTime = Date.now();
for (const entry of Array.from(this.entries.values())) {
if (entry.status === 'running') {
this.cancel(entry.shellId, endTime);
}
}
}
}

View file

@ -329,6 +329,7 @@ export class ShellExecutionService {
abortSignal: AbortSignal,
shouldUseNodePty: boolean,
shellExecutionConfig: ShellExecutionConfig,
options: { streamStdout?: boolean } = {},
): Promise<ShellExecutionHandle> {
if (shouldUseNodePty) {
const ptyInfo = await getPty();
@ -353,6 +354,7 @@ export class ShellExecutionService {
cwd,
onOutputEvent,
abortSignal,
options.streamStdout ?? false,
);
}
@ -361,6 +363,7 @@ export class ShellExecutionService {
cwd: string,
onOutputEvent: (event: ShellOutputEvent) => void,
abortSignal: AbortSignal,
streamStdout: boolean,
): ShellExecutionHandle {
try {
const isWindows = os.platform() === 'win32';
@ -415,26 +418,61 @@ export class ShellExecutionService {
}
}
outputChunks.push(data);
// Binary sniff applies in both modes — even streaming consumers
// (e.g. background shell output file) shouldn't pile up text-decoded
// garbage when the command actually emits binary (`cat /bin/ls`,
// image dumps, etc.). Track sniffed bytes by running sum so the
// accumulator is truly byte-bounded — the previous version recomputed
// sniffedBytes from `slice(0, 20)` on every call, which never grew
// past the first 20 chunks' total and let the chunk array leak on
// line-sized streams.
if (isStreamingRawContent && sniffedBytes < MAX_SNIFF_SIZE) {
const sniffBuffer = Buffer.concat(outputChunks.slice(0, 20));
sniffedBytes = sniffBuffer.length;
outputChunks.push(data);
sniffedBytes += data.length;
const sniffBuffer = Buffer.concat(outputChunks);
if (isBinary(sniffBuffer)) {
isStreamingRawContent = false;
if (streamStdout) {
// Tell the streaming consumer to stop writing text chunks;
// drop the sniff accumulator now so it can be GC'd.
onOutputEvent({ type: 'binary_detected' });
outputChunks.length = 0;
}
} else if (streamStdout && sniffedBytes >= MAX_SNIFF_SIZE) {
// Sniff passed in streaming mode — text confirmed, drop the
// accumulator. Subsequent chunks fall through to the streaming
// emit path below without ever touching outputChunks.
outputChunks.length = 0;
}
} else if (!streamStdout) {
// Buffered (foreground) mode past sniff: keep accumulating for
// the final emit at exit. Streaming mode does not accumulate.
outputChunks.push(data);
}
if (isStreamingRawContent) {
const decoder = stream === 'stdout' ? stdoutDecoder : stderrDecoder;
const decodedChunk = decoder.decode(data, { stream: true });
if (!isStreamingRawContent) {
// Binary mode: drop further data. Foreground emits the
// binary_detected event from handleExit (existing behavior);
// background already emitted it above.
return;
}
if (stream === 'stdout') {
stdout += decodedChunk;
} else {
stderr += decodedChunk;
}
const decoder = stream === 'stdout' ? stdoutDecoder : stderrDecoder;
const decodedChunk = decoder.decode(data, { stream: true });
if (streamStdout) {
// Streaming text mode: push through immediately, no string
// accumulation. (Up to ~4KB may already have been emitted
// before binary detection trips — bounded, acceptable.)
onOutputEvent({ type: 'data', chunk: decodedChunk });
return;
}
// Buffered text mode: accumulate for the final cleaned-blob emit.
if (stream === 'stdout') {
stdout += decodedChunk;
} else {
stderr += decodedChunk;
}
};
@ -451,7 +489,9 @@ export class ShellExecutionService {
const finalStrippedOutput = stripAnsi(combinedOutput).trim();
if (isStreamingRawContent) {
if (finalStrippedOutput) {
// In streaming mode chunks were already emitted as they arrived;
// re-emitting the final blob would duplicate everything.
if (!streamStdout && finalStrippedOutput) {
onOutputEvent({ type: 'data', chunk: finalStrippedOutput });
}
} else {

View file

@ -27,7 +27,14 @@ describe('FileExporter.serialize', () => {
afterEach(async () => {
await exporter.shutdown();
fs.rmSync(tmpDir, { recursive: true, force: true });
// Windows occasionally returns ENOTEMPTY when the underlying file
// handle isn't fully released yet; retry a few times before failing.
fs.rmSync(tmpDir, {
recursive: true,
force: true,
maxRetries: 5,
retryDelay: 50,
});
});
// Regression for upstream PR #4689: a raw JSON.stringify on a ReadableSpan

View file

@ -31,8 +31,6 @@ import {
} from '../services/shellExecutionService.js';
import * as fs from 'node:fs';
import * as os from 'node:os';
import { EOL } from 'node:os';
import * as path from 'node:path';
import * as crypto from 'node:crypto';
import { ToolErrorType } from './tool-error.js';
import { OUTPUT_UPDATE_INTERVAL_MS } from './shell.js';
@ -55,12 +53,14 @@ describe('ShellTool', () => {
getPermissionsDeny: vi.fn().mockReturnValue([]),
getDebugMode: vi.fn().mockReturnValue(false),
getTargetDir: vi.fn().mockReturnValue('/test/dir'),
getSessionId: vi.fn().mockReturnValue('test-session'),
getWorkspaceContext: vi
.fn()
.mockReturnValue(createMockWorkspaceContext('/test/dir')),
storage: {
getUserSkillsDirs: vi.fn().mockReturnValue(['/test/dir/.qwen/skills']),
getProjectTempDir: vi.fn().mockReturnValue('/tmp/qwen-temp'),
getProjectDir: vi.fn().mockReturnValue('/test/proj'),
},
getTruncateToolOutputThreshold: vi.fn().mockReturnValue(0),
getTruncateToolOutputLines: vi.fn().mockReturnValue(0),
@ -72,8 +72,24 @@ describe('ShellTool', () => {
email: 'qwen-coder@alibabacloud.com',
}),
getShouldUseNodePtyShell: vi.fn().mockReturnValue(false),
getBackgroundShellRegistry: vi.fn().mockReturnValue({
register: vi.fn(),
get: vi.fn(),
getAll: vi.fn().mockReturnValue([]),
cancel: vi.fn(),
complete: vi.fn(),
fail: vi.fn(),
}),
} as unknown as Config;
// executeBackground writes to disk; stub mkdirSync + createWriteStream.
vi.mocked(fs.mkdirSync).mockReturnValue(undefined);
vi.mocked(fs.createWriteStream).mockReturnValue({
write: vi.fn(),
end: vi.fn(),
on: vi.fn(),
} as unknown as fs.WriteStream);
shellTool = new ShellTool(mockConfig);
vi.mocked(os.platform).mockReturnValue('linux');
@ -271,83 +287,202 @@ describe('ShellTool', () => {
resolveExecutionPromise(fullResult);
};
it('should wrap background command on linux and parse pgrep output', async () => {
it('runs background commands as managed pool entries (no & / pgrep wrap)', async () => {
const registry = mockConfig.getBackgroundShellRegistry();
const invocation = shellTool.build({
command: 'my-command',
command: 'npm start',
is_background: true,
});
const promise = invocation.execute(mockAbortSignal);
resolveShellExecution({ pid: 54321 });
vi.mocked(fs.existsSync).mockReturnValue(true);
vi.mocked(fs.readFileSync).mockReturnValue(`54321${EOL}54322${EOL}`); // Service PID and background PID
const result = await invocation.execute(mockAbortSignal);
const result = await promise;
const tmpFile = path.join(os.tmpdir(), 'shell_pgrep_abcdef.tmp');
const wrappedCommand = `{ my-command & }; __code=$?; pgrep -g 0 >${tmpFile} 2>&1; exit $__code;`;
// Spawn happens with the unwrapped command — no '&', no pgrep envelope.
// Streaming mode is on so dev-server / watcher output flushes to the
// output file as it arrives instead of buffering until exit.
expect(mockShellExecutionService).toHaveBeenCalledWith(
wrappedCommand,
'npm start',
'/test/dir',
expect.any(Function),
expect.any(AbortSignal),
false,
{},
{ streamStdout: true },
);
expect(result.llmContent).toContain('PIDs: 54322');
expect(vi.mocked(fs.unlinkSync)).toHaveBeenCalledWith(tmpFile);
// Entry registered with the spawn pid.
expect(registry.register).toHaveBeenCalledTimes(1);
const entry = (registry.register as Mock).mock.calls[0][0];
expect(entry.command).toBe('npm start');
expect(entry.cwd).toBe('/test/dir');
expect(entry.status).toBe('running');
expect(entry.pid).toBe(12345);
expect(typeof entry.shellId).toBe('string');
expect(entry.outputPath).toContain('shell-');
// Returns immediately with id + output path; agent's turn isn't blocked.
expect(result.llmContent).toContain(entry.shellId);
expect(result.llmContent).toContain(entry.outputPath);
});
it('should add ampersand to command when is_background is true and command does not end with &', async () => {
it('settles a background entry as completed when the process exits cleanly', async () => {
const registry = mockConfig.getBackgroundShellRegistry();
const invocation = shellTool.build({
command: 'npm start',
command: 'true',
is_background: true,
});
const promise = invocation.execute(mockAbortSignal);
resolveShellExecution({ pid: 54321 });
await invocation.execute(mockAbortSignal);
const entry = (registry.register as Mock).mock.calls[0][0];
vi.mocked(fs.existsSync).mockReturnValue(true);
vi.mocked(fs.readFileSync).mockReturnValue('54321\n54322\n');
resolveExecutionPromise({
rawOutput: Buffer.from(''),
output: '',
exitCode: 0,
signal: null,
error: null,
aborted: false,
pid: 12345,
executionMethod: 'child_process',
});
// Flush the .then() microtask attached to resultPromise.
await new Promise((r) => setImmediate(r));
await promise;
expect(registry.complete).toHaveBeenCalledWith(
entry.shellId,
0,
expect.any(Number),
);
expect(registry.fail).not.toHaveBeenCalled();
expect(registry.cancel).not.toHaveBeenCalled();
});
const tmpFile = path.join(os.tmpdir(), 'shell_pgrep_abcdef.tmp');
const wrappedCommand = `{ npm start & }; __code=$?; pgrep -g 0 >${tmpFile} 2>&1; exit $__code;`;
it('settles a background entry as failed when ShellExecutionService reports error', async () => {
const registry = mockConfig.getBackgroundShellRegistry();
const invocation = shellTool.build({
command: 'no-such-command',
is_background: true,
});
await invocation.execute(mockAbortSignal);
const entry = (registry.register as Mock).mock.calls[0][0];
resolveExecutionPromise({
rawOutput: Buffer.from(''),
output: '',
exitCode: null,
signal: null,
error: new Error('spawn ENOENT'),
aborted: false,
pid: 12345,
executionMethod: 'child_process',
});
await new Promise((r) => setImmediate(r));
expect(registry.fail).toHaveBeenCalledWith(
entry.shellId,
'spawn ENOENT',
expect.any(Number),
);
expect(registry.complete).not.toHaveBeenCalled();
});
it('settles a background entry as failed on non-zero exit code (no error object)', async () => {
const registry = mockConfig.getBackgroundShellRegistry();
const invocation = shellTool.build({
command: 'false',
is_background: true,
});
await invocation.execute(mockAbortSignal);
const entry = (registry.register as Mock).mock.calls[0][0];
// ShellExecutionService reports a clean non-zero exit (no error object,
// no signal) — historically this got bucketed as `completed`, which
// misreported a failed `npm test` / `false` as a success.
resolveExecutionPromise({
rawOutput: Buffer.from(''),
output: '',
exitCode: 1,
signal: null,
error: null,
aborted: false,
pid: 12345,
executionMethod: 'child_process',
});
await new Promise((r) => setImmediate(r));
expect(registry.fail).toHaveBeenCalledWith(
entry.shellId,
expect.stringContaining('exited with code 1'),
expect.any(Number),
);
expect(registry.complete).not.toHaveBeenCalled();
});
it('strips trailing & from the spawned command (managed path handles backgrounding)', async () => {
const invocation = shellTool.build({
command: 'node server.js &',
is_background: true,
});
await invocation.execute(mockAbortSignal);
expect(mockShellExecutionService).toHaveBeenCalledWith(
wrappedCommand,
'node server.js',
'/test/dir',
expect.any(Function),
expect.any(AbortSignal),
false,
{},
{ streamStdout: true },
);
});
it('does not strip a trailing && (logical AND would be syntactically broken)', async () => {
const invocation = shellTool.build({
command: 'npm run dev &&',
is_background: true,
});
await invocation.execute(mockAbortSignal);
expect(mockShellExecutionService).toHaveBeenCalledWith(
'npm run dev &&',
expect.any(String),
expect.any(Function),
expect.any(AbortSignal),
false,
{},
{ streamStdout: true },
);
});
it('should not add extra ampersand when is_background is true and command already ends with &', async () => {
it('does not strip an escaped trailing \\& (literal &)', async () => {
const invocation = shellTool.build({
command: 'npm start &',
command: 'echo foo \\&',
is_background: true,
});
const promise = invocation.execute(mockAbortSignal);
resolveShellExecution({ pid: 54321 });
vi.mocked(fs.existsSync).mockReturnValue(true);
vi.mocked(fs.readFileSync).mockReturnValue('54321\n54322\n');
await promise;
const tmpFile = path.join(os.tmpdir(), 'shell_pgrep_abcdef.tmp');
const wrappedCommand = `{ npm start & }; __code=$?; pgrep -g 0 >${tmpFile} 2>&1; exit $__code;`;
await invocation.execute(mockAbortSignal);
expect(mockShellExecutionService).toHaveBeenCalledWith(
wrappedCommand,
'echo foo \\&',
expect.any(String),
expect.any(Function),
expect.any(AbortSignal),
false,
{},
{ streamStdout: true },
);
});
it('does not forward the turn signal into the background shell', async () => {
// Verifies: the AbortSignal handed to ShellExecutionService is the
// entry's own controller, not the outer turn signal. Cancelling the
// turn must not kill an intentionally backgrounded dev server / watcher.
const turnAc = new AbortController();
const invocation = shellTool.build({
command: 'npm run dev',
is_background: true,
});
await invocation.execute(turnAc.signal);
const passedSignal = mockShellExecutionService.mock.calls[0][3];
expect(passedSignal).not.toBe(turnAc.signal);
turnAc.abort();
// The signal handed to ShellExecutionService stays un-aborted —
// the turn's abort doesn't propagate into the background shell.
expect(passedSignal.aborted).toBe(false);
});
it('should not add ampersand when is_background is false', async () => {
const invocation = shellTool.build({
command: 'npm test',
@ -479,23 +614,6 @@ describe('ShellTool', () => {
).toThrow('Directory must be an absolute path.');
});
it('should clean up the temp file on synchronous execution error', async () => {
const error = new Error('sync spawn error');
mockShellExecutionService.mockImplementation(() => {
throw error;
});
vi.mocked(fs.existsSync).mockReturnValue(true); // Pretend the file exists
const invocation = shellTool.build({
command: 'a-command',
is_background: false,
});
await expect(invocation.execute(mockAbortSignal)).rejects.toThrow(error);
const tmpFile = path.join(os.tmpdir(), 'shell_pgrep_abcdef.tmp');
expect(vi.mocked(fs.unlinkSync)).toHaveBeenCalledWith(tmpFile);
});
describe('Streaming to `updateOutput`', () => {
let updateOutputMock: Mock;
beforeEach(() => {
@ -1016,43 +1134,6 @@ describe('ShellTool', () => {
});
});
describe('Windows background execution', () => {
it('should clean up trailing ampersand on Windows for background tasks', async () => {
vi.mocked(os.platform).mockReturnValue('win32');
const mockAbortSignal = new AbortController().signal;
const invocation = shellTool.build({
command: 'npm start &',
is_background: true,
});
const promise = invocation.execute(mockAbortSignal);
// Simulate immediate success (process started)
resolveExecutionPromise({
rawOutput: Buffer.from(''),
output: '',
exitCode: 0,
signal: null,
error: null,
aborted: false,
pid: 12345,
executionMethod: 'child_process',
});
await promise;
expect(mockShellExecutionService).toHaveBeenCalledWith(
'npm start',
expect.any(String),
expect.any(Function),
expect.any(AbortSignal),
false,
{},
);
});
});
describe('timeout parameter', () => {
it('should validate timeout parameter correctly', async () => {
// Valid timeout
@ -1177,40 +1258,6 @@ describe('ShellTool', () => {
expect(calledSignal).not.toBe(mockAbortSignal);
});
it('should not create timeout signal for background execution', async () => {
const mockAbortSignal = new AbortController().signal;
const invocation = shellTool.build({
command: 'npm start',
is_background: true,
timeout: 5000,
});
const promise = invocation.execute(mockAbortSignal);
resolveExecutionPromise({
rawOutput: Buffer.from(''),
output: 'Background command started. PID: 12345',
exitCode: 0,
signal: null,
error: null,
aborted: false,
pid: 12345,
executionMethod: 'child_process',
});
await promise;
// For background execution, the original signal should be used
expect(mockShellExecutionService).toHaveBeenCalledWith(
expect.any(String),
expect.any(String),
expect.any(Function),
mockAbortSignal,
false,
{},
);
});
it('should handle timeout vs user cancellation correctly', async () => {
const userAbortController = new AbortController();
const invocation = shellTool.build({

View file

@ -6,7 +6,7 @@
import fs from 'node:fs';
import path from 'node:path';
import os, { EOL } from 'node:os';
import os from 'node:os';
import crypto from 'node:crypto';
import type { Config } from '../config/config.js';
import { ToolNames, ToolDisplayNames } from './tool-names.js';
@ -29,6 +29,8 @@ import type {
ShellOutputEvent,
} from '../services/shellExecutionService.js';
import { ShellExecutionService } from '../services/shellExecutionService.js';
import type { BackgroundShellEntry } from '../services/backgroundShellRegistry.js';
import stripAnsi from 'strip-ansi';
import { formatMemoryUsage } from '../utils/formatters.js';
import type { AnsiOutput } from '../utils/terminalSerializer.js';
import { isSubpaths } from '../utils/paths.js';
@ -46,6 +48,20 @@ import {
const debugLogger = createDebugLogger('SHELL');
/**
* Strip a single bare trailing `&` (bash background operator) from a
* command string. Returns the input unchanged if the trailing form is
* `&&` (logical AND), `\&` (escaped literal `&`), or there is no `&`
* at the end at all. Linear time, no regex backtracking risk.
*/
function stripTrailingBackgroundAmp(command: string): string {
const trimmed = command.trimEnd();
if (!trimmed.endsWith('&')) return command;
if (trimmed.endsWith('&&')) return command;
if (trimmed.endsWith('\\&')) return command;
return trimmed.slice(0, -1).trimEnd();
}
export const OUTPUT_UPDATE_INTERVAL_MS = 1000;
const DEFAULT_FOREGROUND_TIMEOUT_MS = 120000;
@ -208,9 +224,12 @@ export class ShellToolInvocation extends BaseToolInvocation<
};
}
const effectiveTimeout = this.params.is_background
? undefined
: (this.params.timeout ?? DEFAULT_FOREGROUND_TIMEOUT_MS);
if (this.params.is_background) {
return this.executeBackground(signal, shellExecutionConfig);
}
const effectiveTimeout =
this.params.timeout ?? DEFAULT_FOREGROUND_TIMEOUT_MS;
// Create combined signal with timeout for foreground execution
let combinedSignal = signal;
@ -219,294 +238,345 @@ export class ShellToolInvocation extends BaseToolInvocation<
combinedSignal = AbortSignal.any([signal, timeoutSignal]);
}
const isWindows = os.platform() === 'win32';
const tempFileName = `shell_pgrep_${crypto
.randomBytes(6)
.toString('hex')}.tmp`;
const tempFilePath = path.join(os.tmpdir(), tempFileName);
// Add co-author to git commit commands
const processedCommand = this.addCoAuthorToGitCommit(strippedCommand);
const commandToExecute = processedCommand;
const cwd = this.params.directory || this.config.getTargetDir();
try {
// Add co-author to git commit commands
const processedCommand = this.addCoAuthorToGitCommit(strippedCommand);
let cumulativeOutput: string | AnsiOutput = '';
let lastUpdateTime = Date.now();
let isBinaryStream = false;
let totalLines = 0;
let totalBytes = 0;
const shouldRunInBackground = this.params.is_background;
let finalCommand = processedCommand;
const { result: resultPromise, pid } = await ShellExecutionService.execute(
commandToExecute,
cwd,
(event: ShellOutputEvent) => {
let shouldUpdate = false;
// On non-Windows, use & to run in background.
// On Windows, we don't use start /B because it creates a detached process that
// doesn't die when the parent dies. Instead, we rely on the race logic below
// to return early while keeping the process attached (detached: false).
if (
!isWindows &&
shouldRunInBackground &&
!finalCommand.trim().endsWith('&')
) {
finalCommand = finalCommand.trim() + ' &';
}
// On Windows, we rely on the race logic below to handle background tasks.
// We just ensure the command string is clean.
if (isWindows && shouldRunInBackground) {
finalCommand = finalCommand.trim().replace(/&+$/, '').trim();
}
// On non-Windows background commands, wrap with pgrep to capture
// subprocess PIDs so we can report them to the user.
const commandToExecute =
!isWindows && shouldRunInBackground
? (() => {
let command = finalCommand.trim();
if (!command.endsWith('&')) command += ';';
return `{ ${command} }; __code=$?; pgrep -g 0 >${tempFilePath} 2>&1; exit $__code;`;
})()
: finalCommand;
const cwd = this.params.directory || this.config.getTargetDir();
let cumulativeOutput: string | AnsiOutput = '';
let lastUpdateTime = Date.now();
let isBinaryStream = false;
let totalLines = 0;
let totalBytes = 0;
const { result: resultPromise, pid } =
await ShellExecutionService.execute(
commandToExecute,
cwd,
(event: ShellOutputEvent) => {
let shouldUpdate = false;
switch (event.type) {
case 'data':
if (isBinaryStream) break;
cumulativeOutput = event.chunk;
// Stats are only consumed by the ANSI-output branch below,
// so skip the per-chunk accounting for plain string chunks.
if (Array.isArray(event.chunk)) {
totalLines = event.chunk.length;
totalBytes = event.chunk.reduce(
(sum, line) =>
sum +
line.reduce(
(ls, token) =>
ls + Buffer.byteLength(token.text, 'utf-8'),
0,
),
switch (event.type) {
case 'data':
if (isBinaryStream) break;
cumulativeOutput = event.chunk;
// Stats are only consumed by the ANSI-output branch below,
// so skip the per-chunk accounting for plain string chunks.
if (Array.isArray(event.chunk)) {
totalLines = event.chunk.length;
totalBytes = event.chunk.reduce(
(sum, line) =>
sum +
line.reduce(
(ls, token) => ls + Buffer.byteLength(token.text, 'utf-8'),
0,
);
}
shouldUpdate = true;
break;
case 'binary_detected':
isBinaryStream = true;
cumulativeOutput =
'[Binary output detected. Halting stream...]';
shouldUpdate = true;
break;
case 'binary_progress':
isBinaryStream = true;
cumulativeOutput = `[Receiving binary output... ${formatMemoryUsage(
event.bytesReceived,
)} received]`;
if (Date.now() - lastUpdateTime > OUTPUT_UPDATE_INTERVAL_MS) {
shouldUpdate = true;
}
break;
default: {
throw new Error('An unhandled ShellOutputEvent was found.');
}
),
0,
);
}
if (shouldUpdate && updateOutput) {
if (typeof cumulativeOutput === 'string') {
updateOutput(cumulativeOutput);
} else {
updateOutput({
ansiOutput: cumulativeOutput,
totalLines,
totalBytes,
// Only include timeout when user explicitly set it
...(this.params.timeout != null && {
timeoutMs: this.params.timeout,
}),
});
}
lastUpdateTime = Date.now();
shouldUpdate = true;
break;
case 'binary_detected':
isBinaryStream = true;
cumulativeOutput = '[Binary output detected. Halting stream...]';
shouldUpdate = true;
break;
case 'binary_progress':
isBinaryStream = true;
cumulativeOutput = `[Receiving binary output... ${formatMemoryUsage(
event.bytesReceived,
)} received]`;
if (Date.now() - lastUpdateTime > OUTPUT_UPDATE_INTERVAL_MS) {
shouldUpdate = true;
}
},
combinedSignal,
shouldRunInBackground
? false
: this.config.getShouldUseNodePtyShell(),
shellExecutionConfig ?? {},
);
if (pid && setPidCallback) {
setPidCallback(pid);
}
// On Windows, background commands rely on early return since there's
// no & backgrounding or pgrep. Awaiting would block until completion.
if (shouldRunInBackground && isWindows) {
const pidMsg = pid ? ` PID: ${pid}` : '';
const killHint = ' (Use taskkill /F /T /PID <pid> to stop)';
return {
llmContent: `Background command started.${pidMsg}${killHint}`,
returnDisplay: `Background command started.${pidMsg}${killHint}`,
};
}
const result = await resultPromise;
if (shouldRunInBackground) {
// Read subprocess PIDs captured by the pgrep wrapper (non-Windows only)
const backgroundPIDs: number[] = [];
if (!isWindows) {
if (fs.existsSync(tempFilePath)) {
const pgrepLines = fs
.readFileSync(tempFilePath, 'utf8')
.split(EOL)
.filter(Boolean);
for (const line of pgrepLines) {
if (!/^\d+$/.test(line)) {
debugLogger.warn(`pgrep: ${line}`);
continue;
}
const bgPid = Number(line);
if (bgPid !== result.pid) {
backgroundPIDs.push(bgPid);
}
}
} else if (!signal.aborted) {
debugLogger.warn('missing pgrep output');
break;
default: {
throw new Error('An unhandled ShellOutputEvent was found.');
}
}
const bgPidMsg =
backgroundPIDs.length > 0
? ` PIDs: ${backgroundPIDs.join(', ')}`
: pid
? ` PID: ${pid}`
: '';
const killHint = ' (Use kill <pid> to stop)';
return {
llmContent: `Background command started.${bgPidMsg}${killHint}`,
returnDisplay: `Background command started.${bgPidMsg}${killHint}`,
};
}
let llmContent = '';
if (result.aborted) {
// Check if it was a timeout or user cancellation
const wasTimeout =
!this.params.is_background &&
effectiveTimeout &&
combinedSignal.aborted &&
!signal.aborted;
if (wasTimeout) {
llmContent = `Command timed out after ${effectiveTimeout}ms before it could complete.`;
if (result.output.trim()) {
llmContent += ` Below is the output before it timed out:\n${result.output}`;
if (shouldUpdate && updateOutput) {
if (typeof cumulativeOutput === 'string') {
updateOutput(cumulativeOutput);
} else {
llmContent += ' There was no output before it timed out.';
}
} else {
llmContent =
'Command was cancelled by user before it could complete.';
if (result.output.trim()) {
llmContent += ` Below is the output before it was cancelled:\n${result.output}`;
} else {
llmContent += ' There was no output before it was cancelled.';
updateOutput({
ansiOutput: cumulativeOutput,
totalLines,
totalBytes,
// Only include timeout when user explicitly set it
...(this.params.timeout != null && {
timeoutMs: this.params.timeout,
}),
});
}
lastUpdateTime = Date.now();
}
} else {
// Create a formatted error string for display, replacing the wrapper command
// with the user-facing command.
const finalError = result.error
? result.error.message.replace(commandToExecute, this.params.command)
: '(none)';
},
combinedSignal,
this.config.getShouldUseNodePtyShell(),
shellExecutionConfig ?? {},
);
llmContent = [
`Command: ${this.params.command}`,
`Directory: ${this.params.directory || '(root)'}`,
`Output: ${result.output || '(empty)'}`,
`Error: ${finalError}`, // Use the cleaned error string.
`Exit Code: ${result.exitCode ?? '(none)'}`,
`Signal: ${result.signal ?? '(none)'}`,
`Process Group PGID: ${result.pid ?? '(none)'}`,
].join('\n');
}
if (pid && setPidCallback) {
setPidCallback(pid);
}
let returnDisplayMessage = '';
if (this.config.getDebugMode()) {
returnDisplayMessage = llmContent;
} else {
const result = await resultPromise;
let llmContent = '';
if (result.aborted) {
// Check if it was a timeout or user cancellation
const wasTimeout =
effectiveTimeout && combinedSignal.aborted && !signal.aborted;
if (wasTimeout) {
llmContent = `Command timed out after ${effectiveTimeout}ms before it could complete.`;
if (result.output.trim()) {
returnDisplayMessage = result.output;
llmContent += ` Below is the output before it timed out:\n${result.output}`;
} else {
if (result.aborted) {
// Check if it was a timeout or user cancellation
const wasTimeout =
!this.params.is_background &&
effectiveTimeout &&
combinedSignal.aborted &&
!signal.aborted;
returnDisplayMessage = wasTimeout
? `Command timed out after ${effectiveTimeout}ms.`
: 'Command cancelled by user.';
} else if (result.signal) {
returnDisplayMessage = `Command terminated by signal: ${result.signal}`;
} else if (result.error) {
returnDisplayMessage = `Command failed: ${getErrorMessage(
result.error,
)}`;
} else if (result.exitCode !== null && result.exitCode !== 0) {
returnDisplayMessage = `Command exited with code: ${result.exitCode}`;
}
// If output is empty and command succeeded (code 0, no error/signal/abort),
// returnDisplayMessage will remain empty, which is fine.
llmContent += ' There was no output before it timed out.';
}
} else {
llmContent = 'Command was cancelled by user before it could complete.';
if (result.output.trim()) {
llmContent += ` Below is the output before it was cancelled:\n${result.output}`;
} else {
llmContent += ' There was no output before it was cancelled.';
}
}
} else {
// Create a formatted error string for display, replacing the wrapper command
// with the user-facing command.
const finalError = result.error
? result.error.message.replace(commandToExecute, this.params.command)
: '(none)';
// Truncate large output and save full content to a temp file.
if (typeof llmContent === 'string') {
const truncatedResult = await truncateToolOutput(
this.config,
ShellTool.Name,
llmContent,
);
llmContent = [
`Command: ${this.params.command}`,
`Directory: ${this.params.directory || '(root)'}`,
`Output: ${result.output || '(empty)'}`,
`Error: ${finalError}`, // Use the cleaned error string.
`Exit Code: ${result.exitCode ?? '(none)'}`,
`Signal: ${result.signal ?? '(none)'}`,
`Process Group PGID: ${result.pid ?? '(none)'}`,
].join('\n');
}
if (truncatedResult.outputFile) {
llmContent = truncatedResult.content;
returnDisplayMessage +=
(returnDisplayMessage ? '\n' : '') +
`Output too long and was saved to: ${truncatedResult.outputFile}`;
let returnDisplayMessage = '';
if (this.config.getDebugMode()) {
returnDisplayMessage = llmContent;
} else {
if (result.output.trim()) {
returnDisplayMessage = result.output;
} else {
if (result.aborted) {
// Check if it was a timeout or user cancellation
const wasTimeout =
effectiveTimeout && combinedSignal.aborted && !signal.aborted;
returnDisplayMessage = wasTimeout
? `Command timed out after ${effectiveTimeout}ms.`
: 'Command cancelled by user.';
} else if (result.signal) {
returnDisplayMessage = `Command terminated by signal: ${result.signal}`;
} else if (result.error) {
returnDisplayMessage = `Command failed: ${getErrorMessage(
result.error,
)}`;
} else if (result.exitCode !== null && result.exitCode !== 0) {
returnDisplayMessage = `Command exited with code: ${result.exitCode}`;
}
}
const executionError = result.error
? {
error: {
message: result.error.message,
type: ToolErrorType.SHELL_EXECUTE_ERROR,
},
}
: {};
return {
llmContent,
returnDisplay: returnDisplayMessage,
...executionError,
};
} finally {
if (fs.existsSync(tempFilePath)) {
fs.unlinkSync(tempFilePath);
// If output is empty and command succeeded (code 0, no error/signal/abort),
// returnDisplayMessage will remain empty, which is fine.
}
}
// Truncate large output and save full content to a temp file.
if (typeof llmContent === 'string') {
const truncatedResult = await truncateToolOutput(
this.config,
ShellTool.Name,
llmContent,
);
if (truncatedResult.outputFile) {
llmContent = truncatedResult.content;
returnDisplayMessage +=
(returnDisplayMessage ? '\n' : '') +
`Output too long and was saved to: ${truncatedResult.outputFile}`;
}
}
const executionError = result.error
? {
error: {
message: result.error.message,
type: ToolErrorType.SHELL_EXECUTE_ERROR,
},
}
: {};
return {
llmContent,
returnDisplay: returnDisplayMessage,
...executionError,
};
}
/**
* Background-execution path: spawn the command into a managed registry
* entry instead of detaching with `&`. Output streams to a per-shell file
* the agent can `Read`; cancellation flows through the entry's
* AbortController; the registry's terminal status is set when the process
* exits. Returns immediately so the agent's turn isn't blocked.
*/
private async executeBackground(
signal: AbortSignal,
shellExecutionConfig?: ShellExecutionConfig,
): Promise<ToolResult> {
const strippedCommand = stripShellWrapper(this.params.command);
// Strip a single bare trailing `&` (the bash background operator) before
// spawn: bash treats it as background-detach, exits the wrapper
// immediately, and the real child outlives the wrapper — the registry
// would settle as `completed` while the shell is still running, and
// chunked output would land on a closed stream. The managed path is
// itself the backgrounding mechanism, so the trailing `&` is redundant.
//
// Deliberately precise: do not touch `&&` (logical AND), `\&` (escaped
// literal `&`), or commands without a trailing `&`. Earlier `\s*&+\s*$`
// was both too greedy (it ate `&&` and `\&`) and a ReDoS hazard on
// long all-`&` inputs. Plain string checks here are linear and clearer
// than a lookbehind regex.
const noTrailingAmp = stripTrailingBackgroundAmp(strippedCommand);
if (noTrailingAmp !== strippedCommand) {
debugLogger.warn(
'Stripped trailing & from background shell command — managed path handles backgrounding',
);
}
const processedCommand = this.addCoAuthorToGitCommit(noTrailingAmp);
const cwd = this.params.directory || this.config.getTargetDir();
// Output goes under the project temp dir (which `ReadFileTool`
// auto-allows by default), so the LLM can `Read` the captured output
// without bouncing off a permission prompt — important because
// background-agent contexts can't surface interactive prompts.
const outputDir = path.join(
this.config.storage.getProjectTempDir(),
'background-shells',
this.config.getSessionId(),
);
fs.mkdirSync(outputDir, { recursive: true });
const shellId = `bg_${crypto.randomBytes(4).toString('hex')}`;
const outputPath = path.join(outputDir, `shell-${shellId}.output`);
// Background shells are explicitly independent of the current turn:
// the user pressing Ctrl+C on a turn (which aborts `signal`) should
// NOT kill a long-running dev server / watcher they intentionally
// backgrounded. Cancellation flows only through the entry's own
// AbortController, driven by future `task_stop` integration (#3471).
// The `signal` parameter is still honored for the synchronous early
// return below (don't even spawn if the agent already aborted), but
// we deliberately do not forward it.
const entryAc = new AbortController();
const outputStream = fs.createWriteStream(outputPath, { flags: 'w' });
// Without an 'error' listener, a write failure (disk full, permission
// change, fs going away) would surface as an uncaught exception and
// kill the entire CLI session. Log + drop is the sane default — the
// process keeps running, the registry still settles via resultPromise.
outputStream.on('error', (err) => {
debugLogger.warn(
`background shell ${shellId} output write error: ${err.message}`,
);
});
const startTime = Date.now();
const entry: BackgroundShellEntry = {
shellId,
command: processedCommand,
cwd,
status: 'running',
startTime,
outputPath,
abortController: entryAc,
};
const { result: resultPromise, pid } = await ShellExecutionService.execute(
processedCommand,
cwd,
(event: ShellOutputEvent) => {
if (event.type === 'data' && typeof event.chunk === 'string') {
// Strip ANSI escape codes (color, cursor-move, clear-screen) before
// writing — agents read the file as plain text, and dev servers /
// build tools spam plenty of escape sequences that would render as
// garbage. Costs ~one regex per chunk; cheap relative to disk I/O.
outputStream.write(stripAnsi(event.chunk));
}
// ANSI array chunks and binary streams are not written to the output
// file: agents read the file as plain text and binary spam would be
// unhelpful.
},
entryAc.signal,
// Background shells are non-interactive by design — no terminal to
// attach a PTY to, no human to type at it. Force the child_process
// path so we don't pull in node-pty for fire-and-forget commands.
false,
shellExecutionConfig ?? {},
// Stream stdout/stderr through to the output file as chunks arrive.
// Default child_process mode buffers until exit, which would leave
// dev-server / watcher output files empty until the process dies.
{ streamStdout: true },
);
if (pid !== undefined) entry.pid = pid;
const registry = this.config.getBackgroundShellRegistry();
registry.register(entry);
// Settle in the background — do NOT await here, the agent should be
// unblocked immediately.
void resultPromise.then(
(result) => {
outputStream.end();
const endTime = Date.now();
if (entryAc.signal.aborted) {
if (registry.get(shellId)?.status === 'running') {
registry.cancel(shellId, endTime);
}
} else if (
result.error ||
(result.exitCode !== null && result.exitCode !== 0) ||
result.signal !== null
) {
// Non-zero exit / killed by signal / spawn error all count as failed.
// Treating them as `completed` would let `/tasks` (and any future
// model-facing notification) misreport a failed `npm test` or
// `false` command as a success.
const reason = result.error
? result.error.message
: result.signal !== null
? `terminated by signal ${result.signal}`
: `exited with code ${result.exitCode}`;
registry.fail(shellId, reason, endTime);
} else {
registry.complete(shellId, result.exitCode ?? 0, endTime);
}
},
(err) => {
outputStream.end();
registry.fail(shellId, getErrorMessage(err), Date.now());
},
);
const pidLine = pid !== undefined ? `pid: ${pid}\n` : '';
return {
llmContent:
`Background shell started.\n` +
`id: ${shellId}\n` +
pidLine +
`output file: ${outputPath}\n` +
`Use the /tasks command to list and inspect background shells, or Read the output file directly.`,
returnDisplay: `Background shell ${shellId} started${pid !== undefined ? ` (pid ${pid})` : ''}.`,
};
}
private addCoAuthorToGitCommit(command: string): string {