feat(cli): headless support and SDK task events for background agents (#3379)

* feat(cli): unify notification queue for cron and background agents

Migrate cron from its own queue (cronQueueRef / cronQueue) to the shared
notification queue used by background agents. Both producers now push the
same item shape { displayText, modelText, sendMessageType } and a single
drain effect / helper processes them in FIFO order.

Cron fires render as HistoryItemNotification (● prefix) instead of
HistoryItemUser (> prefix), with a "Cron: <prompt>" display label.
Records use subtype 'cron' for clean resume and analytics separation.

Lift the non-interactive rejection for background agents. Register a
notification callback in nonInteractiveCli.ts with a terminal hold-back
phase (100ms poll) that keeps the process alive until all background
agents complete and their notifications are processed.

* feat(cli): emit SDK task events for background subagents

Emit `task_started` when a background agent registers and
`task_notification` when it completes, fails, or is cancelled, so
headless/SDK consumers can track lifecycle without parsing display
text. Model-facing text is now structured XML with status, summary,
truncated result, and usage stats. Completion stats (tokens, tool
uses, duration) are captured from the subagent and included in both
the SDK payload and the model XML.

* fix: address codex review issues for background subagents

- Background subagents now inherit the resolved approval mode from
  agentConfig instead of the raw session config, so a subagent with
  `approvalMode: auto-edit` (or execution in a trusted folder) keeps
  that override when it runs asynchronously.
- Non-interactive cron drains are single-flight: concurrent cron fires
  now await the same in-flight drain, and the cron-done check gates
  on it, preventing the final result from being emitted while a cron
  turn is still streaming.
- Background forks go through createForkSubagent so they retain the
  parent's rendered system prompt and inherited history instead of
  degrading to a plain FORK_AGENT.

* fix(cli): restore cancellation, approval, and error paths in queued drain

- Hold-back loop now reacts to SIGINT/SIGTERM: when the main abort
  signal fires it calls registry.abortAll() so background agents with
  their own AbortControllers stop promptly instead of pinning the
  process open.
- Queued-turn tool execution forwards the stream-json approval update
  callback (onToolCallsUpdate) so permission-gated tools inside a
  background-notification follow-up emit can_use_tool requests.
- Queued-turn stream loop mirrors the main loop's text-mode handling
  of GeminiEventType.Error, writing to stderr and throwing so provider
  errors produce a non-zero exit code instead of silently succeeding.
- Interactive cron prompts go through the normal slash/@-command/shell
  preprocessing again; only Notification messages skip that path.

* fix(cli): skip duplicate user-message item for cron prompts

Cron prompts already render as a `● Cron: …` notification via the queue
drain, so adding them again as a `USER` history item produced a
duplicate `> …` line.

* fix(cli): honor SIGINT/SIGTERM during cron scheduler wait

The non-interactive cron phase awaits a Promise that resolves only when
scheduler.size reaches 0 and no drain is in flight. Recurring cron jobs
never drop the scheduler size to 0 on their own, so the previous abort
handling (added to the hold-back loop) was unreachable — the process
hung indefinitely after SIGINT/SIGTERM. Attach an abort listener inside
the promise so abort stops the scheduler and resolves immediately,
allowing the hold-back loop to run and the process to exit cleanly.

* feat(core): propagate tool-use id through background agent notifications

Plumb the scheduler's callId into AgentToolInvocation via an optional
setCallId hook on the invocation, detected structurally in
buildInvocation. The agent tool forwards it as toolUseId on the
BackgroundTaskRegistry entry so completion notifications can carry a
<tool-use-id> tag and SDK task_started / task_notification events can
emit tool_use_id — letting consumers correlate background completions
back to the original Agent tool-use that spawned them.

* fix(cli): drain single-flight race kept task_notification from emitting

drainLocalQueue wrapped its body in an async IIFE and cleared the
promise reference via finally. When the queue is empty the IIFE has
no awaits, so its finally runs synchronously as part of the RHS of
the assignment `drainPromise = (async () => {...})()` — clearing
drainPromise BEFORE the outer assignment overwrites it with the
resolved promise. The reference then stayed stuck on that fulfilled
promise forever, so later calls short-circuited through
`if (drainPromise) return drainPromise` and never processed
queued notifications.

Symptom: in headless `--output-format json` (and `stream-json`),
task_started emitted but task_notification never did, even after
the background agent completed. The process sat in the hold-back
loop until SIGTERM.

Fix: move the null-clearing out of the async body into an outer
`.finally()` on the returned promise. `.finally()` runs as a
microtask after the current synchronous block, so it clears the
latest drainPromise reference instead of the pre-assignment null.

* fix(cli): append newline to text-mode emitResult so zsh PROMPT_SP doesn't erase the line

Headless text mode wrote `resultMessage.result` without a trailing newline.
In a TTY, zsh themes that use PROMPT_SP (powerlevel10k, agnoster, …) detect
the missing `\n` and emit `\r\033[K` before drawing the next prompt, which
wipes the final line off the screen. Pipe-captured output was unaffected,
so the bug only surfaced for interactive shell users — most visibly in the
background-agent flow where the drain-loop's final assistant message is
the *only* stdout write in text mode.

Append `\n` to both the success (stdout) and error (stderr) writes.

* docs(skill): tighten worked-example blurb in structured-debugging

Mirror the simplified blurb from .claude/skills/structured-debugging/SKILL.md
(knowledge repo). Drops the round-by-round narrative; keeps the contradiction
+ two lessons.

* docs(skill): mirror SKILL.md improvements (reframing failure mode, generalized path, value-logging guidance)

Mirror of knowledge repo commit 38eb28d into the qwen-code .qwen/skills
copy.

* docs(skill): mirror worked example into .qwen/skills/structured-debugging/

Mirrors knowledge/.claude/skills/structured-debugging/examples/
headless-bg-agent-empty-stdout.md so the .qwen copy of the skill links
resolve.

* docs(skill): mirror generalized side-note path guidance

* fix(cli): harden headless cron and background-agent failure paths

Three regressions surfaced by Codex review of feat/background-subagent:

- Cron drain rejections were dropped by a bare `void`, so a failing
  queued turn left the outer Promise unresolved and hung the run. Route
  drain failures through the Promise's reject so they propagate to the
  outer catch.
- The background-agent registry entry was inserted before
  `createForkSubagent()` / `createAgentHeadless()` was awaited. Failed
  init returned an error from the tool call but left a phantom `running`
  entry, and the headless hold-back loop (`registry.getRunning()`) waited
  forever. Register only after init succeeds.
- SIGINT/SIGTERM during the hold-back phase aborted background tasks,
  then fell through to `emitResult({ isError: false })`, so a cancelled
  `qwen -p ...` exited 0 with the prior assistant text. Route through
  `handleCancellationError()` so cancellation exits non-zero, matching
  the main turn loop.

* test(cli): update stdout/stderr assertions for trailing newline

`feadf052f` appended `\n` to text-mode `emitResult` output, but the
nonInteractiveCli tests still asserted the pre-change strings. Update
the 11 affected assertions to expect the trailing newline.

* fix: address review comments on background-agent notifications

Four additional issues from the PR review that the prior regression-fix
commit didn't cover:

- Escape XML metacharacters when interpolating `description`, `result`,
  `error`, `agentId`, `toolUseId`, and `status` into the task-notification
  envelope. Subagent output (which itself may carry untrusted tool output,
  fetched HTML, or another agent's notification) could contain
  `</result>` or `</task-notification>` and forge sibling tags the parent
  model would treat as trusted metadata. Truncate result text *before*
  escaping so the truncation never slices through an entity like `&amp;`.
- Emit the terminal notification from `cancel()` and `abortAll()`. The
  fire-and-forget `complete()`/`fail()` from the subagent task is guarded
  by `status !== 'running'` and was no-op'd after cancellation, so SDK
  consumers saw `task_started` with no matching `task_notification`,
  breaking the contract this PR establishes. Updated two race-guard
  tests that asserted the old behavior.
- Call `adapter.finalizeAssistantMessage()` before the abort-triggered
  early return inside `drainOneItem`'s stream loop. Without it,
  `startAssistantMessage()` had already been called, so stream-json mode
  left `message_start` unpaired.
- Enforce `config.getMaxSessionTurns()` in `drainOneItem` for symmetry
  with the main turn loop. Cron fires and notification replies otherwise
  bypass the budget cap in headless runs.
This commit is contained in:
tanzhenxin 2026-04-17 14:42:44 +08:00 committed by GitHub
parent 37d1924a3b
commit 0da1182b74
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
14 changed files with 780 additions and 261 deletions

View file

@ -1,6 +1,6 @@
---
name: structured-debugging
description:
description: >
Hypothesis-driven debugging methodology for hard bugs. Use this skill whenever
you're investigating non-trivial bugs, unexpected behavior, flaky tests, or
tracing issues through complex systems. Activate proactively when debugging
@ -32,11 +32,9 @@ Good: "The leader hangs because `hasActiveTeammates()` returns true after all ag
have reported completed, likely because terminal status isn't being set on the agent
object after the backend process exits."
Create a side note file for the investigation:
```
~/.qwen/investigations/<project>-<issue>.md
```
For bugs you expect to take more than one round, create a side note file
for the investigation in whichever location the project uses for such
notes.
Write your hypothesis there. This file persists across conversation turns and even
across sessions — it's your investigation journal.
@ -49,6 +47,12 @@ confirm or reject your hypothesis. Think about what data you need to see.
Don't scatter `console.log` everywhere. Identify the 2-3 places where your
hypothesis makes a testable prediction, and instrument those.
Prefer logging _values_ (return codes, payload contents, stream types,
message bodies, env state) over _presence checks_ ("was this function
called?", "was this branch taken?"). Code-path traces tell you what ran;
data traces tell you what it ran on. Most non-trivial bugs are correct
code processing wrong data.
Ask yourself: "If my hypothesis is correct, what will I see at point X?
If it's wrong, what will I see instead?"
@ -129,6 +133,23 @@ is still broken if the inbox contains stale messages from a previous run.
Always inspect the _content_ flowing through the code, not just whether the code
runs. Check payloads, message contents, file data, and database state.
### Reframing the user's report instead of investigating it
When the user reports a symptom your own run doesn't reproduce, the
contradiction _is_ the evidence — the two environments differ in some way
you haven't identified yet. The wrong move is to reframe their report
("they must be on a stale SHA", "they must be confused about what they
saw", "must be a flake") so that your run becomes the ground truth. Once
you do that, every later piece of evidence gets bent to defend the
reframing, and the actual bug stays hidden.
The right move: catalogue what differs between their environment and
yours (TTY vs pipe, terminal emulator, shell, locale, env vars, prior
state, build artifacts) before forming any hypothesis. For ambiguous
symptoms ("no output", "it's slow", "it's wrong") ask one disambiguating
question first — e.g., "does it hang or exit cleanly?" — that prunes the
hypothesis space cheaply before any test run.
### Losing context across attempts
After several debugging rounds, you start forgetting what you already tried and
@ -164,3 +185,10 @@ Fix: [what you're changing and why it addresses the root cause]
```
Then apply the fix, remove instrumentation, and verify with a clean run.
## Worked examples
- [`examples/headless-bg-agent-empty-stdout.md`](examples/headless-bg-agent-empty-stdout.md)
— pipe-captured runs all passed; the user's TTY printed nothing. The
contradiction _was_ the bug. Illustrates _reproduction contradiction is
data_ and _instrument data, not code paths_.

View file

@ -0,0 +1,59 @@
# Worked example: headless run prints empty stdout in zsh TTY
A short qwen-code case to illustrate two failure modes from `SKILL.md`:
_reproduction contradiction is data_, and _instrument the data flow, not
just the code path_.
## The bug
User: `npm run dev -- -p "..."` in zsh prints nothing. Process exits clean,
`~/.qwen/logs` shows the model returned proper text. Stdout was empty.
Cause: `JsonOutputAdapter.emitResult` wrote `resultMessage.result` without
a trailing `\n`. zsh's `PROMPT_SP` (powerlevel10k, agnoster, …) detects
the missing newline and emits `\r\033[K` before drawing the next prompt,
erasing the line. Pipe-captured stdout has no `PROMPT_SP`, so the bug is
invisible there.
Fix: append `\n` to the write.
## What made the case instructive
Every reproduction attempt from a debugging environment that captures
stdout (Cursor's Shell tool, `out=$(...)`, `tee`, file redirect) **passed**.
14/14 success against the user's 0/N. Same SHA, same machine, same
command. The only variable was: pipe stdout vs TTY stdout.
That contradiction was the entire investigation. Once it was named, the
fix was one line.
## Lessons mapped to SKILL.md
- **Reproduction contradiction is data, not user error.** When your run
succeeds and the user's fails on identical state, the _difference
between the two environments_ is where the bug lives. Catalogue what
differs (TTY vs pipe, terminal emulator, shell, locale, env vars,
prior state) before forming any hypothesis. Reframing the user's
report ("they must be on stale code") burns rounds and credibility.
- **Ask the one disambiguating question first.** "Does it hang or exit
cleanly?" would have falsified the most tempting wrong hypothesis here
(the recently-fixed drain-loop hang) on turn one. For any "no output"
report, that question is free and prunes half the hypothesis space.
- **Instrument the data flow, not just the code path.** Tracing whether
`write` was called showed the happy path firing every time and resolved
nothing. The breakthrough was logging the _return value_ of
`process.stdout.write` together with `process.stdout.isTTY`. Code-path
traces tell you what ran; data traces tell you what it ran on.
- **Pipe ≠ TTY.** A passing pipe-captured run does not prove a TTY user
sees the same output. Shell prompts can post-process trailing-newline-
less writes; terminals can swallow control sequences; pipes do
neither. When debugging interactive-shell symptoms, get evidence from
the user's actual terminal at least once.
## Reference
Fix commit: qwen-code `feadf052f`
`fix(cli): append newline to text-mode emitResult so zsh PROMPT_SP doesn't erase the line`

View file

@ -67,9 +67,9 @@ export class JsonOutputAdapter
if (this.config.getOutputFormat() === 'text') {
if (resultMessage.is_error) {
process.stderr.write(`${resultMessage.error?.message || ''}`);
process.stderr.write(`${resultMessage.error?.message || ''}\n`);
} else {
process.stdout.write(`${resultMessage.result}`);
process.stdout.write(`${resultMessage.result}\n`);
}
} else {
// Emit the entire messages array as JSON (includes all main agent + subagent messages)

View file

@ -146,6 +146,11 @@ describe('runNonInteractive', () => {
isInteractive: vi.fn().mockReturnValue(false),
isCronEnabled: vi.fn().mockReturnValue(false),
getCronScheduler: vi.fn().mockReturnValue(null),
getBackgroundTaskRegistry: vi.fn().mockReturnValue({
setNotificationCallback: vi.fn(),
setRegisterCallback: vi.fn(),
getRunning: vi.fn().mockReturnValue([]),
}),
} as unknown as Config;
mockSettings = {
@ -255,7 +260,7 @@ describe('runNonInteractive', () => {
'prompt-id-1',
{ type: SendMessageType.UserQuery },
);
expect(processStdoutSpy).toHaveBeenCalledWith('Hello World');
expect(processStdoutSpy).toHaveBeenCalledWith('Hello World\n');
expect(mockShutdownTelemetry).toHaveBeenCalled();
});
@ -319,7 +324,7 @@ describe('runNonInteractive', () => {
'prompt-id-2',
{ type: SendMessageType.ToolResult },
);
expect(processStdoutSpy).toHaveBeenCalledWith('Final answer');
expect(processStdoutSpy).toHaveBeenCalledWith('Final answer\n');
});
it('should handle error during tool execution and should send error back to the model', async () => {
@ -388,7 +393,7 @@ describe('runNonInteractive', () => {
'prompt-id-3',
{ type: SendMessageType.ToolResult },
);
expect(processStdoutSpy).toHaveBeenCalledWith('Sorry, let me try again.');
expect(processStdoutSpy).toHaveBeenCalledWith('Sorry, let me try again.\n');
});
it('should exit with error if sendMessageStream throws initially', async () => {
@ -450,7 +455,7 @@ describe('runNonInteractive', () => {
expect(mockCoreExecuteToolCall).toHaveBeenCalled();
expect(mockGeminiClient.sendMessageStream).toHaveBeenCalledTimes(2);
expect(processStdoutSpy).toHaveBeenCalledWith(
"Sorry, I can't find that tool.",
"Sorry, I can't find that tool.\n",
);
});
@ -514,7 +519,7 @@ describe('runNonInteractive', () => {
);
// 6. Assert the final output is correct
expect(processStdoutSpy).toHaveBeenCalledWith('Summary complete.');
expect(processStdoutSpy).toHaveBeenCalledWith('Summary complete.\n');
});
it('should process input and write JSON output with stats', async () => {
@ -887,7 +892,7 @@ describe('runNonInteractive', () => {
{ type: SendMessageType.UserQuery },
);
expect(processStdoutSpy).toHaveBeenCalledWith('Response from command');
expect(processStdoutSpy).toHaveBeenCalledWith('Response from command\n');
});
it('should handle command that requires confirmation by returning early', async () => {
@ -912,7 +917,7 @@ describe('runNonInteractive', () => {
// Should write error message through adapter to stdout (TEXT mode goes through JsonOutputAdapter)
expect(processStderrSpy).toHaveBeenCalledWith(
'Shell command confirmation is not supported in non-interactive mode. Use YOLO mode or pre-approve commands.',
'Shell command confirmation is not supported in non-interactive mode. Use YOLO mode or pre-approve commands.\n',
);
});
@ -947,7 +952,7 @@ describe('runNonInteractive', () => {
{ type: SendMessageType.UserQuery },
);
expect(processStdoutSpy).toHaveBeenCalledWith('Response to unknown');
expect(processStdoutSpy).toHaveBeenCalledWith('Response to unknown\n');
});
it('should handle known but unsupported slash commands like /help by returning early', async () => {
@ -970,7 +975,7 @@ describe('runNonInteractive', () => {
// Should write error message through adapter to stdout (TEXT mode goes through JsonOutputAdapter)
expect(processStderrSpy).toHaveBeenCalledWith(
'The command "/help" is not supported in non-interactive mode.',
'The command "/help" is not supported in non-interactive mode.\n',
);
});
@ -995,7 +1000,7 @@ describe('runNonInteractive', () => {
// Should write error message to stderr
expect(processStderrSpy).toHaveBeenCalledWith(
'Unknown command result type: unhandled',
'Unknown command result type: unhandled\n',
);
});
@ -1033,7 +1038,7 @@ describe('runNonInteractive', () => {
expect(mockAction).toHaveBeenCalledWith(expect.any(Object), 'arg1 arg2');
expect(processStdoutSpy).toHaveBeenCalledWith('Acknowledged');
expect(processStdoutSpy).toHaveBeenCalledWith('Acknowledged\n');
});
it('should emit stream-json envelopes when output format is stream-json', async () => {

View file

@ -4,7 +4,11 @@
* SPDX-License-Identifier: Apache-2.0
*/
import type { Config, ToolCallRequestInfo } from '@qwen-code/qwen-code-core';
import type {
BackgroundAgentStatus,
Config,
ToolCallRequestInfo,
} from '@qwen-code/qwen-code-core';
import { isSlashCommand } from './ui/utils/commandUtils.js';
import type { LoadedSettings } from './config/settings.js';
import {
@ -250,6 +254,55 @@ export async function runNonInteractive(
const initialParts = normalizePartList(initialPartList);
let currentMessages: Content[] = [{ role: 'user', parts: initialParts }];
// ─── Shared notification queue (cron + background agents) ──────
// Register the callback early so background agents launched during
// the main tool-call chain can push completions onto the queue.
interface LocalQueueItem {
displayText: string;
modelText: string;
sendMessageType: SendMessageType;
sdkNotification?: {
task_id: string;
tool_use_id?: string;
status: BackgroundAgentStatus;
usage?: {
total_tokens: number;
tool_uses: number;
duration_ms: number;
};
};
}
const localQueue: LocalQueueItem[] = [];
const registry = config.getBackgroundTaskRegistry();
registry.setNotificationCallback((displayText, modelText, meta) => {
localQueue.push({
displayText,
modelText,
sendMessageType: SendMessageType.Notification,
sdkNotification: {
task_id: meta.agentId,
tool_use_id: meta.toolUseId,
status: meta.status,
usage: meta.stats
? {
total_tokens: meta.stats.totalTokens,
tool_uses: meta.stats.toolUses,
duration_ms: meta.stats.durationMs,
}
: undefined,
},
});
});
registry.setRegisterCallback((entry) => {
adapter.emitSystemMessage('task_started', {
task_id: entry.agentId,
tool_use_id: entry.toolUseId,
description: entry.description,
subagent_type: entry.subagentType,
});
});
let isFirstTurn = true;
let modelOverride: string | undefined;
while (true) {
@ -380,144 +433,257 @@ export async function runNonInteractive(
}
currentMessages = [{ role: 'user', parts: toolResponseParts }];
} else {
// No more tool calls — check if cron jobs are keeping us alive
// No more tool calls — drain notifications and cron, then exit.
// Process one queue item through the full turn loop.
const drainOneItem = async () => {
if (localQueue.length === 0) return;
const item = localQueue.shift()!;
if (item.sendMessageType === SendMessageType.Notification) {
adapter.emitUserMessage([{ text: item.displayText }]);
if (item.sdkNotification) {
adapter.emitSystemMessage(
'task_notification',
item.sdkNotification,
);
}
}
turnCount++;
// Symmetry with the main turn loop: drain-turns (cron fires and
// background-agent notification replies) count toward the
// configured budget too, otherwise a looping cron or a model
// that keeps replying to notifications can exceed the cap
// silently in headless runs.
if (
config.getMaxSessionTurns() >= 0 &&
turnCount > config.getMaxSessionTurns()
) {
handleMaxTurnsExceededError(config);
}
let itemMessages: Content[] = [
{ role: 'user', parts: [{ text: item.modelText }] },
];
let itemIsFirstTurn = true;
let itemModelOverride: string | undefined;
while (true) {
const itemToolCallRequests: ToolCallRequestInfo[] = [];
const itemApiStartTime = Date.now();
const itemStream = geminiClient.sendMessageStream(
itemMessages[0]?.parts || [],
abortController.signal,
prompt_id,
{
type: itemIsFirstTurn
? item.sendMessageType
: SendMessageType.ToolResult,
modelOverride: itemModelOverride,
...(itemIsFirstTurn && {
notificationDisplayText: item.displayText,
}),
},
);
itemIsFirstTurn = false;
adapter.startAssistantMessage();
for await (const event of itemStream) {
if (abortController.signal.aborted) {
// Pair the startAssistantMessage() above so stream-json
// mode doesn't leave an unterminated message_start.
adapter.finalizeAssistantMessage();
return;
}
adapter.processEvent(event);
if (event.type === GeminiEventType.ToolCallRequest) {
itemToolCallRequests.push(event.value);
}
if (
outputFormat === OutputFormat.TEXT &&
event.type === GeminiEventType.Error
) {
const errorText = parseAndFormatApiError(
event.value.error,
config.getContentGeneratorConfig()?.authType,
);
process.stderr.write(`${errorText}\n`);
throw new Error(errorText);
}
}
adapter.finalizeAssistantMessage();
totalApiDurationMs += Date.now() - itemApiStartTime;
if (itemToolCallRequests.length > 0) {
const itemToolResponseParts: Part[] = [];
for (const requestInfo of itemToolCallRequests) {
const itemInputFormat =
typeof config.getInputFormat === 'function'
? config.getInputFormat()
: InputFormat.TEXT;
const itemToolCallUpdateCallback =
itemInputFormat === InputFormat.STREAM_JSON &&
options.controlService
? options.controlService.permission.getToolCallUpdateCallback()
: undefined;
const isAgentTool = requestInfo.name === 'agent';
const { handler: outputUpdateHandler } = isAgentTool
? createAgentToolProgressHandler(
config,
requestInfo.callId,
adapter,
)
: createToolProgressHandler(requestInfo, adapter);
const toolResponse = await executeToolCall(
config,
requestInfo,
abortController.signal,
{
outputUpdateHandler,
...(itemToolCallUpdateCallback && {
onToolCallsUpdate: itemToolCallUpdateCallback,
}),
},
);
if (toolResponse.error) {
handleToolError(
requestInfo.name,
toolResponse.error,
config,
toolResponse.errorType || 'TOOL_EXECUTION_ERROR',
typeof toolResponse.resultDisplay === 'string'
? toolResponse.resultDisplay
: undefined,
);
}
adapter.emitToolResult(requestInfo, toolResponse);
if (toolResponse.responseParts) {
itemToolResponseParts.push(...toolResponse.responseParts);
}
if ('modelOverride' in toolResponse) {
itemModelOverride = toolResponse.modelOverride;
}
}
itemMessages = [{ role: 'user', parts: itemToolResponseParts }];
} else {
break;
}
}
};
// Single-flight drain: concurrent callers wait for the running
// drain. A new drain starts only after the previous one finishes
// its queue, which prevents overlapping turns when cron jobs fire
// while an earlier queued item is still streaming.
//
// The null-clearing is attached via `.finally()` on the outer
// promise rather than inside the async body. When the queue is
// empty the async body runs to completion synchronously (no
// awaits) — an inner `finally { drainPromise = null }` would
// therefore fire BEFORE the outer `drainPromise = p` assignment,
// leaving drainPromise stuck holding a resolved Promise forever
// and making every future call return immediately without ever
// draining. Clearing via `p.finally()` schedules the null as a
// microtask that runs after the outer assignment.
let drainPromise: Promise<void> | null = null;
const drainLocalQueue = (): Promise<void> => {
if (drainPromise) return drainPromise;
const p = (async () => {
while (localQueue.length > 0) {
await drainOneItem();
}
})();
drainPromise = p;
void p.finally(() => {
if (drainPromise === p) drainPromise = null;
});
return p;
};
// Start cron scheduler — fires enqueue onto the shared queue.
const scheduler = !config.isCronEnabled()
? null
: config.getCronScheduler();
if (scheduler && scheduler.size > 0) {
// Start the scheduler and wait for all jobs to complete or be deleted.
// Each fired prompt is processed as a new turn through the same loop.
await new Promise<void>((resolve) => {
const cronQueue: string[] = [];
let processing = false;
const checkDone = () => {
if (scheduler.size === 0 && !processing) {
if (scheduler && scheduler.size > 0) {
await new Promise<void>((resolve, reject) => {
// Resolve on SIGINT/SIGTERM too — recurring cron jobs never
// drop scheduler.size to 0 on their own, so without this the
// hold-back loop below is unreachable after an abort.
const onAbort = () => {
scheduler.stop();
resolve();
};
if (abortController.signal.aborted) {
onAbort();
return;
}
abortController.signal.addEventListener('abort', onAbort, {
once: true,
});
const checkCronDone = () => {
if (scheduler.size === 0 && !drainPromise) {
abortController.signal.removeEventListener('abort', onAbort);
scheduler.stop();
resolve();
}
};
const drainQueue = async () => {
if (processing) return;
processing = true;
try {
while (cronQueue.length > 0) {
const cronPrompt = cronQueue.shift()!;
turnCount++;
let cronMessages: Content[] = [
{ role: 'user', parts: [{ text: cronPrompt }] },
];
let cronIsFirstTurn = true;
let cronModelOverride: string | undefined;
while (true) {
const cronToolCallRequests: ToolCallRequestInfo[] = [];
const cronApiStartTime = Date.now();
const cronStream = geminiClient.sendMessageStream(
cronMessages[0]?.parts || [],
abortController.signal,
prompt_id,
{
type: cronIsFirstTurn
? SendMessageType.Cron
: SendMessageType.ToolResult,
modelOverride: cronModelOverride,
},
);
cronIsFirstTurn = false;
adapter.startAssistantMessage();
for await (const event of cronStream) {
if (abortController.signal.aborted) {
const summary = scheduler.getExitSummary();
scheduler.stop();
if (summary) {
process.stderr.write(summary + '\n');
}
resolve();
return;
}
adapter.processEvent(event);
if (event.type === GeminiEventType.ToolCallRequest) {
cronToolCallRequests.push(event.value);
}
}
adapter.finalizeAssistantMessage();
totalApiDurationMs += Date.now() - cronApiStartTime;
if (cronToolCallRequests.length > 0) {
const cronToolResponseParts: Part[] = [];
for (const requestInfo of cronToolCallRequests) {
const isAgentTool = requestInfo.name === 'agent';
const { handler: outputUpdateHandler } = isAgentTool
? createAgentToolProgressHandler(
config,
requestInfo.callId,
adapter,
)
: createToolProgressHandler(requestInfo, adapter);
const toolResponse = await executeToolCall(
config,
requestInfo,
abortController.signal,
{ outputUpdateHandler },
);
if (toolResponse.error) {
handleToolError(
requestInfo.name,
toolResponse.error,
config,
toolResponse.errorType || 'TOOL_EXECUTION_ERROR',
typeof toolResponse.resultDisplay === 'string'
? toolResponse.resultDisplay
: undefined,
);
}
adapter.emitToolResult(requestInfo, toolResponse);
if (toolResponse.responseParts) {
cronToolResponseParts.push(
...toolResponse.responseParts,
);
}
if ('modelOverride' in toolResponse) {
cronModelOverride = toolResponse.modelOverride;
}
}
cronMessages = [
{ role: 'user', parts: cronToolResponseParts },
];
} else {
break;
}
}
}
} catch (error) {
debugLogger.error('Error processing cron prompt:', error);
} finally {
processing = false;
checkDone();
}
// Propagate drain failures. Without this, a rejected
// drainLocalQueue() (e.g. a text-mode API error surfacing
// out of drainOneItem) would be swallowed by `void` and
// checkCronDone would never fire — hanging the run.
const onDrainError = (err: unknown) => {
abortController.signal.removeEventListener('abort', onAbort);
scheduler.stop();
reject(err);
};
scheduler.start((job: { prompt: string }) => {
cronQueue.push(job.prompt);
void drainQueue();
const label = job.prompt.slice(0, 40);
localQueue.push({
displayText: `Cron: ${label}`,
modelText: job.prompt,
sendMessageType: SendMessageType.Cron,
});
drainLocalQueue().then(checkCronDone, onDrainError);
});
// Also check immediately in case jobs were already deleted
checkDone();
// Check immediately in case jobs were already deleted
checkCronDone();
});
}
// ─── Terminal hold-back phase ──────────────────────────
// Wait for running background agents to complete and drain
// their notifications before emitting the final result. If
// SIGINT/SIGTERM fires here, abort running background agents
// (they use their own AbortControllers) and route through
// handleCancellationError so the run exits non-zero — falling
// through to the success emitResult below would silently
// convert cancellation into a successful completion.
while (true) {
if (abortController.signal.aborted) {
registry.abortAll();
handleCancellationError(config);
}
await drainLocalQueue();
const running = registry.getRunning();
if (running.length === 0 && localQueue.length === 0) break;
await new Promise((r) => setTimeout(r, 100));
}
const metrics = uiTelemetryService.getMetrics();
const usage = computeUsageFromMetrics(metrics);
// Get stats for JSON format output
@ -567,6 +733,14 @@ export async function runNonInteractive(
});
handleError(error, config);
} finally {
try {
const reg = config.getBackgroundTaskRegistry();
reg.setNotificationCallback(undefined);
reg.setRegisterCallback(undefined);
} catch {
// Ignore — registry may not be initialized if we failed early.
}
process.stdout.removeListener('error', stdoutErrorHandler);
// Cleanup signal handlers
process.removeListener('SIGINT', shutdownHandler);

View file

@ -547,7 +547,8 @@ export const useGeminiStream = (
// Notification messages (e.g. background agent completions) are
// pre-processed by the notification drain loop which already
// added the display item to history. Just pass the model text
// through to the API.
// through to the API. Cron prompts still go through the normal
// slash/@-command/shell preprocessing path below.
if (submitType === SendMessageType.Notification) {
onDebugMessage(
`Received notification (${trimmedQuery.length} chars)`,
@ -605,10 +606,15 @@ export const useGeminiStream = (
localQueryToSendToGemini = trimmedQuery;
addItem(
{ type: MessageType.USER, text: trimmedQuery },
userMessageTimestamp,
);
// Cron prompts are already rendered as a `● Cron: …` notification by
// the queue drain, so skip the user-message history item to avoid
// a duplicate `> …` line. Preprocessing (@/slash/shell) still runs.
if (submitType !== SendMessageType.Cron) {
addItem(
{ type: MessageType.USER, text: trimmedQuery },
userMessageTimestamp,
);
}
// Handle @-commands (which might involve tool calls)
if (isAtCommand(trimmedQuery)) {
@ -1862,17 +1868,29 @@ export const useGeminiStream = (
storage,
]);
// ─── Cron scheduler integration ─────────────────────────
const cronQueueRef = useRef<string[]>([]);
const [cronTrigger, setCronTrigger] = useState(0);
// ─── Unified notification queue (cron + background agents) ──────
const notificationQueueRef = useRef<
Array<{
displayText: string;
modelText: string;
sendMessageType: SendMessageType;
}>
>([]);
const [notificationTrigger, setNotificationTrigger] = useState(0);
// Start the scheduler on mount, stop on unmount
// Start the cron scheduler on mount, stop on unmount.
// Cron fires enqueue onto the shared notification queue.
useEffect(() => {
if (!config.isCronEnabled()) return;
const scheduler = config.getCronScheduler();
scheduler.start((job: { prompt: string }) => {
cronQueueRef.current.push(job.prompt);
setCronTrigger((n) => n + 1);
const label = job.prompt.slice(0, 40);
notificationQueueRef.current.push({
displayText: `Cron: ${label}`,
modelText: job.prompt,
sendMessageType: SendMessageType.Cron,
});
setNotificationTrigger((n) => n + 1);
});
return () => {
const summary = scheduler.getExitSummary();
@ -1883,37 +1901,23 @@ export const useGeminiStream = (
};
}, [config]);
// When idle, drain the cron queue one prompt at a time
useEffect(() => {
if (
streamingState === StreamingState.Idle &&
cronQueueRef.current.length > 0
) {
const prompt = cronQueueRef.current.shift()!;
submitQuery(prompt, SendMessageType.Cron);
}
}, [streamingState, submitQuery, cronTrigger]);
// ─── Background agent notification queue ───────────────────
const notificationQueueRef = useRef<
Array<{ displayText: string; modelText: string }>
>([]);
const [notificationTrigger, setNotificationTrigger] = useState(0);
// Register background agent notification callback onto the shared queue.
useEffect(() => {
const registry = config.getBackgroundTaskRegistry();
registry.setNotificationCallback(
(displayText: string, modelText: string) => {
notificationQueueRef.current.push({ displayText, modelText });
setNotificationTrigger((n) => n + 1);
},
);
registry.setNotificationCallback((displayText, modelText) => {
notificationQueueRef.current.push({
displayText,
modelText,
sendMessageType: SendMessageType.Notification,
});
setNotificationTrigger((n) => n + 1);
});
return () => {
registry.setNotificationCallback(() => {});
registry.setNotificationCallback(undefined);
};
}, [config]);
// When idle, drain the notification queue one item at a time
// When idle, drain the unified queue one item at a time.
useEffect(() => {
if (
streamingState === StreamingState.Idle &&
@ -1924,7 +1928,7 @@ export const useGeminiStream = (
{ type: 'notification' as const, text: item.displayText },
Date.now(),
);
submitQuery(item.modelText, SendMessageType.Notification, undefined, {
submitQuery(item.modelText, item.sendMessageType, undefined, {
notificationDisplayText: item.displayText,
});
}

View file

@ -256,15 +256,19 @@ function convertToHistoryItems(
}
switch (record.type) {
case 'user': {
// Restore notification items (background agent completions)
if (record.subtype === 'notification') {
// Restore notification items (background agent completions and cron fires)
if (record.subtype === 'notification' || record.subtype === 'cron') {
const payload = record.systemPayload as
| { displayText?: string }
| undefined;
const fallback =
record.subtype === 'cron'
? 'Cron job fired'
: 'Background agent completed';
const text =
payload?.displayText ||
extractTextFromParts(record.message?.parts as Part[]) ||
'Background agent completed';
fallback;
items.push({ type: 'notification', text });
break;
}

View file

@ -193,8 +193,11 @@ describe('BackgroundTaskRegistry', () => {
// Status should remain 'cancelled', not flip to 'completed'
expect(registry.get('test-1')!.status).toBe('cancelled');
// No notification should have been sent
expect(callback).not.toHaveBeenCalled();
// Exactly one notification, emitted by cancel() itself — the late
// complete() must be no-op'd by the running-status guard.
expect(callback).toHaveBeenCalledTimes(1);
const [, modelText] = callback.mock.calls[0];
expect(modelText).toContain('<status>cancelled</status>');
});
it('fail is a no-op after cancellation (state race guard)', () => {
@ -213,7 +216,9 @@ describe('BackgroundTaskRegistry', () => {
registry.fail('test-1', 'late error');
expect(registry.get('test-1')!.status).toBe('cancelled');
expect(callback).not.toHaveBeenCalled();
expect(callback).toHaveBeenCalledTimes(1);
const [, modelText] = callback.mock.calls[0];
expect(modelText).toContain('<status>cancelled</status>');
});
it('does not send notification without callback', () => {
@ -229,4 +234,68 @@ describe('BackgroundTaskRegistry', () => {
registry.complete('test-1', 'done');
expect(registry.get('test-1')!.status).toBe('completed');
});
it('propagates toolUseId through XML and notification meta', () => {
const callback = vi.fn();
registry.setNotificationCallback(callback);
registry.register({
agentId: 'test-1',
description: 'test agent',
status: 'running',
startTime: Date.now(),
abortController: new AbortController(),
toolUseId: 'call-abc-123',
});
registry.complete('test-1', 'done');
expect(callback).toHaveBeenCalledOnce();
const [, modelText, meta] = callback.mock.calls[0];
expect(modelText).toContain('<tool-use-id>call-abc-123</tool-use-id>');
expect(meta.toolUseId).toBe('call-abc-123');
});
it('omits tool-use-id XML tag when toolUseId is absent', () => {
const callback = vi.fn();
registry.setNotificationCallback(callback);
registry.register({
agentId: 'test-1',
description: 'test agent',
status: 'running',
startTime: Date.now(),
abortController: new AbortController(),
});
registry.complete('test-1', 'done');
const [, modelText, meta] = callback.mock.calls[0];
expect(modelText).not.toContain('<tool-use-id>');
expect(meta.toolUseId).toBeUndefined();
});
it('escapes XML metacharacters in interpolated fields', () => {
const callback = vi.fn();
registry.setNotificationCallback(callback);
registry.register({
agentId: 'test-1',
description: 'summarize </result> & </task-notification>',
status: 'running',
startTime: Date.now(),
abortController: new AbortController(),
});
registry.complete('test-1', 'here is <b>bold</b> & </task-notification>');
const [, modelText] = callback.mock.calls[0];
// No injected closing tags — subagent text is escaped so the
// parent envelope stays a single task-notification element.
expect(modelText.match(/<\/task-notification>/g)!.length).toBe(1);
expect(modelText).toContain('&lt;/result&gt;');
expect(modelText).toContain('&lt;/task-notification&gt;');
expect(modelText).toContain('&lt;b&gt;bold&lt;/b&gt;');
expect(modelText).toContain('&amp;');
});
});

View file

@ -17,6 +17,20 @@ import { createDebugLogger } from '../utils/debugLogger.js';
const debugLogger = createDebugLogger('BACKGROUND_TASKS');
const MAX_DESCRIPTION_LENGTH = 40;
const MAX_RESULT_LENGTH = 2000;
// Escape text so it is safe to interpolate into an XML element body.
// Subagent-produced strings (description, result, error) can contain `<`,
// `>`, or literal `</task-notification>` — without escaping, a subagent
// summarizing HTML or another agent's notification could close the
// envelope early and forge sibling tags (e.g. a faked <status>) that the
// parent model would treat as trusted metadata.
function escapeXml(text: string): string {
return text
.replace(/&/g, '&amp;')
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;');
}
export type BackgroundAgentStatus =
| 'running'
@ -24,6 +38,12 @@ export type BackgroundAgentStatus =
| 'failed'
| 'cancelled';
export interface AgentCompletionStats {
totalTokens: number;
toolUses: number;
durationMs: number;
}
export interface BackgroundAgentEntry {
agentId: string;
description: string;
@ -35,16 +55,29 @@ export interface BackgroundAgentEntry {
error?: string;
abortController: AbortController;
name?: string;
stats?: AgentCompletionStats;
toolUseId?: string;
}
export interface NotificationMeta {
agentId: string;
status: BackgroundAgentStatus;
stats?: AgentCompletionStats;
toolUseId?: string;
}
export type BackgroundNotificationCallback = (
displayText: string,
modelText: string,
meta: NotificationMeta,
) => void;
export type BackgroundRegisterCallback = (entry: BackgroundAgentEntry) => void;
export class BackgroundTaskRegistry {
private readonly agents = new Map<string, BackgroundAgentEntry>();
private notificationCallback?: BackgroundNotificationCallback;
private registerCallback?: BackgroundRegisterCallback;
/**
* Register a new background agent.
@ -52,6 +85,14 @@ export class BackgroundTaskRegistry {
register(entry: BackgroundAgentEntry): void {
this.agents.set(entry.agentId, entry);
debugLogger.info(`Registered background agent: ${entry.agentId}`);
if (this.registerCallback) {
try {
this.registerCallback(entry);
} catch (error) {
debugLogger.error('Failed to emit register callback:', error);
}
}
}
/**
@ -59,13 +100,18 @@ export class BackgroundTaskRegistry {
* No-op if the agent is not in 'running' state (guards against race
* with concurrent cancellation).
*/
complete(agentId: string, result: string): void {
complete(
agentId: string,
result: string,
stats?: AgentCompletionStats,
): void {
const entry = this.agents.get(agentId);
if (!entry || entry.status !== 'running') return;
entry.status = 'completed';
entry.endTime = Date.now();
entry.result = result;
entry.stats = stats;
debugLogger.info(`Background agent completed: ${agentId}`);
this.emitNotification(entry);
@ -76,13 +122,14 @@ export class BackgroundTaskRegistry {
* No-op if the agent is not in 'running' state (guards against race
* with concurrent cancellation).
*/
fail(agentId: string, error: string): void {
fail(agentId: string, error: string, stats?: AgentCompletionStats): void {
const entry = this.agents.get(agentId);
if (!entry || entry.status !== 'running') return;
entry.status = 'failed';
entry.endTime = Date.now();
entry.error = error;
entry.stats = stats;
debugLogger.info(`Background agent failed: ${agentId}`);
this.emitNotification(entry);
@ -99,6 +146,12 @@ export class BackgroundTaskRegistry {
entry.status = 'cancelled';
entry.endTime = Date.now();
debugLogger.info(`Background agent cancelled: ${agentId}`);
// Emit the terminal notification here — the fire-and-forget
// complete()/fail() path is guarded by `status !== 'running'` and
// will no-op, so without this the SDK contract breaks: consumers
// saw task_started but never receive a matching task_notification.
this.emitNotification(entry);
}
/**
@ -133,10 +186,20 @@ export class BackgroundTaskRegistry {
* Set the callback that delivers completion notifications to the CLI.
* Called by AppContainer during initialization.
*/
setNotificationCallback(cb: BackgroundNotificationCallback): void {
setNotificationCallback(
cb: BackgroundNotificationCallback | undefined,
): void {
this.notificationCallback = cb;
}
/**
* Set the callback fired when a new background agent is registered.
* Used by the CLI to emit task_started SDK events.
*/
setRegisterCallback(cb: BackgroundRegisterCallback | undefined): void {
this.registerCallback = cb;
}
/**
* Abort all running background agents. Called during session cleanup.
*/
@ -146,22 +209,16 @@ export class BackgroundTaskRegistry {
entry.abortController.abort();
entry.status = 'cancelled';
entry.endTime = Date.now();
// Same reasoning as cancel(): emit the terminal notification
// here, because the fire-and-forget complete()/fail() will
// be no-op'd by the running-status guard.
this.emitNotification(entry);
}
}
debugLogger.info('Aborted all background agents');
}
private emitNotification(entry: BackgroundAgentEntry): void {
if (!this.notificationCallback) return;
const statusText =
entry.status === 'completed'
? 'completed'
: entry.status === 'failed'
? `failed`
: 'was cancelled';
// Build the label: "Explore: list ts files..." (truncated)
private buildDisplayLabel(entry: BackgroundAgentEntry): string {
// Strip the subagent type prefix if the description already starts with it
// to avoid duplication like "Explore: Explore: list ts files".
let rawDesc = entry.description;
@ -175,24 +232,67 @@ export class BackgroundTaskRegistry {
rawDesc.length > MAX_DESCRIPTION_LENGTH
? rawDesc.slice(0, MAX_DESCRIPTION_LENGTH) + '...'
: rawDesc;
const label = entry.subagentType ? `${entry.subagentType}: ${desc}` : desc;
return entry.subagentType ? `${entry.subagentType}: ${desc}` : desc;
}
// Short display line shown in the UI
private emitNotification(entry: BackgroundAgentEntry): void {
if (!this.notificationCallback) return;
const statusText =
entry.status === 'completed'
? 'completed'
: entry.status === 'failed'
? 'failed'
: 'was cancelled';
const label = this.buildDisplayLabel(entry);
const displayLine = `Background agent "${label}" ${statusText}.`;
// Full model-facing text (includes result/error for the LLM to act on)
const modelLines: string[] = [
`Background agent "${entry.description}" (${entry.agentId}) ${statusText}.`,
// Truncate before escaping so we don't slice through an escape
// sequence (e.g. mid-`&amp;`) and emit malformed XML.
const rawResult = entry.result
? entry.result.length > MAX_RESULT_LENGTH
? entry.result.slice(0, MAX_RESULT_LENGTH) + '\n[truncated]'
: entry.result
: undefined;
const xmlParts: string[] = [
'<task-notification>',
`<task-id>${escapeXml(entry.agentId)}</task-id>`,
];
if (entry.result) {
modelLines.push('', entry.result);
if (entry.toolUseId) {
xmlParts.push(`<tool-use-id>${escapeXml(entry.toolUseId)}</tool-use-id>`);
}
xmlParts.push(
`<status>${escapeXml(entry.status)}</status>`,
`<summary>Agent "${escapeXml(entry.description)}" ${statusText}.</summary>`,
);
if (rawResult) {
xmlParts.push(`<result>${escapeXml(rawResult)}</result>`);
}
if (entry.error) {
modelLines.push('', `Error: ${entry.error}`);
xmlParts.push(`<result>Error: ${escapeXml(entry.error)}</result>`);
}
if (entry.stats) {
xmlParts.push(
'<usage>',
`<total_tokens>${entry.stats.totalTokens}</total_tokens>`,
`<tool_uses>${entry.stats.toolUses}</tool_uses>`,
`<duration_ms>${entry.stats.durationMs}</duration_ms>`,
'</usage>',
);
}
xmlParts.push('</task-notification>');
const meta: NotificationMeta = {
agentId: entry.agentId,
status: entry.status,
stats: entry.stats,
toolUseId: entry.toolUseId,
};
try {
this.notificationCallback(displayLine, modelLines.join('\n'));
this.notificationCallback(displayLine, xmlParts.join('\n'), meta);
} catch (error) {
debugLogger.error('Failed to emit background notification:', error);
}

View file

@ -676,8 +676,14 @@ export class GeminiClient {
});
}
// record user message for session management
this.config.getChatRecordingService()?.recordUserMessage(request);
// record user/cron message for session management
if (messageType === SendMessageType.Cron) {
this.config
.getChatRecordingService()
?.recordCronPrompt(request, options?.notificationDisplayText);
} else {
this.config.getChatRecordingService()?.recordUserMessage(request);
}
// Idle cleanup: clear stale thinking blocks after idle period.
// Latch: once triggered, never revert — prevents oscillation.

View file

@ -610,6 +610,7 @@ export class CoreToolScheduler {
const invocationOrError = this.buildInvocation(
call.tool,
args as Record<string, unknown>,
targetCallId,
);
if (invocationOrError instanceof Error) {
const response = createErrorResponse(
@ -646,9 +647,17 @@ export class CoreToolScheduler {
private buildInvocation(
tool: AnyDeclarativeTool,
args: object,
callId?: string,
): AnyToolInvocation | Error {
try {
return tool.build(structuredClone(args));
const invocation = tool.build(structuredClone(args));
if (callId) {
const maybeAware = invocation as { setCallId?: (id: string) => void };
if (typeof maybeAware.setCallId === 'function') {
maybeAware.setCallId(callId);
}
}
return invocation;
} catch (e) {
if (e instanceof Error) {
return e;
@ -827,6 +836,7 @@ export class CoreToolScheduler {
const invocationOrError = this.buildInvocation(
toolInstance,
reqInfo.args,
reqInfo.callId,
);
if (invocationOrError instanceof Error) {
const error = reqInfo.wasOutputTruncated

View file

@ -58,7 +58,8 @@ export interface ChatRecord {
| 'slash_command'
| 'ui_telemetry'
| 'at_command'
| 'notification';
| 'notification'
| 'cron';
/** Working directory at time of message */
cwd: string;
/** CLI version for compatibility tracking */
@ -300,16 +301,33 @@ export class ChatRecordingService {
}
}
/**
* Records a cron-fired prompt.
* Stored as a user-role message with subtype 'cron' so the UI
* restores it as a notification item instead of a user turn.
*/
recordCronPrompt(message: PartListUnion, displayText?: string): void {
this.recordNotificationLike(message, 'cron', displayText);
}
/**
* Records a background agent notification.
* Stored as a user-role message (so the API history includes it on resume)
* with subtype 'notification' (so the UI can restore it as an info item).
* Stored as a user-role message with subtype 'notification' so the
* UI restores it as an info item, not a user turn.
*/
recordNotification(message: PartListUnion, displayText?: string): void {
this.recordNotificationLike(message, 'notification', displayText);
}
private recordNotificationLike(
message: PartListUnion,
subtype: 'notification' | 'cron',
displayText?: string,
): void {
try {
const record: ChatRecord = {
...this.createBaseRecord('user'),
subtype: 'notification',
subtype,
message: createUserContent(message),
systemPayload: displayText
? ({ displayText } as NotificationRecordPayload)
@ -317,7 +335,7 @@ export class ChatRecordingService {
};
this.appendRecord(record);
} catch (error) {
debugLogger.error('Error saving notification:', error);
debugLogger.error(`Error saving ${subtype} record:`, error);
}
}

View file

@ -1439,12 +1439,12 @@ describe('AgentTool', () => {
};
vi.mocked(config.getApprovalMode).mockReturnValue(ApprovalMode.DEFAULT);
(config as Record<string, unknown>).isInteractive = vi
(config as unknown as Record<string, unknown>)['isInteractive'] = vi
.fn()
.mockReturnValue(true);
(config as Record<string, unknown>).getBackgroundTaskRegistry = vi
.fn()
.mockReturnValue(mockRegistry);
(config as unknown as Record<string, unknown>)[
'getBackgroundTaskRegistry'
] = vi.fn().mockReturnValue(mockRegistry);
vi.mocked(mockSubagentManager.loadSubagent).mockResolvedValue(bgSubagent);
vi.mocked(mockSubagentManager.createAgentHeadless).mockResolvedValue(
@ -1526,7 +1526,7 @@ describe('AgentTool', () => {
expect(mockRegistry.register).not.toHaveBeenCalled();
});
it('should reject background when non-interactive, even with background: true config', async () => {
it('should allow background in non-interactive mode (headless support)', async () => {
vi.mocked(
config.isInteractive as ReturnType<typeof vi.fn>,
).mockReturnValue(false);
@ -1543,9 +1543,28 @@ describe('AgentTool', () => {
const result = await invocation.execute();
const llmText = partToString(result.llmContent);
expect(llmText).toContain('not supported in non-interactive mode');
const display = result.returnDisplay as AgentResultDisplay;
expect(display.status).toBe('failed');
expect(llmText).toContain('Background agent launched');
expect(mockRegistry.register).toHaveBeenCalled();
});
it('forwards the scheduler-provided callId as toolUseId on the registry entry', async () => {
const params: AgentParams = {
description: 'Start monitor',
prompt: 'Watch for changes',
subagent_type: 'monitor',
};
const invocation = (
agentTool as AgentToolWithProtectedMethods
).createInvocation(params);
(invocation as unknown as { setCallId: (id: string) => void }).setCallId(
'call-xyz-789',
);
await invocation.execute();
expect(mockRegistry.register).toHaveBeenCalledWith(
expect.objectContaining({ toolUseId: 'call-xyz-789' }),
);
});
});
});

View file

@ -391,6 +391,7 @@ class AgentToolInvocation extends BaseToolInvocation<AgentParams, ToolResult> {
readonly eventEmitter: AgentEventEmitter = new AgentEventEmitter();
private currentDisplay: AgentResultDisplay | null = null;
private currentToolCalls: AgentResultDisplay['toolCalls'] = [];
private callId?: string;
constructor(
private readonly config: Config,
@ -400,6 +401,15 @@ class AgentToolInvocation extends BaseToolInvocation<AgentParams, ToolResult> {
super(params);
}
/**
* Invoked by the tool scheduler after `build` to link this invocation
* back to the model's original tool-use request. Used so background
* agents carry the tool-use id through to completion notifications.
*/
setCallId(callId: string): void {
this.callId = callId;
}
/**
* Updates the current display state and calls updateOutput if provided
*/
@ -660,10 +670,12 @@ class AgentToolInvocation extends BaseToolInvocation<AgentParams, ToolResult> {
const generationConfig = geminiClient?.getChat().getGenerationConfig();
if (generationConfig?.systemInstruction) {
// Inline FunctionDeclaration[] from the parent — passed verbatim
// including `agent` itself so the fork's tool-name set matches the
// parent's. prepareTools bypasses the exclusion filter for inline
// decls; `isInForkExecution()` (ALS-based) is the sole
// recursive-fork block at runtime.
// (including `agent` and cron tools) so the fork's system prompt,
// tools, and history exactly match the parent's and share its
// DashScope cache prefix. A fork is a context-sharing extension of
// the parent, not an isolated subagent, so the general subagent
// exclusion list does not apply. Recursive forks are blocked by the
// ALS-based `isInForkExecution()` guard.
const parentToolDecls: FunctionDeclaration[] =
(
generationConfig.tools as Array<{
@ -963,23 +975,6 @@ class AgentToolInvocation extends BaseToolInvocation<AgentParams, ToolResult> {
subagentConfig.background === true;
if (shouldRunInBackground) {
// Background agents are not supported in non-interactive mode
if (!this.config.isInteractive()) {
return {
llmContent:
'Background agents are not supported in non-interactive mode. Retry without run_in_background.',
returnDisplay: {
type: 'task_execution' as const,
subagentName: this.params.subagent_type || 'unknown',
taskDescription: this.params.description,
taskPrompt: this.params.prompt,
status: 'failed' as const,
terminateReason:
'Background agents are not supported in non-interactive mode',
},
};
}
// Fire SubagentStart hook before background launch
const hookSystem = this.config.getHookSystem();
if (hookSystem) {
@ -1005,6 +1000,38 @@ class AgentToolInvocation extends BaseToolInvocation<AgentParams, ToolResult> {
// survive ESC cancellation of the parent's current turn.
const bgAbortController = new AbortController();
// Background agents can't show interactive permission prompts
// (no UI). Instead of YOLO (which would auto-approve everything),
// we set shouldAvoidPermissionPrompts so the tool scheduler
// auto-denies 'ask' decisions — matching claw-code's approach.
// PermissionRequest hooks still run and can override the denial.
// Base on agentConfig so the resolved approval mode override (e.g.
// subagent-level `approvalMode: auto-edit`) is preserved.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const bgConfig = Object.create(agentConfig) as any;
bgConfig.getShouldAvoidPermissionPrompts = () => true;
// Rebuild the subagent against bgConfig. For forks, go through
// createForkSubagent so the parent's rendered system prompt and
// inherited history are carried over; otherwise the background fork
// degrades to a plain FORK_AGENT without context.
//
// Register in the background task registry only AFTER init
// succeeds. If construction throws (e.g. invalid agent config or
// model setup), registering first would leave a phantom 'running'
// entry that the non-interactive hold-back loop would wait on
// forever.
let bgSubagent: AgentHeadless;
if (isFork) {
const fork = await this.createForkSubagent(bgConfig as Config);
bgSubagent = fork.subagent;
} else {
bgSubagent = await this.subagentManager.createAgentHeadless(
subagentConfig,
bgConfig as Config,
);
}
const registry = this.config.getBackgroundTaskRegistry();
registry.register({
agentId: hookOpts.agentId,
@ -1013,23 +1040,17 @@ class AgentToolInvocation extends BaseToolInvocation<AgentParams, ToolResult> {
status: 'running',
startTime: Date.now(),
abortController: bgAbortController,
toolUseId: this.callId,
});
// Background agents can't show interactive permission prompts
// (no UI). Instead of YOLO (which would auto-approve everything),
// we set shouldAvoidPermissionPrompts so the tool scheduler
// auto-denies 'ask' decisions — matching claw-code's approach.
// PermissionRequest hooks still run and can override the denial.
// Inherit from agentConfig so the resolved approval mode is preserved.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const bgConfig = Object.create(agentConfig) as any;
bgConfig.getShouldAvoidPermissionPrompts = () => true;
// Create a dedicated subagent that uses the bg-specific config.
const bgSubagent = await this.subagentManager.createAgentHeadless(
subagentConfig,
bgConfig as Config,
);
const getCompletionStats = () => {
const summary = bgSubagent.getExecutionSummary();
return {
totalTokens: summary.totalTokens,
toolUses: summary.totalToolCalls,
durationMs: summary.totalDurationMs,
};
};
// Fire-and-forget: start the subagent without blocking the parent.
void (async () => {
@ -1104,12 +1125,14 @@ class AgentToolInvocation extends BaseToolInvocation<AgentParams, ToolResult> {
// model (and the UI) don't treat incomplete runs as completed.
const terminateMode = bgSubagent.getTerminateMode();
const finalText = bgSubagent.getFinalText();
const completionStats = getCompletionStats();
if (terminateMode === AgentTerminateMode.GOAL) {
registry.complete(hookOpts.agentId, finalText);
registry.complete(hookOpts.agentId, finalText, completionStats);
} else {
registry.fail(
hookOpts.agentId,
finalText || `Agent terminated with mode: ${terminateMode}`,
completionStats,
);
}
} catch (error) {
@ -1117,7 +1140,7 @@ class AgentToolInvocation extends BaseToolInvocation<AgentParams, ToolResult> {
error instanceof Error ? error.message : String(error);
debugLogger.error(`[Agent] Background agent failed: ${errorMsg}`);
registry.fail(hookOpts.agentId, errorMsg);
registry.fail(hookOpts.agentId, errorMsg, getCompletionStats());
}
})();