mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-05-10 03:59:33 +00:00
* feat(core): add run_in_background support for Agent tool Enable sub-agents to run asynchronously via `run_in_background: true` parameter. Background agents execute independently from the parent, which receives an immediate launch confirmation and continues working. A notification is injected into the parent conversation when the background agent completes. Key changes: - BackgroundTaskRegistry tracks lifecycle of background agents - Agent tool gains async execution path with fire-and-forget semantics - Background agents use YOLO approval mode to prevent deadlock - Independent AbortControllers survive parent ESC cancellation - CLI bridges notifications via useMessageQueue for between-turn delivery - State race guards prevent complete/fail after cancellation - Session cleanup aborts all running background agents * feat(background): improve notification formatting and UI handling - Add prefix/separator protocol to distinguish background notifications from user input - Show concise summary in UI while sending full details to LLM - Add 'notification' history item type with specialized display - Add 'background' agent status for background-running agents - Prevent notifications from polluting prompt history (up-arrow) - Truncate long descriptions in display text This improves the UX for background agents by showing cleaner, more concise notifications while preserving full context for the LLM. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(background): reject run_in_background in non-interactive mode Headless mode skips AppContainer, so the notification callback is never registered and background agent results would be silently dropped. Return an error prompting the model to retry without run_in_background. * refactor(background): replace prefix/separator protocol with typed notification queue Replace the stringly-typed \x00__BG_NOTIFY__\x00 prefix/separator encoding with a typed notification path using SendMessageType.Notification. - Add SendMessageType.Notification to the enum - Change BackgroundNotificationCallback to emit (displayText, modelText) - Move notification queue from AppContainer into useGeminiStream (mirrors the cron queue pattern): register on registry, queue structured items, drain on idle via submitQuery - prepareQueryForGemini short-circuits for Notification type (skips slash commands, shell mode, @-commands, prompt history logging) - Remove BACKGROUND_NOTIFICATION_PREFIX/SEPARATOR constants * refactor(background): move abortAll to Config.shutdown Background agent cleanup belongs in Config.shutdown() alongside other resource teardown (skillManager, toolRegistry, arenaRuntime), not in AppContainer's registerCleanup. This also ensures headless mode gets cleanup for free. * fix(background): persist notification items for session resume Background agent notifications were missing after session resume because they were never recorded in the chat history. The model text was absent from the API history and the display item was lost. - Add recordNotification() to ChatRecordingService — stores as user-role message with subtype 'notification' and displayText payload - Thread notificationDisplayText through submitQuery → sendMessageStream - Restore as HistoryItemNotification in resumeHistoryUtils * fix(background): replace YOLO with deny-by-default for background agents Background agents were using YOLO approval mode which auto-approves all tool calls — too permissive. Replace with shouldAvoidPermissionPrompts which auto-denies tool calls that need interactive approval, matching claw-code's approach. The permission flow for background agents is now: 1. L3/L4 permission rules (allow/deny) — same as foreground 2. Approval mode overrides (AUTO_EDIT for edits) — same as foreground 3. PermissionRequest hooks — can override the denial 4. Auto-deny — if no hook decided, deny because prompts are unavailable * fix(background): add missing getBackgroundTaskRegistry mock in useGeminiStream tests * refactor(core): move fork subagent params from execute() to construction time Identity-shaping fork inputs (parent history, generationConfig, tool decls, env-skip flag) were threaded through `AgentHeadless.execute()`'s options bag and re-passed by the SubagentStop hook retry loop. They belong on the agent's construction-time configs, not its per-invocation options. - PromptConfig gains `renderedSystemPrompt` (verbatim, bypasses templating and userMemory injection) and drops the `systemPrompt`/`initialMessages` XOR so fork can carry both. createChat skips env bootstrap when `initialMessages` is non-empty. - AgentHeadless.execute() shrinks to (context, signal?). Fork dispatch in agent.ts builds synthetic PromptConfig/ModelConfig/ToolConfig from the parent's cache-safe params and calls AgentHeadless.create directly (bypassing SubagentManager). Parent's tool decls flow through verbatim including the `agent` tool itself for cache parity. - Recursive-fork prevention switches from fork-side tool stripping to a runtime guard. The previous `isInForkChild(history)` helper was dead code (it scanned the main GeminiClient's history, not the fork child's chat). Replaced with `isInForkExecution()` backed by AsyncLocalStorage: the fork's background execution runs inside `runInForkContext`, and the ALS frame propagates through the standard async chain into nested AgentTool.execute() calls where the guard fires. * refactor(core): move agent tool files into dedicated tools/agent/ directory Move agent.ts, agent.test.ts, and fork-subagent.ts under tools/agent/ and update all import paths accordingly. * refactor(core): remove dead temp and top_p fields from ModelConfig These fields were never populated from subagent frontmatter and served no purpose in the fork path either. The ModelConfig interface retains only the actively-used model field. * refactor(core): read parent generation config directly instead of getCacheSafeParams Fork subagent now reads system instruction and tool declarations from the live GeminiChat via getGenerationConfig() instead of the global getCacheSafeParams() snapshot. This removes the cross-module coupling between the agent tool and the followup infrastructure. * fix(core): prevent duplicate tool declarations when toolConfig has only inline decls prepareTools() treated asStrings.length === 0 as "add all registry tools", which is correct when no tools are specified at all, but wrong when the caller provides only inline FunctionDeclaration[] (no string names). The fork path passes parent tool declarations as inline decls for cache parity, so prepareTools was adding the full registry set on top — duplicating every non-excluded tool. Add onlyInlineDecls.length === 0 to the condition so that pure-inline toolConfigs bypass the registry entirely. * feat(core): support agent-level `background: true` in frontmatter Subagent definitions can now declare `background: true` in their YAML frontmatter to always run as background tasks. This is OR'd with the `run_in_background` tool parameter — useful for monitors, watchers, and proactive agents so the LLM doesn't need to remember to set the flag. * fix(core): address background subagent lifecycle gaps - Inherit bgConfig from agentConfig so the resolved approval mode is preserved for background agents (foreground would run AUTO_EDIT but background fell back to DEFAULT, which combined with shouldAvoid- PermissionPrompts would auto-deny every permission request). - Honor SubagentStop blocking decisions in background runs by looping on hook output up to 5 iterations, matching runSubagentWithHooks. - Check terminate mode before reporting completion; non-GOAL modes (ERROR, MAX_TURNS, TIMEOUT) are now reported as failures instead of emitting a success notification for an incomplete run. - Exclude SendMessageType.Notification from the UserPromptSubmit hook guard so background completion messages are not rewritten or blocked as if they were user input. * feat(cli): headless support and SDK task events for background agents (#3379) * feat(cli): unify notification queue for cron and background agents Migrate cron from its own queue (cronQueueRef / cronQueue) to the shared notification queue used by background agents. Both producers now push the same item shape { displayText, modelText, sendMessageType } and a single drain effect / helper processes them in FIFO order. Cron fires render as HistoryItemNotification (● prefix) instead of HistoryItemUser (> prefix), with a "Cron: <prompt>" display label. Records use subtype 'cron' for clean resume and analytics separation. Lift the non-interactive rejection for background agents. Register a notification callback in nonInteractiveCli.ts with a terminal hold-back phase (100ms poll) that keeps the process alive until all background agents complete and their notifications are processed. * feat(cli): emit SDK task events for background subagents Emit `task_started` when a background agent registers and `task_notification` when it completes, fails, or is cancelled, so headless/SDK consumers can track lifecycle without parsing display text. Model-facing text is now structured XML with status, summary, truncated result, and usage stats. Completion stats (tokens, tool uses, duration) are captured from the subagent and included in both the SDK payload and the model XML. * fix: address codex review issues for background subagents - Background subagents now inherit the resolved approval mode from agentConfig instead of the raw session config, so a subagent with `approvalMode: auto-edit` (or execution in a trusted folder) keeps that override when it runs asynchronously. - Non-interactive cron drains are single-flight: concurrent cron fires now await the same in-flight drain, and the cron-done check gates on it, preventing the final result from being emitted while a cron turn is still streaming. - Background forks go through createForkSubagent so they retain the parent's rendered system prompt and inherited history instead of degrading to a plain FORK_AGENT. * fix(cli): restore cancellation, approval, and error paths in queued drain - Hold-back loop now reacts to SIGINT/SIGTERM: when the main abort signal fires it calls registry.abortAll() so background agents with their own AbortControllers stop promptly instead of pinning the process open. - Queued-turn tool execution forwards the stream-json approval update callback (onToolCallsUpdate) so permission-gated tools inside a background-notification follow-up emit can_use_tool requests. - Queued-turn stream loop mirrors the main loop's text-mode handling of GeminiEventType.Error, writing to stderr and throwing so provider errors produce a non-zero exit code instead of silently succeeding. - Interactive cron prompts go through the normal slash/@-command/shell preprocessing again; only Notification messages skip that path. * fix(cli): skip duplicate user-message item for cron prompts Cron prompts already render as a `● Cron: …` notification via the queue drain, so adding them again as a `USER` history item produced a duplicate `> …` line. * fix(cli): honor SIGINT/SIGTERM during cron scheduler wait The non-interactive cron phase awaits a Promise that resolves only when scheduler.size reaches 0 and no drain is in flight. Recurring cron jobs never drop the scheduler size to 0 on their own, so the previous abort handling (added to the hold-back loop) was unreachable — the process hung indefinitely after SIGINT/SIGTERM. Attach an abort listener inside the promise so abort stops the scheduler and resolves immediately, allowing the hold-back loop to run and the process to exit cleanly. * feat(core): propagate tool-use id through background agent notifications Plumb the scheduler's callId into AgentToolInvocation via an optional setCallId hook on the invocation, detected structurally in buildInvocation. The agent tool forwards it as toolUseId on the BackgroundTaskRegistry entry so completion notifications can carry a <tool-use-id> tag and SDK task_started / task_notification events can emit tool_use_id — letting consumers correlate background completions back to the original Agent tool-use that spawned them. * fix(cli): drain single-flight race kept task_notification from emitting drainLocalQueue wrapped its body in an async IIFE and cleared the promise reference via finally. When the queue is empty the IIFE has no awaits, so its finally runs synchronously as part of the RHS of the assignment `drainPromise = (async () => {...})()` — clearing drainPromise BEFORE the outer assignment overwrites it with the resolved promise. The reference then stayed stuck on that fulfilled promise forever, so later calls short-circuited through `if (drainPromise) return drainPromise` and never processed queued notifications. Symptom: in headless `--output-format json` (and `stream-json`), task_started emitted but task_notification never did, even after the background agent completed. The process sat in the hold-back loop until SIGTERM. Fix: move the null-clearing out of the async body into an outer `.finally()` on the returned promise. `.finally()` runs as a microtask after the current synchronous block, so it clears the latest drainPromise reference instead of the pre-assignment null. * fix(cli): append newline to text-mode emitResult so zsh PROMPT_SP doesn't erase the line Headless text mode wrote `resultMessage.result` without a trailing newline. In a TTY, zsh themes that use PROMPT_SP (powerlevel10k, agnoster, …) detect the missing `\n` and emit `\r\033[K` before drawing the next prompt, which wipes the final line off the screen. Pipe-captured output was unaffected, so the bug only surfaced for interactive shell users — most visibly in the background-agent flow where the drain-loop's final assistant message is the *only* stdout write in text mode. Append `\n` to both the success (stdout) and error (stderr) writes. * docs(skill): tighten worked-example blurb in structured-debugging Mirror the simplified blurb from .claude/skills/structured-debugging/SKILL.md (knowledge repo). Drops the round-by-round narrative; keeps the contradiction + two lessons. * docs(skill): mirror SKILL.md improvements (reframing failure mode, generalized path, value-logging guidance) Mirror of knowledge repo commit 38eb28d into the qwen-code .qwen/skills copy. * docs(skill): mirror worked example into .qwen/skills/structured-debugging/ Mirrors knowledge/.claude/skills/structured-debugging/examples/ headless-bg-agent-empty-stdout.md so the .qwen copy of the skill links resolve. * docs(skill): mirror generalized side-note path guidance * fix(cli): harden headless cron and background-agent failure paths Three regressions surfaced by Codex review of feat/background-subagent: - Cron drain rejections were dropped by a bare `void`, so a failing queued turn left the outer Promise unresolved and hung the run. Route drain failures through the Promise's reject so they propagate to the outer catch. - The background-agent registry entry was inserted before `createForkSubagent()` / `createAgentHeadless()` was awaited. Failed init returned an error from the tool call but left a phantom `running` entry, and the headless hold-back loop (`registry.getRunning()`) waited forever. Register only after init succeeds. - SIGINT/SIGTERM during the hold-back phase aborted background tasks, then fell through to `emitResult({ isError: false })`, so a cancelled `qwen -p ...` exited 0 with the prior assistant text. Route through `handleCancellationError()` so cancellation exits non-zero, matching the main turn loop. * test(cli): update stdout/stderr assertions for trailing newline `feadf052f` appended `\n` to text-mode `emitResult` output, but the nonInteractiveCli tests still asserted the pre-change strings. Update the 11 affected assertions to expect the trailing newline. * fix: address review comments on background-agent notifications Four additional issues from the PR review that the prior regression-fix commit didn't cover: - Escape XML metacharacters when interpolating `description`, `result`, `error`, `agentId`, `toolUseId`, and `status` into the task-notification envelope. Subagent output (which itself may carry untrusted tool output, fetched HTML, or another agent's notification) could contain `</result>` or `</task-notification>` and forge sibling tags the parent model would treat as trusted metadata. Truncate result text *before* escaping so the truncation never slices through an entity like `&`. - Emit the terminal notification from `cancel()` and `abortAll()`. The fire-and-forget `complete()`/`fail()` from the subagent task is guarded by `status !== 'running'` and was no-op'd after cancellation, so SDK consumers saw `task_started` with no matching `task_notification`, breaking the contract this PR establishes. Updated two race-guard tests that asserted the old behavior. - Call `adapter.finalizeAssistantMessage()` before the abort-triggered early return inside `drainOneItem`'s stream loop. Without it, `startAssistantMessage()` had already been called, so stream-json mode left `message_start` unpaired. - Enforce `config.getMaxSessionTurns()` in `drainOneItem` for symmetry with the main turn loop. Cron fires and notification replies otherwise bypass the budget cap in headless runs. * fix: address codex review comments for background subagents - Wrap background fork execute() in runInForkContext so the recursive-fork guard (AsyncLocalStorage-based) fires when a background fork's child model calls `agent` again. Previously only the foreground fork path was wrapped, so background forks could spawn nested implicit forks. - Emit queued terminal task_notifications on SIGINT/SIGTERM before handleCancellationError exits. abortAll() enqueues cancellation notifications via the registry callback, but the process was exiting before the drain loop had a chance to flush them — leaving stream-json consumers that already saw task_started without a matching terminal task_notification. Extracted the SDK-emit block into a shared emitNotificationToSdk helper reused by the normal drain and the cancellation flush. - Skip notification/cron subtypes in ACP HistoryReplayer. These records are persisted as type: 'user' so the model's chat history keeps them for continuity, but they were never user input — replaying them leaked raw <task-notification> XML (and cron prompts) back into the ACP session as if the user typed them. * test(cli): sync JsonOutputAdapter text-mode assertions with trailing newline Commit0da1182b7appended a newline to text-mode emitResult output (zsh PROMPT_SP fix) and updated the nonInteractiveCli tests, but four assertions in JsonOutputAdapter.test.ts were missed. Update them to expect the trailing newline so CI passes. * refactor: simplify background subagent plumbing - Extract the SubagentStop hook blocking-decision loop into a runSubagentStopHookLoop helper so the foreground and background paths no longer duplicate the iteration/abort/log scaffolding. - Unify BackgroundTaskRegistry.abortAll to delegate to cancel, removing copy-pasted abort/notification bookkeeping. - Drop the unused findByName and BackgroundAgentEntry.name field. - In nonInteractiveCli drain, hoist inputFormat and toolCallUpdateCallback out of the inner tool loop, and drop the unreachable try/catch around the readonly registry. - Trim boilerplate doc/narration comments while keeping load-bearing WHY comments. * fix: address codex review comments for background subagents - Use tool callId (or short random suffix) instead of Date.now() for background agentIds; avoids registry collisions when parallel same-type agents launch in the same millisecond. - Reset loopDetector and lastPromptId for Notification turns so a prior turn's loop count doesn't trip LoopDetected on the notification response. - Replay notification/cron displayText in ACP HistoryReplayer so the assistant reply has an antecedent in resumed transcripts. --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
764 lines
21 KiB
TypeScript
764 lines
21 KiB
TypeScript
/**
|
|
* @license
|
|
* Copyright 2025 Google LLC
|
|
* SPDX-License-Identifier: Apache-2.0
|
|
*/
|
|
|
|
import type { FunctionDeclaration, Part, PartListUnion } from '@google/genai';
|
|
import { ToolErrorType } from './tool-error.js';
|
|
import type { ShellExecutionConfig } from '../services/shellExecutionService.js';
|
|
import { SchemaValidator } from '../utils/schemaValidator.js';
|
|
import { type AgentStatsSummary } from '../agents/runtime/agent-statistics.js';
|
|
import type { AnsiOutput } from '../utils/terminalSerializer.js';
|
|
import type { PermissionDecision } from '../permissions/types.js';
|
|
|
|
/**
|
|
* Represents a validated and ready-to-execute tool call.
|
|
* An instance of this is created by a `ToolBuilder`.
|
|
*/
|
|
export interface ToolInvocation<
|
|
TParams extends object,
|
|
TResult extends ToolResult,
|
|
> {
|
|
/**
|
|
* The validated parameters for this specific invocation.
|
|
*/
|
|
params: TParams;
|
|
|
|
/**
|
|
* Gets a pre-execution description of the tool operation.
|
|
*
|
|
* @returns A markdown string describing what the tool will do.
|
|
*/
|
|
getDescription(): string;
|
|
|
|
/**
|
|
* Determines what file system paths the tool will affect.
|
|
* @returns A list of such paths.
|
|
*/
|
|
toolLocations(): ToolLocation[];
|
|
|
|
/**
|
|
* Returns the tool's intrinsic permission for this invocation, based solely
|
|
* on its own parameters (without consulting PermissionManager).
|
|
*
|
|
* - `'allow'` — inherently safe (e.g., read-only commands, `cat`, `ls`).
|
|
* - `'ask'` — may have side effects, needs user or PM confirmation.
|
|
* - `'deny'` — security violation (e.g., command substitution in shell).
|
|
*
|
|
* The coreToolScheduler uses this as the *default* permission which may be
|
|
* overridden by PermissionManager rules at L4.
|
|
*/
|
|
getDefaultPermission(): Promise<PermissionDecision>;
|
|
|
|
/**
|
|
* Constructs the confirmation dialog details for this invocation.
|
|
* Only called when the final permission decision is `'ask'` and the user
|
|
* needs to be prompted interactively.
|
|
*
|
|
* @param abortSignal Signal to cancel the operation.
|
|
* @returns The confirmation details for the UI to display.
|
|
*/
|
|
getConfirmationDetails(
|
|
abortSignal: AbortSignal,
|
|
): Promise<ToolCallConfirmationDetails>;
|
|
|
|
/**
|
|
* Executes the tool with the validated parameters.
|
|
* @param signal AbortSignal for tool cancellation.
|
|
* @param updateOutput Optional callback to stream output.
|
|
* @returns Result of the tool execution.
|
|
*/
|
|
execute(
|
|
signal: AbortSignal,
|
|
updateOutput?: (output: ToolResultDisplay) => void,
|
|
shellExecutionConfig?: ShellExecutionConfig,
|
|
): Promise<TResult>;
|
|
}
|
|
|
|
/**
|
|
* A convenience base class for ToolInvocation.
|
|
*/
|
|
export abstract class BaseToolInvocation<
|
|
TParams extends object,
|
|
TResult extends ToolResult,
|
|
> implements ToolInvocation<TParams, TResult>
|
|
{
|
|
constructor(readonly params: TParams) {}
|
|
|
|
abstract getDescription(): string;
|
|
|
|
toolLocations(): ToolLocation[] {
|
|
return [];
|
|
}
|
|
|
|
/**
|
|
* Default: read-only tools return 'allow'. Override in subclasses for
|
|
* tools with side effects.
|
|
*/
|
|
getDefaultPermission(): Promise<PermissionDecision> {
|
|
return Promise.resolve('allow');
|
|
}
|
|
|
|
/**
|
|
* Default fallback: returns a generic 'info' confirmation dialog using the
|
|
* tool's getDescription(). This ensures that even tools whose
|
|
* getDefaultPermission() returns 'allow' can still be prompted when PM
|
|
* rules override the decision to 'ask' at L4.
|
|
*
|
|
* Tools with richer confirmation UIs (Shell, Edit, MCP, etc.) override this.
|
|
*/
|
|
getConfirmationDetails(
|
|
_abortSignal: AbortSignal,
|
|
): Promise<ToolCallConfirmationDetails> {
|
|
const details: ToolInfoConfirmationDetails = {
|
|
type: 'info',
|
|
title: `Confirm ${this.constructor.name.replace(/Invocation$/, '')}`,
|
|
prompt: this.getDescription(),
|
|
onConfirm: async (
|
|
_outcome: ToolConfirmationOutcome,
|
|
_payload?: ToolConfirmationPayload,
|
|
) => {
|
|
// No-op: persistence is handled by coreToolScheduler via PM rules
|
|
},
|
|
};
|
|
return Promise.resolve(details);
|
|
}
|
|
|
|
abstract execute(
|
|
signal: AbortSignal,
|
|
updateOutput?: (output: ToolResultDisplay) => void,
|
|
shellExecutionConfig?: ShellExecutionConfig,
|
|
): Promise<TResult>;
|
|
}
|
|
|
|
/**
|
|
* A type alias for a tool invocation where the specific parameter and result types are not known.
|
|
*/
|
|
export type AnyToolInvocation = ToolInvocation<object, ToolResult>;
|
|
|
|
/**
|
|
* Interface for a tool builder that validates parameters and creates invocations.
|
|
*/
|
|
export interface ToolBuilder<
|
|
TParams extends object,
|
|
TResult extends ToolResult,
|
|
> {
|
|
/**
|
|
* The internal name of the tool (used for API calls).
|
|
*/
|
|
name: string;
|
|
|
|
/**
|
|
* The user-friendly display name of the tool.
|
|
*/
|
|
displayName: string;
|
|
|
|
/**
|
|
* Description of what the tool does.
|
|
*/
|
|
description: string;
|
|
|
|
/**
|
|
* The kind of tool for categorization and permissions
|
|
*/
|
|
kind: Kind;
|
|
|
|
/**
|
|
* Function declaration schema from @google/genai.
|
|
*/
|
|
schema: FunctionDeclaration;
|
|
|
|
/**
|
|
* Whether the tool's output should be rendered as markdown.
|
|
*/
|
|
isOutputMarkdown: boolean;
|
|
|
|
/**
|
|
* Whether the tool supports live (streaming) output.
|
|
*/
|
|
canUpdateOutput: boolean;
|
|
|
|
/**
|
|
* Validates raw parameters and builds a ready-to-execute invocation.
|
|
* @param params The raw, untrusted parameters from the model.
|
|
* @returns A valid `ToolInvocation` if successful. Throws an error if validation fails.
|
|
*/
|
|
build(params: TParams): ToolInvocation<TParams, TResult>;
|
|
}
|
|
|
|
/**
|
|
* New base class for tools that separates validation from execution.
|
|
* New tools should extend this class.
|
|
*/
|
|
export abstract class DeclarativeTool<
|
|
TParams extends object,
|
|
TResult extends ToolResult,
|
|
> implements ToolBuilder<TParams, TResult>
|
|
{
|
|
constructor(
|
|
readonly name: string,
|
|
readonly displayName: string,
|
|
readonly description: string,
|
|
readonly kind: Kind,
|
|
readonly parameterSchema: unknown,
|
|
readonly isOutputMarkdown: boolean = true,
|
|
readonly canUpdateOutput: boolean = false,
|
|
) {}
|
|
|
|
get schema(): FunctionDeclaration {
|
|
return {
|
|
name: this.name,
|
|
description: this.description,
|
|
parametersJsonSchema: this.parameterSchema,
|
|
};
|
|
}
|
|
|
|
/**
|
|
* Validates the raw tool parameters.
|
|
* Subclasses should override this to add custom validation logic
|
|
* beyond the JSON schema check.
|
|
* @param params The raw parameters from the model.
|
|
* @returns An error message string if invalid, null otherwise.
|
|
*/
|
|
validateToolParams(_params: TParams): string | null {
|
|
// Base implementation can be extended by subclasses.
|
|
return null;
|
|
}
|
|
|
|
/**
|
|
* The core of the new pattern. It validates parameters and, if successful,
|
|
* returns a `ToolInvocation` object that encapsulates the logic for the
|
|
* specific, validated call.
|
|
* @param params The raw, untrusted parameters from the model.
|
|
* @returns A `ToolInvocation` instance.
|
|
*/
|
|
abstract build(params: TParams): ToolInvocation<TParams, TResult>;
|
|
|
|
/**
|
|
* A convenience method that builds and executes the tool in one step.
|
|
* Throws an error if validation fails.
|
|
* @param params The raw, untrusted parameters from the model.
|
|
* @param signal AbortSignal for tool cancellation.
|
|
* @param updateOutput Optional callback to stream output.
|
|
* @returns The result of the tool execution.
|
|
*/
|
|
async buildAndExecute(
|
|
params: TParams,
|
|
signal: AbortSignal,
|
|
updateOutput?: (output: ToolResultDisplay) => void,
|
|
shellExecutionConfig?: ShellExecutionConfig,
|
|
): Promise<TResult> {
|
|
const invocation = this.build(params);
|
|
return invocation.execute(signal, updateOutput, shellExecutionConfig);
|
|
}
|
|
|
|
/**
|
|
* Similar to `build` but never throws.
|
|
* @param params The raw, untrusted parameters from the model.
|
|
* @returns A `ToolInvocation` instance.
|
|
*/
|
|
private silentBuild(
|
|
params: TParams,
|
|
): ToolInvocation<TParams, TResult> | Error {
|
|
try {
|
|
return this.build(params);
|
|
} catch (e) {
|
|
if (e instanceof Error) {
|
|
return e;
|
|
}
|
|
return new Error(String(e));
|
|
}
|
|
}
|
|
|
|
/**
|
|
* A convenience method that builds and executes the tool in one step.
|
|
* Never throws.
|
|
* @param params The raw, untrusted parameters from the model.
|
|
* @params abortSignal a signal to abort.
|
|
* @returns The result of the tool execution.
|
|
*/
|
|
async validateBuildAndExecute(
|
|
params: TParams,
|
|
abortSignal: AbortSignal,
|
|
): Promise<ToolResult> {
|
|
const invocationOrError = this.silentBuild(params);
|
|
if (invocationOrError instanceof Error) {
|
|
const errorMessage = invocationOrError.message;
|
|
return {
|
|
llmContent: `Error: Invalid parameters provided. Reason: ${errorMessage}`,
|
|
returnDisplay: errorMessage,
|
|
error: {
|
|
message: errorMessage,
|
|
type: ToolErrorType.INVALID_TOOL_PARAMS,
|
|
},
|
|
};
|
|
}
|
|
|
|
try {
|
|
return await invocationOrError.execute(abortSignal);
|
|
} catch (error) {
|
|
const errorMessage =
|
|
error instanceof Error ? error.message : String(error);
|
|
return {
|
|
llmContent: `Error: Tool call execution failed. Reason: ${errorMessage}`,
|
|
returnDisplay: errorMessage,
|
|
error: {
|
|
message: errorMessage,
|
|
type: ToolErrorType.EXECUTION_FAILED,
|
|
},
|
|
};
|
|
}
|
|
}
|
|
}
|
|
|
|
/**
|
|
* New base class for declarative tools that separates validation from execution.
|
|
* New tools should extend this class, which provides a `build` method that
|
|
* validates parameters before deferring to a `createInvocation` method for
|
|
* the final `ToolInvocation` object instantiation.
|
|
*/
|
|
export abstract class BaseDeclarativeTool<
|
|
TParams extends object,
|
|
TResult extends ToolResult,
|
|
> extends DeclarativeTool<TParams, TResult> {
|
|
build(params: TParams): ToolInvocation<TParams, TResult> {
|
|
const validationError = this.validateToolParams(params);
|
|
if (validationError) {
|
|
throw new Error(validationError);
|
|
}
|
|
return this.createInvocation(params);
|
|
}
|
|
|
|
override validateToolParams(params: TParams): string | null {
|
|
const errors = SchemaValidator.validate(
|
|
this.schema.parametersJsonSchema,
|
|
params,
|
|
);
|
|
|
|
if (errors) {
|
|
return errors;
|
|
}
|
|
return this.validateToolParamValues(params);
|
|
}
|
|
|
|
protected validateToolParamValues(_params: TParams): string | null {
|
|
// Base implementation can be extended by subclasses.
|
|
return null;
|
|
}
|
|
|
|
protected abstract createInvocation(
|
|
params: TParams,
|
|
): ToolInvocation<TParams, TResult>;
|
|
}
|
|
|
|
/**
|
|
* A type alias for a declarative tool where the specific parameter and result types are not known.
|
|
*/
|
|
export type AnyDeclarativeTool = DeclarativeTool<object, ToolResult>;
|
|
|
|
/**
|
|
* Type guard to check if an object is a Tool.
|
|
* @param obj The object to check.
|
|
* @returns True if the object is a Tool, false otherwise.
|
|
*/
|
|
export function isTool(obj: unknown): obj is AnyDeclarativeTool {
|
|
return (
|
|
typeof obj === 'object' &&
|
|
obj !== null &&
|
|
'name' in obj &&
|
|
'build' in obj &&
|
|
typeof (obj as AnyDeclarativeTool).build === 'function'
|
|
);
|
|
}
|
|
|
|
export interface ToolResult {
|
|
/**
|
|
* Content meant to be included in LLM history.
|
|
* This should represent the factual outcome of the tool execution.
|
|
*/
|
|
llmContent: PartListUnion;
|
|
|
|
/**
|
|
* Markdown string for user display.
|
|
* This provides a user-friendly summary or visualization of the result.
|
|
* NOTE: This might also be considered UI-specific and could potentially be
|
|
* removed or modified in a further refactor if the server becomes purely API-driven.
|
|
* For now, we keep it as the core logic in ReadFileTool currently produces it.
|
|
*/
|
|
returnDisplay: ToolResultDisplay;
|
|
|
|
/**
|
|
* If this property is present, the tool call is considered a failure.
|
|
*/
|
|
error?: {
|
|
message: string; // raw error message
|
|
type?: ToolErrorType; // An optional machine-readable error type (e.g., 'FILE_NOT_FOUND').
|
|
};
|
|
|
|
/**
|
|
* Optional model override propagated from skill execution.
|
|
* When present, the client should use this model for subsequent
|
|
* turns within the same agentic loop.
|
|
*/
|
|
modelOverride?: string;
|
|
}
|
|
|
|
/**
|
|
* Detects cycles in a JSON schemas due to `$ref`s.
|
|
* @param schema The root of the JSON schema.
|
|
* @returns `true` if a cycle is detected, `false` otherwise.
|
|
*/
|
|
export function hasCycleInSchema(schema: object): boolean {
|
|
function resolveRef(ref: string): object | null {
|
|
if (!ref.startsWith('#/')) {
|
|
return null;
|
|
}
|
|
const path = ref.substring(2).split('/');
|
|
let current: unknown = schema;
|
|
for (const segment of path) {
|
|
if (
|
|
typeof current !== 'object' ||
|
|
current === null ||
|
|
!Object.prototype.hasOwnProperty.call(current, segment)
|
|
) {
|
|
return null;
|
|
}
|
|
current = (current as Record<string, unknown>)[segment];
|
|
}
|
|
return current as object;
|
|
}
|
|
|
|
function traverse(
|
|
node: unknown,
|
|
visitedRefs: Set<string>,
|
|
pathRefs: Set<string>,
|
|
): boolean {
|
|
if (typeof node !== 'object' || node === null) {
|
|
return false;
|
|
}
|
|
|
|
if (Array.isArray(node)) {
|
|
for (const item of node) {
|
|
if (traverse(item, visitedRefs, pathRefs)) {
|
|
return true;
|
|
}
|
|
}
|
|
return false;
|
|
}
|
|
|
|
if ('$ref' in node && typeof node.$ref === 'string') {
|
|
const ref = node.$ref;
|
|
if (ref === '#/' || pathRefs.has(ref)) {
|
|
// A ref to just '#/' is always a cycle.
|
|
return true; // Cycle detected!
|
|
}
|
|
if (visitedRefs.has(ref)) {
|
|
return false; // Bail early, we have checked this ref before.
|
|
}
|
|
|
|
const resolvedNode = resolveRef(ref);
|
|
if (resolvedNode) {
|
|
// Add it to both visited and the current path
|
|
visitedRefs.add(ref);
|
|
pathRefs.add(ref);
|
|
const hasCycle = traverse(resolvedNode, visitedRefs, pathRefs);
|
|
pathRefs.delete(ref); // Backtrack, leaving it in visited
|
|
return hasCycle;
|
|
}
|
|
}
|
|
|
|
// Crawl all the properties of node
|
|
for (const key in node) {
|
|
if (Object.prototype.hasOwnProperty.call(node, key)) {
|
|
if (
|
|
traverse(
|
|
(node as Record<string, unknown>)[key],
|
|
visitedRefs,
|
|
pathRefs,
|
|
)
|
|
) {
|
|
return true;
|
|
}
|
|
}
|
|
}
|
|
|
|
return false;
|
|
}
|
|
|
|
return traverse(schema, new Set<string>(), new Set<string>());
|
|
}
|
|
|
|
export interface AgentResultDisplay {
|
|
type: 'task_execution';
|
|
subagentName: string;
|
|
subagentColor?: string;
|
|
taskDescription: string;
|
|
taskPrompt: string;
|
|
status: 'running' | 'completed' | 'failed' | 'cancelled' | 'background';
|
|
terminateReason?: string;
|
|
result?: string;
|
|
executionSummary?: AgentStatsSummary;
|
|
|
|
// If the subagent is awaiting approval for a tool call,
|
|
// this contains the confirmation details for inline UI rendering.
|
|
pendingConfirmation?: ToolCallConfirmationDetails;
|
|
|
|
toolCalls?: Array<{
|
|
callId: string;
|
|
name: string;
|
|
status: 'executing' | 'awaiting_approval' | 'success' | 'failed';
|
|
error?: string;
|
|
args?: Record<string, unknown>;
|
|
result?: string;
|
|
resultDisplay?: string;
|
|
responseParts?: Part[];
|
|
description?: string;
|
|
}>;
|
|
}
|
|
|
|
export interface AnsiOutputDisplay {
|
|
ansiOutput: AnsiOutput;
|
|
}
|
|
|
|
/**
|
|
* Structured progress data following the MCP notifications/progress spec.
|
|
* @see https://modelcontextprotocol.io/specification/2025-06-18/basic/utilities/progress
|
|
*/
|
|
export interface McpToolProgressData {
|
|
type: 'mcp_tool_progress';
|
|
/** Current progress value (must increase with each notification) */
|
|
progress: number;
|
|
/** Optional total value indicating the operation's target */
|
|
total?: number;
|
|
/** Optional human-readable progress message */
|
|
message?: string;
|
|
}
|
|
|
|
export type ToolResultDisplay =
|
|
| string
|
|
| FileDiff
|
|
| TodoResultDisplay
|
|
| PlanResultDisplay
|
|
| AgentResultDisplay
|
|
| AnsiOutputDisplay
|
|
| McpToolProgressData;
|
|
|
|
export interface FileDiff {
|
|
fileDiff: string;
|
|
fileName: string;
|
|
originalContent: string | null;
|
|
newContent: string;
|
|
diffStat?: DiffStat;
|
|
}
|
|
|
|
export interface DiffStat {
|
|
model_added_lines: number;
|
|
model_removed_lines: number;
|
|
model_added_chars: number;
|
|
model_removed_chars: number;
|
|
user_added_lines: number;
|
|
user_removed_lines: number;
|
|
user_added_chars: number;
|
|
user_removed_chars: number;
|
|
}
|
|
|
|
export interface TodoResultDisplay {
|
|
type: 'todo_list';
|
|
todos: Array<{
|
|
id: string;
|
|
content: string;
|
|
status: 'pending' | 'in_progress' | 'completed';
|
|
}>;
|
|
}
|
|
|
|
export interface PlanResultDisplay {
|
|
type: 'plan_summary';
|
|
message: string;
|
|
plan: string;
|
|
rejected?: boolean;
|
|
}
|
|
|
|
export interface ToolEditConfirmationDetails {
|
|
type: 'edit';
|
|
title: string;
|
|
onConfirm: (
|
|
outcome: ToolConfirmationOutcome,
|
|
payload?: ToolConfirmationPayload,
|
|
) => Promise<void>;
|
|
/**
|
|
* When true, the UI should not show "Always allow" options (ProceedAlwaysProject/User).
|
|
* Set by coreToolScheduler when PM has an explicit 'ask' rule that would override
|
|
* any 'allow' rule the user might add.
|
|
*/
|
|
hideAlwaysAllow?: boolean;
|
|
fileName: string;
|
|
filePath: string;
|
|
fileDiff: string;
|
|
originalContent: string | null;
|
|
newContent: string;
|
|
isModifying?: boolean;
|
|
}
|
|
|
|
export interface ToolConfirmationPayload {
|
|
// used to override `modifiedProposedContent` for modifiable tools in the
|
|
// inline modify flow
|
|
newContent?: string;
|
|
// used to provide custom cancellation message when outcome is Cancel
|
|
cancelMessage?: string;
|
|
// Permission rules to persist when user selects ProceedAlwaysProject/User.
|
|
// Populated by the tool's getConfirmationDetails() and read by
|
|
// coreToolScheduler.handleConfirmationResponse() for persistence.
|
|
permissionRules?: string[];
|
|
// used to pass user answers from ask_user_question tool
|
|
answers?: Record<string, string>;
|
|
}
|
|
|
|
export interface ToolExecuteConfirmationDetails {
|
|
type: 'exec';
|
|
title: string;
|
|
onConfirm: (
|
|
outcome: ToolConfirmationOutcome,
|
|
payload?: ToolConfirmationPayload,
|
|
) => Promise<void>;
|
|
/** @see ToolEditConfirmationDetails.hideAlwaysAllow */
|
|
hideAlwaysAllow?: boolean;
|
|
command: string;
|
|
rootCommand: string;
|
|
/** Permission rules extracted by extractCommandRules(), used for display and persistence. */
|
|
permissionRules?: string[];
|
|
}
|
|
|
|
export interface ToolMcpConfirmationDetails {
|
|
type: 'mcp';
|
|
title: string;
|
|
/** @see ToolEditConfirmationDetails.hideAlwaysAllow */
|
|
hideAlwaysAllow?: boolean;
|
|
serverName: string;
|
|
toolName: string;
|
|
toolDisplayName: string;
|
|
onConfirm: (
|
|
outcome: ToolConfirmationOutcome,
|
|
payload?: ToolConfirmationPayload,
|
|
) => Promise<void>;
|
|
/** Permission rule for this MCP tool, e.g. 'mcp__server__tool'. */
|
|
permissionRules?: string[];
|
|
}
|
|
|
|
export interface ToolInfoConfirmationDetails {
|
|
type: 'info';
|
|
title: string;
|
|
onConfirm: (
|
|
outcome: ToolConfirmationOutcome,
|
|
payload?: ToolConfirmationPayload,
|
|
) => Promise<void>;
|
|
/** @see ToolEditConfirmationDetails.hideAlwaysAllow */
|
|
hideAlwaysAllow?: boolean;
|
|
prompt: string;
|
|
urls?: string[];
|
|
/** Permission rules for persistence, e.g. 'WebFetch(example.com)'. */
|
|
permissionRules?: string[];
|
|
}
|
|
|
|
export type ToolCallConfirmationDetails =
|
|
| ToolEditConfirmationDetails
|
|
| ToolExecuteConfirmationDetails
|
|
| ToolMcpConfirmationDetails
|
|
| ToolInfoConfirmationDetails
|
|
| ToolPlanConfirmationDetails
|
|
| ToolAskUserQuestionConfirmationDetails;
|
|
|
|
export interface ToolPlanConfirmationDetails {
|
|
type: 'plan';
|
|
title: string;
|
|
/** @see ToolEditConfirmationDetails.hideAlwaysAllow */
|
|
hideAlwaysAllow?: boolean;
|
|
plan: string;
|
|
/** The approval mode that was active before entering plan mode (for display in the UI). */
|
|
prePlanMode?: string;
|
|
onConfirm: (
|
|
outcome: ToolConfirmationOutcome,
|
|
payload?: ToolConfirmationPayload,
|
|
) => Promise<void>;
|
|
}
|
|
|
|
export interface ToolAskUserQuestionConfirmationDetails {
|
|
type: 'ask_user_question';
|
|
title: string;
|
|
questions: Array<{
|
|
question: string;
|
|
header: string;
|
|
options: Array<{
|
|
label: string;
|
|
description: string;
|
|
}>;
|
|
multiSelect: boolean;
|
|
}>;
|
|
metadata?: {
|
|
source?: string;
|
|
};
|
|
onConfirm: (
|
|
outcome: ToolConfirmationOutcome,
|
|
payload?: ToolConfirmationPayload,
|
|
) => Promise<void>;
|
|
}
|
|
|
|
/**
|
|
* TODO:
|
|
* 1. support explicit denied outcome
|
|
* 2. support proceed with modified input
|
|
*/
|
|
export enum ToolConfirmationOutcome {
|
|
ProceedOnce = 'proceed_once',
|
|
ProceedAlways = 'proceed_always',
|
|
/** @deprecated Use ProceedAlwaysProject or ProceedAlwaysUser instead. */
|
|
ProceedAlwaysServer = 'proceed_always_server',
|
|
/** @deprecated Use ProceedAlwaysProject or ProceedAlwaysUser instead. */
|
|
ProceedAlwaysTool = 'proceed_always_tool',
|
|
/** Persist the permission rule to the project settings (workspace scope). */
|
|
ProceedAlwaysProject = 'proceed_always_project',
|
|
/** Persist the permission rule to the user settings (user scope). */
|
|
ProceedAlwaysUser = 'proceed_always_user',
|
|
ModifyWithEditor = 'modify_with_editor',
|
|
/** Restore the approval mode that was active before entering plan mode. */
|
|
RestorePrevious = 'restore_previous',
|
|
Cancel = 'cancel',
|
|
}
|
|
|
|
export enum Kind {
|
|
Read = 'read',
|
|
Edit = 'edit',
|
|
Delete = 'delete',
|
|
Move = 'move',
|
|
Search = 'search',
|
|
Execute = 'execute',
|
|
Think = 'think',
|
|
Fetch = 'fetch',
|
|
Other = 'other',
|
|
}
|
|
|
|
// Function kinds that have side effects
|
|
export const MUTATOR_KINDS: Kind[] = [
|
|
Kind.Edit,
|
|
Kind.Delete,
|
|
Kind.Move,
|
|
Kind.Execute,
|
|
] as const;
|
|
|
|
/**
|
|
* Tool kinds that are safe to execute concurrently (pure reads, no writes).
|
|
* Kind.Think is excluded because some Think tools write to disk
|
|
* (e.g., save_memory, todo_write).
|
|
*/
|
|
export const CONCURRENCY_SAFE_KINDS: ReadonlySet<Kind> = new Set([
|
|
Kind.Read,
|
|
Kind.Search,
|
|
Kind.Fetch,
|
|
]);
|
|
|
|
export interface ToolLocation {
|
|
// Absolute path to the file
|
|
path: string;
|
|
// Which line (if known)
|
|
line?: number;
|
|
}
|