mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-04-28 03:30:40 +00:00
feat(cli/sdk): expose /context usage data in non-interactive mode and SDK API (#2916)
Some checks are pending
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
Some checks are pending
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
* feat(cli): implement non-interactive /context output and diagnostic - Extract collectContextData() from contextCommand.ts for shared usage. - Register /context in ALLOWED_BUILTIN_COMMANDS_NON_INTERACTIVE. - Extend SDK control protocol with GET_CONTEXT_USAGE request. - Implement handleGetContextUsage in SystemController for programmatic token queries. - Expose getContextUsage() method in the TypeScript SDK Query interface. * fix: address review feedback and fix critical bugs in context usage feature - Add missing `get_context_usage` route in ControlDispatcher (SDK calls would throw) - Fix `executionMode` defaulting: use `?? 'interactive'` to match other commands - Validate dynamic import of `collectContextData` before invoking - Preserve original error message in handleGetContextUsage catch block - Add ControlDispatcher test for get_context_usage routing - Add JSDoc comment for context command in non-interactive allowlist Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: re-check abort signal after async operations in handleGetContextUsage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add getContextUsage() to SDK TypeScript documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify getContextUsage showDetails is a display hint, not a data filter Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: make showDetails affect response shape, add getContextUsage test - When showDetails is false, return empty detail arrays instead of full data so /context and /context detail produce different payloads - Add unit test for Query.getContextUsage() covering request payload and response handling * fix: strip UI type from SDK response, sync Java SDK protocol - Remove leaked `type: 'context_usage'` from control response payload - Add GET_CONTEXT_USAGE to Java SDK protocol mirror (enum, interface, union type) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
e90abf4c35
commit
1486e85385
11 changed files with 445 additions and 275 deletions
|
|
@ -79,12 +79,12 @@ Creates a new query session with the Qwen Code.
|
|||
|
||||
The SDK enforces the following default timeouts:
|
||||
|
||||
| Timeout | Default | Description |
|
||||
| ---------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `canUseTool` | 1 minute | Maximum time for `canUseTool` callback to respond. If exceeded, the tool request is auto-denied. |
|
||||
| `mcpRequest` | 1 minute | Maximum time for SDK MCP tool calls to complete. |
|
||||
| `controlRequest` | 1 minute | Maximum time for control operations like `initialize()`, `setModel()`, `setPermissionMode()`, and `interrupt()` to complete. |
|
||||
| `streamClose` | 1 minute | Maximum time to wait for initialization to complete before closing CLI stdin in multi-turn mode with SDK MCP servers. |
|
||||
| Timeout | Default | Description |
|
||||
| ---------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `canUseTool` | 1 minute | Maximum time for `canUseTool` callback to respond. If exceeded, the tool request is auto-denied. |
|
||||
| `mcpRequest` | 1 minute | Maximum time for SDK MCP tool calls to complete. |
|
||||
| `controlRequest` | 1 minute | Maximum time for control operations like `initialize()`, `setModel()`, `setPermissionMode()`, `getContextUsage()`, and `interrupt()` to complete. |
|
||||
| `streamClose` | 1 minute | Maximum time to wait for initialization to complete before closing CLI stdin in multi-turn mode with SDK MCP servers. |
|
||||
|
||||
You can customize these timeouts via the `timeout` option:
|
||||
|
||||
|
|
@ -143,6 +143,11 @@ await q.setPermissionMode('yolo');
|
|||
// Change model mid-session
|
||||
await q.setModel('qwen-max');
|
||||
|
||||
// Get context window usage breakdown (token counts per category)
|
||||
const usage = await q.getContextUsage();
|
||||
// Pass true to hint that per-item details should be displayed
|
||||
const detail = await q.getContextUsage(true);
|
||||
|
||||
// Close the session
|
||||
await q.close();
|
||||
```
|
||||
|
|
|
|||
|
|
@ -18,6 +18,7 @@ import type {
|
|||
CLIControlInterruptRequest,
|
||||
CLIControlSetModelRequest,
|
||||
CLIControlSupportedCommandsRequest,
|
||||
CLIControlGetContextUsageRequest,
|
||||
} from '../types.js';
|
||||
|
||||
/**
|
||||
|
|
@ -242,6 +243,41 @@ describe('ControlDispatcher', () => {
|
|||
});
|
||||
});
|
||||
|
||||
it('should route get_context_usage request to system controller', async () => {
|
||||
const request: CLIControlRequest = {
|
||||
type: 'control_request',
|
||||
request_id: 'req-ctx',
|
||||
request: {
|
||||
subtype: 'get_context_usage',
|
||||
show_details: false,
|
||||
} as CLIControlGetContextUsageRequest,
|
||||
};
|
||||
|
||||
const mockResponse = {
|
||||
subtype: 'get_context_usage',
|
||||
totalTokens: 1000,
|
||||
};
|
||||
|
||||
vi.mocked(mockSystemController.handleRequest).mockResolvedValue(
|
||||
mockResponse,
|
||||
);
|
||||
|
||||
await dispatcher.dispatch(request);
|
||||
|
||||
expect(mockSystemController.handleRequest).toHaveBeenCalledWith(
|
||||
request.request,
|
||||
'req-ctx',
|
||||
);
|
||||
expect(mockContext.streamJson.send).toHaveBeenCalledWith({
|
||||
type: 'control_response',
|
||||
response: {
|
||||
subtype: 'success',
|
||||
request_id: 'req-ctx',
|
||||
response: mockResponse,
|
||||
},
|
||||
});
|
||||
});
|
||||
|
||||
it('should send error response when controller throws error', async () => {
|
||||
const request: CLIControlRequest = {
|
||||
type: 'control_request',
|
||||
|
|
|
|||
|
|
@ -14,7 +14,7 @@
|
|||
* which wraps these controllers with a stable programmatic API.
|
||||
*
|
||||
* Controllers:
|
||||
* - SystemController: initialize, interrupt, set_model, supported_commands
|
||||
* - SystemController: initialize, interrupt, set_model, supported_commands, get_context_usage
|
||||
* - PermissionController: can_use_tool, set_permission_mode
|
||||
* - SdkMcpController: mcp_server_status (mcp_message handled via callback)
|
||||
* - HookController: hook_callback
|
||||
|
|
@ -380,6 +380,7 @@ export class ControlDispatcher implements IPendingRequestRegistry {
|
|||
case 'interrupt':
|
||||
case 'set_model':
|
||||
case 'supported_commands':
|
||||
case 'get_context_usage':
|
||||
return this.systemController;
|
||||
|
||||
case 'can_use_tool':
|
||||
|
|
|
|||
|
|
@ -19,6 +19,7 @@ import type {
|
|||
CLIControlInitializeRequest,
|
||||
CLIControlSetModelRequest,
|
||||
CLIMcpServerConfig,
|
||||
CLIControlGetContextUsageRequest,
|
||||
} from '../../types.js';
|
||||
import { getAvailableCommands } from '../../../nonInteractiveCliCommands.js';
|
||||
import {
|
||||
|
|
@ -61,11 +62,58 @@ export class SystemController extends BaseController {
|
|||
case 'supported_commands':
|
||||
return this.handleSupportedCommands(signal);
|
||||
|
||||
case 'get_context_usage':
|
||||
return this.handleGetContextUsage(
|
||||
payload as CLIControlGetContextUsageRequest,
|
||||
signal,
|
||||
);
|
||||
|
||||
default:
|
||||
throw new Error(`Unsupported request subtype in SystemController`);
|
||||
}
|
||||
}
|
||||
|
||||
private async handleGetContextUsage(
|
||||
payload: CLIControlGetContextUsageRequest,
|
||||
signal: AbortSignal,
|
||||
): Promise<Record<string, unknown>> {
|
||||
if (signal.aborted) {
|
||||
throw new Error('Request aborted');
|
||||
}
|
||||
|
||||
try {
|
||||
const mod = await import('../../../ui/commands/contextCommand.js');
|
||||
if (signal.aborted) {
|
||||
throw new Error('Request aborted');
|
||||
}
|
||||
if (typeof mod.collectContextData !== 'function') {
|
||||
throw new Error('collectContextData is not available');
|
||||
}
|
||||
const showDetails = payload.show_details ?? false;
|
||||
const contextUsageItem = await mod.collectContextData(
|
||||
this.context.config,
|
||||
showDetails,
|
||||
);
|
||||
if (signal.aborted) {
|
||||
throw new Error('Request aborted');
|
||||
}
|
||||
|
||||
const { type: _type, ...contextData } = contextUsageItem;
|
||||
return {
|
||||
subtype: 'get_context_usage',
|
||||
...contextData,
|
||||
};
|
||||
} catch (error) {
|
||||
const errorMessage =
|
||||
error instanceof Error ? error.message : 'Failed to get context usage';
|
||||
debugLogger.error(
|
||||
'[SystemController] Failed to get context usage:',
|
||||
error,
|
||||
);
|
||||
throw new Error(errorMessage);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle initialize request
|
||||
*
|
||||
|
|
@ -212,6 +260,7 @@ export class SystemController extends BaseController {
|
|||
can_set_permission_mode:
|
||||
typeof this.context.config.setApprovalMode === 'function',
|
||||
can_set_model: typeof this.context.config.setModel === 'function',
|
||||
can_get_context_usage: true,
|
||||
// SDK MCP servers are supported - messages routed through control plane
|
||||
can_handle_mcp_message: true,
|
||||
};
|
||||
|
|
|
|||
|
|
@ -407,6 +407,11 @@ export interface CLIControlSupportedCommandsRequest {
|
|||
subtype: 'supported_commands';
|
||||
}
|
||||
|
||||
export interface CLIControlGetContextUsageRequest {
|
||||
subtype: 'get_context_usage';
|
||||
show_details?: boolean;
|
||||
}
|
||||
|
||||
export type ControlRequestPayload =
|
||||
| CLIControlInterruptRequest
|
||||
| CLIControlPermissionRequest
|
||||
|
|
@ -416,7 +421,8 @@ export type ControlRequestPayload =
|
|||
| CLIControlMcpMessageRequest
|
||||
| CLIControlSetModelRequest
|
||||
| CLIControlMcpStatusRequest
|
||||
| CLIControlSupportedCommandsRequest;
|
||||
| CLIControlSupportedCommandsRequest
|
||||
| CLIControlGetContextUsageRequest;
|
||||
|
||||
export interface CLIControlRequest {
|
||||
type: 'control_request';
|
||||
|
|
|
|||
|
|
@ -37,6 +37,7 @@ const debugLogger = createDebugLogger('NON_INTERACTIVE_COMMANDS');
|
|||
* - init: Initialize project configuration
|
||||
* - summary: Generate session summary
|
||||
* - compress: Compress conversation history
|
||||
* - context: Show context window usage (read-only diagnostic)
|
||||
*/
|
||||
export const ALLOWED_BUILTIN_COMMANDS_NON_INTERACTIVE = [
|
||||
'init',
|
||||
|
|
@ -44,6 +45,7 @@ export const ALLOWED_BUILTIN_COMMANDS_NON_INTERACTIVE = [
|
|||
'compress',
|
||||
'btw',
|
||||
'bug',
|
||||
'context',
|
||||
] as const;
|
||||
|
||||
/**
|
||||
|
|
|
|||
|
|
@ -87,6 +87,226 @@ function parseMemoryFiles(memoryContent: string): ContextMemoryDetail[] {
|
|||
return results;
|
||||
}
|
||||
|
||||
export async function collectContextData(
|
||||
config: import('@qwen-code/qwen-code-core').Config,
|
||||
showDetails: boolean,
|
||||
): Promise<HistoryItemContextUsage> {
|
||||
const modelName = config.getModel() || 'unknown';
|
||||
const contentGeneratorConfig = config.getContentGeneratorConfig();
|
||||
const contextWindowSize =
|
||||
contentGeneratorConfig.contextWindowSize ?? DEFAULT_TOKEN_LIMIT;
|
||||
|
||||
const apiTotalTokens = uiTelemetryService.getLastPromptTokenCount();
|
||||
const apiCachedTokens = uiTelemetryService.getLastCachedContentTokenCount();
|
||||
|
||||
const systemPromptText = getCoreSystemPrompt(undefined, modelName);
|
||||
const systemPromptTokens = estimateTokens(systemPromptText);
|
||||
|
||||
const toolRegistry = config.getToolRegistry();
|
||||
const allTools = toolRegistry ? toolRegistry.getAllTools() : [];
|
||||
const toolDeclarations = toolRegistry
|
||||
? toolRegistry.getFunctionDeclarations()
|
||||
: [];
|
||||
const toolsJsonStr = JSON.stringify(toolDeclarations);
|
||||
const allToolsTokens = estimateTokens(toolsJsonStr);
|
||||
|
||||
const builtinTools: ContextToolDetail[] = [];
|
||||
const mcpTools: ContextToolDetail[] = [];
|
||||
for (const tool of allTools) {
|
||||
const toolJsonStr = JSON.stringify(tool.schema);
|
||||
const tokens = estimateTokens(toolJsonStr);
|
||||
if (tool instanceof DiscoveredMCPTool) {
|
||||
mcpTools.push({
|
||||
name: `${tool.serverName}__${tool.serverToolName || tool.name}`,
|
||||
tokens,
|
||||
});
|
||||
} else if (tool.name !== ToolNames.SKILL) {
|
||||
builtinTools.push({
|
||||
name: tool.name,
|
||||
tokens,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
const memoryContent = config.getUserMemory();
|
||||
const memoryFiles = parseMemoryFiles(memoryContent);
|
||||
const memoryFilesTokens = memoryFiles.reduce((sum, f) => sum + f.tokens, 0);
|
||||
|
||||
const skillTool = allTools.find((tool) => tool.name === ToolNames.SKILL);
|
||||
const skillToolDefinitionTokens = skillTool
|
||||
? estimateTokens(JSON.stringify(skillTool.schema))
|
||||
: 0;
|
||||
|
||||
const loadedSkillNames: ReadonlySet<string> =
|
||||
skillTool instanceof SkillTool
|
||||
? skillTool.getLoadedSkillNames()
|
||||
: new Set();
|
||||
|
||||
const skillManager = config.getSkillManager();
|
||||
const skillConfigs = skillManager ? await skillManager.listSkills() : [];
|
||||
let loadedBodiesTokens = 0;
|
||||
const skills: ContextSkillDetail[] = skillConfigs.map((skill) => {
|
||||
const listingTokens = estimateTokens(
|
||||
`<skill>\n<name>\n${skill.name}\n</name>\n<description>\n${skill.description} (${skill.level})\n</description>\n<location>\n${skill.level}\n</location>\n</skill>`,
|
||||
);
|
||||
const isLoaded = loadedSkillNames.has(skill.name);
|
||||
let bodyTokens: number | undefined;
|
||||
if (isLoaded && skill.body) {
|
||||
const baseDir = skill.filePath
|
||||
? skill.filePath.replace(/\/[^/]+$/, '')
|
||||
: '';
|
||||
bodyTokens = estimateTokens(buildSkillLlmContent(baseDir, skill.body));
|
||||
loadedBodiesTokens += bodyTokens;
|
||||
}
|
||||
return {
|
||||
name: skill.name,
|
||||
tokens: listingTokens,
|
||||
loaded: isLoaded,
|
||||
bodyTokens,
|
||||
};
|
||||
});
|
||||
|
||||
const skillsTokens = skillToolDefinitionTokens + loadedBodiesTokens;
|
||||
|
||||
const compressionThreshold =
|
||||
config.getChatCompression()?.contextPercentageThreshold ??
|
||||
DEFAULT_COMPRESSION_THRESHOLD;
|
||||
const autocompactBuffer =
|
||||
compressionThreshold > 0
|
||||
? Math.round((1 - compressionThreshold) * contextWindowSize)
|
||||
: 0;
|
||||
|
||||
const rawOverhead =
|
||||
systemPromptTokens +
|
||||
allToolsTokens +
|
||||
memoryFilesTokens +
|
||||
loadedBodiesTokens;
|
||||
|
||||
const isEstimated = apiTotalTokens === 0;
|
||||
|
||||
const mcpToolsTotalTokens = mcpTools.reduce(
|
||||
(sum, tool) => sum + tool.tokens,
|
||||
0,
|
||||
);
|
||||
|
||||
let totalTokens: number;
|
||||
let displaySystemPrompt: number;
|
||||
let displayBuiltinTools: number;
|
||||
let displayMcpTools: number;
|
||||
let displayMemoryFiles: number;
|
||||
let displaySkills: number;
|
||||
let messagesTokens: number;
|
||||
let freeSpace: number;
|
||||
let detailBuiltinTools: ContextToolDetail[];
|
||||
let detailMcpTools: ContextToolDetail[];
|
||||
let detailMemoryFiles: ContextMemoryDetail[];
|
||||
let detailSkills: ContextSkillDetail[];
|
||||
|
||||
if (isEstimated) {
|
||||
totalTokens = 0;
|
||||
displaySystemPrompt = systemPromptTokens;
|
||||
displaySkills = skillsTokens;
|
||||
displayBuiltinTools = Math.max(
|
||||
0,
|
||||
allToolsTokens - skillToolDefinitionTokens - mcpToolsTotalTokens,
|
||||
);
|
||||
displayMcpTools = mcpToolsTotalTokens;
|
||||
displayMemoryFiles = memoryFilesTokens;
|
||||
messagesTokens = 0;
|
||||
freeSpace = Math.max(
|
||||
0,
|
||||
contextWindowSize - rawOverhead - autocompactBuffer,
|
||||
);
|
||||
detailBuiltinTools = builtinTools;
|
||||
detailMcpTools = mcpTools;
|
||||
detailMemoryFiles = memoryFiles;
|
||||
detailSkills = skills;
|
||||
} else {
|
||||
totalTokens = apiTotalTokens;
|
||||
|
||||
const overheadScale =
|
||||
rawOverhead > totalTokens ? totalTokens / rawOverhead : 1;
|
||||
|
||||
displaySystemPrompt = Math.round(systemPromptTokens * overheadScale);
|
||||
const scaledAllTools = Math.round(allToolsTokens * overheadScale);
|
||||
displayMemoryFiles = Math.round(memoryFilesTokens * overheadScale);
|
||||
displaySkills = Math.round(skillsTokens * overheadScale);
|
||||
const scaledMcpTotal = Math.round(mcpToolsTotalTokens * overheadScale);
|
||||
displayMcpTools = scaledMcpTotal;
|
||||
const scaledSkillDefinition = Math.round(
|
||||
skillToolDefinitionTokens * overheadScale,
|
||||
);
|
||||
displayBuiltinTools = Math.max(
|
||||
0,
|
||||
scaledAllTools - scaledSkillDefinition - scaledMcpTotal,
|
||||
);
|
||||
|
||||
const scaledOverhead =
|
||||
displaySystemPrompt +
|
||||
scaledAllTools +
|
||||
displayMemoryFiles +
|
||||
Math.round(loadedBodiesTokens * overheadScale);
|
||||
|
||||
if (apiCachedTokens > 0) {
|
||||
messagesTokens = Math.max(0, totalTokens - apiCachedTokens);
|
||||
} else {
|
||||
messagesTokens = Math.max(0, totalTokens - scaledOverhead);
|
||||
}
|
||||
|
||||
freeSpace = Math.max(
|
||||
0,
|
||||
contextWindowSize - totalTokens - autocompactBuffer,
|
||||
);
|
||||
|
||||
const scaleDetail = <T extends { tokens: number }>(items: T[]): T[] =>
|
||||
overheadScale < 1
|
||||
? items.map((item) => ({
|
||||
...item,
|
||||
tokens: Math.round(item.tokens * overheadScale),
|
||||
}))
|
||||
: items;
|
||||
|
||||
detailBuiltinTools = scaleDetail(builtinTools);
|
||||
detailMcpTools = scaleDetail(mcpTools);
|
||||
detailMemoryFiles = scaleDetail(memoryFiles);
|
||||
detailSkills =
|
||||
overheadScale < 1
|
||||
? skills.map((item) => ({
|
||||
...item,
|
||||
tokens: Math.round(item.tokens * overheadScale),
|
||||
bodyTokens: item.bodyTokens
|
||||
? Math.round(item.bodyTokens * overheadScale)
|
||||
: undefined,
|
||||
}))
|
||||
: skills;
|
||||
}
|
||||
|
||||
const breakdown: ContextCategoryBreakdown = {
|
||||
systemPrompt: displaySystemPrompt,
|
||||
builtinTools: displayBuiltinTools,
|
||||
mcpTools: displayMcpTools,
|
||||
memoryFiles: displayMemoryFiles,
|
||||
skills: displaySkills,
|
||||
messages: messagesTokens,
|
||||
freeSpace,
|
||||
autocompactBuffer,
|
||||
};
|
||||
|
||||
return {
|
||||
type: MessageType.CONTEXT_USAGE,
|
||||
modelName,
|
||||
totalTokens,
|
||||
contextWindowSize,
|
||||
breakdown,
|
||||
builtinTools: showDetails ? detailBuiltinTools : [],
|
||||
mcpTools: showDetails ? detailMcpTools : [],
|
||||
memoryFiles: showDetails ? detailMemoryFiles : [],
|
||||
skills: showDetails ? detailSkills : [],
|
||||
isEstimated,
|
||||
showDetails,
|
||||
};
|
||||
}
|
||||
|
||||
export const contextCommand: SlashCommand = {
|
||||
name: 'context',
|
||||
get description() {
|
||||
|
|
@ -99,279 +319,38 @@ export const contextCommand: SlashCommand = {
|
|||
const showDetails =
|
||||
args?.trim().toLowerCase() === 'detail' ||
|
||||
args?.trim().toLowerCase() === '-d';
|
||||
const executionMode = context.executionMode ?? 'interactive';
|
||||
const { config } = context.services;
|
||||
if (!config) {
|
||||
context.ui.addItem(
|
||||
{
|
||||
type: MessageType.ERROR,
|
||||
text: t('Config not loaded.'),
|
||||
},
|
||||
Date.now(),
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
// --- Gather data ---
|
||||
|
||||
const modelName = config.getModel() || 'unknown';
|
||||
const contentGeneratorConfig = config.getContentGeneratorConfig();
|
||||
const contextWindowSize =
|
||||
contentGeneratorConfig.contextWindowSize ?? DEFAULT_TOKEN_LIMIT;
|
||||
|
||||
// Total prompt token count from API (most accurate)
|
||||
const apiTotalTokens = uiTelemetryService.getLastPromptTokenCount();
|
||||
// Cached content token count — when available (e.g. DashScope prefix caching),
|
||||
// represents the cached overhead (system prompt + tools). Using this gives a much
|
||||
// more accurate "Messages" count: promptTokens - cachedTokens = actual history tokens.
|
||||
const apiCachedTokens = uiTelemetryService.getLastCachedContentTokenCount();
|
||||
|
||||
// 1. System prompt tokens (without memory, as memory is counted separately)
|
||||
const systemPromptText = getCoreSystemPrompt(undefined, modelName);
|
||||
const systemPromptTokens = estimateTokens(systemPromptText);
|
||||
|
||||
// 2. Tool declarations tokens (includes ALL tools: built-in, MCP, skill tool)
|
||||
const toolRegistry = config.getToolRegistry();
|
||||
const allTools = toolRegistry ? toolRegistry.getAllTools() : [];
|
||||
const toolDeclarations = toolRegistry
|
||||
? toolRegistry.getFunctionDeclarations()
|
||||
: [];
|
||||
const toolsJsonStr = JSON.stringify(toolDeclarations);
|
||||
const allToolsTokens = estimateTokens(toolsJsonStr);
|
||||
|
||||
// 3. Per-tool details (for breakdown display)
|
||||
const builtinTools: ContextToolDetail[] = [];
|
||||
const mcpTools: ContextToolDetail[] = [];
|
||||
for (const tool of allTools) {
|
||||
const toolJsonStr = JSON.stringify(tool.schema);
|
||||
const tokens = estimateTokens(toolJsonStr);
|
||||
if (tool instanceof DiscoveredMCPTool) {
|
||||
mcpTools.push({
|
||||
name: `${tool.serverName}__${tool.serverToolName || tool.name}`,
|
||||
tokens,
|
||||
});
|
||||
} else if (tool.name !== ToolNames.SKILL) {
|
||||
// Built-in tool (exclude SkillTool, which is shown under Skills)
|
||||
builtinTools.push({
|
||||
name: tool.name,
|
||||
tokens,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// 4. Memory files
|
||||
const memoryContent = config.getUserMemory();
|
||||
const memoryFiles = parseMemoryFiles(memoryContent);
|
||||
const memoryFilesTokens = memoryFiles.reduce((sum, f) => sum + f.tokens, 0);
|
||||
|
||||
// 5. Skills (progressive disclosure)
|
||||
// Two cost components:
|
||||
// a) Tool definition: SkillTool's description embeds all skill
|
||||
// name+description listings plus instruction text — always in context.
|
||||
// b) Loaded bodies: When the model invokes a skill, the full SKILL.md
|
||||
// body is injected into the conversation as a tool result. We track
|
||||
// which skills have been loaded and attribute their body tokens here
|
||||
// so the "Skills" category accurately reflects the total cost.
|
||||
const skillTool = allTools.find((tool) => tool.name === ToolNames.SKILL);
|
||||
const skillToolDefinitionTokens = skillTool
|
||||
? estimateTokens(JSON.stringify(skillTool.schema))
|
||||
: 0;
|
||||
|
||||
// Determine which skills have been loaded in this session
|
||||
const loadedSkillNames: ReadonlySet<string> =
|
||||
skillTool instanceof SkillTool
|
||||
? skillTool.getLoadedSkillNames()
|
||||
: new Set();
|
||||
|
||||
// Per-skill breakdown: listing cost + body cost for loaded skills
|
||||
const skillManager = config.getSkillManager();
|
||||
const skillConfigs = skillManager ? await skillManager.listSkills() : [];
|
||||
let loadedBodiesTokens = 0;
|
||||
const skills: ContextSkillDetail[] = skillConfigs.map((skill) => {
|
||||
const listingTokens = estimateTokens(
|
||||
`<skill>\n<name>\n${skill.name}\n</name>\n<description>\n${skill.description} (${skill.level})\n</description>\n<location>\n${skill.level}\n</location>\n</skill>`,
|
||||
);
|
||||
const isLoaded = loadedSkillNames.has(skill.name);
|
||||
let bodyTokens: number | undefined;
|
||||
if (isLoaded && skill.body) {
|
||||
const baseDir = skill.filePath
|
||||
? skill.filePath.replace(/\/[^/]+$/, '')
|
||||
: '';
|
||||
bodyTokens = estimateTokens(buildSkillLlmContent(baseDir, skill.body));
|
||||
loadedBodiesTokens += bodyTokens;
|
||||
if (executionMode === 'interactive') {
|
||||
context.ui.addItem(
|
||||
{
|
||||
type: MessageType.ERROR,
|
||||
text: t('Config not loaded.'),
|
||||
},
|
||||
Date.now(),
|
||||
);
|
||||
return;
|
||||
}
|
||||
return {
|
||||
name: skill.name,
|
||||
tokens: listingTokens,
|
||||
loaded: isLoaded,
|
||||
bodyTokens,
|
||||
type: 'message',
|
||||
messageType: 'error',
|
||||
content: t('Config not loaded.'),
|
||||
};
|
||||
});
|
||||
|
||||
// Total skills cost = tool definition + loaded bodies
|
||||
const skillsTokens = skillToolDefinitionTokens + loadedBodiesTokens;
|
||||
|
||||
// 6. Autocompact buffer
|
||||
const compressionThreshold =
|
||||
config.getChatCompression()?.contextPercentageThreshold ??
|
||||
DEFAULT_COMPRESSION_THRESHOLD;
|
||||
const autocompactBuffer =
|
||||
compressionThreshold > 0
|
||||
? Math.round((1 - compressionThreshold) * contextWindowSize)
|
||||
: 0;
|
||||
|
||||
// 7. Calculate raw overhead
|
||||
// allToolsTokens includes the skill tool definition; loadedBodiesTokens
|
||||
// covers the on-demand skill bodies now attributed to Skills.
|
||||
const rawOverhead =
|
||||
systemPromptTokens +
|
||||
allToolsTokens +
|
||||
memoryFilesTokens +
|
||||
loadedBodiesTokens;
|
||||
|
||||
// 8. Determine total tokens and build breakdown
|
||||
const isEstimated = apiTotalTokens === 0;
|
||||
|
||||
// Sum of MCP tool tokens for category-level display
|
||||
const mcpToolsTotalTokens = mcpTools.reduce(
|
||||
(sum, tool) => sum + tool.tokens,
|
||||
0,
|
||||
);
|
||||
|
||||
let totalTokens: number;
|
||||
let displaySystemPrompt: number;
|
||||
let displayBuiltinTools: number;
|
||||
let displayMcpTools: number;
|
||||
let displayMemoryFiles: number;
|
||||
let displaySkills: number;
|
||||
let messagesTokens: number;
|
||||
let freeSpace: number;
|
||||
let detailBuiltinTools: ContextToolDetail[];
|
||||
let detailMcpTools: ContextToolDetail[];
|
||||
let detailMemoryFiles: ContextMemoryDetail[];
|
||||
let detailSkills: ContextSkillDetail[];
|
||||
|
||||
if (isEstimated) {
|
||||
// No API data yet: show raw overhead estimates only.
|
||||
// Use 0 as totalTokens so the progress bar stays empty —
|
||||
// avoids showing an inflated estimate that would "decrease"
|
||||
// once real API data arrives.
|
||||
totalTokens = 0;
|
||||
displaySystemPrompt = systemPromptTokens;
|
||||
// Skills = tool definition + loaded bodies
|
||||
displaySkills = skillsTokens;
|
||||
// builtinTools = allTools minus skills-definition minus mcpTools
|
||||
displayBuiltinTools = Math.max(
|
||||
0,
|
||||
allToolsTokens - skillToolDefinitionTokens - mcpToolsTotalTokens,
|
||||
);
|
||||
displayMcpTools = mcpToolsTotalTokens;
|
||||
displayMemoryFiles = memoryFilesTokens;
|
||||
messagesTokens = 0;
|
||||
// Free space accounts for the estimated overhead
|
||||
freeSpace = Math.max(
|
||||
0,
|
||||
contextWindowSize - rawOverhead - autocompactBuffer,
|
||||
);
|
||||
detailBuiltinTools = builtinTools;
|
||||
detailMcpTools = mcpTools;
|
||||
detailMemoryFiles = memoryFiles;
|
||||
detailSkills = skills;
|
||||
} else {
|
||||
// API data available: use actual total with proportional scaling
|
||||
totalTokens = apiTotalTokens;
|
||||
|
||||
// When estimates overshoot API total, scale down proportionally
|
||||
// so the breakdown categories add up to totalTokens.
|
||||
const overheadScale =
|
||||
rawOverhead > totalTokens ? totalTokens / rawOverhead : 1;
|
||||
|
||||
displaySystemPrompt = Math.round(systemPromptTokens * overheadScale);
|
||||
const scaledAllTools = Math.round(allToolsTokens * overheadScale);
|
||||
displayMemoryFiles = Math.round(memoryFilesTokens * overheadScale);
|
||||
// Skills = tool definition + loaded bodies (scaled together)
|
||||
displaySkills = Math.round(skillsTokens * overheadScale);
|
||||
const scaledMcpTotal = Math.round(mcpToolsTotalTokens * overheadScale);
|
||||
displayMcpTools = scaledMcpTotal;
|
||||
// builtinTools = allTools minus skill-definition minus mcpTools
|
||||
const scaledSkillDefinition = Math.round(
|
||||
skillToolDefinitionTokens * overheadScale,
|
||||
);
|
||||
displayBuiltinTools = Math.max(
|
||||
0,
|
||||
scaledAllTools - scaledSkillDefinition - scaledMcpTotal,
|
||||
);
|
||||
|
||||
const scaledOverhead =
|
||||
displaySystemPrompt +
|
||||
scaledAllTools +
|
||||
displayMemoryFiles +
|
||||
Math.round(loadedBodiesTokens * overheadScale);
|
||||
|
||||
// When the API reports cached content tokens (e.g. DashScope prefix caching),
|
||||
// use them as the actual overhead indicator for a more accurate messages count.
|
||||
// cachedTokens ≈ system prompt + tools tokens actually served from cache.
|
||||
// This avoids the "messages = 0" problem caused by estimation overshoot.
|
||||
if (apiCachedTokens > 0) {
|
||||
messagesTokens = Math.max(0, totalTokens - apiCachedTokens);
|
||||
} else {
|
||||
messagesTokens = Math.max(0, totalTokens - scaledOverhead);
|
||||
}
|
||||
|
||||
freeSpace = Math.max(
|
||||
0,
|
||||
contextWindowSize - totalTokens - autocompactBuffer,
|
||||
);
|
||||
|
||||
// Scale detail items to match their parent categories
|
||||
const scaleDetail = <T extends { tokens: number }>(items: T[]): T[] =>
|
||||
overheadScale < 1
|
||||
? items.map((item) => ({
|
||||
...item,
|
||||
tokens: Math.round(item.tokens * overheadScale),
|
||||
}))
|
||||
: items;
|
||||
|
||||
detailBuiltinTools = scaleDetail(builtinTools);
|
||||
detailMcpTools = scaleDetail(mcpTools);
|
||||
detailMemoryFiles = scaleDetail(memoryFiles);
|
||||
detailSkills =
|
||||
overheadScale < 1
|
||||
? skills.map((item) => ({
|
||||
...item,
|
||||
tokens: Math.round(item.tokens * overheadScale),
|
||||
bodyTokens: item.bodyTokens
|
||||
? Math.round(item.bodyTokens * overheadScale)
|
||||
: undefined,
|
||||
}))
|
||||
: skills;
|
||||
}
|
||||
|
||||
const breakdown: ContextCategoryBreakdown = {
|
||||
systemPrompt: displaySystemPrompt,
|
||||
builtinTools: displayBuiltinTools,
|
||||
mcpTools: displayMcpTools,
|
||||
memoryFiles: displayMemoryFiles,
|
||||
skills: displaySkills,
|
||||
messages: messagesTokens,
|
||||
freeSpace,
|
||||
autocompactBuffer,
|
||||
};
|
||||
const contextUsageItem = await collectContextData(config, showDetails);
|
||||
|
||||
const contextUsageItem: HistoryItemContextUsage = {
|
||||
type: MessageType.CONTEXT_USAGE,
|
||||
modelName,
|
||||
totalTokens,
|
||||
contextWindowSize,
|
||||
breakdown,
|
||||
builtinTools: detailBuiltinTools,
|
||||
mcpTools: detailMcpTools,
|
||||
memoryFiles: detailMemoryFiles,
|
||||
skills: detailSkills,
|
||||
isEstimated,
|
||||
showDetails,
|
||||
};
|
||||
|
||||
context.ui.addItem(contextUsageItem, Date.now());
|
||||
if (executionMode === 'interactive') {
|
||||
context.ui.addItem(contextUsageItem, Date.now());
|
||||
return;
|
||||
} else {
|
||||
return {
|
||||
type: 'message',
|
||||
messageType: 'info',
|
||||
content: JSON.stringify(contextUsageItem, null, 2),
|
||||
};
|
||||
}
|
||||
},
|
||||
subCommands: [
|
||||
{
|
||||
|
|
|
|||
|
|
@ -372,6 +372,11 @@ export interface CLIControlSupportedCommandsRequest {
|
|||
subtype: 'supported_commands';
|
||||
}
|
||||
|
||||
export interface CLIControlGetContextUsageRequest {
|
||||
subtype: 'get_context_usage';
|
||||
show_details?: boolean;
|
||||
}
|
||||
|
||||
export type ControlRequestPayload =
|
||||
| CLIControlInterruptRequest
|
||||
| CLIControlPermissionRequest
|
||||
|
|
@ -381,7 +386,8 @@ export type ControlRequestPayload =
|
|||
| CLIControlMcpMessageRequest
|
||||
| CLIControlSetModelRequest
|
||||
| CLIControlMcpStatusRequest
|
||||
| CLIControlSupportedCommandsRequest;
|
||||
| CLIControlSupportedCommandsRequest
|
||||
| CLIControlGetContextUsageRequest;
|
||||
|
||||
export interface CLIControlRequest {
|
||||
type: 'control_request';
|
||||
|
|
@ -574,6 +580,7 @@ export enum ControlRequestType {
|
|||
INTERRUPT = 'interrupt',
|
||||
SET_MODEL = 'set_model',
|
||||
SUPPORTED_COMMANDS = 'supported_commands',
|
||||
GET_CONTEXT_USAGE = 'get_context_usage',
|
||||
|
||||
// PermissionController requests
|
||||
CAN_USE_TOOL = 'can_use_tool',
|
||||
|
|
|
|||
|
|
@ -907,6 +907,21 @@ export class Query implements AsyncIterable<SDKMessage> {
|
|||
await this.sendControlRequest(ControlRequestType.SET_MODEL, { model });
|
||||
}
|
||||
|
||||
/**
|
||||
* Get context usage breakdown from the CLI
|
||||
*
|
||||
* @param showDetails Display hint for per-item breakdowns (data is always complete)
|
||||
* @returns Promise resolving to context usage data
|
||||
* @throws Error if query is closed
|
||||
*/
|
||||
async getContextUsage(
|
||||
showDetails: boolean = false,
|
||||
): Promise<Record<string, unknown> | null> {
|
||||
return this.sendControlRequest(ControlRequestType.GET_CONTEXT_USAGE, {
|
||||
show_details: showDetails,
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Get list of control commands supported by the CLI
|
||||
*
|
||||
|
|
|
|||
|
|
@ -383,6 +383,11 @@ export interface CLIControlSupportedCommandsRequest {
|
|||
subtype: 'supported_commands';
|
||||
}
|
||||
|
||||
export interface CLIControlGetContextUsageRequest {
|
||||
subtype: 'get_context_usage';
|
||||
show_details?: boolean;
|
||||
}
|
||||
|
||||
export type ControlRequestPayload =
|
||||
| CLIControlInterruptRequest
|
||||
| CLIControlPermissionRequest
|
||||
|
|
@ -392,7 +397,8 @@ export type ControlRequestPayload =
|
|||
| CLIControlMcpMessageRequest
|
||||
| CLIControlSetModelRequest
|
||||
| CLIControlMcpStatusRequest
|
||||
| CLIControlSupportedCommandsRequest;
|
||||
| CLIControlSupportedCommandsRequest
|
||||
| CLIControlGetContextUsageRequest;
|
||||
|
||||
export interface CLIControlRequest {
|
||||
type: 'control_request';
|
||||
|
|
@ -585,6 +591,7 @@ export enum ControlRequestType {
|
|||
INTERRUPT = 'interrupt',
|
||||
SET_MODEL = 'set_model',
|
||||
SUPPORTED_COMMANDS = 'supported_commands',
|
||||
GET_CONTEXT_USAGE = 'get_context_usage',
|
||||
|
||||
// PermissionController requests
|
||||
CAN_USE_TOOL = 'can_use_tool',
|
||||
|
|
|
|||
|
|
@ -1184,6 +1184,68 @@ describe('Query', () => {
|
|||
await query.close();
|
||||
});
|
||||
|
||||
it('should provide getContextUsage() method', async () => {
|
||||
const query = new Query(transport, { cwd: '/test' });
|
||||
|
||||
await respondToInitialize(transport, query);
|
||||
|
||||
const usagePromise = query.getContextUsage(true);
|
||||
|
||||
await vi.waitFor(() => {
|
||||
const messages = transport.getAllWrittenMessages();
|
||||
const usageMsg = findControlRequest(
|
||||
messages,
|
||||
ControlRequestType.GET_CONTEXT_USAGE,
|
||||
);
|
||||
expect(usageMsg).toBeDefined();
|
||||
});
|
||||
|
||||
// Respond with context usage data
|
||||
const messages = transport.getAllWrittenMessages();
|
||||
const usageMsg = findControlRequest(
|
||||
messages,
|
||||
ControlRequestType.GET_CONTEXT_USAGE,
|
||||
)!;
|
||||
|
||||
expect((usageMsg.request as Record<string, unknown>).show_details).toBe(
|
||||
true,
|
||||
);
|
||||
|
||||
transport.simulateMessage(
|
||||
createControlResponse(usageMsg.request_id, true, {
|
||||
subtype: 'get_context_usage',
|
||||
modelName: 'test-model',
|
||||
totalTokens: 50000,
|
||||
contextWindowSize: 200000,
|
||||
breakdown: {
|
||||
systemPrompt: 5000,
|
||||
builtinTools: 10000,
|
||||
mcpTools: 0,
|
||||
memoryFiles: 2000,
|
||||
skills: 3000,
|
||||
messages: 25000,
|
||||
freeSpace: 145000,
|
||||
autocompactBuffer: 10000,
|
||||
},
|
||||
builtinTools: [{ name: 'Read', tokens: 500 }],
|
||||
mcpTools: [],
|
||||
memoryFiles: [],
|
||||
skills: [],
|
||||
showDetails: true,
|
||||
}),
|
||||
);
|
||||
|
||||
const result = await usagePromise;
|
||||
expect(result).toMatchObject({
|
||||
modelName: 'test-model',
|
||||
totalTokens: 50000,
|
||||
contextWindowSize: 200000,
|
||||
showDetails: true,
|
||||
});
|
||||
|
||||
await query.close();
|
||||
});
|
||||
|
||||
it('should throw if methods called on closed query', async () => {
|
||||
const query = new Query(transport, { cwd: '/test' });
|
||||
await respondToInitialize(transport, query);
|
||||
|
|
@ -1198,6 +1260,7 @@ describe('Query', () => {
|
|||
'Query is closed',
|
||||
);
|
||||
await expect(query.mcpServerStatus()).rejects.toThrow('Query is closed');
|
||||
await expect(query.getContextUsage()).rejects.toThrow('Query is closed');
|
||||
});
|
||||
});
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue