mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-05-05 23:42:03 +00:00
feat: add /context command to display context window token usage breakdown
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
parent
aefea076b0
commit
77fd945474
7 changed files with 1026 additions and 1 deletions
293
packages/cli/src/ui/commands/CONTEXT_COMMAND.md
Normal file
293
packages/cli/src/ui/commands/CONTEXT_COMMAND.md
Normal file
|
|
@ -0,0 +1,293 @@
|
|||
# `/context` 命令 — 上下文窗口用量分解
|
||||
|
||||
## 概述
|
||||
|
||||
`/context` 命令展示当前模型上下文窗口的 token 使用情况。它将整个上下文窗口拆分为多个分类,帮助用户理解 token 花在了哪里,以及还剩多少空间。
|
||||
|
||||
## 上下文窗口的组成
|
||||
|
||||
一次 API 请求发送给模型的完整 prompt 包含以下部分:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ Context Window (总容量) │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────┐ │
|
||||
│ │ System Prompt (系统提示词) │ │
|
||||
│ │ └─ 核心指令 + 行为规则 │ │
|
||||
│ ├─────────────────────────────────────┤ │
|
||||
│ │ Tool Declarations (工具声明) │ │
|
||||
│ │ ├─ Built-in tools (内置工具) │ │
|
||||
│ │ ├─ MCP tools (MCP 工具) │ │
|
||||
│ │ └─ SkillTool (技能工具) ◄──────────┼─── 包含所有 skill 的名称+描述
|
||||
│ ├─────────────────────────────────────┤ │
|
||||
│ │ Memory (用户记忆) │ │
|
||||
│ │ └─ QWEN.md + extension configs │ │
|
||||
│ ├─────────────────────────────────────┤ │
|
||||
│ │ Messages (对话消息) │ │
|
||||
│ │ ├─ 用户消息 │ │
|
||||
│ │ ├─ 模型回复 │ │
|
||||
│ │ └─ 工具调用 & 工具结果 ◄───────────┼─── skill body 在此加载
|
||||
│ ├─────────────────────────────────────┤ │
|
||||
│ │ Free Space (可用空间) │ │
|
||||
│ ├─────────────────────────────────────┤ │
|
||||
│ │ Autocompact Buffer (自动压缩缓冲) │ │
|
||||
│ └─────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**不变量**:所有分类之和 = Context Window 总容量。
|
||||
|
||||
## 各分类详解
|
||||
|
||||
### 1. System Prompt(系统提示词)
|
||||
|
||||
| 属性 | 说明 |
|
||||
| ------------ | ------------------------------------------------------------------ |
|
||||
| **数据来源** | `getCoreSystemPrompt(undefined, modelName)` |
|
||||
| **包含内容** | 模型的核心行为指令、输出格式要求、安全规则等 |
|
||||
| **不包含** | Memory 内容(单独计算) |
|
||||
| **计算方式** | 对系统提示词文本调用 `estimateTokens()` |
|
||||
| **变化频率** | 基本固定,除非修改了 `QWEN_SYSTEM_MD` 环境变量或 `.qwen/system.md` |
|
||||
|
||||
> **注意**:`getCoreSystemPrompt` 接受 `userMemory` 参数,这里传入 `undefined` 以排除 memory,因为 memory 作为独立分类统计。
|
||||
|
||||
### 2. Built-in Tools(内置工具)
|
||||
|
||||
| 属性 | 说明 |
|
||||
| ------------ | ----------------------------------------------------------------------------------------------------- |
|
||||
| **数据来源** | `toolRegistry.getAllTools()` 中非 MCP、非 SkillTool 的工具 |
|
||||
| **包含内容** | `read_file`、`edit`、`run_shell_command`、`grep_search`、`glob`、`list_directory` 等核心工具的 schema |
|
||||
| **计算方式** | `allToolsTokens - skillsTokens - mcpToolsTotalTokens` |
|
||||
| **详情列表** | 逐项展示每个内置工具的名称和 token 占用,按 token 数降序排列 |
|
||||
|
||||
> **SkillTool** 虽然也是内置工具,但因其内容动态性(嵌入所有 skill 列表),独立作为 **Skills** 分类展示,不在 Built-in tools 中出现。
|
||||
|
||||
### 2b. MCP Tools(MCP 工具)
|
||||
|
||||
| 属性 | 说明 |
|
||||
| ------------ | ----------------------------------------------------------------------- |
|
||||
| **数据来源** | `toolRegistry.getAllTools()` 中 `DiscoveredMCPTool` 实例 |
|
||||
| **包含内容** | 通过 MCP 协议连接的外部工具服务器提供的工具 schema |
|
||||
| **计算方式** | 各 MCP 工具 `estimateTokens(JSON.stringify(tool.schema))` 之和 |
|
||||
| **详情列表** | 逐项展示每个 MCP 工具的名称(`serverName__toolName` 格式)和 token 占用 |
|
||||
| **条件显示** | 仅当存在 MCP 工具时才显示此分类行和详情 |
|
||||
|
||||
### 3. Skills(技能)⭐ 渐进式披露
|
||||
|
||||
Skills 采用**两阶段加载**设计:
|
||||
|
||||
| 阶段 | 加载内容 | Token 归属 | 何时加载 |
|
||||
| ------------ | ---------------------------------------------- | ----------------- | ------------------------------- |
|
||||
| **第一阶段** | 每个 skill 的 name + 短 description + 使用说明 | **Skills 分类** | 每次 API 请求都发送 |
|
||||
| **第二阶段** | 完整的 SKILL.md body 内容(详细指令、模板等) | **Messages 分类** | 模型调用 `skill` 工具后按需注入 |
|
||||
|
||||
**`/context` 中 Skills 分类展示的是第一阶段的常驻开销。**
|
||||
|
||||
#### 第一阶段的实现细节
|
||||
|
||||
SkillTool 在初始化时将所有 skill 信息嵌入其 `description` 字段:
|
||||
|
||||
```
|
||||
Execute a skill within the main conversation
|
||||
|
||||
<skills_instructions>
|
||||
... 使用说明(~600 字符)...
|
||||
</skills_instructions>
|
||||
|
||||
<available_skills>
|
||||
<skill>
|
||||
<name>pdf</name>
|
||||
<description>Convert PDF files to text (project)</description>
|
||||
<location>project</location>
|
||||
</skill>
|
||||
<skill>
|
||||
<name>xlsx</name>
|
||||
<description>Process Excel spreadsheets (user)</description>
|
||||
<location>user</location>
|
||||
</skill>
|
||||
...更多 skills...
|
||||
</available_skills>
|
||||
```
|
||||
|
||||
这整块文本是 SkillTool 的 tool declaration 的一部分,每次 API 请求都会发送。
|
||||
|
||||
#### Token 计算方式
|
||||
|
||||
```
|
||||
skillsTokens = estimateTokens(JSON.stringify(skillTool.schema))
|
||||
```
|
||||
|
||||
直接从 ToolRegistry 中获取 SkillTool 的完整 schema 进行估算,确保包含:
|
||||
|
||||
- 使用说明文本(`<skills_instructions>`)
|
||||
- 所有 skill 的 XML 列表(`<available_skills>`)
|
||||
- schema 参数定义
|
||||
|
||||
#### 第二阶段(按需加载)
|
||||
|
||||
当模型调用 `skill` 工具时,`SkillToolInvocation.execute()` 会加载完整的 SKILL.md:
|
||||
|
||||
```typescript
|
||||
const skill = await this.skillManager.loadSkillForRuntime(this.params.skill);
|
||||
const llmContent = `Base directory: ${baseDir}\n\n${skill.body}\n`;
|
||||
```
|
||||
|
||||
这个 body 内容作为工具调用结果注入到对话中,token 开销归入 **Messages** 分类。
|
||||
|
||||
#### Skills 详情列表
|
||||
|
||||
每个 skill 的详情行展示该 skill 在第一阶段中的大致占用,按 token 数降序排列。注意:
|
||||
|
||||
- 各 skill 详情的 token 之和 **< Skills 分类总数**,差值是 skills_instructions 指令文本的开销
|
||||
- 详情仅展示名称和描述的 token,不包含 schema 参数定义部分
|
||||
|
||||
### 4. Memory Files(用户记忆)
|
||||
|
||||
| 属性 | 说明 |
|
||||
| ------------ | -------------------------------------------------------------------------- |
|
||||
| **数据来源** | `config.getUserMemory()` |
|
||||
| **包含内容** | `QWEN.md`、extension 配置、`output-language` 等用户级配置文件 |
|
||||
| **加载位置** | 拼接到 System Prompt 末尾(通过 `getCoreSystemPrompt(userMemory, model)`) |
|
||||
| **计算方式** | 解析 memory 文本中的 `--- Context from: <path> ---` 标记,分文件估算 token |
|
||||
|
||||
**Memory 内容格式**:
|
||||
|
||||
```
|
||||
--- Context from: ~/.qwen/QWEN.md ---
|
||||
用户自定义规则和偏好...
|
||||
--- End of Context from: ~/.qwen/QWEN.md ---
|
||||
--- Context from: ~/.qwen/extensions/config.md ---
|
||||
扩展配置内容...
|
||||
--- End of Context from: ~/.qwen/extensions/config.md ---
|
||||
```
|
||||
|
||||
> **为什么 System Prompt 不包含 Memory?** 计算 System Prompt token 时传入 `userMemory = undefined`,Memory 作为独立分类展示,避免两个分类重叠。实际 API 请求中 memory 是拼接在 system prompt 末尾的。
|
||||
|
||||
### 5. Messages(对话消息)
|
||||
|
||||
| 属性 | 说明 |
|
||||
| ------------ | ---------------------------------------------------------------- |
|
||||
| **数据来源** | 反推:`totalTokens - systemPrompt - allTools - memory` |
|
||||
| **包含内容** | 所有用户消息、模型回复、工具调用参数、工具返回结果 |
|
||||
| **特别包含** | skill body(第二阶段按需加载的内容)、文件读取结果、shell 输出等 |
|
||||
| **计算方式** | `max(0, apiTotalTokens - estimatedOverhead)` |
|
||||
|
||||
> **注意**:Messages 是通过 API 返回的 `totalTokens` 减去其他分类的估算值得出的,因此它吸收了估算误差。如果 overhead 被高估,Messages 会被相应低估。
|
||||
|
||||
### 6. Free Space(可用空间)
|
||||
|
||||
| 属性 | 说明 |
|
||||
| ------------ | ----------------------------------------------------- |
|
||||
| **计算方式** | `contextWindowSize - totalTokens - autocompactBuffer` |
|
||||
| **含义** | 在触发自动压缩之前,还能容纳多少 token 的对话内容 |
|
||||
|
||||
### 7. Autocompact Buffer(自动压缩缓冲区)
|
||||
|
||||
| 属性 | 说明 |
|
||||
| ------------ | ----------------------------------------------------------------- |
|
||||
| **计算方式** | `(1 - compressionThreshold) × contextWindowSize` |
|
||||
| **默认值** | `(1 - 0.7) × 131072 = 39322`(约 30% 的上下文窗口) |
|
||||
| **含义** | 当 token 用量达到 70% 时触发自动压缩,这 30% 的空间作为缓冲区预留 |
|
||||
|
||||
## 两种展示模式
|
||||
|
||||
### 模式 A:无 API 数据(首次使用,尚未发送消息)
|
||||
|
||||
```
|
||||
Context Usage
|
||||
|
||||
No API response yet. Send a message to see actual usage.
|
||||
|
||||
Estimated pre-conversation overhead
|
||||
Model: glm-5 Context window: 131.1k tokens
|
||||
|
||||
█ System prompt 4.8k tokens (3.7%)
|
||||
█ System tools 5.2k tokens (4.0%)
|
||||
█ Memory files 845 tokens (0.6%)
|
||||
█ Skills 5.1k tokens (3.9%)
|
||||
░ Free space 75.8k tokens (57.8%)
|
||||
░ Autocompact buffer 39.3k tokens (30.0%)
|
||||
```
|
||||
|
||||
- **不显示进度条和 total 数字**:避免估算值与后续 API 实际值产生不合理的对比
|
||||
- **不显示 Messages 行**:尚无对话
|
||||
- 各分类基于本地启发式估算(`estimateTokens`),可能与实际 API tokenizer 有 ~10% 偏差
|
||||
|
||||
### 模式 B:有 API 数据(已进行对话)
|
||||
|
||||
```
|
||||
Context Usage
|
||||
|
||||
██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ glm-5
|
||||
25.3k/131.1k tokens (19.3%)
|
||||
|
||||
Usage by category
|
||||
█ System prompt 4.5k tokens (3.4%)
|
||||
█ System tools 4.9k tokens (3.7%)
|
||||
█ Memory files 790 tokens (0.6%)
|
||||
█ Skills 4.8k tokens (3.7%)
|
||||
█ Messages 10.3k tokens (7.9%)
|
||||
░ Free space 66.5k tokens (50.7%)
|
||||
░ Autocompact buffer 39.3k tokens (30.0%)
|
||||
```
|
||||
|
||||
- **`totalTokens` 来自 API 响应**(`usageMetadata.promptTokenCount`),是最准确的值
|
||||
- **当本地估算 > API total 时**:按比例缩放各 overhead 分类,确保分类之和 = totalTokens
|
||||
- **Messages** = `totalTokens - scaledOverhead`,包含所有对话内容 + 按需加载的 skill body
|
||||
|
||||
## Token 估算方法
|
||||
|
||||
由于无法直接访问模型的 tokenizer,使用基于字符的启发式估算:
|
||||
|
||||
```
|
||||
tokens ≈ ⌈asciiChars / 4 + nonAsciiChars × 1.5⌉
|
||||
```
|
||||
|
||||
| 字符类型 | 比例 | 依据 |
|
||||
| --------------------------------- | --------------- | -------------------------------- |
|
||||
| ASCII(英文、JSON 结构字符等) | ~4 字符/token | BPE tokenizer 对英文的平均压缩率 |
|
||||
| 非 ASCII(中文、日文等 CJK 字符) | ~1.5 token/字符 | CJK 字符通常映射为 1-2 个 token |
|
||||
|
||||
**已知局限**:
|
||||
|
||||
- 不同模型的 tokenizer 有差异,估算可能偏差 ±10-20%
|
||||
- JSON 结构字符(`{`, `"`, `:` 等)的实际 token 化比率与自然语言不同
|
||||
- 当估算偏高时,通过 `overheadScale` 按比例缩放校正
|
||||
|
||||
## 数据流图
|
||||
|
||||
```
|
||||
┌──────────────────┐
|
||||
│ API Response │
|
||||
│ promptTokenCount │ ─── totalTokens (ground truth)
|
||||
└──────────────────┘
|
||||
│
|
||||
┌──────────────────────────┼──────────────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
estimateTokens() estimateTokens() estimateTokens()
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
systemPromptTokens allToolsTokens memoryFilesTokens
|
||||
│
|
||||
┌─────┴──────┐
|
||||
│ │
|
||||
▼ ▼
|
||||
systemToolsTokens skillsTokens
|
||||
(allTools - skills) (from SkillTool schema)
|
||||
│ │
|
||||
└─────┬──────┘
|
||||
│
|
||||
▼
|
||||
rawOverhead = systemPrompt + allTools + memory
|
||||
│
|
||||
┌───────────┼───────────┐
|
||||
│ overheadScale │ (= min(1, totalTokens/rawOverhead))
|
||||
▼ ▼
|
||||
scaled categories messages = totalTokens - scaledOverhead
|
||||
│ │
|
||||
└───────────┬───────────┘
|
||||
▼
|
||||
breakdown output
|
||||
```
|
||||
310
packages/cli/src/ui/commands/contextCommand.ts
Normal file
310
packages/cli/src/ui/commands/contextCommand.ts
Normal file
|
|
@ -0,0 +1,310 @@
|
|||
/**
|
||||
* @license
|
||||
* Copyright 2025 Qwen
|
||||
* SPDX-License-Identifier: Apache-2.0
|
||||
*/
|
||||
|
||||
import {
|
||||
type CommandContext,
|
||||
type SlashCommand,
|
||||
CommandKind,
|
||||
} from './types.js';
|
||||
import {
|
||||
MessageType,
|
||||
type HistoryItemContextUsage,
|
||||
type ContextCategoryBreakdown,
|
||||
type ContextToolDetail,
|
||||
type ContextMemoryDetail,
|
||||
type ContextSkillDetail,
|
||||
} from '../types.js';
|
||||
import {
|
||||
DiscoveredMCPTool,
|
||||
uiTelemetryService,
|
||||
getCoreSystemPrompt,
|
||||
DEFAULT_TOKEN_LIMIT,
|
||||
ToolNames,
|
||||
} from '@qwen-code/qwen-code-core';
|
||||
import { t } from '../../i18n/index.js';
|
||||
|
||||
/**
|
||||
* Default compression token threshold (triggers compression at 70% usage).
|
||||
* The autocompact buffer is (1 - threshold) * contextWindowSize.
|
||||
*/
|
||||
const DEFAULT_COMPRESSION_THRESHOLD = 0.7;
|
||||
|
||||
/**
|
||||
* Estimate token count for a string using a character-based heuristic.
|
||||
* ASCII chars ≈ 4 chars/token, CJK/non-ASCII chars ≈ 1.5 tokens/char.
|
||||
*/
|
||||
function estimateTokens(text: string): number {
|
||||
if (!text || text.length === 0) return 0;
|
||||
let asciiChars = 0;
|
||||
let nonAsciiChars = 0;
|
||||
for (let i = 0; i < text.length; i++) {
|
||||
const charCode = text.charCodeAt(i);
|
||||
if (charCode < 128) {
|
||||
asciiChars++;
|
||||
} else {
|
||||
nonAsciiChars++;
|
||||
}
|
||||
}
|
||||
// CJK and other non-ASCII characters typically produce 1.5-2 tokens each
|
||||
return Math.ceil(asciiChars / 4 + nonAsciiChars * 1.5);
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse concatenated memory content into individual file entries.
|
||||
* Memory content format: "--- Context from: <path> ---\n<content>\n--- End of Context from: <path> ---"
|
||||
*/
|
||||
function parseMemoryFiles(memoryContent: string): ContextMemoryDetail[] {
|
||||
if (!memoryContent || memoryContent.trim().length === 0) return [];
|
||||
|
||||
const results: ContextMemoryDetail[] = [];
|
||||
// Use backreference (\1) to ensure start/end path markers match
|
||||
const regex =
|
||||
/--- Context from: (.+?) ---\n([\s\S]*?)--- End of Context from: \1 ---/g;
|
||||
let match: RegExpExecArray | null;
|
||||
|
||||
while ((match = regex.exec(memoryContent)) !== null) {
|
||||
const filePath = match[1]!;
|
||||
const content = match[2]!;
|
||||
results.push({
|
||||
path: filePath,
|
||||
tokens: estimateTokens(content),
|
||||
});
|
||||
}
|
||||
|
||||
// If no structured markers found, treat as a single memory block
|
||||
if (results.length === 0 && memoryContent.trim().length > 0) {
|
||||
results.push({
|
||||
path: t('memory'),
|
||||
tokens: estimateTokens(memoryContent),
|
||||
});
|
||||
}
|
||||
|
||||
return results;
|
||||
}
|
||||
|
||||
export const contextCommand: SlashCommand = {
|
||||
name: 'context',
|
||||
get description() {
|
||||
return t('Show context window usage breakdown.');
|
||||
},
|
||||
kind: CommandKind.BUILT_IN,
|
||||
action: async (context: CommandContext) => {
|
||||
const { config } = context.services;
|
||||
if (!config) {
|
||||
context.ui.addItem(
|
||||
{
|
||||
type: MessageType.ERROR,
|
||||
text: t('Config not loaded.'),
|
||||
},
|
||||
Date.now(),
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
// --- Gather data ---
|
||||
|
||||
const modelName = config.getModel() || 'unknown';
|
||||
const contentGeneratorConfig = config.getContentGeneratorConfig();
|
||||
const contextWindowSize =
|
||||
contentGeneratorConfig.contextWindowSize ?? DEFAULT_TOKEN_LIMIT;
|
||||
|
||||
// Total prompt token count from API (most accurate)
|
||||
const apiTotalTokens = uiTelemetryService.getLastPromptTokenCount();
|
||||
|
||||
// 1. System prompt tokens (without memory, as memory is counted separately)
|
||||
const systemPromptText = getCoreSystemPrompt(undefined, modelName);
|
||||
const systemPromptTokens = estimateTokens(systemPromptText);
|
||||
|
||||
// 2. Tool declarations tokens (includes ALL tools: built-in, MCP, skill tool)
|
||||
const toolRegistry = config.getToolRegistry();
|
||||
const allTools = toolRegistry ? toolRegistry.getAllTools() : [];
|
||||
const toolDeclarations = toolRegistry
|
||||
? toolRegistry.getFunctionDeclarations()
|
||||
: [];
|
||||
const toolsJsonStr = JSON.stringify(toolDeclarations);
|
||||
const allToolsTokens = estimateTokens(toolsJsonStr);
|
||||
|
||||
// 3. Per-tool details (for breakdown display)
|
||||
const builtinTools: ContextToolDetail[] = [];
|
||||
const mcpTools: ContextToolDetail[] = [];
|
||||
for (const tool of allTools) {
|
||||
const toolJsonStr = JSON.stringify(tool.schema);
|
||||
const tokens = estimateTokens(toolJsonStr);
|
||||
if (tool instanceof DiscoveredMCPTool) {
|
||||
mcpTools.push({
|
||||
name: `${tool.serverName}__${tool.serverToolName || tool.name}`,
|
||||
tokens,
|
||||
});
|
||||
} else if (tool.name !== ToolNames.SKILL) {
|
||||
// Built-in tool (exclude SkillTool, which is shown under Skills)
|
||||
builtinTools.push({
|
||||
name: tool.name,
|
||||
tokens,
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// 4. Memory files
|
||||
const memoryContent = config.getUserMemory();
|
||||
const memoryFiles = parseMemoryFiles(memoryContent);
|
||||
const memoryFilesTokens = memoryFiles.reduce((sum, f) => sum + f.tokens, 0);
|
||||
|
||||
// 5. Skills (progressive disclosure)
|
||||
// The SkillTool's description embeds all skill name+description listings
|
||||
// plus ~600 chars of instruction text. This is the "always in context"
|
||||
// cost. The full SKILL.md body is only loaded on-demand when the model
|
||||
// invokes the skill tool (and that cost appears in Messages).
|
||||
//
|
||||
// To get an accurate total, we read the SkillTool's actual schema from
|
||||
// the registry rather than reconstructing from a template.
|
||||
const skillTool = allTools.find((tool) => tool.name === ToolNames.SKILL);
|
||||
const skillToolTotalTokens = skillTool
|
||||
? estimateTokens(JSON.stringify(skillTool.schema))
|
||||
: 0;
|
||||
|
||||
// Per-skill breakdown for detail display (proportional to description length)
|
||||
const skillManager = config.getSkillManager();
|
||||
const skillConfigs = skillManager ? await skillManager.listSkills() : [];
|
||||
const skills: ContextSkillDetail[] = skillConfigs.map((skill) => ({
|
||||
name: skill.name,
|
||||
tokens: estimateTokens(
|
||||
`<skill>\n<name>\n${skill.name}\n</name>\n<description>\n${skill.description} (${skill.level})\n</description>\n<location>\n${skill.level}\n</location>\n</skill>`,
|
||||
),
|
||||
}));
|
||||
// Use the SkillTool's actual schema tokens as the total, not the sum of
|
||||
// individual estimates (which would miss the instruction wrapper text).
|
||||
const skillsTokens = skillToolTotalTokens;
|
||||
|
||||
// 6. Autocompact buffer
|
||||
const compressionThreshold =
|
||||
config.getChatCompression()?.contextPercentageThreshold ??
|
||||
DEFAULT_COMPRESSION_THRESHOLD;
|
||||
const autocompactBuffer =
|
||||
compressionThreshold > 0
|
||||
? Math.round((1 - compressionThreshold) * contextWindowSize)
|
||||
: 0;
|
||||
|
||||
// 7. Calculate raw overhead (allToolsTokens already includes skills)
|
||||
const rawOverhead = systemPromptTokens + allToolsTokens + memoryFilesTokens;
|
||||
|
||||
// 8. Determine total tokens and build breakdown
|
||||
const isEstimated = apiTotalTokens === 0;
|
||||
|
||||
// Sum of MCP tool tokens for category-level display
|
||||
const mcpToolsTotalTokens = mcpTools.reduce(
|
||||
(sum, tool) => sum + tool.tokens,
|
||||
0,
|
||||
);
|
||||
|
||||
let totalTokens: number;
|
||||
let displaySystemPrompt: number;
|
||||
let displayBuiltinTools: number;
|
||||
let displayMcpTools: number;
|
||||
let displayMemoryFiles: number;
|
||||
let displaySkills: number;
|
||||
let messagesTokens: number;
|
||||
let freeSpace: number;
|
||||
let detailBuiltinTools: ContextToolDetail[];
|
||||
let detailMcpTools: ContextToolDetail[];
|
||||
let detailMemoryFiles: ContextMemoryDetail[];
|
||||
let detailSkills: ContextSkillDetail[];
|
||||
|
||||
if (isEstimated) {
|
||||
// No API data yet: show raw overhead estimates only.
|
||||
// Use 0 as totalTokens so the progress bar stays empty —
|
||||
// avoids showing an inflated estimate that would "decrease"
|
||||
// once real API data arrives.
|
||||
totalTokens = 0;
|
||||
displaySystemPrompt = systemPromptTokens;
|
||||
// builtinTools category = allTools - skills - mcpTools
|
||||
displayBuiltinTools = Math.max(
|
||||
0,
|
||||
allToolsTokens - skillsTokens - mcpToolsTotalTokens,
|
||||
);
|
||||
displayMcpTools = mcpToolsTotalTokens;
|
||||
displayMemoryFiles = memoryFilesTokens;
|
||||
displaySkills = skillsTokens;
|
||||
messagesTokens = 0;
|
||||
// Free space accounts for the estimated overhead
|
||||
freeSpace = Math.max(
|
||||
0,
|
||||
contextWindowSize - rawOverhead - autocompactBuffer,
|
||||
);
|
||||
detailBuiltinTools = builtinTools;
|
||||
detailMcpTools = mcpTools;
|
||||
detailMemoryFiles = memoryFiles;
|
||||
detailSkills = skills;
|
||||
} else {
|
||||
// API data available: use actual total with proportional scaling
|
||||
totalTokens = apiTotalTokens;
|
||||
|
||||
// When estimates overshoot API total, scale down proportionally
|
||||
// so the breakdown categories add up to totalTokens.
|
||||
const overheadScale =
|
||||
rawOverhead > totalTokens ? totalTokens / rawOverhead : 1;
|
||||
|
||||
displaySystemPrompt = Math.round(systemPromptTokens * overheadScale);
|
||||
const scaledAllTools = Math.round(allToolsTokens * overheadScale);
|
||||
displayMemoryFiles = Math.round(memoryFilesTokens * overheadScale);
|
||||
displaySkills = Math.round(skillsTokens * overheadScale);
|
||||
const scaledMcpTotal = Math.round(mcpToolsTotalTokens * overheadScale);
|
||||
displayMcpTools = scaledMcpTotal;
|
||||
displayBuiltinTools = Math.max(
|
||||
0,
|
||||
scaledAllTools - displaySkills - scaledMcpTotal,
|
||||
);
|
||||
|
||||
const scaledOverhead =
|
||||
displaySystemPrompt + scaledAllTools + displayMemoryFiles;
|
||||
messagesTokens = Math.max(0, totalTokens - scaledOverhead);
|
||||
|
||||
freeSpace = Math.max(
|
||||
0,
|
||||
contextWindowSize - totalTokens - autocompactBuffer,
|
||||
);
|
||||
|
||||
// Scale detail items to match their parent categories
|
||||
const scaleDetail = <T extends { tokens: number }>(items: T[]): T[] =>
|
||||
overheadScale < 1
|
||||
? items.map((item) => ({
|
||||
...item,
|
||||
tokens: Math.round(item.tokens * overheadScale),
|
||||
}))
|
||||
: items;
|
||||
|
||||
detailBuiltinTools = scaleDetail(builtinTools);
|
||||
detailMcpTools = scaleDetail(mcpTools);
|
||||
detailMemoryFiles = scaleDetail(memoryFiles);
|
||||
detailSkills = scaleDetail(skills);
|
||||
}
|
||||
|
||||
const breakdown: ContextCategoryBreakdown = {
|
||||
systemPrompt: displaySystemPrompt,
|
||||
builtinTools: displayBuiltinTools,
|
||||
mcpTools: displayMcpTools,
|
||||
memoryFiles: displayMemoryFiles,
|
||||
skills: displaySkills,
|
||||
messages: messagesTokens,
|
||||
freeSpace,
|
||||
autocompactBuffer,
|
||||
};
|
||||
|
||||
const contextUsageItem: HistoryItemContextUsage = {
|
||||
type: MessageType.CONTEXT_USAGE,
|
||||
modelName,
|
||||
totalTokens,
|
||||
contextWindowSize,
|
||||
breakdown,
|
||||
builtinTools: detailBuiltinTools,
|
||||
mcpTools: detailMcpTools,
|
||||
memoryFiles: detailMemoryFiles,
|
||||
skills: detailSkills,
|
||||
isEstimated,
|
||||
};
|
||||
|
||||
context.ui.addItem(contextUsageItem, Date.now());
|
||||
},
|
||||
};
|
||||
|
|
@ -33,6 +33,7 @@ import { getMCPServerStatus } from '@qwen-code/qwen-code-core';
|
|||
import { SkillsList } from './views/SkillsList.js';
|
||||
import { ToolsList } from './views/ToolsList.js';
|
||||
import { McpStatus } from './views/McpStatus.js';
|
||||
import { ContextUsage } from './views/ContextUsage.js';
|
||||
|
||||
interface HistoryItemDisplayProps {
|
||||
item: HistoryItem;
|
||||
|
|
@ -176,6 +177,19 @@ const HistoryItemDisplayComponent: React.FC<HistoryItemDisplayProps> = ({
|
|||
{itemForDisplay.type === 'mcp_status' && (
|
||||
<McpStatus {...itemForDisplay} serverStatus={getMCPServerStatus} />
|
||||
)}
|
||||
{itemForDisplay.type === 'context_usage' && (
|
||||
<ContextUsage
|
||||
modelName={itemForDisplay.modelName}
|
||||
totalTokens={itemForDisplay.totalTokens}
|
||||
contextWindowSize={itemForDisplay.contextWindowSize}
|
||||
breakdown={itemForDisplay.breakdown}
|
||||
builtinTools={itemForDisplay.builtinTools}
|
||||
mcpTools={itemForDisplay.mcpTools}
|
||||
memoryFiles={itemForDisplay.memoryFiles}
|
||||
skills={itemForDisplay.skills}
|
||||
isEstimated={itemForDisplay.isEstimated}
|
||||
/>
|
||||
)}
|
||||
</Box>
|
||||
);
|
||||
};
|
||||
|
|
|
|||
361
packages/cli/src/ui/components/views/ContextUsage.tsx
Normal file
361
packages/cli/src/ui/components/views/ContextUsage.tsx
Normal file
|
|
@ -0,0 +1,361 @@
|
|||
/**
|
||||
* @license
|
||||
* Copyright 2025 Qwen
|
||||
* SPDX-License-Identifier: Apache-2.0
|
||||
*/
|
||||
|
||||
import type React from 'react';
|
||||
import { Box, Text } from 'ink';
|
||||
import { theme } from '../../semantic-colors.js';
|
||||
import type {
|
||||
ContextCategoryBreakdown,
|
||||
ContextToolDetail,
|
||||
ContextMemoryDetail,
|
||||
ContextSkillDetail,
|
||||
} from '../../types.js';
|
||||
import { t } from '../../../i18n/index.js';
|
||||
|
||||
// Progress bar characters
|
||||
const FILLED = '\u2588'; // █ - filled block
|
||||
const BUFFER = '\u2592'; // ▒ - medium shade (autocompact buffer)
|
||||
const EMPTY = '\u2591'; // ░ - light shade (free space)
|
||||
|
||||
const CONTENT_WIDTH = 56;
|
||||
|
||||
interface ContextUsageProps {
|
||||
modelName: string;
|
||||
totalTokens: number;
|
||||
contextWindowSize: number;
|
||||
breakdown: ContextCategoryBreakdown;
|
||||
builtinTools: ContextToolDetail[];
|
||||
mcpTools: ContextToolDetail[];
|
||||
memoryFiles: ContextMemoryDetail[];
|
||||
skills: ContextSkillDetail[];
|
||||
/** True when totalTokens is estimated (no API call yet) */
|
||||
isEstimated?: boolean;
|
||||
}
|
||||
|
||||
/**
|
||||
* Truncate a string to maxLen, appending '…' if truncated.
|
||||
*/
|
||||
function truncateName(name: string, maxLen: number): string {
|
||||
if (name.length <= maxLen) return name;
|
||||
return name.slice(0, maxLen - 1) + '\u2026';
|
||||
}
|
||||
|
||||
/**
|
||||
* Format token count for display (e.g. 1234 -> "1.2k", 123456 -> "123.5k")
|
||||
*/
|
||||
function formatTokens(tokens: number): string {
|
||||
if (tokens >= 1000) {
|
||||
return `${(tokens / 1000).toFixed(1)}k`;
|
||||
}
|
||||
return `${tokens}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Render a three-segment progress bar: used | autocompact buffer | free space.
|
||||
*/
|
||||
const ProgressBar: React.FC<{
|
||||
usedPercentage: number;
|
||||
bufferPercentage: number;
|
||||
width: number;
|
||||
}> = ({ usedPercentage, bufferPercentage, width }) => {
|
||||
const usedCount = Math.round((Math.min(usedPercentage, 100) / 100) * width);
|
||||
const bufferCount = Math.round(
|
||||
(Math.min(bufferPercentage, 100 - usedPercentage) / 100) * width,
|
||||
);
|
||||
const freeCount = Math.max(0, width - usedCount - bufferCount);
|
||||
|
||||
const usedStr = FILLED.repeat(Math.max(0, usedCount));
|
||||
const freeStr = EMPTY.repeat(Math.max(0, freeCount));
|
||||
const bufferStr = BUFFER.repeat(Math.max(0, bufferCount));
|
||||
|
||||
// Used color: accent by default, warning/error at high usage.
|
||||
let usedColor = theme.text.accent;
|
||||
if (usedPercentage > 80) {
|
||||
usedColor = theme.status.error;
|
||||
} else if (usedPercentage > 60) {
|
||||
usedColor = theme.status.warning;
|
||||
}
|
||||
|
||||
return (
|
||||
<Text>
|
||||
<Text color={usedColor}>{usedStr}</Text>
|
||||
<Text color={theme.text.secondary}>{freeStr}</Text>
|
||||
<Text color={theme.status.warning}>{bufferStr}</Text>
|
||||
</Text>
|
||||
);
|
||||
};
|
||||
|
||||
/**
|
||||
* A row showing a category with its token count and percentage.
|
||||
*/
|
||||
const CategoryRow: React.FC<{
|
||||
symbol: string;
|
||||
label: string;
|
||||
tokens: number;
|
||||
contextWindowSize: number;
|
||||
symbolColor?: string;
|
||||
}> = ({ symbol, label, tokens, contextWindowSize, symbolColor }) => {
|
||||
const percentage = ((tokens / contextWindowSize) * 100).toFixed(1);
|
||||
const tokenStr = `${formatTokens(tokens)} ${t('tokens')} (${percentage}%)`;
|
||||
|
||||
return (
|
||||
<Box width={CONTENT_WIDTH}>
|
||||
<Box width={2}>
|
||||
<Text color={symbolColor || theme.text.secondary}>{symbol}</Text>
|
||||
</Box>
|
||||
<Box width={24}>
|
||||
<Text color={theme.text.primary}>{label}</Text>
|
||||
</Box>
|
||||
<Box flexGrow={1} justifyContent="flex-end">
|
||||
<Text color={theme.text.secondary}>{tokenStr}</Text>
|
||||
</Box>
|
||||
</Box>
|
||||
);
|
||||
};
|
||||
|
||||
/**
|
||||
* A detail row for individual items (MCP tools, memory files, skills).
|
||||
*/
|
||||
const DETAIL_NAME_MAX_LEN = 30;
|
||||
|
||||
const DetailRow: React.FC<{
|
||||
name: string;
|
||||
tokens: number;
|
||||
}> = ({ name, tokens }) => {
|
||||
const tokenStr =
|
||||
tokens > 0 ? `${formatTokens(tokens)} ${t('tokens')}` : `0 ${t('tokens')}`;
|
||||
return (
|
||||
<Box width={CONTENT_WIDTH} paddingLeft={2}>
|
||||
<Text color={theme.text.secondary}>{'\u2514'} </Text>
|
||||
<Box width={32}>
|
||||
<Text color={theme.text.link}>
|
||||
{truncateName(name, DETAIL_NAME_MAX_LEN)}
|
||||
</Text>
|
||||
</Box>
|
||||
<Box flexGrow={1} justifyContent="flex-end">
|
||||
<Text color={theme.text.secondary}>{tokenStr}</Text>
|
||||
</Box>
|
||||
</Box>
|
||||
);
|
||||
};
|
||||
|
||||
export const ContextUsage: React.FC<ContextUsageProps> = ({
|
||||
modelName,
|
||||
totalTokens,
|
||||
contextWindowSize,
|
||||
breakdown,
|
||||
builtinTools,
|
||||
mcpTools,
|
||||
memoryFiles,
|
||||
skills,
|
||||
isEstimated,
|
||||
}) => {
|
||||
const percentage =
|
||||
contextWindowSize > 0 ? (totalTokens / contextWindowSize) * 100 : 0;
|
||||
|
||||
// Sort detail items by token count (descending) for better readability
|
||||
const sortedBuiltinTools = [...builtinTools].sort(
|
||||
(a, b) => b.tokens - a.tokens,
|
||||
);
|
||||
const sortedMcpTools = [...mcpTools].sort((a, b) => b.tokens - a.tokens);
|
||||
const sortedMemoryFiles = [...memoryFiles].sort(
|
||||
(a, b) => b.tokens - a.tokens,
|
||||
);
|
||||
const sortedSkills = [...skills].sort((a, b) => b.tokens - a.tokens);
|
||||
|
||||
return (
|
||||
<Box
|
||||
borderStyle="round"
|
||||
borderColor={theme.border.default}
|
||||
flexDirection="column"
|
||||
paddingY={1}
|
||||
paddingX={2}
|
||||
>
|
||||
{/* Title */}
|
||||
<Text bold color={theme.text.accent}>
|
||||
{t('Context Usage')}
|
||||
</Text>
|
||||
<Box height={1} />
|
||||
|
||||
{isEstimated ? (
|
||||
<>
|
||||
{/* No API data yet — show hint instead of progress bar */}
|
||||
<Box marginBottom={1}>
|
||||
<Text color={theme.status.warning} italic>
|
||||
{t('No API response yet. Send a message to see actual usage.')}
|
||||
</Text>
|
||||
</Box>
|
||||
|
||||
{/* Estimated overhead categories */}
|
||||
<Text bold color={theme.text.primary}>
|
||||
{t('Estimated pre-conversation overhead')}
|
||||
</Text>
|
||||
<Text color={theme.text.secondary}>
|
||||
{t('Model')}: {modelName}
|
||||
{' '}
|
||||
{t('Context window')}: {formatTokens(contextWindowSize)}{' '}
|
||||
{t('tokens')}
|
||||
</Text>
|
||||
<Box height={1} />
|
||||
</>
|
||||
) : (
|
||||
<>
|
||||
{/* Model name + context window info */}
|
||||
<Box width={CONTENT_WIDTH} marginBottom={1}>
|
||||
<Text color={theme.text.secondary}>{modelName}</Text>
|
||||
<Box flexGrow={1} justifyContent="flex-end">
|
||||
<Text color={theme.text.secondary}>
|
||||
{t('Context window')}: {formatTokens(contextWindowSize)}{' '}
|
||||
{t('tokens')}
|
||||
</Text>
|
||||
</Box>
|
||||
</Box>
|
||||
{/* Progress bar — three segments: used | free | buffer */}
|
||||
<Box width={CONTENT_WIDTH}>
|
||||
<ProgressBar
|
||||
usedPercentage={Math.min(percentage, 100)}
|
||||
bufferPercentage={
|
||||
contextWindowSize > 0
|
||||
? (breakdown.autocompactBuffer / contextWindowSize) * 100
|
||||
: 0
|
||||
}
|
||||
width={CONTENT_WIDTH}
|
||||
/>
|
||||
</Box>
|
||||
<Box height={1} />
|
||||
{/* Legend — same layout as CategoryRow for alignment */}
|
||||
<CategoryRow
|
||||
symbol={FILLED}
|
||||
label={t('Used')}
|
||||
tokens={totalTokens}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.text.accent}
|
||||
/>
|
||||
<CategoryRow
|
||||
symbol={EMPTY}
|
||||
label={t('Free')}
|
||||
tokens={breakdown.freeSpace}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.text.secondary}
|
||||
/>
|
||||
<CategoryRow
|
||||
symbol={BUFFER}
|
||||
label={t('Autocompact')}
|
||||
tokens={breakdown.autocompactBuffer}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.status.warning}
|
||||
/>
|
||||
<Box height={1} />
|
||||
|
||||
{/* Breakdown header */}
|
||||
<Text bold color={theme.text.primary}>
|
||||
{t('Usage by category')}
|
||||
</Text>
|
||||
</>
|
||||
)}
|
||||
|
||||
<CategoryRow
|
||||
symbol={FILLED}
|
||||
label={t('System prompt')}
|
||||
tokens={breakdown.systemPrompt}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.text.accent}
|
||||
/>
|
||||
<CategoryRow
|
||||
symbol={FILLED}
|
||||
label={t('Built-in tools')}
|
||||
tokens={breakdown.builtinTools}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.text.accent}
|
||||
/>
|
||||
{breakdown.mcpTools > 0 && (
|
||||
<CategoryRow
|
||||
symbol={FILLED}
|
||||
label={t('MCP tools')}
|
||||
tokens={breakdown.mcpTools}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.text.accent}
|
||||
/>
|
||||
)}
|
||||
<CategoryRow
|
||||
symbol={FILLED}
|
||||
label={t('Memory files')}
|
||||
tokens={breakdown.memoryFiles}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.text.accent}
|
||||
/>
|
||||
<CategoryRow
|
||||
symbol={FILLED}
|
||||
label={t('Skills')}
|
||||
tokens={breakdown.skills}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.text.accent}
|
||||
/>
|
||||
{/* Only show Messages when we have real API data */}
|
||||
{!isEstimated && (
|
||||
<CategoryRow
|
||||
symbol={FILLED}
|
||||
label={t('Messages')}
|
||||
tokens={breakdown.messages}
|
||||
contextWindowSize={contextWindowSize}
|
||||
symbolColor={theme.text.accent}
|
||||
/>
|
||||
)}
|
||||
|
||||
{/* Built-in tools detail */}
|
||||
{sortedBuiltinTools.length > 0 && (
|
||||
<Box flexDirection="column" marginTop={1}>
|
||||
<Text bold color={theme.text.primary}>
|
||||
{t('Built-in tools')}
|
||||
</Text>
|
||||
{sortedBuiltinTools.map((tool) => (
|
||||
<DetailRow key={tool.name} name={tool.name} tokens={tool.tokens} />
|
||||
))}
|
||||
</Box>
|
||||
)}
|
||||
|
||||
{/* MCP Tools detail */}
|
||||
{sortedMcpTools.length > 0 && (
|
||||
<Box flexDirection="column" marginTop={1}>
|
||||
<Text bold color={theme.text.primary}>
|
||||
{t('MCP tools')}
|
||||
</Text>
|
||||
{sortedMcpTools.map((tool) => (
|
||||
<DetailRow key={tool.name} name={tool.name} tokens={tool.tokens} />
|
||||
))}
|
||||
</Box>
|
||||
)}
|
||||
|
||||
{/* Memory files detail */}
|
||||
{sortedMemoryFiles.length > 0 && (
|
||||
<Box flexDirection="column" marginTop={1}>
|
||||
<Text bold color={theme.text.primary}>
|
||||
{t('Memory files')}
|
||||
</Text>
|
||||
{sortedMemoryFiles.map((file) => (
|
||||
<DetailRow key={file.path} name={file.path} tokens={file.tokens} />
|
||||
))}
|
||||
</Box>
|
||||
)}
|
||||
|
||||
{/* Skills detail */}
|
||||
{sortedSkills.length > 0 && (
|
||||
<Box flexDirection="column" marginTop={1}>
|
||||
<Text bold color={theme.text.primary}>
|
||||
{t('Skills')}
|
||||
</Text>
|
||||
{sortedSkills.map((skill) => (
|
||||
<DetailRow
|
||||
key={skill.name}
|
||||
name={skill.name}
|
||||
tokens={skill.tokens}
|
||||
/>
|
||||
))}
|
||||
</Box>
|
||||
)}
|
||||
</Box>
|
||||
);
|
||||
};
|
||||
|
|
@ -251,6 +251,48 @@ export type HistoryItemMcpStatus = HistoryItemBase & {
|
|||
showTips: boolean;
|
||||
};
|
||||
|
||||
// --- Context Usage types ---
|
||||
|
||||
export interface ContextCategoryBreakdown {
|
||||
systemPrompt: number;
|
||||
builtinTools: number;
|
||||
mcpTools: number;
|
||||
memoryFiles: number;
|
||||
skills: number;
|
||||
messages: number;
|
||||
freeSpace: number;
|
||||
autocompactBuffer: number;
|
||||
}
|
||||
|
||||
export interface ContextToolDetail {
|
||||
name: string;
|
||||
tokens: number;
|
||||
}
|
||||
|
||||
export interface ContextMemoryDetail {
|
||||
path: string;
|
||||
tokens: number;
|
||||
}
|
||||
|
||||
export interface ContextSkillDetail {
|
||||
name: string;
|
||||
tokens: number;
|
||||
}
|
||||
|
||||
export type HistoryItemContextUsage = HistoryItemBase & {
|
||||
type: 'context_usage';
|
||||
modelName: string;
|
||||
totalTokens: number;
|
||||
contextWindowSize: number;
|
||||
breakdown: ContextCategoryBreakdown;
|
||||
builtinTools: ContextToolDetail[];
|
||||
mcpTools: ContextToolDetail[];
|
||||
memoryFiles: ContextMemoryDetail[];
|
||||
skills: ContextSkillDetail[];
|
||||
/** True when totalTokens is estimated (no API call yet) rather than from API response */
|
||||
isEstimated?: boolean;
|
||||
};
|
||||
|
||||
// Using Omit<HistoryItem, 'id'> seems to have some issues with typescript's
|
||||
// type inference e.g. historyItem.type === 'tool_group' isn't auto-inferring that
|
||||
// 'tools' in historyItem.
|
||||
|
|
@ -278,7 +320,8 @@ export type HistoryItemWithoutId =
|
|||
| HistoryItemExtensionsList
|
||||
| HistoryItemToolsList
|
||||
| HistoryItemSkillsList
|
||||
| HistoryItemMcpStatus;
|
||||
| HistoryItemMcpStatus
|
||||
| HistoryItemContextUsage;
|
||||
|
||||
export type HistoryItem = HistoryItemWithoutId & { id: number };
|
||||
|
||||
|
|
@ -301,6 +344,7 @@ export enum MessageType {
|
|||
TOOLS_LIST = 'tools_list',
|
||||
SKILLS_LIST = 'skills_list',
|
||||
MCP_STATUS = 'mcp_status',
|
||||
CONTEXT_USAGE = 'context_usage',
|
||||
}
|
||||
|
||||
// Simplified message structure for internal feedback
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue