feat: add /context command to display context window token usage breakdown

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
pomelo-nwu 2026-02-13 16:34:44 +08:00
parent aefea076b0
commit 77fd945474
7 changed files with 1026 additions and 1 deletions

View file

@ -0,0 +1,293 @@
# `/context` 命令 — 上下文窗口用量分解
## 概述
`/context` 命令展示当前模型上下文窗口的 token 使用情况。它将整个上下文窗口拆分为多个分类,帮助用户理解 token 花在了哪里,以及还剩多少空间。
## 上下文窗口的组成
一次 API 请求发送给模型的完整 prompt 包含以下部分:
```
┌─────────────────────────────────────────────┐
│ Context Window (总容量) │
│ │
│ ┌─────────────────────────────────────┐ │
│ │ System Prompt (系统提示词) │ │
│ │ └─ 核心指令 + 行为规则 │ │
│ ├─────────────────────────────────────┤ │
│ │ Tool Declarations (工具声明) │ │
│ │ ├─ Built-in tools (内置工具) │ │
│ │ ├─ MCP tools (MCP 工具) │ │
│ │ └─ SkillTool (技能工具) ◄──────────┼─── 包含所有 skill 的名称+描述
│ ├─────────────────────────────────────┤ │
│ │ Memory (用户记忆) │ │
│ │ └─ QWEN.md + extension configs │ │
│ ├─────────────────────────────────────┤ │
│ │ Messages (对话消息) │ │
│ │ ├─ 用户消息 │ │
│ │ ├─ 模型回复 │ │
│ │ └─ 工具调用 & 工具结果 ◄───────────┼─── skill body 在此加载
│ ├─────────────────────────────────────┤ │
│ │ Free Space (可用空间) │ │
│ ├─────────────────────────────────────┤ │
│ │ Autocompact Buffer (自动压缩缓冲) │ │
│ └─────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
```
**不变量**:所有分类之和 = Context Window 总容量。
## 各分类详解
### 1. System Prompt系统提示词
| 属性 | 说明 |
| ------------ | ------------------------------------------------------------------ |
| **数据来源** | `getCoreSystemPrompt(undefined, modelName)` |
| **包含内容** | 模型的核心行为指令、输出格式要求、安全规则等 |
| **不包含** | Memory 内容(单独计算) |
| **计算方式** | 对系统提示词文本调用 `estimateTokens()` |
| **变化频率** | 基本固定,除非修改了 `QWEN_SYSTEM_MD` 环境变量或 `.qwen/system.md` |
> **注意**`getCoreSystemPrompt` 接受 `userMemory` 参数,这里传入 `undefined` 以排除 memory因为 memory 作为独立分类统计。
### 2. Built-in Tools内置工具
| 属性 | 说明 |
| ------------ | ----------------------------------------------------------------------------------------------------- |
| **数据来源** | `toolRegistry.getAllTools()` 中非 MCP、非 SkillTool 的工具 |
| **包含内容** | `read_file``edit``run_shell_command``grep_search``glob``list_directory` 等核心工具的 schema |
| **计算方式** | `allToolsTokens - skillsTokens - mcpToolsTotalTokens` |
| **详情列表** | 逐项展示每个内置工具的名称和 token 占用,按 token 数降序排列 |
> **SkillTool** 虽然也是内置工具,但因其内容动态性(嵌入所有 skill 列表),独立作为 **Skills** 分类展示,不在 Built-in tools 中出现。
### 2b. MCP ToolsMCP 工具)
| 属性 | 说明 |
| ------------ | ----------------------------------------------------------------------- |
| **数据来源** | `toolRegistry.getAllTools()``DiscoveredMCPTool` 实例 |
| **包含内容** | 通过 MCP 协议连接的外部工具服务器提供的工具 schema |
| **计算方式** | 各 MCP 工具 `estimateTokens(JSON.stringify(tool.schema))` 之和 |
| **详情列表** | 逐项展示每个 MCP 工具的名称(`serverName__toolName` 格式)和 token 占用 |
| **条件显示** | 仅当存在 MCP 工具时才显示此分类行和详情 |
### 3. Skills技能⭐ 渐进式披露
Skills 采用**两阶段加载**设计:
| 阶段 | 加载内容 | Token 归属 | 何时加载 |
| ------------ | ---------------------------------------------- | ----------------- | ------------------------------- |
| **第一阶段** | 每个 skill 的 name + 短 description + 使用说明 | **Skills 分类** | 每次 API 请求都发送 |
| **第二阶段** | 完整的 SKILL.md body 内容(详细指令、模板等) | **Messages 分类** | 模型调用 `skill` 工具后按需注入 |
**`/context` 中 Skills 分类展示的是第一阶段的常驻开销。**
#### 第一阶段的实现细节
SkillTool 在初始化时将所有 skill 信息嵌入其 `description` 字段:
```
Execute a skill within the main conversation
<skills_instructions>
... 使用说明(~600 字符)...
</skills_instructions>
<available_skills>
<skill>
<name>pdf</name>
<description>Convert PDF files to text (project)</description>
<location>project</location>
</skill>
<skill>
<name>xlsx</name>
<description>Process Excel spreadsheets (user)</description>
<location>user</location>
</skill>
...更多 skills...
</available_skills>
```
这整块文本是 SkillTool 的 tool declaration 的一部分,每次 API 请求都会发送。
#### Token 计算方式
```
skillsTokens = estimateTokens(JSON.stringify(skillTool.schema))
```
直接从 ToolRegistry 中获取 SkillTool 的完整 schema 进行估算,确保包含:
- 使用说明文本(`<skills_instructions>`
- 所有 skill 的 XML 列表(`<available_skills>`
- schema 参数定义
#### 第二阶段(按需加载)
当模型调用 `skill` 工具时,`SkillToolInvocation.execute()` 会加载完整的 SKILL.md
```typescript
const skill = await this.skillManager.loadSkillForRuntime(this.params.skill);
const llmContent = `Base directory: ${baseDir}\n\n${skill.body}\n`;
```
这个 body 内容作为工具调用结果注入到对话中token 开销归入 **Messages** 分类。
#### Skills 详情列表
每个 skill 的详情行展示该 skill 在第一阶段中的大致占用,按 token 数降序排列。注意:
- 各 skill 详情的 token 之和 **< Skills 分类总数**差值是 skills_instructions 指令文本的开销
- 详情仅展示名称和描述的 token不包含 schema 参数定义部分
### 4. Memory Files用户记忆
| 属性 | 说明 |
| ------------ | -------------------------------------------------------------------------- |
| **数据来源** | `config.getUserMemory()` |
| **包含内容** | `QWEN.md`、extension 配置、`output-language` 等用户级配置文件 |
| **加载位置** | 拼接到 System Prompt 末尾(通过 `getCoreSystemPrompt(userMemory, model)` |
| **计算方式** | 解析 memory 文本中的 `--- Context from: <path> ---` 标记,分文件估算 token |
**Memory 内容格式**
```
--- Context from: ~/.qwen/QWEN.md ---
用户自定义规则和偏好...
--- End of Context from: ~/.qwen/QWEN.md ---
--- Context from: ~/.qwen/extensions/config.md ---
扩展配置内容...
--- End of Context from: ~/.qwen/extensions/config.md ---
```
> **为什么 System Prompt 不包含 Memory** 计算 System Prompt token 时传入 `userMemory = undefined`Memory 作为独立分类展示,避免两个分类重叠。实际 API 请求中 memory 是拼接在 system prompt 末尾的。
### 5. Messages对话消息
| 属性 | 说明 |
| ------------ | ---------------------------------------------------------------- |
| **数据来源** | 反推:`totalTokens - systemPrompt - allTools - memory` |
| **包含内容** | 所有用户消息、模型回复、工具调用参数、工具返回结果 |
| **特别包含** | skill body第二阶段按需加载的内容、文件读取结果、shell 输出等 |
| **计算方式** | `max(0, apiTotalTokens - estimatedOverhead)` |
> **注意**Messages 是通过 API 返回的 `totalTokens` 减去其他分类的估算值得出的,因此它吸收了估算误差。如果 overhead 被高估Messages 会被相应低估。
### 6. Free Space可用空间
| 属性 | 说明 |
| ------------ | ----------------------------------------------------- |
| **计算方式** | `contextWindowSize - totalTokens - autocompactBuffer` |
| **含义** | 在触发自动压缩之前,还能容纳多少 token 的对话内容 |
### 7. Autocompact Buffer自动压缩缓冲区
| 属性 | 说明 |
| ------------ | ----------------------------------------------------------------- |
| **计算方式** | `(1 - compressionThreshold) × contextWindowSize` |
| **默认值** | `(1 - 0.7) × 131072 = 39322`(约 30% 的上下文窗口) |
| **含义** | 当 token 用量达到 70% 时触发自动压缩,这 30% 的空间作为缓冲区预留 |
## 两种展示模式
### 模式 A无 API 数据(首次使用,尚未发送消息)
```
Context Usage
No API response yet. Send a message to see actual usage.
Estimated pre-conversation overhead
Model: glm-5 Context window: 131.1k tokens
█ System prompt 4.8k tokens (3.7%)
█ System tools 5.2k tokens (4.0%)
█ Memory files 845 tokens (0.6%)
█ Skills 5.1k tokens (3.9%)
░ Free space 75.8k tokens (57.8%)
░ Autocompact buffer 39.3k tokens (30.0%)
```
- **不显示进度条和 total 数字**:避免估算值与后续 API 实际值产生不合理的对比
- **不显示 Messages 行**:尚无对话
- 各分类基于本地启发式估算(`estimateTokens`),可能与实际 API tokenizer 有 ~10% 偏差
### 模式 B有 API 数据(已进行对话)
```
Context Usage
██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ glm-5
25.3k/131.1k tokens (19.3%)
Usage by category
█ System prompt 4.5k tokens (3.4%)
█ System tools 4.9k tokens (3.7%)
█ Memory files 790 tokens (0.6%)
█ Skills 4.8k tokens (3.7%)
█ Messages 10.3k tokens (7.9%)
░ Free space 66.5k tokens (50.7%)
░ Autocompact buffer 39.3k tokens (30.0%)
```
- **`totalTokens` 来自 API 响应**`usageMetadata.promptTokenCount`),是最准确的值
- **当本地估算 > API total 时**:按比例缩放各 overhead 分类,确保分类之和 = totalTokens
- **Messages** = `totalTokens - scaledOverhead`,包含所有对话内容 + 按需加载的 skill body
## Token 估算方法
由于无法直接访问模型的 tokenizer使用基于字符的启发式估算
```
tokens ≈ ⌈asciiChars / 4 + nonAsciiChars × 1.5⌉
```
| 字符类型 | 比例 | 依据 |
| --------------------------------- | --------------- | -------------------------------- |
| ASCII英文、JSON 结构字符等) | ~4 字符/token | BPE tokenizer 对英文的平均压缩率 |
| 非 ASCII中文、日文等 CJK 字符) | ~1.5 token/字符 | CJK 字符通常映射为 1-2 个 token |
**已知局限**
- 不同模型的 tokenizer 有差异,估算可能偏差 ±10-20%
- JSON 结构字符(`{`, `"`, `:` 等)的实际 token 化比率与自然语言不同
- 当估算偏高时,通过 `overheadScale` 按比例缩放校正
## 数据流图
```
┌──────────────────┐
│ API Response │
│ promptTokenCount │ ─── totalTokens (ground truth)
└──────────────────┘
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ ▼ ▼
estimateTokens() estimateTokens() estimateTokens()
│ │ │
▼ ▼ ▼
systemPromptTokens allToolsTokens memoryFilesTokens
┌─────┴──────┐
│ │
▼ ▼
systemToolsTokens skillsTokens
(allTools - skills) (from SkillTool schema)
│ │
└─────┬──────┘
rawOverhead = systemPrompt + allTools + memory
┌───────────┼───────────┐
│ overheadScale │ (= min(1, totalTokens/rawOverhead))
▼ ▼
scaled categories messages = totalTokens - scaledOverhead
│ │
└───────────┬───────────┘
breakdown output
```

View file

@ -0,0 +1,310 @@
/**
* @license
* Copyright 2025 Qwen
* SPDX-License-Identifier: Apache-2.0
*/
import {
type CommandContext,
type SlashCommand,
CommandKind,
} from './types.js';
import {
MessageType,
type HistoryItemContextUsage,
type ContextCategoryBreakdown,
type ContextToolDetail,
type ContextMemoryDetail,
type ContextSkillDetail,
} from '../types.js';
import {
DiscoveredMCPTool,
uiTelemetryService,
getCoreSystemPrompt,
DEFAULT_TOKEN_LIMIT,
ToolNames,
} from '@qwen-code/qwen-code-core';
import { t } from '../../i18n/index.js';
/**
* Default compression token threshold (triggers compression at 70% usage).
* The autocompact buffer is (1 - threshold) * contextWindowSize.
*/
const DEFAULT_COMPRESSION_THRESHOLD = 0.7;
/**
* Estimate token count for a string using a character-based heuristic.
* ASCII chars 4 chars/token, CJK/non-ASCII chars 1.5 tokens/char.
*/
function estimateTokens(text: string): number {
if (!text || text.length === 0) return 0;
let asciiChars = 0;
let nonAsciiChars = 0;
for (let i = 0; i < text.length; i++) {
const charCode = text.charCodeAt(i);
if (charCode < 128) {
asciiChars++;
} else {
nonAsciiChars++;
}
}
// CJK and other non-ASCII characters typically produce 1.5-2 tokens each
return Math.ceil(asciiChars / 4 + nonAsciiChars * 1.5);
}
/**
* Parse concatenated memory content into individual file entries.
* Memory content format: "--- Context from: <path> ---\n<content>\n--- End of Context from: <path> ---"
*/
function parseMemoryFiles(memoryContent: string): ContextMemoryDetail[] {
if (!memoryContent || memoryContent.trim().length === 0) return [];
const results: ContextMemoryDetail[] = [];
// Use backreference (\1) to ensure start/end path markers match
const regex =
/--- Context from: (.+?) ---\n([\s\S]*?)--- End of Context from: \1 ---/g;
let match: RegExpExecArray | null;
while ((match = regex.exec(memoryContent)) !== null) {
const filePath = match[1]!;
const content = match[2]!;
results.push({
path: filePath,
tokens: estimateTokens(content),
});
}
// If no structured markers found, treat as a single memory block
if (results.length === 0 && memoryContent.trim().length > 0) {
results.push({
path: t('memory'),
tokens: estimateTokens(memoryContent),
});
}
return results;
}
export const contextCommand: SlashCommand = {
name: 'context',
get description() {
return t('Show context window usage breakdown.');
},
kind: CommandKind.BUILT_IN,
action: async (context: CommandContext) => {
const { config } = context.services;
if (!config) {
context.ui.addItem(
{
type: MessageType.ERROR,
text: t('Config not loaded.'),
},
Date.now(),
);
return;
}
// --- Gather data ---
const modelName = config.getModel() || 'unknown';
const contentGeneratorConfig = config.getContentGeneratorConfig();
const contextWindowSize =
contentGeneratorConfig.contextWindowSize ?? DEFAULT_TOKEN_LIMIT;
// Total prompt token count from API (most accurate)
const apiTotalTokens = uiTelemetryService.getLastPromptTokenCount();
// 1. System prompt tokens (without memory, as memory is counted separately)
const systemPromptText = getCoreSystemPrompt(undefined, modelName);
const systemPromptTokens = estimateTokens(systemPromptText);
// 2. Tool declarations tokens (includes ALL tools: built-in, MCP, skill tool)
const toolRegistry = config.getToolRegistry();
const allTools = toolRegistry ? toolRegistry.getAllTools() : [];
const toolDeclarations = toolRegistry
? toolRegistry.getFunctionDeclarations()
: [];
const toolsJsonStr = JSON.stringify(toolDeclarations);
const allToolsTokens = estimateTokens(toolsJsonStr);
// 3. Per-tool details (for breakdown display)
const builtinTools: ContextToolDetail[] = [];
const mcpTools: ContextToolDetail[] = [];
for (const tool of allTools) {
const toolJsonStr = JSON.stringify(tool.schema);
const tokens = estimateTokens(toolJsonStr);
if (tool instanceof DiscoveredMCPTool) {
mcpTools.push({
name: `${tool.serverName}__${tool.serverToolName || tool.name}`,
tokens,
});
} else if (tool.name !== ToolNames.SKILL) {
// Built-in tool (exclude SkillTool, which is shown under Skills)
builtinTools.push({
name: tool.name,
tokens,
});
}
}
// 4. Memory files
const memoryContent = config.getUserMemory();
const memoryFiles = parseMemoryFiles(memoryContent);
const memoryFilesTokens = memoryFiles.reduce((sum, f) => sum + f.tokens, 0);
// 5. Skills (progressive disclosure)
// The SkillTool's description embeds all skill name+description listings
// plus ~600 chars of instruction text. This is the "always in context"
// cost. The full SKILL.md body is only loaded on-demand when the model
// invokes the skill tool (and that cost appears in Messages).
//
// To get an accurate total, we read the SkillTool's actual schema from
// the registry rather than reconstructing from a template.
const skillTool = allTools.find((tool) => tool.name === ToolNames.SKILL);
const skillToolTotalTokens = skillTool
? estimateTokens(JSON.stringify(skillTool.schema))
: 0;
// Per-skill breakdown for detail display (proportional to description length)
const skillManager = config.getSkillManager();
const skillConfigs = skillManager ? await skillManager.listSkills() : [];
const skills: ContextSkillDetail[] = skillConfigs.map((skill) => ({
name: skill.name,
tokens: estimateTokens(
`<skill>\n<name>\n${skill.name}\n</name>\n<description>\n${skill.description} (${skill.level})\n</description>\n<location>\n${skill.level}\n</location>\n</skill>`,
),
}));
// Use the SkillTool's actual schema tokens as the total, not the sum of
// individual estimates (which would miss the instruction wrapper text).
const skillsTokens = skillToolTotalTokens;
// 6. Autocompact buffer
const compressionThreshold =
config.getChatCompression()?.contextPercentageThreshold ??
DEFAULT_COMPRESSION_THRESHOLD;
const autocompactBuffer =
compressionThreshold > 0
? Math.round((1 - compressionThreshold) * contextWindowSize)
: 0;
// 7. Calculate raw overhead (allToolsTokens already includes skills)
const rawOverhead = systemPromptTokens + allToolsTokens + memoryFilesTokens;
// 8. Determine total tokens and build breakdown
const isEstimated = apiTotalTokens === 0;
// Sum of MCP tool tokens for category-level display
const mcpToolsTotalTokens = mcpTools.reduce(
(sum, tool) => sum + tool.tokens,
0,
);
let totalTokens: number;
let displaySystemPrompt: number;
let displayBuiltinTools: number;
let displayMcpTools: number;
let displayMemoryFiles: number;
let displaySkills: number;
let messagesTokens: number;
let freeSpace: number;
let detailBuiltinTools: ContextToolDetail[];
let detailMcpTools: ContextToolDetail[];
let detailMemoryFiles: ContextMemoryDetail[];
let detailSkills: ContextSkillDetail[];
if (isEstimated) {
// No API data yet: show raw overhead estimates only.
// Use 0 as totalTokens so the progress bar stays empty —
// avoids showing an inflated estimate that would "decrease"
// once real API data arrives.
totalTokens = 0;
displaySystemPrompt = systemPromptTokens;
// builtinTools category = allTools - skills - mcpTools
displayBuiltinTools = Math.max(
0,
allToolsTokens - skillsTokens - mcpToolsTotalTokens,
);
displayMcpTools = mcpToolsTotalTokens;
displayMemoryFiles = memoryFilesTokens;
displaySkills = skillsTokens;
messagesTokens = 0;
// Free space accounts for the estimated overhead
freeSpace = Math.max(
0,
contextWindowSize - rawOverhead - autocompactBuffer,
);
detailBuiltinTools = builtinTools;
detailMcpTools = mcpTools;
detailMemoryFiles = memoryFiles;
detailSkills = skills;
} else {
// API data available: use actual total with proportional scaling
totalTokens = apiTotalTokens;
// When estimates overshoot API total, scale down proportionally
// so the breakdown categories add up to totalTokens.
const overheadScale =
rawOverhead > totalTokens ? totalTokens / rawOverhead : 1;
displaySystemPrompt = Math.round(systemPromptTokens * overheadScale);
const scaledAllTools = Math.round(allToolsTokens * overheadScale);
displayMemoryFiles = Math.round(memoryFilesTokens * overheadScale);
displaySkills = Math.round(skillsTokens * overheadScale);
const scaledMcpTotal = Math.round(mcpToolsTotalTokens * overheadScale);
displayMcpTools = scaledMcpTotal;
displayBuiltinTools = Math.max(
0,
scaledAllTools - displaySkills - scaledMcpTotal,
);
const scaledOverhead =
displaySystemPrompt + scaledAllTools + displayMemoryFiles;
messagesTokens = Math.max(0, totalTokens - scaledOverhead);
freeSpace = Math.max(
0,
contextWindowSize - totalTokens - autocompactBuffer,
);
// Scale detail items to match their parent categories
const scaleDetail = <T extends { tokens: number }>(items: T[]): T[] =>
overheadScale < 1
? items.map((item) => ({
...item,
tokens: Math.round(item.tokens * overheadScale),
}))
: items;
detailBuiltinTools = scaleDetail(builtinTools);
detailMcpTools = scaleDetail(mcpTools);
detailMemoryFiles = scaleDetail(memoryFiles);
detailSkills = scaleDetail(skills);
}
const breakdown: ContextCategoryBreakdown = {
systemPrompt: displaySystemPrompt,
builtinTools: displayBuiltinTools,
mcpTools: displayMcpTools,
memoryFiles: displayMemoryFiles,
skills: displaySkills,
messages: messagesTokens,
freeSpace,
autocompactBuffer,
};
const contextUsageItem: HistoryItemContextUsage = {
type: MessageType.CONTEXT_USAGE,
modelName,
totalTokens,
contextWindowSize,
breakdown,
builtinTools: detailBuiltinTools,
mcpTools: detailMcpTools,
memoryFiles: detailMemoryFiles,
skills: detailSkills,
isEstimated,
};
context.ui.addItem(contextUsageItem, Date.now());
},
};

View file

@ -33,6 +33,7 @@ import { getMCPServerStatus } from '@qwen-code/qwen-code-core';
import { SkillsList } from './views/SkillsList.js';
import { ToolsList } from './views/ToolsList.js';
import { McpStatus } from './views/McpStatus.js';
import { ContextUsage } from './views/ContextUsage.js';
interface HistoryItemDisplayProps {
item: HistoryItem;
@ -176,6 +177,19 @@ const HistoryItemDisplayComponent: React.FC<HistoryItemDisplayProps> = ({
{itemForDisplay.type === 'mcp_status' && (
<McpStatus {...itemForDisplay} serverStatus={getMCPServerStatus} />
)}
{itemForDisplay.type === 'context_usage' && (
<ContextUsage
modelName={itemForDisplay.modelName}
totalTokens={itemForDisplay.totalTokens}
contextWindowSize={itemForDisplay.contextWindowSize}
breakdown={itemForDisplay.breakdown}
builtinTools={itemForDisplay.builtinTools}
mcpTools={itemForDisplay.mcpTools}
memoryFiles={itemForDisplay.memoryFiles}
skills={itemForDisplay.skills}
isEstimated={itemForDisplay.isEstimated}
/>
)}
</Box>
);
};

View file

@ -0,0 +1,361 @@
/**
* @license
* Copyright 2025 Qwen
* SPDX-License-Identifier: Apache-2.0
*/
import type React from 'react';
import { Box, Text } from 'ink';
import { theme } from '../../semantic-colors.js';
import type {
ContextCategoryBreakdown,
ContextToolDetail,
ContextMemoryDetail,
ContextSkillDetail,
} from '../../types.js';
import { t } from '../../../i18n/index.js';
// Progress bar characters
const FILLED = '\u2588'; // █ - filled block
const BUFFER = '\u2592'; // ▒ - medium shade (autocompact buffer)
const EMPTY = '\u2591'; // ░ - light shade (free space)
const CONTENT_WIDTH = 56;
interface ContextUsageProps {
modelName: string;
totalTokens: number;
contextWindowSize: number;
breakdown: ContextCategoryBreakdown;
builtinTools: ContextToolDetail[];
mcpTools: ContextToolDetail[];
memoryFiles: ContextMemoryDetail[];
skills: ContextSkillDetail[];
/** True when totalTokens is estimated (no API call yet) */
isEstimated?: boolean;
}
/**
* Truncate a string to maxLen, appending '…' if truncated.
*/
function truncateName(name: string, maxLen: number): string {
if (name.length <= maxLen) return name;
return name.slice(0, maxLen - 1) + '\u2026';
}
/**
* Format token count for display (e.g. 1234 -> "1.2k", 123456 -> "123.5k")
*/
function formatTokens(tokens: number): string {
if (tokens >= 1000) {
return `${(tokens / 1000).toFixed(1)}k`;
}
return `${tokens}`;
}
/**
* Render a three-segment progress bar: used | autocompact buffer | free space.
*/
const ProgressBar: React.FC<{
usedPercentage: number;
bufferPercentage: number;
width: number;
}> = ({ usedPercentage, bufferPercentage, width }) => {
const usedCount = Math.round((Math.min(usedPercentage, 100) / 100) * width);
const bufferCount = Math.round(
(Math.min(bufferPercentage, 100 - usedPercentage) / 100) * width,
);
const freeCount = Math.max(0, width - usedCount - bufferCount);
const usedStr = FILLED.repeat(Math.max(0, usedCount));
const freeStr = EMPTY.repeat(Math.max(0, freeCount));
const bufferStr = BUFFER.repeat(Math.max(0, bufferCount));
// Used color: accent by default, warning/error at high usage.
let usedColor = theme.text.accent;
if (usedPercentage > 80) {
usedColor = theme.status.error;
} else if (usedPercentage > 60) {
usedColor = theme.status.warning;
}
return (
<Text>
<Text color={usedColor}>{usedStr}</Text>
<Text color={theme.text.secondary}>{freeStr}</Text>
<Text color={theme.status.warning}>{bufferStr}</Text>
</Text>
);
};
/**
* A row showing a category with its token count and percentage.
*/
const CategoryRow: React.FC<{
symbol: string;
label: string;
tokens: number;
contextWindowSize: number;
symbolColor?: string;
}> = ({ symbol, label, tokens, contextWindowSize, symbolColor }) => {
const percentage = ((tokens / contextWindowSize) * 100).toFixed(1);
const tokenStr = `${formatTokens(tokens)} ${t('tokens')} (${percentage}%)`;
return (
<Box width={CONTENT_WIDTH}>
<Box width={2}>
<Text color={symbolColor || theme.text.secondary}>{symbol}</Text>
</Box>
<Box width={24}>
<Text color={theme.text.primary}>{label}</Text>
</Box>
<Box flexGrow={1} justifyContent="flex-end">
<Text color={theme.text.secondary}>{tokenStr}</Text>
</Box>
</Box>
);
};
/**
* A detail row for individual items (MCP tools, memory files, skills).
*/
const DETAIL_NAME_MAX_LEN = 30;
const DetailRow: React.FC<{
name: string;
tokens: number;
}> = ({ name, tokens }) => {
const tokenStr =
tokens > 0 ? `${formatTokens(tokens)} ${t('tokens')}` : `0 ${t('tokens')}`;
return (
<Box width={CONTENT_WIDTH} paddingLeft={2}>
<Text color={theme.text.secondary}>{'\u2514'} </Text>
<Box width={32}>
<Text color={theme.text.link}>
{truncateName(name, DETAIL_NAME_MAX_LEN)}
</Text>
</Box>
<Box flexGrow={1} justifyContent="flex-end">
<Text color={theme.text.secondary}>{tokenStr}</Text>
</Box>
</Box>
);
};
export const ContextUsage: React.FC<ContextUsageProps> = ({
modelName,
totalTokens,
contextWindowSize,
breakdown,
builtinTools,
mcpTools,
memoryFiles,
skills,
isEstimated,
}) => {
const percentage =
contextWindowSize > 0 ? (totalTokens / contextWindowSize) * 100 : 0;
// Sort detail items by token count (descending) for better readability
const sortedBuiltinTools = [...builtinTools].sort(
(a, b) => b.tokens - a.tokens,
);
const sortedMcpTools = [...mcpTools].sort((a, b) => b.tokens - a.tokens);
const sortedMemoryFiles = [...memoryFiles].sort(
(a, b) => b.tokens - a.tokens,
);
const sortedSkills = [...skills].sort((a, b) => b.tokens - a.tokens);
return (
<Box
borderStyle="round"
borderColor={theme.border.default}
flexDirection="column"
paddingY={1}
paddingX={2}
>
{/* Title */}
<Text bold color={theme.text.accent}>
{t('Context Usage')}
</Text>
<Box height={1} />
{isEstimated ? (
<>
{/* No API data yet — show hint instead of progress bar */}
<Box marginBottom={1}>
<Text color={theme.status.warning} italic>
{t('No API response yet. Send a message to see actual usage.')}
</Text>
</Box>
{/* Estimated overhead categories */}
<Text bold color={theme.text.primary}>
{t('Estimated pre-conversation overhead')}
</Text>
<Text color={theme.text.secondary}>
{t('Model')}: {modelName}
{' '}
{t('Context window')}: {formatTokens(contextWindowSize)}{' '}
{t('tokens')}
</Text>
<Box height={1} />
</>
) : (
<>
{/* Model name + context window info */}
<Box width={CONTENT_WIDTH} marginBottom={1}>
<Text color={theme.text.secondary}>{modelName}</Text>
<Box flexGrow={1} justifyContent="flex-end">
<Text color={theme.text.secondary}>
{t('Context window')}: {formatTokens(contextWindowSize)}{' '}
{t('tokens')}
</Text>
</Box>
</Box>
{/* Progress bar — three segments: used | free | buffer */}
<Box width={CONTENT_WIDTH}>
<ProgressBar
usedPercentage={Math.min(percentage, 100)}
bufferPercentage={
contextWindowSize > 0
? (breakdown.autocompactBuffer / contextWindowSize) * 100
: 0
}
width={CONTENT_WIDTH}
/>
</Box>
<Box height={1} />
{/* Legend — same layout as CategoryRow for alignment */}
<CategoryRow
symbol={FILLED}
label={t('Used')}
tokens={totalTokens}
contextWindowSize={contextWindowSize}
symbolColor={theme.text.accent}
/>
<CategoryRow
symbol={EMPTY}
label={t('Free')}
tokens={breakdown.freeSpace}
contextWindowSize={contextWindowSize}
symbolColor={theme.text.secondary}
/>
<CategoryRow
symbol={BUFFER}
label={t('Autocompact')}
tokens={breakdown.autocompactBuffer}
contextWindowSize={contextWindowSize}
symbolColor={theme.status.warning}
/>
<Box height={1} />
{/* Breakdown header */}
<Text bold color={theme.text.primary}>
{t('Usage by category')}
</Text>
</>
)}
<CategoryRow
symbol={FILLED}
label={t('System prompt')}
tokens={breakdown.systemPrompt}
contextWindowSize={contextWindowSize}
symbolColor={theme.text.accent}
/>
<CategoryRow
symbol={FILLED}
label={t('Built-in tools')}
tokens={breakdown.builtinTools}
contextWindowSize={contextWindowSize}
symbolColor={theme.text.accent}
/>
{breakdown.mcpTools > 0 && (
<CategoryRow
symbol={FILLED}
label={t('MCP tools')}
tokens={breakdown.mcpTools}
contextWindowSize={contextWindowSize}
symbolColor={theme.text.accent}
/>
)}
<CategoryRow
symbol={FILLED}
label={t('Memory files')}
tokens={breakdown.memoryFiles}
contextWindowSize={contextWindowSize}
symbolColor={theme.text.accent}
/>
<CategoryRow
symbol={FILLED}
label={t('Skills')}
tokens={breakdown.skills}
contextWindowSize={contextWindowSize}
symbolColor={theme.text.accent}
/>
{/* Only show Messages when we have real API data */}
{!isEstimated && (
<CategoryRow
symbol={FILLED}
label={t('Messages')}
tokens={breakdown.messages}
contextWindowSize={contextWindowSize}
symbolColor={theme.text.accent}
/>
)}
{/* Built-in tools detail */}
{sortedBuiltinTools.length > 0 && (
<Box flexDirection="column" marginTop={1}>
<Text bold color={theme.text.primary}>
{t('Built-in tools')}
</Text>
{sortedBuiltinTools.map((tool) => (
<DetailRow key={tool.name} name={tool.name} tokens={tool.tokens} />
))}
</Box>
)}
{/* MCP Tools detail */}
{sortedMcpTools.length > 0 && (
<Box flexDirection="column" marginTop={1}>
<Text bold color={theme.text.primary}>
{t('MCP tools')}
</Text>
{sortedMcpTools.map((tool) => (
<DetailRow key={tool.name} name={tool.name} tokens={tool.tokens} />
))}
</Box>
)}
{/* Memory files detail */}
{sortedMemoryFiles.length > 0 && (
<Box flexDirection="column" marginTop={1}>
<Text bold color={theme.text.primary}>
{t('Memory files')}
</Text>
{sortedMemoryFiles.map((file) => (
<DetailRow key={file.path} name={file.path} tokens={file.tokens} />
))}
</Box>
)}
{/* Skills detail */}
{sortedSkills.length > 0 && (
<Box flexDirection="column" marginTop={1}>
<Text bold color={theme.text.primary}>
{t('Skills')}
</Text>
{sortedSkills.map((skill) => (
<DetailRow
key={skill.name}
name={skill.name}
tokens={skill.tokens}
/>
))}
</Box>
)}
</Box>
);
};

View file

@ -251,6 +251,48 @@ export type HistoryItemMcpStatus = HistoryItemBase & {
showTips: boolean;
};
// --- Context Usage types ---
export interface ContextCategoryBreakdown {
systemPrompt: number;
builtinTools: number;
mcpTools: number;
memoryFiles: number;
skills: number;
messages: number;
freeSpace: number;
autocompactBuffer: number;
}
export interface ContextToolDetail {
name: string;
tokens: number;
}
export interface ContextMemoryDetail {
path: string;
tokens: number;
}
export interface ContextSkillDetail {
name: string;
tokens: number;
}
export type HistoryItemContextUsage = HistoryItemBase & {
type: 'context_usage';
modelName: string;
totalTokens: number;
contextWindowSize: number;
breakdown: ContextCategoryBreakdown;
builtinTools: ContextToolDetail[];
mcpTools: ContextToolDetail[];
memoryFiles: ContextMemoryDetail[];
skills: ContextSkillDetail[];
/** True when totalTokens is estimated (no API call yet) rather than from API response */
isEstimated?: boolean;
};
// Using Omit<HistoryItem, 'id'> seems to have some issues with typescript's
// type inference e.g. historyItem.type === 'tool_group' isn't auto-inferring that
// 'tools' in historyItem.
@ -278,7 +320,8 @@ export type HistoryItemWithoutId =
| HistoryItemExtensionsList
| HistoryItemToolsList
| HistoryItemSkillsList
| HistoryItemMcpStatus;
| HistoryItemMcpStatus
| HistoryItemContextUsage;
export type HistoryItem = HistoryItemWithoutId & { id: number };
@ -301,6 +344,7 @@ export enum MessageType {
TOOLS_LIST = 'tools_list',
SKILLS_LIST = 'skills_list',
MCP_STATUS = 'mcp_status',
CONTEXT_USAGE = 'context_usage',
}
// Simplified message structure for internal feedback