qwen-code

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-17 12:21:10 +00:00

History

顾盼 c512427f93 feat(core): strip inline media before chat compaction summary (#4101 ) * feat(core): strip inline media before chat compaction summary Compaction's side-query previously shipped historyToCompress verbatim. Two related issues degraded summary quality and accuracy: - Inline image / document bytes (from MCP tool results) leaked into the summary model's prompt where they could not be interpreted and merely inflated payload. - findCompressSplitPoint apportioned chars via JSON.stringify(content), so a single 1 MB base64 image looked like ~350K tokens and biased the split point. Real Qwen-VL token cost is at most a few thousand. This change adds a new compactionInputSlimming module that replaces inlineData / fileData parts with short [image: <mime>] / [document: <mime>] placeholders before the side-query, leaving live history unchanged. The same constant feeds estimateContentChars so the split-point algorithm sees the budget the summary model actually consumes downstream. Microcompact is also extended to clear stale inline images alongside old tool results. A previous draft of the design also externalized large pastes to a content-addressable on-disk cache, but it was withdrawn after surveying claude-code's 2026-03 to 2026-05 releases - upstream consensus is to keep user input visible to the model and amortize cost via prompt caching rather than externalize. See the Out-of-scope section of the design doc for the full rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(core): recurse into functionResponse.parts when stripping media E2E exposed that `read_file` (and any tool that surfaces an image) wraps the result in `functionResponse.parts` via `coreToolScheduler.createFunctionResponsePart`. The slimming module only walked top-level `part.inlineData` / `part.fileData`, so the nested base64 bytes leaked into the compaction side-query payload. The previous design doc incorrectly claimed that no recursive walk was needed. Three changes: - `slimCompactionInput.transformPart` recurses into the nested `functionResponse.parts` array and replaces each entry via the same image/document placeholder logic. - `estimatePartChars` walks the nested array too, so the split-point algorithm doesn't fall back to `JSON.stringify` and over-count the base64 bytes. - `microcompactHistory` drops `functionResponse.parts` when clearing an old tool result; the previous spread of `...part.functionResponse` silently carried the original media through. New unit tests cover (a) nested image / document stripping, (b) the estimator no longer being skewed by nested base64. The previously failing E2E now PASSES: side-query payload contains zero `data:image/` occurrences, zero long base64 runs, and exactly one `[image: image/png]` placeholder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(core): address review findings on compaction image stripping Addresses 8 valid findings from PR review: - [Critical] estimatePartTokens now handles `fileData` parts (both top-level and nested under functionResponse.parts). Without this, microcompact's `tokensSaved === 0` short-circuit silently discarded every fileData clear. - estimatePartTokens for binary parts now uses a fixed MEDIA_PART_TOKEN_ESTIMATE constant (1,600) instead of base64-length divided by 4. The old formula billed a 1 MB image as ~250K tokens rather than its actual ~1,280 visual tokens on Qwen-VL, inflating the saved-token metric by orders of magnitude. - mimeType values from MCP tool servers are now run through sanitizeMimeForPlaceholder before being embedded in `[image: …]` / `[document: …]` placeholders. An adversarial server could otherwise craft `image/png]\n\n[SYSTEM: …` and inject instructions into the summary side-query. - collectCompactablePartRefs now recognizes a third 'nested-media' kind: functionResponse parts from non-compactable tools (e.g. MCP screenshots whose names aren't in COMPACTABLE_TOOLS) that carry images on functionResponse.parts. The nested media is dropped while the tool's text output is preserved. Previously such media accumulated forever in live history. - keepRecent budgets are now per-kind (tool / media / nested-media). Setting `toolResultsNumToKeep: 1` keeps 1 of each kind rather than 1 entry total across the merged list — matches the natural reading of the setting name. - findCompressSplitPoint's `precomputedCharCounts` fallback path is now documented as test-only; production callers MUST pass the precomputed array. - The text-based branch of isAlreadyCleared is gone: with the new nested-media handling (drops `parts`) and existing media handling (replaces with `{ text: … }` that is no longer collected) it was unreachable. - OpenAI converter (createToolMessage) now passes text parts inside functionResponse.parts through as text content. The slimmer writes `{ text: '[image: image/png]' }` placeholders into the nested array; without this fix the converter dropped them when serializing to the OpenAI wire format, leaving the summary model with empty tool responses instead of the placeholder. Two findings deferred with rationale (see design doc Open Questions): MIN_COMPRESSION_FRACTION still uses pre-slim counts (acceptable — "user shared an image" is itself worth summarizing); SlimResult is not re-exported (round-3 simplify decided to keep core's public surface minimal). E2E re-verified end-to-end: side-query payload contains 0 data:image/ occurrences, 0 long base64 runs, and 1 `[image: image/png]` placeholder in the expected position. 185/185 collocated unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(core): tidy compaction slimming after self-review Three small polishes from a follow-up code review pass: - `stripNestedMedia` no longer re-casts its return value: after destructuring `parts` out of the widened input type, TypeScript infers the original `FunctionResponse` shape without help. - `isAlreadyCleared` shed a 10-line comment block — the body is now one line, so one descriptive line above it is enough. - OpenAI converter's nested-part text check switched from `(part as { text?: unknown }).text` to `'text' in part && typeof part.text === 'string'`, dropping the cast and letting `in` narrow the type. No behavior change. 185/185 unit tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(core): wire slim stats to debug log; split MicrocompactMeta tools vs media Addresses two follow-up review suggestions: - `slimCompactionInput` returned `stats.imagesStripped` and `stats.documentsStripped` but the orchestrator never consumed them. Now logged at debug level whenever non-zero so operators can confirm the slimming pipeline actually fires on image-heavy compactions. - `MicrocompactMeta.toolsCleared` lost meaning after the recent refactor: it had grown to count both tool-result clears AND inline-media / nested-media clears. Renamed: - `toolsCleared` → only `tool`-kind clears (compactable tool output) - `mediaCleared` → `media` + `nested-media` clears (new) - `toolsKept` / `mediaKept` mirror the split, replacing the prior `toolsKept` that was actually a combined count. The single non-test consumer (`client.ts` debug log) updated to use both fields. 185/185 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-14 10:20:11 +08:00
..
design	feat(core): strip inline media before chat compaction summary (#4101 )	2026-05-14 10:20:11 +08:00
developers	feat(cli,sdk): qwen serve daemon (Stage 1) (#3889 )	2026-05-13 14:47:47 +08:00
plans	feat(vscode-ide-companion): add agent execution tool display (#2590 )	2026-04-18 23:39:26 +08:00
users	feat(perf): progressive MCP availability — MCP no longer blocks first input (#3994 )	2026-05-13 22:17:16 +08:00
_meta.ts	feat: refactor docs	2025-12-05 10:51:57 +08:00
index.md	fix: lint issues	2025-12-19 15:52:11 +08:00