qwen-code/docs
顾盼 c512427f93
feat(core): strip inline media before chat compaction summary (#4101)
* feat(core): strip inline media before chat compaction summary

Compaction's side-query previously shipped historyToCompress verbatim.
Two related issues degraded summary quality and accuracy:

- Inline image / document bytes (from MCP tool results) leaked into the
  summary model's prompt where they could not be interpreted and merely
  inflated payload.
- findCompressSplitPoint apportioned chars via JSON.stringify(content),
  so a single 1 MB base64 image looked like ~350K tokens and biased
  the split point. Real Qwen-VL token cost is at most a few thousand.

This change adds a new compactionInputSlimming module that replaces
inlineData / fileData parts with short [image: <mime>] / [document:
<mime>] placeholders before the side-query, leaving live history
unchanged. The same constant feeds estimateContentChars so the
split-point algorithm sees the budget the summary model actually
consumes downstream. Microcompact is also extended to clear stale
inline images alongside old tool results.

A previous draft of the design also externalized large pastes to a
content-addressable on-disk cache, but it was withdrawn after surveying
claude-code's 2026-03 to 2026-05 releases - upstream consensus is to
keep user input visible to the model and amortize cost via prompt
caching rather than externalize. See the Out-of-scope section of the
design doc for the full rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): recurse into functionResponse.parts when stripping media

E2E exposed that `read_file` (and any tool that surfaces an image)
wraps the result in `functionResponse.parts` via
`coreToolScheduler.createFunctionResponsePart`. The slimming module
only walked top-level `part.inlineData` / `part.fileData`, so the
nested base64 bytes leaked into the compaction side-query payload.
The previous design doc incorrectly claimed that no recursive walk
was needed.

Three changes:

- `slimCompactionInput.transformPart` recurses into the nested
  `functionResponse.parts` array and replaces each entry via the
  same image/document placeholder logic.
- `estimatePartChars` walks the nested array too, so the split-point
  algorithm doesn't fall back to `JSON.stringify` and over-count the
  base64 bytes.
- `microcompactHistory` drops `functionResponse.parts` when clearing
  an old tool result; the previous spread of `...part.functionResponse`
  silently carried the original media through.

New unit tests cover (a) nested image / document stripping, (b) the
estimator no longer being skewed by nested base64. The previously
failing E2E now PASSES: side-query payload contains zero `data:image/`
occurrences, zero long base64 runs, and exactly one
`[image: image/png]` placeholder.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): address review findings on compaction image stripping

Addresses 8 valid findings from PR review:

- [Critical] estimatePartTokens now handles `fileData` parts (both
  top-level and nested under functionResponse.parts). Without this,
  microcompact's `tokensSaved === 0` short-circuit silently discarded
  every fileData clear.

- estimatePartTokens for binary parts now uses a fixed
  MEDIA_PART_TOKEN_ESTIMATE constant (1,600) instead of base64-length
  divided by 4. The old formula billed a 1 MB image as ~250K tokens
  rather than its actual ~1,280 visual tokens on Qwen-VL, inflating
  the saved-token metric by orders of magnitude.

- mimeType values from MCP tool servers are now run through
  sanitizeMimeForPlaceholder before being embedded in `[image: …]` /
  `[document: …]` placeholders. An adversarial server could otherwise
  craft `image/png]\n\n[SYSTEM: …` and inject instructions into the
  summary side-query.

- collectCompactablePartRefs now recognizes a third 'nested-media'
  kind: functionResponse parts from non-compactable tools (e.g. MCP
  screenshots whose names aren't in COMPACTABLE_TOOLS) that carry
  images on functionResponse.parts. The nested media is dropped while
  the tool's text output is preserved. Previously such media
  accumulated forever in live history.

- keepRecent budgets are now per-kind (tool / media / nested-media).
  Setting `toolResultsNumToKeep: 1` keeps 1 of each kind rather than 1
  entry total across the merged list — matches the natural reading of
  the setting name.

- findCompressSplitPoint's `precomputedCharCounts` fallback path is
  now documented as test-only; production callers MUST pass the
  precomputed array.

- The text-based branch of isAlreadyCleared is gone: with the new
  nested-media handling (drops `parts`) and existing media handling
  (replaces with `{ text: … }` that is no longer collected) it was
  unreachable.

- OpenAI converter (createToolMessage) now passes text parts inside
  functionResponse.parts through as text content. The slimmer writes
  `{ text: '[image: image/png]' }` placeholders into the nested array;
  without this fix the converter dropped them when serializing to the
  OpenAI wire format, leaving the summary model with empty tool
  responses instead of the placeholder.

Two findings deferred with rationale (see design doc Open Questions):
MIN_COMPRESSION_FRACTION still uses pre-slim counts (acceptable —
"user shared an image" is itself worth summarizing); SlimResult is not
re-exported (round-3 simplify decided to keep core's public surface
minimal).

E2E re-verified end-to-end: side-query payload contains 0 data:image/
occurrences, 0 long base64 runs, and 1 `[image: image/png]` placeholder
in the expected position. 185/185 collocated unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(core): tidy compaction slimming after self-review

Three small polishes from a follow-up code review pass:

- `stripNestedMedia` no longer re-casts its return value: after
  destructuring `parts` out of the widened input type, TypeScript
  infers the original `FunctionResponse` shape without help.
- `isAlreadyCleared` shed a 10-line comment block — the body is now
  one line, so one descriptive line above it is enough.
- OpenAI converter's nested-part text check switched from
  `(part as { text?: unknown }).text` to
  `'text' in part && typeof part.text === 'string'`, dropping the
  cast and letting `in` narrow the type.

No behavior change. 185/185 unit tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(core): wire slim stats to debug log; split MicrocompactMeta tools vs media

Addresses two follow-up review suggestions:

- `slimCompactionInput` returned `stats.imagesStripped` and
  `stats.documentsStripped` but the orchestrator never consumed them.
  Now logged at debug level whenever non-zero so operators can confirm
  the slimming pipeline actually fires on image-heavy compactions.

- `MicrocompactMeta.toolsCleared` lost meaning after the recent
  refactor: it had grown to count both tool-result clears AND
  inline-media / nested-media clears. Renamed:
  - `toolsCleared` → only `tool`-kind clears (compactable tool output)
  - `mediaCleared` → `media` + `nested-media` clears (new)
  - `toolsKept` / `mediaKept` mirror the split, replacing the prior
    `toolsKept` that was actually a combined count.

  The single non-test consumer (`client.ts` debug log) updated to use
  both fields.

185/185 unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:20:11 +08:00
..
design feat(core): strip inline media before chat compaction summary (#4101) 2026-05-14 10:20:11 +08:00
developers feat(cli,sdk): qwen serve daemon (Stage 1) (#3889) 2026-05-13 14:47:47 +08:00
plans feat(vscode-ide-companion): add agent execution tool display (#2590) 2026-04-18 23:39:26 +08:00
users feat(perf): progressive MCP availability — MCP no longer blocks first input (#3994) 2026-05-13 22:17:16 +08:00
_meta.ts feat: refactor docs 2025-12-05 10:51:57 +08:00
index.md fix: lint issues 2025-12-19 15:52:11 +08:00