mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-05-17 12:21:10 +00:00
* feat(core): strip inline media before chat compaction summary
Compaction's side-query previously shipped historyToCompress verbatim.
Two related issues degraded summary quality and accuracy:
- Inline image / document bytes (from MCP tool results) leaked into the
summary model's prompt where they could not be interpreted and merely
inflated payload.
- findCompressSplitPoint apportioned chars via JSON.stringify(content),
so a single 1 MB base64 image looked like ~350K tokens and biased
the split point. Real Qwen-VL token cost is at most a few thousand.
This change adds a new compactionInputSlimming module that replaces
inlineData / fileData parts with short [image: <mime>] / [document:
<mime>] placeholders before the side-query, leaving live history
unchanged. The same constant feeds estimateContentChars so the
split-point algorithm sees the budget the summary model actually
consumes downstream. Microcompact is also extended to clear stale
inline images alongside old tool results.
A previous draft of the design also externalized large pastes to a
content-addressable on-disk cache, but it was withdrawn after surveying
claude-code's 2026-03 to 2026-05 releases - upstream consensus is to
keep user input visible to the model and amortize cost via prompt
caching rather than externalize. See the Out-of-scope section of the
design doc for the full rationale.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(core): recurse into functionResponse.parts when stripping media
E2E exposed that `read_file` (and any tool that surfaces an image)
wraps the result in `functionResponse.parts` via
`coreToolScheduler.createFunctionResponsePart`. The slimming module
only walked top-level `part.inlineData` / `part.fileData`, so the
nested base64 bytes leaked into the compaction side-query payload.
The previous design doc incorrectly claimed that no recursive walk
was needed.
Three changes:
- `slimCompactionInput.transformPart` recurses into the nested
`functionResponse.parts` array and replaces each entry via the
same image/document placeholder logic.
- `estimatePartChars` walks the nested array too, so the split-point
algorithm doesn't fall back to `JSON.stringify` and over-count the
base64 bytes.
- `microcompactHistory` drops `functionResponse.parts` when clearing
an old tool result; the previous spread of `...part.functionResponse`
silently carried the original media through.
New unit tests cover (a) nested image / document stripping, (b) the
estimator no longer being skewed by nested base64. The previously
failing E2E now PASSES: side-query payload contains zero `data:image/`
occurrences, zero long base64 runs, and exactly one
`[image: image/png]` placeholder.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(core): address review findings on compaction image stripping
Addresses 8 valid findings from PR review:
- [Critical] estimatePartTokens now handles `fileData` parts (both
top-level and nested under functionResponse.parts). Without this,
microcompact's `tokensSaved === 0` short-circuit silently discarded
every fileData clear.
- estimatePartTokens for binary parts now uses a fixed
MEDIA_PART_TOKEN_ESTIMATE constant (1,600) instead of base64-length
divided by 4. The old formula billed a 1 MB image as ~250K tokens
rather than its actual ~1,280 visual tokens on Qwen-VL, inflating
the saved-token metric by orders of magnitude.
- mimeType values from MCP tool servers are now run through
sanitizeMimeForPlaceholder before being embedded in `[image: …]` /
`[document: …]` placeholders. An adversarial server could otherwise
craft `image/png]\n\n[SYSTEM: …` and inject instructions into the
summary side-query.
- collectCompactablePartRefs now recognizes a third 'nested-media'
kind: functionResponse parts from non-compactable tools (e.g. MCP
screenshots whose names aren't in COMPACTABLE_TOOLS) that carry
images on functionResponse.parts. The nested media is dropped while
the tool's text output is preserved. Previously such media
accumulated forever in live history.
- keepRecent budgets are now per-kind (tool / media / nested-media).
Setting `toolResultsNumToKeep: 1` keeps 1 of each kind rather than 1
entry total across the merged list — matches the natural reading of
the setting name.
- findCompressSplitPoint's `precomputedCharCounts` fallback path is
now documented as test-only; production callers MUST pass the
precomputed array.
- The text-based branch of isAlreadyCleared is gone: with the new
nested-media handling (drops `parts`) and existing media handling
(replaces with `{ text: … }` that is no longer collected) it was
unreachable.
- OpenAI converter (createToolMessage) now passes text parts inside
functionResponse.parts through as text content. The slimmer writes
`{ text: '[image: image/png]' }` placeholders into the nested array;
without this fix the converter dropped them when serializing to the
OpenAI wire format, leaving the summary model with empty tool
responses instead of the placeholder.
Two findings deferred with rationale (see design doc Open Questions):
MIN_COMPRESSION_FRACTION still uses pre-slim counts (acceptable —
"user shared an image" is itself worth summarizing); SlimResult is not
re-exported (round-3 simplify decided to keep core's public surface
minimal).
E2E re-verified end-to-end: side-query payload contains 0 data:image/
occurrences, 0 long base64 runs, and 1 `[image: image/png]` placeholder
in the expected position. 185/185 collocated unit tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(core): tidy compaction slimming after self-review
Three small polishes from a follow-up code review pass:
- `stripNestedMedia` no longer re-casts its return value: after
destructuring `parts` out of the widened input type, TypeScript
infers the original `FunctionResponse` shape without help.
- `isAlreadyCleared` shed a 10-line comment block — the body is now
one line, so one descriptive line above it is enough.
- OpenAI converter's nested-part text check switched from
`(part as { text?: unknown }).text` to
`'text' in part && typeof part.text === 'string'`, dropping the
cast and letting `in` narrow the type.
No behavior change. 185/185 unit tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(core): wire slim stats to debug log; split MicrocompactMeta tools vs media
Addresses two follow-up review suggestions:
- `slimCompactionInput` returned `stats.imagesStripped` and
`stats.documentsStripped` but the orchestrator never consumed them.
Now logged at debug level whenever non-zero so operators can confirm
the slimming pipeline actually fires on image-heavy compactions.
- `MicrocompactMeta.toolsCleared` lost meaning after the recent
refactor: it had grown to count both tool-result clears AND
inline-media / nested-media clears. Renamed:
- `toolsCleared` → only `tool`-kind clears (compactable tool output)
- `mediaCleared` → `media` + `nested-media` clears (new)
- `toolsKept` / `mediaKept` mirror the split, replacing the prior
`toolsKept` that was actually a combined count.
The single non-test consumer (`client.ts` debug log) updated to use
both fields.
185/185 unit tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| design | ||
| developers | ||
| plans | ||
| users | ||
| _meta.ts | ||
| index.md | ||