vrr/qwen-code

Fork 0

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-04-28 03:30:40 +00:00

Shaojin Wen 60a6dfc14c

Qwen Code CI / Lint (push) Waiting to run

Details

Qwen Code CI / Test (push) Blocked by required conditions

Details

Qwen Code CI / Test-1 (push) Blocked by required conditions

Details

Qwen Code CI / Test-2 (push) Blocked by required conditions

Details

Qwen Code CI / Test-3 (push) Blocked by required conditions

Details

Qwen Code CI / Test-4 (push) Blocked by required conditions

Details

Qwen Code CI / Test-5 (push) Blocked by required conditions

Details

Qwen Code CI / Test-6 (push) Blocked by required conditions

Details

Qwen Code CI / Test-7 (push) Blocked by required conditions

Details

Qwen Code CI / Test-8 (push) Blocked by required conditions

Details

Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions

Details

Qwen Code CI / CodeQL (push) Waiting to run

Details

E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run

Details

E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run

Details

E2E Tests / E2E Test - macOS (push) Waiting to run

Details

feat(cli): add session recap with /recap and auto-show on return (#3434 )

* feat(cli): add session recap with /recap and auto-show on return

Users often open an old session days later and need to scroll through
pages to remember where they left off. This change adds a short
"where did I leave off" recap — a 1-3 sentence summary generated by
the fast model — so they can resume without re-reading the history.

Two triggers:
- /recap: manual slash command.
- Auto: when the terminal has been blurred for 5+ minutes and gets
focused again (uses the existing DECSET 1004 focus protocol via
useFocus). Gated on streamingState === Idle so it never interrupts
an active turn. Only fires once per blur cycle.

The recap is rendered in dim color with a chevron prefix, visually
distinct from assistant replies. A new `general.showSessionRecap`
setting controls the auto-trigger (default on). /recap works
independent of the setting.

Implementation notes:
- generateSessionRecap uses fastModel (falls back to main model),
tools: [], maxOutputTokens: 300, and a tight system prompt. It
strips tool calls / responses from history before sending — tool
responses can hold 10K+ tokens of file content that drown the recap
in irrelevant detail. The 30-message window respects turn boundaries
(slice never starts on a dangling model/tool response).
- Output is wrapped in <recap>...</recap> tags; the extractor returns
empty (skips render) if the tag is missing, preventing model
reasoning from leaking into the UI.
- All failures are silent (return null) and logged via a scoped
debugLogger; recap is best-effort and must never break main flow.
- /recap refuses to run while a turn is pending.

* fix(cli): abort in-flight recap when showSessionRecap is disabled

If the user disables showSessionRecap while an auto-recap LLM call is
already in flight, the previous code returned early without aborting.
The pending .then would still pass its idle/abort guards and append the
recap, producing an unwanted message after the user has opted out.

Abort the controller and clear it eagerly so the resolved promise no
longer adds to history.

* fix(cli): gate /recap and auto-recap on streaming idle state

Two related issues from review:

1. /recap was only refusing when ui.pendingItem was set, but a normal
model reply runs with streamingState === Responding and a null
pendingItem. Invoking /recap mid-stream would generate a recap from
a partial conversation and insert it between the user prompt and
the assistant reply.

2. useAwaySummary cleared blurredAtRef before checking isIdle, so if
focus returned during a still-streaming turn (after a >5min blur)
the recap was permanently dropped — there was no later retry when
the turn became idle, because isIdle was not in the effect deps.

Fixes:
- Expose isIdleRef on CommandContext.ui (mirrors btwAbortControllerRef
pattern). Plumb it from AppContainer through useSlashCommandProcessor.
- recapCommand now refuses when isIdleRef.current is false OR
pendingItem is non-null.
- useAwaySummary preserves blurredAtRef on the !isIdle bail and adds
isIdle to the effect deps, so the trigger re-evaluates when the
current turn finishes.
- Brief blurs (< AWAY_THRESHOLD_MS) still reset blurredAtRef.

Also seeds isIdleRef in nonInteractiveUi and mockCommandContext so the
new field has a sensible default outside the interactive UI.

* docs: document /recap command, showSessionRecap setting, and design

- User docs: add /recap to the Session and Project Management table in
features/commands.md and a dedicated subsection covering manual use,
the auto-trigger, the dim-color rendering, and the fast-model tip.
- User docs: add general.showSessionRecap row to the configuration
settings reference.
- Design doc: docs/design/session-recap/session-recap-design.md covers
motivation, the two trigger paths, the per-file architecture, prompt
design with the <recap> tag and three-tier extractor, history
filtering rationale (functionResponse can be 10K+ tokens), the
useAwaySummary state machine, the isIdleRef gating for /recap, model
selection, observability, and out-of-scope items.

* fix(core): exclude thought parts from session recap context

filterToDialog kept any non-empty text part, but @google/genai's Part
type also marks model reasoning with part.thought / part.thoughtSignature.
That hidden chain-of-thought was being fed to the recap LLM and could
get summarized as if it were user-visible dialogue.

Drop parts where either flag is set. Update the design doc's
History 过滤 section to call this out alongside the existing
tool-call/response rationale.

* docs(session-recap): correct debug-logging guidance, fill in state machine, sharpen UX wording

Audit of the session recap docs against the implementation found three
issues worth fixing:

- Design doc claimed debug logs were enabled via a QWEN_CODE_DEBUG_LOGGING
env var. That var does not exist; debug logs are written to
~/.qwen/debug/<sessionId>.txt by default, gated by QWEN_DEBUG_LOG_FILE.
Replace with the accurate path + opt-out behavior, and tell the reader
to grep for the [SESSION_RECAP] tag.
- Design doc's useAwaySummary state machine table was missing the
isFocused && blurredAtRef === null path (taken on first render and
right after a brief-blur reset). Add the row.
- User doc's "Refuses to run ... failures are silent" line conflated the
inline-error refusal with silent generation failures, and "(when the
conversation is idle)" used internal jargon. Split the two cases and
spell out what "idle" means, including the wait-then-fire behavior
when focus returns mid-turn.

* docs(session-recap): correctly describe /recap vs auto-trigger failure modes

The previous wording said "Generation/network failures are silent — the
recap simply does not appear", but recapCommand returns a user-facing
info message ("Not enough conversation context for a recap yet.") in
exactly that path, and also returns inline messages for the
config-not-loaded and busy-turn guards.

Only the auto-trigger path is truly silent (it just skips addItem when
generateSessionRecap returns null). Split the two paths in the doc so
the manual command's "always responds with something" behavior is
distinguished from the auto-trigger's no-op-on-failure behavior.

* docs(session-recap): align prompt-rules section with the actual prompt

Two doc-vs-code mismatches in the design doc's "System Prompt" section,
caught with the same lens as yiliang114's failure-mode review:

- The bullet list claimed RECAP_SYSTEM_PROMPT forbids "推测用户意图"
and "用 'you' 称呼用户". Those rules existed in an early draft but
were dropped when the <recap> tag rules were added; the current
prompt has no such restrictions. Replace with the actual rules and
add a "与 RECAP_SYSTEM_PROMPT 一一对应" marker so future edits stay
in sync.
- The doc said systemInstruction "覆盖" the main agent prompt. True
for the agent prompt portion, but GeminiClient.generateContent
internally calls getCustomSystemPrompt which appends user memory
(QWEN.md / 自动 memory) as a suffix. Spell that out — the final
system prompt is recap prompt + user memory, which is actually
useful project context for the recap.

* docs(session-recap): translate design doc to English

The repo convention for docs/design is English (7 of 8 existing files;
auto-memory/memory-system.md is the only Chinese one). The first version
of this design doc followed the auto-memory example, which turned out
to be the wrong sample.

Translate to English while preserving the existing structure, the
state-machine table, the prompt-vs-doc 1:1 alignment, the
QWEN_DEBUG_LOG_FILE description, and the failure-mode notes added in
prior commits.

* fix(cli): drop empty info return from /recap interactive success path

The interactive success path inserts the away_recap history item
directly via ui.addItem and then returned `{type: 'message',
messageType: 'info', content: ''}`. The slash-command processor's
'message' case unconditionally calls addMessage, which adds another
HistoryItemInfo with empty text. The empty info renders as nothing
(StatusMessage early-returns null), but it still bloats the in-memory
history list and shows up in /export and saved sessions.

Return void on the interactive success path and on the abort path so
the processor's `if (result)` check skips the message-handler branch
entirely. Widen the action's return type to `void | SlashCommandActionReturn`
to match (same shape as btwCommand).

2026-04-19 21:38:48 +08:00

14 KiB

Raw Blame History

Session Recap Design

A 1-3 sentence "where did I leave off" summary surfaced when the user returns to an idle session, either on demand (/recap) or after the terminal has been blurred for 5+ minutes.

Overview

When a user /resumes an old session days later, scrolling back through pages of history to remember what they were doing and what came next is a real friction point. Just reloading messages does not solve this UX problem.

The goal is to proactively surface a 1-3 sentence recap when the user returns:

High-level task (what they are doing) → next step (what to do next).
Visually distinct from real assistant replies, so it is never mistaken for new model output.
Best-effort: failures must be silent and never break the main flow.

Triggers

Trigger	Conditions	Implementation
Manual	User runs `/recap`	`recapCommand.ts` calls the same underlying service
Auto	Terminal blurred (DECSET 1004 focus protocol) for ≥ 5 min + focus returns + stream is `Idle`	`useAwaySummary.ts` — 5min blur timer + `useFocus` event listener

Both paths funnel into a single function — generateSessionRecap() — to guarantee identical behavior. The auto-trigger is gated by general.showSessionRecap (default: on); the manual command ignores that setting.

Architecture

┌────────────────────────────────────────────────────────────────────────┐
│                          AppContainer.tsx                              │
│   isFocused = useFocus()                                               │
│   isIdle = streamingState === Idle                                     │
│       │                                                                │
│       ├─→ useAwaySummary({enabled, config, isFocused, isIdle, addItem})│
│       │       │                                                        │
│       │       └─→ 5 min blur timer + idle/dedupe gates                 │
│       │              │                                                 │
│       │              ↓                                                 │
│       └─→ recapCommand (slash) ─→ generateSessionRecap(config, signal) │
│                                          │                             │
│                                          ↓                             │
│                              ┌─────────────────────────┐               │
│                              │ packages/core/services/ │               │
│                              │   sessionRecap.ts       │               │
│                              └─────────────────────────┘               │
│                                          │                             │
│                                          ↓                             │
│                              GeminiClient.generateContent              │
│                              (fastModel + tools:[])                    │
│                                                                        │
│   addItem({type: 'away_recap', text}) ─→ HistoryItemDisplay            │
│                                            └─ AwayRecapMessage         │
│                                               (dim color + ❯ prefix)   │
└────────────────────────────────────────────────────────────────────────┘

Files

File	Responsibility
`packages/core/src/services/sessionRecap.ts`	One-shot LLM call + history filter + tag extraction
`packages/cli/src/ui/hooks/useAwaySummary.ts`	Auto-trigger React hook
`packages/cli/src/ui/commands/recapCommand.ts`	`/recap` manual entry point
`packages/cli/src/ui/components/messages/StatusMessages.tsx`	`AwayRecapMessage` dim renderer
`packages/cli/src/ui/types.ts`	`HistoryItemAwayRecap` type
`packages/cli/src/ui/components/HistoryItemDisplay.tsx`	Renderer dispatch
`packages/cli/src/config/settingsSchema.ts`	`general.showSessionRecap` setting

Prompt Design

System Prompt

generationConfig.systemInstruction replaces the main agent's system prompt for this single call, so the model behaves only as a recap generator and not as a coding assistant.

Note that GeminiClient.generateContent() internally runs the prompt through getCustomSystemPrompt(), which appends the user's memory (QWEN.md / managed auto-memory) as a suffix. The final system prompt is therefore recap prompt + user memory — useful project context for the recap, not a leak.

Bullets below correspond 1:1 with RECAP_SYSTEM_PROMPT:

1 to 3 short sentences, plain prose (no markdown / lists / headings).
First sentence: the high-level task. Then: the concrete next step.
Explicitly forbid: listing what was done, reciting tool calls, status reports.
Match the dominant language of the conversation (English or Chinese).
Wrap output in <recap>...</recap>; nothing outside the tags.

Structured Output + Extraction

The model is instructed to wrap its answer in <recap>...</recap>:

<recap>Refactoring loopDetectionService.ts to address long-session OOM. Next step is to implement option B.</recap>

Why: some models (GLM family, reasoning models) write a "thinking" paragraph before the final answer. Returning the raw text would leak that reasoning into the UI.

extractRecap() has three fallback tiers:

Both tags present: take what is between <recap>...</recap> (preferred).
Only the open tag (e.g. maxOutputTokens truncated the close tag): take everything after the open tag.
Tag missing entirely: return empty string → service returns null → UI renders nothing.

The third tier is "skip rather than show the wrong thing" — surfacing the model's reasoning preamble is worse than showing no recap at all.

Call Parameters

Parameter	Value	Reason
`model`	`getFastModel() ?? getModel()`	Recap doesn't need a frontier model
`tools`	`[]`	One-shot query, no tool use
`maxOutputTokens`	`300`	Enough for 1-3 sentences + tags; larger would encourage rambling
`temperature`	`0.3`	Mostly deterministic, with a bit of natural variation
`systemInstruction`	The recap-only prompt above	Replaces the main agent's role definition

History Filtering

geminiClient.getChat().getHistory() returns a Content[] that includes:

user / model text messages
model functionCall parts
user functionResponse parts (which can hold full file contents)
model thought parts (part.thought / part.thoughtSignature, the model's hidden reasoning)

filterToDialog() keeps only user / model parts that have non-empty text and are not thoughts. Two reasons:

Tool calls / responses: a single functionResponse can be 10K+ tokens. 30 such messages would drown the recap LLM in irrelevant detail, both wasting tokens and biasing the recap toward implementation noise like "called X tool to read Y file".
Thought parts: carry the model's internal reasoning. Including them risks treating hidden chain-of-thought as dialogue and surfacing it in the recap text.

After dropping empty messages, takeRecentDialog slices to the last 30 messages and refuses to start the slice on a dangling model/tool response.

Concurrency and Edge Cases

Auto-trigger hook state machine

useAwaySummary keeps three refs:

Ref	Meaning
`blurredAtRef`	Blur start time (not cleared until focus returns)
`recapPendingRef`	Whether an LLM call is in flight
`inFlightRef`	The current in-flight `AbortController`

useEffect deps: [enabled, config, isFocused, isIdle, addItem].

Event	Action
`!enabled \|\| !config`	Abort in-flight call + clear `inFlightRef` + clear `blurredAtRef`
`!isFocused` and `blurredAtRef === null`	Set `blurredAtRef = Date.now()`
`isFocused` and `blurredAtRef === null`	Return early (no blur cycle to handle — first render or right after a brief-blur reset)
`isFocused` and blur duration < 5 min	Clear `blurredAtRef`, wait for next blur cycle
`isFocused` and blur ≥ 5 min and `recapPendingRef`	Return (dedupe)
`isFocused` and blur ≥ 5 min and `!isIdle`	Preserve `blurredAtRef` and wait for the turn to finish (`isIdle` is in the deps, so the effect re-fires when streaming completes)
`isFocused` and all conditions met	Clear `blurredAtRef`, set `recapPendingRef = true`, create `AbortController`, send the LLM request

The .then callback re-checks isIdleRef.current: if the user has started a new turn while the LLM was running, the late-arriving recap is dropped to avoid inserting it mid-turn.

The .finally clears recapPendingRef, and clears inFlightRef only if inFlightRef.current === controller (so it doesn't overwrite a newer controller).

A second useEffect aborts the in-flight controller on unmount.

`/recap` gating

CommandContext.ui.isIdleRef exposes the current stream state (mirroring the existing btwAbortControllerRef pattern). In interactive mode, recapCommand refuses when !isIdleRef.current or pendingItem !== null. pendingItem alone is insufficient because a normal model reply runs with streamingState === Responding and a null pendingItem.

Configuration and Model Selection

User-facing knobs

Setting	Default	Notes
`general.showSessionRecap`	`true`	Auto-trigger only. Manual `/recap` ignores this.
`fastModel`	unset	Recommended (e.g. `qwen3-coder-flash`) for fast and cheap recaps.

Model fallback

config.getFastModel() ?? config.getModel():

User has a fastModel set and it is valid for the current auth type → use fastModel.
Otherwise → fall back to the main session model (works, just costlier and slower).

Observability

createDebugLogger('SESSION_RECAP') emits:

caught exceptions from the recap path (debugLogger.warn).

All failures are fully transparent to the user — recap is an auxiliary feature and never throws into the UI. Developers can grep for the [SESSION_RECAP] tag in the debug log file: written by default to ~/.qwen/debug/<sessionId>.txt (latest.txt symlinks to the current session); disable via QWEN_DEBUG_LOG_FILE=0.

Out of Scope

Item	Why not
Progress UI for `/recap` (spinner / pendingItem)	3-5 second wait is tolerable; adds complexity.
Automated tests	Service is small (~150 lines), end-to-end tested manually first; unit tests can land in a separate PR.
Localized prompts	The system prompt is for the model; English is the most reliable substrate. The model selects the output language from the conversation.
`QWEN_CODE_ENABLE_AWAY_SUMMARY` env var	Claude Code uses it to keep the feature on when telemetry is disabled; Qwen Code's current telemetry model doesn't need this.
Auto-recap on `/resume` completion	A natural follow-up but needs a hook point in `useResumeCommand`; out of scope for this PR.

14 KiB Raw Blame History Unescape Escape

Session Recap Design

Overview

Triggers

Architecture

Files

Prompt Design

System Prompt

Structured Output + Extraction

Call Parameters

History Filtering

Concurrency and Edge Cases

Auto-trigger hook state machine

/recap gating

Configuration and Model Selection

User-facing knobs

Model fallback

Observability

Out of Scope

14 KiB

Raw Blame History

`/recap` gating