feat(cli): add session recap with /recap and auto-show on return (#3434)

* feat(cli): add session recap with /recap and auto-show on return Users often open an old session days later and need to scroll through pages to remember where they left off. This change adds a short "where did I leave off" recap — a 1-3 sentence summary generated by the fast model — so they can resume without re-reading the history. Two triggers: - /recap: manual slash command. - Auto: when the terminal has been blurred for 5+ minutes and gets focused again (uses the existing DECSET 1004 focus protocol via useFocus). Gated on streamingState === Idle so it never interrupts an active turn. Only fires once per blur cycle. The recap is rendered in dim color with a chevron prefix, visually distinct from assistant replies. A new `general.showSessionRecap` setting controls the auto-trigger (default on). /recap works independent of the setting. Implementation notes: - generateSessionRecap uses fastModel (falls back to main model), tools: [], maxOutputTokens: 300, and a tight system prompt. It strips tool calls / responses from history before sending — tool responses can hold 10K+ tokens of file content that drown the recap in irrelevant detail. The 30-message window respects turn boundaries (slice never starts on a dangling model/tool response). - Output is wrapped in <recap>...</recap> tags; the extractor returns empty (skips render) if the tag is missing, preventing model reasoning from leaking into the UI. - All failures are silent (return null) and logged via a scoped debugLogger; recap is best-effort and must never break main flow. - /recap refuses to run while a turn is pending. * fix(cli): abort in-flight recap when showSessionRecap is disabled If the user disables showSessionRecap while an auto-recap LLM call is already in flight, the previous code returned early without aborting. The pending .then would still pass its idle/abort guards and append the recap, producing an unwanted message after the user has opted out. Abort the controller and clear it eagerly so the resolved promise no longer adds to history. * fix(cli): gate /recap and auto-recap on streaming idle state Two related issues from review: 1. /recap was only refusing when ui.pendingItem was set, but a normal model reply runs with streamingState === Responding and a null pendingItem. Invoking /recap mid-stream would generate a recap from a partial conversation and insert it between the user prompt and the assistant reply. 2. useAwaySummary cleared blurredAtRef before checking isIdle, so if focus returned during a still-streaming turn (after a >5min blur) the recap was permanently dropped — there was no later retry when the turn became idle, because isIdle was not in the effect deps. Fixes: - Expose isIdleRef on CommandContext.ui (mirrors btwAbortControllerRef pattern). Plumb it from AppContainer through useSlashCommandProcessor. - recapCommand now refuses when isIdleRef.current is false OR pendingItem is non-null. - useAwaySummary preserves blurredAtRef on the !isIdle bail and adds isIdle to the effect deps, so the trigger re-evaluates when the current turn finishes. - Brief blurs (< AWAY_THRESHOLD_MS) still reset blurredAtRef. Also seeds isIdleRef in nonInteractiveUi and mockCommandContext so the new field has a sensible default outside the interactive UI. * docs: document /recap command, showSessionRecap setting, and design - User docs: add /recap to the Session and Project Management table in features/commands.md and a dedicated subsection covering manual use, the auto-trigger, the dim-color rendering, and the fast-model tip. - User docs: add general.showSessionRecap row to the configuration settings reference. - Design doc: docs/design/session-recap/session-recap-design.md covers motivation, the two trigger paths, the per-file architecture, prompt design with the <recap> tag and three-tier extractor, history filtering rationale (functionResponse can be 10K+ tokens), the useAwaySummary state machine, the isIdleRef gating for /recap, model selection, observability, and out-of-scope items. * fix(core): exclude thought parts from session recap context filterToDialog kept any non-empty text part, but @google/genai's Part type also marks model reasoning with part.thought / part.thoughtSignature. That hidden chain-of-thought was being fed to the recap LLM and could get summarized as if it were user-visible dialogue. Drop parts where either flag is set. Update the design doc's History 过滤 section to call this out alongside the existing tool-call/response rationale. * docs(session-recap): correct debug-logging guidance, fill in state machine, sharpen UX wording Audit of the session recap docs against the implementation found three issues worth fixing: - Design doc claimed debug logs were enabled via a QWEN_CODE_DEBUG_LOGGING env var. That var does not exist; debug logs are written to ~/.qwen/debug/<sessionId>.txt by default, gated by QWEN_DEBUG_LOG_FILE. Replace with the accurate path + opt-out behavior, and tell the reader to grep for the [SESSION_RECAP] tag. - Design doc's useAwaySummary state machine table was missing the isFocused && blurredAtRef === null path (taken on first render and right after a brief-blur reset). Add the row. - User doc's "Refuses to run ... failures are silent" line conflated the inline-error refusal with silent generation failures, and "(when the conversation is idle)" used internal jargon. Split the two cases and spell out what "idle" means, including the wait-then-fire behavior when focus returns mid-turn. * docs(session-recap): correctly describe /recap vs auto-trigger failure modes The previous wording said "Generation/network failures are silent — the recap simply does not appear", but recapCommand returns a user-facing info message ("Not enough conversation context for a recap yet.") in exactly that path, and also returns inline messages for the config-not-loaded and busy-turn guards. Only the auto-trigger path is truly silent (it just skips addItem when generateSessionRecap returns null). Split the two paths in the doc so the manual command's "always responds with something" behavior is distinguished from the auto-trigger's no-op-on-failure behavior. * docs(session-recap): align prompt-rules section with the actual prompt Two doc-vs-code mismatches in the design doc's "System Prompt" section, caught with the same lens as yiliang114's failure-mode review: - The bullet list claimed RECAP_SYSTEM_PROMPT forbids "推测用户意图" and "用 'you' 称呼用户". Those rules existed in an early draft but were dropped when the <recap> tag rules were added; the current prompt has no such restrictions. Replace with the actual rules and add a "与 RECAP_SYSTEM_PROMPT 一一对应" marker so future edits stay in sync. - The doc said systemInstruction "覆盖" the main agent prompt. True for the agent prompt portion, but GeminiClient.generateContent internally calls getCustomSystemPrompt which appends user memory (QWEN.md / 自动 memory) as a suffix. Spell that out — the final system prompt is recap prompt + user memory, which is actually useful project context for the recap. * docs(session-recap): translate design doc to English The repo convention for docs/design is English (7 of 8 existing files; auto-memory/memory-system.md is the only Chinese one). The first version of this design doc followed the auto-memory example, which turned out to be the wrong sample. Translate to English while preserving the existing structure, the state-machine table, the prompt-vs-doc 1:1 alignment, the QWEN_DEBUG_LOG_FILE description, and the failure-mode notes added in prior commits. * fix(cli): drop empty info return from /recap interactive success path The interactive success path inserts the away_recap history item directly via ui.addItem and then returned `{type: 'message', messageType: 'info', content: ''}`. The slash-command processor's 'message' case unconditionally calls addMessage, which adds another HistoryItemInfo with empty text. The empty info renders as nothing (StatusMessage early-returns null), but it still bloats the in-memory history list and shows up in /export and saved sessions. Return void on the interactive success path and on the abort path so the processor's `if (result)` check skips the message-handler branch entirely. Widen the action's return type to `void | SlashCommandActionReturn` to match (same shape as btwCommand).
2026-04-28 11:41:04 +00:00 · 2026-04-19 21:38:48 +08:00 · 2026-04-19 21:38:48 +08:00 · 60a6dfc14c
commit 60a6dfc14c
parent 528fcfcff8
19 changed files with 702 additions and 4 deletions
--- a/docs/design/session-recap/session-recap-design.md
+++ b/docs/design/session-recap/session-recap-design.md
@ -0,0 +1,239 @@
+# Session Recap Design
+
+> A 1-3 sentence "where did I leave off" summary surfaced when the user
+> returns to an idle session, either on demand (`/recap`) or after the
+> terminal has been blurred for 5+ minutes.
+
+## Overview
+
+When a user `/resume`s an old session days later, scrolling back through
+pages of history to remember **what they were doing and what came next**
+is a real friction point. Just reloading messages does not solve this
+UX problem.
+
+The goal is to proactively surface a 1-3 sentence recap when the user
+returns:
+
+- **High-level task** (what they are doing) → **next step** (what to do next).
+- Visually distinct from real assistant replies, so it is never mistaken
+  for new model output.
+- **Best-effort**: failures must be silent and never break the main flow.
+
+## Triggers
+
+| Trigger    | Conditions                                                                                   | Implementation                                                    |
+| ---------- | -------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- |
+| **Manual** | User runs `/recap`                                                                           | `recapCommand.ts` calls the same underlying service               |
+| **Auto**   | Terminal blurred (DECSET 1004 focus protocol) for ≥ 5 min + focus returns + stream is `Idle` | `useAwaySummary.ts` — 5min blur timer + `useFocus` event listener |
+
+Both paths funnel into a single function — `generateSessionRecap()` — to
+guarantee identical behavior. The auto-trigger is gated by
+`general.showSessionRecap` (default: on); the manual command ignores
+that setting.
+
+## Architecture
+
+```
+┌────────────────────────────────────────────────────────────────────────┐
+│                          AppContainer.tsx                              │
+│   isFocused = useFocus()                                               │
+│   isIdle = streamingState === Idle                                     │
+│       │                                                                │
+│       ├─→ useAwaySummary({enabled, config, isFocused, isIdle, addItem})│
+│       │       │                                                        │
+│       │       └─→ 5 min blur timer + idle/dedupe gates                 │
+│       │              │                                                 │
+│       │              ↓                                                 │
+│       └─→ recapCommand (slash) ─→ generateSessionRecap(config, signal) │
+│                                          │                             │
+│                                          ↓                             │
+│                              ┌─────────────────────────┐               │
+│                              │ packages/core/services/ │               │
+│                              │   sessionRecap.ts       │               │
+│                              └─────────────────────────┘               │
+│                                          │                             │
+│                                          ↓                             │
+│                              GeminiClient.generateContent              │
+│                              (fastModel + tools:[])                    │
+│                                                                        │
+│   addItem({type: 'away_recap', text}) ─→ HistoryItemDisplay            │
+│                                            └─ AwayRecapMessage         │
+│                                               (dim color + ❯ prefix)   │
+└────────────────────────────────────────────────────────────────────────┘
+```
+
+### Files
+
+| File                                                         | Responsibility                                      |
+| ------------------------------------------------------------ | --------------------------------------------------- |
+| `packages/core/src/services/sessionRecap.ts`                 | One-shot LLM call + history filter + tag extraction |
+| `packages/cli/src/ui/hooks/useAwaySummary.ts`                | Auto-trigger React hook                             |
+| `packages/cli/src/ui/commands/recapCommand.ts`               | `/recap` manual entry point                         |
+| `packages/cli/src/ui/components/messages/StatusMessages.tsx` | `AwayRecapMessage` dim renderer                     |
+| `packages/cli/src/ui/types.ts`                               | `HistoryItemAwayRecap` type                         |
+| `packages/cli/src/ui/components/HistoryItemDisplay.tsx`      | Renderer dispatch                                   |
+| `packages/cli/src/config/settingsSchema.ts`                  | `general.showSessionRecap` setting                  |
+
+## Prompt Design
+
+### System Prompt
+
+`generationConfig.systemInstruction` replaces the main agent's system
+prompt for this single call, so the model behaves only as a recap
+generator and not as a coding assistant.
+
+Note that `GeminiClient.generateContent()` internally runs the prompt
+through `getCustomSystemPrompt()`, which appends the user's memory
+(QWEN.md / managed auto-memory) as a suffix. The final system prompt is
+therefore `recap prompt + user memory` — useful project context for the
+recap, not a leak.
+
+Bullets below correspond 1:1 with `RECAP_SYSTEM_PROMPT`:
+
+- 1 to 3 short sentences, plain prose (no markdown / lists / headings).
+- First sentence: the high-level task. Then: the concrete next step.
+- Explicitly forbid: listing what was done, reciting tool calls, status reports.
+- Match the dominant language of the conversation (English or Chinese).
+- Wrap output in `<recap>...</recap>`; nothing outside the tags.
+
+### Structured Output + Extraction
+
+The model is instructed to wrap its answer in `<recap>...</recap>`:
+
+```
+<recap>Refactoring loopDetectionService.ts to address long-session OOM. Next step is to implement option B.</recap>
+```
+
+Why: some models (GLM family, reasoning models) write a "thinking"
+paragraph before the final answer. Returning the raw text would leak
+that reasoning into the UI.
+
+`extractRecap()` has three fallback tiers:
+
+1. Both tags present: take what is between `<recap>...</recap>` (preferred).
+2. Only the open tag (e.g. `maxOutputTokens` truncated the close tag):
+   take everything after the open tag.
+3. Tag missing entirely: return empty string → service returns `null`
+   → UI renders nothing.
+
+The third tier is "skip rather than show the wrong thing" — surfacing
+the model's reasoning preamble is worse than showing no recap at all.
+
+### Call Parameters
+
+| Parameter           | Value                          | Reason                                                           |
+| ------------------- | ------------------------------ | ---------------------------------------------------------------- |
+| `model`             | `getFastModel() ?? getModel()` | Recap doesn't need a frontier model                              |
+| `tools`             | `[]`                           | One-shot query, no tool use                                      |
+| `maxOutputTokens`   | `300`                          | Enough for 1-3 sentences + tags; larger would encourage rambling |
+| `temperature`       | `0.3`                          | Mostly deterministic, with a bit of natural variation            |
+| `systemInstruction` | The recap-only prompt above    | Replaces the main agent's role definition                        |
+
+## History Filtering
+
+`geminiClient.getChat().getHistory()` returns a `Content[]` that
+includes:
+
+- `user` / `model` text messages
+- `model` `functionCall` parts
+- `user` `functionResponse` parts (which can hold full file contents)
+- `model` thought parts (`part.thought` / `part.thoughtSignature`,
+  the model's hidden reasoning)
+
+`filterToDialog()` keeps only `user` / `model` parts that have **non-empty
+text and are not thoughts**. Two reasons:
+
+- **Tool calls / responses**: a single `functionResponse` can be 10K+
+  tokens. 30 such messages would drown the recap LLM in irrelevant
+  detail, both wasting tokens and biasing the recap toward
+  implementation noise like "called X tool to read Y file".
+- **Thought parts**: carry the model's internal reasoning. Including
+  them risks treating hidden chain-of-thought as dialogue and
+  surfacing it in the recap text.
+
+After dropping empty messages, `takeRecentDialog` slices to the last 30
+messages and refuses to start the slice on a dangling model/tool
+response.
+
+## Concurrency and Edge Cases
+
+### Auto-trigger hook state machine
+
+`useAwaySummary` keeps three refs:
+
+| Ref               | Meaning                                           |
+| ----------------- | ------------------------------------------------- |
+| `blurredAtRef`    | Blur start time (not cleared until focus returns) |
+| `recapPendingRef` | Whether an LLM call is in flight                  |
+| `inFlightRef`     | The current in-flight `AbortController`           |
+
+`useEffect` deps: `[enabled, config, isFocused, isIdle, addItem]`.
+
+| Event                                              | Action                                                                                                                                 |
+| -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
+| `!enabled \|\| !config`                            | Abort in-flight call + clear `inFlightRef` + clear `blurredAtRef`                                                                      |
+| `!isFocused` and `blurredAtRef === null`           | Set `blurredAtRef = Date.now()`                                                                                                        |
+| `isFocused` and `blurredAtRef === null`            | Return early (no blur cycle to handle — first render or right after a brief-blur reset)                                                |
+| `isFocused` and blur duration < 5 min              | Clear `blurredAtRef`, wait for next blur cycle                                                                                         |
+| `isFocused` and blur ≥ 5 min and `recapPendingRef` | Return (dedupe)                                                                                                                        |
+| `isFocused` and blur ≥ 5 min and `!isIdle`         | **Preserve** `blurredAtRef` and wait for the turn to finish (`isIdle` is in the deps, so the effect re-fires when streaming completes) |
+| `isFocused` and all conditions met                 | Clear `blurredAtRef`, set `recapPendingRef = true`, create `AbortController`, send the LLM request                                     |
+
+The `.then` callback **re-checks** `isIdleRef.current`: if the user has
+started a new turn while the LLM was running, the late-arriving recap
+is dropped to avoid inserting it mid-turn.
+
+The `.finally` clears `recapPendingRef`, and clears `inFlightRef` only
+if `inFlightRef.current === controller` (so it doesn't overwrite a
+newer controller).
+
+A second `useEffect` aborts the in-flight controller on unmount.
+
+### `/recap` gating
+
+`CommandContext.ui.isIdleRef` exposes the current stream state
+(mirroring the existing `btwAbortControllerRef` pattern). In
+interactive mode, `recapCommand` refuses when `!isIdleRef.current`
+**or** `pendingItem !== null`. `pendingItem` alone is insufficient
+because a normal model reply runs with `streamingState === Responding`
+and a null `pendingItem`.
+
+## Configuration and Model Selection
+
+### User-facing knobs
+
+| Setting                    | Default | Notes                                                             |
+| -------------------------- | ------- | ----------------------------------------------------------------- |
+| `general.showSessionRecap` | `true`  | Auto-trigger only. Manual `/recap` ignores this.                  |
+| `fastModel`                | unset   | Recommended (e.g. `qwen3-coder-flash`) for fast and cheap recaps. |
+
+### Model fallback
+
+`config.getFastModel() ?? config.getModel()`:
+
+- User has a `fastModel` set and it is valid for the current auth type
+  → use `fastModel`.
+- Otherwise → fall back to the main session model (works, just costlier
+  and slower).
+
+## Observability
+
+`createDebugLogger('SESSION_RECAP')` emits:
+
+- caught exceptions from the recap path (`debugLogger.warn`).
+
+All failures are **fully transparent** to the user — recap is an
+auxiliary feature and never throws into the UI. Developers can grep for
+the `[SESSION_RECAP]` tag in the debug log file: written by default to
+`~/.qwen/debug/<sessionId>.txt` (`latest.txt` symlinks to the current
+session); disable via `QWEN_DEBUG_LOG_FILE=0`.
+
+## Out of Scope
+
+| Item                                             | Why not                                                                                                                                  |
+| ------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------- |
+| Progress UI for `/recap` (spinner / pendingItem) | 3-5 second wait is tolerable; adds complexity.                                                                                           |
+| Automated tests                                  | Service is small (~150 lines), end-to-end tested manually first; unit tests can land in a separate PR.                                   |
+| Localized prompts                                | The system prompt is for the model; English is the most reliable substrate. The model selects the output language from the conversation. |
+| `QWEN_CODE_ENABLE_AWAY_SUMMARY` env var          | Claude Code uses it to keep the feature on when telemetry is disabled; Qwen Code's current telemetry model doesn't need this.            |
+| Auto-recap on `/resume` completion               | A natural follow-up but needs a hook point in `useResumeCommand`; out of scope for this PR.                                              |
--- a/docs/users/configuration/settings.md
+++ b/docs/users/configuration/settings.md
@ -82,6 +82,7 @@ Settings are organized into categories. All settings should be placed within the
 | `general.preferredEditor`       | string  | The preferred editor to open files in.                                                                                                                                          | `undefined` |
 | `general.vimMode`               | boolean | Enable Vim keybindings.                                                                                                                                                         | `false`     |
 | `general.enableAutoUpdate`      | boolean | Enable automatic update checks and installations on startup.                                                                                                                    | `true`      |
+| `general.showSessionRecap`      | boolean | Show a 1-3 sentence summary of where you left off when returning to the terminal after being away for 5+ minutes. Use `/recap` to trigger manually.                             | `true`      |
 | `general.gitCoAuthor`           | boolean | Automatically add a Co-authored-by trailer to git commit messages when commits are made through Qwen Code.                                                                      | `true`      |
 | `general.checkpointing.enabled` | boolean | Enable session checkpointing for recovery.                                                                                                                                      | `false`     |
 | `general.defaultFileEncoding`   | string  | Default encoding for new files. Use `"utf-8"` (default) for UTF-8 without BOM, or `"utf-8-bom"` for UTF-8 with BOM. Only change this if your project specifically requires BOM. | `"utf-8"`   |
--- a/docs/users/features/commands.md
+++ b/docs/users/features/commands.md
@ -24,6 +24,7 @@ These commands help you save, restore, and summarize work progress.
 | `/summary`  | Generate project summary based on conversation history    | `/summary`                           |
 | `/compress` | Replace chat history with summary to save Tokens          | `/compress`                          |
 | `/resume`   | Resume a previous conversation session                    | `/resume`                            |
+| `/recap`    | Show a 1-3 sentence "where you left off" summary          | `/recap`                             |
 | `/restore`  | Restore files to state before tool execution              | `/restore` (list) or `/restore <ID>` |

 ### 1.2 Interface and Workspace Control
@ -156,7 +157,58 @@ The `/btw` command allows you to ask quick side questions without interrupting o
 >
 > Use `/btw` when you need a quick answer without derailing your main task. It's especially useful for clarifying concepts, checking facts, or getting quick explanations while staying focused on your primary workflow.

-### 1.7 Information, Settings, and Help
+### 1.7 Session Recap (`/recap`)
+
+The `/recap` command generates a short "where you left off" summary of the
+current session, so you can resume an old conversation without scrolling
+back through pages of history.
+
+| Command  | Description                                      |
+| -------- | ------------------------------------------------ |
+| `/recap` | Generate and show a 1-3 sentence session summary |
+
+**How it works:**
+
+- Uses the configured fast model (`fastModel` setting) when available, falling
+  back to the main session model. A small, cheap model is enough for a recap.
+- The recent conversation (up to 30 messages, text only — tool calls and tool
+  responses are filtered out) is sent to the model with a tight system prompt.
+- The recap is rendered in dim color with a `❯` prefix so it stands apart
+  from real assistant replies.
+- Refuses with an inline error if a model turn is in flight or another command
+  is processing. If there is no usable conversation, or the underlying
+  generation fails, `/recap` shows a short info message instead of a recap —
+  the manual command always responds with something.
+
+**Auto-trigger when returning from being away:**
+
+If the terminal is blurred for **5+ minutes** and gets focused again, a recap
+is generated and shown automatically (only when no model response is in
+progress; otherwise it waits for the current turn to finish and then fires).
+Unlike the manual command, the auto-trigger is fully silent on failure: if
+generation errors or there is nothing to summarize, no message is added to
+the history. Controlled by the `general.showSessionRecap` setting
+(default: `true`); the manual `/recap` command always works regardless of
+this setting.
+
+**Example:**
+
+```
+> /recap
+
+❯ Refactoring loopDetectionService.ts to address long-session OOM caused by
+  unbounded streamContentHistory and contentStats. The next step is to
+  implement option B (LRU sliding window with FNV-1a) pending confirmation.
+```
+
+> [!tip]
+>
+> Configure a fast model via `/model --fast <model>` (e.g.
+> `qwen3-coder-flash`) to make `/recap` fast and cheap. Set
+> `general.showSessionRecap` to `false` to opt out of the auto-trigger
+> while keeping the manual command available.
+
+### 1.8 Information, Settings, and Help

 Commands for obtaining information and performing system settings.

@ -171,7 +223,7 @@ Commands for obtaining information and performing system settings.
 | `/copy`     | Copy last output content to clipboard           | `/copy`                          |
 | `/quit`     | Exit Qwen Code immediately                      | `/quit` or `/exit`               |

-### 1.8 Common Shortcuts
+### 1.9 Common Shortcuts

 | Shortcut           | Function                | Note                   |
 | ------------------ | ----------------------- | ---------------------- |
@ -181,7 +233,7 @@ Commands for obtaining information and performing system settings.
 | `Ctrl/cmd+Z`       | Undo input              | Text editing           |
 | `Ctrl/cmd+Shift+Z` | Redo input              | Text editing           |

-### 1.9 CLI Auth Subcommands
+### 1.10 CLI Auth Subcommands

 In addition to the in-session `/auth` slash command, Qwen Code provides standalone CLI subcommands for managing authentication directly from the terminal: