* feat(cli): add session recap with /recap and auto-show on return
Users often open an old session days later and need to scroll through
pages to remember where they left off. This change adds a short
"where did I leave off" recap — a 1-3 sentence summary generated by
the fast model — so they can resume without re-reading the history.
Two triggers:
- /recap: manual slash command.
- Auto: when the terminal has been blurred for 5+ minutes and gets
focused again (uses the existing DECSET 1004 focus protocol via
useFocus). Gated on streamingState === Idle so it never interrupts
an active turn. Only fires once per blur cycle.
The recap is rendered in dim color with a chevron prefix, visually
distinct from assistant replies. A new `general.showSessionRecap`
setting controls the auto-trigger (default on). /recap works
independent of the setting.
Implementation notes:
- generateSessionRecap uses fastModel (falls back to main model),
tools: [], maxOutputTokens: 300, and a tight system prompt. It
strips tool calls / responses from history before sending — tool
responses can hold 10K+ tokens of file content that drown the recap
in irrelevant detail. The 30-message window respects turn boundaries
(slice never starts on a dangling model/tool response).
- Output is wrapped in <recap>...</recap> tags; the extractor returns
empty (skips render) if the tag is missing, preventing model
reasoning from leaking into the UI.
- All failures are silent (return null) and logged via a scoped
debugLogger; recap is best-effort and must never break main flow.
- /recap refuses to run while a turn is pending.
* fix(cli): abort in-flight recap when showSessionRecap is disabled
If the user disables showSessionRecap while an auto-recap LLM call is
already in flight, the previous code returned early without aborting.
The pending .then would still pass its idle/abort guards and append the
recap, producing an unwanted message after the user has opted out.
Abort the controller and clear it eagerly so the resolved promise no
longer adds to history.
* fix(cli): gate /recap and auto-recap on streaming idle state
Two related issues from review:
1. /recap was only refusing when ui.pendingItem was set, but a normal
model reply runs with streamingState === Responding and a null
pendingItem. Invoking /recap mid-stream would generate a recap from
a partial conversation and insert it between the user prompt and
the assistant reply.
2. useAwaySummary cleared blurredAtRef before checking isIdle, so if
focus returned during a still-streaming turn (after a >5min blur)
the recap was permanently dropped — there was no later retry when
the turn became idle, because isIdle was not in the effect deps.
Fixes:
- Expose isIdleRef on CommandContext.ui (mirrors btwAbortControllerRef
pattern). Plumb it from AppContainer through useSlashCommandProcessor.
- recapCommand now refuses when isIdleRef.current is false OR
pendingItem is non-null.
- useAwaySummary preserves blurredAtRef on the !isIdle bail and adds
isIdle to the effect deps, so the trigger re-evaluates when the
current turn finishes.
- Brief blurs (< AWAY_THRESHOLD_MS) still reset blurredAtRef.
Also seeds isIdleRef in nonInteractiveUi and mockCommandContext so the
new field has a sensible default outside the interactive UI.
* docs: document /recap command, showSessionRecap setting, and design
- User docs: add /recap to the Session and Project Management table in
features/commands.md and a dedicated subsection covering manual use,
the auto-trigger, the dim-color rendering, and the fast-model tip.
- User docs: add general.showSessionRecap row to the configuration
settings reference.
- Design doc: docs/design/session-recap/session-recap-design.md covers
motivation, the two trigger paths, the per-file architecture, prompt
design with the <recap> tag and three-tier extractor, history
filtering rationale (functionResponse can be 10K+ tokens), the
useAwaySummary state machine, the isIdleRef gating for /recap, model
selection, observability, and out-of-scope items.
* fix(core): exclude thought parts from session recap context
filterToDialog kept any non-empty text part, but @google/genai's Part
type also marks model reasoning with part.thought / part.thoughtSignature.
That hidden chain-of-thought was being fed to the recap LLM and could
get summarized as if it were user-visible dialogue.
Drop parts where either flag is set. Update the design doc's
History 过滤 section to call this out alongside the existing
tool-call/response rationale.
* docs(session-recap): correct debug-logging guidance, fill in state machine, sharpen UX wording
Audit of the session recap docs against the implementation found three
issues worth fixing:
- Design doc claimed debug logs were enabled via a QWEN_CODE_DEBUG_LOGGING
env var. That var does not exist; debug logs are written to
~/.qwen/debug/<sessionId>.txt by default, gated by QWEN_DEBUG_LOG_FILE.
Replace with the accurate path + opt-out behavior, and tell the reader
to grep for the [SESSION_RECAP] tag.
- Design doc's useAwaySummary state machine table was missing the
isFocused && blurredAtRef === null path (taken on first render and
right after a brief-blur reset). Add the row.
- User doc's "Refuses to run ... failures are silent" line conflated the
inline-error refusal with silent generation failures, and "(when the
conversation is idle)" used internal jargon. Split the two cases and
spell out what "idle" means, including the wait-then-fire behavior
when focus returns mid-turn.
* docs(session-recap): correctly describe /recap vs auto-trigger failure modes
The previous wording said "Generation/network failures are silent — the
recap simply does not appear", but recapCommand returns a user-facing
info message ("Not enough conversation context for a recap yet.") in
exactly that path, and also returns inline messages for the
config-not-loaded and busy-turn guards.
Only the auto-trigger path is truly silent (it just skips addItem when
generateSessionRecap returns null). Split the two paths in the doc so
the manual command's "always responds with something" behavior is
distinguished from the auto-trigger's no-op-on-failure behavior.
* docs(session-recap): align prompt-rules section with the actual prompt
Two doc-vs-code mismatches in the design doc's "System Prompt" section,
caught with the same lens as yiliang114's failure-mode review:
- The bullet list claimed RECAP_SYSTEM_PROMPT forbids "推测用户意图"
and "用 'you' 称呼用户". Those rules existed in an early draft but
were dropped when the <recap> tag rules were added; the current
prompt has no such restrictions. Replace with the actual rules and
add a "与 RECAP_SYSTEM_PROMPT 一一对应" marker so future edits stay
in sync.
- The doc said systemInstruction "覆盖" the main agent prompt. True
for the agent prompt portion, but GeminiClient.generateContent
internally calls getCustomSystemPrompt which appends user memory
(QWEN.md / 自动 memory) as a suffix. Spell that out — the final
system prompt is recap prompt + user memory, which is actually
useful project context for the recap.
* docs(session-recap): translate design doc to English
The repo convention for docs/design is English (7 of 8 existing files;
auto-memory/memory-system.md is the only Chinese one). The first version
of this design doc followed the auto-memory example, which turned out
to be the wrong sample.
Translate to English while preserving the existing structure, the
state-machine table, the prompt-vs-doc 1:1 alignment, the
QWEN_DEBUG_LOG_FILE description, and the failure-mode notes added in
prior commits.
* fix(cli): drop empty info return from /recap interactive success path
The interactive success path inserts the away_recap history item
directly via ui.addItem and then returned `{type: 'message',
messageType: 'info', content: ''}`. The slash-command processor's
'message' case unconditionally calls addMessage, which adds another
HistoryItemInfo with empty text. The empty info renders as nothing
(StatusMessage early-returns null), but it still bloats the in-memory
history list and shows up in /export and saved sessions.
Return void on the interactive success path and on the abort path so
the processor's `if (result)` check skips the message-handler branch
entirely. Widen the action's return type to `void | SlashCommandActionReturn`
to match (same shape as btwCommand).
14 KiB
Session Recap Design
A 1-3 sentence "where did I leave off" summary surfaced when the user returns to an idle session, either on demand (
/recap) or after the terminal has been blurred for 5+ minutes.
Overview
When a user /resumes an old session days later, scrolling back through
pages of history to remember what they were doing and what came next
is a real friction point. Just reloading messages does not solve this
UX problem.
The goal is to proactively surface a 1-3 sentence recap when the user returns:
- High-level task (what they are doing) → next step (what to do next).
- Visually distinct from real assistant replies, so it is never mistaken for new model output.
- Best-effort: failures must be silent and never break the main flow.
Triggers
| Trigger | Conditions | Implementation |
|---|---|---|
| Manual | User runs /recap |
recapCommand.ts calls the same underlying service |
| Auto | Terminal blurred (DECSET 1004 focus protocol) for ≥ 5 min + focus returns + stream is Idle |
useAwaySummary.ts — 5min blur timer + useFocus event listener |
Both paths funnel into a single function — generateSessionRecap() — to
guarantee identical behavior. The auto-trigger is gated by
general.showSessionRecap (default: on); the manual command ignores
that setting.
Architecture
┌────────────────────────────────────────────────────────────────────────┐
│ AppContainer.tsx │
│ isFocused = useFocus() │
│ isIdle = streamingState === Idle │
│ │ │
│ ├─→ useAwaySummary({enabled, config, isFocused, isIdle, addItem})│
│ │ │ │
│ │ └─→ 5 min blur timer + idle/dedupe gates │
│ │ │ │
│ │ ↓ │
│ └─→ recapCommand (slash) ─→ generateSessionRecap(config, signal) │
│ │ │
│ ↓ │
│ ┌─────────────────────────┐ │
│ │ packages/core/services/ │ │
│ │ sessionRecap.ts │ │
│ └─────────────────────────┘ │
│ │ │
│ ↓ │
│ GeminiClient.generateContent │
│ (fastModel + tools:[]) │
│ │
│ addItem({type: 'away_recap', text}) ─→ HistoryItemDisplay │
│ └─ AwayRecapMessage │
│ (dim color + ❯ prefix) │
└────────────────────────────────────────────────────────────────────────┘
Files
| File | Responsibility |
|---|---|
packages/core/src/services/sessionRecap.ts |
One-shot LLM call + history filter + tag extraction |
packages/cli/src/ui/hooks/useAwaySummary.ts |
Auto-trigger React hook |
packages/cli/src/ui/commands/recapCommand.ts |
/recap manual entry point |
packages/cli/src/ui/components/messages/StatusMessages.tsx |
AwayRecapMessage dim renderer |
packages/cli/src/ui/types.ts |
HistoryItemAwayRecap type |
packages/cli/src/ui/components/HistoryItemDisplay.tsx |
Renderer dispatch |
packages/cli/src/config/settingsSchema.ts |
general.showSessionRecap setting |
Prompt Design
System Prompt
generationConfig.systemInstruction replaces the main agent's system
prompt for this single call, so the model behaves only as a recap
generator and not as a coding assistant.
Note that GeminiClient.generateContent() internally runs the prompt
through getCustomSystemPrompt(), which appends the user's memory
(QWEN.md / managed auto-memory) as a suffix. The final system prompt is
therefore recap prompt + user memory — useful project context for the
recap, not a leak.
Bullets below correspond 1:1 with RECAP_SYSTEM_PROMPT:
- 1 to 3 short sentences, plain prose (no markdown / lists / headings).
- First sentence: the high-level task. Then: the concrete next step.
- Explicitly forbid: listing what was done, reciting tool calls, status reports.
- Match the dominant language of the conversation (English or Chinese).
- Wrap output in
<recap>...</recap>; nothing outside the tags.
Structured Output + Extraction
The model is instructed to wrap its answer in <recap>...</recap>:
<recap>Refactoring loopDetectionService.ts to address long-session OOM. Next step is to implement option B.</recap>
Why: some models (GLM family, reasoning models) write a "thinking" paragraph before the final answer. Returning the raw text would leak that reasoning into the UI.
extractRecap() has three fallback tiers:
- Both tags present: take what is between
<recap>...</recap>(preferred). - Only the open tag (e.g.
maxOutputTokenstruncated the close tag): take everything after the open tag. - Tag missing entirely: return empty string → service returns
null→ UI renders nothing.
The third tier is "skip rather than show the wrong thing" — surfacing the model's reasoning preamble is worse than showing no recap at all.
Call Parameters
| Parameter | Value | Reason |
|---|---|---|
model |
getFastModel() ?? getModel() |
Recap doesn't need a frontier model |
tools |
[] |
One-shot query, no tool use |
maxOutputTokens |
300 |
Enough for 1-3 sentences + tags; larger would encourage rambling |
temperature |
0.3 |
Mostly deterministic, with a bit of natural variation |
systemInstruction |
The recap-only prompt above | Replaces the main agent's role definition |
History Filtering
geminiClient.getChat().getHistory() returns a Content[] that
includes:
user/modeltext messagesmodelfunctionCallpartsuserfunctionResponseparts (which can hold full file contents)modelthought parts (part.thought/part.thoughtSignature, the model's hidden reasoning)
filterToDialog() keeps only user / model parts that have non-empty
text and are not thoughts. Two reasons:
- Tool calls / responses: a single
functionResponsecan be 10K+ tokens. 30 such messages would drown the recap LLM in irrelevant detail, both wasting tokens and biasing the recap toward implementation noise like "called X tool to read Y file". - Thought parts: carry the model's internal reasoning. Including them risks treating hidden chain-of-thought as dialogue and surfacing it in the recap text.
After dropping empty messages, takeRecentDialog slices to the last 30
messages and refuses to start the slice on a dangling model/tool
response.
Concurrency and Edge Cases
Auto-trigger hook state machine
useAwaySummary keeps three refs:
| Ref | Meaning |
|---|---|
blurredAtRef |
Blur start time (not cleared until focus returns) |
recapPendingRef |
Whether an LLM call is in flight |
inFlightRef |
The current in-flight AbortController |
useEffect deps: [enabled, config, isFocused, isIdle, addItem].
| Event | Action |
|---|---|
!enabled || !config |
Abort in-flight call + clear inFlightRef + clear blurredAtRef |
!isFocused and blurredAtRef === null |
Set blurredAtRef = Date.now() |
isFocused and blurredAtRef === null |
Return early (no blur cycle to handle — first render or right after a brief-blur reset) |
isFocused and blur duration < 5 min |
Clear blurredAtRef, wait for next blur cycle |
isFocused and blur ≥ 5 min and recapPendingRef |
Return (dedupe) |
isFocused and blur ≥ 5 min and !isIdle |
Preserve blurredAtRef and wait for the turn to finish (isIdle is in the deps, so the effect re-fires when streaming completes) |
isFocused and all conditions met |
Clear blurredAtRef, set recapPendingRef = true, create AbortController, send the LLM request |
The .then callback re-checks isIdleRef.current: if the user has
started a new turn while the LLM was running, the late-arriving recap
is dropped to avoid inserting it mid-turn.
The .finally clears recapPendingRef, and clears inFlightRef only
if inFlightRef.current === controller (so it doesn't overwrite a
newer controller).
A second useEffect aborts the in-flight controller on unmount.
/recap gating
CommandContext.ui.isIdleRef exposes the current stream state
(mirroring the existing btwAbortControllerRef pattern). In
interactive mode, recapCommand refuses when !isIdleRef.current
or pendingItem !== null. pendingItem alone is insufficient
because a normal model reply runs with streamingState === Responding
and a null pendingItem.
Configuration and Model Selection
User-facing knobs
| Setting | Default | Notes |
|---|---|---|
general.showSessionRecap |
true |
Auto-trigger only. Manual /recap ignores this. |
fastModel |
unset | Recommended (e.g. qwen3-coder-flash) for fast and cheap recaps. |
Model fallback
config.getFastModel() ?? config.getModel():
- User has a
fastModelset and it is valid for the current auth type → usefastModel. - Otherwise → fall back to the main session model (works, just costlier and slower).
Observability
createDebugLogger('SESSION_RECAP') emits:
- caught exceptions from the recap path (
debugLogger.warn).
All failures are fully transparent to the user — recap is an
auxiliary feature and never throws into the UI. Developers can grep for
the [SESSION_RECAP] tag in the debug log file: written by default to
~/.qwen/debug/<sessionId>.txt (latest.txt symlinks to the current
session); disable via QWEN_DEBUG_LOG_FILE=0.
Out of Scope
| Item | Why not |
|---|---|
Progress UI for /recap (spinner / pendingItem) |
3-5 second wait is tolerable; adds complexity. |
| Automated tests | Service is small (~150 lines), end-to-end tested manually first; unit tests can land in a separate PR. |
| Localized prompts | The system prompt is for the model; English is the most reliable substrate. The model selects the output language from the conversation. |
QWEN_CODE_ENABLE_AWAY_SUMMARY env var |
Claude Code uses it to keep the feature on when telemetry is disabled; Qwen Code's current telemetry model doesn't need this. |
Auto-recap on /resume completion |
A natural follow-up but needs a hook point in useResumeCommand; out of scope for this PR. |