qwen-code/docs/design/session-title/session-title-design.md
Shaojin Wen d36f12c4c4
feat(session): auto-title sessions via fast model, add /rename --auto (#3540)
* feat(session): auto-title sessions via fast model, add /rename --auto

The /rename work in #3093 generates kebab-case titles only when the user
explicitly runs `/rename` with no args; until they do, the session picker
shows the first user prompt (often truncated or misleading). This change
adds a sentence-case auto-title that fires once per session after the
first assistant turn, using the configured fast model.

New service: `packages/core/src/services/sessionTitle.ts` —
`tryGenerateSessionTitle(config, signal)` returns a discriminated outcome
(`{ok: true, title, modelUsed}` | `{ok: false, reason}`) so callers can
either handle failures generically or map reasons to actionable messages.
Prompt shape: 3-7 words, sentence case, good/bad examples including a
CJK row, JSON schema enforced via `baseLlmClient.generateJson`.
`maxAttempts: 1` — titles are cosmetic metadata and shouldn't fight
rate limits.

Trigger point: `ChatRecordingService.maybeTriggerAutoTitle` runs after
`recordAssistantTurn`. Fire-and-forget promise, guarded by:

- `currentCustomTitle` — don't overwrite any existing title.
- `autoTitleController` doubles as in-flight flag; a second turn while
  the first is still pending is a no-op.
- `autoTitleAttempts` cap of 3 — the first assistant turn may be a
  pure tool-call with no user-visible text; retry for a handful of
  turns until a title lands. Cap bounds total waste.
- `!config.isInteractive()` — headless CLI (`qwen -p`, CI) never auto-
  titles; spending fast-model tokens on a one-shot session is waste.
- `autoTitleDisabledByEnv()` — `QWEN_DISABLE_AUTO_TITLE=1` opt-out.
- `config.getFastModel()` falsy — skip entirely rather than falling
  back to the main model; auto-titling on main-model tokens is too
  expensive to be silent.

Persistence: `CustomTitleRecordPayload` grows a `titleSource: 'auto' |
'manual'` field. Absent on pre-change records (treated as `undefined`
→ manual, safe default so a user's pre-upgrade `/rename` is never
silently reclassified). `SessionPicker` renders `titleSource === 'auto'`
titles in dim (secondary) color; manual stays full contrast. On resume,
the persisted source is rehydrated into `currentTitleSource` — without
this, finalize's re-append would rewrite an auto title as manual on
every resume cycle.

Cross-process manual-rename guard: when two CLI tabs target the same
JSONL, in-memory state can diverge. Before writing an auto record, the
IIFE re-reads the file via `sessionService.getSessionTitleInfo`. If a
`/rename` from another process landed as manual, bail and sync local
state — never clobber a deliberately-chosen manual title with a model
guess. Cost is one 64KB tail read per successful generation.

`finalize()` aborts the in-flight controller before re-appending the
title record. Session switch / shutdown doesn't have to wait on a slow
fast-model call.

New user-facing command: `/rename --auto` regenerates via the same
generator — explicit user trigger, overwrites whatever's there (manual
or auto) because the user asked. Errors route through
`autoFailureMessage(reason)` so `empty_history`, `model_error`,
`aborted`, etc. each get actionable guidance rather than a generic
"could not generate". `/rename -- --literal-name` is the sentinel for
titles that start with `--`; unknown `--flag` tokens error with a hint
pointing at the sentinel. Existing `/rename <name>` and bare `/rename`
(kebab-case via existing path) are unchanged, except the kebab path now
prefers fast model when available and runs its output through
`stripTerminalControlSequences` (same ANSI/OSC-8 hardening as the
sentence-case path).

New shared util: `packages/core/src/utils/terminalSafe.ts` —
`stripTerminalControlSequences(s)` strips OSC (\x1b]...\x07|\x1b\\), CSI
(\x1b[...[a-zA-Z]), SS2/SS3 leaders, and C0/C1/DEL as a backstop. A
model-returned `\x1b[2J` or OSC-8 hyperlink escape would otherwise
execute on every SessionPicker render; both sentence-case and kebab
paths now route titles through the helper before they reach the JSONL
or the UI.

Tail-read extractor: `extractLastJsonStringFields(text, primaryKey,
otherKeys, lineContains)` reads multiple fields from the same matching
line in a single pass. Two separate tail scans could return a mismatched
pair (primary from a newer record, secondary from an older one with only
the primary set); the new helper guarantees the pair is atomic. Validates
a proper closing quote on the primary value so a crash-truncated trailing
record can't win the latest-match race. `readLastJsonStringFieldsSync`
is its file-reading wrapper — same tail-window fast path and full-file
fallback as the single-field version, plus a `MAX_FULL_SCAN_BYTES = 64MB`
cap so a corrupt multi-GB session file can't freeze the picker. Session
reads now open with `O_NOFOLLOW` (falls back to plain RDONLY on Windows
where the constant isn't exposed) — defense in depth against a symlink
planted in `~/.qwen/projects/<proj>/chats/`.

Character handling: `flattenToTail` on the LLM prompt drops a dangling
low surrogate after `slice(-1000)` — otherwise a CJK supplementary char
or emoji cut mid-pair produces invalid UTF-16 that some providers 400.
`sanitizeTitle` applies the same surrogate scrub after max-length trim,
and strips paired CJK brackets (`「」 『』 【】 〈〉 《》`) as whole units so
a `【Draft】 Fix login` doesn't leave a dangling `】` after leading-char
strip. `lineContains` in the title reader is tightened from the loose
substring `'custom_title'` to `'"subtype":"custom_title"'` so user text
containing the literal `custom_title` can't shadow a real record.

Tests: 46 new unit tests across
- `sessionTitle.test.ts` (22): success/all-failure-reasons, tool-call
  filter, tail-slice, surrogate scrub, ANSI/OSC-8 strip, CJK brackets.
- `chatRecordingService.autoTitle.test.ts` (15): trigger/skip matrix,
  in-flight guard, abort propagation on finalize, manual/auto/legacy
  resume symmetry, cross-process race, env opt-out, retry-after-
  transient.
- `sessionStorageUtils.test.ts` (13): single-pass extractor, straddle
  boundary, truncated trailing record, lineContains, multi-field atom.
- `renameCommand.test.ts` (8): `--auto` success, all reasons, sentinel,
  unknown-flag hint, positional rejection, manual/SessionService
  fallbacks.

* docs(session): design doc for auto session titles

Matches the session-recap design doc shape (Overview / Triggers /
Architecture / Prompt Design / History Filtering / Persistence /
Concurrency / Configuration / Observability / Out of Scope) and adds a
Security Hardening section unique to the title path — titles render
directly in the picker and persist in user-readable JSONL, so
LLM-returned control sequences are an attack surface the recap path
doesn't have.

Captures decisions a code-only reader has to reverse-engineer:

- Why `maxAttempts: 1` (best-effort cosmetic metadata; no retry loop).
- Why `autoTitleAttempts` cap is 3 (first turn can be pure tool-call).
- Why the auto trigger does NOT fall back to the main model but
  session-recap does (auto-title fires on every turn; silently charging
  main-model tokens is a bill surprise).
- Why `titleSource: undefined` stays unwritten on legacy records (no
  rewrite risks silently reclassifying user intent).
- Why the cross-process re-read sits between the LLM await and the
  append (manual wins at both in-process and on-disk layers).
- Why `finalize()`'s abort tolerates a controller swap (in-flight
  identity check).
- Why JSON-schema function calling instead of tag extraction (avoid
  reasoning preamble bleed; cross-provider reliability).

Placed at docs/design/session-title/ alongside session-recap,
compact-mode, fork-subagent, and other per-feature design docs. No
sidebar index update required — the design folder is unindexed.

* test(rename): pin model choice in bare /rename kebab path

Addresses reviewer feedback: the bare `/rename` model selection
(`config.getFastModel() ?? config.getModel()`) had no test pinning
it either way. Previous tests mocked `getHistory: []`, which exits
the function before the model is ever chosen, so a silent regression
to either direction (always-main or always-fast) would pass CI.

Two explicit cases now:
- fastModel set → `generateContent` called with `model: 'qwen-turbo'`.
- fastModel unset → `generateContent` called with `model: 'main-model'`.

The tests intentionally mock a non-empty history so the kebab path
reaches the generateContent call site instead of bailing on empty input.
2026-04-23 20:37:05 +08:00

23 KiB

Session Title Design

A 3-7 word sentence-case session title generated by the fast model after the first assistant turn. Persisted in the session JSONL with a titleSource: 'auto' | 'manual' tag, surfaced in the session picker, and regeneratable on demand via /rename --auto.

Overview

/rename (#3093) lets a user label a session so they can find it again in the picker later, but until they run it the picker shows the first user prompt — often truncated mid-sentence, or describing a framing question rather than what the session actually became about. Manual renaming is optional friction most users never do.

The goal is to make session names useful by default:

  • Descriptive of what the session actually accomplished, not just the opening line. 3-7 words, sentence case, git-commit-subject style.
  • Best-effort: fires in the background after the first reply; if it fails the user never sees an error.
  • Deferential to the user: never clobber a /rename title the user chose deliberately, even across CLI tabs on the same session.
  • Explicitly regeneratable via /rename --auto for the "auto title became stale / I want a fresh one" case.

Triggers

Trigger Conditions Implementation
Auto After recordAssistantTurn fires. Skipped if an existing title is set, another attempt is in-flight, cap reached, non-interactive, env disabled, or no fast model. ChatRecordingService.maybeTriggerAutoTitle — fire-and-forget
Manual User runs /rename --auto renameCommand.ts via tryGenerateSessionTitle

Both paths funnel into a single function — tryGenerateSessionTitle(config, signal) — to guarantee identical prompt, schema, model selection, and sanitization. The auto trigger is a best-effort background call; the manual /rename --auto is a blocking user action that surfaces a reason-specific error on failure.

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                        packages/core/src/services/                      │
│                                                                         │
│  ┌──────────────────────────┐                                           │
│  │ chatRecordingService.ts  │                                           │
│  │                          │                                           │
│  │  recordAssistantTurn()   │                                           │
│  │     │                    │                                           │
│  │     ↓                    │                                           │
│  │  maybeTriggerAutoTitle() │── 6 guards ──→ IIFE(autoTitleController)  │
│  │     │                    │                       │                   │
│  │     └── resume hydrate   │                       ↓                   │
│  │         via              │          tryGenerateSessionTitle          │
│  │         getSessionTitle- │          (sessionTitle.ts)                │
│  │         Info             │                       │                   │
│  │                          │                       ↓                   │
│  └──────────────────────────┘          BaseLlmClient.generateJson       │
│                                        (fastModel + JSON schema)        │
│                                                       │                 │
│  ┌──────────────────────────┐                         ↓                 │
│  │ sessionService.ts        │         sanitizeTitle + sanity checks     │
│  │                          │                         │                 │
│  │  getSessionTitleInfo()   │◀── cross-process        ↓                 │
│  │      uses                │    re-read             recordCustomTitle  │
│  │  readLastJsonString-     │    before write        (…, 'auto')        │
│  │  FieldsSync              │                                           │
│  │  (sessionStorageUtils)   │                                           │
│  └──────────────────────────┘                                           │
│                                                                         │
│                          ┌─────────────────────┐                        │
│                          │ utils/terminalSafe  │                        │
│                          │ stripTerminalCtrl-  │                        │
│                          │ Sequences           │                        │
│                          └─────────────────────┘                        │
└─────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────┐
│                     packages/cli/src/ui/                                │
│                                                                         │
│  commands/renameCommand.ts     ─── /rename <name>          → manual      │
│                                ─── /rename                 → kebab       │
│                                ─── /rename --auto          → auto       │
│                                ─── /rename -- --literal    → manual     │
│                                ─── /rename --unknown-flag  → error      │
│                                                                         │
│  components/SessionPicker.tsx  ── dims rows where                       │
│                                   session.titleSource === 'auto'        │
└─────────────────────────────────────────────────────────────────────────┘

Files

File Responsibility
packages/core/src/services/sessionTitle.ts One-shot LLM call + history filter + sanitize. Exports tryGenerateSessionTitle.
packages/core/src/services/chatRecordingService.ts maybeTriggerAutoTitle trigger, guards, cross-process re-read, abort-on-finalize.
packages/core/src/services/sessionService.ts getSessionTitleInfo public accessor; renameSession accepts titleSource.
packages/core/src/utils/sessionStorageUtils.ts extractLastJsonStringFields + readLastJsonStringFieldsSync atomic pair reader.
packages/core/src/utils/terminalSafe.ts stripTerminalControlSequences shared by sentence-case and kebab paths.
packages/cli/src/ui/commands/renameCommand.ts /rename --auto, sentinel parser, failure-reason message map.
packages/cli/src/ui/components/SessionPicker.tsx Dim styling for titleSource === 'auto'.

Prompt Design

System Prompt

Replaces the main agent's system prompt for this single call so the model only tries to label the session, not behave as a coding assistant.

Bullets below correspond 1:1 with TITLE_SYSTEM_PROMPT:

  • 3-7 words, sentence case (only first word and proper nouns capitalized).
  • No trailing punctuation, no markdown, no quotes.
  • Match the dominant language of the conversation; for Chinese, budget roughly 12-20 characters.
  • Be specific about the user's actual goal — name the feature, bug, or subject area. Avoid vague catch-alls like "Code changes" or "Help request".
  • Four good examples (three English + one Chinese) and four bad examples (too vague / too long / wrong case / trailing punctuation).
  • Return only a JSON object with a single title key.

Structured Output (JSON schema)

Instead of wrapping output in tags (as session-recap does), we use BaseLlmClient.generateJson with a function-calling schema:

const TITLE_SCHEMA = {
  type: 'object',
  properties: {
    title: {
      type: 'string',
      description:
        'A concise sentence-case session title, 3-7 words, no trailing punctuation.',
    },
  },
  required: ['title'],
};

Why function calling rather than free text + tag extraction:

  1. Cross-provider reliability — OpenAI-compatible endpoints, Gemini, and Qwen's native tool-calling all implement function calling; tag parsing would rely on every model respecting a text convention.
  2. No reasoning-preamble leakage — the function call arguments come back structured, so a "thinking" paragraph before the answer can't bleed into the title.
  3. Simpler post-processing — a single typeof result.title === 'string' check plus sanitizeTitle covers every realistic model drift.

The model may still return something the schema allows but the UX rejects (empty string, whitespace-only, 500 chars, markdown fencing, control chars). sanitizeTitle handles all of these and returns '' → service returns {ok: false, reason: 'empty_result'}.

Call Parameters

Parameter Value Reason
model getFastModel() — no fallback Auto-titling on main-model tokens is too expensive to be silent.
schema TITLE_SCHEMA Forces {title: string}; filters shape drift at the transport layer.
maxOutputTokens 100 More than enough for 7 words plus schema overhead.
temperature 0.2 Mostly deterministic — session titles benefit from stability across regeneration.
maxAttempts 1 Titles are best-effort cosmetic metadata; retries would queue behind user-visible main traffic.

Contrast with session-recap, which falls back to the main model. Title generation is triggered automatically and often; silently spending main-model tokens without a user opt-in is a real bill surprise. Manual /rename --auto explicitly fails with no_fast_model rather than fallback — forcing the user to make the fast-model choice consciously.

History Filtering

geminiClient.getChat().getHistory() returns Content[] that includes tool calls, tool responses (often 10K+ tokens of file content), and model thought parts. Feeding that raw into the title LLM would bias the label toward implementation noise like "Called grep on auth module".

filterToDialog keeps only user / model entries with non-empty text and no thought / thoughtSignature parts. takeRecentDialog slices to the last 20 messages and refuses to start on a dangling model/tool response. flattenToTail converts to "Role: text" lines and slices the last 1000 characters.

The 1000-character tail slice

A session that starts with help me debug X but pivots to refactoring Y should be titled about Y. Titling by the head locks in the opening framing; titling by the tail captures what the session became.

UTF-16 surrogate handling

.slice(-1000) on a UTF-16 code-unit boundary can orphan a high or low surrogate if a CJK supplementary char or emoji gets cut. Some providers respond to the resulting invalid UTF-16 with a 400 — which, without handling, would burn an attempt for no reason. flattenToTail drops a leading orphaned low surrogate; sanitizeTitle scrubs any orphaned surrogate after the max-length trim on the output path too.

Persistence

Record shape

CustomTitleRecordPayload grows an optional titleSource: 'auto' | 'manual' field:

{
  "type": "system",
  "subtype": "custom_title",
  "systemPayload": {
    "customTitle": "Debug login button on mobile",
    "titleSource": "auto",
  },
}

The field is optional, and absent-in-legacy records are treated as undefined. SessionPicker dims rows only on a strict === 'auto' match — a pre-change user /rename title is never silently reclassified as a model guess.

Resume hydration

On resume, ChatRecordingService constructor calls sessionService.getSessionTitleInfo(sessionId) to read both the title and its source. Without hydrating the source, finalize()'s re-append (which runs on every session lifecycle event) would rewrite auto as manual on every resume cycle — silently stripping the dim affordance.

Atomic pair read

extractLastJsonStringFields returns customTitle and titleSource from the same matching line in a single scan. Two separate readLastJsonStringFieldSync calls could land on different records if an older line has only the primary field, yielding a mismatched pair. The extractor also requires a proper closing quote on the primary value, so a crash-truncated trailing record can't win the latest-match race.

Full-file scan cap

Phase-2 (when the tail-window fast path misses) streams the whole file in 64KB chunks. Capped at MAX_FULL_SCAN_BYTES = 64 MB so a corrupt multi-GB JSONL can't freeze the session picker on the main event loop. The picker's latency envelope survives corruption.

Session reads open with O_NOFOLLOW (falls back to plain read-only on Windows, where the constant is not exposed). Defense in depth so a symlink planted in ~/.qwen/projects/<proj>/chats/ can't redirect a metadata read to an unrelated file.

Concurrency and Edge Cases

Trigger guard order

maybeTriggerAutoTitle checks six conditions in this exact order — each short-circuits the rest so the cheap ones run first:

  1. currentCustomTitle set → skip. Never overwrite manual / prior auto.
  2. autoTitleController !== undefined → skip. One attempt at a time.
  3. autoTitleAttempts >= 3 → skip. Cap bounds total waste.
  4. !config.isInteractive() → skip. Headless qwen -p / CI never spends fast-model tokens on a one-shot session.
  5. autoTitleDisabledByEnv() → skip. QWEN_DISABLE_AUTO_TITLE=1 explicit opt-out.
  6. !config.getFastModel() → skip. No fast-model → no-op.

Why the cap is 3, not 1

The first assistant turn can be a pure tool-call with no user-visible text (e.g. the model opens with a grep). tryGenerateSessionTitle returns {ok: false, reason: 'empty_history'} in that case. Without a retry window, an entire session's chance at a title would be burned on turn 1 before the user said anything interesting. Cap of 3 covers the common "first turn is noise" case while still bounding runaway retry on a persistently failing fast model.

Cross-process manual-rename race

Two CLI tabs on the same session file can diverge in memory. Tab A runs /rename foo and writes titleSource: manual. Tab B's ChatRecordingService has its own currentCustomTitle = undefined and would naively overwrite with an auto title.

After the LLM call resolves, the IIFE re-reads the JSONL via sessionService.getSessionTitleInfo. If the file shows source: 'manual', the IIFE bails AND syncs its in-memory state so subsequent turns respect the rename too. Cost: one 64KB tail read per successful generation; negligible.

Abort propagation on finalize()

autoTitleController doubles as the in-flight flag. finalize() (run on session switch and process shutdown) calls autoTitleController.abort() before re-appending the title record. The LLM socket is cancelled promptly; session switch doesn't wait on a slow fast-model call. The IIFE's finally block clears autoTitleController only if it's still the active one, so a finalize mid-flight doesn't race a concurrent recordAssistantTurn.

Manual /rename lands mid-flight

Between the IIFE's await completing and the recordCustomTitle('auto') call, the user could /rename foo. The IIFE re-checks this.currentTitleSource === 'manual' and bails. The in-process check AND the cross-process re-read both run; manual wins at both layers.

Configuration

User-facing knobs

Setting / env var Default Effect
fastModel unset Required for auto-titling. Unset → no-op (no main-model fallback).
QWEN_DISABLE_AUTO_TITLE=1 unset Opt out of the auto trigger without unsetting fastModel. /rename --auto still works on request.

No settings.json toggle — the env var is the only user-visible off-switch. Rationale: the feature is cosmetic and cheap; a settings toggle would add a UI surface for something that can live as a one-time env export for the few users who want to disable it.

Why auto doesn't fall back to the main model

Auto-titling is triggered unconditionally after every assistant turn. If a user without a fast model were silently charged main-model tokens for every new session's title, the cost delta is invisible until the monthly bill arrives. Failing quietly (no-op, no title, no cost) is the safer default. /rename --auto surfaces no_fast_model as an actionable error so the user can set one if they want to.

Observability

createDebugLogger('SESSION_TITLE') emits debugLogger.warn from the generator's catch block. Failures are fully transparent to the user — auto-title is an auxiliary feature and never throws into the UI.

Developers can grep for the [SESSION_TITLE] tag in the debug log (~/.qwen/debug/<sessionId>.txt; latest.txt symlinks to the current session). A working end-to-end call produces no log output; a failing one gets one WARN line with the underlying error message.

Security Hardening

The title value is rendered verbatim in the terminal (session picker) AND persisted in a user-readable JSONL file. Both surfaces are attack reachable if a compromised or prompt-injected fast model returns hostile text.

Concern Guard
ANSI / OSC-8 / CSI injection stripTerminalControlSequences before both JSONL write and picker render.
Clickable-link smuggle via OSC-8 Same — OSC sequences stripped as whole units, not just the ESC byte.
Invalid UTF-16 surrogates Scrubbed in flattenToTail (LLM input) and sanitizeTitle (LLM output after max-length trim).
Subtype-line spoof via user message content lineContains: '"subtype":"custom_title"' — user text that happens to contain the literal phrase can't shadow a real record.
Symlink redirect on session reads O_NOFOLLOW (no-op on Windows where the constant is missing).
Truncated trailing JSONL record extractLastJsonStringFields requires a closing quote before a record wins the latest-match race.
Pathological file size freezing the picker MAX_FULL_SCAN_BYTES = 64 MB cap on Phase-2 full-file scan.
Paired CJK bracket decorators (【Draft】) Stripped as a unit so a lone closing bracket doesn't dangle.

Out of Scope

Item Why not
Auto-regenerate when the title goes stale /rename --auto is the explicit user-triggered path. Silent mid-session title swaps would confuse users scrolling back through the picker.
WebUI / VSCode dim-styling parity Those surfaces read customTitle already and will show auto titles as if manual. A follow-up can wire the titleSource through.
Settings-dialog toggle for auto generation Env var is the single knob. Full settings UI is easy to add later if user demand surfaces.
i18n locale catalog entries for new strings Consistent with existing /rename strings, which fall through to English. A repo-wide i18n pass is out of scope.
Migration to re-classify legacy records Back-compat by design: absent titleSource is treated as manual. Rewriting old records would risk losing user intent.
Non-interactive auto-titling qwen -p / CI scripts throw the session away; fast-model tokens for a title no one will ever resume is pure waste.