qwen-code

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-05 23:42:03 +00:00

Author	SHA1	Message	Date
Shaojin Wen	35fe97e0f6	feat(review): expand review pipeline + qwen review CLI subcommands (#3754 ) Some checks are pending Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * feat(review): expand review pipeline + add `qwen review` CLI subcommands Review skill (SKILL.md) changes: - Step 4: 5 → 9 parallel agents (split Correctness/Security, add Test Coverage, 3 undirected personas: attacker / 3am-oncall / maintainer) - Step 5: verification "uncertain → reject" → "uncertain → low-confidence" (terminal-only "Needs Human Review" bucket; never posted as PR comments) - Step 6: single reverse audit → iterative (terminate on no-new-findings, hard cap 3 rounds) - Step 9: self-PR detection (downgrade APPROVE/REQUEST_CHANGES → COMMENT when GitHub forbids self-review with HTTP 422); CI status check (downgrade APPROVE → COMMENT on red/pending CI); existing-Qwen-comment classification with priority order Stale > Resolved > Overlap > NoConflict (only Overlap blocks for confirmation) `qwen review` CLI subcommands (packages/cli/src/commands/review/): - fetch-pr — clean stale + fetch PR ref + create worktree + metadata - pr-context — emit Markdown context file with security preamble + already-discussed dedup section - load-rules — read review rules from base branch (4 source files) - deterministic— run tsc, eslint, ruff, cargo-clippy, go-vet, golangci-lint on changed files; filtered + structured findings JSON (TypeScript/JavaScript, Python, Rust, Go) - presubmit — self-PR + CI status + existing-comment classification in a single JSON report - cleanup — worktree + branch ref + per-target temp files (idempotent) Cross-platform: execFileSync (no shell), path.join, CRLF normalization, which/where for tool detection. Replaces bash-style inline commands in SKILL.md; works identically on macOS/Linux/Windows. Path consistency: SKILL.md temp files moved from /tmp/qwen-review-* to .qwen/tmp/qwen-review-* — matches what os.tmpdir() resolves to across platforms (macOS returns /var/folders/... not /tmp). DESIGN.md gains five "Why ..." sections explaining each design decision; docs/users/features/code-review.md synced for user-visible changes. * feat(review): expose full reply chains in pr-context output `qwen review pr-context` now renders each replied-to inline-comment thread as the original reviewer comment + chronological reply chain, instead of only listing the root-comment snippet. This lets review agents see at a glance whether a topic has been addressed (e.g. a "Fixed in <commit>" reply closes the thread) and avoids re-reporting already-resolved concerns without forcing the LLM driver to manually summarise each reply chain in agent prompts. - Walk `in_reply_to_id` chain to group replies under their root comment - Sort replies chronologically (by id, monotonic on GitHub) - Render thread block: root snippet as a quote + bulleted reply list - Sort threads by `(path, line)` for deterministic output - SKILL.md note updated to point agents at the new chain format * feat(review): include review-level summaries in pr-context output `qwen review pr-context` now also fetches `gh api repos/{owner}/{repo}/pulls/{n}/reviews` and renders a "Review summaries" section listing each reviewer's overall body (the comment they typed alongside an APPROVED / CHANGES_REQUESTED / COMMENTED submission). Closes a real gap found during the PR #3684 review: > "@wenshao [CHANGES_REQUESTED]: The previously identified exported > type rename issue no longer maps to the current PR diff, so this > review only includes the remaining high-confidence blocker." Without this section, the LLM driver's review agents would have missed that integration note from the prior reviewer. - New `RawReview` type + extra `ghApi` call - Filter: skip empty bodies + the canonical "No issues found. LGTM!" template the qwen-review pipeline auto-emits — those carry no agent-actionable content beyond the review state itself - Sort meaningful reviews by `submitted_at` for chronological output - Stdout summary now reports `M/N review summaries` (M = kept after filter) Smoke-tested on PR #3684: 30 inline, 3 issue, 1/30 review summaries correctly surfaces the @wenshao CHANGES_REQUESTED body and filters the 29 LGTM templates. * fix(review): paginate gh API calls to capture comments past page 1 `gh api <path>` defaults to per_page=30. Busy PRs cross that limit on inline comments, issue comments, and reviews — the latest entries (the ones most likely to contain new reviewer feedback or in-flight reply chains) end up on page 2+ and were silently truncated. Concrete bug found while re-reviewing PR #3684: Before: `30 inline, 3 issue comments, 1/30 review summaries` After: `97 inline, 3 issue comments, 6/67 review summaries` 5 additional reviewer-level summaries surfaced — including the @wenshao 2026-04-30 "Multi-agent re-review (Phase C)" body with the explicit verification notes that this PR's pipeline is supposed to chain forward into the next review. Changes: - `lib/gh.ts`: new `ghApiAll(path)` helper using `gh api --paginate`, which walks every `next` link and concatenates each page's array. - `pr-context.ts`: 3 fetches (inline / issue / reviews) → `ghApiAll`. - `presubmit.ts`: PR comments fetch → `ghApiAll` too (existing-comment classification was equally susceptible to dropping page 2+ overlap candidates). `check-runs` and `commits/<sha>/status` calls retain `ghApi` — those return objects (with embedded arrays) and rarely cross 30 entries. --------- Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>	2026-05-01 18:30:35 +08:00
jinye	431a87c384	Add background agent resume and continuation (#3739 ) Some checks are pending Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * Add background agent resume support * Fix CLI typecheck against core workspace sources * Fix background agent resume hook and UI blocking * Honor folder trust when resuming agents * Fix background agent resume review follow-ups * Fix tasks command to include background agents * Harden background agent resume lifecycle * Fix background task cancellation persistence * Persist empty fork bootstrap transcripts * Align shell prompts with managed background mode * Guard session switches with background work * Preserve trailing user turns during resume --------- Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>	2026-05-01 12:14:33 +08:00
John London	4cd9f0cbe4	feat(core): add shared permission flow for tool execution unification (#3723 ) * docs: scaffold branch for #3247 tool execution unification Placeholder commit to establish the branch for PR creation. Actual refactoring will be done in subsequent commits. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(core): add shared permission flow for tool execution unification This addresses #3247 by consolidating duplicated tool execution behavior across Interactive, Non-Interactive, and ACP modes behind shared execution utilities. - Add permissionFlow.ts: shared L3→L4 permission evaluation logic - Add permissionFlow.test.ts: comprehensive test coverage (17 tests) - Export from index.ts for use across all execution modes Why: Permission handling logic was duplicated in CoreToolScheduler and Session.runTool(). This shared module ensures consistent behavior across all modes and provides a single source of truth for future fixes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(e2e): add bundle step to E2E workflow and fix canUseTool test - Add 'npm run bundle' to E2E workflow so dist/cli.js exists for SDK tests - Fix 'should handle control responses when stdin closes before replies' test: - Use helper.getPath() for absolute file path - Make prompt explicitly invoke write_file tool - Remove inputStreamDonePromise timeout that caused false failures - Add q.endInput() to signal stdin done - Assert canUseTool was called and file content is updated * fix(core): wire evaluatePermissionFlow() and address PR review feedback Address review feedback on PR #3723: - Wire evaluatePermissionFlow() in coreToolScheduler.ts (both call sites) - Wire evaluatePermissionFlow() in Session.ts (ACP mode) - Delete TOOL_EXECUTION_UNIFICATION.md (had literal \n artifacts) - Add PermissionFlowPermission union type for stronger typing - Document the 'default' permission state in docstring - Use needsConfirmation/isPlanModeBlocked/isAutoEditApproved helpers --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 22:10:37 +08:00
Shaojin Wen	6efcf2b877	feat(core): add FileReadCache and short-circuit unchanged Reads (#3717 ) * feat(core): add FileReadCache and short-circuit unchanged Reads Track Read / Edit / WriteFile operations per session in a new FileReadCache, keyed by (dev, ino) so symlinks, hardlinks, and case-variant paths collapse to one entry. ReadFile consults the cache on entry: when a full Read of a text file is repeated against an unchanged inode (mtime+size match, no intervening recordWrite), it returns a short placeholder instead of re-emitting the file content. Range-scoped Reads, non-text payloads, and post-write reads always fall through to the full pipeline. The cache is a one-instance-per-Config field, which gives subagents an empty cache automatically. Edit / WriteFile do not consume it yet — a follow-up will wire prior-read enforcement onto the same cache. * fix(core): refine FileReadCache contract per PR review feedback Three changes addressing review feedback on PR #3717: 1. Truncated reads no longer arm the placeholder. A "full" Read whose output got truncated (line cap or character cap) means the model only saw the head of the file; returning `file_unchanged` next call would falsely imply "you've already seen everything", so we keep such entries non-cacheable and let the next call re-emit the truncated window. 2. Add a Config-level escape hatch (`fileReadCacheDisabled`, default false). When true, ReadFile bypasses both the fast-path lookup and the post-read record so behaviour matches the pre-cache build byte-for-byte. Intended for sessions that may undergo context compaction or transcript transformation, where the placeholder's "you saw the content earlier in this conversation" assumption becomes unreliable. 3. The `unchangedResult` placeholder now explicitly warns about three distinct retrieval failures: context compaction, subagent transcript transformation, and external mutation (shell / MCP / other process). The previous wording only covered the third. Also adds a `READ_FILE_CACHE` debug logger that emits `hit` / `miss` on every full-Read cache consultation, so cache-hit rate can be observed locally without committing to a full telemetry pipeline. * fix(core): clear FileReadCache on startNewSession The file-read cache backs ReadFile's `file_unchanged` placeholder, whose correctness depends on the model having seen the prior full read earlier in the current conversation. `/clear` and session resume both go through `startNewSession()`, which previously left cache entries from the outgoing session in place. Result: a follow-up full Read of an unchanged file in the new session could return the placeholder despite the new conversation never having received the file contents, leaving the model to reason about content it cannot retrieve. Calls `this.fileReadCache.clear()` from `startNewSession()` and adds a regression test asserting the cache is empty after a session restart. Reported by `pomelo-nwu` on PR #3717. * fix(core): tighten FileReadCache contract per 3rd review pass Six issues raised by the 3rd review on PR #3717, all addressed: 1. Subagent cache isolation (was the most critical bug). Every subagent / scoped-agent / fork path constructs its Config via `Object.create(parent)`, which does not run instance field initializers. The child therefore resolved `fileReadCache` through the prototype chain to the parent's instance — so a subagent's ReadFile would return the file_unchanged placeholder for files the subagent's own transcript had never received. Fixed centrally in `getFileReadCache()` with a lazy own-property check, so every `Object.create(Config)` site (6 of them today) automatically gets an isolated cache without each site needing to remember to override the field. New regression tests assert (a) `Object.create` children get a distinct cache and (b) repeated calls return the same instance. 2. Edit / WriteFile now call `cache.recordWrite(absPath, postWriteStats)` on the success path. Without this, low-resolution mtime filesystems (FAT/exFAT, NFS attribute caches, same-millisecond rewrites on POSIX) would leave the cache reporting `fresh` after an edit and ReadFile would serve the pre-edit placeholder. Best-effort: a stat failure here is non-fatal (the next Read will re-stat). 3. `tryCompressChat` (in `core/client.ts`) now clears the cache after `startChat(newHistory)` succeeds. Compaction rewrites the prompt history so prior full-Read tool results may no longer be in the model's context, but the cache previously kept claiming "the model has seen this file in this conversation." 4. ReadFile auto-memory paths skip the fast-path entirely. Auto-memory files (AGENTS.md and the auto-memory root) get a per-read `<system-reminder>` freshness note in the slow path; returning the placeholder would silently drop that staleness signal. These files are small; re-emitting them is cheap. 5. The cache's recorded fingerprint is now the post-read stat, not the pre-read one. processSingleFileContent does its own internal stat between the pre-read stat and the bytes that land in `result.llmContent`; if the file mutated in that window, the old code would record a fingerprint that did not correspond to the bytes actually emitted. A subsequent Read whose stat happened to match the recorded fingerprint would then serve a placeholder pointing at content the model never saw. 6. The empty `catch` around the pre-read stat now logs `stat-failed` with `err.code` so oncall can distinguish a transient stat failure from a genuine cache miss in the debug stream. One-line change, no behaviour difference. Reported by `pomelo-nwu` on PR #3717. * test(core): mock getFileReadCache in client.test.ts CI flagged 5 tryCompressChat tests as TypeError after the cache.clear() hook was added in `0471799fd` — the existing mock Config in client.test.ts predates the FileReadCache wiring and did not stub getFileReadCache(). Local test runs missed this because they were scoped to the cache / read-file / edit / write-file / config files. Adds the minimal getFileReadCache stub returning an object with a clear() method, matching the only call shape tryCompressChat needs.	2026-04-30 17:47:48 +08:00
qwen-code-ci-bot	3f0b47172a	chore(release): v0.15.6 (#3766 ) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2026-04-30 15:59:35 +08:00
tanzhenxin	6c71b6b09c	chore(core): drop tool token usage tracking (#3727 ) The `tool_token_count` field was sourced from `toolUsePromptTokenCount` on the GenAI usage metadata, but none of the providers we adapt (OpenAI/DashScope, Anthropic) populate it, and Google's Gemini API only emits it for built-in server-side tools that qwen-code does not use. The metric was therefore always zero in practice, so the dedicated counter, telemetry field, UI row, and supporting plumbing are removed end-to-end (telemetry types, OTEL counter type, UI aggregation, model stats display, qwen-logger payload, VS Code session schema, and docs).	2026-04-30 15:35:01 +08:00
易良	49e462c021	fix(lsp): 修复 LSP 文档、isPathSafe 限制，并提升 LSP 工具调用率 (#3615 ) * fix(docs): correct outdated and inaccurate LSP documentation - Remove reference to non-existent `packages/cli/LSP_DEBUGGING_GUIDE.md` - Remove reference to unimplemented `/lsp status` slash command - Replace incorrect `DEBUG=lsp` env var with actual debug log location (`~/.qwen/debug/` session files with `[LSP]` tag) - Remove external Claude Code documentation links (`code.claude.com`) - Document `isPathSafe` constraint: absolute paths outside workspace are blocked, users must add server binary directory to PATH - Add practical troubleshooting: `ps aux \| grep <server>` to check if the server process is actually running - Add clangd-specific guidance: `--background-index`, `compile_commands.json` location, and `--compile-commands-dir` usage - Simplify trust documentation (remove vague "configure in settings") fix(lsp): allow absolute paths in LSP server command configuration Previously, `isPathSafe` rejected any command containing a path separator that resolved outside the workspace directory. This blocked legitimate use cases where users specify absolute paths to language server binaries (e.g. `/usr/bin/clangd`, `/opt/tools/jdtls/bin/jdtls`). The fix allows: - Bare command names resolved via PATH (unchanged) - Absolute paths (explicit user intent, already gated by trust checks) - Relative paths within the workspace (unchanged) Only relative paths that traverse outside the workspace (e.g. `../../malicious-binary`) are still blocked. Closes: server silently fails to start when users configure absolute paths in `.lsp.json`, with only a debug log warning visible. * feat(lsp): inject LSP priority instruction into system prompt when enabled The model was not using the LSP tool because the system prompt's "Tool Usage" section never mentioned it. The tool description alone ("ALWAYS use LSP as the PRIMARY tool") was insufficient — models follow system prompt instructions more reliably than tool descriptions. Changes: - getCoreSystemPrompt() accepts `options.lspEnabled` parameter - When LSP is enabled, injects an instruction in the Tool Usage section telling the model to ALWAYS use the LSP tool FIRST for code intelligence queries (definitions, references, hover, symbols, etc.) instead of falling back to grep/readfile - Updated client.ts to pass config.isLspEnabled() to the prompt builder - Updated test mocks and snapshots * feat(lsp): add symbolName parameter for position-free LSP queries The model avoided calling LSP for findReferences, hover, etc. because these operations required filePath + line + character which the user rarely provides. The model would read files directly instead. Changes: - Add `symbolName` optional parameter to LspTool - When symbolName is provided without line/character, auto-resolve the symbol's position via workspaceSymbol before executing the actual operation (findReferences, hover, goToImplementation, etc.) - Update tool description with examples showing symbolName usage - Move LSP priority instruction to top of system prompt for visibility - Add debug logging for LSP prompt injection This enables natural queries like: {operation: "findReferences", symbolName: "Calculator"} {operation: "hover", symbolName: "addShape"} without requiring the user to know exact file positions. * feat(lsp): add LSP reminder to grep/readfile tool descriptions When LSP is enabled, the model often chose grep or readfile instead of LSP for code intelligence queries. Now the competing tools' descriptions include a note reminding the model to use the LSP tool for definitions, references, symbols, hover, diagnostics, etc. This "push-pull" approach: - System prompt pushes toward LSP (top-level priority instruction) - Grep/ReadFile descriptions pull away from code intelligence usage * fix(docs): align LSP doc with isPathSafe change — absolute paths now supported The doc still said "absolute paths outside the workspace are not supported" but the code was changed to allow them. Updated all three places (Required Fields table, Troubleshooting, Debugging) to reflect that absolute paths are now accepted. * fix(lsp): improve symbol-based tool resolution * fix(lsp): normalize display paths across platforms * fix(lsp): narrow docs and path safety changes * fix(lsp): add edge-case tests for isPathSafe and fix Chinese comment - Add test for intermediate path traversal (./a/../../../etc/passwd) - Add test for forward-slash relative paths (tools/clangd) - Replace Chinese JSDoc with English on requestUserConsent * fix(lsp): rename requestUserConsent to checkWorkspaceTrust The method only checks workspace trust level and does not actually prompt the user for consent. Rename the method and update the JSDoc and call-site log message to accurately reflect the behavior.	2026-04-30 15:24:18 +08:00
Fu Yuchen	2f1b52d3d3	fix(core): preserve reasoning_content in rewind, compression, and merge paths (#3579 ) (#3737 ) * fix(core): preserve reasoning_content in rewind, compression, and merge paths (#3579) * chore(core): remove dead stripThoughtsFromHistory methods (#3579) * revert(pr): remove redundant reasoning merge per review feedback (#3737) Per tanzhenxin's review: the compressed ack is plain text without tool_calls so the thought-part injection is unnecessary, and the converter reasoning merge is redundant given #3729's canonical ensureReasoningContentOnToolCalls in the deepseek provider. Both paths are now handled at the request boundary, not in history transformation.	2026-04-30 10:25:00 +08:00
tanzhenxin	da2936336b	fix(core): replay DeepSeek reasoning_content on all assistant turns (#3747 ) Extend the DeepSeek reasoning_content normalization (introduced in #3729) to assistant turns without tool_calls. The DeepSeek API rejects follow-up requests in thinking mode whenever any prior assistant turn omits reasoning_content, not just turns that carried tool_calls.	2026-04-30 07:43:16 +08:00
顾盼	65a1503e13	fix(memory): use project transcript path for dream (#3722 ) Some checks are pending Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * fix(memory): use project transcript path for dream Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(memory): quote dream transcript grep path Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test(memory): make dream prompt quoting test portable Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-29 17:56:53 +08:00
tanzhenxin	762f603e9b	fix(core): inject reasoning_content on DeepSeek tool-call replays (#3729 ) DeepSeek's thinking-mode API requires every prior assistant turn that carried tool_calls to replay reasoning_content in subsequent requests, or it returns HTTP 400 ("The reasoning_content in the thinking mode must be passed back to the API"). The model can legitimately return a tool round without any reasoning text — qwen-code then stored no thought parts and rebuilt the next request with reasoning_content absent, tripping the API's check. The DeepSeek provider now normalizes outgoing assistant messages so any turn carrying tool_calls always has reasoning_content set (empty string when none was emitted). Other providers are unaffected. Refs #3695	2026-04-29 16:28:29 +08:00
Shaojin Wen	7b3d36e1f3	feat(cli): wire background shells into combined Background tasks dialog (#3720 ) * feat(cli): wire background shells into combined Background tasks dialog Phase B follow-up #2: surface managed background shells in the same overlay that already shows local subagents, so users get one unified view instead of having to remember /tasks for shells. - BackgroundShellRegistry: add setRegisterCallback/setStatusChangeCallback and requestCancel(id), mirroring BackgroundTaskRegistry's contract. register() also fires statusChange so subscribers see the lifecycle start, not just transitions. - useBackgroundTaskView: subscribe to both registries, merge entries by startTime, attach a `kind` discriminator (DialogEntry union) so renderers can dispatch on agent vs shell. - BackgroundTasksPill: group running counts by kind ("2 shells, 1 local agent"); when all entries are terminal, collapse to "N task(s) done". - BackgroundTasksDialog: replace per-kind section header with a single "Background tasks" header; ListBody renders shell rows as "[shell] <command>"; DetailBody dispatches to AgentDetailBody (the original) or a new ShellDetailBody (cwd / output file / pid / exit). - Context cancelSelected switches by kind: agents go through cancel(), shells through requestCancel() — only aborts, lets the spawn settle path record the real terminal state (mirrors task_stop in #3687). Tests: 8 pill cases (singular/plural per kind, mixed, terminal-only), 4 dialog cases (auto-fallback on running→terminal, cancel flow, already-terminal stays in detail, selectedIndex clamp); shell registry gains 5 callback tests + 3 requestCancel tests. * fix(cli): refresh detail-body agent fields between status changes useBackgroundTaskView shallow-copies agent entries into DialogEntry so each entry can carry a `kind` discriminator. The copy detaches `recentActivities` from the registry: BackgroundTaskRegistry.appendActivity mutates `entry.recentActivities = next` on the registry object and emits `activityChange`, but the dialog's activity callback only bumps a local counter — so the snapshot's `recentActivities` reference goes stale and the Progress block keeps rendering the old array until the next status-driven refresh. Resolve `selectedEntry` against the registry on each render when the selected entry is an agent, with `activityTick` as a useMemo dep so it recomputes on every activity callback. Snapshot remains the source of truth for the list (no churn on the pill / AppContainer); only the detail body re-reads live. Also rename the non-empty list section header from "Local agents" to "Background tasks" to match the empty-state branch and the unified multi-kind contents. --------- Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>	2026-04-29 16:06:36 +08:00
Shaojin Wen	1b9e8ec45d	feat(core): wire background shells into the task_stop tool (#3687 ) * feat(core): wire background shells into the task_stop tool Phase B follow-up #1 from #3634, unblocked by #3471 (control plane) merging in. The model can now cancel a managed background shell with the same `task_stop` tool it uses for subagents — no more falling back to `kill <pid>` via BashTool. Lookup order: subagent registry first (existing behavior), then the background shell registry as a fallback. Agent IDs follow `<subagentName>-<suffix>` and shell IDs follow `bg_<8 hex chars>`, so the two namespaces cannot collide in practice; the order is fixed for determinism (a defensive test pins agent-wins-over-shell). The shell cancel path resolves through the entry's own AbortController (which `BackgroundShellRegistry.cancel` triggers); the child process exit handler then settles the registry to `cancelled` and the on-disk output file is preserved for inspection via `/tasks` or a direct `Read`. This matches Phase B's "registry's own AbortController is the cancellation source of truth" design without needing the in-flight notification framework that subagents use. Tests: 7 task-stop tests (was 4) — added cancel-shell happy path, NOT_RUNNING for already-exited shell, and a defensive agent-takes-precedence-on-id-collision case. * fix(core): defer shell terminal transition until spawn handler settles @doudouOUC noticed that the previous task_stop path called `BackgroundShellRegistry.cancel(id, Date.now())`, which marked the entry `cancelled` immediately. The spawn handler's settle path only records real exit info via cancel/complete/fail when the entry is still `running`, so the cancel-vs-exit race could permanently hide a real completed/failed result and `/tasks` would show a terminal endTime while the process was still draining. Add a `requestCancel(id)` method to `BackgroundShellRegistry` that triggers the entry's AbortController only; status stays `running` until the settle path observes the abort and records the real terminal state. The immediate-mark `cancel(id, endTime)` is reserved for `abortAll()` / shutdown, where the CLI process is tearing down anyway and there is no settle handler to wait for. Tests updated: - `task-stop.test.ts` cancel-shell happy path now asserts the entry stays `running` with `endTime` undefined post-stop, and the abort signal fires (the settle path's contract, not task_stop's, is the one that flips status). - 3 new `requestCancel` tests in `backgroundShellRegistry.test.ts`: running → abort+still-running, terminal entry no-op, unknown id no-op. --------- Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>	2026-04-29 10:10:41 +08:00
pomelo	2ee014e347	fix(cli): refresh static header on model switch (#3667 ) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-29 07:16:26 +08:00
易良	8de1bcb279	chore(release): bump version to 0.15.3 (#3708 ) Some checks failed Qwen Code CI / CodeQL (push) Waiting to run Details Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details SDK Python / SDK Python (3.10) (push) Has been cancelled Details SDK Python / SDK Python (3.11) (push) Has been cancelled Details SDK Python / SDK Python (3.12) (push) Has been cancelled Details Update all package versions from 0.15.2 to 0.15.3 across the monorepo including root package.json, package-lock.json, and all sub-packages (channels, cli, core, vscode-ide-companion, web-templates, webui). Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-28 21:04:52 +08:00
tanzhenxin	8807c02676	fix(core): set DeepSeek V4 context to 1M and output to 384K (#3693 ) Some checks are pending Qwen Code CI / CodeQL (push) Waiting to run Details Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details DeepSeek V4 (flash, pro) ships with a 1M context window and 384K max output, but the generic deepseek pattern was capping it at 128K input / 8K-64K output. Add a dedicated v4 pattern that takes precedence over the generic deepseek fallback. Fixes #3679	2026-04-28 16:44:20 +08:00
tanzhenxin	784b3cef66	fix(core): treat ask_user_question multiSelect as optional (#3699 ) The schema advertised `default: false` but also listed the field as required, and the validator hard-rejected calls where the model omitted it. Models read the default annotation and reasonably skipped the field, then got the error `Question 1: "multiSelect" must be a boolean.` and could not recover. Make multiSelect optional in the schema and the input types, and only error when the field is present with a non-boolean value. Existing UI consumers already coalesce a missing value to false. Fixes #3218	2026-04-28 16:40:26 +08:00
Shaojin Wen	aac2e96ec3	feat(core): managed background shell pool with /tasks command (#3642 ) Some checks are pending Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details Qwen Code CI / Lint (push) Waiting to run Details * feat(core): managed background shell pool with /bashes command Replace shell.ts's `&` fork-and-detach background path with a managed process registry. Background shells now have observable lifecycle, captured output, and explicit cancellation — matching the pattern used by background subagents (#3076). Phase B from #3634 (background task management roadmap). What changes - New `BackgroundShellRegistry` (services/backgroundShellRegistry.ts): per-process entry with status (running / completed / failed / cancelled), AbortController, output file path. State transitions are one-shot (terminal status sticks; late callbacks no-op). Mirrors the lifecycle shape of #3471's BackgroundTaskRegistry so the two can be unified later. - `shell.ts` is_background path rewritten as `executeBackground`: - Spawns the unwrapped command (no '&', no pgrep envelope) - Streams stdout to `<projectDir>/tasks/<sessionId>/shell-<id>.output` (path layout aligns with the direction sketched in #3471 review) - Bridges the external abort signal into the entry's AbortController so a single source of truth governs cancellation - Returns immediately with id + output path; agent's turn isn't blocked - Settles the registry entry asynchronously when ShellExecutionService resolves: complete (clean exit) / fail (error) / cancel (aborted) - Removes ~120 lines of dead bg-specific code from shell.ts: pgrep wrapping, '&' appending, Windows ampersand cleanup, Windows early-return path, bg PID parsing, tempFile cleanup - New `/bashes` slash command: lists registered shells with id, status, runtime, command, output path. Empty state prints a friendly message. What this PR doesn't do - Footer pill / dialog integration — gated on #3488 landing - task_stop / send_message integration — gated on #3471 landing - Auto-backgrounding heuristics for long foreground bash — Phase D Test plan - 11 registry unit tests (state machine + idempotent terminal transitions) - 4 background-path tests in shell.test.ts (spawn no-wrap + complete / fail / cancel settle paths) - 2 /bashes command tests (empty + populated) - Full core suite: 247 files / 6075 passed (existing tests unaffected) * fix(core): address PR #3642 review feedback Three [Critical] from the auto review + naming alignment with Claude Code: - shell.ts settle: non-zero exit code or termination signal now bucket into `failed` instead of `completed`. The previous `if (result.error) fail else complete()` would misreport `false` / failed `npm test` as success because ShellExecutionService surfaces ordinary command failures as a non-zero exitCode with `error: null`. Failure reason carries the exit code or signal so `/tasks` shows the real cause. - ShellExecutionService.childProcessFallback: add `streamStdout` mode that emits each decoded chunk through the existing onOutputEvent path. The default (foreground) path continues to buffer + emit the cleaned final blob, so existing in-line shell calls are unaffected. executeBackground opts in via `{ streamStdout: true }`, which is what makes the captured output file actually useful for long-running processes (dev servers, watchers) — without it the file stayed empty until the process exited. - shell.ts test fixture: cancel-settle test was using `signal: 'SIGTERM'` but `ShellExecutionResult.signal` is `number \| null`. TS2322 broke the build; switched to `signal: null`. Added a test that explicitly covers the new "non-zero exit → failed" path so the bucketing change has regression coverage. - shell.ts comment: explicitly document why background shells force `shouldUseNodePty=false` (no terminal, no human; node-pty would be dead weight for fire-and-forget commands). - /bashes → /tasks (alias bashes), description "List and manage background tasks" — matches Claude Code's command name. Currently lists shells only; will surface other task kinds (subagents, monitor) as those registries land via #3471 / #3488. * fix(core): address PR #3642 second-round review feedback - shellExecutionService streaming: drop stdout/stderr buffer + outputChunks accumulation in streaming mode. Each decoded chunk goes straight to onOutputEvent and is GC-eligible immediately. Long-running background commands (dev servers, watchers) no longer accumulate unbounded memory proportional to total output. Buffered (foreground) mode is unchanged. - shell.ts executeBackground: stripAnsi each chunk before writing to the output file. Dev servers / build tools spam color codes and cursor-move sequences that would render as garbage in the file the agent reads. - bashesCommand: command description "List and manage" → "List background tasks" — current implementation only supports listing, cancellation follows when the unified task_stop tool from #3471 is wired in. Replace the hand-rolled formatRuntime helper with the shared formatDuration utility (uses hideTrailingZeros for parity with the previous output). - backgroundShellRegistry: add a comment documenting the lack of an eviction policy as a known limitation. LRU / age-based / capped-size eviction (and on-disk output rotation) is left as a follow-up alongside the broader output-file lifecycle story. * fix(core): address PR #3642 third-round review feedback - shell.ts executeBackground: add 'error' listener on the output write stream. fs.createWriteStream surfaces write failures (disk full, permission, fs going away) as 'error' events; without a listener Node treats it as an uncaught exception and kills the entire CLI session. Log + drop is the sane default — the registry still settles via resultPromise so /tasks shows the right terminal status. - shell.ts executeBackground: store the abort handler reference and removeEventListener in the settle callback. Background shells outlive the turn signal; the dangling listener was keeping `entryAc` (and transitively `outputStream`) reachable until the turn signal itself was GC'd, which for long sessions would never happen. - shell.test.ts: extend the createWriteStream mock with an `on` stub so the new error-listener wiring doesn't crash the test suite. * refactor(cli): drop /bashes alias and rename file to tasksCommand Per follow-up review: the slash command should be exclusively /tasks. Removes the `bashes` altName, renames `bashesCommand{,.test}.ts` → `tasksCommand{,.test}.ts`, renames the exported binding `bashesCommand` → `tasksCommand`, and cleans up the remaining `/bashes` references in backgroundShellRegistry.ts comments. No behavior change beyond the alias removal. * refactor(cli): finish tasksCommand rename — apply content changes The previous commit (`03c8503c8`) only captured the file rename via `git mv`; the export name change (`bashesCommand` → `tasksCommand`), the removal of `altNames: ['bashes']`, the import update in BuiltinCommandLoader, and the `/bashes` → `/tasks` comments in backgroundShellRegistry.ts were unstaged when that commit landed. Squash candidate before merge. * fix(core): address PR #3642 fourth-round review feedback Four reviewer concerns from @wenshao + @doudouOUC: - [Critical] Config.shutdown() now also calls `backgroundShellRegistry.abortAll()`. Previously only the subagent registry was aborted, so a managed background shell could outlive the CLI process and orphan its child. Symmetric with how `BackgroundTaskRegistry.abortAll()` is wired in. - [P1] shell.ts executeBackground strips a trailing `&` from the command before spawn. The managed path is itself the backgrounding mechanism; forwarding `node server.js &` verbatim made bash exit immediately while the real child outlived the wrapper, causing the registry to settle as `completed` while the shell was still running and chunked output to land on a closed stream. Strip + warn. - [P2] Output file moves under `storage.getProjectTempDir()` (specifically `<projectTempDir>/background-shells/<sessionId>/shell-<id>.output`). `ReadFileTool` already auto-allows the project temp dir, so the LLM can `Read` the captured output without bouncing off a permission prompt — important because background-agent contexts can't surface interactive prompts. - [P2] Background shells are no longer killed when the current turn's AbortSignal fires. Forwarding the turn signal into the entry's AbortController meant a Ctrl+C on the turn would also terminate intentionally backgrounded dev servers / watchers, contradicting the independent-lifecycle promise. Cancellation now flows only through `entryAc` (driven by future `task_stop` integration via #3471). Tests: - New `abortAll` registry tests cover running / mixed / empty cases. - `runs background commands as managed pool entries` test stops asserting the wrapper-vs-entry signal identity since they're now structurally separate (no turn-to-entry forwarding). - New `does not forward the turn signal into the background shell` test pins the new behavior. - New `strips trailing & from the spawned command` test pins the strip. - Removed the cancel-via-outer-signal settle test — that path no longer exists; cancellation is exercised end-to-end via the registry's own `cancel` and `abortAll` tests in `backgroundShellRegistry.test.ts`. * fix(core): tighten trailing & strip — narrow regex + ReDoS-safe Two reviewer concerns on the same line of #3642 round 4: - [Critical CodeQL] `\s&+\s$` is a polynomial-time regex on uncontrolled input (long all-`&` strings backtrack quadratically). - [P2 doudouOUC] `&+` is too greedy: it also rewrites `npm run dev &&` into `npm run dev` (breaks logical AND syntax) and `echo foo \&` into `echo foo \` (eats the escaped literal). Only the bare bash background operator should be stripped. Replace the regex with a small linear-time helper `stripTrailingBackgroundAmp` that explicitly checks for the three "don't touch" cases (`&&`, `\&`, no trailing `&`). Plain `endsWith` / `slice` — no regex backtracking, and the intent reads off the page. Tests: - Existing strip-trailing-`&` test still passes. - New `does not strip a trailing &&` test pins the logical-AND case. - New `does not strip an escaped trailing \\&` test pins the escape case. * fix(core): keep binary-detection sniff in streaming mode @doudouOUC noted that `streamStdout` shortcut returned before the binary-sniff path, so a background command emitting binary bytes (`cat /bin/ls`, image dump, etc.) would be text-decoded and appended to the task output file unbounded. Restructure handleOutput so the sniff-and-cutover logic runs in both modes: - Both modes accumulate up to MAX_SNIFF_SIZE for the binary check. The accumulator is bounded; once the threshold is reached, it stops growing in streaming mode (dropped on binary detection / left inert on text confirmation) and continues to accumulate in buffered mode (existing foreground behavior). - Streaming mode emits 'binary_detected' as soon as `isBinary` trips so the consumer can stop writing the output file. Up to ~4KB of bytes may have been emitted as text chunks before detection — this is bounded and acceptable; the unbounded write is the pathology reviewers flagged. - Streaming text mode still emits each decoded chunk immediately and does not accumulate stdout/stderr strings, so long-running text streams remain GC-friendly. - Buffered (foreground) behavior is unchanged — the sniff accumulator is the same path the existing tests cover. Tests: 50 shellExecutionService + 11 backgroundShellRegistry + 57 shell.test.ts all pass; no regressions. * fix(core): tighten streaming sniff bound + Windows rmSync flake Two unrelated reds on the latest CI run: 1. [P1 doudouOUC] Streaming sniff buffer leaks on small chunks. The previous fix recomputed `sniffedBytes` from `Buffer.concat(outputChunks.slice(0, 20)).length` on every chunk — pinned to the first 20 chunks. If those total under MAX_SNIFF_SIZE (line-sized stdout, e.g. dev-server logs) the byte count never grew, the sniff branch stayed open forever, and `outputChunks` accumulated every later chunk — exactly the leak `streamStdout` was meant to prevent. Track sniffed bytes by running sum (`sniffedBytes += data.length`) so the bound is genuine. When sniff confirms text in streaming mode, drop the accumulator immediately so subsequent chunks fall through the streaming emit path without ever touching it. 2. file-exporters.test.ts afterEach `fs.rmSync` flaked on Windows (ENOTEMPTY: directory not empty). The exporter's underlying write stream hasn't always released its handle by the time `rmSync` runs. Pass `maxRetries: 5, retryDelay: 50` so the cleanup retries through the brief Windows handle-release window instead of failing the test on a CI quirk. --------- Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>	2026-04-28 11:06:50 +08:00
tanzhenxin	03c88b7308	feat(cli): background-agent UI — pill, combined dialog, detail view (#3488 ) * feat(cli): background-task UI — pill, combined dialog, detail view Adds the user-facing surface for background tasks on top of the model-facing agent control primitives merged in #3471. A dedicated pill in the footer summarises running tasks, ↓ focuses it, and Enter opens a combined dialog listing every task with a detail view that shows the original prompt, live stats, and a rolling progress feed of recent tool invocations. Also renames BackgroundAgent* to BackgroundTask* for consistency with the user-facing terminology and the task_* tool family. * chore: trigger CI	2026-04-28 10:57:59 +08:00
Fu Yuchen	d09c19c0c5	fix(core,cli): stop stripping reasoning on switch and resume paths (#3682 )	2026-04-28 09:22:17 +08:00
JerryLee	1befabe586	fix(core): handle shell line continuations in command splitting (#3600 ) Some checks are pending Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details Fixes #3158. `splitCommands()` previously handled backslash escapes before newline/operator splitting, so a chained command like `cd project && \\<LF>git add ...` produced segments starting with a backslash-newline pair, leaving the extracted command root empty and bypassing per-command permission checks for the chained sub-command. Treat `\\` followed by LF as a removed line continuation, while keeping `\\` followed by CRLF as a normal command separator (bash escapes only \r; the trailing \n still ends the command). This preserves the contract that every chained sub-command is visible to permission parsing and prevents an attacker from hiding a command behind a pseudo-continuation like `echo SAFE \\<CR><LF>rm -rf /`. Adds regression coverage for both the LF-continuation positive case and the escaped-CRLF safety case.	2026-04-27 23:31:37 +08:00
Bramha.dev	414b3304cd	fix(core): split tool-result media into follow-up user message for strict OpenAI compat (#3617 ) Fixes #3616. Adds opt-in `splitToolMedia` flag (default false). When enabled, media parts (image / audio / video / file) returned by MCP tool calls are split into a follow-up `role: "user"` message instead of being embedded in the `role: "tool"` message. Required for strict OpenAI-compatible servers (e.g., LM Studio) that reject non-text content on tool messages with HTTP 400 "Invalid 'messages' in payload". Media from parallel tool responses is accumulated and emitted as a single follow-up user message after all tool messages, preserving OpenAI's contiguity requirement for tool responses. Default behavior is unchanged for permissive providers.	2026-04-27 23:01:02 +08:00
qqqys	8a278767ed	fix(core): recover from `}{` glued records on session JSONL load (#3606 ) (#3656 )	2026-04-27 22:50:17 +08:00
tanzhenxin	581c74d76e	feat(core): model-facing agent control (task_stop, send_message, per-agent transcript) (#3471 ) * feat(core): task_stop, send_message, and live transcripts for background agents Add two new tools (task_stop, send_message) and a plain-text transcript writer so the parent model can control and observe long-running background subagents. The agent lifecycle is also tightened so every background launch is paired with exactly one terminal task_notification — including under cancellation races and pathological tools that swallow AbortSignal. * feat(core): switch background-agent transcript to ChatRecord JSONL Replaces the plain-text per-agent transcript writer with one that emits the same ChatRecord schema as the main session log. Each background subagent now writes to <projectDir>/subagents/<sessionId>/agent-<id>.jsonl with a .meta.json sidecar; records carry agentId/agentName/agentColor and isSidechain so a single parser can reconstruct the parent session and its subagents as one tree. A new EXTERNAL_MESSAGE event is emitted when send_message injections are drained inside agent-core, so each follow-up message is persisted as a user-role record and the transcript remains a complete view of the run. read_file's auto-allow set is extended to <projectDir>/subagents/ so the model can keep polling the transcript path advertised in the launch response and the completion notification XML. * feat(core): emit full background agent result in task-notification Drop the 2000-char truncation on <result> in emitNotification. The agent output is already a model-generated summary; truncating it strips content the parent agent specifically asked for. The <output-file> path is still included for anyone who wants the structured transcript. * test(cli): add hasUnfinalizedAgents/abortAll to registry mock The nonInteractiveCli test stub was missing two methods that the runtime now calls when draining background agents on shutdown, causing every runNonInteractive test to fail with TypeError. * test(core): use path.join in agent-transcript path helper assertions Hard-coded forward slashes in expected paths failed on Windows where path.join produces backslashes. * fix(core): thread nested agent identity into sidecar metadata * feat(agent): improve background agent launch tool result - Add internal-ID qualifier, anti-duplication clause, and large-file reading strategy to the launch tool-result template, ported from claw-code. - Rename transcript_file to output_file for consistency. - Reference read_file and run_shell_command via ToolNames constants instead of raw strings. * fix(core): rename send_message target field * fix(core): exclude task_stop and send_message from subagents These are parent-side control-plane tools for managing background subagents. Subagents themselves cannot launch background agents (AGENT is already excluded), so they have no agent IDs to manage natively, and exposing the tools only widens the surface for cross-agent interference if an ID leaks via prompt or transcript. * refactor(core): generalize task_stop and send_message framing to "task" Today every BackgroundTaskRegistry entry is a subagent, but the control-plane tools were named and described as agent-only. Generalize so future task kinds (e.g. backgrounded shells, monitors) can share the same registry without a model-facing rename. - task_stop / send_message: descriptions, error messages, and ToolError enum values drop the "agent" framing in favor of "task". - send_message: parameter to -> task_id, matching task_stop for a uniform control-plane contract. - BackgroundTaskRegistry.hasUnfinalizedAgents -> hasUnfinalizedTasks. - agent-transcript: add a TODO at getSubagentSessionDir flagging that <projectDir>/subagents/ is part of the model-facing contract via <output-file>; future kinds should migrate to <projectDir>/tasks/. - Add a test for complete()-after-finalizeCancelled no-op to pin the one-notification-per-task SDK contract through the post-notified re-entry path.	2026-04-27 20:36:38 +08:00
Shaojin Wen	f420742831	feat(cli,core): LLM-generated summary labels for tool-call batches (#3538 ) Some checks are pending Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details * feat(cli,core): generate tool-use summaries for compact mode After each tool batch completes, fire a parallel fast-model call to generate a short git-commit-subject-style label summarizing what the batch accomplished (e.g. "Read txt files", "Searched in auth/"). In compact mode the label replaces the generic "Tool × N" header so N parallel tool calls collapse to a single semantic row. The fast-model call (~1s) runs fire-and-forget, overlapped with the next turn's API stream, so there is no perceived latency. Missing fast model, aborted turns, and model failures all degrade silently to the existing rendering. The summary is also emitted as a `tool_use_summary` history entry with `precedingToolUseIds`, keeping the shape compatible with SDK clients that want to render collapsed tool views on their own. Gated by `experimental.emitToolUseSummaries` (default on). Can be overridden per-session with `QWEN_CODE_EMIT_TOOL_USE_SUMMARIES=0\|1`. The system prompt and truncation rules (300 chars per tool field, 200 chars of trailing assistant text as intent prefix) match the existing behavior seen in other tools that emit the same message type, so SDK consumers see a consistent shape across clients. * fix(core): bound cleanSummary quote-strip regex to avoid ReDoS CodeQL js/polynomial-redos flagged the /^["'`]+\|["'`]+$/g pattern in cleanSummary because its input comes from an LLM (treated as uncontrolled). The original regex is anchored and linear in practice, but tightening the quantifier to {1,10} both satisfies the static check and caps engine work on pathological model output with a long run of quotes. Ten opening/closing quotes is well past anything a real label would produce. * fix(cli): render tool_use_summary inline so full mode also shows the label The summary was only visible in compact mode because the full-mode ToolGroupMessage ignored the compactLabel prop. Compact mode got away with this because mergeCompactToolGroups triggers refreshStatic(), which re-renders the merged tool_group with its newly-looked-up label. Full mode has no such refresh path, so when the fast-model call resolves after the tool_group has been committed to the append-only <Static>, there is no way to retroactively decorate it. Switch to rendering `tool_use_summary` as its own inline history item (a single dim `● <label>` line). New items append cleanly to <Static>, so the summary flows in naturally once the fast-model call resolves. Compact mode still replaces the merged tool_group header with the label and hides the standalone summary line via the `compactMode` guard. With this, the feature works under the default `ui.compactMode: false` — not just the opt-in compact view. * docs: tool-use-summaries feature guide, settings entry, and design doc Three new docs matching the existing fast-model feature docs layout: - docs/users/features/tool-use-summaries.md — user-facing guide covering full + compact rendering, configuration (settings + env), failure modes, cost, and cross-links to followup-suggestions. - docs/users/configuration/settings.md — register the new experimental.emitToolUseSummaries setting next to the other fast-model-driven UI settings. - docs/design/tool-use-summary/tool-use-summary-design.md — deep dive matching the compact-mode-design.md competitive-analysis style. Documents the Claude Code port (prompt, truncation, timing, gate), the deviations (settings layer, default on, cleanSummary, dual render paths), and the Ink <Static> append-only rationale that drove the inline full-mode render vs header-replacement split. * docs: add Recommended pairing section to tool-use-summaries Full-mode rendering of the summary works, but for small same-type batches (Read × 3 and similar) the label visibly restates what the tool lines already show. Pairing with ui.compactMode: true folds the whole batch into a single labeled row, which is the cleanest transcript shape once the label is available. Adds a dedicated section showing the paired settings.json snippet and explicitly calling out when each mode wins (and when to turn the feature off instead). * fix: address review feedback on tool-use summary generation Addresses multiple issues from @chiga0's review: Blocking — compact-mode label invisible for single-batch turns. mergeCompactToolGroups's adjacency-only gating left a trailing tool_use_summary in the merged result whenever there was no second batch to merge across. That pushed mergedHistory.length lock-step with history.length and MainContent's refreshStatic heuristic (currMLen <= prevMLen) never fired, so Ink's append-only <Static> never repainted the tool_group with its newly-looked-up label. Drop tool_use_summary items unconditionally now; gemini_thought still survives to avoid unnecessary repaints. New tests cover the single-batch case and the summary-before-user-message case. Blocking — stale summary appears after Ctrl+C on the next turn. summarySignal captured the CURRENT turn's AbortController, but the summary resolves during the NEXT turn's streaming window. The next turn's submitQuery allocates a fresh controller, so the captured signal was never aborted — Ctrl+C during the new turn used to let the previous turn's summary land in the transcript seconds later. Fix: dedicated per-batch AbortController tracked in a ref set, aborted eagerly from cancelOngoingRequest; resolve-time check reads the live abort state and turnCancelledRef. High — summarizer input pollution. geminiTools contained error/cancelled tools; retry-loop warnings and "Cancelled by user" strings were feeding the fast model. cleanSummary can only reject error-shaped output, not prevent the model from hallucinating a plausible label from bad input (the PR's own tmux screenshot showed "Read txt files · 5 tools" where 4 of the 5 were prior-retry failures). Filter to status === 'success' before building the prompt; skip the call entirely if nothing's left. High — unstable label on merged groups. getCompactLabel iterated all callIds and returned the first hit, so asynchronous resolution order made the header visibly flip from SB to SA when batch A resolved after batch B. Lock onto item.tools[0].callId to keep stable "leading batch governs" semantics. High — force-expanded groups in compact mode had no label at all. Compact mode routes non-force-expand groups through CompactToolGroupDisplay (consumes compactLabel) and force-expand groups through the full ToolGroupMessage (ignores compactLabel); the standalone ● line was gated on !compactMode, creating a dead zone — exactly the diagnostically valuable case. MainContent now computes absorbedCallIds (which groups actually consume the header replacement) and passes summaryAbsorbed to HistoryItemDisplay; force-expand groups in compact mode get the standalone line as the label's only path to the screen. Medium — cleanSummary robustness. Extend quote-strip to Unicode curly + CJK corner brackets; strip markdown emphasis (bold, _italic_); broaden refusal-prefix rejection to curly-apostrophe "I can't", Chinese "我无法 / 我不能 / 抱歉 / 无法", and "Failed to / Sorry, / Request failed". 7 new cleanSummary tests cover the added cases. Low — concurrent-rendering safety. Move historyRef.current = history from render phase into useLayoutEffect so bailed renders can't leave a dropped value. Low — CompactToolGroupDisplay readability. Extract renderSummaryHeader / renderDefaultHeader helpers and document the toolCalls.length > 1 count-suffix guard so a future "fix" to >= 1 doesn't reintroduce "Read config.json · 1 tools". Docs — add Scope & Lifecycle section to tool-use-summaries.md covering (1) one generation per batch shared by both modes, (2) no backfill on toggle / session resume, (3) main-agent batches only with the Task-tool clarification. * fix: address second-round review feedback on tool-use summaries Critical — force-expand groups lost their summary entirely. Previous round's "drop tool_use_summary unconditionally" merge fix also stripped summaries for force-expanded groups, defeating the exact case (errors, confirmations, focused shell) where the standalone ● label is the label's only path to the screen. The merge function now takes an absorbedCallIds set: summaries whose preceding callIds are all absorbed by a compact tool_group header are dropped (so refreshStatic still fires), but force-expanded summaries pass through to be rendered standalone by HistoryItemDisplay. MainContent computes absorbedCallIds from raw history and passes it in. New tests cover both the absorbed-drop and the force-expand-preserve cases plus the empty-set default for callers that don't compute absorption. Suggestion — late-arriving summaries could land out of order. A slow fast-model call could resolve after the next turn's content was committed, planting the ● label between later items in full mode. The resolve callback now captures the first batch callId, locates the corresponding tool_group at resolve time, and drops the summary if a newer tool_group has already appeared in history. New test exercises this with a manually-resolved fast-model promise. Suggestion — truncateJson allocated full JSON for large strings. A 10MB ReadFile result was being JSON.stringify'd in full only to be sliced down to 300 chars. Added preTruncate that walks the value (depth-bounded to 4) and slices string leaves to maxLength before serialization. Tests verify the input never reaches its full pre-cap form. Suggestion — settings description over-claimed SDK emission. The description said summaries are emitted to SDK clients as a tool_use_summary message; the SDK plumbing isn't actually wired in this PR (the factory is exported for follow-up). Updated settings.json description and regenerated the vscode schema to state CLI-only scope explicitly. Suggestion — fastModel data-boundary not documented. When fastModel uses a different provider than the main session model, tool inputs/outputs cross a new auth boundary that users may not expect. Added "Data flow & privacy" section to the user feature doc spelling out: same-provider fast model = no scope change; different-provider = strictly larger sharing scope; two escape hatches (same-provider fast model OR feature off). Code-level mitigation (metadata-only mode) deferred.	2026-04-27 16:54:10 +08:00
pomelo	7fe853a782	Feat/openrouter auth (#3576 ) * feat(cli): add OpenRouter auth flow Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * feat(cli): add OpenRouter model management UI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): align OpenRouter OAuth fallback session Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * refactor(cli): unify OpenRouter model setup flow Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * feat(auth): update OAuth description with provider examples and i18n support - Updated OAuth option description to include provider examples (OpenRouter, ModelScope) - Added internationalization support for new description text - Updated all language files (en, zh, de, fr, ja, pt, ru) with translations Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * docs: simplify OpenRouter design docs Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test(auth): fix OpenRouter OAuth mock typing Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test(auth): sync AuthDialog tests with new three-option main menu layout Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> Update assertions that referenced removed 'Qwen OAuth' and 'OpenRouter' options in the main/API-key views to match the refactored OAUTH / CODING_PLAN / API_KEY structure. * fix(i18n): add missing zh-TW translation for browser-based auth key Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> zh-TW.js was generated from main's en.js which had already removed this key, but the PR re-adds it in en.js. Sync zh-TW with the new translation. * feat(cli): Improve custom auth wizard with step indicators and cleaner advanced config (#3607) * feat(cli): Add custom API key auth wizard with 6-step setup flow Replace the documentation-only Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>"Custom API Key" screen with an in-terminal wizard: Protocol select → Base URL input → API Key input → Model ID input → JSON review → Save. - Add 5 new ViewLevels and render functions in AuthDialog - Implement utility functions: generateCustomApiKeyEnvKey (normalization), normalizeCustomModelIds (split/trim/dedupe), maskApiKey (display) - Implement handleCustomApiKeySubmit in useAuth with backup, env key generation, modelProviders merge, auth refresh, and user feedback - Wire handler through UIActionsContext and AppContainer - Add 18 unit tests for utilities, 4 wizard flow integration tests * feat(cli): Improve custom auth wizard with step indicators and cleaner advanced config - Add step indicators (Step 1/6 · Protocol) to each wizard screen - Remove redundant Protocol/Endpoint context from each step for focus - Redesign advanced config: add descriptions to thinking/modality toggles - Remove max tokens option; keep only thinking and modality settings - Add ↑↓ arrow navigation with Space toggle and Enter to continue - Generation config flows through review JSON and final submit Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test: Fix Windows CI failures in fileUtils and AuthDialog tests - fileUtils.test.ts: Mock node:child_process execFile to prevent pdftotext spawn that times out on Windows (ENOENT, 5s timeout) - AuthDialog.test.tsx: Add char-by-char typeText() helper to work around Node 24.x + ink TextInput compatibility issue on Windows Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): Reset advanced wizard state and use JSON.stringify for settings preview - Reset advancedThinkingEnabled, advancedModalityEnabled, and focusedConfigIndex when re-entering custom wizard to prevent state leakage between configurations - Replace hand-rolled JSON string concatenation with JSON.stringify for settings.json preview to properly escape special characters in model IDs and base URLs Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): harden OpenRouter OAuth callback handling Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test(cli): stabilize OpenRouter state mismatch test Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test(cli): stabilize custom auth wizard navigation Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-27 14:47:44 +08:00
John London	ccb9857a5c	refactor(config): dedupe QWEN_CODE_API_TIMEOUT_MS env override logic (#3653 ) Some checks are pending Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details Extract duplicated timeout env override block into a shared helper applyTimeoutEnvOverride(), used by both resolveModelConfig() and resolveQwenOAuthConfig(). Preserves precedence: modelProvider > env > settings > default. Adds [Regression] and [Additional] tests guarding against the original OAuth-path bug and covering edge cases.	2026-04-27 08:44:18 +08:00
Dragon	534ca986eb	feat(cli): add argument-hint support for slash commands (#3593 ) Adds argument-hint support across the slash command pipeline. Skill and command authors specify an argument-hint field in markdown frontmatter, which renders as inline ghost text when the user has typed the command name but not yet provided arguments. Pipeline: - Skill parsing: SkillConfig.argumentHint parsed from SKILL.md frontmatter - Command loaders: propagated through SkillCommandLoader, BundledSkillLoader, FileCommandLoader, command-factory - UI: useCommandCompletion shows hint as ghost text with showCursorBeforeText layout; InputPrompt separates display text from Tab-accept text - ACP: passed as input.hint per spec - Bundled skills (batch, loop, qc-helper, review) get hints Hint is excluded from completion menu labels to keep the dropdown clean and disappears as soon as the user starts typing arguments.	2026-04-27 08:29:50 +08:00
jinye	3b0b6c052b	feat(cli): add API preconnect to reduce first-call latency (#3318 ) Fire a fire-and-forget HEAD request early in startup to warm the TCP+TLS connection. Subsequent SDK calls share an undici dispatcher with preconnect, reusing the warmed connection to save 100-200ms on the first request. Skip conditions: - NODE_EXTRA_CA_CERTS set (enterprise TLS inspection) - Sandbox mode (process-restart context) - Non-default baseUrl (mTLS / private deployment) - Non-Node runtimes (Bun) Disable via QWEN_CODE_DISABLE_PRECONNECT=1. Closes #3223	2026-04-27 06:54:55 +08:00
John London	70127b5cd8	fix(config): support QWEN_CODE_API_TIMEOUT_MS across OAuth and non-OAuth paths (#3629 ) * feat(config): support API timeout env override Adds support for QWEN_CODE_API_TIMEOUT_MS as an environment override for model generation timeout. Qwen Code already supports timeout configuration via: settings.model.generationConfig.timeout This change introduces an env-based override for users running slow local/OpenAI-compatible backends where editing config is less convenient. Precedence: modelProvider > env var > settings > default (120000ms) Behavior: - Valid positive env values override configured timeout - Invalid values are ignored - Default behavior remains unchanged (applied in buildClient()) Note: The 5-minute timeout reported in #1045 originally came from undici's default bodyTimeout, which is now disabled (bodyTimeout:0). The modelConfigResolver default is 120000ms (2 minutes). Includes unit tests covering precedence and validation. Closes #1045 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(core): add edge-case tests for QWEN_CODE_API_TIMEOUT_MS Covers: large timeout values, whitespace-padded env values, negative env values, and reinforces provider > env > settings precedence. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(config): support QWEN_CODE_API_TIMEOUT_MS override Adds support for QWEN_CODE_API_TIMEOUT_MS as an environment override for model generation timeout. Closes #13 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 05:59:06 +08:00
Shaojin Wen	29887ddfef	fix(core): match DeepSeek provider by model name for sglang/vllm (#3613 ) (#3620 ) Some OpenAI-compatible servers (notably sglang's deepseek-v4 jinja template) crash on the array form of message content even when it carries a single text block, with `TypeError: sequence item 0: expected str instance, list found` at `encoding_dsv4.py:336`. The DeepSeekOpenAICompatibleProvider already flattens content arrays into joined strings in buildRequest, but isDeepSeekProvider only matched on the official api.deepseek.com baseUrl. DeepSeek models served behind sglang / vllm / ollama / etc. bypass the workaround and hit the bug. Extend the matcher to also detect by model name (case-insensitive substring 'deepseek'), so any OpenAI-compatible endpoint serving a DeepSeek model picks up the same content-format flattening. Fixes #3613 Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>	2026-04-26 13:17:34 +08:00
Shaojin Wen	569cfe10fa	fix(telemetry): use safeJsonStringify in FileExporter to avoid circular reference crash (#3630 ) When --telemetry-outfile is configured, FileSpanExporter.serialize called JSON.stringify directly on OTel ReadableSpan instances. The spans hold a back-reference to BatchSpanProcessor (._shutdownOnce -> BindOnceFuture._that -> BatchSpanProcessor), which forms a cycle and triggers "TypeError: Converting circular structure to JSON" on every export. Combined with DiagConsoleLogger, the error was repeatedly printed to stderr and polluted the Ink TUI. Switch FileExporter.serialize to the existing safeJsonStringify utility, matching the upstream gemini-cli fix so future merges stay clean. Add a focused regression test that mimics the BatchSpanProcessor cycle shape; broader cycle behavior is already covered by safeJsonStringify.test.ts. Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>	2026-04-26 12:55:39 +08:00
jinye	f7cfe53c6a	fix(core): preserve settings-sourced apiKey when registry model envKey is absent (#3495 ) * fix(core): preserve settings-sourced apiKey when registry model envKey is absent (#3417) On restart, `applyResolvedModelDefaults` unconditionally cleared the apiKey resolved from `settings.security.auth.apiKey` (layer 4 fallback) and only read from `process.env[model.envKey]`. When the provider-specific env var was absent (e.g. key stored only in settings), the correctly resolved key was discarded, causing a 401 error. Now capture the previously-resolved apiKey before clearing and fall back to it when `process.env[model.envKey]` is empty, but only for safe source kinds (`settings` and general `env` without `via.modelProviders`). Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): also preserve CLI-sourced apiKey during syncAfterAuthRefresh Address review feedback: keys passed via CLI flags (e.g. --openaiApiKey) were dropped on restart because source kind 'cli' was not in the fallback allowlist. Add 'cli' to the condition and a regression test. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): move apiKey preservation from applyResolvedModelDefaults to syncAfterAuthRefresh The previous fallback logic inside applyResolvedModelDefaults could leak a settings/cli-sourced apiKey to a different provider when switching models within the same authType (e.g. dashscope → openai). This is a credential safety issue because the two providers may have different baseUrls. Move the save/restore logic to syncAfterAuthRefresh Step 1, guarded by an `isUnchanged` check (same authType AND same modelId). This ensures: - Restart scenario: apiKey preserved (same model, no change) - Cross-provider switch: apiKey cleared (different modelId) Also adds two cross-provider switch tests (settings-sourced and CLI-sourced) per review feedback. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): replace non-null assertion with truthiness guard and add cold-start test - Replace `savedApiKeySource!` with a truthiness guard for safer source restoration - Add test for cold-start scenario (previousAuthType undefined) to verify no key preservation occurs on first syncAfterAuthRefresh - Fix stale "short-circuit" comment in programmatic key test Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): detect provider config hot-reload in isUnchanged check When a model provider config is hot-reloaded (e.g. via Coding Plan update) changing envKey or baseUrl while keeping the same model id, the save/restore logic must not preserve the old apiKey. Extend the isUnchanged guard to compare apiKeyEnvKey and baseUrl against the resolved model, but only after applyResolvedModelDefaults has run at least once (apiKeyEnvKey !== undefined). On first startup call these fields are still unset, so the check is skipped to preserve the settings/cli-sourced key correctly. Adds two hot-reload tests (envKey change and baseUrl change). Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): use baseUrl source as hasBeenApplied signal for provider change detection Replace `apiKeyEnvKey !== undefined` guard with `baseUrl source === 'modelProviders'` to reliably detect whether applyResolvedModelDefaults has been called before. This fixes two edge cases: 1. No-envKey models: hot-reload changing baseUrl was undetected because apiKeyEnvKey remained undefined. Now baseUrl source is checked. 2. Startup with envKey but omitted baseUrl: undefined !== default URL could falsely trigger isProviderChanged. Now skipped at startup since baseUrl source is not yet 'modelProviders'. Updates hot-reload test fixtures to simulate post-apply state (baseUrl source as 'modelProviders') and adds no-envKey hot-reload test. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): shallow-clone savedApiKeySource to avoid mutation risk Copy the ConfigSource object before applyResolvedModelDefaults runs, so a future refactor that mutates source objects in place won't break the save/restore logic. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com> Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-26 07:37:56 +08:00
易良	b127258328	fix(review): respect /language output setting for local reviews (#3611 ) The /review skill's language rule "match the language of the PR" has no applicable target during local reviews (no PR exists). When a user sets an output language via /language, local review output now honors that preference instead of defaulting to English. PR reviews remain unchanged — they continue matching the PR's language since findings may be published as inline comments visible to all collaborators. Closes #3594	2026-04-25 22:27:30 +08:00
jinye	c406c73509	feat(cli): add conversation rewind feature with double-ESC and /rewind command (#3441 ) * feat(cli): add conversation rewind feature with double-ESC and /rewind command (#3186) Add the ability to rewind conversation to a previous user turn, similar to Claude Code's message selector. Users can trigger rewind via: - Double-ESC on empty prompt while idle - /rewind (or /rollback) slash command The RewindSelector component provides a two-phase UI: a scrollable pick-list of user turns followed by a confirmation dialog. On confirm, both UI history and API history are truncated consistently, the terminal is re-rendered, and the original prompt text is pre-populated in the input for editing. Key implementation details: - historyMapping.ts correctly handles tool-call loops (functionResponse entries) and the startup context pair when mapping UI turns to API Content[] indices - useDoublePress hook provides generic double-press detection with 800ms timeout and proper cleanup on unmount - ESC handler guards against WaitingForConfirmation state to prevent accidental rewind during tool approval - Chat recording service records rewind events with tree-branching via parentUuid for session replay support Closes #3186 Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix: call recordRewind() in handleRewindConfirm and simplify payload - Actually invoke chatRecordingService.recordRewind() after rewind - Remove tree-branching from recordRewind (no UI-to-recording UUID mapping exists yet) to avoid corrupting the parentUuid chain - Simplify RewindRecordPayload to just truncatedCount Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test: add tmux-based E2E script for rewind feature Automated verification of all 5 manual test items from PR description: 1. /rewind command flow (pick turn, confirm, verify truncation) 2. Double-ESC opens selector (with btw dismiss handling) 3. ESC during streaming cancels (no rewind) 4. /rewind with no history (guard blocks) 5. After rewind, model ignores removed turns Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(rewind): resolve resume persistence and IDE mode issues - chatRecordingService: add turnParentUuids tracking and rewindRecording() which re-roots the parentUuid chain so rewound messages land on a dead branch; reconstructHistory() then skips them automatically on resume. Add rebuildTurnBoundaries() for re-populating the index after /resume. - AppContainer: fix truncatedCount bug (was always 0 after loadHistory), wire handleRewindConfirm to rewindRecording() with correct targetTurnIndex, add config.getIdeMode() guard to openRewindSelector so rewind is disabled in IDE sessions where extra user Content entries break the API boundary mapping. - useResumeCommand: call rebuildTurnBoundaries() after startNewSession so rewind works correctly within resumed sessions. - resumeHistoryUtils: surface "Conversation rewound." info item when a rewind record is encountered during history reconstruction. - historyMapping.test.ts: add 9 unit tests for computeApiTruncationIndex covering normal flow, startup context pair, tool responses, and compression fallback. - Copyright headers: standardize new files to "Copyright 2025 Qwen Code". 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(rewind): close slash-command, compression, and IDE bypass holes Three bugs found by Codex review: 1. P1: `/rewind` slash command bypassed the IDE-mode guard because `slashCommandActions.openRewindSelector` called `setIsRewindSelectorOpen` directly. Fixed by introducing a ref bridge (`openRewindSelectorRef`) that delegates to the guarded callback. 2. P1: Slash-command invocations (`/help`, `/stats`, etc.) are stored as `type: 'user'` in UI history but never reach the API or recording service. The turn-index counter in `handleRewindConfirm` and `computeApiTruncationIndex` counted them, producing off-by-N errors. Added `isRealUserTurn()` helper that excludes items starting with `/` or `?`, applied in all three counting sites (AppContainer, historyMapping, RewindSelector). 3. P2: After chat compression, `computeApiTruncationIndex` returned `apiHistory.length` when the target turn was unreachable, silently keeping the full API history while the UI was truncated. Changed to return `-1`; `handleRewindConfirm` now aborts with an error message when the target turn was absorbed by compression. Tests: 14 unit tests for historyMapping (including slash-command and compression cases), full suite 616/616 passed. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) --------- Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com> Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-25 22:12:29 +08:00
ChiGao	54465b0c02	fix(cli): add TUI flicker foundation fixes (#3591 ) * fix(cli): reduce main screen flicker * fix(cli): pre-slice large tool text output * fix(cli): slice tool output by visual height * fix(core): preserve shell transcript across narrow wraps * fix(core): suppress soft-wrap-only shell rerenders * fix(core): compare default shell output by logical wraps * fix(cli): gate synchronized terminal output --------- Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com>	2026-04-25 10:13:34 +08:00
MikeWang0316tw	12b26ba063	feat(cli): add Traditional Chinese (zh-TW) as a UI language option (#3569 ) * feat(cli): add Traditional Chinese (zh-TW) as a UI language option Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix: use upstream unused-keys-only-in-locales.json to resolve conflict Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * revert: remove check-i18n.ts changes to avoid pre-existing zh.js issues Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * feat(cli): add Traditional Chinese (zh-TW) as a UI language option Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): add WITTY_LOADING_PHRASES to zh-TW locale Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): sync zh-TW.js with en.js keys, fix double-escape, fix check-i18n.ts Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix: resolve conflict in unused-keys-only-in-locales.json Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): add missing Performance translation to zh-TW Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): add quotes to Performance key in zh-TW Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): regenerate zh-TW.js with correct multi-line value parsing Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix: resolve conflict in unused-keys-only-in-locales.json Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): regenerate zh-TW.js with correct multi-line value parsing Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): standardize zh-TW.js key quoting and sync zh.js keys - Convert zh-TW.js keys from double-quoted to single-quoted to match en.js style - Fix zh.js key mismatches: add missing keys (Value:, No server selected, prompts, required, Enum) and remove extra keys (The name of the extension to update, Session (temporary)) - Regenerate unused-keys-only-in-locales.json Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): update loading phrases when UI language changes Add getCurrentLanguage() to useMemo deps in usePhraseCycler so that WITTY_LOADING_PHRASES re-evaluates after a /language switch instead of staying locked to the language active at mount time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(i18n): normalize locale separators and fix case-insensitive language lookup - detectSystemLanguage(): normalize POSIX locales (e.g. zh_TW.UTF-8 → zh-tw) by replacing underscores with hyphens and lowercasing before matching, so users with LANG=zh_TW.UTF-8 correctly detect zh-TW instead of falling through to zh - getLanguageNameFromLocale(): compare codes case-insensitively so that normalizeOutputLanguage('zh-TW') resolves to 'Traditional Chinese' instead of falling back to 'English' - Add test cases for zh-TW / zh-tw / ZH-TW in normalizeOutputLanguage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): update getLanguageNameFromLocale mock to include zh-TW Add 'zh-tw' entry to the mock map and normalize locale input with toLowerCase() so the mock mirrors the real case-insensitive implementation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-24 21:34:46 +08:00
Shaojin Wen	609b4324f6	perf(core): cut runtime sync I/O on tool hot path by 91% (#3581 ) * perf(core): make chat recording writes async Every recorded chat event (user message, assistant turn, tool call, tool result, slash command, etc.) was issuing 4 sync fs syscalls on the main event loop: existsSync(dir) + mkdirSync(dir) + existsSync(file) + appendFileSync(file). For a tool-heavy prompt this added ~88 sync I/O calls per session, blocking the UI render and keypress handler during each one. - chatRecordingService.appendRecord: cache ensure-flags so dir/file creation runs once per session, then enqueue the actual write on a per-instance promise chain (writeChain). lastRecordUuid is updated synchronously so chained createBaseRecord still sees the right parentUuid without waiting for the previous write. - chatRecordingService.flush: drains the chain — wired into Config.shutdown so no records are lost on exit. - jsonl-utils.writeLine: now actually async (fs.promises.mkdir + fs.promises.appendFile) with per-dir mkdir cache. The existing per-file mutex still serializes writes correctly. - Tests updated to await flush() before assertions. Trace measurement on a single tool-heavy prompt: 110 → 20 sync I/O calls (-82%), with chatRecordingService dropping from 88 to 0. * perf(core): cache repeated fs lookups on tool hot path Each tool invocation went through validatePath → isPathWithinWorkspace → fullyResolvedPath, plus its own existence/dir checks. The same paths got re-resolved across back-to-back tool calls, and ripGrep re- discovered .qwenignore on every Grep. - workspaceContext.fullyResolvedPath: bounded LRU on input path (1024, FIFO). Failed resolutions are NOT cached so retries work. - paths.validatePath: cache positive isDirectory results; ENOENT falls through every time so a freshly created file is picked up immediately. - ripGrep: module-level caches for searchPath-is-dir and per-dir .qwenignore presence (256 each, FIFO). - fileUtils.processSingleFileContent: drop the existsSync gate; let fs.promises.stat throw ENOENT and convert to FILE_NOT_FOUND in catch. Trace: 20 → 10 sync I/O calls. Cumulative reduction since the chat-recording change: 110 → 10, -91%. All 6057 core tests pass. * test(core): cache reset hooks + regression-guards from audit Self-review pass on the previous two perf commits surfaced a few follow-ups worth pinning down before they bite: - Module-level caches (paths.isDirectoryCache, ripGrep dirIsDir/qwen- Ignore, jsonl-utils.ensuredDirs) persisted across vitest cases silently. Added underscore-prefixed `_resetForTest` exports and wired one into the validatePath describe block so future cases mutating the same absolute paths can't pass by accident. - Documented the parentUuid-chain tradeoff on chatRecordingService .appendRecord: when the async write rejects, lastRecordUuid was already set sync, so subsequent records reference an absent ancestor — readers like sessionService.reconstructHistory then silently drop those descendants. Same observable failure mode as the prior sync code's caught-and-logged throw. - Documented the dir<->file mutation and mid-session .qwenignore staleness windows for the validatePath / ripGrep caches. - Added regression tests: validatePath does NOT cache ENOENT (Edit-then-Read works) * validatePath skips re-stat on cache hit (perf assertion) * flush() resolves immediately on a fresh service * a rejected writeLine does not block the next record Full core suite: 6061 pass, 2 skipped — no regressions. * fix(core): cache chatsDirEnsured only on mkdir success Pre-fix, the flag flipped to true even when mkdirSync threw, so a single transient failure (NFS EACCES, sandbox mount race, parent dir briefly missing) would short-circuit every subsequent appendRecord and silently drop the rest of the session's transcript with no error surfaced. Reported by zhangxy-zju on #3581. * fix(cli): destroy stdout instead of process.exit on EPIPE Routine CLI patterns like `qwen -p ... \| head -1` / `\| less` / `\| grep -m1` close the downstream pipe and trigger EPIPE. The previous handler called process.exit(0), which bypassed the caller's runExitCleanup -> Config .shutdown -> chat-recording flush() chain and silently dropped queued JSONL writes (most recent assistant turn + tool results). Destroying stdout instead lets writes fail fast and the natural function return drive cleanup. We deliberately do not also abortController.abort() here: the abort path runs handleCancellationError which itself calls process.exit(130), re-introducing the same bypass. Reported by zhangxy-zju on #3581. * fix(cli): bound runExitCleanup with per-fn + wall-clock timeouts Pre-fix, runExitCleanup was an unbounded series of awaits. After the async-jsonl change moved chat-recording writes off the calling thread (Config.shutdown now `await flush()`s the queue), any hung syscall (slow disk, dead NFS mount, stuck MCP socket, telemetry HTTP stall) would hang process exit indefinitely — sync writes were inherently bounded by syscall return; async writes are not. Adds per-cleanup 2s + overall 5s wall-clock failsafes on the same shape as Claude Code's gracefulShutdown.ts. Also replaces dead test-isolation code (`global['cleanupFunctions']` was never on global, the array is module-private) with a `_resetCleanupFunctionsForTest` hook matching the convention from `d6485964c`. Follow-up flagged by zhangxy-zju on #3581. --------- Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>	2026-04-24 21:17:51 +08:00
易良	44b482928b	chore(release): bump version to 0.15.2 (#3596 ) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> Update version from 0.15.1 to 0.15.2 across all packages and lockfile	2026-04-24 19:55:12 +08:00
Fu Yuchen	93cbad24b1	fix(core): preserve reasoning_content during session resume and active sessions (GH#3579) (#3590 ) * fix(core): preserve reasoning_content during session resume and active sessions (GH#3579) * chore(core): remove dead thinkingThresholdMinutes config after latch removal (GH#3579)	2026-04-24 17:49:05 +08:00
harsh	5c1e636dbe	fix: Strengthen error handling in qwenOAuth2.ts to prevent unhandled 'error' event (#3481 ) * fix: strengthen error handling in launchBrowser to prevent unhandled events * fix: strengthen error handling with ChildProcess type and debugLogger * fix: use type-only import for ChildProcess	2026-04-24 14:43:13 +08:00
tanzhenxin	53293e4d85	refactor(core): make OpenAI converter stateless (follow-up to #3525 ) (#3550 ) * refactor(core): make OpenAI converter stateless to prevent shared-state races Follow-up to #3525. #3516 showed that OpenAIContentConverter's long-lived per-pipeline state raced between concurrent streams; #3525 scoped the streaming tool-call parser, this removes the remaining shared state. - OpenAIContentConverter is now a module of stand-alone functions; the exported symbol is a namespace object preserved for call-site compatibility. - New RequestContext (in types.ts, alongside PipelineConfig and ErrorHandler) carries model, modalities, startTime, and an optional per-stream toolCallParser. The pipeline builds one per request and threads it through every conversion call. - errorHandler drops duration/isStreaming; duration is recomputed from startTime at error time and troubleshooting text is uniform. - convertOpenAIChunkToGemini now throws if toolCallParser is missing so future misuse surfaces loudly instead of silently constructing a one-shot parser per chunk. * test(core): align timeout expectations	2026-04-24 12:28:03 +08:00
顾盼	aeeb2976d6	feat(web-search): remove built-in web_search tool, replace with MCP-based approach (#3502 ) * feat(web-search): add GLM (ZhipuAI) web search provider - Add GlmProvider class implementing BaseWebSearchProvider using the ZhipuAI Web Search API (https://open.bigmodel.cn/api/paas/v4/web_search) - Support multiple search engines: search_std, search_pro, search_pro_sogou, search_pro_quark - Support optional config: maxResults, searchIntent, searchRecencyFilter, contentSize, searchDomainFilter - Truncate query to 70 characters per API limit - Register 'glm' in the provider discriminated union (types.ts) and createProvider() switch (index.ts) - Add GlmProviderConfig to settingsSchema, ConfigParams, and Config class - Add --glm-api-key CLI flag and GLM_API_KEY env var support in webSearch.ts - Forward GLM_API_KEY in sandbox environment - Update provider priority list: Tavily > Google > GLM > DashScope - Add 17 unit tests for GlmProvider and 4 integration tests in index.test.ts - Update docs/developers/tools/web-search.md with GLM configuration, env vars, CLI args, pricing, and corrected DashScope billing info - Fix stale OAuth/free-tier references in web-search.md Closes #3496 * docs(web-search): fix DashScope note and add GLM server-side limitations * fix(web-search): make DashScope provider work with standard API key, remove qwen-oauth dependency - DashScopeProvider.isAvailable() now checks config.apiKey instead of authType - Remove OAuth credential file reading and resource_url requirement - Use standard DashScope endpoint: dashscope.aliyuncs.com/api/v1/indices/plugin/web_search - Read DASHSCOPE_API_KEY env var and --dashscope-api-key CLI flag - Forward DASHSCOPE_API_KEY into sandbox environment - Update integration test to detect DASHSCOPE_API_KEY - Update docs to reflect new API key based configuration * feat(web-search): remove built-in web search tool The web_search tool and all related provider implementations are removed. Web search functionality will be provided via MCP integrations instead, which is the direction the broader agent ecosystem is moving. Removed: - packages/core/src/tools/web-search/ (entire directory) - packages/cli/src/config/webSearch.ts - integration-tests/cli/web_search.test.ts - ToolNames.WEB_SEARCH, ToolErrorCode.WEB_SEARCH_FAILED - webSearch config in ConfigParams, Config class, settingsSchema - CLI options: --tavily-api-key, --google-api-key, --google-search-engine-id, --glm-api-key, --dashscope-api-key, --web-search-default - Sandbox env forwarding for TAVILY/GLM/DASHSCOPE/GOOGLE search keys - web_search from rule-parser, permission-manager, speculation gate, microcompact tool set, and builtin-agents tool list * fix: remove websearch reference * docs: remove websearch tool * docs: add break change guide * fix review	2026-04-24 11:29:02 +08:00
Shaojin Wen	d36f12c4c4	feat(session): auto-title sessions via fast model, add /rename --auto (#3540 ) * feat(session): auto-title sessions via fast model, add /rename --auto The /rename work in #3093 generates kebab-case titles only when the user explicitly runs `/rename` with no args; until they do, the session picker shows the first user prompt (often truncated or misleading). This change adds a sentence-case auto-title that fires once per session after the first assistant turn, using the configured fast model. New service: `packages/core/src/services/sessionTitle.ts` — `tryGenerateSessionTitle(config, signal)` returns a discriminated outcome (`{ok: true, title, modelUsed}` \| `{ok: false, reason}`) so callers can either handle failures generically or map reasons to actionable messages. Prompt shape: 3-7 words, sentence case, good/bad examples including a CJK row, JSON schema enforced via `baseLlmClient.generateJson`. `maxAttempts: 1` — titles are cosmetic metadata and shouldn't fight rate limits. Trigger point: `ChatRecordingService.maybeTriggerAutoTitle` runs after `recordAssistantTurn`. Fire-and-forget promise, guarded by: - `currentCustomTitle` — don't overwrite any existing title. - `autoTitleController` doubles as in-flight flag; a second turn while the first is still pending is a no-op. - `autoTitleAttempts` cap of 3 — the first assistant turn may be a pure tool-call with no user-visible text; retry for a handful of turns until a title lands. Cap bounds total waste. - `!config.isInteractive()` — headless CLI (`qwen -p`, CI) never auto- titles; spending fast-model tokens on a one-shot session is waste. - `autoTitleDisabledByEnv()` — `QWEN_DISABLE_AUTO_TITLE=1` opt-out. - `config.getFastModel()` falsy — skip entirely rather than falling back to the main model; auto-titling on main-model tokens is too expensive to be silent. Persistence: `CustomTitleRecordPayload` grows a `titleSource: 'auto' \| 'manual'` field. Absent on pre-change records (treated as `undefined` → manual, safe default so a user's pre-upgrade `/rename` is never silently reclassified). `SessionPicker` renders `titleSource === 'auto'` titles in dim (secondary) color; manual stays full contrast. On resume, the persisted source is rehydrated into `currentTitleSource` — without this, finalize's re-append would rewrite an auto title as manual on every resume cycle. Cross-process manual-rename guard: when two CLI tabs target the same JSONL, in-memory state can diverge. Before writing an auto record, the IIFE re-reads the file via `sessionService.getSessionTitleInfo`. If a `/rename` from another process landed as manual, bail and sync local state — never clobber a deliberately-chosen manual title with a model guess. Cost is one 64KB tail read per successful generation. `finalize()` aborts the in-flight controller before re-appending the title record. Session switch / shutdown doesn't have to wait on a slow fast-model call. New user-facing command: `/rename --auto` regenerates via the same generator — explicit user trigger, overwrites whatever's there (manual or auto) because the user asked. Errors route through `autoFailureMessage(reason)` so `empty_history`, `model_error`, `aborted`, etc. each get actionable guidance rather than a generic "could not generate". `/rename -- --literal-name` is the sentinel for titles that start with `--`; unknown `--flag` tokens error with a hint pointing at the sentinel. Existing `/rename <name>` and bare `/rename` (kebab-case via existing path) are unchanged, except the kebab path now prefers fast model when available and runs its output through `stripTerminalControlSequences` (same ANSI/OSC-8 hardening as the sentence-case path). New shared util: `packages/core/src/utils/terminalSafe.ts` — `stripTerminalControlSequences(s)` strips OSC (\x1b]...\x07\|\x1b\\), CSI (\x1b[...[a-zA-Z]), SS2/SS3 leaders, and C0/C1/DEL as a backstop. A model-returned `\x1b[2J` or OSC-8 hyperlink escape would otherwise execute on every SessionPicker render; both sentence-case and kebab paths now route titles through the helper before they reach the JSONL or the UI. Tail-read extractor: `extractLastJsonStringFields(text, primaryKey, otherKeys, lineContains)` reads multiple fields from the same matching line in a single pass. Two separate tail scans could return a mismatched pair (primary from a newer record, secondary from an older one with only the primary set); the new helper guarantees the pair is atomic. Validates a proper closing quote on the primary value so a crash-truncated trailing record can't win the latest-match race. `readLastJsonStringFieldsSync` is its file-reading wrapper — same tail-window fast path and full-file fallback as the single-field version, plus a `MAX_FULL_SCAN_BYTES = 64MB` cap so a corrupt multi-GB session file can't freeze the picker. Session reads now open with `O_NOFOLLOW` (falls back to plain RDONLY on Windows where the constant isn't exposed) — defense in depth against a symlink planted in `~/.qwen/projects/<proj>/chats/`. Character handling: `flattenToTail` on the LLM prompt drops a dangling low surrogate after `slice(-1000)` — otherwise a CJK supplementary char or emoji cut mid-pair produces invalid UTF-16 that some providers 400. `sanitizeTitle` applies the same surrogate scrub after max-length trim, and strips paired CJK brackets (`「」『』【】〈〉《》`) as whole units so a `【Draft】 Fix login` doesn't leave a dangling `】` after leading-char strip. `lineContains` in the title reader is tightened from the loose substring `'custom_title'` to `'"subtype":"custom_title"'` so user text containing the literal `custom_title` can't shadow a real record. Tests: 46 new unit tests across - `sessionTitle.test.ts` (22): success/all-failure-reasons, tool-call filter, tail-slice, surrogate scrub, ANSI/OSC-8 strip, CJK brackets. - `chatRecordingService.autoTitle.test.ts` (15): trigger/skip matrix, in-flight guard, abort propagation on finalize, manual/auto/legacy resume symmetry, cross-process race, env opt-out, retry-after- transient. - `sessionStorageUtils.test.ts` (13): single-pass extractor, straddle boundary, truncated trailing record, lineContains, multi-field atom. - `renameCommand.test.ts` (8): `--auto` success, all reasons, sentinel, unknown-flag hint, positional rejection, manual/SessionService fallbacks. * docs(session): design doc for auto session titles Matches the session-recap design doc shape (Overview / Triggers / Architecture / Prompt Design / History Filtering / Persistence / Concurrency / Configuration / Observability / Out of Scope) and adds a Security Hardening section unique to the title path — titles render directly in the picker and persist in user-readable JSONL, so LLM-returned control sequences are an attack surface the recap path doesn't have. Captures decisions a code-only reader has to reverse-engineer: - Why `maxAttempts: 1` (best-effort cosmetic metadata; no retry loop). - Why `autoTitleAttempts` cap is 3 (first turn can be pure tool-call). - Why the auto trigger does NOT fall back to the main model but session-recap does (auto-title fires on every turn; silently charging main-model tokens is a bill surprise). - Why `titleSource: undefined` stays unwritten on legacy records (no rewrite risks silently reclassifying user intent). - Why the cross-process re-read sits between the LLM await and the append (manual wins at both in-process and on-disk layers). - Why `finalize()`'s abort tolerates a controller swap (in-flight identity check). - Why JSON-schema function calling instead of tag extraction (avoid reasoning preamble bleed; cross-provider reliability). Placed at docs/design/session-title/ alongside session-recap, compact-mode, fork-subagent, and other per-feature design docs. No sidebar index update required — the design folder is unindexed. * test(rename): pin model choice in bare /rename kebab path Addresses reviewer feedback: the bare `/rename` model selection (`config.getFastModel() ?? config.getModel()`) had no test pinning it either way. Previous tests mocked `getHistory: []`, which exits the function before the model is ever chosen, so a silent regression to either direction (always-main or always-fast) would pass CI. Two explicit cases now: - fastModel set → `generateContent` called with `model: 'qwen-turbo'`. - fastModel unset → `generateContent` called with `model: 'main-model'`. The tests intentionally mock a non-empty history so the kebab path reaches the generateContent call site instead of bailing on empty input.	2026-04-23 20:37:05 +08:00
zhangxy-zju	d14ce16b95	fix(core): treat empty 'pages' parameter as unset in ReadFile (#3559 ) params.pages !== undefined let "" fall through to parsePDFPageRange(''), which returns null and surfaced "Invalid pages parameter: ''" for every read_file call from models that default optional strings to "". Switch to a truthy check so "" behaves the same as an omitted field, and add a regression test. Fixes #3558	2026-04-23 20:02:05 +08:00
顾盼	9010c09123	chore: bump version to 0.15.1 (#3541 ) Some checks are pending Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-23 11:06:07 +08:00
zhangxy-zju	d40fe7cdba	fix(core): scope StreamingToolCallParser per stream, not per Converter (#3516 ) (#3525 ) Some checks are pending Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * fix(core): scope StreamingToolCallParser per stream, not per Converter Issue #3516 reports subagent failures with `Model stream ended with empty response text` whose real root cause is concurrent streams racing on a single shared tool-call parser. Architecture before this change: Config (singleton) └── contentGenerator (OpenAIContentGenerator) └── ContentGenerationPipeline └── OpenAIContentConverter └── streamingToolCallParser ← shared! Any caller of `Config.getContentGenerator()` — foreground turns, fork subagents, `run_in_background: true` subagents, ACP concurrent Agent calls (PR #3463) — ends up using the same parser instance. When two streams run concurrently, `processStreamWithLogging`'s stream-start `resetStreamingToolCalls()` wipes the other stream's in-flight buffers, and their chunks interleave at `index: 0`, producing corrupt JSON like `{"file_path": "/A{"file_path": "/B...` that even jsonrepair cannot salvage. The corrupted tool calls are dropped entirely and the stream surfaces upstream as `NO_RESPONSE_TEXT`. Fix: move parser state from Converter instance field into per-stream local state. - Add `ConverterStreamContext` and `createStreamContext()` factory on `OpenAIContentConverter`. Each call returns a fresh context holding its own `StreamingToolCallParser`. - `convertOpenAIChunkToGemini(chunk, ctx)` now takes the context as an explicit arg; all internal parser calls route through it. - `ContentGenerationPipeline.processStreamWithLogging` creates one context at stream entry and passes it to every chunk conversion. - Drop `OpenAIContentConverter.streamingToolCallParser` field. - Drop `resetStreamingToolCalls()` — the context has stream-local lifetime, no manual reset needed. The two call sites in the pipeline (stream entry and error path) are removed. Tests: - Replace the `resetStreamingToolCalls` suite with a `createStreamContext` suite asserting that distinct contexts are independent and writes to one never leak into the other. - Add a regression test simulating two concurrent streams with interleaved chunks through the same Converter instance; both tool calls close cleanly with correct arguments and ids. - All existing single-stream tests updated to obtain a context via `createStreamContext()` and pass it through to chunk conversion. - `pipeline.test.ts` mocks updated accordingly. packages/core test suite: 841 passed. No stale references to `resetStreamingToolCalls` or the private parser field remain. Refs #3516 * docs(core): clarify GC wording in per-stream context comment (copilot review) * test(core): add pipeline-level integration test for concurrent streams Complements the unit tests in converter.test.ts by driving the real ContentGenerationPipeline + real OpenAIContentConverter (no mocks on converter) through two streams that interleave on the event loop via `setImmediate`-paced async generators. Two scenarios: 1. Happy path — two concurrent executeStream invocations with their own tool-call chunks. Assert each stream emits its own function call with the correct id and args (not cross-contaminated from the sibling stream). 2. Error isolation — one stream hits `error_finish` mid-flight while a sibling stream is still accumulating tool-call chunks. Assert the sibling's function call still emits cleanly, covering the removed `resetStreamingToolCalls()` call in the error path of processStreamWithLogging. Verified as a positive control: with the per-stream context fix reverted (origin/main state), both tests fail with exactly the bug shape users reported — one stream's function call is either overwritten by the other's id/args, or is swallowed entirely when the sibling stream's error path wipes the shared parser buffer. Refs #3516	2026-04-22 20:32:30 +08:00
易良	f2fac208ff	chore(release): bump version to 0.15.0 (#3526 ) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> Upgrade all package versions from 0.14.5 to 0.15.0 across the monorepo, including package-lock.json and sandbox image references.	2026-04-22 19:26:13 +08:00
顾盼	2710bdec0d	feat(cli): Phase 2 — slash command multi-mode expansion, ACP fixes, and UX improvements (#3377 ) * refactor(cli): replace slash command whitelist with capability-based filtering (Phase 1) ## Summary Replace the hardcoded ALLOWED_BUILTIN_COMMANDS_NON_INTERACTIVE whitelist with a unified, capability-based command metadata model. This is Phase 1 of the slash command architecture refactor described in docs/design/slash-command/. ## Key changes ### New types (types.ts) - Add ExecutionMode ('interactive' \| 'non_interactive' \| 'acp') - Add CommandSource ('builtin-command' \| 'bundled-skill' \| 'skill-dir-command' \| 'plugin-command' \| 'mcp-prompt') - Add CommandType ('prompt' \| 'local' \| 'local-jsx') - Extend SlashCommand interface with: source, sourceLabel, commandType, supportedModes, userInvocable, modelInvocable, argumentHint, whenToUse, examples (all optional, backward-compatible) ### New module (commandUtils.ts + commandUtils.test.ts) - getEffectiveSupportedModes(): 3-priority inference (explicit supportedModes > commandType > CommandKind fallback) - filterCommandsForMode(): replaces filterCommandsForNonInteractive() - 18 unit tests ### Whitelist removal (nonInteractiveCliCommands.ts) - Remove ALLOWED_BUILTIN_COMMANDS_NON_INTERACTIVE constant - Remove filterCommandsForNonInteractive() function - Replace with CommandService.getCommandsForMode(mode) ### CommandService enhancements (CommandService.ts) - Add getCommandsForMode(mode: ExecutionMode): filters by mode, excludes hidden - Add getModelInvocableCommands(): reserved for Phase 3 model tool-call use ### Built-in command annotations (41 files) Annotate every built-in command with commandType: - commandType='local' + supportedModes all-modes: btw, bug, compress, context, init, summary (replaces the 6-command whitelist) - commandType='local' interactive-only: export, memory, plan, insight - commandType='local-jsx' interactive-only: all remaining ~31 commands ### Loader metadata injection (4 files) Each loader stamps source/sourceLabel/commandType/modelInvocable on every command it emits: - BuiltinCommandLoader: source='builtin-command', modelInvocable=false - BundledSkillLoader: source='bundled-skill', commandType='prompt', modelInvocable=true - command-factory (FileCommandLoader): source per extension/user origin, commandType='prompt', modelInvocable=!extensionName - McpPromptLoader: source='mcp-prompt', commandType='prompt', modelInvocable=true ### Bug fix MCP_PROMPT commands were incorrectly excluded from non-interactive/ACP modes by the old whitelist logic. commandType='prompt' now correctly allows them in all modes. ### Session.ts / nonInteractiveHelpers.ts - ACP session calls getAvailableCommands with explicit 'acp' mode - Remove allowedBuiltinCommandNames parameter from buildSystemMessage() — capability filtering is now self-contained in CommandService * fix test ci * feat(cli): Phase 2 slash command expansion + ACP fixes + UX improvements Phase 2.1 - Command mode expansion: - Extend 13 built-in commands to support non_interactive/acp modes - A class: export, plan, statusline - supportedModes only - A+ class: language, copy, restore - add non-interactive branches - A' class: model, approvalMode - handle dialog paths in non-interactive - B class: about, stats, insight, docs, clear - full non-interactive branches - context: format output as readable Markdown instead of raw JSON - export: use HTML as default format when no subcommand given Phase 2.2 - SkillTool integration: - SkillTool now consumes CommandService.getModelInvocableCommands() Phase 2.3 - Mid-input slash ghost text: - Replace mid-input dropdown completion with inline ghost text - Match Claude Code behavior: gray dimmed completion hint in input box - Tab accepts the ghost text completion - Add findMidInputSlashCommand() and getBestSlashCommandMatch() utilities ACP session bug fixes: - Fix executionMode undefined in interactive mode (slashCommandProcessor) - Fix slash command output not visible in Zed (use emitAgentMessage) - Fix newline rendering in Zed (Markdown hard line-break) - Fix history replay merging consecutive user messages (recordSlashCommand) - Fix /clear not clearing model context (dynamic chat reference) * feat: inline complete only for modelInvocable * fix memory command * fix: pass 'non_interactive' mode explicitly to getAvailableCommands - Fix critical bug in nonInteractiveHelpers.ts: loadSlashCommandNames was calling getAvailableCommands without specifying mode, causing it to default to 'acp' instead of 'non_interactive'. Commands with supportedModes that include 'non_interactive' but not 'acp' would be silently excluded. - Apply the same fix in systemController.ts for the same reason. - Update test mock to delegate filtering to production filterCommandsForMode() instead of duplicating the logic inline, preventing divergence. Fixes review comments by wenshao and tanzhenxin on PR #3283. * fix: resolve TypeScript type error in nonInteractiveHelpers.test.ts * fix test ci * fix mcp prompt in skill manager * revert pr#3345 * fix test ci * feat(cli): adapt /insight for non_interactive mode with message return - non_interactive: run generateStaticInsight() synchronously with no-op progress callback, return { type: 'message' } with output path - acp: keep existing stream_messages path with progress streaming - interactive: unchanged Add tests for non_interactive success and error paths. Update phase2-technical-design.md and roadmap.md to reflect the three-way mode split and clarify that MCP prompts do not need modelInvocable (they are called via native MCP tool call mechanism). * fix(cli): ghost text only shown when cursor is at end of slash token Use strict equality (!==) instead of > in findMidInputSlashCommand so that ghost text is only computed and Tab-accepted when the cursor sits exactly at the trailing edge of the partial command token. Previously, with the cursor inside an already-typed token (e.g. /re\|view), the ghost text suffix would still be shown and pressing Tab would insert it at the cursor position, producing a duplicated tail. Using strict equality makes ghost text disappear as soon as the cursor moves inside the token. Add unit tests for findMidInputSlashCommand covering cursor-at-end, cursor-inside-token, cursor-past-token, start-of-line, and no-space-before-slash cases. * fix(cli): support /model <model-id> in non-interactive and ACP modes Previously, /model <model-id> (without --fast) fell through to the non-interactive branch that only returned the current model info and incorrectly told users to use --fast. Now: - /model <model-id> → sets the main model via settings + config.setModel() - /model → shows current model with correct usage hint - /model --fast <id> → unchanged (sets fast model) Fixes the inconsistency flagged in PR review: the help text said to use '/model <model-id>' but the command returned a dialog action which is unsupported in non-interactive mode. * fix(cli): declare supportedModes on doctorCommand to enable non-interactive and ACP The command's action already had non-interactive handling (returns a JSON message with check results), but without supportedModes declared the BUILT_IN fallback restricted it to interactive-only so it was never registered in non_interactive or acp sessions. * feat(skills): add SkillCommandLoader for user/project/extension skills as slash commands - New SkillCommandLoader loads user, project, and extension level SKILL.md files as slash commands (previously only bundled skills were slash-invocable) - Extension skills follow plugin-command rules: modelInvocable only when description or whenToUse is present - User/project skills are always modelInvocable (matching bundled behavior) - skill-manager now injects extensionName when loading extension-level skills - Add when_to_use and disable-model-invocation frontmatter support to SKILL.md and .md command files (SkillConfig, markdown-command-parser, command-factory, BundledSkillLoader, FileCommandLoader) - SkillTool filters out skills with disableModelInvocation and includes whenToUse in the skill description shown to the model - 16 unit tests for SkillCommandLoader covering all cases * docs: update phase2 design doc to reflect final decisions on plan/statusline/copy/restore These four commands are intentionally kept as interactive-only by design: - /plan and /statusline: tightly coupled with interactive multi-turn UI - /copy and /restore: clipboard and snapshot restore are inherently interactive Update design doc classification table, section 4.2, 4.3, 5.2, 5.3, file change summary, test requirements, behavior analysis table, and implementation batch descriptions to reflect this decision. * feat(cli): re-implement slashCommands.disabled denylist based on current refactored code Adapts the feature originally introduced in pr#3445 to the current CommandService / Phase-2 refactored code. Sources (merged, de-duplicated, case-insensitive): - settings key slashCommands.disabled (string[], UNION merge) - --disabled-slash-commands CLI flag (comma-separated or repeated) - QWEN_DISABLED_SLASH_COMMANDS environment variable Enforcement points: - CommandService.create() accepts optional disabledNames: ReadonlySet<string> and removes matching commands post-rename, so disabled commands never appear in autocomplete, mid-input ghost text, or model-invocable commands list. - slashCommandProcessor (interactive TUI) passes the denylist to CommandService.create so disabled commands are absent from dropdown/ghost text. - nonInteractiveCliCommands.handleSlashCommand() keeps allCommands unfiltered to distinguish disabled vs unknown; disabled commands return unsupported with a "disabled by the current configuration" reason (not no_command). - getAvailableCommands() (ACP) passes the denylist to CommandService.create. Config plumbing: - core/Config: ConfigParameters.disabledSlashCommands + getDisabledSlashCommands() - cli/config: CliArgs.disabledSlashCommands + yargs option + loadCliConfig merge - settingsSchema: slashCommands.disabled (MergeStrategy.UNION) - settings.schema.json: regenerated Tests: 28 pass (CommandService x4, nonInteractiveCliCommands x3 new cases) * feat(cli): complete slashCommands.disabled coverage from pr#3445 Fill in the three items that were missing from the initial re-implementation: - packages/cli/src/config/settings.test.ts: add UNION-merge test for slashCommands.disabled across user and workspace scopes - packages/cli/src/nonInteractiveCli.test.ts: add getDisabledSlashCommands mock to the shared mockConfig fixture - docs/users/configuration/settings.md: add slashCommands section (table + example + note) and --disabled-slash-commands row in the CLI args table * fix(cli): match disabled slash commands by alias as well as primary name The denylist previously only checked cmd.name (the primary/canonical name), so disabling a command by its alias (e.g. 'about' for the 'status' command) had no effect. Fix both CommandService.create() and the isDisabled() helper in nonInteractiveCliCommands.ts to also check altNames. Also improve the user-facing error message to show the token the user actually typed (e.g. /about) instead of always showing the primary name (/status).	2026-04-22 19:12:44 +08:00
jinye	685296e978	fix(core): reject truncated subagent write_file calls (#3505 ) Some checks are pending Qwen Code CI / Lint (push) Waiting to run Details Qwen Code CI / Test (push) Blocked by required conditions Details Qwen Code CI / Test-1 (push) Blocked by required conditions Details Qwen Code CI / Test-2 (push) Blocked by required conditions Details Qwen Code CI / Test-3 (push) Blocked by required conditions Details Qwen Code CI / Test-4 (push) Blocked by required conditions Details Qwen Code CI / Test-5 (push) Blocked by required conditions Details Qwen Code CI / Test-6 (push) Blocked by required conditions Details Qwen Code CI / Test-7 (push) Blocked by required conditions Details Qwen Code CI / Test-8 (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * fix(core): reject truncated subagent write_file calls Propagate MAX_TOKENS truncation from subagent responses into tool requests and reject truncated edit calls before schema validation can surface misleading missing-parameter errors. * fix(core): reset per-attempt stream state on retry in agent-core When a streaming response hits MAX_TOKENS and is retried, accumulated state variables (functionCalls, wasOutputTruncated, roundText, etc.) were not cleared. This caused a successful retry to inherit the stale wasOutputTruncated=true flag from the failed attempt, incorrectly rejecting all Edit tool calls with a truncation error. Reset all per-attempt state on retry events, matching the existing behaviour in turn.ts. Add a regression test covering the truncated-then-retried-successfully scenario. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-22 15:01:42 +08:00

1 2 3 4 5 ...

2217 commits