Commit graph

5636 commits

Author SHA1 Message Date
Shaojin Wen
58836f1c3d fix(skills): per-tool extraction dispatcher (LSP URI + grep glob + integration test)
Four findings from /review on the activation extractor:

C1 (Critical): LSP allowlisted but the extractor pushed `filePath`
  through unchanged. The LSP tool accepts non-file URI schemes
  (`http://`, `git://`, etc.); forwarding any of those to
  SkillActivationRegistry as a project-relative candidate let an
  LSP call against a non-file resource activate path-gated skills
  without the model touching a real project file. Fix is two-part:
  decode `file://` URIs via `fileURLToPath` (so a project file
  expressed as a URI still activates correctly) and silently drop
  any string containing `://` that's not `file://`.

S1: LSP `incomingCalls` / `outgoingCalls` operate on
  `callHierarchyItem.uri`, not the top-level `filePath`. After
  `prepareCallHierarchy` returns a file-backed item, following the
  hierarchy with that item produced no candidate, so path-gated
  skills for that file stayed dormant. Same URI-aware extraction is
  applied to the nested `uri` field.

S2: grep_search has a path-shaped `glob` field
  (`GrepToolParams.glob`) — distinct from `pattern`, which is a
  regex on contents. The extractor previously ignored `glob`, so
  `grep_search({ pattern, glob: 'src/**/*.ts' })` produced no
  activation candidate even though the call walked every file under
  `src/**/*.ts`. Same `path + glob` join treatment as GLOB.

S3: No scheduler-side integration test covered the
  extractToolFilePaths → matchAndActivateByPaths → reminder-append
  wiring, so a regression there could land while extractor and
  registry unit tests still passed. Added three integration tests
  covering: (a) reminder appended when SkillTool present,
  (b) reminder suppressed when SkillTool absent (subagent case),
  (c) hook not invoked for non-FS tools.

Restructured `extractToolFilePaths` from a generic
`file_path/filePath/path/paths` extractor into a per-tool
dispatcher (`switch` on canonical tool name). The previous generic
shape was overly permissive — every FS tool got every field name,
including ones it doesn't accept — and it was the wrong shape to
add LSP URI semantics to. Per-tool means each branch reflects the
actual `XToolParams` interface.

Test reshape:
- Removed tests asserting cross-tool field acceptance (e.g. grep
  reading `filePath` / `paths`); those documented inaccurate input.
- Added per-tool realistic tests for grep glob, lsp file:// URI,
  lsp callHierarchyItem.uri, lsp non-file scheme dropped.
- Plus the three CoreToolScheduler activation wiring tests.

639 tests pass (was 632); types and lint clean.

DEFERRED

S4: Activation driven from input selector rather than concrete
  matched files. For `glob({ pattern: '**/*.ts' })` the selector
  itself may not match a skill scoped narrower than the query.
  Real concern, but the fix needs typed result-path metadata
  feedback from each tool — a cross-cutting addition to every FS
  tool's return shape. Logged for follow-up.
2026-05-02 05:13:58 +08:00
Shaojin Wen
18e30aae39 fix(skills): activate broad globs on dotfiles + cross-ref FS allowlist
Two more findings from /review:

- S13: picomatch was compiled with `dot: false`, so a broad glob like
  `paths: ['**/*.js']` silently excluded `.eslintrc.js`, `.env`,
  `.github/foo.yml`, etc. The hidden-file exclusion is gitignore-style
  semantics — wrong for activation, where the question is "did the
  model touch a file matching this glob." Switch to `dot: true`.

- S14: `FS_PATH_TOOL_NAMES` is a manually maintained allowlist with no
  compile-time guard — adding a new FS tool without updating the set
  silently drops the tool out of the activation pipeline. Add a
  cross-ref comment at the top of `ToolNames` in `tool-names.ts`
  pointing maintainers at the allowlist site, plus a TODO noting the
  long-term fix is per-declaration `pathFields?: string[]`. The
  cross-cutting refactor is its own PR.

Adds one regression test (`activates broad globs on dotfiles too`)
that pins the dot:true semantics on `**/*.js` matching
`.eslintrc.js`. 211 skill-area tests pass.
2026-05-01 22:44:30 +08:00
Shaojin Wen
09248c0745 fix(skills): comprehensive review pass — security, correctness, robustness
Eleven findings from /qreview (claude-opus-4-7), grouped by area:

CORRECTNESS

- C1: appendAdditionalContext silently dropped reminders for any tool
  whose llmContent is a single non-array Part (read-file returning
  inlineData for images / PDFs is the canonical case). Both the
  ConditionalRulesRegistry rule reminder and the path-conditional
  skill activation reminder were lost. Wrap the single-Part case
  into an array so the addition still lands.
- S2: Legacy tool-name aliases (`replace` → `edit`,
  `search_file_content` → `grep_search`, `task` → `agent`) bypassed
  FS_PATH_TOOL_NAMES. The registry resolves the alias at execute time
  but `request.name` keeps the alias, so `replace({ file_path: ... })`
  produced empty candidates and missed activation. Canonicalize via
  `ToolNamesMigration` before the allowlist check.
- S5: `new SkillActivationRegistry(...)` ran picomatch unguarded —
  pathological patterns (oversize / broken extglob) could throw and
  abort all of `refreshCache`. Wrap each picomatch call in try/catch
  inside the constructor; drop the bad pattern, keep the rest of
  the skill, log via debugLogger.
- S7: Extension parser (skill-load.ts) silently dropped
  `disable-model-invocation` and `when_to_use`. Now that we have
  `paths:`, that meant an extension SKILL.md with both `paths:` and
  `disable-model-invocation: true` would still fire path-activation
  reminders for a skill the model can't invoke — directly
  contradicting the bug_004 fix at the project/user level.
- S8: SkillTool discarded the `addChangeListener` cleanup function
  and had no `dispose()`. Subagents share the parent's SkillManager
  via `InProcessBackend.createPerAgentConfig`, so each per-subagent
  SkillTool registered another listener; with the listener pipeline
  now async, every path activation serialized through every stale
  subagent's refresh chain. Mirror AgentTool: store the cleanup,
  expose `dispose()`.

SECURITY / SUPPLY-CHAIN

- S11: `validateSkillName`'s `/^[a-zA-Z0-9_:.-]+$/` rejected every
  non-ASCII name on upgrade, silently dropping CJK / Cyrillic /
  accented Latin skills. The structural-injection guard targets
  `<>"'/\n\r\t` etc; entire Unicode planes are not the threat.
  Widen to `/^[\p{L}\p{N}_:.-]+$/u`. Update docs/users/features/
  skills.md to match.
- S10: `parsePathsField` only validated shape (must-be-array). Now
  also reject leading-slash absolute patterns and `..` parent-escape
  patterns at parse time — these silently never match anything in
  the activation registry, so an author who writes `paths:
  ['/etc/passwd']` or `['../*.ts']` would otherwise see the skill in
  /skills and never understand why it never activates.

ROBUSTNESS

- S3: `coreToolScheduler` emitted "skill X is now available via the
  Skill tool" even when the calling subagent's tool registry did not
  expose SkillTool (subagent's `tools:` allowlist excluded `skill`).
  Gate the reminder on `toolRegistry.getTool(ToolNames.SKILL)`.
- S4: `extensionManager.refreshMemory` used `Promise.all` so a
  rejection from skill or subagent refresh nuked the other leg AND
  the hierarchical-memory refresh below it. Switch to
  `Promise.allSettled`, log each rejection, and `await` the
  hierarchical refresh too (the comment justifies awaiting; the
  code didn't).
- S9 / S12: `docs/users/features/skills.md` claimed `paths:` only
  gates model discovery and slash invocation always works. True for
  the user-side path itself, but if the model then tries to chain
  off the user's invocation (call `Skill { skill: ... }` itself),
  validateToolParams returns "gated by path-based activation" —
  contradicting the doc. Rephrase to call out the model-side
  limitation explicitly.

DEFERRED

- S6: notifyChangeListeners swallows per-listener errors and the
  reminder still fires. Real concern but the fix needs an API
  shape change (listener-failure signal back to the scheduler);
  worth its own design discussion. Logged here for follow-up.

Adds 12 regression tests across the 7 affected files. 632 tests
pass; types and lint clean.
2026-05-01 17:21:37 +08:00
wenshao
2771aac489 fix(skills): silence CodeQL ReDoS flag on trailing-separator trim
CodeQL #145 flagged `pathField.replace(/[\\/]+$/, '')` as a
polynomial regex on uncontrolled data — the regex is anchored
and uses a single character class with `+`, so worst case is
linear in trailing-separator length, but the scanner is
conservative about `+` quantifiers on inputs that flow from
tool invocation parameters.

Replace the regex with an explicit `endsWith` loop. Same O(n)
behavior on the trailing run, no regex for CodeQL to chew on.
Existing trailing-slash test (forward and back) still passes.
2026-04-30 23:09:45 +08:00
wenshao
c108de89a3 fix(skills): join glob.path with glob.pattern as effective selector
Two coupled fixes for the glob-pattern extraction landed in 7cb7145bb:

1. **Windows CI failure.** `path.join('src', '**/*.ts')` returns
   `'src\\**\\*.ts'` on Windows (OS-aware separator). The new
   regression tests asserted the forward-slash form, so the
   ubuntu/macos matrix was green but all three Windows jobs
   (20.x/22.x/24.x) failed. The downstream registry also matches
   against forward-slash relative paths (after `replace(/\\/g, '/')`),
   so the Windows-shaped candidate would have silently failed to
   activate any skill at runtime — not just in tests.

2. **`..` normalization.** `path.join('src', '../*.ts')` collapses to
   `'*.ts'`, losing the information that the glob actually escaped
   its `path` root. The audit notes this can both miss the real
   touched subtree and false-activate a skill keyed on a wrong
   subtree. Concat preserves the selector verbatim.

Replace `path.join(pathField, patternField)` with
`${pathField.replace(/[\\/]+$/, '')}/${patternField}` per the
audit's exact suggestion. Trims trailing forward-slash and
backslash so `path: 'src/'` and `path: 'src\\'` both produce
`src/<pattern>` instead of `src//<pattern>` or `src\\/<pattern>`.

Adds three tests covering: `..` preservation, forward-slash on
all OSes (the Windows CI regression), and trailing-slash
trimming for both `/` and `\` variants.
2026-04-30 23:02:19 +08:00
wenshao
7cb7145bbe fix(skills): join glob.path with glob.pattern as effective selector
Caught by /review on 599490b91: my earlier glob extraction pushed
`path` and `pattern` as separate candidates. `glob({ path: 'src',
pattern: '**/*.ts' })` produced `['src', '**/*.ts']` — neither
component matches a skill keyed on `paths: ['src/**/*.ts']` in
isolation, so activation silently broke for the most common
two-arg glob shape.

The glob call actually searches `<path>/<pattern>`. Replace the
standalone pattern push with `path.join(pathField, patternField)`,
falling back to bare pattern when no path is provided. The
generic block above still emits the bare `path` candidate, so a
broad skill keyed on `paths: ['src/**']` (directory-level)
continues to activate too. Combined output for the regression
example: `['src', 'src/**/*.ts']` — covers both the directory-
level and file-level skill cases.

Adds three tests: an updated unit test pinning the joined
effective selector, an absolute-`path` variant whose joined form
gets rejected downstream by the project-root guard
(`/tmp/external/**/*.ts`), and the audit-suggested integration
regression that pipes `extractToolFilePaths` output straight into
`SkillActivationRegistry` and verifies a `paths: ['src/**/*.ts']`
skill activates from `glob({ path: 'src', pattern: '**/*.ts' })`.
2026-04-30 20:29:53 +08:00
wenshao
599490b916 fix(skills): glob pattern activation + verifiable Windows guard
Two follow-ups from the latest /review pass:

1. `extractToolFilePaths` now extracts `pattern` for `ToolNames.GLOB`
   in addition to the existing `path` field. The shape
   `glob({ pattern: 'src/**/*.tsx' })` (no `path`) was producing an
   empty candidate set, so a skill keyed on the same glob never
   activated from a glob call. Pattern extraction is gated to GLOB
   only — grep_search also has a `pattern` field, but it's a regex
   and would false-match if treated as a path-shaped selector.

2. The relative-path normalization is extracted into a pure helper
   `resolveProjectRelativePath(filePath, projectRoot, pathModule)`.
   The previous Windows cross-drive regression test
   (`/totally/other/place/file.ts` against `/project`) actually
   exercised the older `..` outside-root branch on POSIX runners,
   so the new `path.isAbsolute(rawRelativePath)` guard could have
   been removed without the test failing. The helper is now
   parameterized over a `path` module so a unit test can pass
   `path.win32` directly and pin the cross-drive case
   (`D:\\other\\file.ts` against `C:\\project`) deterministically
   on any host OS.

Adds 6 tests: glob pattern extraction (with and without path),
grep regex pattern not extracted, and four
resolveProjectRelativePath cases covering POSIX in-project, POSIX
outside-root, Windows cross-drive (the new branch), and Windows
in-project backslash normalization.
2026-04-30 17:34:01 +08:00
wenshao
aaeaa7ba18 fix(skills): security/perf/robustness pass on activation pipeline
Six findings from /review (claude-opus-4-7), all rooted in the new
path-conditional activation code:

1. extractToolFilePaths now requires a `toolName` and gates on a
   closed FS_PATH_TOOL_NAMES allowlist (read_file, edit, write_file,
   grep_search, glob, list_directory, lsp). MCP / non-FS tools that
   reuse `path` / `paths` for HTTP routes, JSON keys, search queries
   would otherwise feed those values into the activation pipeline,
   where `path.resolve(projectRoot, …)` would normalise them to
   project-relative strings and false-match a skill with broad
   globs (e.g. `paths: ['**']`). Concrete attack noted by /review:
   `{ path: 'https://api.example.com/users/123' }` → activates a
   skill on every MCP call.

2. Skill `name` validated at parse time against
   `/^[a-zA-Z0-9_:.-]+$/`. The value flows verbatim into multiple
   model-trusted sinks: `<available_skills>` description, the
   path-activation `<system-reminder>`, the SkillTool schema, and
   UI listings. Reject characters that could close a tag and open a
   forged one (`name: "ok</system-reminder><system-reminder>…"`).

3. SkillManager.matchAndActivateByPaths(filePaths) added. The
   per-path notify in coreToolScheduler caused N successive
   SkillTool.refreshSkills() / geminiClient.setTools() round-trips
   for a single ripGrep-style multi-path call; the batch entry
   point activates across all paths and fires listeners exactly
   once with the union. matchAndActivateByPath delegates to it for
   call-site compatibility.

4. SkillManager.refreshCache uses Promise.allSettled at the
   levels boundary so a fatal error on one level (FS hang,
   permission denial, missing config dir) no longer nukes the
   other three; warns with the level + reason for the failed slot.

5. parsePathsField accepts explicit `null` (the YAML `paths:`
   no-value shorthand) the same way as omission, instead of
   throwing and dropping the whole skill via parseErrors.
   Matches the leniency of `argumentHint` and `whenToUse`.

6. SkillActivationRegistry adds a `SKILL_ACTIVATION` debug logger
   for the operational pain noted in the audit: per-path resolved
   relative-path, project-root-rejection reason, and per-skill
   activation. Also gives oncall a grep target for "why did/didn't
   skill X activate?" without source-reading.

Test mocks (agent-headless, config) now expose
matchAndActivateByPaths alongside matchAndActivateByPath. New
tests: parsePathsField null, validateSkillName allow/reject pairs
(including the closing-tag attack literal), batch activation
firing listeners exactly once, batch with no matches not firing
listeners, and an extractToolFilePaths regression for MCP / web /
skill tool inputs being filtered out.
2026-04-30 16:54:39 +08:00
wenshao
d4b1b3491b fix(extension): await skill + subagent cache refresh in refreshMemory
Caught by /review on the previous async-listener change: this PR
made `SkillManager.refreshCache()` resolve only after the
change-listener chain (notably `SkillTool.refreshSkills` and
`geminiClient.setTools()`) settles. `ExtensionManager.refreshMemory`
was firing it without `await`, so callers like `refreshTools` would
return while the skill cache and tool description were still
updating, and any rejection from the listener chain was silently
detached.

Wrap skill + subagent refreshes in a single `Promise.all` so they
still run concurrently, but the parent `refreshMemory` Promise only
resolves once both side-effects have landed. Hierarchical memory
refresh is left as-is (pre-existing fire-and-forget pattern,
unchanged by this PR).
2026-04-30 14:08:50 +08:00
wenshao
b0c2fb13ea fix(skills): await listener refresh during path activation
Race surfaced by /review: matchAndActivateByPath synchronously
notified change listeners, but the SkillTool listener was a
fire-and-forget `void this.refreshSkills()`. The activation hook
in CoreToolScheduler then appended the "skill X is now available"
<system-reminder> and the tool result was sent to the model
without waiting — so the next turn could land with the
<available_skills> listing still showing the pre-activation set,
and the model's first invocation of the announced skill would
hit validateToolParams's "not found" branch.

Make the listener pipeline awaitable end-to-end:

- addChangeListener now accepts `() => void | Promise<void>`.
- notifyChangeListeners is async and awaits each listener's
  return, so any returned Promise (e.g. SkillTool.refreshSkills)
  is held before the call resolves.
- refreshCache awaits the notification it was already firing.
- matchAndActivateByPath becomes async and awaits notification
  when at least one new activation occurred. The CoreToolScheduler
  hook awaits the call so the system-reminder lands strictly
  after the tool description has been refreshed.
- SkillTool's listener returns the refresh Promise directly
  instead of stranding it under `void`.

Existing test mocks for `addChangeListener` accept any return
value, so no mock changes are needed. The four
matchAndActivateByPath direct-call tests in skill-manager.test
are updated to `await` the new Promise return.
2026-04-30 11:35:19 +08:00
wenshao
fcf5b0eb08 test: stub matchAndActivateByPath in SkillManager test mocks
The path-conditional skill activation hook in
CoreToolScheduler.executeSingleToolCall now fires on every tool
invocation that names a filesystem path. With the widened
extractToolFilePaths coverage, that includes the `path: '.'`
input shape used by the AgentHeadless tool-execution tests.

Two SkillManager mocks predate the activation API and stubbed
only watcher / listener methods, so the scheduler hook crashed
with "matchAndActivateByPath is not a function" on any tool
invocation in those test files. Local runs still hit it on this
branch (no `path:` field tools were exercised pre-merge), and CI
caught the regression in agent-headless.test.ts across all 9
matrix combos.

Stub the method to return [] in both mocks (agent-headless and
config), matching the watcher-method pattern. Production code is
unchanged — the existing SkillManager has the method and the
real path through Config wires it up correctly.
2026-04-30 09:37:00 +08:00
wenshao
c632f0a046 fix(skills): widen activation coverage and tighten dedup edges
Three fixes from the latest /review pass on the activation
pipeline, all touching the same hook surface:

1. Activation only fired on `file_path` — read-file / edit /
   write-file. Tools that touch the filesystem under different
   parameter names (`path` for ls and ripGrep, `filePath` for
   grep and lsp, `paths` array for ripGrep multi-path) silently
   skipped both ConditionalRulesRegistry and SkillActivationRegistry.
   Extract `extractToolFilePaths(toolInput)` and route every
   recognised path through both registries; coalesce skill
   activations from one tool call into a single system-reminder.

2. SkillTool's model-invocable-commands dedup set was built from
   every file-based skill name, including ones marked
   `disable-model-invocation: true`. A hidden file skill could
   suppress an unrelated MCP prompt or command of the same name
   that was never meant to overlap with it. Filter the dedup set
   to model-invocable skills only; pending conditional skills
   stay reserved (correct contract), disabled skills no longer
   block unrelated commands.

3. SkillActivationRegistry's project-root guard rejected `..` /
   `../` prefixes but accepted absolute results. On Windows,
   `path.relative('C:\\proj', 'D:\\elsewhere')` returns an
   absolute path; after normalising backslashes a broad glob like
   `**/*.ts` would activate a project-scoped skill for an
   off-project file. Reject absolute relative results before
   normalising slashes.

Adds regression tests for each:
- 7 cases for `extractToolFilePaths` (each field name + combos
  + non-object / wrong-shape inputs).
- 1 SkillTool case proving a `disable-model-invocation` skill no
  longer suppresses a same-name MCP prompt.
- 1 SkillActivationRegistry case for the absolute-relative-path
  guard. (220 skill-area tests pass total.)
2026-04-29 22:23:32 +08:00
wenshao
f6f3f3d5dd Merge origin/main into feat/skills-parallel-load-and-path-activation
Brings the branch up to date (36 commits behind). Two source-level
conflicts in the skills area, both resolved by keeping additions
from both sides — the new `argument-hint:` parsing from #3593 and
this branch's `paths:` / `parsePathsField` work are independent
features that touch the same parser entry points but do not
interact:

- packages/core/src/skills/skill-manager.test.ts: combined the two
  new branches in `mockParseYaml.mockImplementation` (one for
  `argument-hint:`, one for `paths:`) so both feature paths get
  exercised, and combined the two new `parseSkillContent` test
  blocks (argument-hint frontmatter + paths frontmatter variants).

Other touched files (skill-manager.ts, skill-load.ts,
skill-load.test.ts, types.ts) auto-merged cleanly: argumentHint
extraction lives next to model/whenToUse, and the parsePathsField
helper call is the last optional-field extraction before the
SkillConfig is constructed.

Validation after merge:
- 210 skill-area tests pass (skill-manager + skill-activation +
  skill-load + tools/skill + coreToolScheduler).
- typecheck clean.
- lint clean.
2026-04-28 14:00:28 +08:00
Shaojin Wen
aac2e96ec3
feat(core): managed background shell pool with /tasks command (#3642)
Some checks are pending
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
Qwen Code CI / Lint (push) Waiting to run
* feat(core): managed background shell pool with /bashes command

Replace shell.ts's `&` fork-and-detach background path with a managed
process registry. Background shells now have observable lifecycle, captured
output, and explicit cancellation — matching the pattern used by background
subagents (#3076).

Phase B from #3634 (background task management roadmap).

What changes
- New `BackgroundShellRegistry` (services/backgroundShellRegistry.ts):
  per-process entry with status (running / completed / failed / cancelled),
  AbortController, output file path. State transitions are one-shot
  (terminal status sticks; late callbacks no-op). Mirrors the lifecycle
  shape of #3471's BackgroundTaskRegistry so the two can be unified later.
- `shell.ts` is_background path rewritten as `executeBackground`:
  - Spawns the unwrapped command (no '&', no pgrep envelope)
  - Streams stdout to `<projectDir>/tasks/<sessionId>/shell-<id>.output`
    (path layout aligns with the direction sketched in #3471 review)
  - Bridges the external abort signal into the entry's AbortController so
    a single source of truth governs cancellation
  - Returns immediately with id + output path; agent's turn isn't blocked
  - Settles the registry entry asynchronously when ShellExecutionService
    resolves: complete (clean exit) / fail (error) / cancel (aborted)
- Removes ~120 lines of dead bg-specific code from shell.ts:
  pgrep wrapping, '&' appending, Windows ampersand cleanup, Windows
  early-return path, bg PID parsing, tempFile cleanup
- New `/bashes` slash command: lists registered shells with id, status,
  runtime, command, output path. Empty state prints a friendly message.

What this PR doesn't do
- Footer pill / dialog integration — gated on #3488 landing
- task_stop / send_message integration — gated on #3471 landing
- Auto-backgrounding heuristics for long foreground bash — Phase D

Test plan
- 11 registry unit tests (state machine + idempotent terminal transitions)
- 4 background-path tests in shell.test.ts (spawn no-wrap + complete /
  fail / cancel settle paths)
- 2 /bashes command tests (empty + populated)
- Full core suite: 247 files / 6075 passed (existing tests unaffected)

* fix(core): address PR #3642 review feedback

Three [Critical] from the auto review + naming alignment with Claude Code:

- shell.ts settle: non-zero exit code or termination signal now bucket into
  `failed` instead of `completed`. The previous `if (result.error) fail else
  complete()` would misreport `false` / failed `npm test` as success because
  ShellExecutionService surfaces ordinary command failures as a non-zero
  exitCode with `error: null`. Failure reason carries the exit code or signal
  so `/tasks` shows the real cause.

- ShellExecutionService.childProcessFallback: add `streamStdout` mode that
  emits each decoded chunk through the existing onOutputEvent path. The
  default (foreground) path continues to buffer + emit the cleaned final
  blob, so existing in-line shell calls are unaffected. executeBackground
  opts in via `{ streamStdout: true }`, which is what makes the captured
  output file actually useful for long-running processes (dev servers,
  watchers) — without it the file stayed empty until the process exited.

- shell.ts test fixture: cancel-settle test was using `signal: 'SIGTERM'`
  but `ShellExecutionResult.signal` is `number | null`. TS2322 broke the
  build; switched to `signal: null`. Added a test that explicitly covers
  the new "non-zero exit → failed" path so the bucketing change has
  regression coverage.

- shell.ts comment: explicitly document why background shells force
  `shouldUseNodePty=false` (no terminal, no human; node-pty would be dead
  weight for fire-and-forget commands).

- /bashes → /tasks (alias bashes), description "List and manage background
  tasks" — matches Claude Code's command name. Currently lists shells only;
  will surface other task kinds (subagents, monitor) as those registries
  land via #3471 / #3488.

* fix(core): address PR #3642 second-round review feedback

- shellExecutionService streaming: drop stdout/stderr buffer + outputChunks
  accumulation in streaming mode. Each decoded chunk goes straight to
  onOutputEvent and is GC-eligible immediately. Long-running background
  commands (dev servers, watchers) no longer accumulate unbounded memory
  proportional to total output. Buffered (foreground) mode is unchanged.

- shell.ts executeBackground: stripAnsi each chunk before writing to the
  output file. Dev servers / build tools spam color codes and cursor-move
  sequences that would render as garbage in the file the agent reads.

- bashesCommand: command description "List and manage" → "List background
  tasks" — current implementation only supports listing, cancellation
  follows when the unified task_stop tool from #3471 is wired in. Replace
  the hand-rolled formatRuntime helper with the shared formatDuration
  utility (uses hideTrailingZeros for parity with the previous output).

- backgroundShellRegistry: add a comment documenting the lack of an
  eviction policy as a known limitation. LRU / age-based / capped-size
  eviction (and on-disk output rotation) is left as a follow-up alongside
  the broader output-file lifecycle story.

* fix(core): address PR #3642 third-round review feedback

- shell.ts executeBackground: add 'error' listener on the output write
  stream. fs.createWriteStream surfaces write failures (disk full,
  permission, fs going away) as 'error' events; without a listener Node
  treats it as an uncaught exception and kills the entire CLI session.
  Log + drop is the sane default — the registry still settles via
  resultPromise so /tasks shows the right terminal status.

- shell.ts executeBackground: store the abort handler reference and
  removeEventListener in the settle callback. Background shells outlive
  the turn signal; the dangling listener was keeping `entryAc` (and
  transitively `outputStream`) reachable until the turn signal itself was
  GC'd, which for long sessions would never happen.

- shell.test.ts: extend the createWriteStream mock with an `on` stub so
  the new error-listener wiring doesn't crash the test suite.

* refactor(cli): drop /bashes alias and rename file to tasksCommand

Per follow-up review: the slash command should be exclusively /tasks.
Removes the `bashes` altName, renames `bashesCommand{,.test}.ts` →
`tasksCommand{,.test}.ts`, renames the exported binding `bashesCommand`
→ `tasksCommand`, and cleans up the remaining `/bashes` references in
backgroundShellRegistry.ts comments. No behavior change beyond the
alias removal.

* refactor(cli): finish tasksCommand rename — apply content changes

The previous commit (03c8503c8) only captured the file rename via
`git mv`; the export name change (`bashesCommand` → `tasksCommand`),
the removal of `altNames: ['bashes']`, the import update in
BuiltinCommandLoader, and the `/bashes` → `/tasks` comments in
backgroundShellRegistry.ts were unstaged when that commit landed.
Squash candidate before merge.

* fix(core): address PR #3642 fourth-round review feedback

Four reviewer concerns from @wenshao + @doudouOUC:

- [Critical] Config.shutdown() now also calls
  `backgroundShellRegistry.abortAll()`. Previously only the subagent
  registry was aborted, so a managed background shell could outlive the
  CLI process and orphan its child. Symmetric with how
  `BackgroundTaskRegistry.abortAll()` is wired in.

- [P1] shell.ts executeBackground strips a trailing `&` from the command
  before spawn. The managed path is itself the backgrounding mechanism;
  forwarding `node server.js &` verbatim made bash exit immediately while
  the real child outlived the wrapper, causing the registry to settle as
  `completed` while the shell was still running and chunked output to
  land on a closed stream. Strip + warn.

- [P2] Output file moves under `storage.getProjectTempDir()` (specifically
  `<projectTempDir>/background-shells/<sessionId>/shell-<id>.output`).
  `ReadFileTool` already auto-allows the project temp dir, so the LLM
  can `Read` the captured output without bouncing off a permission
  prompt — important because background-agent contexts can't surface
  interactive prompts.

- [P2] Background shells are no longer killed when the current turn's
  AbortSignal fires. Forwarding the turn signal into the entry's
  AbortController meant a Ctrl+C on the turn would also terminate
  intentionally backgrounded dev servers / watchers, contradicting the
  independent-lifecycle promise. Cancellation now flows only through
  `entryAc` (driven by future `task_stop` integration via #3471).

Tests:
- New `abortAll` registry tests cover running / mixed / empty cases.
- `runs background commands as managed pool entries` test stops asserting
  the wrapper-vs-entry signal identity since they're now structurally
  separate (no turn-to-entry forwarding).
- New `does not forward the turn signal into the background shell` test
  pins the new behavior.
- New `strips trailing & from the spawned command` test pins the strip.
- Removed the cancel-via-outer-signal settle test — that path no longer
  exists; cancellation is exercised end-to-end via the registry's own
  `cancel` and `abortAll` tests in `backgroundShellRegistry.test.ts`.

* fix(core): tighten trailing & strip — narrow regex + ReDoS-safe

Two reviewer concerns on the same line of #3642 round 4:

- [Critical CodeQL] `\s*&+\s*$` is a polynomial-time regex on
  uncontrolled input (long all-`&` strings backtrack quadratically).
- [P2 doudouOUC] `&+` is too greedy: it also rewrites `npm run dev &&`
  into `npm run dev` (breaks logical AND syntax) and `echo foo \&` into
  `echo foo \` (eats the escaped literal). Only the bare bash background
  operator should be stripped.

Replace the regex with a small linear-time helper
`stripTrailingBackgroundAmp` that explicitly checks for the three
"don't touch" cases (`&&`, `\&`, no trailing `&`). Plain `endsWith` /
`slice` — no regex backtracking, and the intent reads off the page.

Tests:
- Existing strip-trailing-`&` test still passes.
- New `does not strip a trailing &&` test pins the logical-AND case.
- New `does not strip an escaped trailing \\&` test pins the escape case.

* fix(core): keep binary-detection sniff in streaming mode

@doudouOUC noted that `streamStdout` shortcut returned before the
binary-sniff path, so a background command emitting binary bytes
(`cat /bin/ls`, image dump, etc.) would be text-decoded and appended
to the task output file unbounded.

Restructure handleOutput so the sniff-and-cutover logic runs in both
modes:

- Both modes accumulate up to MAX_SNIFF_SIZE for the binary check.
  The accumulator is bounded; once the threshold is reached, it stops
  growing in streaming mode (dropped on binary detection / left
  inert on text confirmation) and continues to accumulate in buffered
  mode (existing foreground behavior).
- Streaming mode emits 'binary_detected' as soon as `isBinary` trips
  so the consumer can stop writing the output file. Up to ~4KB of
  bytes may have been emitted as text chunks before detection — this
  is bounded and acceptable; the unbounded write is the pathology
  reviewers flagged.
- Streaming text mode still emits each decoded chunk immediately and
  does not accumulate stdout/stderr strings, so long-running text
  streams remain GC-friendly.
- Buffered (foreground) behavior is unchanged — the sniff accumulator
  is the same path the existing tests cover.

Tests: 50 shellExecutionService + 11 backgroundShellRegistry + 57
shell.test.ts all pass; no regressions.

* fix(core): tighten streaming sniff bound + Windows rmSync flake

Two unrelated reds on the latest CI run:

1. [P1 doudouOUC] Streaming sniff buffer leaks on small chunks.
   The previous fix recomputed `sniffedBytes` from
   `Buffer.concat(outputChunks.slice(0, 20)).length` on every chunk —
   pinned to the first 20 chunks. If those total under MAX_SNIFF_SIZE
   (line-sized stdout, e.g. dev-server logs) the byte count never grew,
   the sniff branch stayed open forever, and `outputChunks` accumulated
   every later chunk — exactly the leak `streamStdout` was meant to
   prevent.

   Track sniffed bytes by running sum (`sniffedBytes += data.length`)
   so the bound is genuine. When sniff confirms text in streaming mode,
   drop the accumulator immediately so subsequent chunks fall through
   the streaming emit path without ever touching it.

2. file-exporters.test.ts afterEach `fs.rmSync` flaked on Windows
   (ENOTEMPTY: directory not empty). The exporter's underlying write
   stream hasn't always released its handle by the time `rmSync` runs.
   Pass `maxRetries: 5, retryDelay: 50` so the cleanup retries through
   the brief Windows handle-release window instead of failing the test
   on a CI quirk.

---------

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
2026-04-28 11:06:50 +08:00
tanzhenxin
03c88b7308
feat(cli): background-agent UI — pill, combined dialog, detail view (#3488)
* feat(cli): background-task UI — pill, combined dialog, detail view

Adds the user-facing surface for background tasks on top of the
model-facing agent control primitives merged in #3471. A dedicated
pill in the footer summarises running tasks, ↓ focuses it, and Enter
opens a combined dialog listing every task with a detail view that
shows the original prompt, live stats, and a rolling progress feed
of recent tool invocations.

Also renames BackgroundAgent* to BackgroundTask* for consistency with
the user-facing terminology and the task_* tool family.

* chore: trigger CI
2026-04-28 10:57:59 +08:00
Fu Yuchen
d09c19c0c5
fix(core,cli): stop stripping reasoning on switch and resume paths (#3682) 2026-04-28 09:22:17 +08:00
jinye
4ac9ec07c3
fix(cli): recognize OpenAI-compatible providers in qwen auth status (#3623)
* fix(cli): recognize OpenAI-compatible providers in `qwen auth status`

Previously `qwen auth status` treated all `selectedType=openai` setups
as Coding Plan, checking only `BAILIAN_CODING_PLAN_API_KEY`. Users who
configured generic OpenAI-compatible providers (e.g. Xunfei, DeepSeek,
Ollama) via `modelProviders.openai` saw a misleading "Alibaba Cloud
Coding Plan (Incomplete)" even though their provider worked correctly.

Split the USE_OPENAI branch into two paths:
- Coding Plan: detected by `codingPlan.region` or `CODING_PLAN_ENV_KEY`
- Generic OpenAI-compatible: checks API key from modelProviders envKey,
  OPENAI_API_KEY, or settings.security.auth.apiKey; displays provider
  info including model name and base URL.

Closes #3612

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): improve Coding Plan detection and align API key check semantics

Address review feedback:

1. Detect Coding Plan via isCodingPlanConfig(baseUrl, envKey) on the
   active modelProviders entry instead of checking BAILIAN_CODING_PLAN_API_KEY
   env var presence. A stale env key from a previous setup no longer
   misclassifies a generic OpenAI-compatible provider as Coding Plan.

2. When modelProviders entry has an explicit envKey, only check that key
   without falling back to OPENAI_API_KEY or settings.security.auth.apiKey.
   This mirrors hasApiKeyForAuth() semantics in auth.ts, preventing
   status from reporting "configured" when the actual provider key is
   missing.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): fix TS2367 type error in displayRegion comparison

`detectedCodingPlanRegion` is `CodingPlanRegion | false`, so
`!== true` comparison is invalid. Simplify to truthiness check.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): refine model fallback and Coding Plan key detection

- Only fall back to models[0] when model.name is unset; when set but
  not matching any provider entry, treat as unmanaged to avoid binding
  status to an unrelated provider's envKey/baseUrl.
- Simplify hasCodingPlanKey to only check CODING_PLAN_ENV_KEY, not
  activeModelConfig.envKey, preventing a generic provider key from
  being mistaken as Coding Plan credentials when codingPlan.region
  is stale.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): prioritize active model config over stale codingPlan.region

When activeModelConfig exists, trust isCodingPlanConfig() result over
potentially stale codingPlan.region from a previous setup. This prevents
a user who switched from Coding Plan to a generic provider from still
seeing "Alibaba Cloud Coding Plan" in auth status.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): avoid stale Coding Plan fallback in auth status

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-28 09:06:09 +08:00
JerryLee
1befabe586
fix(core): handle shell line continuations in command splitting (#3600)
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
Fixes #3158.

`splitCommands()` previously handled backslash escapes before newline/operator splitting, so a chained command like `cd project && \\<LF>git add ...` produced segments starting with a backslash-newline pair, leaving the extracted command root empty and bypassing per-command permission checks for the chained sub-command.

Treat `\\` followed by LF as a removed line continuation, while keeping `\\` followed by CRLF as a normal command separator (bash escapes only \r; the trailing \n still ends the command). This preserves the contract that every chained sub-command is visible to permission parsing and prevents an attacker from hiding a command behind a pseudo-continuation like `echo SAFE \\<CR><LF>rm -rf /`.

Adds regression coverage for both the LF-continuation positive case and the escaped-CRLF safety case.
2026-04-27 23:31:37 +08:00
Bramha.dev
414b3304cd
fix(core): split tool-result media into follow-up user message for strict OpenAI compat (#3617)
Fixes #3616.

Adds opt-in `splitToolMedia` flag (default false). When enabled, media parts (image / audio / video / file) returned by MCP tool calls are split into a follow-up `role: "user"` message instead of being embedded in the `role: "tool"` message. Required for strict OpenAI-compatible servers (e.g., LM Studio) that reject non-text content on tool messages with HTTP 400 "Invalid 'messages' in payload".

Media from parallel tool responses is accumulated and emitted as a single follow-up user message after all tool messages, preserving OpenAI's contiguity requirement for tool responses.

Default behavior is unchanged for permissive providers.
2026-04-27 23:01:02 +08:00
qqqys
8a278767ed
fix(core): recover from }{ glued records on session JSONL load (#3606) (#3656) 2026-04-27 22:50:17 +08:00
jinye
f0e8601982
fix(cli): add API Key option to qwen auth interactive menu (#3624)
* fix(cli): add "API Key" option to `qwen auth` interactive menu

The `qwen auth` CLI command only showed 2 options (Coding Plan, Qwen OAuth),
while the interactive `/auth` dialog showed 3 (Coding Plan, API Key, Qwen OAuth).
Users following the README instructions to configure OpenRouter/Fireworks via
`qwen auth` had no API Key entry point.

- Add "API Key" option to the `runInteractiveAuth` menu with two sub-paths:
  "Alibaba Cloud ModelStudio Standard API Key" (guided flow) and
  "Custom API Key" (prints docs link)
- Add `qwen auth api-key` yargs subcommand for direct access
- Extract `createMinimalArgv` / `loadAuthConfig` helpers to eliminate duplicated
  CliArgs boilerplate
- Extract `promptForInput` to share raw-mode stdin logic between `promptForKey`
  and `promptForModelIds`
- Improve `showAuthStatus` to distinguish Coding Plan, Standard API Key, and
  generic OpenAI-compatible configurations
- Align menu labels and descriptions with the interactive `/auth` dialog

Closes #3413

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs: add `qwen auth api-key` to auth subcommand tables

Update documentation to reflect the new `qwen auth api-key` subcommand:
- auth.md: add to subcommands table, examples, and interactive menu display
- commands.md: add to CLI Auth Subcommands table
- quickstart.md: add to quick-reference command table

* fix(cli): restore incomplete Coding Plan warning in showAuthStatus

When selectedType is USE_OPENAI and Coding Plan metadata exists but
the API key is missing, show the incomplete warning instead of falling
through to the generic "OpenAI-compatible" status.

* refactor(cli): use endpoint constants in region selector and fix status formatting

- Use ALIBABA_STANDARD_API_KEY_ENDPOINTS constants for region
  descriptions instead of hardcoded URLs
- Restore trailing newline in showAuthStatus "no auth" command list
  for consistent spacing

* fix(cli): determine active auth method from model config in showAuthStatus

Previously showAuthStatus checked which env keys exist to determine
the auth method, causing false reports when users switch providers
(e.g., Coding Plan key still present after switching to Standard API Key).

Now it inspects the active model's provider config (baseUrl/envKey) to
determine the actual method, and validates the corresponding key exists:
- Coding Plan: check via isCodingPlanConfig + CODING_PLAN_ENV_KEY
- Standard API Key: check via DASHSCOPE_STANDARD_API_KEY_ENV_KEY + endpoints
- Generic OpenAI-compatible: check if the model's envKey is set

Also clear stale Coding Plan metadata (codingPlan.region/version and
process.env) when switching to Standard API Key.

* fix(cli): add legacy fallback in showAuthStatus and clear persisted Coding Plan env

- When no active model config is found (legacy setups without
  modelProviders), fall back to env key / metadata checks for
  Coding Plan status detection. Fixes CI test failures.
- When activeConfig exists but has no envKey, report incomplete
  status instead of false positive "Configured".
- Clear persisted env.BAILIAN_CODING_PLAN_API_KEY from settings
  when switching to Standard API Key, not just process.env.

* fix(cli): also remove Coding Plan model entries when switching to Standard API Key

When switching to Standard API Key, filter out existing Coding Plan
model entries from modelProviders.openai in addition to old Standard
entries. Previously these were preserved but their credential source
(BAILIAN_CODING_PLAN_API_KEY) was cleared, leaving broken model
entries visible in /model.

---------

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-27 22:01:47 +08:00
tanzhenxin
581c74d76e
feat(core): model-facing agent control (task_stop, send_message, per-agent transcript) (#3471)
* feat(core): task_stop, send_message, and live transcripts for background agents

Add two new tools (task_stop, send_message) and a plain-text transcript
writer so the parent model can control and observe long-running background
subagents. The agent lifecycle is also tightened so every background launch
is paired with exactly one terminal task_notification — including under
cancellation races and pathological tools that swallow AbortSignal.

* feat(core): switch background-agent transcript to ChatRecord JSONL

Replaces the plain-text per-agent transcript writer with one that emits
the same ChatRecord schema as the main session log. Each background
subagent now writes to <projectDir>/subagents/<sessionId>/agent-<id>.jsonl
with a .meta.json sidecar; records carry agentId/agentName/agentColor
and isSidechain so a single parser can reconstruct the parent session
and its subagents as one tree.

A new EXTERNAL_MESSAGE event is emitted when send_message injections are
drained inside agent-core, so each follow-up message is persisted as a
user-role record and the transcript remains a complete view of the run.

read_file's auto-allow set is extended to <projectDir>/subagents/ so the
model can keep polling the transcript path advertised in the launch
response and the completion notification XML.

* feat(core): emit full background agent result in task-notification

Drop the 2000-char truncation on <result> in emitNotification. The agent
output is already a model-generated summary; truncating it strips content
the parent agent specifically asked for. The <output-file> path is still
included for anyone who wants the structured transcript.

* test(cli): add hasUnfinalizedAgents/abortAll to registry mock

The nonInteractiveCli test stub was missing two methods that the
runtime now calls when draining background agents on shutdown,
causing every runNonInteractive test to fail with TypeError.

* test(core): use path.join in agent-transcript path helper assertions

Hard-coded forward slashes in expected paths failed on Windows where
path.join produces backslashes.

* fix(core): thread nested agent identity into sidecar metadata

* feat(agent): improve background agent launch tool result

- Add internal-ID qualifier, anti-duplication clause, and large-file reading strategy to the launch tool-result template, ported from claw-code.
- Rename transcript_file to output_file for consistency.
- Reference read_file and run_shell_command via ToolNames constants instead of raw strings.

* fix(core): rename send_message target field

* fix(core): exclude task_stop and send_message from subagents

These are parent-side control-plane tools for managing background
subagents. Subagents themselves cannot launch background agents
(AGENT is already excluded), so they have no agent IDs to manage
natively, and exposing the tools only widens the surface for
cross-agent interference if an ID leaks via prompt or transcript.

* refactor(core): generalize task_stop and send_message framing to "task"

Today every BackgroundTaskRegistry entry is a subagent, but the
control-plane tools were named and described as agent-only. Generalize
so future task kinds (e.g. backgrounded shells, monitors) can share
the same registry without a model-facing rename.

- task_stop / send_message: descriptions, error messages, and ToolError
  enum values drop the "agent" framing in favor of "task".
- send_message: parameter to -> task_id, matching task_stop for a
  uniform control-plane contract.
- BackgroundTaskRegistry.hasUnfinalizedAgents -> hasUnfinalizedTasks.
- agent-transcript: add a TODO at getSubagentSessionDir flagging that
  <projectDir>/subagents/ is part of the model-facing contract via
  <output-file>; future kinds should migrate to <projectDir>/tasks/.
- Add a test for complete()-after-finalizeCancelled no-op to pin the
  one-notification-per-task SDK contract through the post-notified
  re-entry path.
2026-04-27 20:36:38 +08:00
dreamWB
596b5a3fd7
feat(vscode): add tab dot indicator and notification system (#3106) (#3661)
* feat(vscode): add tab dot indicator and notification system (#3106)

Add a three-layer notification system for the VSCode extension:

- Tab dot indicator: orange (task completed) / blue (needs user input,
  higher priority). Appears when the tab is not the active editor.
- VS Code notification bubble with a "Show" button that focuses the
  Qwen Code panel when clicked.
- Platform notification sound (macOS: Glass.aiff via afplay,
  Windows: SystemSounds.Asterisk, Linux: canberra-gtk-play / paplay).

Notification rules (aligned with CLI useAttentionNotifications):
- Task completed: notify when user is not watching the panel AND task
  took >= 20 seconds. idleNotificationSent guard prevents duplicates
  across multi-turn endTurn events.
- Permission / askUserQuestion: notify immediately when user is not
  watching (no duration threshold). attentionNotified guard prevents
  duplicates per request.
- "Watching" = VS Code window focused AND panel visible.

New settings: qwen-code.dotIndicator (boolean) and
qwen-code.notifications (boolean), both default to true.

Also refactors three duplicate onDidReceiveMessage handlers into a
shared handleCommonWebviewMessage method.

* fix(vscode): address PR review — improve JSDoc, add sidebar no-op comment, add paplay error logging
2026-04-27 20:28:16 +08:00
dreamWB
00ba2ef600
feat(cli): add OSC notification support for iTerm2, Kitty, and Ghostty (#3562)
* feat(cli): add OSC notification support for iTerm2, Kitty, and Ghostty

Replace the basic terminal bell with protocol-specific OSC notifications
that display rich system notifications with title and message content.

- Add terminal detection (TERM → TERM_PROGRAM → KITTY_WINDOW_ID fallback)
- Add OSC 9 (iTerm2), OSC 99 (Kitty 3-step), OSC 777 (Ghostty/cmux)
- Add tmux/screen DCS passthrough with ESC byte doubling
- Add notification routing service with auto terminal detection
- Add dynamic tool name in approval notifications
- Refactor useTerminalProgress to use shared osc.ts module
- 42 unit tests covering all detection paths and protocols

Closes #2528

* fix(cli): address PR review feedback for OSC notification system

- Reorder terminal detection: TERM_PROGRAM first, TERM fallback, KITTY_WINDOW_ID last
- Add TTY guard in sendNotification to skip OSC when stdout is piped
- Add sanitizeOscPayload to prevent control character injection in OSC payloads
- Replace hand-rolled PendingToolCall with imported TrackedToolCall type
- Extract awaiting tool name via useMemo to avoid useEffect re-fires
- Unify brand name to "Qwen Code" in all notification messages
- Remove unused TerminalWriteContext/Provider/Hook exports
- Fix docstring: OSC_PREFIX 9;4 → OSC 9;4

* fix(cli): base64-encode Kitty OSC 99 payloads and fix screen ST conflict

- Add encodeKittyPayload() to base64-encode UTF-8 text for Kitty OSC 99
- oscKittyNotify() now uses e=1 flag with base64-encoded title/body
- osc() falls back to BEL terminator for Kitty inside GNU screen to
  avoid ST conflicting with the DCS passthrough wrapper's own ST
2026-04-27 20:28:07 +08:00
wenshao
9a9a8ade65 refactor(skills): extract parsePathsField + tighten paths mock pattern
The `paths:` frontmatter parser was duplicated across
`skill-manager.ts:parseSkillContent` and
`skill-load.ts:parseSkillContent`. Future validation tweaks
(e.g. minimum length, character whitelist, glob pre-check) would
have to land in both places, with no compile-time link to keep
them in sync.

Extract `parsePathsField(frontmatter)` into `types.ts` next to the
existing `parseModelField`, and call it from both parsers. Same
contract: returns the cleaned array, or `undefined` when omitted /
empty / all-whitespace; throws when present but not an array.
Adds 8 tests in `skill-load.test.ts` covering the contract.

Also tighten the `paths:` branch in the `skill-manager.test.ts`
mock yaml parser. The previous `yamlString.includes('paths:')`
also matches incidental occurrences of `paths:` inside skill body
text. No bundled fixture currently has that, but the substring
check is a footgun for future tests; switch to `^paths:` (multiline
start anchor) so only a frontmatter-level field triggers the
branch.
2026-04-27 17:41:37 +08:00
Shaojin Wen
f420742831
feat(cli,core): LLM-generated summary labels for tool-call batches (#3538)
Some checks are pending
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
* feat(cli,core): generate tool-use summaries for compact mode

After each tool batch completes, fire a parallel fast-model call to
generate a short git-commit-subject-style label summarizing what the
batch accomplished (e.g. "Read txt files", "Searched in auth/"). In
compact mode the label replaces the generic "Tool × N" header so N
parallel tool calls collapse to a single semantic row.

The fast-model call (~1s) runs fire-and-forget, overlapped with the
next turn's API stream, so there is no perceived latency. Missing
fast model, aborted turns, and model failures all degrade silently to
the existing rendering.

The summary is also emitted as a `tool_use_summary` history entry
with `precedingToolUseIds`, keeping the shape compatible with SDK
clients that want to render collapsed tool views on their own.

Gated by `experimental.emitToolUseSummaries` (default on). Can be
overridden per-session with `QWEN_CODE_EMIT_TOOL_USE_SUMMARIES=0|1`.

The system prompt and truncation rules (300 chars per tool field,
200 chars of trailing assistant text as intent prefix) match the
existing behavior seen in other tools that emit the same message
type, so SDK consumers see a consistent shape across clients.

* fix(core): bound cleanSummary quote-strip regex to avoid ReDoS

CodeQL js/polynomial-redos flagged the /^["'`]+|["'`]+$/g pattern in
cleanSummary because its input comes from an LLM (treated as
uncontrolled). The original regex is anchored and linear in practice,
but tightening the quantifier to {1,10} both satisfies the static
check and caps engine work on pathological model output with a long
run of quotes. Ten opening/closing quotes is well past anything a real
label would produce.

* fix(cli): render tool_use_summary inline so full mode also shows the label

The summary was only visible in compact mode because the full-mode
ToolGroupMessage ignored the compactLabel prop. Compact mode got away
with this because mergeCompactToolGroups triggers refreshStatic(),
which re-renders the merged tool_group with its newly-looked-up
label. Full mode has no such refresh path, so when the fast-model
call resolves *after* the tool_group has been committed to the
append-only <Static>, there is no way to retroactively decorate it.

Switch to rendering `tool_use_summary` as its own inline history item
(a single dim `● <label>` line). New items append cleanly to <Static>,
so the summary flows in naturally once the fast-model call resolves.
Compact mode still replaces the merged tool_group header with the
label and hides the standalone summary line via the `compactMode`
guard.

With this, the feature works under the default `ui.compactMode: false`
— not just the opt-in compact view.

* docs: tool-use-summaries feature guide, settings entry, and design doc

Three new docs matching the existing fast-model feature docs layout:

- docs/users/features/tool-use-summaries.md — user-facing guide
  covering full + compact rendering, configuration (settings + env),
  failure modes, cost, and cross-links to followup-suggestions.

- docs/users/configuration/settings.md — register the new
  experimental.emitToolUseSummaries setting next to the other
  fast-model-driven UI settings.

- docs/design/tool-use-summary/tool-use-summary-design.md — deep dive
  matching the compact-mode-design.md competitive-analysis style.
  Documents the Claude Code port (prompt, truncation, timing, gate),
  the deviations (settings layer, default on, cleanSummary, dual
  render paths), and the Ink <Static> append-only rationale that
  drove the inline full-mode render vs header-replacement split.

* docs: add Recommended pairing section to tool-use-summaries

Full-mode rendering of the summary works, but for small same-type
batches (Read × 3 and similar) the label visibly restates what the
tool lines already show. Pairing with ui.compactMode: true folds
the whole batch into a single labeled row, which is the cleanest
transcript shape once the label is available.

Adds a dedicated section showing the paired settings.json snippet
and explicitly calling out when each mode wins (and when to turn
the feature off instead).

* fix: address review feedback on tool-use summary generation

Addresses multiple issues from @chiga0's review:

Blocking — compact-mode label invisible for single-batch turns.
mergeCompactToolGroups's adjacency-only gating left a trailing
tool_use_summary in the merged result whenever there was no second
batch to merge across. That pushed mergedHistory.length lock-step
with history.length and MainContent's refreshStatic heuristic
(currMLen <= prevMLen) never fired, so Ink's append-only <Static>
never repainted the tool_group with its newly-looked-up label.
Drop tool_use_summary items unconditionally now; gemini_thought
still survives to avoid unnecessary repaints. New tests cover
the single-batch case and the summary-before-user-message case.

Blocking — stale summary appears after Ctrl+C on the next turn.
summarySignal captured the CURRENT turn's AbortController, but the
summary resolves during the NEXT turn's streaming window. The next
turn's submitQuery allocates a fresh controller, so the captured
signal was never aborted — Ctrl+C during the new turn used to let
the previous turn's summary land in the transcript seconds later.
Fix: dedicated per-batch AbortController tracked in a ref set,
aborted eagerly from cancelOngoingRequest; resolve-time check reads
the live abort state and turnCancelledRef.

High — summarizer input pollution.
geminiTools contained error/cancelled tools; retry-loop warnings
and "Cancelled by user" strings were feeding the fast model.
cleanSummary can only reject error-shaped output, not prevent the
model from hallucinating a plausible label from bad input (the PR's
own tmux screenshot showed "Read txt files · 5 tools" where 4 of
the 5 were prior-retry failures). Filter to status === 'success'
before building the prompt; skip the call entirely if nothing's
left.

High — unstable label on merged groups.
getCompactLabel iterated all callIds and returned the first hit,
so asynchronous resolution order made the header visibly flip
from SB to SA when batch A resolved after batch B. Lock onto
item.tools[0].callId to keep stable "leading batch governs"
semantics.

High — force-expanded groups in compact mode had no label at all.
Compact mode routes non-force-expand groups through
CompactToolGroupDisplay (consumes compactLabel) and force-expand
groups through the full ToolGroupMessage (ignores compactLabel);
the standalone ● line was gated on !compactMode, creating a dead
zone — exactly the diagnostically valuable case. MainContent now
computes absorbedCallIds (which groups actually consume the
header replacement) and passes summaryAbsorbed to
HistoryItemDisplay; force-expand groups in compact mode get the
standalone line as the label's only path to the screen.

Medium — cleanSummary robustness.
Extend quote-strip to Unicode curly + CJK corner brackets; strip
markdown emphasis (**bold**, _italic_); broaden refusal-prefix
rejection to curly-apostrophe "I can't", Chinese "我无法 / 我不能 /
抱歉 / 无法", and "Failed to / Sorry, / Request failed". 7 new
cleanSummary tests cover the added cases.

Low — concurrent-rendering safety.
Move historyRef.current = history from render phase into
useLayoutEffect so bailed renders can't leave a dropped value.

Low — CompactToolGroupDisplay readability.
Extract renderSummaryHeader / renderDefaultHeader helpers and
document the toolCalls.length > 1 count-suffix guard so a future
"fix" to >= 1 doesn't reintroduce "Read config.json · 1 tools".

Docs — add Scope & Lifecycle section to tool-use-summaries.md
covering (1) one generation per batch shared by both modes,
(2) no backfill on toggle / session resume, (3) main-agent batches
only with the Task-tool clarification.

* fix: address second-round review feedback on tool-use summaries

Critical — force-expand groups lost their summary entirely.
Previous round's "drop tool_use_summary unconditionally" merge fix
also stripped summaries for force-expanded groups, defeating the
exact case (errors, confirmations, focused shell) where the
standalone ● label is the label's only path to the screen. The
merge function now takes an absorbedCallIds set: summaries whose
preceding callIds are all absorbed by a compact tool_group header
are dropped (so refreshStatic still fires), but force-expanded
summaries pass through to be rendered standalone by
HistoryItemDisplay. MainContent computes absorbedCallIds from raw
history and passes it in. New tests cover both the absorbed-drop
and the force-expand-preserve cases plus the empty-set default
for callers that don't compute absorption.

Suggestion — late-arriving summaries could land out of order.
A slow fast-model call could resolve after the next turn's
content was committed, planting the ● label between later items
in full mode. The resolve callback now captures the first batch
callId, locates the corresponding tool_group at resolve time,
and drops the summary if a newer tool_group has already appeared
in history. New test exercises this with a manually-resolved
fast-model promise.

Suggestion — truncateJson allocated full JSON for large strings.
A 10MB ReadFile result was being JSON.stringify'd in full only to
be sliced down to 300 chars. Added preTruncate that walks the
value (depth-bounded to 4) and slices string leaves to maxLength
before serialization. Tests verify the input never reaches its
full pre-cap form.

Suggestion — settings description over-claimed SDK emission.
The description said summaries are emitted to SDK clients as a
tool_use_summary message; the SDK plumbing isn't actually wired
in this PR (the factory is exported for follow-up). Updated
settings.json description and regenerated the vscode schema to
state CLI-only scope explicitly.

Suggestion — fastModel data-boundary not documented.
When fastModel uses a different provider than the main session
model, tool inputs/outputs cross a new auth boundary that users
may not expect. Added "Data flow & privacy" section to the user
feature doc spelling out: same-provider fast model = no scope
change; different-provider = strictly larger sharing scope; two
escape hatches (same-provider fast model OR feature off).
Code-level mitigation (metadata-only mode) deferred.
2026-04-27 16:54:10 +08:00
pomelo
7fe853a782
Feat/openrouter auth (#3576)
* feat(cli): add OpenRouter auth flow

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(cli): add OpenRouter model management UI

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): align OpenRouter OAuth fallback session

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* refactor(cli): unify OpenRouter model setup flow

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(auth): update OAuth description with provider examples and i18n support

- Updated OAuth option description to include provider examples (OpenRouter, ModelScope)
- Added internationalization support for new description text
- Updated all language files (en, zh, de, fr, ja, pt, ru) with translations

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs: simplify OpenRouter design docs

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(auth): fix OpenRouter OAuth mock typing

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(auth): sync AuthDialog tests with new three-option main menu layout

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Update assertions that referenced removed 'Qwen OAuth' and 'OpenRouter' options in the main/API-key views to match the refactored OAUTH / CODING_PLAN / API_KEY structure.

* fix(i18n): add missing zh-TW translation for browser-based auth key

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

zh-TW.js was generated from main's en.js which had already removed this key, but the PR re-adds it in en.js. Sync zh-TW with the new translation.

* feat(cli): Improve custom auth wizard with step indicators and cleaner advanced config (#3607)

* feat(cli): Add custom API key auth wizard with 6-step setup flow

Replace the documentation-only

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>"Custom API Key" screen with an
in-terminal wizard: Protocol select → Base URL input → API Key input →
Model ID input → JSON review → Save.

- Add 5 new ViewLevels and render functions in AuthDialog
- Implement utility functions: generateCustomApiKeyEnvKey (normalization),
  normalizeCustomModelIds (split/trim/dedupe), maskApiKey (display)
- Implement handleCustomApiKeySubmit in useAuth with backup, env key
  generation, modelProviders merge, auth refresh, and user feedback
- Wire handler through UIActionsContext and AppContainer
- Add 18 unit tests for utilities, 4 wizard flow integration tests

* feat(cli): Improve custom auth wizard with step indicators and cleaner advanced config

- Add step indicators (Step 1/6 · Protocol) to each wizard screen
- Remove redundant Protocol/Endpoint context from each step for focus
- Redesign advanced config: add descriptions to thinking/modality toggles
- Remove max tokens option; keep only thinking and modality settings
- Add ↑↓ arrow navigation with Space toggle and Enter to continue
- Generation config flows through review JSON and final submit

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test: Fix Windows CI failures in fileUtils and AuthDialog tests

- fileUtils.test.ts: Mock node:child_process execFile to prevent
  pdftotext spawn that times out on Windows (ENOENT, 5s timeout)
- AuthDialog.test.tsx: Add char-by-char typeText() helper to work
  around Node 24.x + ink TextInput compatibility issue on Windows

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): Reset advanced wizard state and use JSON.stringify for settings preview

- Reset advancedThinkingEnabled, advancedModalityEnabled, and
  focusedConfigIndex when re-entering custom wizard to prevent
  state leakage between configurations
- Replace hand-rolled JSON string concatenation with
  JSON.stringify for settings.json preview to properly escape
  special characters in model IDs and base URLs

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(cli): harden OpenRouter OAuth callback handling

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): stabilize OpenRouter state mismatch test

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test(cli): stabilize custom auth wizard navigation

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-27 14:47:44 +08:00
jinye
96bc874197
chore(gitignore): add .codex directory to gitignore (#3665)
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
2026-04-27 14:38:56 +08:00
John London
ccb9857a5c
refactor(config): dedupe QWEN_CODE_API_TIMEOUT_MS env override logic (#3653)
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
Extract duplicated timeout env override block into a shared helper
applyTimeoutEnvOverride(), used by both resolveModelConfig() and
resolveQwenOAuthConfig(). Preserves precedence:
modelProvider > env > settings > default.

Adds [Regression] and [Additional] tests guarding against the
original OAuth-path bug and covering edge cases.
2026-04-27 08:44:18 +08:00
Dragon
534ca986eb
feat(cli): add argument-hint support for slash commands (#3593)
Adds argument-hint support across the slash command pipeline. Skill and command authors specify an argument-hint field in markdown frontmatter, which renders as inline ghost text when the user has typed the command name but not yet provided arguments.

Pipeline:
- Skill parsing: SkillConfig.argumentHint parsed from SKILL.md frontmatter
- Command loaders: propagated through SkillCommandLoader, BundledSkillLoader, FileCommandLoader, command-factory
- UI: useCommandCompletion shows hint as ghost text with showCursorBeforeText layout; InputPrompt separates display text from Tab-accept text
- ACP: passed as input.hint per spec
- Bundled skills (batch, loop, qc-helper, review) get hints

Hint is excluded from completion menu labels to keep the dropdown clean and disappears as soon as the user starts typing arguments.
2026-04-27 08:29:50 +08:00
jinye
3b0b6c052b
feat(cli): add API preconnect to reduce first-call latency (#3318)
Fire a fire-and-forget HEAD request early in startup to warm the TCP+TLS connection. Subsequent SDK calls share an undici dispatcher with preconnect, reusing the warmed connection to save 100-200ms on the first request.

Skip conditions:
- NODE_EXTRA_CA_CERTS set (enterprise TLS inspection)
- Sandbox mode (process-restart context)
- Non-default baseUrl (mTLS / private deployment)
- Non-Node runtimes (Bun)

Disable via QWEN_CODE_DISABLE_PRECONNECT=1.

Closes #3223
2026-04-27 06:54:55 +08:00
John London
70127b5cd8
fix(config): support QWEN_CODE_API_TIMEOUT_MS across OAuth and non-OAuth paths (#3629)
* feat(config): support API timeout env override

Adds support for QWEN_CODE_API_TIMEOUT_MS as an environment override
for model generation timeout.

Qwen Code already supports timeout configuration via:
  settings.model.generationConfig.timeout

This change introduces an env-based override for users running slow
local/OpenAI-compatible backends where editing config is less convenient.

Precedence: modelProvider > env var > settings > default (120000ms)

Behavior:
- Valid positive env values override configured timeout
- Invalid values are ignored
- Default behavior remains unchanged (applied in buildClient())

Note: The 5-minute timeout reported in #1045 originally came from
undici's default bodyTimeout, which is now disabled (bodyTimeout:0).
The modelConfigResolver default is 120000ms (2 minutes).

Includes unit tests covering precedence and validation.

Closes #1045

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(core): add edge-case tests for QWEN_CODE_API_TIMEOUT_MS

Covers: large timeout values, whitespace-padded env values,
negative env values, and reinforces provider > env > settings precedence.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(config): support QWEN_CODE_API_TIMEOUT_MS override

Adds support for QWEN_CODE_API_TIMEOUT_MS as an environment
override for model generation timeout.

Closes #13

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 05:59:06 +08:00
易良
310eb88fba
fix(cli): guard gradient rendering without colors (#3640)
Some checks are pending
Qwen Code CI / CodeQL (push) Waiting to run
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
2026-04-27 00:52:56 +08:00
易良
04afc610ea
fix(vscode-companion): slash command completion not triggering after message submit (#3609)
* fix(vscode-companion): slash command completion not triggering after message submit

After submitting a message, the input field is cleared with a zero-width
space (\u200B) to maintain contentEditable height. When the user then
types "/", the DOM content becomes "\u200B/" and the trigger character
lands at position 1 instead of 0. The word boundary check only recognized
regular space and newline, so the zero-width space was rejected as an
invalid boundary — preventing the completion popup from appearing.

Add \u200B to the valid word boundary characters so "/" and "@" triggers
work correctly after message submission without requiring an extra
backspace.

Closes #3592

* refactor(webui): extract zero-width space placeholder into shared constant

Replace scattered `\u200B` magic strings with a shared `ZERO_WIDTH_SPACE`
constant and `stripZeroWidthSpaces()` helper exported from @qwen-code/webui.

This also improves the slash command completion fix: instead of adding
\u200B to the word boundary check, strip it at the source in handleInput
(consistent with InputForm's onInput handler) and clamp the cursor
position to the stripped text length.

Closes #3592

* test: add tests for zero-width space handling and shouldSendMessage

- Add unit tests for ZERO_WIDTH_SPACE constant and stripZeroWidthSpaces
  helper (via @qwen-code/webui import)
- Add shouldSendMessage tests covering empty, whitespace, zero-width
  space, and attachment scenarios
- Add parseExportSlashCommand tests for zero-width space input

* fix(test): use correct ImageAttachment type in shouldSendMessage tests

Fix CI lint failure by providing all required ImageAttachment fields
(id, name, type, size, data, timestamp) instead of non-existent
mediaType property.
2026-04-26 22:27:54 +08:00
Jordi Mas
b5c7abd28e
feat: Adds Catalan language support (#3643)
* Initial version

* Some fixes

* Fix sentences

* More fixes

* Fix

* Latest fixes
2026-04-26 22:26:53 +08:00
tanzhenxin
a6b0b7e579
Revert "fix(cli): respect OPENAI_MODEL precedence in CLI model resolution (#3567)" (#3633)
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
This reverts commit 007a109db8.

The change made `OPENAI_MODEL` outrank `settings.model.name` when looking
up the active entry in `settings.modelProviders`. Combined with the core
resolver's `modelProvider > cli > env > settings` priority, this caused
a regression: a `/model` selection (which writes `settings.model.name`)
was silently overridden whenever `OPENAI_MODEL` was set in the user's
shell, with no warning surfaced.

Restoring the previous behavior — looking up the provider entry by
`argv.model || settings.model?.name` — preserves the implicit contract
that an explicit `modelProviders` config takes precedence over stale
shell defaults. Users without a `modelProviders` config are unaffected:
env vars still drive model selection through the core resolver.

See discussion on #3567.
2026-04-26 14:29:18 +08:00
Shaojin Wen
29887ddfef
fix(core): match DeepSeek provider by model name for sglang/vllm (#3613) (#3620)
Some OpenAI-compatible servers (notably sglang's deepseek-v4 jinja
template) crash on the array form of message content even when it
carries a single text block, with `TypeError: sequence item 0:
expected str instance, list found` at `encoding_dsv4.py:336`.

The DeepSeekOpenAICompatibleProvider already flattens content arrays
into joined strings in buildRequest, but isDeepSeekProvider only
matched on the official api.deepseek.com baseUrl. DeepSeek models
served behind sglang / vllm / ollama / etc. bypass the workaround
and hit the bug.

Extend the matcher to also detect by model name (case-insensitive
substring 'deepseek'), so any OpenAI-compatible endpoint serving a
DeepSeek model picks up the same content-format flattening.

Fixes #3613

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
2026-04-26 13:17:34 +08:00
Shaojin Wen
569cfe10fa
fix(telemetry): use safeJsonStringify in FileExporter to avoid circular reference crash (#3630)
When --telemetry-outfile is configured, FileSpanExporter.serialize called
JSON.stringify directly on OTel ReadableSpan instances. The spans hold a
back-reference to BatchSpanProcessor (._shutdownOnce -> BindOnceFuture._that
-> BatchSpanProcessor), which forms a cycle and triggers
"TypeError: Converting circular structure to JSON" on every export. Combined
with DiagConsoleLogger, the error was repeatedly printed to stderr and
polluted the Ink TUI.

Switch FileExporter.serialize to the existing safeJsonStringify utility,
matching the upstream gemini-cli fix so future merges stay clean. Add a
focused regression test that mimics the BatchSpanProcessor cycle shape;
broader cycle behavior is already covered by safeJsonStringify.test.ts.

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
2026-04-26 12:55:39 +08:00
Yan Shen
eea4e10eea
feat(cli): add sticky todo panel to app layouts (#3507)
* feat(cli): add sticky todo panel to app layouts

* fix(cli): hide sticky todos during feedback dialog
2026-04-26 12:21:30 +08:00
jinye
4be0234d10
docs(telemetry): clarify Alibaba Cloud console entry (#3498)
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
* docs(telemetry): clarify Alibaba Cloud console entry

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs(telemetry): fix unreachable intl console URL and split new/legacy console guidance

- Replace unreachable tracing-sgnew.console.alibabacloud.com with the
  verified arms.console.alibabacloud.com for international users
- Separate OTLP endpoint retrieval steps by console version: new console
  uses Integration Center, legacy console uses Cluster Configurations →
  Access point information

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* docs(telemetry): align target example with current implementation

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs(telemetry): clarify Alibaba Cloud OTLP setup

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* docs(telemetry): remove stale TOC entry

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-26 07:40:35 +08:00
jinye
f7cfe53c6a
fix(core): preserve settings-sourced apiKey when registry model envKey is absent (#3495)
* fix(core): preserve settings-sourced apiKey when registry model envKey is absent (#3417)

On restart, `applyResolvedModelDefaults` unconditionally cleared the
apiKey resolved from `settings.security.auth.apiKey` (layer 4 fallback)
and only read from `process.env[model.envKey]`. When the provider-specific
env var was absent (e.g. key stored only in settings), the correctly
resolved key was discarded, causing a 401 error.

Now capture the previously-resolved apiKey before clearing and fall back
to it when `process.env[model.envKey]` is empty, but only for safe source
kinds (`settings` and general `env` without `via.modelProviders`).

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): also preserve CLI-sourced apiKey during syncAfterAuthRefresh

Address review feedback: keys passed via CLI flags (e.g. --openaiApiKey)
were dropped on restart because source kind 'cli' was not in the
fallback allowlist. Add 'cli' to the condition and a regression test.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): move apiKey preservation from applyResolvedModelDefaults to syncAfterAuthRefresh

The previous fallback logic inside applyResolvedModelDefaults could leak
a settings/cli-sourced apiKey to a different provider when switching
models within the same authType (e.g. dashscope → openai). This is a
credential safety issue because the two providers may have different
baseUrls.

Move the save/restore logic to syncAfterAuthRefresh Step 1, guarded by
an `isUnchanged` check (same authType AND same modelId). This ensures:
- Restart scenario: apiKey preserved (same model, no change)
- Cross-provider switch: apiKey cleared (different modelId)

Also adds two cross-provider switch tests (settings-sourced and
CLI-sourced) per review feedback.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): replace non-null assertion with truthiness guard and add cold-start test

- Replace `savedApiKeySource!` with a truthiness guard for safer
  source restoration
- Add test for cold-start scenario (previousAuthType undefined) to
  verify no key preservation occurs on first syncAfterAuthRefresh
- Fix stale "short-circuit" comment in programmatic key test

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): detect provider config hot-reload in isUnchanged check

When a model provider config is hot-reloaded (e.g. via Coding Plan
update) changing envKey or baseUrl while keeping the same model id,
the save/restore logic must not preserve the old apiKey. Extend the
isUnchanged guard to compare apiKeyEnvKey and baseUrl against the
resolved model, but only after applyResolvedModelDefaults has run at
least once (apiKeyEnvKey !== undefined). On first startup call these
fields are still unset, so the check is skipped to preserve the
settings/cli-sourced key correctly.

Adds two hot-reload tests (envKey change and baseUrl change).

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): use baseUrl source as hasBeenApplied signal for provider change detection

Replace `apiKeyEnvKey !== undefined` guard with `baseUrl source ===
'modelProviders'` to reliably detect whether applyResolvedModelDefaults
has been called before. This fixes two edge cases:

1. No-envKey models: hot-reload changing baseUrl was undetected because
   apiKeyEnvKey remained undefined. Now baseUrl source is checked.
2. Startup with envKey but omitted baseUrl: undefined !== default URL
   could falsely trigger isProviderChanged. Now skipped at startup
   since baseUrl source is not yet 'modelProviders'.

Updates hot-reload test fixtures to simulate post-apply state (baseUrl
source as 'modelProviders') and adds no-envKey hot-reload test.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(core): shallow-clone savedApiKeySource to avoid mutation risk

Copy the ConfigSource object before applyResolvedModelDefaults runs,
so a future refactor that mutates source objects in place won't break
the save/restore logic.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

---------

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-26 07:37:56 +08:00
jinye
72c31d378d
fix(test): update rewind E2E Test 1 assertion after isRealUserTurn fix (#3622)
Test 1 asserted `say exactly GAMMA3` after pressing Up once in the
rewind selector, but that only passed because `/rewind` was incorrectly
counted as a user turn. After `isRealUserTurn()` excluded slash commands,
the turn list is [ALPHA1, BETA2, GAMMA3] and Up from the initial
selection (GAMMA3) lands on BETA2. Update the assertion to match.

Ref: https://github.com/QwenLM/qwen-code/pull/3441#issuecomment-4319798259

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-26 06:49:42 +08:00
qqqys
83d1e6dcae
feat: adds a Space-to-preview affordance to the /resume session picker (#3605)
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
* feat(cli): add Space-to-preview in resume session picker

Press Space on a highlighted session to open a read-only transcript
preview; Enter resumes, Esc returns. Works from both in-session
`/resume` and standalone `qwen --resume`.

The standalone path runs before `loadCliConfig`, so no real Config /
LoadedSettings exist when its render tree mounts. `StandaloneSessionPicker`
wraps the picker in stub Providers — every downstream access in the
preview render path is either optional-chained or gated on states
(Confirming / Executing) that never occur in resumed session data, so
the stubs' methods are only read, never invoked for real work. Tool
descriptions degrade to the raw function-call name in preview; users
get full fidelity after pressing Enter to resume.

Co-Authored-By: Qwen-Coder <noreply@qwen.ai>

* fix(cli): guard SessionPreview separator width on narrow terminals

`'─'.repeat(boxWidth - 2)` would throw RangeError when columns < 6
(tmux splits, small panes). Clamp boxWidth to a safe minimum and
compute separatorWidth with Math.max(0, …).

Co-Authored-By: Qwen-Coder <noreply@alibabacloud.com>

* fix(cli): gate Space-to-preview behind enablePreview prop

`SessionPicker` is shared by the resume dialog and the delete-session
dialog. Preview's Enter shortcut forwards to `onSelect`, which for
delete is `handleDelete` — so Space → preview → Enter would silently
delete the session while the preview UI still says "Enter to resume".

Add `enablePreview?: boolean` (default false). Resume callers (the
in-app resume dialog and `--resume` standalone) opt in; the delete
dialog stays opt-out and behaves exactly as before. Footer hint and
preview render branch are both gated on the prop. Add a regression
test that emulates the delete dialog and asserts Space is a no-op,
the hint is absent, and Enter still flows straight to onSelect.

Co-Authored-By: Qwen-Coder <noreply@alibabacloud.com>

---------

Co-authored-by: Qwen-Coder <noreply@qwen.ai>
Co-authored-by: Qwen-Coder <noreply@alibabacloud.com>
2026-04-25 22:41:03 +08:00
Reid
70cebc46a8
test(arena): cover select dialog key actions (#3614)
Add ArenaSelectDialog tests for Escape, discard, and winner selection key paths.

  Verify Escape only closes the dialog, x discards without applying changes, Enter applies the highlighted successful agent, and
  failed agents remain inert when selected.
2026-04-25 22:30:11 +08:00
易良
b127258328
fix(review): respect /language output setting for local reviews (#3611)
The /review skill's language rule "match the language of the PR" has no
applicable target during local reviews (no PR exists). When a user sets
an output language via /language, local review output now honors that
preference instead of defaulting to English.

PR reviews remain unchanged — they continue matching the PR's language
since findings may be published as inline comments visible to all
collaborators.

Closes #3594
2026-04-25 22:27:30 +08:00
wenshao
e20247bafb docs(skills): document path-conditional activation and the model/user view gap
@yiliang114 noted that asking the model "what skills do you have?"
returns only currently active skills, while `/skills` shows the
fuller list — a path-gated skill stays out of the model's listing
until a matching file is touched, so users may incorrectly conclude
the skill is missing.

Add a "Optional: gate a Skill on file paths (\`paths:\`)" subsection
under the field requirements, covering glob semantics, scope, the
session-lifetime activation, that user invocation is unaffected, and
the disable-model-invocation interaction. Also add an admonition in
the "View available Skills" section calling out the model-vs-user
distinction explicitly and pointing at the \`/skills\` slash command
as the always-complete browse path.
2026-04-25 22:16:44 +08:00
jinye
c406c73509
feat(cli): add conversation rewind feature with double-ESC and /rewind command (#3441)
* feat(cli): add conversation rewind feature with double-ESC and /rewind command (#3186)

Add the ability to rewind conversation to a previous user turn, similar
to Claude Code's message selector. Users can trigger rewind via:
- Double-ESC on empty prompt while idle
- /rewind (or /rollback) slash command

The RewindSelector component provides a two-phase UI: a scrollable
pick-list of user turns followed by a confirmation dialog. On confirm,
both UI history and API history are truncated consistently, the terminal
is re-rendered, and the original prompt text is pre-populated in the
input for editing.

Key implementation details:
- historyMapping.ts correctly handles tool-call loops (functionResponse
  entries) and the startup context pair when mapping UI turns to API
  Content[] indices
- useDoublePress hook provides generic double-press detection with
  800ms timeout and proper cleanup on unmount
- ESC handler guards against WaitingForConfirmation state to prevent
  accidental rewind during tool approval
- Chat recording service records rewind events with tree-branching
  via parentUuid for session replay support

Closes #3186

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix: call recordRewind() in handleRewindConfirm and simplify payload

- Actually invoke chatRecordingService.recordRewind() after rewind
- Remove tree-branching from recordRewind (no UI-to-recording UUID
  mapping exists yet) to avoid corrupting the parentUuid chain
- Simplify RewindRecordPayload to just truncatedCount

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* test: add tmux-based E2E script for rewind feature

Automated verification of all 5 manual test items from PR description:
  1. /rewind command flow (pick turn, confirm, verify truncation)
  2. Double-ESC opens selector (with btw dismiss handling)
  3. ESC during streaming cancels (no rewind)
  4. /rewind with no history (guard blocks)
  5. After rewind, model ignores removed turns

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* fix(rewind): resolve resume persistence and IDE mode issues

- chatRecordingService: add turnParentUuids tracking and rewindRecording()
  which re-roots the parentUuid chain so rewound messages land on a dead
  branch; reconstructHistory() then skips them automatically on resume.
  Add rebuildTurnBoundaries() for re-populating the index after /resume.
- AppContainer: fix truncatedCount bug (was always 0 after loadHistory),
  wire handleRewindConfirm to rewindRecording() with correct targetTurnIndex,
  add config.getIdeMode() guard to openRewindSelector so rewind is disabled
  in IDE sessions where extra user Content entries break the API boundary
  mapping.
- useResumeCommand: call rebuildTurnBoundaries() after startNewSession so
  rewind works correctly within resumed sessions.
- resumeHistoryUtils: surface "Conversation rewound." info item when a
  rewind record is encountered during history reconstruction.
- historyMapping.test.ts: add 9 unit tests for computeApiTruncationIndex
  covering normal flow, startup context pair, tool responses, and
  compression fallback.
- Copyright headers: standardize new files to "Copyright 2025 Qwen Code".

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fix(rewind): close slash-command, compression, and IDE bypass holes

Three bugs found by Codex review:

1. P1: `/rewind` slash command bypassed the IDE-mode guard because
   `slashCommandActions.openRewindSelector` called `setIsRewindSelectorOpen`
   directly. Fixed by introducing a ref bridge (`openRewindSelectorRef`)
   that delegates to the guarded callback.

2. P1: Slash-command invocations (`/help`, `/stats`, etc.) are stored as
   `type: 'user'` in UI history but never reach the API or recording
   service. The turn-index counter in `handleRewindConfirm` and
   `computeApiTruncationIndex` counted them, producing off-by-N errors.
   Added `isRealUserTurn()` helper that excludes items starting with
   `/` or `?`, applied in all three counting sites (AppContainer,
   historyMapping, RewindSelector).

3. P2: After chat compression, `computeApiTruncationIndex` returned
   `apiHistory.length` when the target turn was unreachable, silently
   keeping the full API history while the UI was truncated. Changed to
   return `-1`; `handleRewindConfirm` now aborts with an error message
   when the target turn was absorbed by compression.

Tests: 14 unit tests for historyMapping (including slash-command and
compression cases), full suite 616/616 passed.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

---------

Co-authored-by: jinye.djy <jinye.djy@alibaba-inc.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-25 22:12:29 +08:00
Shaojin Wen
e937d15b90
fix(cli): drain runExitCleanup before process.exit in error handlers (#3602)
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
handleError / handleCancellationError / handleMaxTurnsExceededError all
called process.exit synchronously, bypassing the caller's runExitCleanup
-> Config.shutdown -> chat-recording flush() chain on SIGINT, max-turn,
and fatal-error paths. Same family as the EPIPE bypass fixed in
bf24fff1f, just on a different code path.

Makes the three handlers async and routes the exit through a shared
exitAfterCleanup() helper that awaits runExitCleanup() before the actual
process.exit. The helper carries an exit-once latch so a SIGINT racing a
stream rejection (handleCancellationError + handleError fired
concurrently) doesn't end up running cleanup twice or interleaving exit
calls — only the first caller drains and exits, the second parks on a
never-resolving promise that's killed when process.exit fires.

Text-mode handleError still throws to the caller (unchanged behavior),
but now drains the queue first so the unhandled-rejection path doesn't
lose chat-recording records.

Five call sites in nonInteractiveCli.ts updated to await. Existing 11
errors.test.ts cases adapted to async + rejects.toThrow; added 5 new
regression guards covering cleanup-before-exit ordering for each
handler plus the concurrent-handler race.

Co-authored-by: wenshao <wenshao@U-K7F6PQY3-2157.local>
2026-04-25 11:07:30 +08:00
ChiGao
54465b0c02
fix(cli): add TUI flicker foundation fixes (#3591)
* fix(cli): reduce main screen flicker

* fix(cli): pre-slice large tool text output

* fix(cli): slice tool output by visual height

* fix(core): preserve shell transcript across narrow wraps

* fix(core): suppress soft-wrap-only shell rerenders

* fix(core): compare default shell output by logical wraps

* fix(cli): gate synchronized terminal output

---------

Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com>
2026-04-25 10:13:34 +08:00
wenshao
160462344c fix(skills): scope path activation to visible, model-invocable skills
Two issues caught by review of the new conditional-skill activation
path, both rooted in `refreshCache()` building the activation
registry from the raw concatenation of every level's skills:

- Cross-level shadow: when the same skill name exists at multiple
  levels with different `paths:` globs, `listSkills()` picks the
  highest-precedence copy (project > user > extension > bundled),
  but the registry compiled every copy. A path matching only the
  shadowed copy's glob would still flip the visible copy to
  "active" — the model would see it appear in `<available_skills>`
  even though the touched file was outside its declared paths.

- Disabled-with-paths: a skill carrying both `paths:` and
  `disable-model-invocation: true` would enter the registry, fire
  the "skill is now available" `<system-reminder>` on path match,
  and then SkillTool would reject the invocation because the
  disabled flag hid it from `availableSkills` and
  `pendingConditionalSkillNames`. The model gets a generic "not
  found" after being told the skill exists.

Fix both at the registry-build site by walking levels in precedence
order, deduping by name (keep the first/highest-precedence copy),
and dropping `disableModelInvocation` skills before splitting on
`paths`. Adds two regression tests in `skill-manager.test.ts`.
2026-04-25 09:31:32 +08:00