qwen-code

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-24 05:26:21 +00:00

Author	SHA1	Message	Date
pomelo-nwu	f4e01a409e	fix(auth): address PR #4287 review (critical + suggestion) vscode AuthMessageHandler (Critical): - Add the missing protocol-selection step so custom-provider users can pick Anthropic/Gemini instead of being silently locked to OpenAI. - Validate free-form base URL with the same /^https?:\/\// check the CLI uses; reject file:/javascript: schemes. vscode AuthMessageHandler (Suggestion): - Stop filtering separator entries from the provider QuickPick so groups (Alibaba Cloud / Third Party / Custom) actually show as headers instead of a flat list. - Treat a null authInteractiveHandler as an error: surface an authError + cancellation notification instead of silently dropping the user's input. - Call notifyAuthCancelled when validateApiKey rejects so the webview state resets and the user can retry. core/providers/presets/openrouter.ts (Critical): - Replace the substring includes() in ownsModel with a URL-hostname match so paths like https://api.example.com/openrouter.ai/v1 stop being misidentified as OpenRouter models (and getting removed on re-install). vscode/services/settingsWriter.ts (Critical): - stripTrailingCommas() so JSONC files with trailing commas (VSCode's default style) parse instead of silently returning {} and then overwriting the entire settings file. - readSettings() distinguishes ENOENT (return {}) from parse errors (log + rethrow) so a malformed file never gets clobbered. - writeSettings() writes through a temp file + fs.renameSync atomic rename, eliminating the half-written file window on EACCES / disk-full / crash. - setValue() refuses to overwrite a scalar at an intermediate path segment (would have silently destroyed e.g. {"env": "legacy-string"}). core/providers/install.ts (Suggestion): - Move settings.backup?.() inside the try block so a backup failure still triggers the env-rollback path in catch. cli/config/loadedSettingsAdapter.ts (Suggestion): - Add the same UNSAFE_KEY_PARTS guard the vscode adapter has, so __proto__/constructor/prototype segments are rejected before reaching the underlying setNestedPropertySafe walker. Defense in depth: not exploitable today but the utility has no built-in guard. vscode/webview/providers/WebViewProvider.ts (Suggestion): - Hoist buildInstallPlan / applyProviderInstallPlanToFile to static imports (both modules already top-level imported); drops two per-call await import() round-trips. cli/utils/doctorChecks.ts (Suggestion): - Whitespace nit before the comma in the qwen-code-core import. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-19 13:58:25 +08:00
pomelo-nwu	18b35b48ec	i18n(cli): translate "Connect an LLM provider" in all locales Strict-parity locales (zh, zh-TW) require every built-in command description to be translated; the renamed /auth description was falling back to English and breaking the must-translate test. Add translations for zh / zh-TW (required) and refresh the other seven locales (en, ru, de, ja, fr, ca, pt) so the old "Configure authentication information for login" key is removed everywhere rather than left as a dangling dictionary entry. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-18 22:36:09 +08:00
pomelo-nwu	1b0b7f93a8	refactor(cli): rename /auth description to "Connect an LLM provider" The old description ("Configure authentication information for login") implied a Qwen-account login. After the /auth refactor it's really about picking an LLM provider and entering credentials, so the menu entry should say that. Also add 'connect' as an alt-name alongside the existing 'login' so users can type /connect when 'auth' feels wrong. Keep 'login' for muscle-memory compatibility. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-18 22:12:36 +08:00
pomelo-nwu	ed08f7296b	fix(auth): show base URL default as placeholder, not prefilled value In Custom Provider Step 2/6 (and on protocol switch), the base URL input started with the protocol's default URL pre-filled. Users who wanted a non-default endpoint had to manually clear the field first. Switch to placeholder semantics: the input starts empty, the default URL is shown as a hint, and submitting blank falls back to that default (then writes it back to baseUrl so downstream steps see a real value). Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-18 21:56:28 +08:00
pomelo-nwu	0d8fe738f5	fix(auth): default Audio modality to off in provider advanced config In the /auth Custom Provider advanced-config step, "Enable modality" should default to Image + Video only. Audio was on by default, which implied the model accepts audio input even though most providers people configure here don't. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-18 21:49:31 +08:00
pomelo-nwu	efcfce5b6b	Merge remote-tracking branch 'origin/main' into refactor/unify-provider-config-to-core # Conflicts: # packages/cli/src/ui/components/ManageModelsDialog.test.tsx # packages/cli/src/ui/components/ManageModelsDialog.tsx # packages/cli/src/utils/apiPreconnect.ts # packages/core/src/providers/all-providers.ts	2026-05-18 20:49:31 +08:00
pomelo-nwu	6804a18f01	refactor(auth): unify applyProviderInstallPlan in core, drop cli/auth CLI and vscode now share core's applyProviderInstallPlan instead of keeping two parallel implementations. The CLI-only env rollback (snapshot process.env, restore on error) is folded into the core version so vscode also benefits from it. CLI ships a LoadedSettingsAdapter that maps LoadedSettings to core's ProviderSettingsAdapter contract. Backup/restore is layered: write a .orig file, structuredClone settings + originalSettings, then recomputeMerged() on restore — same guarantees as before, just routed through the adapter. Tests for the install logic are migrated to core and rewritten against the adapter mock (more focused than the previous LoadedSettings/Config mocks). packages/cli/src/auth/ is gone entirely. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-18 18:14:12 +08:00
pomelo-nwu	423e453069	refactor(cli): drop OpenRouter OAuth + /manage-models, simplify /auth OpenRouter now uses the standard API-key flow under "Third-party Providers" (issue #4108). The whole OpenRouter OAuth implementation (PKCE, callback server, model auto-install) and the /manage-models command (only OpenRouter was wired in; /auth Step 2 already covers model selection) are removed. /auth is renamed around the "Connect a Provider" mental model: - Dialog title is now "Connect a Provider"; the OAuth main entry is gone - handleAuthSelect (mixed close + auth trigger) is split into a single-purpose closeAuthDialog; legacy wrappers (handleSubscriptionPlanSubmit, handleApiKeyProviderSubmit, handleCustomApiKeySubmit, ...) are dropped in favor of the unified handleProviderSubmit Core: openRouterProvider switches to authMethod='input', uiGroup='third-party', ships with two recommended free models, and is reordered to the end of the third-party list to keep DeepSeek as the default highlight. Net diff: 34 files, +124 / -3835. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-18 17:54:40 +08:00
jinye	33b2e0dccc	fix(serve): normalize Windows path separators in workspace file read responses (#4279 ) `workspaceRelative` returned the platform-native separator from `path.relative`, leaking backslashes into `/file`, `/stat`, `/list`, and `/glob` response paths on Windows. Surfaced as a Windows-only CI failure in the `GET /glob > scopes glob matches to cwd` test (`['sub\\inside.ts']` vs expected `['sub/inside.ts']`). Always emit POSIX-style separators so SDK consumers see the same shape across platforms. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)	2026-05-18 17:53:14 +08:00
jifeng	b208f34455	feat(serve): add /demo debug page for qwen serve daemon (#4132 ) * feat(serve): add /demo debug page for qwen serve daemon Add a self-contained HTML debug page at GET /demo that provides a browser-based UI for exercising all daemon routes: session create/attach, prompt send/cancel, SSE event streaming, model switching, permission voting, and health/capabilities checks. Also add a same-origin exemption middleware (before the CORS deny layer) so browser fetch calls from the demo page pass through while external Origins remain blocked. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(serve): address CR feedback for /demo page security and robustness - Fix XSS: build permission buttons with DOM APIs instead of innerHTML - Fix SSE: move currentEvent outside read loop for cross-chunk frames - Fix SSE: handle stream end (flush trailing buffer, update UI status) - Security: move /demo route behind hostAllowlist and bearerAuth guards - Security: add host.docker.internal to same-origin Origin allowlist - Add Auth Token input and include Authorization header in API/SSE calls - Add try/catch to /demo route handler with writeStderrLine logging - Check API result before removing permission card from UI - Add 7 tests for /demo route and Origin-stripping middleware * fix(serve): move /demo before bearerAuth so browsers can reach it Browsers cannot attach Authorization headers on address-bar navigation, so /demo behind bearerAuth was unreachable when --token was set. Move the /demo route after CORS + Host allowlist but before bearerAuth. The static HTML shell contains no secrets; all API/SSE routes remain bearer-protected and the in-page token input authenticates them. * feat(serve): show 401 token hint on demo page When an API call returns 401 Unauthorized, highlight the Auth Token input field with a yellow border and display a hint message guiding the user to enter their bearer token. Applies to both API calls and SSE connections. The hint auto-dismisses after 6 seconds. * fix(serve): address round-2 CR feedback for /demo security - Loopback-gate /demo like /health: pre-auth on loopback, post-auth on non-loopback. Prevents unauthenticated access on public interfaces. - Add X-Frame-Options: DENY + CSP frame-ancestors to /demo response to prevent clickjacking via cross-origin iframe embedding. - Cache selfOrigins Set in Origin-stripping middleware (rebuild only when port changes) instead of allocating per-request. - Clear pendingPerms + reset currentAssistantBubble on session switch to prevent stale permission cards from the previous session. - Update tests: loopback vs non-loopback /demo auth, anti-clickjacking headers, rename CORS test, add localhost and [::1] origin coverage. * fix(serve/demo): address CR round-2 feedback for demo page UX - Replace innerHTML with DOM APIs in addLog() to prevent XSS - Add MAX_LOG_ENTRIES=500 pruning to prevent unbounded DOM growth - Add concurrent-send guard (promptInFlight) to prevent double submits - Show error feedback in Chat tab when sendPrompt or createSession fails - Disable prompt controls on SSE failure (both catch and non-OK paths) - Add toolCall context detail (command/input/path/diff) to permission cards --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-18 16:31:03 +08:00
jinye	52d2850c7f	feat(serve): safe workspace file read routes (#4175 PR 19) (#4269 ) * refactor(serve/fs): glob audit hashes workspace + emits pattern Closes PR #4250 follow-up #4. Hashing the per-call cwd for glob audit produced a different pathHash for every subdirectory glob without giving operators any actionable difference (raw paths are privacy-gated). Replace the hash basis with the bound workspace itself and surface the literal pattern on a new schema field, so every glob row carries a stable workspace marker and a per-call pattern. The pattern field also fires on parse_error denials (path-escape patterns, non-relative patterns) so audit consumers debugging a production glob rejection can see the exact rejected pattern without needing QWEN_AUDIT_RAW_PATHS=1. * feat(serve): safe workspace file read routes (#4175 PR 19) Add four read-only HTTP routes that consume PR 18's per-request WorkspaceFileSystem boundary: - GET /file?path=... text content + meta (encoding/BOM/lineEnding) - GET /list?path=... directory entries (name/kind/ignored) - GET /glob?pattern=... workspace-relative match paths - GET /stat?path=... file/directory metadata The routes share one error envelope (sendFsError) that maps FsError.kind through the boundary's existing DEFAULT_STATUS_BY_KIND table to a typed JSON response. All four 200 responses set Cache-Control: no-store and X-Content-Type-Options: nosniff so a browser-adjacent client cannot cache or sniff source content. Routes are advertised under a single workspace_file_read capability tag — the four endpoints share the same backing boundary and the same failure shape, so per-route tags would force four simultaneous registry entries with no operator-meaningful difference between them. Mutating routes will ship in PR 20 under their own workspace_file_write tag. Trust gate is unchanged: read intents pass on untrusted workspaces per PR 18's policy.ts. Auth follows the global bearer flow only; read routes never run mutate(), since none of them mutate state. * feat(serve): runQwenServe injects fsFactory + emit pipeline Closes PR #4250 follow-up #2. runQwenServe now constructs a WorkspaceFileSystem factory from the bound workspace, threads its emit hook through to the read routes, and exposes the trust snapshot via deps.trustedWorkspace. Test additions pin the wiring contract: - audit events emitted on success / denial flow back through the test-supplied fsAuditEmit hook - deps.fsFactory override is honored (built-in default does not silently shadow injection) - trust snapshot defaults to true (operator-chosen workspace) - trust=false routes through to the boundary and trips untrusted_workspace on write intents Default emit stays a stderr warning so a wiring regression that drops events remains visible. PR 21's SSE fan-out will replace the default with a workspace-scoped event channel. * fixup(serve): address PR #4269 round-1 review feedback Closes 8 findings from Copilot inline review + Codex review on PR #4269 (5 P0, 3 P1): P0 (correctness / privacy / operations) - runQwenServe.ts: throttle the default fsAuditEmit by reusing the exported `createDefaultFsAuditEmit` from server.ts. The earlier per-event `writeStderrLine` would print one line for every /file/list/glob/stat audit event under normal traffic. Now warns once + every 100th drop with payload context, so a wiring regression is still visible without flooding logs. (Copilot runQwenServe.ts:316; Codex runQwenServe.ts:305) - routes/workspaceFileRead.ts: probe glob with maxResults+1 and trim, so `truncated` reflects whether the boundary actually had more matches. Earlier `length === maxResults` heuristic false-positived when the workspace happened to hold exactly N matches. (Copilot workspaceFileRead.ts:399) - routes/workspaceFileRead.ts: glob `relMatches` now flows through the shared `workspaceRelative` helper. Root match (`pattern=.`) renders as "." rather than the empty string `path.relative` returns; helper also covers the boundWorkspace-undefined edge case so the route no longer carries its own fallback branch. (Copilot workspaceFileRead.ts:388; review summary HIGH-1) - fs/audit.ts: `pattern` field now rides on the same privacy gate as `relPath` / `message`. Glob patterns commonly carry workspace-relative or absolute path fragments (`src/secrets/.env`, rejected `/Users/alice/ws/`), so emitting them in privacy mode bypassed the same redaction the other path-bearing fields honor. Operators wanting full forensic context opt in via QWEN_AUDIT_RAW_PATHS=1. (Codex audit.ts:249) - routes/workspaceFileRead.ts: cwd resolves with intent='list' rather than 'glob'. The orchestrator's `recordAndWrap` auto-derives `data.pattern` from `intent === 'glob'`, which turned cwd-resolution failures into rows where the cwd string masqueraded as the glob pattern (`?cwd=../outside` → `pattern: ../outside` in audit). Switching to 'list' is the correct semantic shape (cwd is a directory we intend to walk) with identical trust + path-resolution behavior. (Codex workspaceFileSystem.ts:941) P1 (cosmetic / comment accuracy) - server.test.ts: `honors deps.fsFactory override` test comment rewritten to match the actual failure mode (a regression would 404 on a.txt, not 200 against package.json). (Copilot server.test.ts:3219) - routes/workspaceFileRead.ts: `limit` error message uses the MAX_LIST_ENTRIES constant instead of the literal 2000. (review summary MEDIUM) - fs/audit.ts: expanded the JSDoc explaining why the AuditPublisher request types Omit four fields and pass `pattern` through. (review summary MEDIUM) Test additions / adjustments - audit.test.ts: split the existing pattern tests into raw-paths and privacy-default cases; added two new privacy-mode assertions that strip pattern under default config. - workspaceFileSystem.test.ts: harness accepts `includeRawPaths`; glob audit suite runs with raw paths to observe `pattern`; new `glob audit privacy default` suite asserts pattern + relPath are stripped without the env opt-in. - workspaceFileRead.test.ts: new GET /glob cases for the truncated edge case (count == maxResults → false; count > maxResults → true) and root-match normalization. Not adopted (with rationale) - review summary HIGH-2 (glob pathHash uses boundWorkspace): this is the deliberate follow-up #4 contract from PR 18; pattern is the per-call signal, pathHash is the workspace marker. - review summary MEDIUM-1 (parseIntInRange three-state return): matches `parseMaxQueuedQuery` in server.ts; consistency wins. - review summary LOW-1/2/3 (capabilities comment length, CSP header, reverse truncated:false assertion): rationale already documented in code, CSP belongs in a hardening PR, the reverse assertion already exists. 518/518 serve tests pass; typecheck + eslint clean within src/serve/. fix(serve): address workspace file read review Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(serve): tighten workspace file read review followups Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-18 16:17:14 +08:00
易良	8f9b979ec0	test: reduce wait-dependent UI test delays (#3987 ) * test: reduce wait-dependent UI test delays * test: address UI test review feedback * test: address remaining UI test review feedback - Replace runPendingRetryTimer's polling loop with vi.advanceTimersToNextTimerAsync to decouple from connect()'s internal async structure. - Document that newSessionWithRetry's 300ms auth-delay timer is unreachable under the current mocks so future tests advance it via the same helper instead of restoring shouldAdvanceTime. - Stop swallowing runPendingRetryTimer errors behind connectPromise.catch(e => e); attach a noop catch only to mute the unhandled-rejection guard and assert directly on the original promise. - Comment why flush() yields twice (Ink 7 + React 19 split a render across two microtask ticks) so it is not "optimized" to a single yield. - Move the trailing unmount() into the finally block in the shell-history placeholder test so an assertion failure does not leak the Ink instance.	2026-05-18 15:34:34 +08:00
jinye	103090669e	feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) (#4249 ) * feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) Adds the first Wave 4 mutation route surface: workspace-scoped memory and subagent CRUD over HTTP. Remote clients (TUI / channels / web / IDE adapters) can now list, read, create, update, and delete subagent definitions and read / append / replace QWEN.md without disturbing session state. Routes: - GET /workspace/memory (read-only snapshot) - POST /workspace/memory (append/replace, strict-gated) - GET /workspace/agents (list project + user + builtin) - POST /workspace/agents (create-only; 409 on collision) - GET /workspace/agents/:agentType (full detail incl. systemPrompt) - POST /workspace/agents/:agentType (update; 403 read-only on builtin) - DELETE /workspace/agents/:agentType (idempotent for SDK callers) Mutation paths use mutate({ strict: true }) from PR 15 so they refuse unauthenticated requests even on no-token loopback defaults. Workspace mutations validate X-Qwen-Client-Id against bridge.knownClientIds() and stamp originatorClientId on emitted events. Capability tags added: workspace_memory, workspace_agents. New typed events fanned out via bridge.publishWorkspaceEvent (best- effort to every active session bus; read-after-write is the contract): - memory_changed { scope, filePath, mode, bytesWritten } - agent_changed { change, name, level } writeContextFile.ts is the new core helper that resolves QWEN.md placement (workspace vs ~/.qwen) and append-vs-replace semantics. Whitespace-only appends short-circuit before fs.writeFile, so a no-op POST does not bump mtime or fan out a misleading event. SubagentManager is wrapped with a CRUD-scoped Config stub via Proxy: only getSdkMode / getProjectRoot / getActiveExtensions are stubbed (verified against subagent-manager.ts; getToolRegistry is execution- path only). Any future Config method touched on a CRUD path throws immediately so dependency creep is visible. Auto-memory CRUD, persistent audit log, and the EACCES → NOT_FOUND unlink mapping in core SubagentManager.deleteSubagent are explicit follow-ups (PR 16.5 / PR 24 / separate fix). Validation: - typecheck: cli + sdk-typescript clean - vitest: serve 348/348, writeContextFile 10/10, SDK 335/335 - eslint: clean * fix(serve): address Codex P2 review on PR 16 (#4175 Wave 4 PR 16 follow-up) Three correctness issues Codex flagged on the just-shipped workspace memory + agents CRUD surface: 1. Concurrent POST /workspace/memory append no longer loses writes. Two simultaneous appends would each read the same existing file, compose new content in JS memory, then race the fs.writeFile — the later write silently overwrote the earlier appended entry. Add a per-resolved-path Mutex map (mirroring jsonl-utils.ts's fileLocks pattern) and wrap the entire read-compose-write sequence in runExclusive. 2. GET /workspace/agents now reflects out-of-band file changes. SubagentManager.listSubagents() default served the in-memory cache; developer / IDE adapter edits to .qwen/agents/.md never appeared even though GET /workspace/agents/:agentType always reads disk. Pass { force: true } so the LIST route walks disk every call, matching the detail route's "filesystem is the source of truth" contract. 3. Reject builtin agent names on POST /workspace/agents to prevent undeleteable shadow files. A client could write a project-level agent named "general-purpose" — list/load resolved the shadow first, but SubagentManager.deleteSubagent's name-based builtin guard (subagent-manager.ts:302) rejected DELETE forever. Add a BuiltinAgentRegistry.isBuiltinAgent check in parseAgentConfig so the conflict surfaces at create time instead of trapping the file beyond the API. The check is case-insensitive, matching the resolver's case-insensitive cascade. New tests: - writeContextFile.test.ts: 10 parallel appends, all 10 entries must survive in the final file (would fail without the mutex). - workspaceAgents.test.ts: GET /workspace/agents observes a freshly-written agent file on the second call (force-refresh proof); POST with name="general-purpose" returns 422 + the case-insensitive variant "explore" too. Validation: - typecheck: cli + sdk-typescript clean - vitest: serve 351/351 (was 348, +3 new), writeContextFile 11/11 - eslint: clean fix(serve): apply round-1 review fold-in 2a (HIGH + CodeQL) on PR 16 Round-1 inline review (#4249) flagged ~28 items across Copilot, wenshao, and CodeQL. This commit lands the HIGH-severity correctness fixes plus the two CodeQL polynomial-regex warnings. Validation tighten — `parseAgentConfig` + `parseAgentUpdates`: - Trim leading/trailing whitespace on `name` before passing to SubagentManager. `" tester "` would otherwise create a frontmatter name with spaces that case-insensitive lookups can never find. - Fail-closed (422 invalid_config) on present-but-wrong-type optional scalars: `model`, `color`, `approvalMode`, `background`. Previously malformed values silently dropped through validation, masking client-serialization bugs. - Validate `approvalMode` against the `APPROVAL_MODES` enum on both create and update; an unknown value used to 201 with the field silently omitted from the saved file. - `runConfig` is now whitelist-sanitized to `{ max_time_minutes, max_turns }` only; unknown keys are dropped, malformed values return 422. Previously the whole input object was persisted verbatim into YAML frontmatter. - `?scope=` query is fail-closed for repeated values (`?scope=workspace&scope=global`) — Express parses these as arrays which the previous `typeof === 'string'` check silently treated as absent, broadening DELETE/UPDATE semantics from one level to both. - Empty update body returns 400 invalid_config (previously rewrote the file + emitted a misleading `agent_changed` event). - No-op updates (every supplied field already matches `existing`) return 200 + `changed: false` and SKIP the file rewrite + event fan-out. Memory write helper — `writeContextFile.ts`: - Move whitespace-only no-op detection BEFORE `fs.mkdir`. Without this, an empty POST still created the parent directory and bumped its mtime even though `changed: false` was reported. - Replace two polynomial regex patterns flagged by CodeQL (`/^\s+\|\s+$/g` and `/^\n+\|\n+$/g`) with hand-rolled `while` loops. Same pattern auth.ts:120-125 already uses for the same CodeQL rule. SDK — `DaemonClient.ts` + `types.ts`: - `DaemonWriteMemoryResult` gains optional `changed?: boolean` so typed callers can suppress redundant cache invalidation on no-op appends. Optional for forward-compat with daemons that predate the field — undefined treats as "changed: true" (legacy contract). - `deleteWorkspaceAgent` only swallows 404 when the body's `code` is `agent_not_found`. A bare 404 (older daemon, misrouted proxy, generic gateway page) now throws — previously the SDK silently reported success even when the request never reached a route that understands workspace agents. - `updateWorkspaceAgent` adds an optional `scope` parameter mirroring `deleteWorkspaceAgent`, so callers can target the user- level definition when a project-level agent shadows it. Validation: - typecheck: cli + sdk-typescript clean - vitest: serve 357/357 + writeContextFile 12/12 = 369/369 passing (was 362; +7 new) - eslint: clean Explicitly NOT applying (out of scope per issue #4175 PR 16 review-resolution policy): - Copilot's "strict gate after body parser" finding — already documented as PR 15 review-resolved tradeoff at auth.ts:256-269. * fix(serve): apply round-1 review fold-in 2b (MEDIUM + tests) on PR 16 MEDIUM hardening: - Fix the JSDoc on `collectWorkspaceMemoryStatus` to match the workspace-root-only discovery the implementation actually does today. The 32-iteration upward walk is reserved for a future hierarchical mode but breaks after iteration 1 in v1. - Lower the depth limit on `walkWorkspaceForMemory` from 32 → 12. Realistic project depth sits well below 8; 12 leaves headroom without amplifying blast radius from symlink cycles. - Daemon `Config` Proxy now defines a `has` trap symmetric to the existing `get` trap. Without it, a future SubagentManager path doing `'someMethod' in this.config` would silently get `false` and bypass the safety net the throw-on-unknown-property design installed. - Preflight `manager.loadSubagent(name, level)` before `manager.createSubagent`. The default-path collision check inside SubagentManager would otherwise miss same-frontmatter-name + different-filename collisions; the preflight makes 409 agent_already_exists deterministic. - Multi-level DELETE now emits one `agent_changed` event per level that actually had a file removed. Previously an unscoped DELETE removing both project and user shadows would publish only one event with one level — misleading subscribers using event metadata for toasts / audit / echo-suppression. Test additions (covers the new event types + bridge fan-out + SDK helpers): - `daemonEvents.test.ts`: predicate narrowing for `memory_changed` / `agent_changed` (rejects malformed scope/mode/level), reducer records `lastWorkspaceMutation` + `lastWorkspaceMutationType` with latest-event-wins semantics and stays non-terminal. - `httpAcpBridge.test.ts`: `publishWorkspaceEvent` fans out to every active session bus; `knownClientIds()` aggregates clientIds across sessions and the returned set is a snapshot (mutating it does not affect future calls). - `workspaceAgents.test.ts`: success-path test stamping `originatorClientId` on the create / update / delete events for a known client. - `DaemonClient.test.ts`: 7 round-trip tests for the new SDK helpers (workspaceMemory, writeWorkspaceMemory, listWorkspaceAgents, getWorkspaceAgent, createWorkspaceAgent, updateWorkspaceAgent with scope query, deleteWorkspaceAgent: 204 / structured 404 / bare 404 triage). - `writeContextFile.test.ts`: replace the 30ms-mtime test with a `vi.spyOn(fs, 'writeFile')` assertion that the no-op path never invokes writeFile. Deterministic on every filesystem. Validation: - typecheck: cli + sdk-typescript clean - vitest: serve 363/363 + writeContextFile 12/12 + SDK 347/347 - eslint: clean Reviewer guide: combined with fold-in 2a (commit `134c43c82`), PR 16's round-1 review feedback is closed except for the explicitly- deferred Copilot finding on "strict gate after body parser" (already documented as PR 15 review-resolved tradeoff at auth.ts:256-269). The DRY refactor wenshao suggested for `resolveOriginatorClientId` is left as a future sweep — it touches multiple Wave 4 routes and should land alongside PR 17/19/20/21 to keep the helper's shape informed by all consumers. * docs(serve): apply round-1 review fold-in 2c (doc/type tightening) on PR 16 Two doc-only fixes that close the last open Copilot threads on PR #4249 — both are JSDoc/tsdoc corrections where the wording promised broader behavior than the implementation actually delivers, so a maintainer or SDK consumer reading the type would form a wrong mental model. 1. `DaemonAgentLevel` (sdk-typescript) and `ServeAgentLevel` (cli serve) keep `'extension'` + `'session'` on the union for forward- compat but the JSDoc now explicitly says the daemon does NOT return either today. The `'extension'` case is gated by the daemon's stub `Config.getActiveExtensions()` returning `[]`; `'session'` is a runtime-only `SubagentManager` cache the CRUD routes don't read. Both arms stay so a future PR exposing either source is not a breaking SDK change. 2. `DaemonClient.workspaceMemory()` tsdoc no longer says "hierarchical" — v1 only discovers files at the bound workspace root + the global `~/.qwen` directory, no parent-directory walk. The 12-iteration upward-walk loop body inside `walkWorkspaceForMemory` is reserved for PR 16.5 hierarchical mode and breaks after iteration 1 today; the SDK doc now states that explicitly so callers don't expect more than they receive. No runtime change. Validation: - typecheck: cli + sdk-typescript clean - vitest: 363/363 serve + 12/12 writeContextFile + SDK unchanged - eslint: clean * fix(serve): apply round-2 review fold-in 2d on PR 16 wenshao round-2 (4 inline comments at 16:51-16:53Z): three real bugs + one performance-tradeoff doc note. 1. `composeAppendedContent` now inserts inside the MEMORY section, not at EOF. Previously a QWEN.md whose `## Qwen Added Memories` block was followed by another `## ...` heading would silently land each new entry past the next heading — moving entries into the wrong section. Walk the memory header forward, find the next `\n## ` heading, and insert just before it. Fall back to the EOF append when the memory section is the last block. 2. `parseAgentUpdates` now matches the create-side trim/empty rule for `description` (rejects whitespace-only) and ensures `systemPrompt` rejects the empty string. Update path used to silently accept `" "` and overwrite the field with blank content — divergent from create which 422s the same payload. 3. `isNoOpUpdate`'s runConfig comparison no longer false-positives on partial updates. Comparing every known runConfig field against `existing` treated absent keys as `undefined` while existing had real values — so `{max_time_minutes: 30}` against `{max_time_minutes: 30, max_turns: 10}` claimed non-no-op and re-emitted `agent_changed`. Fixed to only compare keys actually present in `updates.runConfig`, matching `mergeConfigurations` semantics (existing values preserved when not in updates). 4. JSDoc on the LIST-route `force: true` call now explains the tradeoff (no TTL cache / no fs.watch invalidation): re-introducing caching would re-introduce the stale-list bug Codex P2 #2 fixed, `fs.watch` is platform-fragile, and PR 24's audit/policy layer is the proper home for request rate limiting. Sub-millisecond cost per request on local SSD; revisit if profiling flags it. Tests: - writeContextFile.test.ts: section-boundary insertion + EOF fallback - workspaceAgents.test.ts: whitespace-only description rejected; partial runConfig no-op detection; partial runConfig real change preserves omitted keys via mergeConfigurations Validation: - typecheck: cli + sdk-typescript clean - vitest: 368/368 (was 363, +5 new) - eslint: clean * fix(serve): apply round-3 review fold-in 2e on PR 16 wenshao round-3 (5 inline [Suggestion]s, all real correctness or forward-compat issues; one item carried over from round-2): 1. `parseAgentConfig` rejects whitespace-only `systemPrompt` on create, matching the description field's `trim().length === 0` rule. Pure-whitespace prompts collapse to nothing on YAML serialization and the agent can't operate without instructions — 422 at the boundary is friendlier than the downstream "agent does nothing" failure. 2. `parseAgentUpdates` mirrors the same `trim()` check on the update path so `{systemPrompt: " "}` returns 422 rather than silently blanking the field. 3. `POST /workspace/memory` `file_error` 500 response now carries `scope`, `mode`, optional `osCode` (`EACCES`/`EROFS`/`ENOSPC`/...) and a redacted `errorMessage`. Previous shape was just `{error, code: 'file_error'}` — callers had nothing to branch on. 4. `composeAppendedContent` runs `fs.stat` before `fs.readFile` and refuses with a typed `WorkspaceMemoryFileTooLargeError` when the existing file exceeds 16 MB. Without this cap a pathological QWEN.md would be loaded into the daemon heap on every append. The route maps the typed error to a 413 with `code: 'memory_file_too_large'` plus `bytes` / `limit` so callers can decide whether to trim or switch to mode=replace. 5. `toDetail` no longer spreads `config.runConfig` with a cast. Explicit field-by-field pick of `max_time_minutes` / `max_turns` ensures any future `SubagentConfig.runConfig` field requires a deliberate route-schema update rather than silently leaking through the HTTP API. Tests: - workspaceAgents.test.ts: whitespace-only systemPrompt rejected on create AND update; toDetail.runConfig only emits whitelisted keys - existing tests still cover the description-side trim and the partial runConfig no-op detection from fold-in 2d Validation: - typecheck: cli + sdk-typescript clean - vitest: 371/371 (was 368, +3 new) - eslint: clean Reviewer note: response shape on 500 file_error is additive (`scope`/`mode`/`osCode`/`errorMessage` are new fields), so SDK callers that only consumed `{error, code}` keep working. The new 413 `memory_file_too_large` is a new error code SDK consumers can branch on but that pre-PR-16 daemons never emitted, so adding it is also additive. * fix(sdk): expose `changed` on DaemonAgentMutationResult (PR 16 round-4) wenshao round-4 review (single inline at types.ts:434): the agent update route emits `changed: true` for real updates and `changed: false` for no-op short-circuits (introduced in fold-in 2a alongside the no-op detection), but `DaemonAgentMutationResult` in the SDK type still only exposed `{ ok, agent }`. Typed callers of `updateWorkspaceAgent()` couldn't observe the no-op signal even though `DaemonClient` already returns the raw JSON at runtime. Add optional `changed?: boolean` matching the shape introduced for `DaemonWriteMemoryResult.changed` in fold-in 2a. Optional for forward-compat with daemons that predate the field; SDK consumers should treat `undefined` as `true` (the legacy contract — every successful create / update was a write before fold-in 2a's no-op short-circuit landed). Test: - `DaemonClient.test.ts`: round-trip asserts the typed result surfaces `changed: false` from the wire payload. Validation: - typecheck: cli + sdk-typescript clean - vitest: 82/82 in DaemonClient.test.ts (was 81; +1 new) - eslint: clean * fix(serve): apply round-6 review fold-in 2g on PR 16 Round-6 review (gpt-5.5 [Critical] + 5 wenshao [Suggestion]s). [Critical] Per-level delete verification (workspaceAgents.ts): - gpt-5.5 flagged that `SubagentManager.deleteSubagent` swallows per-level `fs.unlink()` failures (subagent-manager.ts:332-336) and returns success as long as ANY level was removed. Trusting that signal would let the route publish `agent_changed`/`deleted` for a file still on disk under EACCES/EBUSY/EPERM — the client UI would drop a still-active definition from cache. - Route now runs `fs.access` on each pre-checked level's file path AFTER `manager.deleteSubagent` returns and partitions into `removed` / `remaining`. Events are emitted ONLY for confirmed removals; if any level still has its file, the route returns 500 `agent_delete_partial` with `removedLevels` + `remainingLevels` so callers can act precisely. - New test installs a 0o555 chmod on the user-level agents directory so `fs.unlink` raises EACCES while the project-level unlink succeeds, asserting both the 500 response and that exactly one `agent_changed` event fired for the level that actually went away. Concurrency consistency (writeContextFile.ts): - Whitespace-only no-op detection now happens INSIDE the per-file mutex's `runExclusive` block. The pre-fix layout did the short-circuit `fs.stat` outside the lock; under concurrent POSTs (one whitespace-only, one with real content) the no-op's `bytesWritten` could lag the post-write reality. Functional behavior was already correct; this aligns the snapshot with the post-write state. Defense-in-depth + DRY (workspaceAgents.ts): - `validateAgentType(req, res)` regex-validates `:agentType` URL parameter at the route boundary against the same `^[\\p{L}\\p{N}_-]+$/u` pattern as `SubagentValidator.validateName`, with a 64-char cap. `findSubagentByNameAtLevel`'s readdir scan already prevented path traversal, but failing fast at the boundary keeps surprising inputs out of downstream code paths. Two new tests cover `..%2Fetc%2Fpasswd` and over-long names. - `parseScopeQuery(req, res)` extracts the duplicated `?scope=` query parser from the POST update + DELETE handlers. Same fail-closed semantics on repeated/non-string values. - `assertMutableLevel(found, agentType, res)` extracts the duplicated `isBuiltin \|\| level === 'builtin' \|\| 'extension' \|\| 'session'` 403 guard. Future Wave 4 mutation routes (PR 17 / 19 / 20) call this helper instead of re-implementing the predicate. Client-id helper consistency (workspaceMemory.ts): - `resolveWorkspaceClientId` removed; the inline branch in the POST handler now mirrors `workspaceAgents.ts:resolveOriginatorClientId` (validate against `bridge.knownClientIds()`, send 400 directly, return so the caller short-circuits). Previously this file threw `InvalidClientIdError` and caught it locally — wenshao round-6 flagged the throw-vs-direct-400 inconsistency between the two files. The deeper full-extraction DRY refactor remains deferred to the cross-Wave-4 sweep with PR 17/19/20/21. Won't-fix doc note (workspaceMemory.ts): - Mount-point JSDoc now explicitly explains why the route returns absolute on-disk paths (success / 413 / GET list): clients pre-flight `caps.workspaceCwd` to learn the bound workspace and can compute relative paths if they want; the global scope's `~/.qwen/QWEN.md` is NOT under the workspace root, so a workspace-relative form would lose information. Path redaction for multi-tenant deployments belongs to PR 24's `--redact-errors` policy work, not a per-route default flip in PR 16. Validation: - typecheck: cli + sdk-typescript clean - vitest: 374/374 (was 371, +3 new) - eslint: clean * fix(serve): apply round-7 review fold-in 2h on PR 16 glm-5.1 round-7: 2 [Critical] + 5 [Suggestion] inline comments. Five applied as code changes; one is a stale-snapshot false positive (workspaceMemory.ts no longer has the InvalidClientIdError call site glm-5.1 referenced — fold-in 2g already replaced it with inline 400); one is rationale-replied (INVALID_CONFIG → 422 mapping suggestion is based on incorrect premise about manager semantics). [Critical] Code-fence-aware section-boundary detection (writeContextFile.ts): - The naive `\n## ` indexOf scan would split user-authored memory entries that quote markdown documentation containing `##` headings inside fenced code blocks. New `findNextTopLevelHeading` helper tracks fence state line-by-line and only accepts matches outside fences. Two new tests: (a) entry containing a fenced `## Request Body` keeps its body intact; (b) real `## post` heading outside fences still acts as the section boundary. [Suggestion] errorMessage + filePath gating (workspaceMemory.ts): - 500 `file_error` and 413 `memory_file_too_large` responses now omit `errorMessage` and `filePath` unless `QWEN_SERVE_DEBUG` is set. Default response carries `error / code / scope / mode / osCode` — enough for SDK callers to branch without leaking absolute filesystem paths. New test asserts both modes round-trip the right shape. [Suggestion] publishWorkspaceEvent visibility (httpAcpBridge.ts): - Catch block now writes to stderr unconditionally during normal operation; only downgrades to the debug channel when `shuttingDown` is true. `EventBus.publish` is documented never to throw, so a hit in normal ops is by definition a regression that must be visible in production logs — silencing via debug-gate could let a true bug succeed at the route layer (200 OK) while SSE subscribers stop receiving events. [Suggestion] Log-injection defense for `agentType` (workspaceAgents.ts): - New `safeLogValue` helper wraps `agentType` interpolations in `JSON.stringify(...).slice(0, 82)` before stderr writes (mirrors `server.ts:1340`). The route's `validateAgentType` regex already rejects names with control chars, but defense-in-depth covers legacy on-disk shadows and future fields. Five `writeStderrLine` call sites updated (GET / POST / DELETE failure, reload-failure, partial-delete, create-reload-failure). [Suggestion] Simplify walkWorkspaceForMemory (workspaceMemory.ts): - Replaced the 12-iteration loop with a straightforward single-pass stat-each-filename. The `seen` Set, `cursor = parent` walk, and filesystem-root guard were dead code (the loop unconditionally broke on first iteration). PR 16.5's hierarchical mode lands as a fresh upward walk rather than re-enabling commented-out code. Validation: - typecheck: cli + sdk-typescript clean - vitest: 377/377 (was 374, +3 new) - eslint: clean Reviewer notes (NOT adopting): - glm-5.1's "InvalidClientIdError('workspace', ...)" message-confusion Critical: stale-snapshot false positive — fold-in 2g already removed `resolveWorkspaceClientId` and inlined a 400 with the correct "registered for this workspace" wording. Only a comment reference remains. - glm-5.1's "INVALID_CONFIG → 422" suggestion: SubagentManager only ever throws INVALID_CONFIG for read-only conditions (built-in / extension / session) — not for malformed config (which uses VALIDATION_ERROR). The current 403 mapping in update + delete is correct for the manager's actual semantics. * fix(serve): apply round-8 review fold-in 2i on PR 16 wenshao round-8: 2 [Critical] path-disclosure + 5 [Suggestion] (name regex, per-field caps, mutex timeout, test gaps, tilde fence). All adopted. [Critical] C1 — 413 `err.message` path disclosure (workspaceMemory.ts): - The 413 `memory_file_too_large` response sent `err.message` unconditionally as the `error` field. The `WorkspaceMemoryFileTooLargeError` constructor embeds the absolute file path in its message ("Existing memory file at /Users/<x>/.qwen/QWEN.md is ..."), bypassing the `debugMode()` gating that already hid the `filePath` field. Same gating now applies to both `error` and `filePath`; default response carries a generic string + structured `code` / `bytes` / `limit` so SDK callers can branch without the path leak. [Critical] C2 — workspaceAgents FILE_ERROR `err.message` (workspaceAgents.ts): - Two catch blocks (create + update) sent `SubagentError(FILE_ERROR)` messages directly in the response. Node fs errors embed paths like "ENOENT: ... '/Users/<x>/.qwen/agents/foo.md'". Both now gate behind `isServeDebugMode()`; default response is the generic "Failed to write workspace agent file" envelope. Shared `isServeDebugMode` helper (debugMode.ts new): - Moved from inlined copies in workspaceMemory.ts to a small shared module so both route files (and future Wave 4 mutation routes) share one canonical predicate. [Suggestion] S1 — POST body `name` validation (workspaceAgents.ts): - `parseAgentConfig` now applies the same regex + length contract as `validateAgentType` (`^[\p{L}\p{N}_-]+$/u`, 2-64 chars). A client posting `name: "my/agent"` or 100-char name now fails at the body-validation boundary with a 422 `invalid_config` instead of bubbling a less-specific `SubagentValidator` error. [Suggestion] S2 — Per-field size caps (workspaceAgents.ts): - `description` / `systemPrompt`: 256 KB each - `tools` / `disallowedTools`: 256 entries, each at most 256 chars Applied on both create + update; matches workspaceMemory's `MAX_MEMORY_CONTENT_BYTES = 1 MB` posture and keeps `GET /workspace/agents` list-response cost bounded. [Suggestion] S3 — Mutex timeout (writeContextFile.ts): - `getFileLock` now wraps each Mutex with `withTimeout(..., 30_000)` so a wedged filesystem (NFS hiccup, OneDrive lock, kernel I/O hang) cannot indefinitely hold the per-file lock. The `E_TIMEOUT` sentinel is caught and re-thrown as a typed `WorkspaceMemoryWriteTimeoutError`; the route maps it to 500 `memory_write_timeout` with `timeoutMs` so SDK callers can branch on stalled-fs without parsing a generic 500. [Suggestion] S4 — Test gaps: - `DELETE /workspace/agents/:id?scope=workspace` happy path: removes only the project shadow, leaves user file on disk, emits exactly one `agent_changed` event with `level: project`. - `POST /workspace/agents/:id?scope=global` happy path: updates user shadow, leaves project file untouched. - 413 `memory_file_too_large`: write a 17 MB QWEN.md externally, POST append fails with the structured 413 payload (`bytes` / `limit`, no `filePath` / no path-embedding error message in default response). [Nice] N1 — Tilde fence support (writeContextFile.ts): - `findNextTopLevelHeading` now toggles fence state on both ``` ` and `~~~` openers (CommonMark allows both). A `## heading` inside a `~~~` fenced block no longer counts as the section boundary. Validation: - typecheck: cli + sdk-typescript clean - vitest: 380/380 (was 377, +3 new) - eslint: clean * fix(serve): apply round-9 review fold-in 2j on PR 16 Two real correctness fixes from wenshao's 2026-05-18 review: 1. resolveContextFilePath now uses getCurrentGeminiMdFilename() so POST /workspace/memory writes to the same file GET surfaces. Without this, a deployment that ran setGeminiMdFilename('AGENTS.md') saw GET list AGENTS.md while POST kept appending to a stale QWEN.md — clients then observed "I just wrote content but it's missing from /workspace/memory". 2. runWrite no-op branch now returns bytesWritten: 0 instead of the existing file's stat.size. The prior value conflated "bytes I wrote" with "current file size"; clients accumulating writes via sum(bytesWritten) added the file size for every whitespace POST. changed: false already signals the no-op; the byte count should match its field name. JSDoc updated on both WriteContextFileResult.bytesWritten and DaemonWriteMemoryResult.bytesWritten so the contract is explicit. New test covers setGeminiMdFilename(AGENTS.md) round-trip; existing no-op test updated for the new bytesWritten semantics. Round-8 thread PRRT_kwDOPB-92c6Cpyap (DRY resolveOriginatorClientId) stays open as the cross-Wave-4 tracking marker. CodeQL "missing rate limiting" alert deferred to PR 24's audit/policy layer (bearer + max-connections + mutation gate provide v1 mitigations). * fix(serve): skip two Windows-incompatible test fixtures on win32 Both tests rely on `fs.chmod(dir, 0o555)` to trigger EACCES on a subsequent write/unlink. Windows ignores Unix-style permission bits passed to `fs.chmod`, so the directory stays writable, the operation succeeds, and the error path the test exercises is unreachable — the test then sees the success status (200 / 204) instead of the expected 500. CI failed on Windows runner only; Ubuntu + macOS pass. Route logic is platform-agnostic — these tests validate that: - `workspaceMemory.test.ts` POST returns the structured 500 envelope (no `errorMessage` / `filePath` leakage outside QWEN_SERVE_DEBUG). - `workspaceAgents.test.ts` DELETE returns 500 `agent_delete_partial` when one level's `fs.unlink` silently fails inside SubagentManager. Both invariants are still covered by the Ubuntu + macOS runs. We can't swap in a `vi.spyOn(fs, 'unlink')` mock for the agents case either — `SubagentManager` does `import * as fs from 'fs/promises'`, creating a sealed ESM namespace object vitest can't redefine. Skip pattern mirrors `customBanner.test.ts:232` (`if (process.platform === 'win32') return;`).	2026-05-18 14:26:59 +08:00
jinye	495d11f016	refactor(serve): add FileSystemService boundary (#4175 Wave 4 PR 18) (#4250 ) Some checks are pending Qwen Code CI / Classify PR (push) Waiting to run Details Qwen Code CI / Lint (push) Blocked by required conditions Details Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * refactor(serve): add FileSystemService boundary (#4175 PR 18) Introduce a per-request workspace filesystem boundary inside the `qwen serve` daemon. The boundary centralizes path canonicalization, symlink-aware boundary checks, ignore/trust policy, size/binary limits, and audit hooks behind a single typed surface — preparing PR 19 (read-only file routes) and PR 20 (write/edit routes) to share a guarded chokepoint instead of re-implementing path safety per route. Wave 4 PR 18 of #4175 — pure refactor, no new HTTP routes; depends on PR 12 (#4241) and PR 15 (#4236), both merged. New module under `packages/cli/src/serve/fs/`: - `paths.ts` extracts `canonicalizeWorkspace` from `httpAcpBridge.ts` (re-exported there for backward compatibility) and adds: - `ResolvedPath` brand and `Intent` union (read/write/edit/list/ glob/stat) with exhaustiveness checks at the trust gate - `hasSuspiciousPathPattern` — detects NTFS ADS, 8.3 short names, long-path prefixes, UNC paths, trailing dots, DOS device names, and three-or-more-dot path components (claude-code-style) - `findExistingAncestor` with explicit ENOTDIR rejection so a regular file in a path component throws `parse_error` rather than passing boundary inspection and 500-ing later - `resolveWithinWorkspace` running a chain-aware realpath check with ENOENT-tolerant ancestor walk for write/stat intents - `errors.ts` defines `FsError` / `FsErrorKind` plus `wrapAsFsError`, which categorizes raw `fs.promises` errnos (EACCES → permission_ denied, ELOOP → symlink_escape, ENOTDIR → parse_error, etc.) so body-level failures emit audit events instead of escaping uncategorized - `policy.ts` carries `MAX_READ_BYTES` (256 KiB), `MAX_WRITE_BYTES` (5 MiB), `BINARY_PROBE_BYTES` (4 KiB), `shouldIgnore` (file/ directory aware), and `assertTrustedForIntent` with an exhaustive switch over `Intent` - `audit.ts` emits typed `fs.access` / `fs.denied` `BridgeEvent` frames with SHA-256-hashed paths, optional raw-path passthrough via `QWEN_AUDIT_RAW_PATHS=1`, and discriminator `kind` fields so SDK consumers can exhaustively narrow `event.data` - `workspaceFileSystem.ts` — `WorkspaceFileSystem` interface + `createWorkspaceFileSystemFactory` with eight methods (resolve, stat, readText, readBytes, list, glob, writeText, edit). Every body method funnels failures through `recordAndWrap`, which wraps raw fs errors and always emits an `fs.denied` audit event before rethrowing. `readText` enforces `MAX_READ_BYTES` before delegating to the slurping core service so unbounded requests against multi-gigabyte files can no longer OOM the daemon. `glob` realpath-checks each hit against the canonical workspace and reports filtered escapes via a single aggregated `fs.denied` event with the dropped count - `index.ts` is the barrel re-export PR 19/20 will import from Modified files: - `packages/cli/src/serve/httpAcpBridge.ts` — extracted `canonicalizeWorkspace` to `fs/paths.ts`; the bridge re-exports it so existing callers in `server.ts` and `runQwenServe.ts` keep working - `packages/cli/src/serve/server.ts` — added `fsFactory?: WorkspaceFileSystemFactory` to `ServeAppDeps`; `createServeApp` builds a strict default (`trusted: false`, warn-once no-op `emit`) when none is injected so a future refactor that forgets `fsFactory` injection cannot silently allow writes against an untrusted workspace; factory parked on `app.locals` for PR 19/20 route handlers - `packages/core/src/index.ts` — re-exports `Ignore`, `loadIgnoreRules`, and `LoadIgnoreRulesOptions` from `utils/filesearch/ignore.js` for cli consumption 411 serve tests pass; typecheck clean. Engineering principles checklist: - [x] Independently mergeable (no new routes, no new capability tag) - [x] Backward compatible (no removed routes / event fields / CLI behavior) - [x] Default off (no public surface change; PR 19/20 will activate routes) - [x] qwen serve Stage 1 routes preserved - [x] Gradual migration (PR 19/20 will adopt the boundary) - [x] Reversible (single PR rollback) - [x] Tests-first (101 unit tests across the new module + contract test) 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): address PR review feedback (#4250) Codex + Copilot review found 8 substantive issues against `7b0db4c3a`; this commit fixes all of them. The first two are P1 build-breakers introduced by the pre-commit `eslint --fix` auto-promoting value imports to `import type` — 31 fs tests were failing post-commit until this fix. Issue list (links to PR comments at #4250): 1. Import type erased runtime values — workspaceFileSystem.ts:10-15. eslint's consistent-type-imports rule rewrote `import { Ignore, StandardFileSystemService, loadIgnoreRules, type WriteTextFileOptions }` -> `import type { ... }` because it saw type-only usage of `Ignore`. That erased `loadIgnoreRules` (called at runtime) and `StandardFileSystemService` (constructed at runtime), causing TS1361/TS2206 and runtime ReferenceErrors. Restored as a value import with inline `type` modifiers per-symbol and an eslint-disable line + comment so future autofixes don't repeat the regression. 2. Same import-type erasure in contract.test.ts:14 — `isFsError` was lumped under `import type` even though it's called at runtime. Same fix shape. 3. edit() OOM hole — workspaceFileSystem.ts. The earlier review pass added a pre-stat MAX_READ_BYTES gate to readText but missed edit, which fsp.readFile's the whole file before any size check. Multi-GB targets inside the workspace could OOM the daemon. Now stat-first; refuses above the cap with file_too_large; also rejects binary files (string indexOf over arbitrary bytes is meaningless). 4. glob accepted absolute / device patterns — workspaceFileSystem.ts. The ..-segment check stopped lexical traversal but /etc/ / C:\\Users\\foo\\ / \\\\server\\share\\ / //server/share/ still reached globAsync, walking outside the workspace before per-hit filtering dropped the results. Now rejects these patterns up-front with parse_error so no I/O happens outside. 5. glob ignore filter probed every hit as `file` — workspaceFileSystem.ts. The underlying `ignore` library needs a trailing-slash probe for dist/ / .git/-style directory patterns; probing as `file` silently leaked directory matches. Now lstats each hit and routes 'directory' vs 'file' to shouldIgnore so dir-only ignore rules actually match. 6. ReadTextOptions.line off-by-one — workspaceFileSystem.ts. The public option was documented as 1-based but forwarded as-is to readFileWithLineAndLimit, which is 0-based. A request with line: 1 returned content starting at the second line. Now converts 1-based -> 0-based at the boundary; doc clarified; truncation check uses the converted index. 7. ServeAppDeps.fsFactory JSDoc said trusted=true — server.ts:96. Stale from before the strict-default refactor in the same review pass. Rewrote to match the actual trusted: false + warn-once emit behavior. 8. MAX_READ_BYTES JSDoc said reads above cap return truncated — policy.ts:18. Stale from before the hard-cap refactor; now correctly states the cap throws file_too_large and that soft truncation only applies under the cap via enforceReadSize. 7 new tests cover the new behaviors: - POSIX-absolute pattern rejection - Win32 / UNC pattern rejection (4 variants) - directory-pattern ignore (dist/) - edit file-too-large - edit binary refusal - readText line: 1 returns from first line - readText line: 2 starts from second line 418/418 serve tests pass; typecheck + eslint clean. Deferred follow-ups (per PR review reply): - glob maxResults is applied after globAsync materializes every match. A streaming iterator (glob.iterate) would bound the walk too. Non-trivial; tracked as a separate hardening follow-up since current behavior is correctness-safe (just not optimal under huge trees). - Per-path glob escape hash in audit hint (currently aggregated count) — can revisit once PR 19 wires the routes and we see real audit volume. - EVENT_SCHEMA_VERSION migration mechanism — orthogonal; the whole BridgeEvent schema lacks one and that's a Wave 5+ concern. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): close 3 review-round-2 findings (#4250) The wenshao + DeepSeek reviewer pass on `a81ada43f` surfaced 3 more issues; this commit fixes them. 1. Dangling-symlink write escape (paths.ts) — Critical security bug. A request like `write /ws/escape` where `<ws>/escape` is a symlink whose target doesn't exist YET would pass the ENOENT-tolerant ancestor walk: realpath fails ENOENT, the walk-up returns `<ws>` as ancestor, the canonical becomes `<ws>/escape`, containment passes — but the eventual write follows the symlink and creates the file at the symlink target outside the workspace. lstat-then-readlink before the ancestor walk catches this; the symlink target is itself resolved via the deepest existing ancestor so macOS /private/var canonicalization stays consistent with boundCanonical (an absolute target inside the workspace tmpdir would otherwise have been false-flagged on macOS). 2. glob realpath catch over-reported symlink_escape (workspaceFileSystem.ts) — every realpath failure inside the per-hit boundary loop was counted as `symlink_escape`. EIO, EACCES, ENAMETOOLONG, EBUSY are environmental failures, not security events; mislabeling them poisoned the audit signal for operators trying to investigate genuine escape attempts. Now distinguished: ENOENT/ELOOP count as escapes; other errnos count as transient errors and emit a separate aggregated `fs.denied` with errorKind: 'permission_denied'. 3. policy.ts:enforceReadSize JSDoc said the boundary "intentionally does NOT throw" — stale after a81ada43f's hard-cap refactor. Rewrote to clarify the helper is the soft truncation gate that only fires under the hard cap; readText itself enforces the hard cap with file_too_large via its pre-stat check. The readme/contract is now consistent with workspaceFileSystem.ts. 2 new tests: - dangling symlink targeting outside-workspace path → symlink_escape - dangling symlink targeting future-inside-workspace path → succeeds (ahead-of-mkdir flow for atomic-write-via-rename) 420/420 serve tests pass; typecheck clean. Remaining tracked follow-ups (per PR review reply): - list/glob brand cast (P2 deferred per PR description) - glob audit pathHash hashes pattern not paths - edit() TOCTOU read-modify-write race (atomic-via-temp + rename) - wrapAsFsError ENOSPC/EIO mapping to a distinct kind - runQwenServe → fsFactory injection integration test - glob maxResults streaming (glob.iterate) 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): cross-platform ENOTDIR detection (#4250 CI fix) Windows CI test failure on `a81ada43f` surfaced a real cross-platform bug in `findExistingAncestor`. POSIX returns `ENOTDIR` when `fs.stat` traverses through a non-directory in a path component (e.g. `${ws}/file.txt/leaf` where `file.txt` is a regular file). Windows returns `ENOENT` for the same case. The errno-based guard added in `a81ada43f` only branched on `ENOTDIR`, so the Windows path silently fell through to the ancestor walk and the boundary returned a "canonical" the eventual write could not honor — `WorkspaceFileSystem - audit always emits on body errors > rejects ENOTDIR ancestor walk with parse_error rather than passing boundary` failed with `expected false to be true` on the windows-latest runner. Fix: switch from errno-based detection (platform-divergent) to dirent-kind detection. After `fs.stat` succeeds during the walk-up, if the existing ancestor is NOT a directory AND there are unresolved tail components, throw `parse_error`. Both `ENOENT` and `ENOTDIR` from `fs.stat` are now treated as "the current path doesn't resolve, keep walking" — the post-walk kind check fires regardless of which errno surfaced. Cross-platform-safe. The local 110/110 fs tests still pass on macOS/Linux; the Windows case will exercise the kind-check branch on next CI run. macOS CI failures on the same workflow run (`InputPrompt.test.tsx` placeholder reuse, `SettingsDialog.test.tsx` 5s timeout) are pre- existing flaky UI tests, NOT touched by this PR. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): close 4 review-round-3 findings (#4250) DeepSeek's third review pass (after `b38e82157`) flagged four more issues; this commit fixes all of them. 1. Multi-hop dangling symlink bypass (paths.ts) — Critical security fix. The earlier single-readlink fix at `efd7a4611` was bypassed by chained dangling symlinks: <ws>/leak -> <ws>/middle -> /scratch/evil where every layer is a symlink and the final target doesn't exist. The fix only readlink'd the first hop (<ws>/middle), saw it was inside the workspace via findExistingAncestor, and let the chain through. The OS write at <ws>/leak would then follow both hops and create /scratch/evil. Now loops lstat + readlink up to MAX_ANCESTOR_HOPS, tracks visited inodes for cycle detection, and only validates containment on the fully-dereferenced leaf. Cycle detection rejects with symlink_escape; chains that exceed the hop bound also surface as symlink_escape with a "too long or contains a cycle" hint. 2. opts.line validation (workspaceFileSystem.ts) — the docstring committed to "1-based positive integer" but Infinity / floats / negative values flowed through to readFileWithLineAndLimit and degraded silently. Now enforces Number.isSafeInteger + line >= 1 at the boundary; everything else throws parse_error. Test covers Infinity, -Infinity, 0, -1, 1.5, NaN. 3. io_error FsErrorKind (errors.ts) — wrapAsFsError previously conflated EIO/EBUSY/ENAMETOOLONG/EMFILE/ENFILE/ENOSPC/ETXTBSY under permission_denied. Monitoring pipelines that key on errorKind for security alerting would page oncall on a full disk. New io_error kind (HTTP 503) maps the environmental- failure errnos with distinct hints. EACCES/EPERM stay on permission_denied (literal access denial); EIO (failing disk), EBUSY (busy file), ENAMETOOLONG (PATH_MAX), EMFILE/ENFILE (fd exhaustion), ENOSPC (df -h reporting 100%), ETXTBSY (text-busy) all route to io_error. 4. glob audit kind taxonomy (workspaceFileSystem.ts) — three-way classification mirrors wrapAsFsError so the per-hit realpath catch surfaces ENOENT/ELOOP -> symlink_escape, EACCES/EPERM -> permission_denied, everything else -> io_error. Each class emits its own aggregated fs.denied event. 5. edit() matchedIgnore (workspaceFileSystem.ts) — readText and writeText both stamp matchedIgnore in their access audit; edit didn't, so operators monitoring fs.access events couldn't distinguish edits to .gitignore'd files (build artifacts, logs) from edits to tracked source. Added the same shouldIgnore + matchedIgnore plumbing that readText uses. 8 new tests: - multi-hop dangling symlink (security) - symlink cycle (security) - ENOSPC/EIO/EBUSY/ETXTBSY/ENAMETOOLONG -> io_error mapping - io_error -> HTTP 503 - EMFILE/ENFILE updated to io_error (was permission_denied) - opts.line rejects Infinity/-Infinity/0/-1/1.5/NaN - edit() audit records matchedIgnore on .log file 426/426 serve tests pass; typecheck clean. Remaining tracked follow-ups (per PR review reply): - list/glob brand cast (P2 deferred per PR description) - glob audit pathHash hashes pattern not paths - edit() TOCTOU read-modify-write race (atomic-via-temp + rename) — pinned to PR 20 - runQwenServe → fsFactory injection integration test — pinned to PR 19 - glob maxResults streaming (glob.iterate) 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): close 3 review-round-4 critical findings (#4250) DeepSeek's fourth review pass surfaced three more critical bugs; all are fixed in-PR. 1. TOCTOU symlink substitution in readText/readBytes/edit (workspaceFileSystem.ts) — Critical. fsp.stat(p) and the subsequent lowFs.readTextFile(p) (or fsp.readFile / fsp.readFile for edit) are two separate syscalls. An attacker who can write into the workspace can swap p from a regular file to a symlink pointing outside between the two calls; pre-stat sees the original, the read follows the swap. Fix: assertInodeStableAfterRead(p, preIno) — after each read, re-lstat p; reject with symlink_escape if the path is now a symlink (post.isSymbolicLink()) or its inode changed (preIno !== post.ino, with a 0-ino fallback for procfs / virtual mounts that don't report meaningful inodes). Catches the swap-and-leave attack and the swap-and-keep-swapped attack. Residual race (attacker swaps back AFTER our read but BEFORE our lstat) is much smaller than the original window and outside PR 18's threat model; fd-based reading via fsp.open + fileHandle.read would close it entirely but requires a new variant of lowFs that takes a FileHandle — tracked as a follow-up. 2. UTF-8 truncation corruption in readText (workspaceFileSystem.ts) — Critical. buf.subarray(0, sizeOutcome.bytesToRead).toString ('utf-8') silently emits U+FFFD when bytesToRead falls mid-codepoint (CJK, emoji). A downstream consumer parsing JSON or source code over the truncated content would see broken trailing bytes; meta.truncated would be true but the prefix is corrupt. A subsequent edit() with the corrupted string as oldText would also fail to match the on-disk content. Fix: safeUtf8Truncate(buf, maxBytes) walks back from maxBytes through any continuation bytes (0b10xxxxxx), then verifies the leading byte's full sequence fits within the cap; drops the leading byte if it doesn't. The result is always a clean prefix at a valid codepoint boundary. Test pins '中文测试' (12 bytes / 3 bytes per char) truncated at 7 bytes -> '中文' (no U+FFFD). 3. glob opts.cwd bypasses workspace boundary (workspaceFileSystem.ts) — Critical. opts.cwd was used directly as the glob root with no validation against boundWorkspace. ResolvedPath is a brand cast and a stale or forged value lets a glob('*/', { cwd: '/etc' }) enumerate files outside the workspace. The pattern-side absolute / UNC checks added in `a81ada43f` only constrain the pattern; cwd is the actual hazard. Fix: at the entry point of glob(), path.resolve cwd and isWithinRoot-check against boundWorkspace. Throws path_outside_workspace if cwd is outside, even when the pattern itself is harmlessly relative. Test pins the case with cwd: scratch (outside workspace). 3 new tests: - readText with mid-operation symlink swap -> symlink_escape - safeUtf8Truncate keeps CJK codepoints intact at 7-byte cap - glob with opts.cwd outside workspace -> path_outside_workspace 429/429 serve tests pass; typecheck + eslint clean. Remaining tracked follow-ups (Post-PR-18 hardening, in #4175): - list/glob brand-cast contract (PR 19) - runQwenServe → fsFactory injection contract test (PR 19) - edit() write-side TOCTOU + atomic-via-temp + expectedHash (PR 20) - glob audit pathHash (independent audit.ts commit) - glob maxResults streaming (independent hardening) - glob pattern preflight refactor to reuse hasSuspiciousPathPattern (cosmetic) 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): close 7 review-round-5 findings (#4250) DeepSeek round-5 surfaced 9 comments; 7 are real findings (the other 2 — UTF-8 truncation + glob opts.cwd — already fixed in `546834267` from round-4 and replied-to inline). Stack on round-4. 1. edit() empty-oldText silent-prepend (workspaceFileSystem.ts) — Critical silent data corruption. JS `''.indexOf('')` returns 0, so without an empty-string guard `current.slice(0,0) + newText + current.slice(0)` = `newText + current` — silently prepends `newText` to the whole file with a success audit event. PR 20 routes that thread user-supplied `oldText` verbatim must not be able to trigger this. Now throws `parse_error` BEFORE the read with a hint explaining why empty matches are rejected. 2. DOS device name regex misses bare names + first-extension forms (paths.ts) — Windows attack surface. The earlier /\.(CON\|PRN\|AUX\|NUL\|COM[1-9]\|LPT[1-9])$/i only caught the last-extension form (file.CON). NTFS reserves these names regardless of extension: CON, NUL, CON.txt, NUL.dat, CON.foo.bar are ALL reserved device handles. New regex /(^\|\.)(CON\|...\|LPT[1-9])(\.\|$)/i covers all four positions (bare / first-ext / last-ext / middle-ext) while still admitting legitimate substrings (BACON, concat.txt, precon.go). 3. Buffer.byteLength replaces Buffer.from for size-only checks (workspaceFileSystem.ts) — 5 MB heap allocation per write eliminated. `writeText` and `edit()` previously materialized the entire UTF-8 payload (up to MAX_WRITE_BYTES) just to read `.length`; `Buffer.byteLength(content, 'utf-8')` returns the count without allocating. 4. edit() error message includes oldText snippet (workspaceFileSystem.ts) — Production debuggability. The earlier hint was just "edit() expects oldText to appear verbatim in the file" — at 3 AM an operator can't tell whether the mismatch is whitespace, a stale file, or a wrong target. Now includes a JSON-quoted truncated snippet (max 80 chars + ellipsis) of the searched-for text. 5. recordAndWrap forwards FsError message into fs.denied audit (workspaceFileSystem.ts + audit.ts) — Audit observability gap. Audit consumers debugging an incident saw `errorKind` + `hint` but lost the underlying OS error detail (path, errno text, byte count). FsDeniedAuditPayload now carries an optional `message` field; recordAndWrap forwards `fs.message` automatically. 6. EISDIR / ENOTDIR distinct hints (errors.ts) — UX. Both shared the same hint "a path component is not a directory where one was expected" — for EISDIR (path IS a directory but a file was expected) the wording was reversed. Now distinct hints with the errno name explicitly cited. 7. kindFromStats / kindFromDirent merged into kindFromStatLike (workspaceFileSystem.ts) — duplicate function bodies removed. Both fs.Stats and fs.Dirent expose the same isFile / isDirectory / isSymbolicLink interface, and both targets (FsStat['kind'], FsEntry['kind']) are the same 4-value union. Single helper avoids drift if the union grows. 4 new tests: - bare/multi-ext DOS device names (CON, NUL, CON.txt, CON.foo.bar) + legitimate substrings (BACON, concat, precon, contemplating) - edit() empty oldText -> parse_error + file unchanged - edit() not-found error includes searched snippet in hint - fs.denied audit payload carries FsError message Plus the 2 already-fixed items (UTF-8 boundary, glob opts.cwd) have new test coverage from round-4. 433/433 serve tests pass; typecheck + eslint clean. Stack: `7b0db4c3a` → `a81ada43f` → `efd7a4611` → `b38e82157` → `911cb8e5d` → `546834267` → THIS 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): close 7 review-round-6 findings (#4250) Round-6 review (wenshao + gpt-5.5) flagged 7 items including two Critical and one privacy regression I introduced in round-5. 1. writeText / edit pre-write symlink guard (workspaceFileSystem.ts) — Critical, two reviewers (wenshao #CsBcq + gpt-5.5 #CsB3M) independently flagged. `atomicWriteFile` (`packages/core/src/utils/atomicFileWrite.ts`) resolves symlinks at write time, so a swap between the boundary's `resolve()` and `lowFs.writeTextFile()` would let the write follow the symlink to outside the workspace. `writeText` had no read phase, so its window was wider than `edit()`'s. New helper `assertNotSymlinkBeforeWrite(p)` lstats the path immediately before each `lowFs.writeTextFile` call; ENOENT is fine (ahead-of-create flow), but an actual symlink throws `symlink_escape`. Used in `writeText` AND `edit()`. Residual race after this guard but before the write completes is the deferred PR 20 atomic-via-temp follow-up. 2. recordAndWrap message field bypassed privacy mode (audit.ts) — Critical privacy regression I introduced at `ebd9e78a1` (round-5). The new `FsDeniedAuditPayload.message` field forwarded `FsError.message` unconditionally — and many throw-sites embed `${p}` (absolute paths) or user-supplied `oldText` snippets into the message. Privacy-mode operators (without `QWEN_AUDIT_RAW_PATHS=1`) saw paths leak through the message field even though they explicitly disabled raw-path logging. Fixed: `message` is now gated behind `includeRawPaths` alongside `relPath`. Privacy mode = no path-bearing fields, period. Operators wanting forensic context opt in via `QWEN_AUDIT_RAW_PATHS=1` and accept both fields together. 3. glob opts.cwd via symlink (workspaceFileSystem.ts) — gpt-5.5 #CsB3P. Textual `path.resolve(cwd) + isWithinRoot` admits `<ws>/link` even when `<ws>/link → /etc` is a symlink to outside; `globAsync` walks `/etc` before the per-hit filter drops results. Switched to `fsp.realpath(path.resolve(cwd))` so the containment check sees the actual walk root. ENOENT on cwd surfaces as `path_not_found`. 4. readText OOM via concurrent file growth (workspaceFileSystem.ts) — wenshao #CsBeE. The pre-stat `MAX_READ_BYTES` gate only sees the size at stat time; a concurrent writer can grow the file before the actual `readFileWithLineAndLimit` slurp. Added post-read `Buffer.byteLength(result.content) > MAX_READ_BYTES` check. The proper fix (fd-based read tying size + read to the same inode) is a hardening follow-up; this byte-length check is the defense-in-depth layer. 5. readBytes maxBytes can widen past MAX_READ_BYTES (policy.ts) — wenshao #CsBj5. `enforceReadBytesSize(st.size, opts.maxBytes)` used the caller-supplied `maxBytes` as the ceiling, replacing rather than clamping `MAX_READ_BYTES`. A future PR 19/20 route forwarding `req.query.maxBytes` could blindly bypass the daemon's 256 KiB safety cap. Now clamps via `Math.min(maxBytes, MAX_READ_BYTES)`. 6. ENOENT_TOLERATING_INTENTS docstring + test (paths.ts) — wenshao #CsBk3. The Intent docstring only mentioned `'write'` tolerating ENOENT; `'stat'` was in the set undocumented. A future maintainer removing `'stat'` thinking it was a copy-paste error would silently change behavior (stat on a concurrently-deleted path would throw `path_not_found` from the resolver instead of letting `fsp.lstat` throw `ENOENT` naturally). Amended docstring to call out `'stat'`'s rationale explicitly + added contract corpus case. 6 new tests: - glob cwd via symlink to outside → path_outside_workspace - writeText with mid-operation symlink swap → symlink_escape + outside file unchanged - edit with mid-operation symlink swap → symlink_escape + outside file unchanged - readBytes opts.maxBytes attempting widening → file_too_large - fs.denied message field absent in privacy mode (default) - fs.denied message field present in raw-paths mode (forensic) - contract corpus: resolve('newdir/leaf', 'stat') succeeds for ENOENT path 439/439 serve tests pass; typecheck + eslint clean. Stack: `7b0db4c3a` → `a81ada43f` → `efd7a4611` → `b38e82157` → `911cb8e5d` → `546834267` → `ebd9e78a1` → THIS 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): close 5 review-round-7 findings (#4250) glm-5.1 round-7 review surfaced 6 items; 5 are real fixes, 1 was already addressed in `7f4f30d0b` (writeText pre-write symlink guard — reviewer looked at `ebd9e78a1` snapshot ~8h before the round-6 fix). Stack on round-6. 1. edit() bypassed lowFs.readTextFile (workspaceFileSystem.ts) — Critical encoding round-trip corruption. The earlier `fsp.readFile(p, 'utf-8')` included the UTF-8 BOM verbatim in `current`, breaking `oldText` matching even when the user passed the exact source text from a previous read; lost iconv-supported codepage handling (GBK / Big5 / Shift_JIS), so non-UTF-8 files would mojibake into `current` and round-trip-corrupt on write-back; and the subsequent `lowFs.writeTextFile` passed no `_meta`, stripping the BOM and forcing UTF-8 on write-back even when the original was BOM'd or non-UTF-8. Fix: read via `lowFs.readTextFile` AND forward `readResult._meta` into the write-back. Test pins UTF-8 BOM round-trip end-to-end. 2. hasSuspiciousPathPattern multi-digit and POSIX false positive (paths.ts) — `/~\\d/` only matched single-digit; NTFS allocates `~10+` on >9 collisions. POSIX false-positives: editor swap files, backup tools used `~N` legitimately. Fixed to `/~\\d+/` and gated behind `process.platform === 'win32'`, matching the ADS-colon check. 3. canonicalizeWorkspace redundant per-request syscall (paths.ts) — Performance. Factory canonicalizes once at build; every `resolveWithinWorkspace` also ran realpathSync on the same path, blocking the event loop. Added CANONICAL_BOUND_CACHE Map keyed on the input string; steady-state size = 1 per `1 daemon = 1 workspace`. 4. readBytes opts.maxBytes API contract (workspaceFileSystem.ts) — Semantic mismatch. Parameter name promised window-read; impl only used it as a hard reject gate. Now truncates the buffer post-read so `readBytes(p, { maxBytes: 1024 })` on a 200 KB file returns 1 KB. Hard `MAX_READ_BYTES` cap still throws for files above it. 5. glob walks node_modules and .git unnecessarily (workspaceFileSystem.ts) — Performance. Without an `ignore` option, `globAsync` traversed every file under those dirs before our per-hit `shouldIgnore` filter. Now passes `ignore: ['/node_modules/', '/.git/']` to short-circuit traversal. Post-filter via `shouldIgnore` remains authoritative. 5 new tests: - 8.3 short-name regex Windows / POSIX split - readBytes truncates 2048-byte file to 1024 with maxBytes - readBytes throws file_too_large only above hard cap - edit() preserves UTF-8 BOM round-trip - glob prunes node_modules and .git 442/442 serve tests pass; typecheck + eslint clean. Stack: `7b0db4c3a` → `a81ada43f` → `efd7a4611` → `b38e82157` → `911cb8e5d` → `546834267` → `ebd9e78a1` → `7f4f30d0b` → THIS 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): close 10 review-round-8/9 findings (#4250) Two more review passes (gpt-5.5 + DeepSeek + wenshao) flagged 13 items; 10 are real fixes, 3 are reviewer-stale-snapshot or already-tracked. Stack on round-7. Critical (5): 1. paths.ts symlink-escape hint embedded the symlink target (gpt-5.5) — Privacy regression sibling to round-6 audit `message` gate. `recordDenied` always forwards `hint` into `fs.denied` even with `QWEN_AUDIT_RAW_PATHS` off; the hint `'symlink points to /Users/alice/secret'` leaks the attacker's intended exfiltration path through audit. Hint is now path-free; operators wanting the resolved target enable `QWEN_AUDIT_RAW_PATHS=1` and read it from `relPath` / `message`. 2. paths.ts dangling-symlink chain discarded its verified canonical (DeepSeek) — After the multi-hop walk validated `cursor → canonicalTarget` was inside the workspace, the code fell through to `findExistingAncestor(absolute)`, re-walking from the original input and discarding the verified result. An attacker swapping an intermediate symlink between the verification and the re-walk could produce a different canonical than the one validated. The verified `canonicalTarget` is now captured in `symlinkResolvedCanonical` and used directly; the `findExistingAncestor(absolute)` fallthrough only runs when no symlink was traversed. 3. workspaceFileSystem.ts readBytes missing post-read size check (DeepSeek) — Same TOCTOU shape as `readText`'s round-6 fix. The pre-stat `enforceReadBytesSize` sees the size at stat time; a concurrent appender keeps the same inode but grows the file past the cap before `fsp.readFile` returns. `assertInodeStableAfterRead` catches inode changes but not same-inode growth. Added a post-read `buf.length > MAX_READ_BYTES` check matching `readText`'s defense-in-depth pattern. 4. errors.ts wrapAsFsError default = permission_denied (DeepSeek) — Misclassified non-errno errors (`TypeError`, programmer-error throws, native module exceptions) as security denials, paging security oncall for what should be a developer ticket. New `internal_error` kind (HTTP 500) is the new default; `permission_denied` reserved for actual `EACCES`/`EPERM`. 5. audit.ts AuditContext.sessionId not forwarded to BridgeEvent (DeepSeek) — Multi-session daemons couldn't trace audit events back to the session that triggered them. `originatorClientId` identifies the client, not the session. Added optional `sessionId` field to both `FsAccessAuditPayload` and `FsDeniedAuditPayload`, forwarded from `ctx.sessionId` when present. Improvements (4): 6. workspaceFileSystem.ts glob cwd realpath redundant when cwd === boundWorkspace (wenshao) — `boundWorkspace` is already canonicalized by the factory (`realpathSync.native` at build time), so calling `fsp.realpath` per-request when no `opts.cwd` was supplied is a redundant async syscall. Added a short-circuit. 7. workspaceFileSystem.ts kindFromStatLike JSDoc orphaned (wenshao) — Inserting `assertNotSymlinkBeforeWrite` between the JSDoc and `kindFromStatLike` left the doc floating above the wrong function. IDE hovers showed the wrong description. Moved the doc back to its function. 8. workspaceFileSystem.ts shared mutable Ignore object (DeepSeek) — `createWorkspaceFileSystemFactory` builds one `Ignore` instance and shares it across every `WorkspaceFileSystemImpl` returned by `forRequest()`. `Ignore.add(): this` is a public mutator. A future "per-session ignore rules" feature calling `.add()` from a request handler would silently corrupt all concurrent sessions. `Object.freeze` turns the cross-request mutation into a `TypeError` rather than a silent leak. 9. server.ts createDefaultFsAuditEmit one-shot warned (DeepSeek) — Permanent silent no-op after the first event; only logged the event `type` with no pathHash / errorKind / intent. If PR 19 forgets the real factory injection, every write 403s and audit is silent past the first warning — exactly the regression the warning exists to surface. Periodic warning (every 100th drop) + first-event context (errorKind, intent, pathHash) makes the regression actionable in production logs. Cleanup (1): 10. workspaceFileSystem.ts safeUtf8Truncate dead code (DeepSeek noted as "off-by-one") — The lead-byte seqLen-check block was dead code: `subarray(0, end)` already excludes the leading byte at `end`, so no further adjustment is ever needed. Removed the block; function is now 4 lines and still produces a valid codepoint prefix. Reviewer's suggested fix (`buf[end-1] → buf[end]`) was technically correct but redundant with the subarray cut. Already-fixed (3 reviewer-stale-snapshot, reply + resolve): - writeText pre-write symlink guard — fixed in `7f4f30d0b` - edit() read-modify-write race — already deferred to PR 20 atomic-via-temp follow-up - glob maxResults walk-bound — already follow-up #5 3 new tests + 2 updated: - wrapAsFsError unknown errno → internal_error (default change) - internal_error has HTTP 500 - non-Error throwables → internal_error (not permission_denied) - readBytes post-stat growth → file_too_large - existing wrapAsFsError test updated for new default 445/445 serve tests pass; typecheck + eslint clean. Stack: `7b0db4c3a` → `a81ada43f` → `efd7a4611` → `b38e82157` → `911cb8e5d` → `546834267` → `ebd9e78a1` → `7f4f30d0b` → `1dc9d2290` → THIS 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve/fs): close 3 review-round-10 findings (#4250) Round-10 review (wenshao) flagged 4 items, all marked non-blocking by the reviewer. 3 are landed in-PR; 1 (glob withFileTypes optimization) is moved to issue #4175 follow-ups since the reviewer themselves recommended "test it on a 100K-file workspace after PR 19 lands." 1. canonicalizeWorkspace docstring (paths.ts) — Documentation. The earlier FIXME warned about sync syscall blocking the event loop without mentioning the `CANONICAL_BOUND_CACHE` added in `1dc9d2290` brings steady-state cost to zero. Future reviewers reading the FIXME again would re-flag the already-mitigated concern. Added a paragraph noting cache hit rate = 100% under the `1 daemon = 1 workspace` model so per-request blocking only happens at boot or on fresh workspace values (e.g. tests). 2. enforceReadBytesSize dead `maxBytes` parameter (policy.ts) — Round-7 changed `readBytes` to use post-read truncation for the soft window cap, leaving the function's `maxBytes` parameter unused at the only callsite. The `Math.min(maxBytes, MAX_READ_BYTES)` clamp branch became dead code. Reviewer's option 1: tighten the signature to `enforceReadBytesSize(fileBytes: number): void`. Done — the function is now purely the hard-cap enforcer its name implies; soft-window truncation lives in the orchestrator's `buf.subarray(0, opts.maxBytes)` post-read step where it's visible alongside the cap check it complements. 3. relForAudit cross-drive sentinel (audit.ts) — Windows-only privacy edge case. `path.relative('C:\\ws', 'D:\\evil')` returns `'D:\\evil'` (an absolute path) because Win32 can't express cross-drive relatives. Even when raw-paths mode is ENABLED, the audit `relPath` field would carry the off-drive absolute path, exposing the attacker's drive letter + directory. Added a `path.isAbsolute(rel)` post-check that substitutes a `<cross-drive>` sentinel — audit consumers see the cross-drive case distinctly without leaking the offending path. This was previously P2 deferred in PR 18's description ("Windows cross-drive path.relative"); reviewer's "few lines beats deferring" assessment was right. Tracked as follow-up (not in this commit): 4. glob withFileTypes optimization — Replace per-hit `lstat` (line ~664) with `glob` v10's `withFileTypes: true` so each hit comes back as a `Path` object with `isDirectory()` / `isFile()` / `isSymbolicLink()` already available. Saves N syscalls in large workspaces. Non-trivial restructure (return type changes from `string[]` to `Path[]`). Reviewer themselves marked "[performance / non-blocking]" and said "test it on a 100K-file workspace after PR 19 lands so we know whether it's worth it." Added to issue #4175 body's Post-PR-18 hardening follow-ups. 446/446 serve tests pass; typecheck + eslint clean. Stack: `7b0db4c3a` → `a81ada43f` → `efd7a4611` → `b38e82157` → `911cb8e5d` → `546834267` → `ebd9e78a1` → `7f4f30d0b` → `1dc9d2290` → `a33d459df` → THIS 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)	2026-05-18 13:14:43 +08:00
jinye	96219924a0	feat(serve): MCP client guardrails (#4175 Wave 3 PR 14) (#4247 ) * feat(serve): MCP client guardrails (#4175 Wave 3 PR 14) Adds an in-process MCP client counter, slot-reservation enforcement at all 3 spawn sites (discoverAllMcpTools / discoverAllMcpToolsIncremental / readResource), new `--mcp-client-budget=N` + `--mcp-budget-mode={enforce,warn,off}` CLI flags forwarded to the ACP child via env, and additive `clientCount` / `clientBudget` / `budgetMode` / `budgets[]` fields plus `disabledReason: 'budget'` tagging on `GET /workspace/mcp`. Always-on capability tag `mcp_guardrails` with `modes: ['warn', 'enforce']` so SDK clients can pre-flight refusal semantics. Typed SSE push events (`mcp_budget_warning` / `mcp_child_refused_batch`) intentionally deferred to a small follow-up PR — the snapshot already exposes `budgets[0].status: 'warning'\|'error'` + `refusedCount` so operator visibility isn't blocked. * fixup(serve): address PR 14 review (#4247) findings 1-7 Addresses Codex + Copilot review feedback on #4247. Seven functional and forward-compat fixes; (8) `tcp` transport mapper vs createTransport deferred pending @wenshao direction (separate core/protocol decision). 1. Single-server rediscovery bypass — add `tryReserveSlot` at the top of `discoverMcpToolsForServerInternal`. Pre-fix a server refused at startup could be brought online later via `/mcp reconnect <name>` and exceed the cap in enforce mode. 2. Empty `budgets[]` when mode=off — early `return []` in `buildBudgetCells` when mode is `off`. Protocol docs / SDK types promise empty array; pre-fix emitted a synthetic noisy cell. 3. runQwenServe validation + env leakage — mirror CLI budget validation in `runQwenServe` (the embedded entry point); explicitly delete `QWEN_SERVE_MCP_` env vars when options are undefined so multiple daemons in one process don't leak prior budget config to subsequent ACP children. 4. Disabled-vs-refused precedence + stale refusal log* — config-disable wins over budget refusal in the per-server cell; `removeServer` + `disconnectServer` drop the entry from `lastRefusedServerNames` so operator action immediately clears the budget tag. 5. Incremental remove-before-reserve ordering — process config-removed servers FIRST in `discoverAllMcpToolsIncremental` so freed slots are visible to subsequent `tryReserveSlot` calls. Pre-fix scenario {a,b}→{a,c} with budget=2 wasted a slot. 6. `scope` forward-compat type widening — `'workspace' \| (string & {})` on both `ServeMcpBudgetStatusCell` and `DaemonMcpBudgetStatusCell` so SDK consumers don't break when PR 23 adds `scope: 'pool'` per the documented no-schema-bump contract. 7. Test comment alignment — fix "With budget=1" comment to match `clientBudget: 2` code. Plus 4 new core regression tests covering #1/#2/#4/#5, and 4 new serve tests covering #3 (boot rejection + env cleanup). 237/237 pass across the affected files (36 core mcp-client-manager + 50 acpAgent + 151 serve). * docs(serve): clarify v1 snapshot-based budget warning detection (#4247) Address github-actions review-summary finding (I) on PR #4247: v1 operators have no SSE push event for budget pressure yet (deferred to PR 14b), so the protocol doc should explicitly say how to detect warning / error states from the snapshot. Adds the three-way mapping `budgets[0].status` ↔ live/refused counts. * fixup(serve): address PR 14 review round 2 (#4247 wenshao) Addresses @wenshao review on PR #4247. Three critical safety fixes + four suggestion-level improvements. Critical (zombie slot leaks — would break `enforce` mode for the rest of the daemon's lifetime): - C2: `discoverAllMcpTools` connect() catch now releases reservedSlots + clients entry. Pre-fix one failed connect permanently consumed a budget slot. - C3: `readResource` wraps client.connect() in try/catch; on throw the slot + client entry are cleaned up before re-raising. Tracked `weReservedSlot` so the cleanup only fires for newly-created lazy spawns (reused already-CONNECTED clients are untouched). - (wenshao C1 was the rediscovery-bypass also caught by Codex + Copilot — already addressed in fixup `597f011e6`.) Suggestion: - S4: `readBudgetFromEnv` downgrades `mode='enforce'` → `'off'` when no budget is set, mirroring the CLI + `runQwenServe` invariant. Fail-closed on operator misconfiguration rather than silently bypassing enforcement. - S5: extract duplicated `mcp_budget_decision` telemetry into private `emitBudgetTelemetry(configuredCount)`. - S6: rename `BudgetExhaustedError` constructor param `liveCount` → `reservedCount`. `reservedSlots.size` is what's blocking the new server, not the live CONNECTED count (those differ when a reserved server is disconnected). - S7a: bump accounting-failure log level — `debugLogger.debug` (gated on debug=true) replaced by `process.stderr.write` so production daemons surface slot-leak / type-mismatch failures in journald/docker logs. (S7b — expose `reservedSlots[]` on the wire for slot-leak debugging — deferred as additive; will be in PR 14b alongside the typed events.) + 3 new core regression tests (C2 leak release, C3 lazy-spawn leak release, S4 env enforce-downgrade). 626/626 tests pass across the focused suite; typecheck + lint clean. * fixup(serve): address PR 14 review round 3 (#4247 wenshao second pass) Addresses @wenshao's second review pass on PR #4247 (submitted 15:56Z after round 2 fixup landed). Four code fixes + three doc clarifications. Code: - R3 #5: `readResource` lazy-spawn path now checks `isMcpServerDisabled` BEFORE the budget gate. Pre-existing gap: a server disabled via `mcpServers.<name>.disabled: true` or `/mcp disable <name>` could be resurrected by any resource read. Disabled precedence over budget mirrors the per-server cell logic. - R3 #6: `buildBudgetCells` now receives the post-disabled-filter `refusedCount` so the workspace cell matches the per-server cell precedence. Pre-fix a server disabled after being refused rendered `disabled` on its per-server row but `error: budget_exhausted` on the workspace row. - R3 #7: extract `MCP_BUDGET_WARN_FRACTION = 0.75` constant. Was hardcoded in `acpAgent.buildBudgetCells` AND `commands/serve.ts` stderr breadcrumb (the latter with `Math.ceil` divergence on non-integer multiples). Pre-extract so PR 14b's dual-threshold (0.75 warn + 0.375 rearm) lands in one file. - R3 #1: env-var enforce-without-budget downgrade (already fixed in round 2 `ba3e3febd` S4 — reply-only on the new thread). Docs: - R3 #2: docstring on `mcpTransportOf` now spells out the `tcp` vs `createTransport` divergence + records the deferred decision (PR 14b / future core). Closes the "comment claims X but code does Y" gap. - R3 #3: comments in both `discoverAllMcpTools` catch (release slot — stop() owns lifecycle) AND `discoverMcpToolsForServerInternal` catch (KEEP slot — operator intent + health-monitor retry). Different paths, different contracts, both explicit. - R3 #4: invariant note in `readResource` lookup→reserve sequence documenting the synchronous no-await guarantee that closes the TOCTOU window. + 3 new core regression tests (readResource disabled gate, disabled-wins-over-budget precedence, MCP_BUDGET_WARN_FRACTION pin). 629/629 tests pass; typecheck + lint clean. * fixup(serve): address PR 14 review round 4 (#4247 wenshao second + third pass) Addresses @wenshao's second + third review passes on PR #4247. One critical scope-correction (per-session vs per-workspace) + one zombie leak fix shared across three threads. Critical correction — per-session vs per-workspace (wenshao R3 line 117 docs): - Reality check: `acpAgent.newSessionConfig()` constructs a fresh `Config` + `ToolRegistry` + `McpClientManager` for EVERY ACP session. Each manager independently reads `QWEN_SERVE_MCP_CLIENT_BUDGET` env. So `--mcp-client-budget=10` with 5 sessions caps at 5 × 10 = 50 live MCP clients across the daemon, NOT 10. The "per-workspace" framing in v1 docs was incorrect. - Pragmatic v1 path (not the big refactor): rewrite docs + change `scope: 'workspace'` → `scope: 'session'` so the wire contract reflects reality. Wave 5 PR 23 (shared MCP pool) will introduce a workspace-scoped manager and add `scope: 'workspace'` cells alongside. - Files touched: `status.ts` + `sdk types.ts` (cell `scope` field widened to `'session' \| 'workspace' \| (string & {})` with v1 emitting `'session'`), `acpAgent.buildBudgetCells` (emits `'session'` + new code comment explaining the per-session truth), `docs/users/qwen-serve.md` (CLI flag + budget section relabel + ⚠️ v1 limitation callout), `docs/developers/qwen-serve-protocol.md` (capabilities section + JSON example + paragraph rewrite + per-session detection hint). Zombie leak fix — single weReserved-pattern fix in discoverMcpToolsForServerInternal closes wenshao R3 line 546 + R4 line 639 + R4 line 929: - Same pattern as R2 C3 (`readResource`): track `weReservedSlot = reservation === 'reserved' && this.reservedSlots.has(serverName)` (the set-membership guard distinguishes a real fresh reservation from `off`-mode's no-op return). On connect-failure, release slot + drop client only when `weReservedSlot`; an `'already_held'` reconnect keeps its slot so health-monitor retry doesn't compete for capacity. - Pre-fix a brand-new server connecting via /mcp reconnect / health monitor / incremental's serversToUpdate that failed on connect() would permanently consume a budget slot under enforce mode. - Updated R3's "always keep" doc comment to reflect the new two-mode cleanup (release on fresh + keep on reconnect). - Caught and added a tripwire test for the `off`-mode no-op edge case (`tryReserveSlot` returns `'reserved'` without adding to the set in off mode — without the has-guard, my fix would have broken the pre-existing "should restore health checks after failed server rediscovery" test by deleting the failed client even in unbudgeted operation). + 2 new core regression tests (fresh-reserve connect-failure releases slot, reconnect connect-failure keeps slot). 631/631 focused tests pass; typecheck + lint clean. * fixup(serve): address PR 14 review round 5 (#4247 wenshao fourth pass) Addresses @wenshao's fourth review pass on PR #4247. Two critical zombie-leak / staleness fixes; three reviewer findings deferred or already-addressed (replied + resolved on the threads). Critical fixes: - R5 line 956: `runWithDiscoveryTimeout` timeout handler now releases `reservedSlots.delete(serverName)` and drops the stale `lastRefusedServerNames` entry alongside the existing `clients.delete`. Pre-fix a timed-out server in `enforce` mode permanently held its budget slot; N consecutive timeouts permanently degraded daemon capacity. + regression test. - R5 line 1268-1: `readResource` lazy-spawn path drops the server from `lastRefusedServerNames` when `tryReserveSlot` returns `'reserved'` (a successful late re-reservation). Pre-fix a server refused at discovery but later re-reserved via `readResource` (e.g., after another server freed a slot) kept its stale `disabledReason: 'budget'` tag in the snapshot. + regression test. Reviewer findings deferred / already done (replied + resolved): - R5 line 1268-2 (`no try/catch around connect()` in readResource): stale view — R2 C3 fixup `ba3e3febd` added the try/catch with the weReservedSlot cleanup pattern. - R5 line 1274 (`BudgetExhaustedError.liveCount` semantic mismatch): R2 S6 fixup `ba3e3febd` renamed the param + readonly field to `reservedCount`, exactly matching the proposed semantic. - R5 acpAgent.ts null line (`Math.ceil(0.75 * budget)` for small budgets): proposed fix is semantically a no-op for integer liveCount — `liveCount >= 0.75` and `liveCount >= Math.ceil(0.75) === 1` give identical results when liveCount is an integer. The underlying "small budgets jump ok→error" observation is a real but inherent limitation of percentage-based thresholds at small N; design tradeoff, not implementation bug. 46/46 core tests pass (44 prior + 2 new R5 regression). Typecheck + lint clean. * fixup(serve): address PR 14 review round 6 (#4247 wenshao fifth pass) Addresses @wenshao's fifth review pass on PR #4247. Two critical fixes (one TOCTOU race, one cross-daemon env leak). Critical fixes: - R6 Thread 2 (line 956): remove the duplicate pre-reservation block in `discoverAllMcpToolsIncremental`. The reservation already happens inside `discoverMcpToolsForServerInternal` (R1 fix #1). With both sites reserving, the timeout cleanup raced against the inner connect path — `runWithDiscoveryTimeout`'s timeout handler could release the slot mid-flight while the inner `connect()` later resolved successfully, leaving a CONNECTED client with NO reservation and breaking `enforce`-mode budget enforcement. With pre-reservation removed, the inner call owns the entire reservation lifecycle (reserve → connect → release-on-failure-via-weReservedSlot → cleared-by-timeout-if-fires) at a single site. Refusal behavior is observably identical from outside. - R6 Thread 1 (runQwenServe.ts:216): per-handle env passthrough via new `BridgeOptions.childEnvOverrides` instead of mutating global `process.env`. Pre-fix concurrent embedded `runQwenServe()` handles with different MCP budgets would race on the global env — `defaultSpawnChannelFactory` snapshots `process.env` AT SPAWN TIME, so the last `runQwenServe()` call to set the var would silently win for ALL daemon handles' subsequent ACP child spawns. Wire surface: - `ChannelFactory` signature: `(workspaceCwd, childEnvOverrides?) => Promise<AcpChannel>`. - `BridgeOptions.childEnvOverrides?: Readonly<Record<string, string \| undefined>>` — `undefined` value means "scrub this var from the child env" so an embedded caller can wipe a stale inherited var without touching global state. - `defaultSpawnChannelFactory` merges overrides AFTER `SCRUBBED_CHILD_ENV_KEYS` so the daemon-only secret list still wins (operators can't override the scrub). - `runQwenServe` closes over per-handle overrides; never touches `process.env`. + 3 new regression tests (incremental refusal post-pre-reservation-removal, runQwenServe-doesn't-mutate-process.env, bridge forwards childEnvOverrides to channelFactory with two concurrent bridges asserting isolation). 327/327 focused tests pass; typecheck + lint clean. * fixup(serve): address PR 14 review round 7 (#4247 wenshao sixth pass) Addresses @wenshao's sixth review pass on PR #4247 (glm-5.1 via Qwen Code /review). One critical staleness fix + four real bug fixes + one operator-visibility breadcrumb + one refactor. Critical: - R7 #1 line 612: `discoverMcpToolsForServerInternal` now drops the entry from `lastRefusedServerNames` on successful connect+discover. Pre-fix a previously-refused server that reconnects via `/mcp reconnect` (or health-monitor retry after another server frees capacity) left the snapshot reporting `error / disabledReason: 'budget'` for a CONNECTED, working server until the next discovery pass cleared the per-pass log. Real bugs: - R7 #2 line 528: disabled gate added to `discoverMcpToolsForServerInternal`. Reachable from `/mcp reconnect`, OAuth re-discovery, and health-monitor `reconnectServer` — none of which previously checked `isMcpServerDisabled`. Pre-fix a disabled server could be resurrected through any of these paths, wasting a budget slot and registering tools the operator told us to ignore. Mirrors the bulk-discovery + readResource patterns. Optional-chain on the call to stay defensive against test fixtures missing the method. - R7 #3 line 634: transport leak in the `discoverMcpToolsForServerInternal` connect-failure catch. Pre-fix when `connect()` succeeded (transport established) and `discover()` later threw, the catch deleted the client reference without calling `client.disconnect()`, leaking the stdio child / socket until Node exit. Best-effort `await client.disconnect()` added before the map cleanup. - R7 #4 line 1302: `readResource`'s `weReservedSlot` now uses the same `reservation === 'reserved' && this.reservedSlots.has(serverName)` guard as `discoverMcpToolsForServerInternal`. Distinguishes a real fresh reservation from `off`-mode's no-op return. Maintenance-trap fix; in `off` mode the cleanup branch never fires now. - R7 #5 line 1342: `readResource` re-checks `isMcpServerDisabled` on EVERY call, regardless of whether the client was just lazy-spawned or pre-existing. Pre-fix a server connected pre-disable and then operator-disabled mid-session via settings reload still served resource reads via its existing CONNECTED client until the next incremental discovery pass called `removeServer`. Polish: - R7 #6 line 191: `readBudgetFromEnv` now emits a stderr breadcrumb when env values are invalid (`QWEN_SERVE_MCP_CLIENT_BUDGET=abc`, `QWEN_SERVE_MCP_BUDGET_MODE=foo`). Pre-fix operator typos silently fell through to "no enforcement". Same pattern as the `--require-auth` boot log. - R7 #7 line 464: extracted `dropRefusalEntry` (4 sites) + `refuseAndLog` (3 sites) helpers. Pure refactor, zero behavior change. The `readResource` refusal path now calls `refuseAndLog` before throwing `BudgetExhaustedError` so operators get the same stderr trail as bulk-discovery refusals. + 5 new core regression tests (refusal-cleared-on-success, internal-disabled-gate, discover-throw-disconnects, env-typo-breadcrumb, existing-client-disabled-rejected). 52/52 core tests pass; typecheck + lint clean. * fixup(serve): address PR 14 review round 8 (#4247 wenshao seventh pass) Addresses @wenshao's seventh review pass on PR #4247 (gpt-5.5 + DeepSeek/deepseek-v4-pro via Qwen Code /review). One critical transport leak + three soundness/consistency fixes; one optional clarity refactor explicitly deferred. Critical: - R8 #1 line 532 (4 duplicate threads): bulk-path transport leak. Mirrors the R7 #3 fix but in `discoverAllMcpTools` instead of the per-server path. Pre-fix: when `connect()` succeeded (transport established) and `discover()` later threw, the bulk catch deleted the client reference without calling `client.disconnect()`, leaking the stdio child / WebSocket / HTTP socket for the rest of the daemon's lifetime (`stop()` can't see what we just removed from `this.clients`). Best-effort `await client.disconnect()` added before `clients.delete` + `reservedSlots.delete`. Updated the doc comment that misleadingly claimed `stop()` was the lifecycle owner — true only for slot bookkeeping, not transports. Soundness: - R8 #2 line 431: tighten `readBudgetFromEnv` mode-without-budget downgrade. Originally only `enforce` got downgraded to `off` when no budget was set; `warn` mode without a budget threshold reached `emitBudgetTelemetry` with `clientBudget: undefined`, contradicting the JSDoc invariant `mode !== 'off' ⇒ clientBudget defined`. Now both `enforce` AND `warn` downgrade to `off` when no budget is configured. The invariant comment was also weakened to match the actual `?? 0` defense-in-depth (the new R8 #5 constructor downgrade closes the remaining edge case). - R8 #5 line 302: constructor mirrors the `readBudgetFromEnv` downgrade for the direct `budgetConfig` parameter. All production callers (CLI, `runQwenServe`, env-var fallback) validate upfront, but a future code path that injects `budgetConfig` directly without re-validating would re-introduce the silent fail-open. Defense in depth. - R8 #4 line 1221: distinguish fresh vs `'already_held'` reservations in `runWithDiscoveryTimeout`'s timeout handler. New private `freshReservations: Set<string>` field marked when `weReservedSlot === true` inside `discoverMcpToolsForServerInternal` and cleared in finally / catch / success. Timeout handler now releases the slot ONLY when `freshReservations.has(serverName)` — meaning the slot was freshly reserved by THIS in-flight call. `'already_held'` reconnect timeouts (a previously-healthy server's transient hiccup) keep the slot so health-monitor retry doesn't have to compete for capacity with new servers admitted during the timeout window. Aligns the timeout handler with the connect-failure catch's `weReservedSlot` semantics — closes the asymmetry wenshao R8 #4 caught. Deferred: - R8 #3 line 332 (`tryReserveSlot` `'observed'` return value clarity): optional, non-blocking style improvement that ripples through 3 call sites + many tests for zero behavior change. Worth doing in a focused refactor PR; flagged as deferred polish, not in this fixup. + 3 new core regression tests (bulk discover-throw disconnects, warn-no-budget downgrade, constructor enforce downgrade). 679/679 focused tests pass; typecheck + lint clean. * fixup(serve): address PR 14 review round 9 (#4247 wenshao eighth pass) Addresses @wenshao's eighth review pass on PR #4247 (glm-5.1 via Qwen Code /review). Six actionable findings adopted; two threads explained as not-actionable (one stale-view, one reviewer hallucination). Critical / real bugs: - R9 #2 line 1534: `readResource` lazy-spawn connect-failure catch now does best-effort `await client.disconnect()` BEFORE `clients.delete` + `reservedSlots.delete`. Mirror of R7 #3 (per-server discovery) and R8 #1 (bulk discovery) — closes the same transport-leak class for the third spawn path. Pre-fix: connect() establishing the transport but throwing on a later handshake step would orphan the stdio child / socket. - R9 #6 line 1521: `readResource` lazy `client.connect()` now wraps in `Promise.race` against `discoveryTimeoutFor(serverConfig)` — same per-server timeout the bulk + incremental paths use. Pre-fix a hung MCP server during a resource-read spawn would block forever and permanently consume a budget slot under enforce mode, cascading into total budget exhaustion. `serverConfig` lookup hoisted to the top of `readResource` so both lazy-spawn and existing-client branches use identical timeout behavior. - R9 #8 line 1514: `readResource` lazy spawn now calls `this.startHealthCheck(serverName)` after a successful connect. Pre-fix a lazy-spawned server that later disconnected (crash, network) had no automatic reconnect — sat DISCONNECTED until the next readResource or incremental pass. Mirrors `discoverMcpToolsForServerInternal`'s finally-block pattern. Operator-visibility: - R9 #7 (general): `readBudgetFromEnv` now writes a stderr breadcrumb when the `(enforce\|warn)`-without-budget downgrade fires. Pre-fix a Docker Compose / k8s env that set `QWEN_SERVE_MCP_BUDGET_MODE=enforce` but forgot the matching `_BUDGET=N` would silently boot with enforcement off and `mcp_guardrails` capability advertised — operator only signal was the snapshot's `budgetMode: 'off'`. Now mirrors the R7 #6 invalid-value breadcrumb pattern. Doc fixes: - R9 #4 line 81: `McpBudgetConfig.clientBudget` JSDoc now reflects the R4 per-session scope correction. The doc was a leftover from the original "per-workspace" framing — every other doc surface (protocol doc, user doc, type comments on the snapshot cell, capability tag) was rewritten in R4 except this one. - R9 #5 line 870: `acpAgent.buildBudgetCells` now spells out the `liveCount` (`accounting.total`, CONNECTED only — operator observability) vs `reservedSlots.size` (all reserved including in-flight — enforcement) semantic distinction. The intentional gap was undocumented in the type signatures, JSDoc, and protocol doc; future PR 14b SSE event payloads should reference both. Not adopted: - R9 #1 acpAgent:15: claimed "MCP_BUDGET_WARN_FRACTION not exported + getMcpClient* methods don't exist + 4 tsc errors" — verified incorrect: the constant IS exported (mcp-client-manager.ts:61), the 3 methods ARE class members (lines 379, 407, 412), and `npm run typecheck` is clean across all 4 workspaces. Reviewer's tool hallucinated this critical finding. - R9 #3 mcp:410: reported the bulk-path transport leak that R8 #1 (commit `7228813c5`) had already closed. Reviewer was on the pre-R8 commit view. + 2 new core regression tests (readResource lazy connect-fail disconnects + R9 #7 stderr breadcrumb). 57/57 core tests + 679/679 focused suite pass. Typecheck + lint clean. * fixup(serve): address PR 14 review round 10 (#4247 wenshao ninth pass) Two non-blocking 🟢 nits — both adopted for symmetry / explicitness. - R10 line 357: constructor downgrade now emits the same stderr breadcrumb the env-var path got in R9 #7. Pre-R10 the `(enforce\|warn)`-without-budget downgrade was silent for the direct-`budgetConfig` path, so a future caller bypassing CLI / env-var validation would have shipped a daemon advertising `mcp_guardrails` while silently disabling enforcement. Now boot logs surface the misconfiguration uniformly across all three resolution paths. - R10 line 1572: documented the `McpClient.disconnect()` cancel-pending-connect contract that the timeout-race cleanup relies on across all three spawn paths (lazy `readResource`, bulk `discoverAllMcpTools`, per-server `discoverMcpToolsForServerInternal`). The bulk path's production stability since #3889 is implicit evidence the contract holds; comment makes the assumption discoverable to the next reader and notes a follow-up unit test would be valuable. No behavior change. 57/57 core tests pass. Typecheck + lint clean.	2026-05-18 12:07:23 +08:00
ChiGao	d07c958bb5	feat(tui): add daemon adapter spike (#4202 ) * docs(tui): draft daemon adapter plan * feat(tui): add daemon adapter spike * fix(tui): harden daemon adapter event handling * fix(tui): report daemon prompt failures * fix(tui): surface daemon terminal failures * fix(tui): harden daemon adapter state handling * fix(tui): harden daemon adapter lifecycle * fix(tui): harden daemon adapter follow-ups --------- Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com>	2026-05-18 11:22:39 +08:00
jinye	f44ed09412	feat(serve): preflight and env diagnostics routes (#4175 Wave 3 PR 13) (#4251 ) * feat(serve): introduce ServeErrorKind and BridgeTimeoutError (#4175 Wave 3 PR 13) Lay the type foundation for `/workspace/preflight` and `/workspace/env` (and the eventual MCP guardrails route) so cells emitted by all three share a closed `errorKind` taxonomy: - `SERVE_ERROR_KINDS` literal-list + `ServeErrorKind` union — the seven values from #4175 (`missing_binary`, `blocked_egress`, `auth_env_error`, `init_timeout`, `protocol_error`, `missing_file`, `parse_error`). - `BridgeTimeoutError` typed class — `withTimeout` now rejects with this rather than a plain `Error`, letting `mapDomainErrorToErrorKind` recognize init / heartbeat / extMethod timeouts via `instanceof` instead of regex-matching message strings. Message format is preserved bit-for-bit. - `mapDomainErrorToErrorKind` helper — one place to classify `BridgeTimeoutError`, `SkillError`, fs ENOENT/EACCES/EPERM, ModelConfigError subclasses (recognized by `name` field — they aren't on the public surface of `@qwen-code/qwen-code-core`), `SyntaxError`, plus message-regex fallbacks for legacy throw sites (`agent channel closed`, missing CLI entry path). - `ServeStatusCell.errorKind` tightened from open `string` to the closed `ServeErrorKind` union. Backward compatible — PR 12 never assigned the field. - SDK mirrors: `DAEMON_ERROR_KINDS` const + `DaemonErrorKind` type; `DaemonStatusCell.errorKind` tightened. Tests: 11 new unit tests in `status.test.ts` covering each mapping rule plus the BridgeTimeoutError shape. No route changes; no behavior changes for any existing path. * feat(serve): add buildEnvStatusFromProcess helper (#4175 Wave 3 PR 13) Pure helper that constructs the `/workspace/env` payload from `process.` state. No I/O, no ACP roundtrip, no globals beyond `process.env`. The route itself lands in the next commit. - `ServeEnvKind` discriminant: `runtime \| platform \| sandbox \| proxy \| env_var` - `ServeEnvCell extends ServeStatusCell` with `name` + optional `present` / `value`. Cells with `kind: 'env_var'` are presence-only — `value` is ALWAYS omitted to keep secret env vars off the wire even by accident. - `ServeWorkspaceEnvStatus` envelope: `{ v, workspaceCwd, initialized: true, acpChannelLive, cells, errors? }`. `initialized` is structurally `true` because env answers from the daemon process directly; `acpChannelLive` reports whether a child is up but does not change the payload shape. Whitelist policy: - Auth/secret keys (presence-only): OPENAI/ANTHROPIC/GEMINI/GOOGLE/DASHSCOPE/ OPENROUTER `_API_KEY`, `QWEN_SERVER_TOKEN`. - Non-secret keys (also presence-only for shape uniformity): base URLs, locale, TZ, NODE_EXTRA_CA_CERTS, QWEN_CLI_ENTRY. - Proxy vars (`HTTP_PROXY`/`HTTPS_PROXY`/`NO_PROXY`/`ALL_PROXY` + lowercase variants): credentials stripped via `redactProxyCredentials`, then `URL().host` so the wire only carries `host:port`. NO_PROXY is a host list rather than a URL so we pass the redacted form verbatim. SDK mirrors: `DaemonEnvKind`, `DaemonEnvCell`, `DaemonWorkspaceEnvStatus`. Tests: 9 unit tests covering the proxy-credential redaction, lowercase env fallback, NO_PROXY pass-through, presence-only `env_var` invariant (`'value' in cell === false`), whitelist enforcement, runtime tag detection, and envelope shape. feat(serve): add GET /workspace/env route (#4175 Wave 3 PR 13) Wire `buildEnvStatusFromProcess` from the previous commit through the bridge, server, and SDK so remote clients can pre-flight the daemon's runtime environment without spawning an ACP child. - `workspace_env` capability tag (always advertised on a current daemon). - `bridge.getWorkspaceEnvStatus()` answers entirely from `process.` — the route never consults ACP. `acpChannelLive` reflects whether a child exists but does not change the payload, so an idle daemon and a busy one return the same env shape. - `app.get('/workspace/env', ...)` mirrors PR 12's one-liner pattern. - SDK: `DaemonClient.workspaceEnv()` returning `DaemonWorkspaceEnvStatus`. - Docs: bullet in `docs/users/qwen-serve.md` calling out the presence-only redaction policy and the no-ACP-spawn guarantee. Tests: server-level (env returned + `'value' in env_var === false` assertion), bridge-level (idle and live both answer locally without hitting ACP extMethod), SDK-level (recording-fetch round-trip on `/workspace/env`). The `workspace_env` tag is added to the `EXPECTED_STAGE1_FEATURES` capability list assertion. feat(serve): add /workspace/preflight daemon-cells path (#4175 Wave 3 PR 13) Wire the preflight route. Daemon-level cells are populated unconditionally from `process.` and `node:fs`; ACP-level cells fall back to `not_started` placeholders when no child is alive so a poll never spawns one. - `workspace_preflight` capability tag. - `ServePreflightKind` discriminant (12 values: node_version, cli_entry, workspace_dir, ripgrep, git, npm — daemon-level — plus auth, mcp_discovery, skills, providers, tool_registry, egress — ACP-level). - `ServePreflightCell extends ServeStatusCell` with `locality: 'daemon' \| 'acp'` + free-form `detail`. `ServeWorkspacePreflightStatus` envelope. - `createIdleAcpPreflightCells()` factory: emits the six ACP-level cells with `status: 'not_started'` + a uniform `hint` so the bridge can stitch them in alongside daemon cells without ever calling ACP. - `bridge.getWorkspacePreflightStatus()`: - Daemon cells via `buildDaemonPreflightCells` (Promise.all over Node-version, CLI-entry resolution mirroring `defaultSpawnChannelFactory`, `fs.stat` on `boundWorkspace` with ENOENT/EACCES/EPERM mapped to `missing_file`, best-effort `canUseRipgrep` / `getGitVersion` / `getNpmVersion` warnings). - ACP cells via `requestWorkspaceStatus` — idle factory returns the `not_started` placeholders; live path delegates to ACP via the `qwen/status/workspace/preflight` ext method (handler lands in next commit). Bridge-side timeout / channel-close while consulting ACP folds into envelope `errors[]` with `mapDomainErrorToErrorKind` classification; daemon cells still render. - `app.get('/workspace/preflight', ...)` route + JSDoc bullet. - SDK: `DaemonPreflightKind` / `DaemonPreflightCell` / `DaemonWorkspacePreflightStatus` mirrors; `DaemonClient.workspacePreflight()`. Tests: server-level (route returns the bridge payload), bridge-level (idle returns 6 daemon + 6 ACP `not_started` cells without spawning a channel), SDK-level (`workspacePreflight()` round-trip). Capability test updated. feat(serve): wire ACP-side preflight cells (#4175 Wave 3 PR 13) Populate the six ACP-level preflight cells inside the ACP child so `/workspace/preflight` returns real values for live sessions. - `extMethod(qwen/status/workspace/preflight, ...)` dispatches to a new `buildAcpPreflightCells(config)` private method. - Five cell builders, each returning a `ServePreflightCell` with `locality: 'acp'`: - `auth`: `validateAuthMethod(authType, config)` returning non-null string → `auth_env_error`. Missing auth method → warning. Throws classified via `mapDomainErrorToErrorKind` with `auth_env_error` fallback. - `mcp_discovery`: rolls up `getMCPDiscoveryState()` + per-server `getMCPServerStatus(name)` counts. `connecting > 0` or in-progress discovery → warning + `init_timeout`; `disconnected > 0` post-discovery → error + `protocol_error`. - `skills`: `SkillManager.listSkills()`; SkillError throws are mapped via the helper (`PARSE_ERROR` → `parse_error`, `FILE_ERROR` → `missing_file`). - `providers`: `getAllConfiguredModels()`; empty list with a configured `authType` → warning + `auth_env_error`. ModelConfigError throws map to `auth_env_error`. - `tool_registry`: null registry → error + `protocol_error`. Otherwise surfaces tool count. - `egress`: stays `not_started`. PR 14 plugs in the real probe. - `errorCell` private helper extended with optional `errorKind` parameter; defaults to `mapDomainErrorToErrorKind(error)` so existing call sites (`mcp` / `skills` / `providers` envelope errors) automatically gain classification. Tests: 2 new acpAgent tests — preflight returns the six expected ACP cells with correct locality + statuses; preflight surfaces a `SkillError` (`PARSE_ERROR`) on the `skills` cell as `errorKind: 'parse_error'`. The core `vi.mock` block adds a SkillError class for `instanceof` matching inside `mapDomainErrorToErrorKind`. * docs(serve): preflight and env protocol section (#4175 Wave 3 PR 13) Document `/workspace/env` and `/workspace/preflight` end-to-end: - Common-cell shape: tighten `errorKind` from open `string` to the closed `DaemonErrorKind` enum (seven literals from #4175). Add an explicit redaction-policy paragraph covering env-var presence-only, proxy host:port reduction, and the whitelisted-secrets list. - Capability-tag list: add `workspace_env` and `workspace_preflight`. - New `### GET /workspace/env` section with sample payload, `DaemonEnvKind` / `DaemonEnvCell` types, and the redaction-policy paragraph spelling out which secret env vars are enumerated and how proxy URLs are reduced to `host:port`. - New `### GET /workspace/preflight` section with idle sample payload, `DaemonPreflightKind` / `DaemonPreflightCell` types, the seven-value `errorKind` semantics table, and the bridge-error fallback contract (mid-request ACP channel close → cells drop to `not_started` + envelope carries one `errors[]` entry). - Source-layout table: extend the `status.ts` row to mention the new `ServeErrorKind` / `BridgeTimeoutError` / `mapDomainErrorToErrorKind` surface; add a new `envSnapshot.ts` row.	2026-05-18 07:29:05 +08:00
qqqys	f84ddd434b	feat(core): fail impossible goals (#4230 ) Some checks are pending Qwen Code CI / Classify PR (push) Waiting to run Details Qwen Code CI / Lint (push) Blocked by required conditions Details Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * feat(core): fail impossible goals * fix(core): refine impossible goal judgement * fix(core): include goal feedback when continuing * fix(core): clarify impossible goal terminal state * fix(core): harden impossible goal feedback * fix(core): log suppressed impossible verdicts * fix(goal): address review suggestions * test(goal): cover impossible parsing suggestions	2026-05-18 00:31:51 +08:00
Shaojin Wen	c93d66cd23	fix(serve): align build and integration test coverage (#4248 ) * fix(serve): align test coverage with build inputs Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * test(serve): address review feedback Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-18 00:01:47 +08:00
jinye	60fe594e8f	feat(serve): add read-only status routes (#4241 ) * feat(serve): add read-only status routes Add read-only daemon status endpoints for workspace MCP, skills, providers, session context, and session supported commands. Expose matching typed SDK helpers and document the new additive v1 status surface. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(serve): harden read-only status snapshots Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(serve): address read-only status review feedback Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-17 21:37:20 +08:00
jinye	aef35c390e	feat(serve): session metadata and close/delete lifecycle (#4175 Wave 2.5 PR 11) (#4240 ) * feat(serve): session metadata and close/delete lifecycle (#4175 Wave 2.5 PR 11) Add explicit session close and metadata management to the daemon serve infrastructure, closing the Stage 1 limitation that sessions could only end via child crash or daemon shutdown. - DELETE /session/:id — force-closes a live session (cancels active prompt, resolves pending permissions, publishes session_closed event) - PATCH /session/:id/metadata — update mutable displayName - Enriched GET /workspace/:id/sessions with createdAt, displayName, clientCount, hasActivePrompt - session_closed + session_metadata_updated SDK event types with validation, reducer, and terminal event priority - DaemonClient.closeSession / updateSessionMetadata + session client wrappers - Capabilities: session_close, session_metadata * fix(serve): address review feedback on session lifecycle PR - Fix JSDoc on closeSession: clarify that bridge throws SessionNotFoundError (SDK absorbs 404 for client-side idempotency) - Tighten event validators: isSessionClosedData checks closedBy type, isSessionMetadataUpdatedData checks displayName type - PATCH /session/:id/metadata now returns effective stored metadata instead of echoing request fields, avoiding ambiguous no-op responses - Only publish session_metadata_updated event when displayName changes - Update chooseTerminalEvent comment to reflect session_closed * fix: address PR 4240 review feedback Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix: address remaining PR 4240 suggestions Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix: update serve sessions test mock Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-17 20:42:15 +08:00
JerryLee	9abd704e09	fix(cli): record mid-turn queued user prompts (#4215 )	2026-05-17 20:21:06 +08:00
jinye	4e06967c2b	feat(serve): mutation gating helper and --require-auth (#4236 ) Some checks are pending Qwen Code CI / Classify PR (push) Waiting to run Details Qwen Code CI / Lint (push) Blocked by required conditions Details Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * feat(serve): mutation gating helper and --require-auth Implements issue #4175 Wave 4 PR 15. Adds the centralized state-changing-route gate that Wave 4 follow-ups (memory CRUD, file edit, MCP restart, device-flow auth) will reuse, plus the `--require-auth` deployment knob that hardens the loopback developer default for shared dev hosts / CI runners. - `createMutationGate({ tokenConfigured, requireAuth })` factory in serve/auth.ts — per-route middleware with a 4-cell behavior matrix: pass-through under `requireAuth` or any token configured; `401 token_required` for `strict: true` routes on no-token loopback defaults; baseline pass-through otherwise. - Existing Wave 1-2 mutation routes (POST /session, /session/:id/{load, resume,prompt,cancel,model}, /permission/:requestId) opt into the default non-strict factory call as the centralization marker. Wave 4 routes will pass `{ strict: true }` to require a token even on loopback. - `--require-auth` CLI flag + `ServeOptions.requireAuth`. Boot refuses without a token; closes the `/health` exemption when on so loopback `/health` also requires bearer auth; stderr breadcrumb so the hardened mode is visible in journald/docker logs. - Conditional `require_auth` capability tag advertised only when the flag is on. New `CONDITIONAL_SERVE_FEATURES` registry primitive so future per-deployment toggles follow the same shape. - 5 new unit tests in auth.test.ts covering the gate matrix; 5 added in server.test.ts for capability advertisement, conditional tag, /health 401 under --require-auth, and runQwenServe boot refusal + happy path. 245/245 serve tests pass; typecheck + eslint clean. Refs: #4175 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4236 review feedback Three small follow-ups from the automated reviewers on PR #4236: 1. Drop misleading `--require-auth` from `token_required` error message (Copilot inline auth.ts:262). The strict-mode 401 listed three remediations but `--require-auth` is paired-required with a token at boot — naming it standalone would loop the operator into a different boot error. Keep the two valid standalone fixes (env var, --token); add inline note explaining the omission. `auth.test.ts` regex updated to `not.toMatch(/--require-auth/)` to anchor the new wording. 2. Mention `/health` gating in `--require-auth` CLI description (auto-reviewer Medium #2). Operators flipping the flag without reading the protocol doc would get paged when k8s/Compose probes start 401-ing. One sentence in the yargs description prevents that. 3. Drift insurance comment between registry and `CONDITIONAL_SERVE_FEATURES` (auto-reviewer Low #3). Document the four-step procedure for adding a new conditional tag so a future contributor doesn't update only the registry and silently advertise the tag unconditionally. Notes the Map<predicate> refactor as the right move when a second tag lands. Deferred (not in this fix-up): - Module-level PASSTHROUGH singleton (High #1) — micro-optimization, unmeasurable. - Map<feature, predicate> for conditional features (High #2) — premature abstraction with one tag. - Per-route `// non-strict marker` comments (Medium #1) — noise. - `@see` cross-ref in types.ts (Low #2) — sugar. - JSDoc bullet-list vs table (Low #1) — current format is fine. Refs: #4175 #4236 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4236 round-2 review feedback Five small follow-ups from @wenshao + DeepSeek (via Qwen Code /review) on PR #4236: 1. Map<predicate> refactor for `CONDITIONAL_SERVE_FEATURES` (review threads #3254467192 + #3254485912). Two reviewers asked for the same shape on the grounds that the `Set` + per-feature `if`-branch needed FOUR coordinated changes per new conditional tag and silently fail-CLOSED when the branch was missed. The Map collapses the predicate-decision and the set-membership into one entry per feature — adding a new conditional tag is now two coordinated changes (registry + Map entry) and a missing predicate is a TypeScript error rather than a silent omission. JSDoc updated. 2. Drift-insurance test that iterates `CONDITIONAL_SERVE_FEATURES` (review thread #3254467192 option 1, layered on top of #1). `server.test.ts` now walks every Map entry and asserts the predicate accepts/rejects as expected; future entries that don't add an assertion branch fail the test loudly so a missing predicate cannot ship silently. Adoption-of-record for the Map shape rather than relying on a hand-maintained invariant. 3. Cache `strictDenier` for allocation symmetry (review thread #3254467193). Wave 4 PRs will mount strict mode on multiple routes; without the cache each `mutate({strict:true})` call would allocate a fresh 401 closure. Now both the passthrough and the strict denier are pre-built singletons. Identity assertion in `auth.test.ts` anchors the cache so a future change that loses it surfaces in CI. 4. Doc cosmetic — extra blank line in qwen-serve.md (review thread #3254467198). Single blank line between the `>` quoted example and the following non-quoted bash block now. 5. Doc correctness — `require_auth` is post-auth confirmation (review thread #3254485910 from DeepSeek). When `--require-auth` is on, the global `bearerAuth` middleware gates every route including `/capabilities`, so an unauthenticated client cannot pre-flight `caps.features` to discover that auth is required — the discovery surface is the 401 response body itself. Both `qwen-serve.md` and `qwen-serve-protocol.md` rewritten to describe the tag as a post-authentication confirmation, matching the auth.ts JSDoc which already stated this correctly. Trade-offs documented (no code change): - Body-parser ordering (review thread #3254485915 from DeepSeek) noted as a comment block in `auth.ts`. Strict-mode 401 fires AFTER `express.json()` because the gate is per-route middleware. On loopback no-token defaults a strict route therefore parses the request body before refusing it — bounded by `express.json({limit: '10mb'})` × `--max-connections` (256 default). Strict routes Wave 4 actually adds carry small bodies in legitimate use, so this isn't a production hot path. Future routes accepting large bodies should lift the gate to app-level (maintain a strict-path Set in `createServeApp`); flagged as a Wave 4 follow-up rather than re-architecting the helper. - `bearerAuth` body-shape inconsistency (review thread #3254467197 from @wenshao) flagged as a Wave 4 cross-PR follow-up. `bearerAuth` returns `{error: 'Unauthorized'}` while the strict gate returns `{code: 'token_required', error: '...'}`; SDK clients have to branch on both shapes. Standardizing `bearerAuth` to also carry a `code` field is orthogonal to this PR's scope. Validation: 260/260 cli serve tests pass (was 258 — added the drift insurance test + strict denier identity test); typecheck + eslint clean. Refs: #4175 #4236 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-17 20:10:37 +08:00
易良	eef06ce376	feat(cli): add structured memory diagnostics JSON (#3785 ) * feat(cli): add memory diagnostics doctor command * fix(core): platform-aware maxRSS conversion and accurate risk message - Extract platform detection before building diagnostics so the correct unit conversion can be applied: multiply by 1024 on Linux (where process.resourceUsage().maxRSS is in KB) but leave the value unchanged on macOS/Windows (where it is already in bytes). - Correct the native-memory-pressure risk message to accurately state that the threshold is 2× heap used, not just "larger than heapUsed". - Add a dedicated test to assert that maxRSS is not multiplied on a non-Linux platform (darwin). All 3 core and 9 CLI tests pass; typecheck clean. Agent-Logs-Url: https://github.com/QwenLM/qwen-code/sessions/9b413337-68ed-4d5c-af99-0d42378900c3 * test(core): cover active request memory risk * fix(cli): address memory diagnostics review feedback * fix(cli): harden memory diagnostics review fixes * fix(memory-diagnostics): tighten risk thresholds and expand readable output - Add 64MB absolute floor on native-memory-pressure so cold processes don't trip the 2x ratio check; raise active-handles threshold from 100 to 256 - Show detachedContexts, nativeContexts, maxRSS, CPU times, smapsRollup availability, and v8HeapSpaces summary in the readable /doctor memory output - Validate unknown memory subcommand args with a usage hint instead of silently dropping them - Wrap human-readable strings in t(...) for i18n parity with the rest of doctor - Advertise the memory subcommand via /doctor argumentHint while keeping acceptsInput false so the parent still auto-submits - Document _getActiveHandles/_getActiveRequests as undocumented Node internals - Update tests for new thresholds, expanded output, unknown-arg path, and abort-during-json * fix(cli): harden memory doctor diagnostics * fix(core): correct maxRSS byte handling and heapRatio consistency - Remove incorrect * 1024 multiplier for maxRSS on Linux (Node.js >=14.10 returns bytes on all platforms) - Use v8HeapStats.usedHeapSize for heapRatio to avoid cross-API inconsistency - Update test expectations and rename "does not multiply" test * fix(cli): resolve rebase conflicts in memory diagnostics - Rename local formatMemoryDiagnostics to formatCoreDiagnostics to avoid naming conflict with the imported utility from memoryDiagnostics.js - Update Session.test.ts to use objectContaining for _meta field added in recent main commits - Align doctorCommand.test.ts assertions with current parent command state (argumentHint includes --sample/--snapshot from main) * fix(core): use null instead of undefined for optional probes, deduplicate active count helpers - optionalProbe/optionalSyncProbe now return null on failure so JSON.stringify preserves the keys instead of silently omitting them. - Merge getActiveHandlesCount/getActiveRequestsCount into a single parameterized getProcessInternalCount helper. - Update MemoryDiagnostics interface: v8HeapSpaces, openFileDescriptors, smapsRollup are now T \| null instead of T \| undefined. * fix(cli): finish memory diagnostics review fixes * fix(cli): address memory diagnostics review feedback --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-05-17 19:52:46 +08:00
Yan Shen	9985d91e08	feat(cli): add configurable plansDirectory for Plan Mode (#4062 ) * feat(cli): add configurable plansDirectory for Plan Mode Add a plansDirectory setting that allows users to define a custom directory for approved Plan Mode files. Relative paths are resolved against the project root and validated to prevent path traversal. - Storage: add isPathWithinDirectory() with realpathSync-based symlink resolution to prevent traversal bypass attacks (direct, intermediate, and cross-drive) - Config: cache plansDir at construction time, use atomic write (write-temp then rename) to prevent corrupted plan files on crash - CLI: respect bareMode by clearing plansDirectory in minimal mode - Docs: document plansDirectory with requiresRestart and gitignore hint - Tests: 26 new tests covering path validation, symlink attacks (direct and intermediate), Windows cross-drive paths, mixed separators, and configuration integration Closes #3548 * fix(core): align symlink test with return value * fix(core): harden plans directory handling * fix(config): address PR #4062 review findings for plansDirectory - Handle EXDEV during atomic plan writes (cross-device rename fallback) - Sanitize session IDs to prevent path traversal in plan filenames - Expand tilde (~) in configured plansDirectory paths - Preserve plansDirectory in bare mode - Add EACCES/EPERM handling to getPlanFileNames with user-visible warnings - Close TOCTOU gap with post-write path containment validation - Fix docs to clarify plansDirectory is a top-level key - Add happy-path I/O tests for configured plansDirectory	2026-05-17 19:43:24 +08:00
jinye	d2d426fad0	feat(serve): SSE replay sizing + slow_client_warning backpressure (#4175 Wave 2.5 PR 10) (#4237 ) * feat(serve): SSE replay sizing + slow_client_warning backpressure #4175 Wave 2.5 PR 10. Closes the SSE replay / backpressure knobs called out in #3803 §02 so chatty Stage 1 sessions get an honest reconnect window and operators get a heads-up signal before clients are summarily evicted. - `DEFAULT_RING_SIZE` 4000 → 8000. Per-session replay ring depth now matches the #3803 §02 target for chatty sessions. - `--event-ring-size <n>` CLI flag (default 8000) lets operators tune the ring per daemon. Threaded `ServeOptions` → `BridgeOptions.eventRingSize` → both `new EventBus()` construction sites (fresh sessions + restore path). Validation is fail-CLOSED (positive finite integer; 0 / NaN / negative throw at boot). - `slow_client_warning` SSE frame. When a subscriber's queue crosses 75% full the bus force-pushes a synthetic `slow_client_warning` to that subscriber once per overflow episode, carrying `{queueSize, maxQueued, lastEventId}`. The flag re-arms after the queue drains below 37.5% (hysteresis, no flap near threshold). If the queue actually overflows after the warning, the existing `client_evicted` terminal frame path still fires. Like `client_evicted`, the warning has no `id` (synthetic frame; must not burn a sequence slot for other subscribers). - `?maxQueued=N` query param on `GET /session/:id/events` (range `[16, 2048]`, default 256). Lets cold reconnect clients pre-size their per-subscriber backlog so a large `Last-Event-ID: 0` replay doesn't trip the warning on the first publish. Range rationale: lower bound 16 (smaller is useless for any replay); upper bound 2048 (so a single subscriber can't pin ~1 MB just by asking). Out-of-range / non-decimal returns `400 invalid_max_queued` BEFORE opening the SSE stream — clean 4xx beats half-opening a stream + emitting a `stream_error` (which EventSource would auto-reconnect on). - `slow_client_warning` capability tag — single source of truth for the warning frame + `?maxQueued` query param + ring-size knob. Old daemons silently lack all of these; pre-flight via `caps.features`. - SDK extensions (`@qwen-code/sdk`): typed `DaemonSlowClientWarningEvent` (added to known event union and `DaemonStreamLifecycleEvent`); schema-validated by a new `isSlowClientWarningData` predicate; reducer (`reduceDaemonSessionEvent`) increments `slowClientWarningCount` + stores `lastSlowClientWarning`. Warning is non-terminal — `alive` stays true (only `client_evicted` / `stream_error` / `session_died` close the stream). Re-exported from the public SDK entry. - Docs: `qwen-serve-protocol.md` updates the features list (adds `slow_client_warning` and the previously-missing `client_identity` to match reality post-#4231), documents the `?maxQueued` query param, adds the warning frame to the event table, and notes the new default ring size. `qwen-serve.md` adds the `--event-ring-size` flag row. Tests: 19 eventBus (4 new: warning at 75%, once per episode, no `id` on the synthetic frame, hysteresis re-arm), 106 bridge (2 new: validate eventRingSize accept/reject), 111 server (4 new: ?maxQueued accept/absent/non-decimal/out-of-range + EXPECTED_STAGE1_FEATURES update), 14 SDK daemonEvents (2 new: schema validation + non-terminal reducer behavior). 321 focused tests total, all green. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * refactor(serve): adopt PR #4237 review feedback (eventBus polish) Address the actionable items from the Qwen Code review bot's pass on PR #4237: - Pre-compute `warnThreshold` / `warnResetThreshold` per `InternalSub` at `subscribe()` time so `publish()`'s per-event hot path is one integer compare per subscriber instead of a multiply + compare. The `!warned` short-circuit still collapses the steady state to a single boolean read; this just shaves a multiply when the threshold check actually fires. - Document the back-of-queue ordering choice for the synthetic `slow_client_warning` frame in `EventBus.publish()`: front-push was considered but mid-stream front-insertion would mis-count `forcedInBuf` in `BoundedAsyncQueue.next()`, and `forcePush` already short-circuits via `resolvers.shift()` for the active-consumer case — the back-of-queue path only matters for stalled consumers, who can't drain regardless of warning position. - Reuse the existing `collect()` helper in the "default ring size 8000" test for consistency with the rest of the file; the new test also tightens the assertion by checking that the first retained event id is 2 (id=1 dropped by the ring) and the last is 8001. - Soften the "~500 B per session" magic number in `BridgeOptions.eventRingSize`'s JSDoc to a qualitative description (each retained `BridgeEvent` is a reference plus its serialized payload; ceiling scales as `ringSize × average-event-size`). Rejected: - Bot's claim that the error JSON contains `\`...\`` escape sequences — bot misread the JS template-literal source as the wire output; `JSON.stringify` does not escape backticks, and the existing `cwd` error messages use the same style. - Bot's "use `Record<string, never>` instead of `[key: string]: unknown`" suggestion on `DaemonSlowClientWarningData` — every other event-data type in `sdk-typescript/src/daemon/events.ts` carries the same index signature for additive-field compatibility. - Bot's "features list breaks alphabetical order" — the capability list is grouped by protocol lifecycle (health → capabilities → session lifecycle → events → permissions), not alphabetical. Tests: 139 focused tests across eventBus + httpAcpBridge + SDK daemon events — all passing. Behavior unchanged; this is hot-path micro-opt + comment polish only. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve): correct queue tagging + plumb maxQueued through SDK Address both P2 findings from the Codex review pass on PR #4237. Bug 1: `BoundedAsyncQueue.forcedInBuf` position-invariant break The previous `forcedInBuf` counter only tracked LIVE-vs-FORCED correctly when all forced entries lived at the FRONT of the buffer (subscribe-time `Last-Event-ID` replay). The new mid-stream `slow_client_warning` path force-pushes to the BACK of the queue while the queue is still open, which the existing accounting was not designed for: - publish 6 events at maxQueued=8 → 75% threshold trips → force-push warning at the back → buf=[1..6, warning], forcedInBuf=1. - consumer shifts `1` → forcedInBuf decremented to 0 (incorrect: `1` was a live frame, not the forced one). - consumer drains 2..6 + warning → buf=[], forcedInBuf=0, true live count = 0, but `size` getter and `push()` cap check then use `buf.length - forcedInBuf` which drifts over subsequent refills, causing premature warn / eviction before the cap is actually reached. Replace the position-dependent counter with a per-entry `{value, forced}` tag. `liveCount` is incremented in `push()` / decremented in `next()` only when the shifted entry was non-forced — position becomes irrelevant. `size` getter returns `liveCount` directly. The class doc comment is rewritten to call out that the new tag is the position-independent replacement for the old "forced frames must stay at the front" invariant. Regression test in `eventBus.test.ts` reproduces the codex trace (warn at 75%, drain past warning, refill to cap) and asserts no premature eviction. Bug 2: SDK does not expose `?maxQueued` `docs/users/qwen-serve.md` and `docs/developers/qwen-serve-protocol.md` both document `?maxQueued=N` as something SDK clients can request, but `SubscribeOptions` on `DaemonClient` only declared `lastEventId` + `signal`, and `subscribeEvents()` always fetched `/events` without a query string. Typed-SDK consumers had no way to opt in without hand-crafting URLs. - Add `SubscribeOptions.maxQueued?: number` with JSDoc noting the daemon range `[16, 2048]` and the pre-flight requirement on `caps.features.slow_client_warning`. - `DaemonClient.subscribeEvents` builds the URL with an optional `?maxQueued=<n>` segment. No client-side range validation — the daemon's `parseMaxQueuedQuery` is the source of truth and returns structured `400 invalid_max_queued`; duplicating the bounds in two layers would diverge on the next tweak. - `DaemonSessionSubscribeOptions extends SubscribeOptions` so the new field flows through `DaemonSessionClient` automatically. Three new SDK tests: - subscribeEvents appends `?maxQueued=N` when set - omits the query string when absent (existing behavior preserved) - propagates a `400 invalid_max_queued` unchanged Tests: 214 focused tests across eventBus / bridge / SDK DaemonClient / DaemonSessionClient / daemonEvents, plus 111 in the server suite. All green; the new eventBus regression case proves the position-invariant fix. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * refactor(serve): adopt PR #4237 copilot review feedback Address 6 of 8 copilot-reviewer findings on PR #4237; the other 2 (#1 forcedInBuf live-size corruption, #5 SDK lacks maxQueued) were already fixed in `bae42c88b` — replied on the threads with the commit hash. - [2] server.ts:1068 — `?maxQueued=` (present-but-empty) now fails closed with `400 invalid_max_queued` instead of silently falling back to the default queue cap. The API documents fail-closed for any malformed value before opening SSE, so an empty string is unambiguously malformed. New server.test.ts case locks this in. - [3] commands/serve.ts:93 — CLI help text for `--event-ring-size` no longer mis-shapes `Last-Event-ID` as a query parameter. It is an HTTP header, and the daemon's SSE route does not parse a `?Last-Event-ID=` query. - [4] docs/developers/qwen-serve-protocol.md:351 — clarify that `?maxQueued=N` controls the LIVE-event backlog cap. Replay frames are force-pushed and exempt from the cap; what consumes it is live events that arrive while the subscriber is still draining a cold-reconnect replay. Bumping for cold reconnects is still the right answer, but for the live tail, not for the replay frames themselves. - [6] eventBus.ts:214 — stale `ringSize=4000` performance comment updated to the new `ringSize=8000` default with a note about the O(n) `shift()` cost scaling. - [7] sdk-typescript events.ts:492 — `isSlowClientWarningData` now uses the existing `isFiniteNumber` helper instead of bare `typeof === 'number'`. Mirrors the sibling predicates and rejects `NaN` / `Infinity` payloads as schema garbage. New daemonEvents.test.ts assertions cover both. - [8] server.ts:127 — `createServeApp`'s default-bridge construction now also forwards `opts.eventRingSize` to `createHttpAcpBridge`, symmetric with the `runQwenServe.ts` path. Direct embeds / tests that called `createServeApp` without supplying their own bridge but did pass `ServeOptions.eventRingSize` were silently getting the default 8000 ring. Tests: 326 focused tests across eventBus / bridge / SDK DaemonClient / DaemonSessionClient / daemonEvents / server. All green; the new server.test.ts case + the extended daemonEvents.test.ts assertions cover the tightened guards. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * refactor(serve): adopt PR #4237 wenshao round-2 review feedback Six adopted findings from @wenshao's second review pass on PR #4237. The seventh ([10] forcedInBuf 3rd case invariant) was already fixed in `bae42c88b` — replied on that thread. - [9] + [14] server.ts — Sanitize attacker-controlled values before stderr interpolation in both `parseMaxQueuedQuery` and `parseLastEventId`. New `safeLogValue()` helper uses `JSON.stringify` to escape control characters (`\n`/`\r`/…) so a URL-encoded newline in `?maxQueued=%0a` can't inject extra log lines into journald/Loki/Splunk pipelines. Matches the `workspace_mismatch` sanitization style in `sendBridgeError`. Fixed in both helpers (the sibling pre-existing `parseLastEventId` had the same shape) so the file stays consistent. - [11] httpAcpBridge.ts — `!Number.isFinite(eventRingSize)` was redundant: `Number.isInteger(NaN)` and `Number.isInteger(Infinity)` both return `false`, so the sibling `!Number.isInteger` already catches both. Drop the dead guard. - [12] httpAcpBridge.ts — Add soft upper bound `MAX_EVENT_RING_SIZE = 1_000_000` on `eventRingSize` to catch operator typos (`--event-ring-size 80000000` vs `8000000`). At ~500 B per `BridgeEvent` an 1M-frame ring already pins ~500 MB per session — well past any realistic workload. Not a security boundary (operator-controlled flag), pure typo defense. Existing bridge construction test extended with an `80_000_000` case. - [13] commands/serve.ts — CLI `--event-ring-size` flag now sources its default from `DEFAULT_RING_SIZE` (imported from `serve/eventBus.js`) instead of the hardcoded literal `8000`. Without this, a future bump of the bus default would silently not take effect for daemons launched through the CLI because the flag always overrides — single source of truth fixes that. - [15] eventBus.ts — Drop unreachable `event.id ?? this.lastEventId` fallback in the `slow_client_warning` frame. `event` is locally constructed at the top of `publish()` with `id: this.nextId++` and is guaranteed defined. Use `event.id as number` directly + an inline note about the invariant. Tests: 197 (eventBus 20 / bridge 107 / SDK DaemonClient 57 / SDK daemonEvents 14) + 112 server. All green; the new upper-bound bridge case + the existing log assertions pin the changed behaviors. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-17 19:30:43 +08:00
jinye	0a4a08e443	feat(serve): add client heartbeat (#4175 Wave 2.5 PR 9) (#4235 ) * feat(serve): add client heartbeat route Adds POST /session/:id/heartbeat plus SDK helpers so long-lived adapters (TUI/IDE/web) can refresh the daemon's last-seen bookkeeping. Bridge stores per-session and per-client timestamps behind a getHeartbeatState() snapshot accessor that PR 12 read-only diagnostics and PR 24 revocation policy will consume. - Capability tag: client_heartbeat (advertised on /capabilities.features) - Identified clients must echo X-Qwen-Client-Id; the bridge validates the id BEFORE bumping any timestamp so a forged id can't mask client absence - Per-client entries are dropped together with the registration ref-count in unregisterClient, so churn doesn't leak stale ids - getHeartbeatState returns a snapshot Map; mutating it does not leak into bridge state - Anonymous heartbeats bump only the per-session watermark Errors mirror the rest of the routes — 404 SessionNotFoundError, 400 invalid_client_id (header malformed or unknown for this session). Roadmap PR 9 from #4175. Depends on PR 7 (#4231 client identity, merged) for the trusted clientId registry. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * feat(sdk): re-export HeartbeatResult from package root The published @qwen-code/sdk only exposes the root entrypoint via `exports`; daemon subpath imports are not part of the public API. Adding HeartbeatResult to packages/sdk-typescript/src/daemon/index.ts made it reachable internally but not for downstream consumers writing `import type { HeartbeatResult } from '@qwen-code/sdk'` — every other daemon result type (PromptResult, SetModelResult, DaemonSession, etc.) is forwarded through the root barrel, so HeartbeatResult was the only hole in the heartbeat helper's public surface. Inserted alphabetically between DaemonStreamLifecycleEvent and KnownDaemonEvent to match the existing ordering convention.	2026-05-17 18:57:28 +08:00
jinye	07e0e82258	feat(serve): advertise typed_event_schema + pin SDK public surface (#4175 PR 4 follow-up) (#4226 ) * feat(serve): advertise typed_event_schema capability Follow-up to #4217 (`feat(protocol): add typed daemon event schema v1`, Wave 1 PR 4 of #4175), which landed the SDK-side typed schema + `KnownDaemonEvent` union + reducer but did not register a daemon-side capability tag for it. Without the tag, non-SDK clients (web debug UI, third-party adapters, channel/IDE backends not yet on `@qwen-code/sdk`) have no way to detect at the protocol envelope level that the daemon promises to emit only `KnownDaemonEvent`-shaped frames — they would either pin against SDK version, or pre-flight every frame defensively. Add `typed_event_schema: { since: 'v1' }` to `SERVE_CAPABILITY_REGISTRY`, inserted right after `session_events` (the route that delivers the frames whose schema this tag describes). The capability is purely informational — `narrowDaemonEvent`/`asKnownDaemonEvent` already fall back to "unknown" for older daemons that don't advertise the tag, so the SDK does not gate any behavior off the tag. Sync `EXPECTED_STAGE1_FEATURES` (server.test.ts) and the integration test array (qwen-serve-routes.test.ts) with the registry order, the same lockstep discipline #4214 codified. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * test(sdk): pin typed event surface at the public SDK entry, point DaemonSessionClient docstring at it Two small follow-ups to #4217 (Wave 1 PR 4 of #4175). 1. Public-entry regression fence `@qwen-code/sdk` is a single-entry package: `package.json.exports` only exposes `.` (`dist/index.{cjs,mjs,d.ts}`), and the bundle is built from `src/index.ts`. Symbols re-exported only from `src/daemon/index.ts` are unreachable to consumers unless they are also forwarded by `src/index.ts`. #4217 forwards the typed event schema correctly today, but the two-layer chain has no compile-time test pinning it — a future daemon export that lands in `src/daemon/index.ts` but is missed by `src/index.ts` would ship invisibly. Add `test/unit/daemon-public-surface.test.ts` that imports `* as Public from '../../src/index.js'`, asserts at runtime that every PR 4 value is `typeof === 'function'` (or a primitive of the expected shape), round-trips a raw `DaemonEvent` through the public `asKnownDaemonEvent` to prove the wire-up actually works, and compile-imports every PR 4 type so any drift fails to build. 2. DaemonSessionClient docstring pointer The class docstring already deferred typed event consumption to "the protocol schema layer" without a concrete pointer. Now that #4217 has put `asKnownDaemonEvent` and `reduceDaemonSessionEvent` in `./events.js`, name them so future readers can find the typed surface without grepping. No code change. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)	2026-05-17 18:43:38 +08:00
kkhomej33-netizen	a5e4839e07	fix(cli): restore ACP prompt counter on resume (#4233 )	2026-05-17 18:32:44 +08:00
ChiGao	c25e22b575	feat(serve): add session-scoped permission route (#4232 ) Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com>	2026-05-17 17:48:30 +08:00
ChiGao	4d9cbe49c0	feat(serve): add daemon-stamped client identity (#4231 ) * feat(serve): add daemon-stamped client identity * fix(serve): harden daemon client identity handling --------- Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com>	2026-05-17 16:19:30 +08:00
kkhomej33-netizen	605e5eea16	fix(cli): include skill base dir in slash commands (#4224 )	2026-05-17 15:52:29 +08:00
jinye	2453b82add	[codex] Add daemon session load/resume (#4222 ) * feat(serve): add daemon session load resume Adds HTTP and SDK support for restoring persisted daemon sessions through load/resume routes, including replay buffering for load and guarded concurrent restore handling. Refs #4175 Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(serve): address review feedback on daemon session load/resume - Gate `defaultEntry` claim in `restoreSession` on `defaultSessionScope === 'single'`, mirroring `doSpawn`. Without the gate, a restored session silently became the omitted-scope attach target on `'thread'`-default daemons. - Rename advertised capability `session_resume` to `unstable_session_resume` to match the underlying ACP method (`connection.unstable_resumeSession`). `session_load` stays stable. - Seed `lastEventId: 0` in `DaemonSessionClient.resume`, symmetric with `load`. The agent's `unstable_resumeSession` schedules an `available_commands_update` via `setTimeout(0)`; without the seed the SDK consumer would miss that frame. - Add HTTP-level test for the `RestoreInProgressError → 409` envelope. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(serve): adopt review feedback comments on session load/resume - Cross-reference the `POST /session` disconnect-cleanup rationale from `restoreSessionHandler`'s `!res.writable` branch so future maintainers find the BQ9tV race + tanzhenxin attach-rollback context without grep. - Document `DaemonSessionState.{models, modes, configOptions}` in the SDK so callers can narrow to the ACP `SessionModelState` / `SessionModeState` / `SessionConfigOption` shapes. - Add JSDoc on `DaemonClient.restoreSession` explaining why `loadSession` and `resumeSession` collapse into one transport. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve): preserve restore state and harden in-flight restore races Address the four Critical findings from PR #4222 review (wenshao): - Coalesced restore waiters now observe the same ACP state the original restore caller did. `state: {}` in `restoreSession`'s coalesce branch was clobbering the spread `restored.state`, so concurrent callers got different payloads based purely on timing. Cache the load/resume response on `SessionEntry.restoreState` and return it from both the existing-byId early return and the coalesce branch. - Drop the `defaultEntry` promotion on restore. Explicit `session/load` / `session/resume` is "give me THIS id"; it must not become the implicit attach target for subsequent omitted-id `POST /session` callers under `single` scope. Reserves `defaultEntry` for sessions created through `doSpawn` only. - Reserve coalesced attaches synchronously via `InFlightRestore.coalesceState.count` so the spawn owner's `requireZeroAttaches` disconnect-reaper sees a non-zero `attachCount` on the freshly registered entry and skips the kill. Without this, B's `attachCount++` happened after `await inFlight.promise`, leaving a window where A's HTTP-disconnect cleanup could reap the session out from under B. - Include `pendingRestoreIds` in the `killSession` channel-teardown decision. The last live session leaving while a restore is in-flight on the same channel would otherwise SIGTERM the channel mid-restore. - Bump `RestoreInProgressError`'s `Retry-After` from 1s to 5s (matches `SessionLimitExceededError`); under the default `initTimeoutMs` of 10s, 1s pushed clients into tight loops. Tests: new bridge cases covering state propagation through coalesce, the spawn-owner-disconnect race, the pendingRestoreIds-aware channel teardown, and the no-promote- on-restore invariant. Existing "attaches twice" test rewritten to assert the cached restore state propagates. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * test(serve): cover acpAgent load/resume + restore route error mappings Close the test-coverage gaps wenshao called out in PR #4222 review: - acpAgent.test.ts gains a `QwenAgent loadSession / unstable_resumeSession` block that locks down the new contract end-to-end at the agent layer: * `loadSession` missing persisted session → throws `RequestError.resourceNotFound("session:<id>")` (code -32002 + `data.uri`). * `loadSession` existing session → returns LoadSessionResponse AND triggers `session.replayHistory(messages)` so SSE subscribers see the persisted turns. * `unstable_resumeSession` missing session → same resourceNotFound contract. * `unstable_resumeSession` existing session → returns the response WITHOUT replaying history (resume restores model context internally; UI replay is intentionally suppressed). Required extending the mocked `RequestError` with `resourceNotFound`, and mocking `SessionService` per case. - server.test.ts adds the missing restore-route wire mappings: `WorkspaceMismatchError → 400 workspace_mismatch` and `SessionLimitExceededError → 503 + Retry-After: 5`. Combined with the existing 409 case for `RestoreInProgressError`, the route layer now has full structured-error coverage. - Updated the 409 test's `Retry-After` expectation from `1` to `5` to match the bumped retry hint. Disconnect-cleanup tests for the restore route were intentionally not added — the cleanup branch is line-for-line identical to `POST /session`'s handler (which itself ships without route-level disconnect tests due to flaky supertest + Node http close-event timing). 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(serve): document daemon session load/resume routes Sync the docs to the routes that landed via PR #4222: - `docs/developers/qwen-serve-protocol.md`: * Add `session_load` and `unstable_session_resume` to the advertised features list, with a note on the `unstable_` prefix mirroring ACP's underlying method name. * Document `POST /session/:id/load` and `POST /session/:id/resume` — request body, response shape (including the cached `state` field that late attachers observe), and the full error envelope: 404 unknown id, 400 workspace_mismatch, 503 session_limit_exceeded (counts in-flight restores), 409 restore_in_progress (cross-action race). * Note the SSE replay ring bound (4000 frames default) and the "subscribe immediately after load" guidance for long histories. - `docs/users/qwen-serve.md`: * Add a "Loading and resuming a persisted session" section with the SDK example (`DaemonSessionClient.load` / `DaemonSessionClient.resume`) and the load-vs-resume decision table. * Update the durability model — sessions are still ephemeral across daemon restarts in Stage 1, but persisted sessions on disk can now be reloaded. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(test): use _meta payload to satisfy ACP SessionConfigOption types The two new state-propagation tests in `httpAcpBridge.test.ts` used `{ id, name, value }` as a `SessionConfigOption`, but ACP's actual `SessionConfigSelect` shape requires `currentValue` + `options`. vitest runs through esbuild and skips strict typechecking, so the local `vitest run` passed; CI's `tsc --build` (run during `npm run prepare`) caught it. Switch the fixture to `_meta: { tag: '...' }` instead — `_meta` is typed as `Record<string, unknown> \| null` on the ACP response shapes, so any payload survives. The assertions only need the bridge to forward the state object intact, which `_meta` proves equally well without committing the test to the full SessionConfigOption union. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve): symmetric restore coalesce guard + transportClosed leak + defensive cleanup Address the two new Critical findings + the test/cosmetic gaps from wenshao's second review pass on PR #4222 (`a3f38da3a`): - [Critical] Symmetric coalesce guard. The previous guard only rejected `load`-on-`resume`; `resume` arriving while a `load` was in flight silently coalesced and inherited the load's history- replay frames over SSE — directly violating resume's "no UI replay" contract (made worse by `DaemonSessionClient.resume()` seeding `lastEventId: 0`). Tighten the guard to `action !== inFlight.action` so any cross-action race throws `RestoreInProgressError`. Same-action coalescing is unaffected. - [Critical] `transportClosed` dangling rejection. When `withTimeout` wins the `Promise.race` against `channel.exited`, the `.then(throw)` chain on `channel.exited` stays pending. A later channel exit (next session boundary, daemon shutdown, agent crash) fires the `throw` with no observer attached — Node 22 logs `unhandledRejection`, and `--unhandled-rejections=throw` deployments crash the daemon. Add `transportClosed.catch(() => {})` to suppress the dangling rejection after the race settles. - `isAcpSessionResourceNotFound` exact-match fallback. The message-fallback path used `message.includes(expectedUri)`, which would falsely match a sessionId of `"a"` against a message containing `"session:abc"`. Tighten to exact equality on the canonical `Resource not found: <uri>` form. The primary `data.uri` path remains the dominant code path. - `loadSession` mcpServers default symmetry. `loadSession` now uses `params.mcpServers ?? []` to mirror `unstable_resumeSession`. Defends against a future ACP schema loosening that makes `LoadSessionRequest.mcpServers` optional — without the null-coalesce, `newSessionConfig` would `TypeError` on iteration. Tests added: - `httpAcpBridge.test.ts`: `resume`-on-`load` rejection (mirror of the existing `load`-on-`resume` test); regression for the dangling `unhandledRejection` (resolves `channel.exited` after the restore promise has already settled and asserts no `unhandledRejection` event); shutdown-awaits-restore via `Promise.race`-based ordering. - `server.test.ts`: 400 for non-string and over-length `cwd` on the restore routes (mirroring the equivalent `POST /session` cases for `parseOptionalWorkspaceCwd`). - `acpAgent.test.ts`: load with `getResumedSessionData()` returning `undefined` — distinct code path that does NOT call `replayHistory`. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-17 12:58:47 +08:00
qqqys	2e773b0e60	[codex] Allow custom output directory for /export (#4193 ) * feat(cli): support export output directories * fix(cli): address export review feedback * test(cli): cover JSON export directory handling * fix(cli): constrain export output directories * test(cli): cover export edge cases * fix(cli): address export directory review feedback * fix(cli): revalidate export directory before write * fix(cli): validate export directory before mkdir * fix(cli): harden export target writes * fix(cli): refine export failure handling * fix(cli): clarify export directory mode * fix(cli): include export path context in errors * fix(cli): add export debug logging * fix(cli): make export tests path portable * fix(cli): refine export validation diagnostics * test(cli): cover export validation failures	2026-05-17 12:33:02 +08:00
Shaojin Wen	ef29700bce	fix(ui): trim background task results and show newest first (#4094 ) (#4125 ) Some checks are pending Qwen Code CI / Classify PR (push) Waiting to run Details Qwen Code CI / Lint (push) Blocked by required conditions Details Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * fix(ui): trim background task results and show newest first (#4094) Two related improvements to the background task pill and dialog: 1. Trim outdated terminal task results. `BackgroundTaskRegistry` and `BackgroundShellRegistry` now cap retained terminal entries at 32 each (mirroring `MonitorRegistry`'s existing `MAX_RETAINED_TERMINAL_MONITORS` pattern). Running, paused, and cancelled-but-not-yet-notified entries are never evicted — pruning a not-yet-notified entry would break the SDK contract that every `register` pairs with exactly one terminal `task-notification`. 2. Show newest tasks at the top of the dialog. `useBackgroundTaskView` now sorts entries by `startTime` descending so the dialog opens with the cursor on the most recently launched task. `LiveAgentPanel` reverses internally back to ASC for its own visual layout (newest row sits closest to the composer). * perf(shell-registry): batch abortAll prune + statusChange into one pass abortAll() previously delegated to cancel() per entry, so each running shell triggered its own pruneTerminalEntries() and statusChange wakeup. On shutdown / `/clear` with N running shells the only subscriber (useBackgroundTaskView) re-pulled getAll() N times for what is logically a single batch transition. Settle each entry inline via the new private settleAsCancelled() helper, then fire prune + statusChange exactly once after the loop. The split keeps the running-status guard at the public-API boundary so callers can't accidentally re-settle a terminal entry. * fix(ui): two-bucket sort so running tasks outrank fresh terminals The earlier startTime DESC sort surfaced the newest LAUNCH but let an older long-running / paused entry get pushed below a batch of newer terminal entries — the user opening the dialog to check on the running work would find it buried under stale completed rows. Split the merge into two buckets: - active (running + paused): sorted by startTime DESC so the most recent launch sits at the very top of the dialog. - terminal (completed / failed / cancelled): sorted by endTime DESC so the most recently FINISHED entry leads the terminal section (matches "what changed while I wasn't looking" intuition; a long task that just settled outranks an old quick task that finished hours ago). Pin the new behavior with two tests covering active-above-terminal and the endTime-vs-startTime distinction inside the terminal bucket. * fix: add missing outputFile and isBackgrounded to retention cap tests The merge brought in required fields on AgentTaskRegistration that the retention-cap test helpers were not supplying. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-17 09:13:24 +08:00
qqqys	07165a095c	Add stop hook blocking cap (#4208 ) * feat(core): add stop hook blocking cap * fix(core): tighten stop hook cap behavior * fix(cli): show goal judge details * fix(core): bound stop hook blocking cap * fix(core): surface subagent stop cap warnings * fix(core): clean up stop hook cap loop * test(core): cover stop hook cap integrations * test(core): strengthen stop hook cap coverage	2026-05-17 06:52:56 +08:00
易良	ba77ddd81b	fix(lsp): expose status and startup diagnostics (#3649 ) * feat(lsp): add /lsp slash command to show server status Implements the /lsp command that displays the status of all configured LSP servers. Previously this was documented in the FAQ but never implemented, leaving users with no way to check if their language servers started successfully. Changes: - Add LspServerStatusInfo interface to lsp/types.ts - Add getServerStatus() to LspClient and NativeLspClient - Expose getServerHandles() from NativeLspService - Create lspCommand.ts with status table output - Register /lsp in BuiltinCommandLoader (only when LSP is enabled) The command shows: server name, command, languages, and status (NOT_STARTED / IN_PROGRESS / READY / FAILED + error message). * fix(lsp): expose status and startup diagnostics * fix(lsp): harden status command diagnostics * fix(lsp): add stderr error listener and harden initialization error handling - Add stderr 'error' event listener in LspConnectionFactory to prevent unhandled stream errors from crashing the process - Wrap setLspInitializationError calls in try-catch in config.ts to guard against post-initialization state changes that would throw	2026-05-17 01:42:28 +08:00
jinye	54fd5c50f0	feat(telemetry): add detailed sensitive span attributes (#4097 ) Layer detailed content attributes onto the existing hierarchical spans (qwen-code.interaction / qwen-code.llm_request / qwen-code.tool) gated by includeSensitiveSpanAttributes: - Interaction span: user prompt (new_context) - LLM request span: system prompt + hash + preview + length (full text deduped per session via SHA-256), tool schemas (per-tool tool_schema events, also hash-deduped), model output - Tool span: tool input, tool result on every exit path (success + pre-hook block + post-hook stop + tool error + try-block cancel + catch-block cancel + execution exception) All large content truncated at 60KB with _truncated and _original_length metadata. Heavy serialization (safeJsonStringify on tool I/O, partToString on user prompt) is guarded by the sensitive flag at the call site so it doesn't run when telemetry is off. Also adds: - getActiveInteractionSpan() helper for client.ts to attach prompt attributes to the interaction span. - Updated config schema description and docs (telemetry.md + settings.md) to reflect expanded scope and add security/cost notes. - 28 unit tests for detailed-span-attributes, 4 tests for getActiveInteractionSpan, integration mocks updated.	2026-05-17 00:36:48 +08:00
qqqys	daaa85e98e	feat(cli): add fork-session resume flag (#4159 ) * feat(cli): add fork-session resume flag * fix(cli): address fork-session review feedback * fix(cli): handle fork session copy failures * fix(cli): guard sandbox session handoff flag	2026-05-17 00:27:52 +08:00
Shaojin Wen	b9590283c0	fix(cli): pass rewind selector test props (#4211 ) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-16 23:57:50 +08:00
jinye	878f35fc4f	feat(serve): per-request sessionScope override on POST /session (#4175 Wave 2 PR 5) (#4209 ) * feat(serve): per-request sessionScope override on POST /session Resolves the FIXME at httpAcpBridge.ts:BridgeOptions.sessionScope from #3803 — clients can now override the daemon-wide sessionScope per request instead of being stuck with whatever boot-time value the operator picked. A VSCode window that wants strict isolation can ask for `'thread'` against a default-`'single'` daemon, and vice versa. Wire change: - POST /session body accepts optional `sessionScope: 'single' \| 'thread'` - Per-request value wins; daemon-wide default remains the fallback when the field is omitted (bit-for-bit backward compat for every existing caller) - Invalid values yield 400 `{ code: 'invalid_session_scope' }` - New capability tag `session_scope_override` advertised on /capabilities.features for negotiation Bridge changes: - BridgeSpawnRequest gains optional `sessionScope` - spawnOrAttach validates the per-request value and resolves effectiveScope = req.sessionScope ?? defaultSessionScope - doSpawn now takes effectiveScope and only stamps `defaultEntry` (the single-scope attach slot) when the spawn is single-scope — fixes a mixed-scope leak where a thread-first call would let a later omitted-scope call attach to the supposedly-isolated session SDK: - CreateSessionRequest gains optional `sessionScope` - DaemonClient.createOrAttachSession conditionally spreads it into the JSON body so omitted callers send the same wire shape as before Tests: - 4 new bridge tests (override single→thread, override thread→single, mixed-scope leak regression, invalid-value rejection) - 3 new server tests (valid passthrough, invalid 400, omitted backward compat) - 2 new SDK tests (forwards/omits sessionScope on the wire) - EXPECTED_STAGE1_FEATURES updated for the new capability tag 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(serve): address Wave 2 PR 5 review findings Three independent review passes found three real issues: 1. Bridge `TypeError` on invalid `sessionScope` collapsed to opaque 500 in `sendBridgeError` instead of the typed `400 invalid_session_scope` the route layer guarantees. Direct embed / test / future entry-point callers bypassing the route would see a generic 500 with stack noise on stderr — disagreeing with the route contract. Fix: add `InvalidSessionScopeError` class (alongside `SessionNotFoundError` / `WorkspaceMismatchError` / `SessionLimitExceededError`); the `spawnOrAttach` validator now throws it, and `sendBridgeError` translates to the same `{ error, code: 'invalid_session_scope' }` shape. 2. SDK `DaemonClient.createOrAttachSession` used a truthy check (`req.sessionScope ? ...`) for the conditional spread, silently erasing falsy-but-defined values (`''`, `null`, `0`) on the wire. A buggy caller would never see the daemon's 400 — it'd inherit the daemon-wide default while believing it requested a specific scope. Fix: use `!== undefined` (matching the bridge's own validation shape). Same fix to the server-side spread for consistency. 3. JSDoc and docs referenced `serve --sessionScope` as if it were a shipping CLI flag. It isn't — `ServeOptions` has no field, neither `runQwenServe` nor `serve.ts` plumbs one, and the production daemon default is hardcoded to `'single'`. Strike the references; note that #4175 may add the flag in a follow-up. Test coverage expanded: - Cap-bypass guard: per-request `'thread'` overrides cannot bypass `maxSessions` on a daemon-default-`'single'` deployment. Without this, a future refactor that gated the cap on `defaultSessionScope` instead of `effectiveScope` would silently let `'thread'` overrides amplify past the limit — the exact N-amplification cliff #3803 was about. - Symmetric mixed-scope leak: daemon-default-`'thread'` + single-first-call followed by omitted-scope-second-call must produce distinct sessions. Mirrors the existing daemon-default-`'single'` + thread-first leak regression. - Concurrent mixed-scope coalescing: simultaneous single + thread `spawnOrAttach` against the same workspace under slow `initialize` must not collide on `inFlightSpawns` (tracker keys differ by scope). - Updated invalid-scope rejection test to assert `InvalidSessionScopeError` instance + carried `sessionScope` field. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)	2026-05-16 23:54:20 +08:00
Dragon	8f54ae9c0f	feat(cli): add built-in status line presets with interactive dialog (#4120 ) * chore(skills): add codex reproduce workflows * feat(cli): add built-in status line presets with interactive dialog Replace the shell-command-only status line with a preset system that renders structured session info (model, context usage, git branch, token counts, etc.) without external commands. Users can configure which items to display via a new interactive dialog accessible through /statusline or the settings UI. - Add statusLinePresets module with 16 built-in item types - Add StatusLineDialog component with search, multi-select, and preview - Update /statusline command to open the preset dialog - Extend settings schema to support { type: "preset", items: [...] } - Enhance MultiSelect with separator items, active marker, and customizable checked text - Update Footer to support theme-colored preset output * fix(cli): refresh status line preset after saving * chore: remove codex reproduce skills * fix(cli): address status line preset review feedback	2026-05-16 23:22:11 +08:00
dreamWB	966b040359	feat(cli): readline Ctrl+P/N for history and selection navigation (#4082 ) * feat(cli): readline Ctrl+P/N for history and selection navigation Adds GNU-readline-style Ctrl+P (previous) and Ctrl+N (next) shortcuts to the qwen-code TUI so users coming from bash/zsh, Emacs, or Claude Code feel at home. The change has three orthogonal behavior groups: 1. Input prompt, history-versus-line-motion two-step edge Ctrl+P / Ctrl+N and the arrow keys behave identically and apply a two-step edge transition that matches GNU readline and Claude Code: inside a multi-line buffer they move the cursor between visual rows; on the top row with the cursor away from column 0 the first Up press snaps the cursor to column 0 without changing history, and only the second press walks one entry back. The mirror rule holds for Down at the last row (snap to end of line, then advance). After navigateUp the buffer is parked at offset 0 (the "start of older entry" landing position); after navigateDown setText's default end-of-text positioning keeps the cursor at the end. The same two-step rule applies to single-line buffers so the reverse-direction case the issue called out works: pressing Ctrl+N immediately after Ctrl+P loaded a single-line older entry (cursor at col 0) first snaps the cursor to end-of-line, and only the next Ctrl+N moves forward through the history. Bare k/j inside the input prompt remain ordinary typed letters — the vim aliases are selection-list shortcuts, not text-editing ones. 2. Selection lists: arrows, k/j, and Ctrl+P/N are interchangeable A new pair of Command bindings, SELECTION_UP and SELECTION_DOWN, is wired into the shared useSelectionList hook and every dialog that used to hand-roll an "up/down arrow only" or "up/k arrow + vim only" navigation check. Covered surfaces: the main selection-list hook itself, the MCP / extensions / agents / hooks / background- tasks / rewind / plugin-choice / ask-user-question dialogs, the memory dialog (both its file list and the auto-memory and auto-cleanup toggle panel above the list), the settings dialog list (with the in-place value editor's "block other keys while editing" guard preserved), and the manage-models dialog's top tabs row. The auth-provider wizard's Advanced Config focus rows and the resume-session picker's cross-mode arrows are extended with the readline Ctrl+P / Ctrl+N synonyms while keeping their existing arrow-key and (for the session picker) vim k/j semantics intact. 3. Selection surfaces that wrap an active text input AskUserQuestionDialog's "Other / type a custom answer" field, manage-models' search input, the resume-session picker's search field, and the auth-wizard's Context-window number input all coexist with the selection list on the same screen. In those surfaces typing k or j has to land in the text buffer, not scroll the surrounding list. The fix is to scope the input-aware handler to unambiguous non-letter shortcuts only — arrow keys plus readline-style Ctrl+P / Ctrl+N escape the text field, while bare letters (including k / j / p / n) are delivered to the active input. The keyBinding-level fix that backs this is the `{ key: 'k', ctrl: false }` / `{ key: 'j', ctrl: false }` clauses on SELECTION_UP / SELECTION_DOWN, which prevent Ctrl+K from accidentally matching SELECTION_UP and thereby firing both the list-up handler and the KILL_LINE_RIGHT handler in the same keystroke (the P0 finding the quality-gate review surfaced). Focus-traversal tokens (the agent tab bar and the background-task pill) and chord shortcuts (Ctrl+Shift+Up/Down for embedded-shell history) are deliberately left untouched because their existing "any printable letter yields focus back to the composer" UX would break under the new vim-style letter bindings, and the Help viewer's scroll is a viewer rather than a selection list and is out of this PR's scope. Documentation: docs/users/reference/keyboard-shortcuts.md is updated so the Ctrl+P / Ctrl+N entries describe the two-step edge rule and the radio-button-select table mentions the new k/j and Ctrl+P/N aliases. Per-dialog on-screen hints (which still read "↑↓ to navigate") are intentionally not touched so the i18n string surface stays unchanged; the global reference doc is the authoritative source for the new shortcuts. Tests: - packages/cli/src/ui/keyMatchers.test.ts adds positive cases covering ↑ / ↓ / bare k / bare j / Ctrl+P / Ctrl+N matching SELECTION_UP / SELECTION_DOWN and negative cases asserting that Ctrl+K and Ctrl+J do NOT match (the conflict guard). - packages/cli/src/ui/components/InputPrompt.test.tsx adds a "two-step edge transition for history navigation" describe block with four cases: a mid-line Ctrl+P snaps to col 0 without invoking navigateUp; an at-col-0 Ctrl+P does invoke navigateUp and then parks the cursor via moveToOffset(0); a not-at-end Ctrl+N snaps to end-of-line without invoking navigateDown; and arrow Up obeys the same rule as Ctrl+P for keyboard-parity. The test file's mock buffer's setText was also corrected to mirror the real buffer's "cursor lands at the end of the new text" semantic so the cursor field is internally consistent during keypress assertions; the small InputPrompt render-frame snapshot in the same file's __snapshots__/ directory was regenerated to reflect the now- accurate cursor render position. Three pre-existing arrow-key navigation tests were updated to pre-position the mock cursor at the relevant edge before pressing the arrow, because the new two-step rule means the first arrow press at a non-edge position is a cursor snap, not a history step. Multi-line cursor-between- rows movement is covered indirectly by the keyBinding-level matcher tests plus the end-to-end manual demo plan. The work landed in three rounds against the planner's gate: round 1 added the unified SELECTION_UP / SELECTION_DOWN Command binding and the cursor-first dispatch in the input prompt; round 2 picked up the quality-gate review's P0 (the Ctrl+K double-fire in the "Other" custom-input field) and the user's hand-test feedback on the missing two-step edge in the reverse direction plus the MemoryDialog top-panel sections that weren't wired through SELECTION_; round 3 swept the remaining adjacent dialogs (SettingsDialog list, ManageModelsDialog tabs and search transitions, ProviderSetupSteps advancedConfig, useSessionPicker's cross-mode arrows) so the keyboard model is uniform across the TUI. The original issue also asks for Meta+B / Meta+F word motion and smarter Ctrl+H token-aware backspace among other readline conveniences. The user explicitly scoped this PR down to Ctrl+P / Ctrl+N at the planner approval gate; the remaining wish-list items are deferred to follow-up issues. Closes #3821 docs(cli): refine Ctrl+P/N input-history rows; fix Ctrl+J in selection-list comment Both items came from a non-blocking COMMENTED review on PR #4082 (https://github.com/QwenLM/qwen-code/pull/4082#pullrequestreview-4271527787), flagging two polish points in the readline Ctrl+P/Ctrl+N feature the parent commit `feat(cli): readline Ctrl+P/N for history and selection navigation` (`f66427b`) introduced. The `Up Arrow`, `Down Arrow`, `Ctrl+P`, and `Ctrl+N` rows of the Input Prompt table in `docs/users/reference/keyboard-shortcuts.md` are reworded to describe the three-phase keystroke sequence the implementation walks through — an intra-buffer visual-row step (a no-op in a single-line buffer, where there's exactly one visual row), a column-edge snap when the cursor reaches the buffer's first or last visual row with the cursor not already at column 0 (for the up-direction pair) or end-of-line (for the down-direction pair), and the readline-style previous-history or next-history walk on the press after the snap. The reviewer specifically pointed out that the prior wording described single-line input as "navigates the input history directly", which no longer matches the post-PR-#4082 behavior: single-line input also goes through the snap-then-walk two-press rule (the snap is a no-op when the cursor is already at the line's edge column, in which case the keystroke does the history walk on its first press). The new sentence covers the single-line and multi-line cases in one shape — single-line is the degenerate zero-row-walk-prefix instance of the same rule. The up-direction text is shared verbatim between the `Up Arrow` row (L31) and the `Ctrl+P` row (L43), and the down-direction text between the `Down Arrow` row (L27) and the `Ctrl+N` row (L42), so the keyboard- parity alias relationship is signaled by source-side text duplication rather than a prose cross-reference. The Input Prompt table's 234-byte canonical row width (the separator row's `\| <50-dash> \| <177-dash> \|` template, which sets the column-1 and column-2 source-side widths the file's existing untouched rows already align to) is preserved by trailing-ASCII-space padding inside the description column. The comment above `[Command.SELECTION_UP]` and `[Command.SELECTION_DOWN]` in `packages/cli/src/config/keyBindings.ts` previously read // Selection list navigation — up/k/Ctrl+P move selection up; down/j/Ctrl+N move selection down // ctrl: false on k/j ensures Ctrl+K (kill-line) and Ctrl+N (history-down) are not captured here The `Ctrl+N` half of the second line is wrong: `Ctrl+N` is intentionally matched here as the selection-down readline alias — the `{ key: 'n', ctrl: true }` entry in the `SELECTION_DOWN` array literal directly below the comment, mirroring the input-prompt-side `[Command.HISTORY_DOWN]: [{ key: 'n', ctrl: true }]` binding at L134 of the same file. The Ctrl-modified key the bare-letter `k` and `j` matchers actually guard against — the one already bound elsewhere whose double-match with the bare-letter selection-key the `ctrl: false` opt-out is preventing — is `Ctrl+J`, the ASCII line-feed (0x0A) encoding of the Enter family that appears as `{ key: 'j', ctrl: true }` inside the four-alternative `[Command.NEWLINE]` array a few lines below. The corrected one-liner is // Selection-list nav: arrows + k/j + Ctrl+P/Ctrl+N // ctrl: false on bare k/j skips Ctrl+K and Ctrl+J in the same terse no-trailing-period section-label style as the file's adjacent `// Screen control` (L129), `// History navigation` (L132), `// Auto-completion` (L213, post-edit numbering), and `// Text input` (L219) header comments. A 64-line block-comment that earlier in the review-fix cycle wrapped this same correct fact in dispatch-broadcast- model prose plus `keyMatchers.test.ts` backreferences was condensed to those two lines for cell-budget consistency with the rest of the file. No code behavior change. The local verification surface the reviewer named at the bottom of the review summary stays green: from `packages/cli`, npx vitest run \ src/ui/keyMatchers.test.ts \ src/config/keyBindings.test.ts \ src/ui/components/InputPrompt.test.tsx runs 178 cases with 177 passed and one unrelated skip (the implementation file `InputPrompt.tsx`'s feature flag for the keyboard- queue-input-editing case that was already skipped on the parent commit), including all four cases inside the `InputPrompt > two-step edge transition for history navigation` describe-block — `Ctrl+P with cursor mid-line snaps to col 0 without touching history`, `Ctrl+N with cursor not at end-of-line snaps to end without touching history`, `Ctrl+P at col 0 walks history and parks the cursor at offset 0`, and `arrow Up applies the same two-step rule as Ctrl+P (snap before navigate)`. Those four test-case names are the implementation-side anchors the new docs wording verbally mirrors. `npx tsc --noEmit -p .` in the same package directory reports zero diagnostics. * fix(cli): align readline history shortcuts with dialogs * test(cli): cover readline navigation aliases * fix(cli): guard readline shortcuts in dialog inputs * test(cli): cover readline aliases in more dialogs	2026-05-16 23:07:25 +08:00
tanzhenxin	8d765fec78	refactor(core): TaskBase envelope + foreground subagent persistence (#3970 ) * refactor(core): TaskBase envelope + foreground subagent persistence Establishes a shared `TaskBase` envelope across the agent / shell / monitor task registries with a mandatory `outputFile` field. Brings the foreground subagent path into compliance with the new contract, so it now leaves the same JSONL transcript + meta sidecar on disk that backgrounded subagents have always produced — closing the only gap where a registered task wrote nothing. Renames the agent-task discriminator from `flavor: 'foreground' \| 'background'` to claw-code's `isBackgrounded: boolean`; the deprecated names are kept as one-release type aliases. PR 1 of the task-registry-unification design. PR 2 will collapse the three per-kind registries into one thin TaskRegistry plus per-kind modules. * refactor(core): drop unused BackgroundTaskFlavor type alias The alias only preserved the type name; no in-tree caller used it, and after the field rename no realistic external consumer use survives (reading entry.flavor / writing { flavor: ... } both fail at the use site regardless of whether the alias resolves). Drop it instead of carrying a hollow shim. * fix(core): tighten foreground subagent launch path - Register before writing the meta sidecar so a register() failure can't leave an orphaned 'running' meta file behind. writeAgentMeta is best-effort and never throws, so the inverse failure mode (registry entry without sidecar) is a benign degradation. - Cache getGitBranch by cwd at the agent module level so foreground launches don't pay a fresh git rev-parse exec each time. Branches don't change within a process under normal use; the transcript annotation is best-effort audit metadata. - Document on cancel() that foreground entries take a partial path through the method — Map deletion is the caller's responsibility via unregisterForeground() in the tool-call's finally path. * fix(agent): correct foreground meta status mapping and register order The foreground finally block in agent.ts mapped any non-ERROR, non-CANCELLED terminate mode (including MAX_TURNS, TIMEOUT, SHUTDOWN) to 'completed' in the sidecar, so post-mortem readers and resume logic saw a successful status for runs that actually hit a guardrail. Flip the ternary to mirror the background path: GOAL -> completed, CANCELLED -> cancelled, else -> failed. Also reorder the background launch so registry.register() runs before writeAgentMeta(), matching the foreground path. Both paths now share the same orphaned-meta guarantee. * test(agent): rename stale foreground-flavor test The "default flavor (absent) behaves as background" test name and its backwards-compat comment referenced the old optional flavor field, but the registration shape has required isBackgrounded for a while now — there is no "absent" path to exercise. Rename it to describe what the assertion actually covers: that background entries fire a task- notification on complete. * refactor(core): alias BackgroundTaskStatus to TaskStatus The local `BackgroundTaskStatus` union was byte-identical to the new shared `TaskStatus` defined in `tasks/types.ts`. Replace it with a `@deprecated` type alias so external consumers (notably `nonInteractiveCli.ts`) keep compiling unchanged while the canonical name lives in one place. * refactor(core): tidy monitorRegistry signatures and document cancel ordering Two small consistency wins flagged in review: 1. `dispatchOwnerLifecycleWake` and `dispatchNotification` were the only methods on the registry still typed with the deprecated `MonitorEntry` alias. Rename their parameters to `MonitorTask` to match every other signature in the file. 2. `cancel()` orders `settle()` and `abort()` differently between its two branches, which is intentional (silent cancel locks the terminal status before abort listeners run; default cancel lets a naturally-completing operation settle through its own terminal path). Document that asymmetry in a JSDoc on the method so the next reader doesn't have to reverse-engineer it. * refactor(core): migrate internal BackgroundTaskStatus refs to TaskStatus The `BackgroundTaskStatus` alias was introduced in `91b59a8fb` as a `@deprecated` synonym for external SDK consumers (notably `nonInteractiveCli.ts`). New internal references in this PR's own file kept the old name; migrate them so the only remaining usage of the deprecated alias is the alias declaration itself. No behavior change — the alias is `= TaskStatus` so the union is identical. * test(agent): cover foreground failed-mode terminal status mapping The foreground finally block maps GOAL→completed, CANCELLED→cancelled, and everything else (ERROR, MAX_TURNS, TIMEOUT, SHUTDOWN) → failed. Only the GOAL branch was asserted; the failed-mode fallback had no coverage even though the same mapping recently regressed (d67da6d50) and had to be fixed by review. Adds a table-driven case mocking getTerminateMode to ERROR / MAX_TURNS / TIMEOUT and asserting patchAgentMeta receives status: 'failed'. CANCELLED is already covered by the "foreground CANCELLED prefixes the partial result" test below. * test(agent): cover foreground CANCELLED → cancelled meta mapping Extends the foreground terminate-mode it.each to assert that CANCELLED is recorded as `cancelled` in the on-disk sidecar — the existing cancel-prefix test only verified the LLM-visible payload, leaving the patchAgentMeta mapping uncovered. A regression flipping CANCELLED → 'failed' would now fail this case. * test(agent): make registry path assertions platform-agnostic The outputFile/metaPath regexes hardcoded forward slashes, so the foreground JSONL+meta reservation test failed on Windows where paths use backslashes. Accept either separator. * fix(core): guard executeBackground register-throw window; correct outputFile contract A throwing register() subscriber in executeBackground() would leak the already-spawned child + open output stream, unreachable by /tasks / task_stop. Mirror the promote path's defensive try/catch: abort the entry's controller, destroy the stream, and rethrow so the launch fails visibly. Also correct the TaskBase.outputFile contract: agent JSONL is materialized on the writer's first append, which is the launch prompt at attach time — not the first runtime event. A subagent cancelled before any event still leaves a prompt-only JSONL plus meta, not meta alone.	2026-05-16 22:53:08 +08:00
jinye	379d14ad00	feat(rewind): add file restoration support to /rewind command (#4064 ) * feat(rewind): add file restoration support to /rewind command (#3697) Previously /rewind only truncated conversation history — files modified by the assistant remained on disk. This adds a file-copy-based backup system (ported from claude-code's fileHistory) so users can optionally roll back file changes when rewinding. Core changes: - New FileHistoryService with snapshot/backup/restore lifecycle - trackEdit() called before each file write in edit and write-file tools - makeSnapshot() at each user turn boundary in client.ts - Three-phase RewindSelector UI: pick turn → choose restore option → execute - RestoreOption type: 'both' \| 'conversation' \| 'code' \| 'cancel' Closes #3697 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(rewind): replace findLast with reverse loop for ES2022 compat vscode-ide-companion targets ES2022 which lacks Array.findLast. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(rewind): add missing i18n translations and fix test expectation - Add file restore i18n keys to all 8 locale files (zh-TW, ca, de, fr, ja, pt, ru were missing) - Update useGeminiStream test to expect promptId in user history item 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(rewind): add getFileHistoryService mock to tool tests edit.test.ts and write-file.test.ts mock configs lacked the new getFileHistoryService method, causing trackEdit calls to throw. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(rewind): allow Esc during diff loading and add missing i18n footer strings Allow users to press Esc/Ctrl+C to cancel during diff stats loading phase. Add three missing footer navigation strings to all 9 locale files. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(rewind): address review feedback — restoreBackup correctness, missing promptId warning, dead code removal - restoreBackup now returns boolean; applySnapshot only counts a file as restored when the backup was actually applied (fixes misleading "Restored N file(s)" when backup is missing on disk) - Show warning when user selects file restore on a turn created before file checkpointing was enabled (promptId undefined) - Remove unused snapshotSequence field, canRestore(), and hasAnyChanges() methods that had no callers * fix(rewind): correct diff direction, truncate snapshots on rewind, add zero-files feedback - Swap diffLines args to diffLines(backup, current) so +/- stats match git convention (insertions = lines added since checkpoint) - Truncate snapshots after rewind to discard stale timeline state, preventing makeSnapshot from using wrong baseline - Show "No files needed restoration." when rewind finds files already at target state (all 9 locales) * test(tools): assert trackEdit is called before file writes * fix(i18n): add missing rewind UI locale keys across all 9 locales * fix(core): reset fileHistoryService on session change, clean up dead code - Reset fileHistoryService in startNewSession() so /clear gets a fresh instance with the new sessionId - Rebuild trackedFiles after rewind() to avoid stale stat() calls - Remove unused setCurrentPromptId/getCurrentPromptId dead API * fix(rewind): validate conversation before file restore, preserve snapshots for code-only - For 'both': validate conversation can be truncated before restoring files to prevent inconsistent state (files rolled back but conversation stays at newer state) - For 'code'-only: pass truncateHistory=false so snapshot timeline is preserved — conversation turns remain visible and their snapshots stay available for future rewinds * fix: correct trackEdit race comment — overwrite not orphan * fix(types): use HistoryItemWithoutId for addItem to preserve union member properties * fix(types): revert addItem type change, use cast at call site for promptId * fix(rewind): guard onRewind calls with .catch() to prevent unhandled rejection * fix(rewind): only truncate snapshot timeline when conversation truncation will execute * fix(rewind): address tanzhenxin review - gate, partial failure, tests 1. Disable file checkpointing for non-interactive (-p) mode by gating on `params.interactive !== false` in addition to `!params.sdkMode`. 2. Surface partial restore failures: `rewind()` now returns `RewindResult { filesChanged, filesFailed }`. In "both" mode, conversation truncation is skipped when any file fails to restore, preventing inconsistent state. 3. Add comprehensive unit tests for FileHistoryService (17 tests covering trackEdit, makeSnapshot, rewind, eviction, diffStats). * fix(rewind): defensive trackEdit + fix version collision on re-track 1. Wrap trackEdit calls in edit.ts and write-file.ts with try/catch so file history failures never break core tool operations. 2. Replace hardcoded version:1 in trackEdit with max-version lookup across all snapshots. Prevents backup file overwrite when the same file is re-tracked after a code-only rewind (truncateHistory=false). * fix(rewind): add missing i18n keys + fix makeSnapshot version collision 1. Add 'Failed to restore {{count}} file(s): {{files}}' to all 7 missing locales (ca, de, fr, ja, pt, ru, zh-TW). 2. Use global max-version scan in makeSnapshot (same as trackEdit) to prevent backup filename collisions after snapshot eviction. * fix(rewind): set hasRestoreFailure when promptId is missing In "both" mode, if the target turn has no promptId, conversation truncation was still proceeding because hasRestoreFailure was not set. Now correctly blocks truncation to prevent inconsistent state. * fix(rewind): show loading state during async restore, close selector in finally Defer setIsRewindSelectorOpen(false) to a try/finally block so the selector stays visible during async file restore. RewindSelector now manages its own isRestoring state: shows "Restoring..." text and disables all keypress handlers while the restore is in progress. This prevents the user from seeing a bare prompt with no progress indicator during slow restores, and eliminates the race where typing during restore could clobber the pre-filled prompt. * fix(rewind): skip timeline truncation on partial failure + fix wording 1. rewind() now only truncates the snapshot timeline when filesFailed is empty, preventing loss of future checkpoints when the caller skips conversation truncation due to failures. 2. Change "No files needed restoration." to the more idiomatic "No files needed to be restored." across all 9 locales. * fix(rewind): address review — TOCTOU in createBackup + outer catch in handleRewindConfirm - Extract safeCopyFile(src, dst) helper that distinguishes source-missing (TOCTOU: file deleted between stat and copyFile) from target-dir-missing, so trackEdit no longer silently fails when a file disappears mid-backup. Same helper now covers restoreBackup. - Wrap handleRewindConfirm with an outer catch that surfaces unexpected failures via historyManager error item; previously a sync throw from the post-rewind block would silently close the selector and leave 'both' mode in a half-applied state. - Add 'Rewind failed: {{error}}' i18n key in all 9 locales. * test(rewind): cover restoreFromSnapshots, trackEdit no-snapshot path, partial-failure timeline guard - restoreFromSnapshots: assert relative-path shortening + external-path preservation - trackEdit before any makeSnapshot: assert no-op early return - rewind truncation guard: assert snapshot timeline is preserved when filesFailed > 0 * fix(rewind): clean up orphaned backups, surface no-client states, polish - Per-eviction backup cleanup: when MAX_SNAPSHOTS overflow or rewind truncation drops snapshots, remove backup files no longer referenced by any surviving snapshot (best-effort, ENOENT-tolerant). Backup files are content-deduplicated across snapshots, so the live-set is computed from survivors before deletion. - Surface no-client failure modes in handleRewindConfirm: 'conversation' mode now shows an error instead of silently returning; 'both' mode shows an info message after restore so the user knows the conversation half was skipped. - i18n the previously hardcoded 'Conversation rewound...' message and add 3 new keys to all 9 locales. - Tighten createBackup signature (drop unreachable null branch). - Extract getMaxVersion helper to deduplicate identical loops in trackEdit and makeSnapshot. Tests added: orphan-cleanup on overflow, dedupe preservation, rewind truncation cleanup. All existing tests continue to pass (23 core, 71 AppContainer, 27 i18n). * fix(rewind): use path separator constant in maybeShortenFilePath The hardcoded '/' check meant Windows absolute paths (with '\') never matched the cwd prefix, so the shortening was a no-op on Windows. The new cleanup tests revealed this by asserting on the relative-path key: on Windows the key was the full absolute path, so trackedFileBackups lookups returned undefined. Switching to the platform sep also makes Windows snapshots use the relative key like POSIX, improving portability if cwd moves later. restoreFromSnapshots re-runs maybeShortenFilePath on every key, so existing on-disk sessions migrate transparently on resume. * test(rewind): cover trackEdit best-effort guarantees and unchanged-file rewind - edit.test.ts: assert tool still completes (file written, llmContent reflects the edit) when FileHistoryService.trackEdit rejects. - write-file.test.ts: same for the write_file tool. - fileHistoryService.test.ts: assert trackEdit swallows createBackup failures (forced via storageDir-replaced-with-file → ENOTDIR in recursive mkdir) without recording any backup. - fileHistoryService.test.ts: assert applySnapshot leaves a file untouched (mtime unchanged, filesChanged empty) when its content already matches the target backup — covers the checkOriginFileChanged short-circuit. * fix(rewind): align fileCheckpointing default + surface backup-missing on rewind Two issues from a Codex review pass: - Config: `fileCheckpointingEnabled` defaulted via `params.interactive !== false`, which resolves truthy when the caller omits `interactive` — but `this.interactive` itself defaults to `false`. Headless/programmatic callers that did not set `interactive` would silently start writing file-history backups under `~/.qwen/file-history/`. Use the same `?? false` default so the gate matches the resolved interactive value. - checkOriginFileChanged: when the on-disk backup AND the working file have both been removed externally, the function returned `false` ("unchanged"), so `applySnapshot` skipped `restoreBackup` and rewind reported success even though the target snapshot expected the file to exist. Treat any failure to stat the backup as "changed" so callers attempt the restore: applySnapshot surfaces the missing backup via restoreBackup → filesFailed, makeSnapshot creates a fresh backup. Added a regression test for the both-missing path. * fix(rewind): mark per-file backup failures so rewind surfaces them Two related issues from a /review pass: 1. Silent data loss in makeSnapshot inheritance: when the per-file backup attempt threw inside makeSnapshot, the catch block left the path missing from `trackedFileBackups`, and the inheritance loop then copied the previous snapshot's backup into the new snapshot. A later rewind to that snapshot would restore older content while reporting success. Now the catch records `{ failed: true, ... }` for the path. The inheritance loop skips paths already present in trackedFileBackups, so failed paths are no longer paved over by stale carryover. Both applySnapshot and getDiffStats honor `failed` — rewind pushes the path to filesFailed and the diff preview omits it. 2. Marketing/scope mismatch: the rewind UI offers "Restore code" but the feature only tracks edits made via the `edit` and `write_file` tools — shell-mediated changes (`sed -i`, `cp`, `rm`, `mv`, `npm`, etc.) and out-of-tool manual edits are not captured. Added a class-level JSDoc on FileHistoryService spelling out the scope, and an inline footer in the restore-options panel: "Rewinding does not affect files edited manually or via shell commands." (matching the upstream claude-code MessageSelector wording). New i18n key in all 9 locales. Test added: trackEdit/makeSnapshot per-file failure path. Asserts the new snapshot has `failed: true`, and that rewind to that snapshot reports the file as filesFailed instead of silently restoring the inherited stale backup. * fix(rewind): polish — i18n, type tightening, resumed-session UX hint Several small wins from the latest /review pass plus a UX mitigation for turns whose file-history snapshot is not present in memory (most often because the conversation came from a resumed session, but also when a turn has no captured edits): - AppContainer: wrap the "Cannot rewind to a turn that was compressed" error in t(); add the new key to all 9 locales. - RewindSelector: replace the inline `(+N -M in K file/files)` template literal with t() using two plural-aware keys; add to all 9 locales. - DiffStats.filesChanged: tighten from optional to required to match reality (every code path that returns a DiffStats sets it). Drops the `!.filesChanged!` non-null cascade in RewindSelector. - RewindSelector phase 2: when the option list does not contain code/both (i.e. no file-restore is actionable for this turn), show an explicit hint instead of leaving the user to guess why those options are missing. Same i18n key in all 9 locales. The mitigation hint covers the resumed-session case Tan raised (snapshots are not rehydrated by `/resume` today) without changing behavior — `getRestoreOptions` already gracefully degrades to conversation-only when `getDiffStats` returns undefined for a snapshot that is not in memory; we just surface the "why" to the user. * fix(rewind): unstick failed marker on the unchanged-file fast path The `failed: true` marker added in `d59838338` was sticky: once set, the no-change optimization in `makeSnapshot` would copy the failed entry forward into every subsequent snapshot for as long as the file stayed unchanged. A single transient I/O error therefore poisoned `/rewind` for that file until the user happened to modify the content again. Add `!latestBackup.failed` to the no-change reuse guard so a failed entry is never copied forward — the next snapshot retries the backup, which either heals (when the underlying I/O has recovered) or honestly records another failed entry. New regression test (`does not carry a failed marker forward when the file is unchanged`): - Snapshot p1 with file content X - Sabotage the storage dir → p2's per-file backup throws → p2 records failed: true - Restore the storage dir; file still equals X - p3 must NOT copy p2's failed entry; it must retry createBackup and produce a fresh non-failed entry that allows rewind to p3 to succeed	2026-05-16 22:16:01 +08:00
qqqys	0dde1ad704	feat(cli): add session-scoped /goal command with judge-driven turn continuation (#4123 ) Some checks are pending Qwen Code CI / Classify PR (push) Waiting to run Details Qwen Code CI / Lint (push) Blocked by required conditions Details Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * feat(cli): add session-scoped /goal command with judge-driven turn continuation `/goal <condition>` pins a free-form objective for the rest of the session. While a goal is active, an LLM judge runs at every Stop boundary and either lets the turn end (condition met) or feeds the judge's reason back as the next user prompt to keep the model working. Auto-clears on success; `/goal clear` cancels early. Same primitive as Anthropic's Claude Code 2.1.140 `/goal`, built on qwen-code's existing Stop-hook + function-hook plumbing — no new subsystem. Core (packages/core/src/goals/): - activeGoalStore: per-session active goal + last-terminal cache, with a terminal-observer channel the CLI subscribes to so achieved/aborted cards land in history. - goalJudge: side-query against a fast model, transcript-grounded system prompt + json_schema response + disabled thinking. Tolerant JSON extraction with fallback so a flaky judge can't kill the loop; 30s default timeout (vs. the 5s function-hook default that was silently killing real-world judge calls). - goalHook: function hook on Stop. Returns {decision:'block', reason} when not met (reusing client.ts's existing recursive continuation), {continue:true} when met. Self-clears active goal + notifies the terminal observer on met/aborted. MAX_GOAL_ITERATIONS=50 backstop. CLI: - goalCommand: /goal \| /goal <cond> \| /goal clear\|stop\|off\|reset\|none\| cancel. 4000-char cap, trust + disableAllHooks gates. Empty /goal shows running status, falls back to the last completed summary. - GoalPill: footer chip "◎ /goal active (12s)" — terse, claude-aligned. - GoalStatusMessage: set / checking / achieved / cleared / aborted history cards. "checking" replaces the generic stop_hook_loop chip for goal-driven iterations. - restoreGoal: on session resume, rehydrate the active goal hook + last-terminal cache from transcript so /goal survives /resume. Cross-cutting fixes: - HookSystem.hasHooksForEvent(eventName, sessionId?): also consults SessionHooksManager. Previously SDK / programmatic Stop function hooks were silently gated out by client.ts's fast-path check, so they never fired. - client.ts: yield StopHookLoop on every continuation iteration (was iter > 1) — first not-met turn is now visible in the UI. - useGeminiStream: commit pending item + clear thoughtBuffer / geminiMessageBuffer on every Finished event. Fixes a UI bug where a Stop-hook continuation's text bled into the prior turn's pending history item (cumulative "te" / "tes" rendering), even though the persisted transcript was clean. Co-authored-by: Qwen-Coder <noreply@qwen.ai> * test(cli): fix footer goal pill mock * fix(goal): persist terminal status on restore * fix(goal): harden judge hook * fix(goal): sanitize condition in instruction prompt and update matcher test - goalCommand.ts: collapse newlines and downgrade embedded double-quotes in the condition before splicing into the instruction prompt so the wrapping quote structure stays intact. - goalLoop.integration.test.ts: matcher assertion updated to '' to match the current registerGoalHook contract (previously ''). Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> feat(goal): surface judge reason on terminal cards Renders `Last check: <reason>` on the achieved / aborted history card and on the empty-`/goal` summary so the final view records why the judge ruled the goal complete. Uses a single inline-label Text instead of the flex-row split used for `Goal:` — the reason is capped at 240 chars and almost always wraps; the flex-row variant hangs the continuation at the value column's left edge (~12 cols of blank space, easily mistaken for a stray empty line). Single Text + natural wrap keeps the continuation flush. Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> * fix(goal): re-arm /goal on runtime /resume and /branch Cold boot path in AppContainer already calls restoreGoalFromHistory after loading session data, but the runtime /resume and /branch paths skipped it entirely. After /new + /resume back to a session that had an active /goal, the in-memory activeGoalStore entry still held the pre-/new setAt and a hookId pointing to a hook that config.startNewSession() had torn down — leaving the footer pill ticking from the original setAt (observable as "几十秒" elapsed immediately after resume) while the Stop hook was silently dead. Wire restoreGoalFromHistory into both handlers right after the session data lands so unregisterGoalHook clears the stale entry and registerGoalHook re-arms with a fresh setAt / hookId and re-installs the terminal observer. Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> * refactor(goal): reuse shared formatDuration utility Drop the duplicated local formatDuration from goalCommand.ts and GoalStatusMessage.tsx in favor of the shared formatters.ts version, called with { hideTrailingZeros: true }. The shared util already has its own test suite and matches Claude Code's ShellTimeDisplay style (round values drop zero-unit tails: `5m 0s` → `5m`). Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> * fix(goal): abort judge API call on judge timeout The judge-timeout path in judgeGoalWithTimeout only resolved a fallback verdict; the underlying judgeGoal generateContent call kept running because the hook context signal is never aborted by the timeout. Each timeout leaked one in-flight request that accumulated across goal-loop iterations. Link an AbortController into the judge signal and abort it when the timeout fires. Co-Authored-By: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(goal): harden judge continuation feedback * test(goal): align loop integration with safe continuation * fix(cli): harden goal resume lifecycle * fix(cli): address goal review blockers * fix(goal): guard stale same-condition callbacks --------- Co-authored-by: Qwen-Coder <noreply@qwen.ai> Co-authored-by: Qwen-Coder <noreply@alibabacloud.com> Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-16 18:14:13 +08:00
jinye	264ed82273	[codex] feat(serve): add capability registry protocol versions (#4191 ) * feat(serve): add capability registry protocol versions Introduce a serve capability registry and advertise protocolVersions from /capabilities while preserving the existing v1 envelope and Stage 1 feature aliases. Update SDK wire types, docs, and focused tests for old-daemon compatibility. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(serve): clarify capability advertisement semantics Address PR review feedback by preserving historical capability versions, separating registered and advertised feature helpers, testing protocol version metadata directly, and keeping runtime exports out of the serve types module. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-05-16 18:07:38 +08:00
qqqys	96b30ee427	feat(cli): add baseline /doctor memory diagnostics (#4180 ) * feat(cli): add baseline doctor memory diagnostics * fix(cli): address doctor memory review feedback * feat(cli): add doctor memory assessment * feat(cli): support doctor memory heap snapshots * feat(cli): add doctor memory sampling * fix(cli): harden doctor memory heap snapshots * fix(cli): harden doctor memory heap snapshots * fix(cli): harden memory heap snapshot diagnostics * fix(cli): harden doctor memory snapshots * fix(cli): stabilize heap snapshot cleanup ordering * fix(cli): harden heap snapshot cleanup * test(cli): cover memory snapshot fallbacks * fix(cli): harden doctor memory abort and disk checks	2026-05-16 17:19:50 +08:00
qqqys	372acf1444	feat(cli): argument hint + --auto completion for /rename (#4048 ) * feat(cli): argument hint + --auto completion for /rename Closes #4047. The /rename command supports a structured --auto flag (let the fast model generate a sentence-case title from the conversation), but unlike /model — which advertises --fast via argumentHint and a completion entry — /rename's flag was undocumented inline. Users had to either run the command incorrectly or check the docs to learn about --auto. - argumentHint: '[--auto] [<name>]' so the completion menu shows the shape when the user types `/rename` and tabs. - completion: returns null on empty / free-text input (don't shadow the user typing a title) and surfaces --auto when the partial arg is a prefix of it ('-', '--', '--a', '--au', '--auto'). Same shape as /model's --fast handling. Free-text titles intentionally don't auto-complete — there's nothing meaningful to suggest, and offering --auto on every keystroke would feel like noise on `/rename my-feature`. Tests: - pins argumentHint shape - empty partial → null - '-' / '--' / '--a' / '--au' / '--auto' all return the --auto suggestion - 'my-feature' / 'fix bug' / '-x' return null (free-text path) Co-Authored-By: Qwen-Coder <noreply@qwen.ai> * fix(core): fall back to text JSON when generateJson gets no tool call generateJson registers schemas as a respond_in_schema function declaration and walks parts[].functionCall for the result. When no tool_choice is set (the OpenAI-compatible converter never sets one) and the system prompt explicitly asks for text JSON — e.g. session-title generation's "Return ONLY a JSON object..." — some models honor the prompt and emit the answer as a plain text part instead of calling the tool. The answer is semantically correct; we just weren't reading it. This bottoms out in /rename --auto as "The fast model returned no usable title" on qwen3.6-max-preview, and likely affects every other generateJson caller (next-speaker checker, edit corrector, etc.) on the same class of model. Add a tolerant fallback: when no function call comes back, parse getResponseText(result) — which already skips thought parts — with a JSON-object extractor that strips optional ```json fences and reads the outermost {...} block. Strictly additive; the function-call path stays primary. Closes #4057. Co-Authored-By: Qwen-Coder <noreply@qwen.ai> * refactor(cli): unify /rename and /rename --auto pipelines Bare /rename (no args) used to call a private generateKebabTitle path that asked the fast model (or main-model fallback) for a 2-4 word kebab-case name via a plain text call. /rename --auto used the schema-enforced tryGenerateSessionTitle path for a 3-7 word sentence- case title. Two code paths, two prompts, two failure-message formats, two sanitizers — with the kebab path consistently lagging on history filtering, surrogate handling, and error specificity. Collapse to a single fast-model schema-enforced pipeline. Both bare /rename and /rename --auto now call tryGenerateSessionTitle and both record titleSource: 'auto' on success. The --auto flag stays as an explicit user-intent marker (preserves the existing argumentHint / completion / parseArgs surface) but no longer diverges semantically. Bare /rename now also hard-requires fastModel; users who relied on the main-model fallback need to either /model --fast <name> or pass a name explicitly (/rename <name>). The new failure message points at both options. Co-Authored-By: Qwen-Coder <noreply@qwen.ai> * fix(cli): clarify rename title failure * test(core): cover loose json fallback --------- Co-authored-by: Qwen-Coder <noreply@qwen.ai>	2026-05-16 16:47:15 +08:00
jinye	435f711e33	feat(cli): warn users that rewind is disabled in IDE mode (#4122 ) Some checks are pending Qwen Code CI / Classify PR (push) Waiting to run Details Qwen Code CI / Lint (push) Blocked by required conditions Details Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details	2026-05-15 20:27:37 +08:00

1 2 3 4 5 ...

2823 commits