mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-25 23:05:49 +00:00

History

jinye 103090669e feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) (#4249 ) * feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) Adds the first Wave 4 mutation route surface: workspace-scoped memory and subagent CRUD over HTTP. Remote clients (TUI / channels / web / IDE adapters) can now list, read, create, update, and delete subagent definitions and read / append / replace QWEN.md without disturbing session state. Routes: - GET /workspace/memory (read-only snapshot) - POST /workspace/memory (append/replace, strict-gated) - GET /workspace/agents (list project + user + builtin) - POST /workspace/agents (create-only; 409 on collision) - GET /workspace/agents/:agentType (full detail incl. systemPrompt) - POST /workspace/agents/:agentType (update; 403 read-only on builtin) - DELETE /workspace/agents/:agentType (idempotent for SDK callers) Mutation paths use mutate({ strict: true }) from PR 15 so they refuse unauthenticated requests even on no-token loopback defaults. Workspace mutations validate X-Qwen-Client-Id against bridge.knownClientIds() and stamp originatorClientId on emitted events. Capability tags added: workspace_memory, workspace_agents. New typed events fanned out via bridge.publishWorkspaceEvent (best- effort to every active session bus; read-after-write is the contract): - memory_changed { scope, filePath, mode, bytesWritten } - agent_changed { change, name, level } writeContextFile.ts is the new core helper that resolves QWEN.md placement (workspace vs ~/.qwen) and append-vs-replace semantics. Whitespace-only appends short-circuit before fs.writeFile, so a no-op POST does not bump mtime or fan out a misleading event. SubagentManager is wrapped with a CRUD-scoped Config stub via Proxy: only getSdkMode / getProjectRoot / getActiveExtensions are stubbed (verified against subagent-manager.ts; getToolRegistry is execution- path only). Any future Config method touched on a CRUD path throws immediately so dependency creep is visible. Auto-memory CRUD, persistent audit log, and the EACCES → NOT_FOUND unlink mapping in core SubagentManager.deleteSubagent are explicit follow-ups (PR 16.5 / PR 24 / separate fix). Validation: - typecheck: cli + sdk-typescript clean - vitest: serve 348/348, writeContextFile 10/10, SDK 335/335 - eslint: clean * fix(serve): address Codex P2 review on PR 16 (#4175 Wave 4 PR 16 follow-up) Three correctness issues Codex flagged on the just-shipped workspace memory + agents CRUD surface: 1. Concurrent POST /workspace/memory append no longer loses writes. Two simultaneous appends would each read the same existing file, compose new content in JS memory, then race the fs.writeFile — the later write silently overwrote the earlier appended entry. Add a per-resolved-path Mutex map (mirroring jsonl-utils.ts's fileLocks pattern) and wrap the entire read-compose-write sequence in runExclusive. 2. GET /workspace/agents now reflects out-of-band file changes. SubagentManager.listSubagents() default served the in-memory cache; developer / IDE adapter edits to .qwen/agents/.md never appeared even though GET /workspace/agents/:agentType always reads disk. Pass { force: true } so the LIST route walks disk every call, matching the detail route's "filesystem is the source of truth" contract. 3. Reject builtin agent names on POST /workspace/agents to prevent undeleteable shadow files. A client could write a project-level agent named "general-purpose" — list/load resolved the shadow first, but SubagentManager.deleteSubagent's name-based builtin guard (subagent-manager.ts:302) rejected DELETE forever. Add a BuiltinAgentRegistry.isBuiltinAgent check in parseAgentConfig so the conflict surfaces at create time instead of trapping the file beyond the API. The check is case-insensitive, matching the resolver's case-insensitive cascade. New tests: - writeContextFile.test.ts: 10 parallel appends, all 10 entries must survive in the final file (would fail without the mutex). - workspaceAgents.test.ts: GET /workspace/agents observes a freshly-written agent file on the second call (force-refresh proof); POST with name="general-purpose" returns 422 + the case-insensitive variant "explore" too. Validation: - typecheck: cli + sdk-typescript clean - vitest: serve 351/351 (was 348, +3 new), writeContextFile 11/11 - eslint: clean fix(serve): apply round-1 review fold-in 2a (HIGH + CodeQL) on PR 16 Round-1 inline review (#4249) flagged ~28 items across Copilot, wenshao, and CodeQL. This commit lands the HIGH-severity correctness fixes plus the two CodeQL polynomial-regex warnings. Validation tighten — `parseAgentConfig` + `parseAgentUpdates`: - Trim leading/trailing whitespace on `name` before passing to SubagentManager. `" tester "` would otherwise create a frontmatter name with spaces that case-insensitive lookups can never find. - Fail-closed (422 invalid_config) on present-but-wrong-type optional scalars: `model`, `color`, `approvalMode`, `background`. Previously malformed values silently dropped through validation, masking client-serialization bugs. - Validate `approvalMode` against the `APPROVAL_MODES` enum on both create and update; an unknown value used to 201 with the field silently omitted from the saved file. - `runConfig` is now whitelist-sanitized to `{ max_time_minutes, max_turns }` only; unknown keys are dropped, malformed values return 422. Previously the whole input object was persisted verbatim into YAML frontmatter. - `?scope=` query is fail-closed for repeated values (`?scope=workspace&scope=global`) — Express parses these as arrays which the previous `typeof === 'string'` check silently treated as absent, broadening DELETE/UPDATE semantics from one level to both. - Empty update body returns 400 invalid_config (previously rewrote the file + emitted a misleading `agent_changed` event). - No-op updates (every supplied field already matches `existing`) return 200 + `changed: false` and SKIP the file rewrite + event fan-out. Memory write helper — `writeContextFile.ts`: - Move whitespace-only no-op detection BEFORE `fs.mkdir`. Without this, an empty POST still created the parent directory and bumped its mtime even though `changed: false` was reported. - Replace two polynomial regex patterns flagged by CodeQL (`/^\s+\|\s+$/g` and `/^\n+\|\n+$/g`) with hand-rolled `while` loops. Same pattern auth.ts:120-125 already uses for the same CodeQL rule. SDK — `DaemonClient.ts` + `types.ts`: - `DaemonWriteMemoryResult` gains optional `changed?: boolean` so typed callers can suppress redundant cache invalidation on no-op appends. Optional for forward-compat with daemons that predate the field — undefined treats as "changed: true" (legacy contract). - `deleteWorkspaceAgent` only swallows 404 when the body's `code` is `agent_not_found`. A bare 404 (older daemon, misrouted proxy, generic gateway page) now throws — previously the SDK silently reported success even when the request never reached a route that understands workspace agents. - `updateWorkspaceAgent` adds an optional `scope` parameter mirroring `deleteWorkspaceAgent`, so callers can target the user- level definition when a project-level agent shadows it. Validation: - typecheck: cli + sdk-typescript clean - vitest: serve 357/357 + writeContextFile 12/12 = 369/369 passing (was 362; +7 new) - eslint: clean Explicitly NOT applying (out of scope per issue #4175 PR 16 review-resolution policy): - Copilot's "strict gate after body parser" finding — already documented as PR 15 review-resolved tradeoff at auth.ts:256-269. * fix(serve): apply round-1 review fold-in 2b (MEDIUM + tests) on PR 16 MEDIUM hardening: - Fix the JSDoc on `collectWorkspaceMemoryStatus` to match the workspace-root-only discovery the implementation actually does today. The 32-iteration upward walk is reserved for a future hierarchical mode but breaks after iteration 1 in v1. - Lower the depth limit on `walkWorkspaceForMemory` from 32 → 12. Realistic project depth sits well below 8; 12 leaves headroom without amplifying blast radius from symlink cycles. - Daemon `Config` Proxy now defines a `has` trap symmetric to the existing `get` trap. Without it, a future SubagentManager path doing `'someMethod' in this.config` would silently get `false` and bypass the safety net the throw-on-unknown-property design installed. - Preflight `manager.loadSubagent(name, level)` before `manager.createSubagent`. The default-path collision check inside SubagentManager would otherwise miss same-frontmatter-name + different-filename collisions; the preflight makes 409 agent_already_exists deterministic. - Multi-level DELETE now emits one `agent_changed` event per level that actually had a file removed. Previously an unscoped DELETE removing both project and user shadows would publish only one event with one level — misleading subscribers using event metadata for toasts / audit / echo-suppression. Test additions (covers the new event types + bridge fan-out + SDK helpers): - `daemonEvents.test.ts`: predicate narrowing for `memory_changed` / `agent_changed` (rejects malformed scope/mode/level), reducer records `lastWorkspaceMutation` + `lastWorkspaceMutationType` with latest-event-wins semantics and stays non-terminal. - `httpAcpBridge.test.ts`: `publishWorkspaceEvent` fans out to every active session bus; `knownClientIds()` aggregates clientIds across sessions and the returned set is a snapshot (mutating it does not affect future calls). - `workspaceAgents.test.ts`: success-path test stamping `originatorClientId` on the create / update / delete events for a known client. - `DaemonClient.test.ts`: 7 round-trip tests for the new SDK helpers (workspaceMemory, writeWorkspaceMemory, listWorkspaceAgents, getWorkspaceAgent, createWorkspaceAgent, updateWorkspaceAgent with scope query, deleteWorkspaceAgent: 204 / structured 404 / bare 404 triage). - `writeContextFile.test.ts`: replace the 30ms-mtime test with a `vi.spyOn(fs, 'writeFile')` assertion that the no-op path never invokes writeFile. Deterministic on every filesystem. Validation: - typecheck: cli + sdk-typescript clean - vitest: serve 363/363 + writeContextFile 12/12 + SDK 347/347 - eslint: clean Reviewer guide: combined with fold-in 2a (commit `134c43c82`), PR 16's round-1 review feedback is closed except for the explicitly- deferred Copilot finding on "strict gate after body parser" (already documented as PR 15 review-resolved tradeoff at auth.ts:256-269). The DRY refactor wenshao suggested for `resolveOriginatorClientId` is left as a future sweep — it touches multiple Wave 4 routes and should land alongside PR 17/19/20/21 to keep the helper's shape informed by all consumers. * docs(serve): apply round-1 review fold-in 2c (doc/type tightening) on PR 16 Two doc-only fixes that close the last open Copilot threads on PR #4249 — both are JSDoc/tsdoc corrections where the wording promised broader behavior than the implementation actually delivers, so a maintainer or SDK consumer reading the type would form a wrong mental model. 1. `DaemonAgentLevel` (sdk-typescript) and `ServeAgentLevel` (cli serve) keep `'extension'` + `'session'` on the union for forward- compat but the JSDoc now explicitly says the daemon does NOT return either today. The `'extension'` case is gated by the daemon's stub `Config.getActiveExtensions()` returning `[]`; `'session'` is a runtime-only `SubagentManager` cache the CRUD routes don't read. Both arms stay so a future PR exposing either source is not a breaking SDK change. 2. `DaemonClient.workspaceMemory()` tsdoc no longer says "hierarchical" — v1 only discovers files at the bound workspace root + the global `~/.qwen` directory, no parent-directory walk. The 12-iteration upward-walk loop body inside `walkWorkspaceForMemory` is reserved for PR 16.5 hierarchical mode and breaks after iteration 1 today; the SDK doc now states that explicitly so callers don't expect more than they receive. No runtime change. Validation: - typecheck: cli + sdk-typescript clean - vitest: 363/363 serve + 12/12 writeContextFile + SDK unchanged - eslint: clean * fix(serve): apply round-2 review fold-in 2d on PR 16 wenshao round-2 (4 inline comments at 16:51-16:53Z): three real bugs + one performance-tradeoff doc note. 1. `composeAppendedContent` now inserts inside the MEMORY section, not at EOF. Previously a QWEN.md whose `## Qwen Added Memories` block was followed by another `## ...` heading would silently land each new entry past the next heading — moving entries into the wrong section. Walk the memory header forward, find the next `\n## ` heading, and insert just before it. Fall back to the EOF append when the memory section is the last block. 2. `parseAgentUpdates` now matches the create-side trim/empty rule for `description` (rejects whitespace-only) and ensures `systemPrompt` rejects the empty string. Update path used to silently accept `" "` and overwrite the field with blank content — divergent from create which 422s the same payload. 3. `isNoOpUpdate`'s runConfig comparison no longer false-positives on partial updates. Comparing every known runConfig field against `existing` treated absent keys as `undefined` while existing had real values — so `{max_time_minutes: 30}` against `{max_time_minutes: 30, max_turns: 10}` claimed non-no-op and re-emitted `agent_changed`. Fixed to only compare keys actually present in `updates.runConfig`, matching `mergeConfigurations` semantics (existing values preserved when not in updates). 4. JSDoc on the LIST-route `force: true` call now explains the tradeoff (no TTL cache / no fs.watch invalidation): re-introducing caching would re-introduce the stale-list bug Codex P2 #2 fixed, `fs.watch` is platform-fragile, and PR 24's audit/policy layer is the proper home for request rate limiting. Sub-millisecond cost per request on local SSD; revisit if profiling flags it. Tests: - writeContextFile.test.ts: section-boundary insertion + EOF fallback - workspaceAgents.test.ts: whitespace-only description rejected; partial runConfig no-op detection; partial runConfig real change preserves omitted keys via mergeConfigurations Validation: - typecheck: cli + sdk-typescript clean - vitest: 368/368 (was 363, +5 new) - eslint: clean * fix(serve): apply round-3 review fold-in 2e on PR 16 wenshao round-3 (5 inline [Suggestion]s, all real correctness or forward-compat issues; one item carried over from round-2): 1. `parseAgentConfig` rejects whitespace-only `systemPrompt` on create, matching the description field's `trim().length === 0` rule. Pure-whitespace prompts collapse to nothing on YAML serialization and the agent can't operate without instructions — 422 at the boundary is friendlier than the downstream "agent does nothing" failure. 2. `parseAgentUpdates` mirrors the same `trim()` check on the update path so `{systemPrompt: " "}` returns 422 rather than silently blanking the field. 3. `POST /workspace/memory` `file_error` 500 response now carries `scope`, `mode`, optional `osCode` (`EACCES`/`EROFS`/`ENOSPC`/...) and a redacted `errorMessage`. Previous shape was just `{error, code: 'file_error'}` — callers had nothing to branch on. 4. `composeAppendedContent` runs `fs.stat` before `fs.readFile` and refuses with a typed `WorkspaceMemoryFileTooLargeError` when the existing file exceeds 16 MB. Without this cap a pathological QWEN.md would be loaded into the daemon heap on every append. The route maps the typed error to a 413 with `code: 'memory_file_too_large'` plus `bytes` / `limit` so callers can decide whether to trim or switch to mode=replace. 5. `toDetail` no longer spreads `config.runConfig` with a cast. Explicit field-by-field pick of `max_time_minutes` / `max_turns` ensures any future `SubagentConfig.runConfig` field requires a deliberate route-schema update rather than silently leaking through the HTTP API. Tests: - workspaceAgents.test.ts: whitespace-only systemPrompt rejected on create AND update; toDetail.runConfig only emits whitelisted keys - existing tests still cover the description-side trim and the partial runConfig no-op detection from fold-in 2d Validation: - typecheck: cli + sdk-typescript clean - vitest: 371/371 (was 368, +3 new) - eslint: clean Reviewer note: response shape on 500 file_error is additive (`scope`/`mode`/`osCode`/`errorMessage` are new fields), so SDK callers that only consumed `{error, code}` keep working. The new 413 `memory_file_too_large` is a new error code SDK consumers can branch on but that pre-PR-16 daemons never emitted, so adding it is also additive. * fix(sdk): expose `changed` on DaemonAgentMutationResult (PR 16 round-4) wenshao round-4 review (single inline at types.ts:434): the agent update route emits `changed: true` for real updates and `changed: false` for no-op short-circuits (introduced in fold-in 2a alongside the no-op detection), but `DaemonAgentMutationResult` in the SDK type still only exposed `{ ok, agent }`. Typed callers of `updateWorkspaceAgent()` couldn't observe the no-op signal even though `DaemonClient` already returns the raw JSON at runtime. Add optional `changed?: boolean` matching the shape introduced for `DaemonWriteMemoryResult.changed` in fold-in 2a. Optional for forward-compat with daemons that predate the field; SDK consumers should treat `undefined` as `true` (the legacy contract — every successful create / update was a write before fold-in 2a's no-op short-circuit landed). Test: - `DaemonClient.test.ts`: round-trip asserts the typed result surfaces `changed: false` from the wire payload. Validation: - typecheck: cli + sdk-typescript clean - vitest: 82/82 in DaemonClient.test.ts (was 81; +1 new) - eslint: clean * fix(serve): apply round-6 review fold-in 2g on PR 16 Round-6 review (gpt-5.5 [Critical] + 5 wenshao [Suggestion]s). [Critical] Per-level delete verification (workspaceAgents.ts): - gpt-5.5 flagged that `SubagentManager.deleteSubagent` swallows per-level `fs.unlink()` failures (subagent-manager.ts:332-336) and returns success as long as ANY level was removed. Trusting that signal would let the route publish `agent_changed`/`deleted` for a file still on disk under EACCES/EBUSY/EPERM — the client UI would drop a still-active definition from cache. - Route now runs `fs.access` on each pre-checked level's file path AFTER `manager.deleteSubagent` returns and partitions into `removed` / `remaining`. Events are emitted ONLY for confirmed removals; if any level still has its file, the route returns 500 `agent_delete_partial` with `removedLevels` + `remainingLevels` so callers can act precisely. - New test installs a 0o555 chmod on the user-level agents directory so `fs.unlink` raises EACCES while the project-level unlink succeeds, asserting both the 500 response and that exactly one `agent_changed` event fired for the level that actually went away. Concurrency consistency (writeContextFile.ts): - Whitespace-only no-op detection now happens INSIDE the per-file mutex's `runExclusive` block. The pre-fix layout did the short-circuit `fs.stat` outside the lock; under concurrent POSTs (one whitespace-only, one with real content) the no-op's `bytesWritten` could lag the post-write reality. Functional behavior was already correct; this aligns the snapshot with the post-write state. Defense-in-depth + DRY (workspaceAgents.ts): - `validateAgentType(req, res)` regex-validates `:agentType` URL parameter at the route boundary against the same `^[\\p{L}\\p{N}_-]+$/u` pattern as `SubagentValidator.validateName`, with a 64-char cap. `findSubagentByNameAtLevel`'s readdir scan already prevented path traversal, but failing fast at the boundary keeps surprising inputs out of downstream code paths. Two new tests cover `..%2Fetc%2Fpasswd` and over-long names. - `parseScopeQuery(req, res)` extracts the duplicated `?scope=` query parser from the POST update + DELETE handlers. Same fail-closed semantics on repeated/non-string values. - `assertMutableLevel(found, agentType, res)` extracts the duplicated `isBuiltin \|\| level === 'builtin' \|\| 'extension' \|\| 'session'` 403 guard. Future Wave 4 mutation routes (PR 17 / 19 / 20) call this helper instead of re-implementing the predicate. Client-id helper consistency (workspaceMemory.ts): - `resolveWorkspaceClientId` removed; the inline branch in the POST handler now mirrors `workspaceAgents.ts:resolveOriginatorClientId` (validate against `bridge.knownClientIds()`, send 400 directly, return so the caller short-circuits). Previously this file threw `InvalidClientIdError` and caught it locally — wenshao round-6 flagged the throw-vs-direct-400 inconsistency between the two files. The deeper full-extraction DRY refactor remains deferred to the cross-Wave-4 sweep with PR 17/19/20/21. Won't-fix doc note (workspaceMemory.ts): - Mount-point JSDoc now explicitly explains why the route returns absolute on-disk paths (success / 413 / GET list): clients pre-flight `caps.workspaceCwd` to learn the bound workspace and can compute relative paths if they want; the global scope's `~/.qwen/QWEN.md` is NOT under the workspace root, so a workspace-relative form would lose information. Path redaction for multi-tenant deployments belongs to PR 24's `--redact-errors` policy work, not a per-route default flip in PR 16. Validation: - typecheck: cli + sdk-typescript clean - vitest: 374/374 (was 371, +3 new) - eslint: clean * fix(serve): apply round-7 review fold-in 2h on PR 16 glm-5.1 round-7: 2 [Critical] + 5 [Suggestion] inline comments. Five applied as code changes; one is a stale-snapshot false positive (workspaceMemory.ts no longer has the InvalidClientIdError call site glm-5.1 referenced — fold-in 2g already replaced it with inline 400); one is rationale-replied (INVALID_CONFIG → 422 mapping suggestion is based on incorrect premise about manager semantics). [Critical] Code-fence-aware section-boundary detection (writeContextFile.ts): - The naive `\n## ` indexOf scan would split user-authored memory entries that quote markdown documentation containing `##` headings inside fenced code blocks. New `findNextTopLevelHeading` helper tracks fence state line-by-line and only accepts matches outside fences. Two new tests: (a) entry containing a fenced `## Request Body` keeps its body intact; (b) real `## post` heading outside fences still acts as the section boundary. [Suggestion] errorMessage + filePath gating (workspaceMemory.ts): - 500 `file_error` and 413 `memory_file_too_large` responses now omit `errorMessage` and `filePath` unless `QWEN_SERVE_DEBUG` is set. Default response carries `error / code / scope / mode / osCode` — enough for SDK callers to branch without leaking absolute filesystem paths. New test asserts both modes round-trip the right shape. [Suggestion] publishWorkspaceEvent visibility (httpAcpBridge.ts): - Catch block now writes to stderr unconditionally during normal operation; only downgrades to the debug channel when `shuttingDown` is true. `EventBus.publish` is documented never to throw, so a hit in normal ops is by definition a regression that must be visible in production logs — silencing via debug-gate could let a true bug succeed at the route layer (200 OK) while SSE subscribers stop receiving events. [Suggestion] Log-injection defense for `agentType` (workspaceAgents.ts): - New `safeLogValue` helper wraps `agentType` interpolations in `JSON.stringify(...).slice(0, 82)` before stderr writes (mirrors `server.ts:1340`). The route's `validateAgentType` regex already rejects names with control chars, but defense-in-depth covers legacy on-disk shadows and future fields. Five `writeStderrLine` call sites updated (GET / POST / DELETE failure, reload-failure, partial-delete, create-reload-failure). [Suggestion] Simplify walkWorkspaceForMemory (workspaceMemory.ts): - Replaced the 12-iteration loop with a straightforward single-pass stat-each-filename. The `seen` Set, `cursor = parent` walk, and filesystem-root guard were dead code (the loop unconditionally broke on first iteration). PR 16.5's hierarchical mode lands as a fresh upward walk rather than re-enabling commented-out code. Validation: - typecheck: cli + sdk-typescript clean - vitest: 377/377 (was 374, +3 new) - eslint: clean Reviewer notes (NOT adopting): - glm-5.1's "InvalidClientIdError('workspace', ...)" message-confusion Critical: stale-snapshot false positive — fold-in 2g already removed `resolveWorkspaceClientId` and inlined a 400 with the correct "registered for this workspace" wording. Only a comment reference remains. - glm-5.1's "INVALID_CONFIG → 422" suggestion: SubagentManager only ever throws INVALID_CONFIG for read-only conditions (built-in / extension / session) — not for malformed config (which uses VALIDATION_ERROR). The current 403 mapping in update + delete is correct for the manager's actual semantics. * fix(serve): apply round-8 review fold-in 2i on PR 16 wenshao round-8: 2 [Critical] path-disclosure + 5 [Suggestion] (name regex, per-field caps, mutex timeout, test gaps, tilde fence). All adopted. [Critical] C1 — 413 `err.message` path disclosure (workspaceMemory.ts): - The 413 `memory_file_too_large` response sent `err.message` unconditionally as the `error` field. The `WorkspaceMemoryFileTooLargeError` constructor embeds the absolute file path in its message ("Existing memory file at /Users/<x>/.qwen/QWEN.md is ..."), bypassing the `debugMode()` gating that already hid the `filePath` field. Same gating now applies to both `error` and `filePath`; default response carries a generic string + structured `code` / `bytes` / `limit` so SDK callers can branch without the path leak. [Critical] C2 — workspaceAgents FILE_ERROR `err.message` (workspaceAgents.ts): - Two catch blocks (create + update) sent `SubagentError(FILE_ERROR)` messages directly in the response. Node fs errors embed paths like "ENOENT: ... '/Users/<x>/.qwen/agents/foo.md'". Both now gate behind `isServeDebugMode()`; default response is the generic "Failed to write workspace agent file" envelope. Shared `isServeDebugMode` helper (debugMode.ts new): - Moved from inlined copies in workspaceMemory.ts to a small shared module so both route files (and future Wave 4 mutation routes) share one canonical predicate. [Suggestion] S1 — POST body `name` validation (workspaceAgents.ts): - `parseAgentConfig` now applies the same regex + length contract as `validateAgentType` (`^[\p{L}\p{N}_-]+$/u`, 2-64 chars). A client posting `name: "my/agent"` or 100-char name now fails at the body-validation boundary with a 422 `invalid_config` instead of bubbling a less-specific `SubagentValidator` error. [Suggestion] S2 — Per-field size caps (workspaceAgents.ts): - `description` / `systemPrompt`: 256 KB each - `tools` / `disallowedTools`: 256 entries, each at most 256 chars Applied on both create + update; matches workspaceMemory's `MAX_MEMORY_CONTENT_BYTES = 1 MB` posture and keeps `GET /workspace/agents` list-response cost bounded. [Suggestion] S3 — Mutex timeout (writeContextFile.ts): - `getFileLock` now wraps each Mutex with `withTimeout(..., 30_000)` so a wedged filesystem (NFS hiccup, OneDrive lock, kernel I/O hang) cannot indefinitely hold the per-file lock. The `E_TIMEOUT` sentinel is caught and re-thrown as a typed `WorkspaceMemoryWriteTimeoutError`; the route maps it to 500 `memory_write_timeout` with `timeoutMs` so SDK callers can branch on stalled-fs without parsing a generic 500. [Suggestion] S4 — Test gaps: - `DELETE /workspace/agents/:id?scope=workspace` happy path: removes only the project shadow, leaves user file on disk, emits exactly one `agent_changed` event with `level: project`. - `POST /workspace/agents/:id?scope=global` happy path: updates user shadow, leaves project file untouched. - 413 `memory_file_too_large`: write a 17 MB QWEN.md externally, POST append fails with the structured 413 payload (`bytes` / `limit`, no `filePath` / no path-embedding error message in default response). [Nice] N1 — Tilde fence support (writeContextFile.ts): - `findNextTopLevelHeading` now toggles fence state on both ``` ` and `~~~` openers (CommonMark allows both). A `## heading` inside a `~~~` fenced block no longer counts as the section boundary. Validation: - typecheck: cli + sdk-typescript clean - vitest: 380/380 (was 377, +3 new) - eslint: clean * fix(serve): apply round-9 review fold-in 2j on PR 16 Two real correctness fixes from wenshao's 2026-05-18 review: 1. resolveContextFilePath now uses getCurrentGeminiMdFilename() so POST /workspace/memory writes to the same file GET surfaces. Without this, a deployment that ran setGeminiMdFilename('AGENTS.md') saw GET list AGENTS.md while POST kept appending to a stale QWEN.md — clients then observed "I just wrote content but it's missing from /workspace/memory". 2. runWrite no-op branch now returns bytesWritten: 0 instead of the existing file's stat.size. The prior value conflated "bytes I wrote" with "current file size"; clients accumulating writes via sum(bytesWritten) added the file size for every whitespace POST. changed: false already signals the no-op; the byte count should match its field name. JSDoc updated on both WriteContextFileResult.bytesWritten and DaemonWriteMemoryResult.bytesWritten so the contract is explicit. New test covers setGeminiMdFilename(AGENTS.md) round-trip; existing no-op test updated for the new bytesWritten semantics. Round-8 thread PRRT_kwDOPB-92c6Cpyap (DRY resolveOriginatorClientId) stays open as the cross-Wave-4 tracking marker. CodeQL "missing rate limiting" alert deferred to PR 24's audit/policy layer (bearer + max-connections + mutation gate provide v1 mitigations). * fix(serve): skip two Windows-incompatible test fixtures on win32 Both tests rely on `fs.chmod(dir, 0o555)` to trigger EACCES on a subsequent write/unlink. Windows ignores Unix-style permission bits passed to `fs.chmod`, so the directory stays writable, the operation succeeds, and the error path the test exercises is unreachable — the test then sees the success status (200 / 204) instead of the expected 500. CI failed on Windows runner only; Ubuntu + macOS pass. Route logic is platform-agnostic — these tests validate that: - `workspaceMemory.test.ts` POST returns the structured 500 envelope (no `errorMessage` / `filePath` leakage outside QWEN_SERVE_DEBUG). - `workspaceAgents.test.ts` DELETE returns 500 `agent_delete_partial` when one level's `fs.unlink` silently fails inside SubagentManager. Both invariants are still covered by the Ubuntu + macOS runs. We can't swap in a `vi.spyOn(fs, 'unlink')` mock for the agents case either — `SubagentManager` does `import * as fs from 'fs/promises'`, creating a sealed ESM namespace object vitest can't redefine. Skip pattern mirrors `customBanner.test.ts:232` (`if (process.platform === 'win32') return;`).		2026-05-18 14:26:59 +08:00
..
scripts	chore(deps): upgrade ink 6.2.3 → 7.0.2 + bump Node engine to 22 (#3860 )	2026-05-11 17:29:50 +08:00
src	feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) (#4249 )	2026-05-18 14:26:59 +08:00
test/unit	feat(serve): workspace memory and agents CRUD (#4175 Wave 4 PR 16) (#4249 )	2026-05-18 14:26:59 +08:00
package.json	feat(core,cli): add generic atomicWriteFile, wire into Write/Edit tools, upgrade @types/node (#4096 )	2026-05-15 17:52:50 +08:00
README.md	feat(protocol): add typed daemon event schema v1 (#4217 )	2026-05-17 12:31:16 +08:00
tsconfig.build.json	chore: keep comments for queryOptions	2025-12-04 17:10:21 +08:00
tsconfig.json	Add Gemini provider, remove legacy Google OAuth, and tune generation defaults	2025-12-19 16:26:54 +08:00
vitest.config.ts	fix: enhance 429 error handling and fix failed cases	2025-12-04 17:10:23 +08:00

README.md

@qwen-code/sdk

A minimum experimental TypeScript SDK for programmatic access to Qwen Code.

Feel free to submit a feature request/issue/PR.

Installation

npm install @qwen-code/sdk

Requirements

Node.js >= 22.0.0

From v0.1.1, the CLI is bundled with the SDK. So no standalone CLI installation is needed.

Quick Start

import { query } from '@qwen-code/sdk';

// Single-turn query
const result = query({
  prompt: 'What files are in the current directory?',
  options: {
    cwd: '/path/to/project',
  },
});

// Iterate over messages
for await (const message of result) {
  if (message.type === 'assistant') {
    console.log('Assistant:', message.message.content);
  } else if (message.type === 'result') {
    console.log('Result:', message.result);
  }
}

API Reference

`query(config)`

Creates a new query session with the Qwen Code.

Parameters

prompt: string | AsyncIterable<SDKUserMessage> - The prompt to send. Use a string for single-turn queries or an async iterable for multi-turn conversations.
options: QueryOptions - Configuration options for the query session.

QueryOptions

Option	Type	Default	Description
`cwd`	`string`	`process.cwd()`	The working directory for the query session. Determines the context in which file operations and commands are executed.
`model`	`string`	-	The AI model to use (e.g., `'qwen-max'`, `'qwen-plus'`, `'qwen-turbo'`). Takes precedence over `OPENAI_MODEL` and `QWEN_MODEL` environment variables.
`pathToQwenExecutable`	`string`	Auto-detected	Path to the Qwen Code executable. Supports multiple formats: `'qwen'` (native binary from PATH), `'/path/to/qwen'` (explicit path), `'/path/to/cli.js'` (Node.js bundle), `'node:/path/to/cli.js'` (force Node.js runtime), `'bun:/path/to/cli.js'` (force Bun runtime). If not provided, auto-detects from: `QWEN_CODE_CLI_PATH` env var, `~/.volta/bin/qwen`, `~/.npm-global/bin/qwen`, `/usr/local/bin/qwen`, `~/.local/bin/qwen`, `~/node_modules/.bin/qwen`, `~/.yarn/bin/qwen`.
`permissionMode`	`'default' \| 'plan' \| 'auto-edit' \| 'yolo'`	`'default'`	Permission mode controlling tool execution approval. See Permission Modes for details.
`canUseTool`	`CanUseTool`	-	Custom permission handler for tool execution approval. Invoked when a tool requires confirmation. Must respond within 60 seconds or the request will be auto-denied. See Custom Permission Handler.
`env`	`Record<string, string>`	-	Environment variables to pass to the Qwen Code process. Merged with the current process environment.
`systemPrompt`	`string \| QuerySystemPromptPreset`	-	System prompt configuration for the main session. Use a string to fully override the built-in Qwen Code system prompt, or a preset object to keep the built-in prompt and append extra instructions.
`mcpServers`	`Record<string, McpServerConfig>`	-	MCP (Model Context Protocol) servers to connect. Supports external servers (stdio/SSE/HTTP) and SDK-embedded servers. External servers are configured with transport options like `command`, `args`, `url`, `httpUrl`, etc. SDK servers use `{ type: 'sdk', name: string, instance: Server }`.
`abortController`	`AbortController`	-	Controller to cancel the query session. Call `abortController.abort()` to terminate the session and cleanup resources.
`debug`	`boolean`	`false`	Enable debug mode for verbose logging from the CLI process.
`maxSessionTurns`	`number`	`-1` (unlimited)	Maximum number of conversation turns before the session automatically terminates. A turn consists of a user message and an assistant response.
`coreTools`	`string[]`	-	Equivalent to `permissions.allow` in settings.json as an allowlist. If specified, only these tools will be available to the AI (all other tools are disabled at registry level). Supports tool name aliases and pattern matching. Example: `['Read', 'Edit', 'Bash(git *)']`.
`excludeTools`	`string[]`	-	Equivalent to `permissions.deny` in settings.json. Excluded tools return a permission error immediately. Takes highest priority over all other permission settings. Supports tool name aliases and pattern matching: tool name (`'write_file'`), shell command prefix (`'Bash(rm )'`), or path patterns (`'Read(.env)'`, `'Edit(/src/*)'`).
`allowedTools`	`string[]`	-	Equivalent to `permissions.allow` in settings.json. Matching tools bypass `canUseTool` callback and execute automatically. Only applies when tool requires confirmation. Supports same pattern matching as `excludeTools`. Example: `['ShellTool(git status)', 'ShellTool(npm test)']`.
`authType`	`'openai' \| 'qwen-oauth'`	`'openai'`	Authentication type for the AI service. Using `'qwen-oauth'` in SDK is not recommended as credentials are stored in `~/.qwen` and may need periodic refresh.
`agents`	`SubagentConfig[]`	-	Configuration for subagents that can be invoked during the session. Subagents are specialized AI agents for specific tasks or domains.
`includePartialMessages`	`boolean`	`false`	When `true`, the SDK emits incomplete messages as they are being generated, allowing real-time streaming of the AI's response.
`resume`	`string`	-	Resume a previous session by providing its session ID. Equivalent to CLI's `--resume` flag.
`sessionId`	`string`	-	Specify a session ID for the new session. Ensures SDK and CLI use the same ID without resuming history. Equivalent to CLI's `--session-id` flag.

Tip

If you need to configure coreTools, excludeTools, or allowedTools, it is strongly recommended to read the permissions configuration documentation first, especially the Tool name aliases and Rule syntax examples sections, to understand the available aliases and pattern matching syntax (e.g., Bash(git *), Read(.env), Edit(/src/**)).

Timeouts

The SDK enforces the following default timeouts:

Timeout	Default	Description
`canUseTool`	1 minute	Maximum time for `canUseTool` callback to respond. If exceeded, the tool request is auto-denied.
`mcpRequest`	1 minute	Maximum time for SDK MCP tool calls to complete.
`controlRequest`	1 minute	Maximum time for control operations like `initialize()`, `setModel()`, `setPermissionMode()`, and `interrupt()` to complete.
`streamClose`	1 minute	Maximum time to wait for initialization to complete before closing CLI stdin in multi-turn mode with SDK MCP servers.

You can customize these timeouts via the timeout option:

const query = qwen.query('Your prompt', {
  timeout: {
    canUseTool: 60000, // 60 seconds for permission callback
    mcpRequest: 600000, // 10 minutes for MCP tool calls
    controlRequest: 60000, // 60 seconds for control requests
    streamClose: 15000, // 15 seconds for stream close wait
  },
});

Experimental Daemon Session Client

DaemonSessionClient is an experimental wrapper for clients that talk to a running qwen serve daemon over HTTP + SSE. It binds one daemon session so TUI, channel, IDE, or web backend adapters do not need to pass sessionId into every call.

import { DaemonClient, DaemonSessionClient } from '@qwen-code/sdk';

const daemon = new DaemonClient({
  baseUrl: 'http://127.0.0.1:4170',
  token: process.env['QWEN_SERVER_TOKEN'],
});

const caps = await daemon.capabilities();
const session = await DaemonSessionClient.createOrAttach(daemon, {
  workspaceCwd: caps.workspaceCwd,
});

const eventController = new AbortController();
const eventTask = (async () => {
  for await (const event of session.events({
    signal: eventController.signal,
  })) {
    console.log(event.type, event.data);
  }
})();

const result = await session.prompt({
  prompt: [{ type: 'text', text: 'Summarize this workspace.' }],
});

eventController.abort();
await eventTask;
console.log(result.stopReason);

session.events() tracks the last seen SSE event id and reuses it on the next subscription by default. Pass { resume: false } to start a fresh subscription without sending Last-Event-ID.

When createOrAttach() is called with modelServiceId, the returned session client seeds its first event subscription with Last-Event-ID: 0. This replays the daemon ring from the oldest available event so adapters can observe attach-time model_switch_failed or model_switched events that are not reported on the create/attach HTTP response. Raw DaemonClient callers should pass { lastEventId: 0 } on their first subscribeEvents() call when they use modelServiceId.

The raw event envelope remains available as DaemonEvent with data: unknown. Adapters that want a v1 typed view can layer the schema helpers on top without changing the wire stream:

import {
  asKnownDaemonEvent,
  createDaemonSessionViewState,
  reduceDaemonSessionEvent,
} from '@qwen-code/sdk';

let view = createDaemonSessionViewState();
for await (const event of session.events()) {
  view = reduceDaemonSessionEvent(view, event);

  const known = asKnownDaemonEvent(event);
  if (known?.type === 'permission_request') {
    console.log(known.data.requestId);
  }
}

Message Types

The SDK provides type guards to identify different message types:

import {
  isSDKUserMessage,
  isSDKAssistantMessage,
  isSDKSystemMessage,
  isSDKResultMessage,
  isSDKPartialAssistantMessage,
} from '@qwen-code/sdk';

for await (const message of result) {
  if (isSDKAssistantMessage(message)) {
    // Handle assistant message
  } else if (isSDKResultMessage(message)) {
    // Handle result message
  }
}

Query Instance Methods

The Query instance returned by query() provides several methods:

const q = query({ prompt: 'Hello', options: {} });

// Get session ID
const sessionId = q.getSessionId();

// Check if closed
const closed = q.isClosed();

// Interrupt the current operation
await q.interrupt();

// Change permission mode mid-session
await q.setPermissionMode('yolo');

// Change model mid-session
await q.setModel('qwen-max');

// Close the session
await q.close();

Permission Modes

The SDK supports different permission modes for controlling tool execution:

default: Write tools are denied unless approved via canUseTool callback or in allowedTools. Read-only tools execute without confirmation.
plan: Blocks all write tools, instructing AI to present a plan first.
auto-edit: Auto-approve edit tools (edit, write_file) while other tools require confirmation.
yolo: All tools execute automatically without confirmation.

Permission Priority Chain

Decision priority (highest first): deny > ask > allow > (default/interactive mode)

The first matching rule wins.

excludeTools / permissions.deny - Blocks tools completely (returns permission error)
permissions.ask - Always requires user confirmation
permissionMode: 'plan' - Blocks all non-read-only tools
permissionMode: 'yolo' - Auto-approves all tools
allowedTools / permissions.allow - Auto-approves matching tools
canUseTool callback - Custom approval logic (if provided, not called for allowed tools)
Default behavior - Auto-deny in SDK mode (write tools require explicit approval)

Examples

Multi-turn Conversation

import { query, type SDKUserMessage } from '@qwen-code/sdk';

async function* generateMessages(): AsyncIterable<SDKUserMessage> {
  yield {
    type: 'user',
    session_id: 'my-session',
    message: { role: 'user', content: 'Create a hello.txt file' },
    parent_tool_use_id: null,
  };

  // Wait for some condition or user input
  yield {
    type: 'user',
    session_id: 'my-session',
    message: { role: 'user', content: 'Now read the file back' },
    parent_tool_use_id: null,
  };
}

const result = query({
  prompt: generateMessages(),
  options: {
    permissionMode: 'auto-edit',
  },
});

for await (const message of result) {
  console.log(message);
}

Custom Permission Handler

import { query, type CanUseTool } from '@qwen-code/sdk';

const canUseTool: CanUseTool = async (toolName, input, { signal }) => {
  // Allow all read operations
  if (toolName.startsWith('read_')) {
    return { behavior: 'allow', updatedInput: input };
  }

  // Prompt user for write operations (in a real app)
  const userApproved = await promptUser(`Allow ${toolName}?`);

  if (userApproved) {
    return { behavior: 'allow', updatedInput: input };
  }

  return { behavior: 'deny', message: 'User denied the operation' };
};

const result = query({
  prompt: 'Create a new file',
  options: {
    canUseTool,
  },
});

With External MCP Servers

import { query } from '@qwen-code/sdk';

const result = query({
  prompt: 'Use the custom tool from my MCP server',
  options: {
    mcpServers: {
      'my-server': {
        command: 'node',
        args: ['path/to/mcp-server.js'],
        env: { PORT: '3000' },
      },
    },
  },
});

Override the System Prompt

import { query } from '@qwen-code/sdk';

const result = query({
  prompt: 'Say hello in one sentence.',
  options: {
    systemPrompt: 'You are a terse assistant. Answer in exactly one sentence.',
  },
});

Append to the Built-in System Prompt

import { query } from '@qwen-code/sdk';

const result = query({
  prompt: 'Review the current directory.',
  options: {
    systemPrompt: {
      type: 'preset',
      preset: 'qwen_code',
      append: 'Be terse and focus on concrete findings.',
    },
  },
});

With SDK-Embedded MCP Servers

The SDK provides tool and createSdkMcpServer to create MCP servers that run in the same process as your SDK application. This is useful when you want to expose custom tools to the AI without running a separate server process.

`tool(name, description, inputSchema, handler)`

Creates a tool definition with Zod schema type inference.

Parameter	Type	Description
`name`	`string`	Tool name (1-64 chars, starts with letter, alphanumeric and underscores)
`description`	`string`	Human-readable description of what the tool does
`inputSchema`	`ZodRawShape`	Zod schema object defining the tool's input parameters
`handler`	`(args, extra) => Promise<Result>`	Async function that executes the tool and returns MCP content blocks

The handler must return a CallToolResult object with the following structure:

{
  content: Array<
    | { type: 'text'; text: string }
    | { type: 'image'; data: string; mimeType: string }
    | { type: 'resource'; uri: string; mimeType?: string; text?: string }
  >;
  isError?: boolean;
}

`createSdkMcpServer(options)`

Creates an SDK-embedded MCP server instance.

Option	Type	Default	Description
`name`	`string`	Required	Unique name for the MCP server
`version`	`string`	`'1.0.0'`	Server version
`tools`	`SdkMcpToolDefinition[]`	-	Array of tools created with `tool()`

Returns a McpSdkServerConfigWithInstance object that can be passed directly to the mcpServers option.

Example

import { z } from 'zod';
import { query, tool, createSdkMcpServer } from '@qwen-code/sdk';

// Define a tool with Zod schema
const calculatorTool = tool(
  'calculate_sum',
  'Add two numbers',
  { a: z.number(), b: z.number() },
  async (args) => ({
    content: [{ type: 'text', text: String(args.a + args.b) }],
  }),
);

// Create the MCP server
const server = createSdkMcpServer({
  name: 'calculator',
  tools: [calculatorTool],
});

// Use the server in a query
const result = query({
  prompt: 'What is 42 + 17?',
  options: {
    permissionMode: 'yolo',
    mcpServers: {
      calculator: server,
    },
  },
});

for await (const message of result) {
  console.log(message);
}

Abort a Query

import { query, isAbortError } from '@qwen-code/sdk';

const abortController = new AbortController();

const result = query({
  prompt: 'Long running task...',
  options: {
    abortController,
  },
});

// Abort after 5 seconds
setTimeout(() => abortController.abort(), 5000);

try {
  for await (const message of result) {
    console.log(message);
  }
} catch (error) {
  if (isAbortError(error)) {
    console.log('Query was aborted');
  } else {
    throw error;
  }
}

Error Handling

The SDK provides an AbortError class for handling aborted queries:

import { AbortError, isAbortError } from '@qwen-code/sdk';

try {
  // ... query operations
} catch (error) {
  if (isAbortError(error)) {
    // Handle abort
  } else {
    // Handle other errors
  }
}

FAQ / Troubleshooting

Version 0.1.0 Requirements

If you're using SDK version 0.1.0, please note the following requirements:

Qwen Code Installation Required

Version 0.1.0 requires Qwen Code >= 0.4.0 to be installed separately and accessible in your PATH.

# Install Qwen Code globally
npm install -g qwen-code@^0.4.0

Note: From version 0.1.1 onwards, the CLI is bundled with the SDK, so no separate Qwen Code installation is needed.

License

Apache-2.0 - see LICENSE for details.