* refactor(serve): 1 daemon = 1 workspace (#3803 §02) Stage 1 shipped with M-workspaces-per-daemon routing (`byWorkspaceChannel` Map keyed by request `cwd`). The §02 architectural revision in `docs/comparison/qwen-code-daemon-design/02-architectural-decisions.md` narrows the bridge to 1 daemon = 1 workspace × N sessions: each daemon binds to one canonical workspace path at boot; `POST /session` with a mismatched `cwd` returns 400 `workspace_mismatch`. Multi-workspace deployments run multiple daemon processes (one per workspace, supervised externally — systemd / docker-compose / k8s / `qwen-coordinator`). Bridge state collapses from maps to single optional slots: - `byWorkspaceChannel: Map<string, ChannelInfo>` → `channelInfo?: ChannelInfo` - `inFlightChannelSpawns: Map<string, Promise>` → `inFlightChannelSpawn?: Promise` - `byWorkspace: Map<string, SessionEntry>` → `defaultEntry?: SessionEntry` - `liveChannels: Set<ChannelInfo>` → not needed; `channelInfo` is the live reference, cleared only by `channel.exited` (preserves the tanzhenxin BkUyD invariant that `killAllSync` finds a target mid-SIGTERM-grace) `BridgeOptions.boundWorkspace` becomes required. `WorkspaceMismatchError` is thrown from `spawnOrAttach` when the request's canonical cwd doesn't match the bound path, translated to 400 `workspace_mismatch` (with both paths in the body) by the route layer. `CapabilitiesEnvelope.workspaceCwd` surfaces the bound path so clients pre-flight check + omit `cwd` from `POST /session` (it falls back to the bound workspace). A new `--workspace <path>` CLI flag lets operators override `process.cwd()` at boot. The previous `--http-bridge` / `--multi-workspace` opt-in was never shipped; nothing changes for default users running `qwen serve` in their project directory. Removed code path: ~150 LOC of multi-workspace map machinery in `httpAcpBridge.ts` plus the test cases that exercised it. Test surgery: - New `makeBridge()` helper in `httpAcpBridge.test.ts` injects `boundWorkspace: WS_A` by default; tests that need a different bind (the mismatch test) pass it explicitly. - `does NOT reuse across workspaces` → `rejects cross-workspace requests with WorkspaceMismatchError` (the new semantics under §02). - `shutdown kills every live channel` retargeted to single-channel multi-session shutdown. - `killAllSync force-kills channels even after shutdown cleared byWorkspaceChannel (BkUyD)` retargeted to single-channel: the invariant is the same (channel reference must outlive eager shutdown clearing), the surface is just smaller. - `listWorkspaceSessions` cross-workspace assertion now expects empty for the un-bound path. - `--max-sessions` cap test uses two thread-scope sessions on `WS_A` instead of WS_A + WS_B. Closes #3803 §02. * fix(serve): address review findings on the §02 refactor Two correctness fixes + four doc/test polish items surfaced by the multi-agent review of #4113: 1. `killSession` → `spawnOrAttach` race (Critical). After killing the last session, `channel.kill()` runs through a 5s SIGTERM grace before SIGKILL. During that window a concurrent `spawnOrAttach` used to hit `ensureChannel`, find `channelInfo` still set, and reuse the dying transport — either landing the caller with a sessionId that 404s on every follow-up once `channel.exited` fires, or hanging until the newSession timeout. Fix: add an `isDying: boolean` flag on `ChannelInfo`, set synchronously by `killSession` / `doSpawn`-newSession-failure / `shutdown` BEFORE awaiting `channel.kill()`. `ensureChannel` treats a dying channel as absent and spawns a fresh one. The tanzhenxin BkUyD invariant ("`channelInfo` reference must outlive the kill-await for `killAllSync` mid-grace") is preserved — we set `isDying` but don't clear `channelInfo` until the OS reaps the child via `channel.exited`. A regression test in `httpAcpBridge.test.ts` pins the invariant: a never-resolving `kill()` keeps the SIGTERM grace open while a concurrent spawn verifies the factory was called twice (two distinct handles). 2. `boundWorkspace` canonicalization divergence (Critical). `server.ts` and `runQwenServe.ts` each computed `opts.workspace ?? process.cwd()` independently. The bridge canonicalized that string via `realpathSync.native` (resolving symlinks, case-folding on case-insensitive filesystems); the callers retained the raw form. On macOS HFS+ / APFS or any symlinked path, `/capabilities.workspaceCwd` advertised one spelling while the bridge enforced against another — clients echoing the advertised path back saw `POST /session` succeed but the response carry a different `workspaceCwd`. Fix: export `canonicalizeWorkspace` from `httpAcpBridge.ts` and call it once in `runQwenServe` (after the existence check) and once in `createServeApp`. Both paths land on the same canonical form; the bridge's own re-canonicalize is now a no-op (idempotent). 3. Reject `--workspace` pointing at non-existent directories at boot (Suggestion). `canonicalizeWorkspace`'s ENOENT fallback to `path.resolve` previously let the daemon boot pointed at a path that didn't exist; every `POST /session` then spawned a `qwen --acp` child with that cwd and the agent failed with an opaque ENOENT. Now `runQwenServe` `statSync`s the bound path at boot and rejects "directory does not exist" / "not a directory" with a clear message. 4. Stale docstrings (Nice to have). `types.ts` `ServeMode` JSDoc said "one `qwen --acp` child PER WORKSPACE" — directly contradicted the new `workspace` field's doc in the same file. `commands/serve.ts` `--http-bridge` description said "per workspace" — directly contradicted the `--workspace` flag's help in the same yargs builder. Both updated to "per daemon (the daemon binds to ONE workspace at boot)". 5. Stale `byWorkspace` comment references (Nice to have). `server.ts:188` ("orphaned in byId / byWorkspace") and `httpAcpBridge.test.ts:1210` ("still in byId/byWorkspace at the moment of crash") referenced the removed Map. Updated to `defaultEntry`. 6. `/capabilities` curl example in the Authentication section of `docs/users/qwen-serve.md` was missing the new `workspaceCwd` field — the Quickstart's curl example was updated but the parallel one in the auth section was not. Synced. Tests added: - `killSession marks the channel dying so concurrent spawnOrAttach gets a fresh channel` — pins fix (1). - `--workspace flows end-to-end and surfaces on /capabilities` — exercises the runQwenServe → server.ts → bridge plumbing that no prior test covered. - `rejects --workspace pointing at a non-existent directory` and `rejects --workspace pointing at a regular file` — pin fix (3). - `rejects relative --workspace at boot` — covers the absoluteness check that exists but was untested. Net: +238 / -24 across 8 files. All 149 serve tests pass. * fix(serve): BkUyD overwrite race + Windows-fragile test + doSpawn-failure coverage Round-2 review of #4113 caught three follow-up issues introduced by or left open after round-1's fixes: 1. **BkUyD invariant overwrite race (Critical).** Round-1's `isDying` flag lets `ensureChannel` skip a dying channel and spawn a fresh one. When the fresh spawn completes, `channelInfo = info` overwrote the dying channel's reference — leaving NO global pointer to it. `killAllSync()` then iterated only `channelInfo` (the fresh one) and missed the dying child entirely. A double-Ctrl+C arriving mid-SIGTERM-grace would call `process.exit(1)` before the dying child's per-channel SIGKILL escalation timer fired, orphaning the child. Restore a `aliveChannels: Set<ChannelInfo>` (parallel to the original Stage 1 design, but justified by single-workspace too). Entries added in `ensureChannel`, removed by each channel's `channel.exited` handler. `killAllSync` iterates the SET, not the single attach-target slot. `shutdown` does the same — snapshots every alive channel and kills each, not just the current `channelInfo`. New regression test pins the invariant: spawn → killSession (channel marked dying, kill hangs) → spawnOrAttach (fresh channel overwrites `channelInfo`) → `killAllSync` — expect BOTH channels' `killSync` to fire. Pre-fix only the fresh one would have fired. 2. **Windows-fragile test path.** The new `rejects --workspace pointing at a regular file` test used `new URL(import.meta.url).pathname` to get a path to the test file. On Windows that returns `/C:/path/...` (leading slash); `fs.statSync` then resolves it as path-from-current-drive-root, fails with ENOENT, and the test sees the "does not exist" error message instead of the expected "not a directory" branch. CI runs `windows-latest`. Fix: `fileURLToPath(import.meta.url)` from `node:url`. 3. **doSpawn newSession-failure isDying path was untested.** The round-1 fix added `ci.isDying = true` to both `killSession` AND `doSpawn`'s newSession-failure catch, but only the killSession path had a regression test. Added a parallel one for the doSpawn path: thread-scope bridge with a `newSessionImpl` that throws on the first call → captures the rejection without awaiting it (the bridge's `await ci.channel.kill()` hangs in the test), yields enough cycles for the `isDying = true` sync prefix to settle, then confirms (a) the next `spawnOrAttach` produces a fresh channel and (b) `killAllSync` finds both channels in `aliveChannels`. Also added a `newSessionImpl` option to the test FakeAgent — the existing `initializeThrows` hook covered handshake-time failures, but post-init `newSession` rejections (auth, bad config, mid-init crashes) had no test affordance. All 151 serve tests pass. * docs(serve): update daemon-client-quickstart for §02 single-workspace Round-3 review caught that the SDK example doc was the only one of the three serve-related docs that the §02 refactor didn't touch. Updated: - Boot log example now shows the `, workspace=/path/to/your-project` suffix that `runQwenServe` emits after the §02 changes. - The "Hello daemon" example now reads `caps.workspaceCwd` off `/capabilities` and passes it back as `workspaceCwd` on session creation — illustrating the documented pre-flight pattern, not a hand-written literal that may not match the daemon's actual bind. - Shared-session example makes the prerequisite explicit: the daemon must be bound to `/work/repo` (via `--workspace` or `cd`); under §02 two clients can only share a session if they're both hitting a daemon already bound to that workspace. - New "Workspace mismatch" section shows how to handle the `400 workspace_mismatch` error class: catching `DaemonHttpError`, branching on `body.code`, surfacing `boundWorkspace` / `requestedWorkspace` for the operator. This is a new error class SDK consumers' error handlers should branch on. No code changes; docs only. * feat(sdk,test): align SDK types + integration tests with §02 single-workspace Round-4 review caught one type-drift gap + a set of integration-test assumptions that the §02 refactor invalidated. **SDK type drift.** `DaemonCapabilities` in `packages/sdk-typescript/src/daemon/types.ts` was the SDK-side mirror of `CapabilitiesEnvelope` on the daemon side. The §02 PR added `workspaceCwd: string` to the daemon envelope (and the round-3 doc example reads `caps.workspaceCwd` off the SDK client) but the SDK type wasn't updated. A TypeScript consumer copying the doc snippet verbatim would hit `TS2339 'workspaceCwd' does not exist on type 'DaemonCapabilities'`. The wire field is present so JS consumers wouldn't notice — but the SDK is marketed as a TypeScript quickstart, so this is a real onboarding break. Fix: add `workspaceCwd: string` to `DaemonCapabilities` (parallel to `DaemonSession.workspaceCwd` which is already there). The SDK unit test for `client.capabilities()` was updated to put the new field in the mocked response. **Integration tests.** `qwen-serve-routes.test.ts` spawns a real `qwen serve` daemon in `beforeAll`. Three breakages exposed: 1. The daemon was launched without `--workspace`, so it inherited the test runner's `cwd`. Tests then POST `workspaceCwd: REPO_ROOT` assuming the daemon is bound to the repo root — true when run via `npm test` from the repo, brittle from IDEs / launchers that have a different `cwd`. Added `'--workspace', REPO_ROOT` to the spawn args so the bound workspace is deterministic regardless of where the test runner is launched. 2. The `bad modelServiceId` test used `cwd: '/tmp'`. Under §02 this would now return 400 workspace_mismatch before the session was spawned. Switched to `REPO_ROOT` and softened the `attached` assertion (REPO_ROOT may already have a session from earlier tests in the suite under sessionScope:single). 3. Added three new integration tests pinning the §02 surface end-to-end through a real daemon process: - `rejects cross-workspace cwd with 400 workspace_mismatch` — posts `/tmp` and asserts the full structured error body (`code`, `boundWorkspace`, `requestedWorkspace`). - `omits cwd → falls back to bound workspace` — posts an empty body and asserts the response's `workspaceCwd` matches REPO_ROOT (verifies the runQwenServe → createServeApp → bridge fallback plumbing). - `GET /capabilities surfaces workspaceCwd` — asserts the new SDK type field is populated correctly off the wire. All 422 unit tests pass (cli serve + sdk). Integration tests typecheck clean. * fix(serve): address /review feedback from gpt-5.5 + deepseek-v4-pro Process the 7 inline /review comments on PR #4113: - C1+C3 (SDK): make `DaemonCapabilities.workspaceCwd` and `CreateSessionRequest.workspaceCwd` optional in the SDK types. `workspaceCwd` is an additive field on the v=1 envelope per #3803 §02; the protocol's "bump v only on incompatible changes" stance is honored by leaving the field optional at the type level. `DaemonClient.createOrAttachSession` now omits `cwd` from the body when `workspaceCwd` isn't passed, matching the PR description's "SDK accepts bound path or none". Adds a unit test pinning the empty-body shape. - C2 (docs/users/qwen-serve.md): the `--http-bridge` row described the pre-§02 per-session model; updated to reflect one child per daemon with N sessions multiplexed via ACP `newSession()`. - C4 (server.ts): `WorkspaceMismatchError` was silently 400'ing without a stderr breadcrumb, leaving operators blind to cross-workspace routing drift. Mirrors the SessionLimitExceeded /InvalidPermissionOption observability pattern. - C5 (server.test.ts): the `/capabilities` fallback test compared `res.body.workspaceCwd` against raw `process.cwd()`; on macOS default tmpdir flows (`/var/folders/...` → `/private/var/...`) the canonicalize-once route value diverges. Use `realpathSync.native(process.cwd())` to match the route's canonicalization. - C6 (server.ts): the cwd-not-absolute error said "cwd is required and must be an absolute path" but cwd is now optional under §02. Tightened wording to "must be an absolute path when provided". - C7 (runQwenServe.ts): the `statSync` catch only wrapped ENOENT with a friendly diagnostic; EACCES / EPERM (typical for SIP-protected dirs on macOS or root-owned paths the daemon's UID can't traverse) re-threw as raw `SystemError`. Wrap both codes with a `--workspace`-context message so the boot failure points at the flag the operator set. Docs: quickstart shows the explicit-pass-or-omit options side by side; protocol reference notes `workspaceCwd` is additive to v=1. * fix(serve/test): make /work/bound literals Windows-portable Windows CI failed on this PR's two new tests because returns (drive-relative absolute), so the route's canonicalize step diverged from the hardcoded literal. Mirror the WS_A/WS_B pattern already used in httpAcpBridge.test.ts: define WS_BOUND / WS_DIFFERENT via `path.resolve(path.sep, …)` and use the constants everywhere. The 400 workspace_mismatch test would still have passed (mock controls both throw + assertion) but I aligned it for consistency. Failures from CI run 25806528710: expected 'D:\work\bound' to be '/work/bound' (Object.is) Affected tests: - createServeApp > GET /capabilities > reports the bound workspace - createServeApp > POST /session > 200 when cwd is omitted * fix(serve): address second /review round (gpt-5.5 + deepseek-v4-pro) Four new inline findings from the latest /review pass: - N1 (integration-tests/cli/qwen-serve-routes.test.ts) — Critical: the `workspace_mismatch` assertion compared `requestedWorkspace` against the literal `'/tmp'`, but the bridge canonicalizes via `realpathSync.native` and on macOS `/tmp` is a symlink to `/private/tmp`. Compare against `realpathSync.native('/tmp')` so the assertion is portable. - N2 (packages/cli/src/serve/types.ts): `CapabilitiesEnvelope.workspaceCwd: string` (server side) diverged from the SDK's `DaemonCapabilities.workspaceCwd?: string`. Made the server type optional too — matches the SDK, matches the protocol doc's "additive to v=1" framing, doesn't change runtime emission (the post-§02 server still always populates the field). - N3 + N4 (packages/cli/src/serve/server.ts + sdk-typescript/.../DaemonClient.ts): the route's `cwd` validation treated every non-string body value (`null`, `123`, `{}`, `[]`) the same as omitted, silently falling back to `boundWorkspace`. That hid client/orchestrator serialization bugs as "session attached to wrong workspace". Now the route uses `'cwd' in body` to detect presence and rejects presence-but-not-a-string with `400 'cwd must be a string absolute path when provided'`. Empty string still hits the existing `path.isAbsolute` branch ("must be an absolute path when provided"), so an SDK caller passing `workspaceCwd: ''` no longer silently lands in the daemon's bound workspace. SDK side: reverted my conditional spread to `cwd: req.workspaceCwd` unconditional. `JSON.stringify` strips `undefined` automatically (so omitted `workspaceCwd` becomes "no `cwd` key" on the wire, as before), but empty-string is now forwarded verbatim and the server's 400 surfaces the bug instead of the SDK swallowing it. Added a unit test pinning the empty-string-forwarded shape. Server tests: - `400 when cwd is present but not a string` covers null / number / object / array via a sub-loop. - `400 when cwd is the empty string` pins the isAbsolute path. bridge: 73/73; server: 80/80 (was 78, +2 new); SDK: 40/40 (was 39, +1 empty-string test). tsc clean for SDK and PR-touched CLI files. * fix(serve): use const cwd in POST /session (prefer-const lint) CI lint failed with packages/cli/src/serve/server.ts:199:9 prefer-const: 'cwd' is never reassigned. The wave-4 rewrite split the original 'let cwd; if (!cwd) cwd = boundWorkspace' into a single ternary, which removes the only mutation path; the variable should be const accordingly. * fix(serve): address third /review round (gpt-5.5 + glm-5.1 + deepseek-v4-pro) Five new inline findings; M1 was already resolved in1c7f5f069. - M2 (httpAcpBridge.ts): drop the dead `ChannelInfo.workspaceCwd` field. Pre-§02 it was the routing key for `byWorkspaceChannel.get`; after the §02 collapse all reads target `SessionEntry.workspaceCwd` and `ChannelInfo.workspaceCwd` was only written, never read. Per- channel storage also suggests variance the "1 daemon = 1 workspace" model forbids. Removing the field encodes the single-workspace invariant in the type itself; left a stub comment so future readers don't reintroduce it. - M3 (httpAcpBridge.ts): fast-path `canonicalizeWorkspace` when `req.workspaceCwd === boundWorkspace`. The §02 recommended client flow is `caps.workspaceCwd` → POST `cwd: caps.workspaceCwd`, and the omit-cwd route in server.ts synthesizes the same equality. Both hit the equality check and skip the sync `realpathSync.native` syscall. Non-equal inputs fall through to the full canonicalize (clients sending `/work/./bound`, mixed casing on case-insensitive FS, symlink aliases) so correctness is unchanged. - M4 (httpAcpBridge.ts): operator stderr breadcrumb in the `channel.exited` handler. An agent crash (OOM / segfault) used to be silent on the daemon side — the child-stderr forwarder caught whatever the child wrote before dying (often nothing on SIGKILL/segfault), and SSE subscribers saw `session_died` frames but operators reading `qwen serve`'s own output had no signal that the agent process was gone. Log code+signal+affected-session-count so the line is the canonical "agent disappeared" indicator. - M5 (server.ts): documentation-only. The reviewer wanted `createServeApp` to validate `opts.workspace` exists + is a directory (currently only `runQwenServe` does). Trade-off: doing that breaks 4 existing tests which pass synthetic `/work/bound` on purpose to exercise route-layer behavior without a real directory. Deferred the helper extraction; added a JSDoc note pinning the contract so future entry points binding `createServeApp` to user input know to replicate the validation. - M6 (runQwenServe.ts): pass the already-canonical `boundWorkspace` into `createServeApp` via `opts.workspace`. `canonicalizeWorkspace` is idempotent so the server-side recanonicalize is a no-op today, but if a future refactor ever makes it non-idempotent the values the route advertises on `/capabilities` and the bridge enforces would diverge — landing clients in a "/capabilities says X, POST /session/X returns workspace_mismatch" contradiction. Removes the drift risk. bridge: 73/73; server: 80/80; tsc clean for PR-touched files. * fix(serve,sdk): address fourth /review round (deepseek-v4-pro x2) Two new inline findings: - O1 (server.ts): the POST /session route uses `'cwd' in body` against `safeBody`'s `Object.create(null)` output to distinguish "client omitted cwd" from "client sent cwd". The semantics quietly couple to `safeBody`'s literal strip list (`__proto__/constructor/prototype`). If a future maintainer adds a user-facing key (e.g. `cwd`) to that strip list, the route's presence-check would silently flip to "absent → fallback", masking the bug as "wrong workspace bound." Extracted `PROTOTYPE_POLLUTION_KEYS: ReadonlySet<string>` as a named module-scope constant; safeBody uses `.has()` on it (behavior unchanged); the route's comment now cross-references the const so the coupling is documented at both ends. The const's JSDoc spells out what to do if the strip set ever has to grow into user-key territory. - O2 (sdk-typescript): `DaemonCapabilities.workspaceCwd` is `string | undefined` (additive to v=1; pre-§02 daemons omit). SDK consumers that pass it into a `string` context get a TS strict error or, against an old daemon, a runtime `Cannot read properties of undefined`. Added a `requireWorkspaceCwd` helper + `DaemonCapabilityMissingError` so consumers can opt into an actionable `DaemonCapabilities.workspaceCwd is missing — introduced in #3803 §02 …` error instead. Exported both from `@qwen-code/sdk`'s top-level module + the `daemon/` sub-module. Unit tests cover populated, missing, and empty-string inputs. bridge: 73/73; server: 80/80; SDK DaemonClient: 43/43 (was 40, +3 new requireWorkspaceCwd cases). tsc clean for SDK and PR-touched CLI files. * fix(serve): address tanzhenxin REQUEST_CHANGES (cold-spawn + streaming-test bind) Two findings from the CHANGES_REQUESTED review on PR #4113. - T1 (integration-tests/cli/qwen-serve-streaming.test.ts) — high severity: the daemon spawn in `beforeAll` did not pass `--workspace REPO_ROOT`, so under §02 the daemon bound to whatever cwd the test runner was invoked from. Every later `createOrAttachSession({ workspaceCwd: REPO_ROOT })` then 400'd with `workspace_mismatch`, and the entire file — child-crash recovery, multi-client first-responder permission, Last-Event-ID resume — silently no-op'd once `SKIP_LLM_TESTS` was unset. The sibling `qwen-serve-routes.test.ts` got the same fix earlier in this PR; this file was missed in that pass. Added the flag with a comment pointing at the rationale so the omission can't recur. - T2 (packages/cli/src/serve/httpAcpBridge.ts) — medium severity: cold-spawn window orphans the agent child on double-Ctrl+C. The `qwen --acp` child exists from the moment `channelFactory` spawns it, but pre-fix the bridge only added the channel to `aliveChannels` AFTER `connection.initialize()` returned. During the up-to-`initTimeoutMs` (default 10s) handshake window `aliveChannels` was empty, and a double-Ctrl+C in that window played out as: first SIGINT entered `shutdown()` and awaited the in-flight spawn; second SIGINT called `killAllSync()` against an empty set; `process.exit(1)` orphaned the child. Same class of bug the BkUyD invariant set out to close — the post-init overwrite race was covered, the pre-init handshake window wasn't. Fix: move `info` creation + `aliveChannels.add(info)` + the `channel.exited` handler registration BEFORE the `initialize` await. Init-failure / late-shutdown / child-crash-during-handshake all converge on the same cleanup path: mark `isDying = true`, `await channel.kill()`, let the exited handler `aliveChannels .delete(info)` once the OS reaps the process. `channelInfo` (the attach target) is still assigned LAST so `ensureChannel`'s fast-path never returns a still-handshaking channel. Regression test: `killAllSync force-kills the channel during the initialize handshake` uses a bespoke factory whose agent's `initialize` never resolves and asserts `killAllSync` fires killSync against the channel during the handshake window. Pre-fix the test would observe an empty `killSyncCalls` array. bridge: 74/74 (was 73, +1 cold-spawn test); server: 80/80; tsc clean for PR-touched files. * fix(serve): address third /review round (gpt-5.5 + glm-5.1 + deepseek-v4-pro) Eight new inline findings; six applied, two deferred-with-reply. - P1 (httpAcpBridge.ts init-failure isDying comment): my comment overstated what `info.isDying` accomplishes on the init-failure path — concurrent `ensureChannel()` callers don't bypass via `isDying`, they coalesce on `inFlightChannelSpawn` and observe the same rejection. Reworded to describe the actual cross-path invariant marker. - P2 (server.ts workspace_mismatch log injection): doudouOUC flagged log injection via `err.requested` (user-controlled). `path.resolve` + `realpathSync.native` preserve control chars in path segments, so a body `{"cwd": "/legit/path\nqwen serve: FAKE LOG"}` would emit two valid-looking daemon log lines on stderr — weaponizing line-based log shippers (Splunk / Loki / journald → SIEM). `JSON.stringify` both `err.bound` and `err.requested` in the log line escapes control chars + quotes the values, making any injection attempt visible-as-quoted-noise rather than forged-line. Bound is operator-controlled and inherently safe but quoted symmetrically for readability. The defense-in-depth alternative (reject control chars in canonicalizeWorkspace) is deferred — this single log site was the actionable interpolation; future workspace-path-into-stderr / -JSON / -templated-SQL flows can pick up the rejection if they ship. - P3 (httpAcpBridge.test.ts): refactor the cross-workspace WorkspaceMismatchError test to a single `.catch((e) => e)` capture rather than firing the rejection twice (once for the `rejects .toBeInstanceOf` matcher, once for the field assertions). Logic unchanged. - P4 (httpAcpBridge.ts channel.exited log): the `qwen serve: channel exited (...)` line fired on every channel exit including planned shutdown — alarming for operators who Ctrl+C'd a healthy daemon. Guarded with `if (!shuttingDown)` so the planned-shutdown case (operator already saw `received SIGINT, draining...`) stays silent. The killSession path (last session leaves, daemon stays up — no top-level context line) still logs, since the line is the only signal that the cleanup actually ran. - P5 (httpAcpBridge.ts): light trim of the "pre-fix" narrative voice in two comment blocks (cold-spawn ensureChannel layout + BkUyD killAllSync aliveChannels iteration). Kept the invariant explanations — those carry maintenance value — dropped the "pre-fix the code did X" framing that's review-context not future-reader context. - P6 (server.ts + runQwenServe.ts): `createServeApp` now accepts a pre-canonicalized `deps.boundWorkspace` to skip its own `canonicalizeWorkspace` syscall when the caller (runQwenServe) already did the work. Replaces my earlier `{...opts, workspace: boundWorkspace}` opts-mutation hack — cleaner separation of concerns + drops one `realpathSync.native` per boot. Direct callers (tests, embeds) that omit `deps.boundWorkspace` still get the in-body canonicalize path. - P8 (httpAcpBridge.ts): defensive `aliveChannels.size > 2` warning. The set is intentionally multi-entry to cover the killSession-then-spawnOrAttach overlap window (size 2 is legitimate). Anything higher implies a `channel.exited` handler never fired for a prior channel — a real leak we'd otherwise catch only as gradually-growing RSS. The warning surfaces it the moment it happens. - P7 (CreateSessionRequest.workspaceCwd optional): deferred with reply rationale. Making the field optional is the §02 design ("SDK accepts bound path or none"); the JSDoc already explains the omit-vs-explicit choice; Stage 1 has no shipping SDK consumers so there's no breakage to call out in a changelog file. No code change. bridge: 74/74 (cross-workspace test refactor + behavioral assertions unchanged); server: 80/80; SDK 43/43. tsc clean for PR-touched files. * fix(serve): apply auto-fixes from /review (#4113) - canonicalizeWorkspace: narrow catch to ENOENT only, propagate other filesystem errors - listWorkspaceSessions: add fast-path string equality to avoid realpathSync on every poll - GET /workspace/:id/sessions: return 400 workspace_mismatch for cross-workspace queries - SessionNotFoundError: accept optional extra message; clarify agent-crash-on-spawn case - requireWorkspaceCwd: distinguish empty-string (post-§02 bug) from absent (pre-§02 daemon) * fix(serve/test): bind workspace explicitly in GET /workspace tests Wave-5 commit0c6e963cd("apply auto-fixes from /review (#4113)") added a 400 workspace_mismatch reject path to GET /workspace/:id/sessions for cross-workspace queries, but the existing two happy-path tests queried `/work/a` / `/work/idle` against an unbound daemon (which falls back to `process.cwd()`). Both turned to 400 in CI. Bind the daemon to WS_BOUND in both happy-path tests and query the same path. Add a third regression test that pins the §02 cross-workspace rejection contract — `code: workspace_mismatch`, both paths in the body, bridge.listCalls untouched (no silent fallback regression). Brings server.test.ts from 80 → 82 tests, all passing. * fix(serve,sdk): address fourth /review round (deepseek-v4-pro x2) Six new inline findings; five applied, one defer-with-reply. - Q1 (httpAcpBridge.ts + server.ts + tests): cwd length amplification through WorkspaceMismatchError. The error constructor interpolates `requested` into `.message` TWICE; `sendBridgeError` echoes it on stderr (now JSON.stringify-wrapped); `res.json` echoes it again — a ~10 MB `cwd` body (right under express.json's 10 MB cap) would amplify to ~60 MB per request × maxConnections (default 256). On loopback-default-no-token deployments this is pre-auth. Added `MAX_WORKSPACE_PATH_LENGTH = 4096` (Linux PATH_MAX); route rejects oversized `cwd` with a 400 BEFORE the bridge is touched, and the `WorkspaceMismatchError` constructor truncates `requested` as defense-in-depth for non-route callers (tests, embeds, future entry points that throw the error directly). Three new tests pin the route 400, the constructor truncation, and the normal-path passthrough. - Q2 + Q5 (httpAcpBridge.ts docs): the `channelInfo` declaration comment + `ChannelInfo.sessionIds` JSDoc + `ChannelInfo.isDying` JSDoc all overstated when `channelInfo` is cleared. Post-§02 the BkUyD invariant is "ONLY `channel.exited` clears `channelInfo`" — teardown initiators (killSession last-session-leaving, doSpawn-newSession-failure, ensureChannel init-failure/late- shutdown, shutdown) set `isDying = true` but LEAVE `channelInfo` pointing at the dying channel until OS reap, so `killAllSync` can still reach it through `aliveChannels`. A future maintainer reading the old phrasing might "fix" killSession to also clear `channelInfo` and silently break the double-Ctrl+C force-kill path. Rewrote all three sites to describe the actual invariant + enumerate the 5 isDying set-sites + spell out the BkUyD rationale in one place (the `isDying` JSDoc) that other comments point at. - Q3 (runQwenServe.ts): the "listening on …" boot summary goes to stdout but every other operational diagnostic (bearer auth, the workspace_mismatch breadcrumb, channel-exited, bridge errors) goes to stderr. Operators capturing only stderr (systemd / docker / k8s default) miss the `workspace=` indicator, which is the single piece of information they need most when triaging §02 migration issues. Added a `qwen serve: bound to workspace "X"` stderr line alongside the stdout one — keeps stdout untouched (integration tests + scripts parse it) while making the breadcrumb visible to stderr-only log shippers. `JSON.stringify` the boundWorkspace value (operator-controlled but cheap defense-in-depth against any future flow that lands a control char in the path). - Q4 (integration-tests/tsconfig.json): the `paths` entry resolved `@qwen-code/sdk` to the SDK's built `dist/` directory; `dist/` is gitignored and stale dist (no `npm run build` first) yields TS2339 errors on the integration tests' imports of new SDK fields. Pointed `paths` at SDK source instead — `tsc -p integration-tests/tsconfig.json` no longer requires a prior rebuild. The vitest config's runtime alias still resolves to `dist/index.mjs` so the actual test execution exercises the published-bundle shape; this paths entry only affects type resolution. - Q6 (httpAcpBridge.ts): `createHttpAcpBridge` constructor called `canonicalizeWorkspace(opts.boundWorkspace)` even when the caller (`runQwenServe`) had already canonicalized and threaded the same value through `deps.boundWorkspace` into `createServeApp`. Two independent `realpathSync.native` calls can theoretically diverge on NFS-transient / mid-rename filesystems, landing the bridge with a canonical form different from what `/capabilities` advertises and from `createServeApp`'s view. Dropped the bridge's re-canonicalize; kept `path.isAbsolute` (structural, not a syscall); documented the caller contract on `BridgeOptions .boundWorkspace` ("MUST be pre-canonicalized; tests/embeds call `canonicalizeWorkspace` first"). Tests use `path.resolve(path.sep, ...)` which is already canonical-or- fallback for non-existent paths, so no test changes needed. bridge: 76/76 (was 74, +2 WorkspaceMismatchError truncation tests); server: 82/82 (was 80, +2 length cap + the auto-applied helper). tsc clean for SDK, CLI PR-touched files, and integration-tests' qwen-serve-*.
8.6 KiB
DaemonClient quickstart (TypeScript)
A minimal end-to-end example: start a qwen serve daemon in another terminal, then drive it from a Node script with the SDK's DaemonClient. See also: Daemon mode user guide and HTTP protocol reference.
Setup
In one terminal:
cd your-project/
qwen serve --port 4170
# → qwen serve listening on http://127.0.0.1:4170 (mode=http-bridge, workspace=/path/to/your-project)
Per #3803 §02 each daemon binds to one workspace at boot (the current cwd, or override with --workspace /path/to/dir). The daemon's bound path is advertised on /capabilities.workspaceCwd so clients can pre-flight check + omit cwd from POST /session.
In another:
npm install @qwen-code/sdk
Hello daemon
import { DaemonClient, type DaemonEvent } from '@qwen-code/sdk';
const client = new DaemonClient({
baseUrl: 'http://127.0.0.1:4170',
// token: process.env.QWEN_SERVER_TOKEN, // required for non-loopback binds
});
// 1. Confirm we can reach the daemon, gate UI on its features, and
// read back the daemon's bound workspace (#3803 §02).
const caps = await client.capabilities();
console.log('Daemon features:', caps.features);
console.log('Daemon workspace:', caps.workspaceCwd); // canonical bound path
// 2. Spawn-or-attach a session. Two equally-valid shapes:
// (a) pass `workspaceCwd: caps.workspaceCwd` to be explicit, or
// (b) omit `workspaceCwd` entirely — the SDK then sends no `cwd`
// field and the daemon route falls back to its bound
// workspace. The (b) shape is concise but assumes you trust
// `caps.workspaceCwd` to be whatever you intended.
// A non-empty `workspaceCwd` that doesn't canonicalize to the
// daemon's bound path yields `400 workspace_mismatch` (see
// "Workspace mismatch" below).
const session = await client.createOrAttachSession({
workspaceCwd: caps.workspaceCwd,
});
console.log(`session=${session.sessionId} attached=${session.attached}`);
// 3. Subscribe to the event stream. Pass `lastEventId: 0` so the daemon
// replays everything from the session's start — without it, there's
// a TOCTOU window between `subscribeEvents()` returning the iterator
// and the underlying SSE connection actually opening (one fetch
// round-trip), during which a fast-starting agent can emit events
// that go into the per-session ring but won't be streamed to a fresh
// no-cursor subscriber. `lastEventId: 0` makes the replay buffer
// cover that gap (and any reconnect later — see below).
const abort = new AbortController();
const subscription = (async () => {
for await (const event of client.subscribeEvents(session.sessionId, {
signal: abort.signal,
lastEventId: 0,
})) {
handleEvent(event);
}
})();
// 4. Send a prompt and wait for it to settle. (Order-of-operations
// note: even if `prompt()` fires before the SSE handshake
// completes, step 3's `lastEventId: 0` guarantees every event
// lands in the iterator.)
const result = await client.prompt(session.sessionId, {
prompt: [{ type: 'text', text: 'Summarize src/main.ts in one sentence.' }],
});
console.log('stop reason:', result.stopReason);
// 5. Tear down the subscription so the script can exit.
abort.abort();
await subscription;
function handleEvent(event: DaemonEvent): void {
switch (event.type) {
case 'session_update': {
const data = event.data as {
sessionUpdate: string;
content?: { text?: string };
};
if (data.sessionUpdate === 'agent_message_chunk' && data.content?.text) {
process.stdout.write(data.content.text);
}
break;
}
case 'permission_request':
// See "Voting on permissions" below for first-responder semantics.
console.log('\n[needs permission]', event.data);
break;
case 'permission_resolved':
console.log('\n[permission resolved]', event.data);
break;
case 'session_died':
console.error('\n[agent crashed]', event.data);
break;
default:
console.log(`\n[${event.type}]`, event.data);
}
}
Reconnect with Last-Event-ID
If your client process restarts mid-session, replay events you missed:
let cursor: number | undefined;
for await (const event of client.subscribeEvents(session.sessionId, {
signal: abort.signal,
lastEventId: cursor, // resume from after this id; undefined = live only
})) {
if (typeof event.id === 'number') cursor = event.id;
handleEvent(event);
}
The daemon retains the last 4000 events per session in a ring buffer; gaps beyond that window won't be re-deliverable.
Voting on permissions
When the agent asks for permission to run a tool, every connected client sees the permission_request event. First responder wins — once one client votes, the rest get 404 if they try to vote on the same requestId.
case 'permission_request': {
const req = event.data as {
requestId: string;
options: Array<{ optionId: string; name: string; kind: string }>;
};
// Pick whichever option you want — `proceed_once`, `allow`, etc.
const choice = req.options.find((o) => o.kind === 'allow_once') ?? req.options[0];
const accepted = await client.respondToPermission(req.requestId, {
outcome: { outcome: 'selected', optionId: choice.optionId },
});
if (!accepted) {
console.log('Another client voted first; nothing to do.');
}
break;
}
Shared-session collaboration
Two clients pointed at the same daemon end up on the same session. Per #3803 §02 each daemon is bound to ONE workspace at boot, so the daemon launched as qwen serve --workspace /work/repo (or cd /work/repo && qwen serve) is what both clients connect to:
// Daemon was launched as `qwen serve --workspace /work/repo` so
// `caps.workspaceCwd === '/work/repo'` for both clients.
// Client A (e.g. an IDE plugin)
const a = await clientA.createOrAttachSession({ workspaceCwd: '/work/repo' });
console.log(a.attached); // false — A spawned the agent
// Client B (e.g. a web UI on the same machine)
const b = await clientB.createOrAttachSession({ workspaceCwd: '/work/repo' });
console.log(b.attached); // true — B joined A's session
console.log(a.sessionId === b.sessionId); // true
Both clients see the same session_update / permission_request stream. Either can send a prompt; they FIFO-queue per the agent's "one active prompt per session" guarantee.
Workspace mismatch
If workspaceCwd doesn't match the daemon's bound workspace, createOrAttachSession rejects with DaemonHttpError carrying status 400 and a structured body:
import { DaemonHttpError } from '@qwen-code/sdk';
try {
await client.createOrAttachSession({ workspaceCwd: '/some/other/project' });
} catch (err) {
if (err instanceof DaemonHttpError && err.status === 400) {
const body = err.body as {
code?: string;
boundWorkspace?: string;
requestedWorkspace?: string;
};
if (body.code === 'workspace_mismatch') {
console.error(
`This daemon is bound to ${body.boundWorkspace}, ` +
`not ${body.requestedWorkspace}. Start a separate daemon ` +
`for that workspace, or route to the right one.`,
);
}
}
}
Multi-workspace deployments run one daemon per workspace on separate ports — there's no intra-daemon routing under §02. An orchestrator (or the user's launcher) picks the right daemon based on the project the client wants to talk to.
Authentication
When the daemon was started with a token (any non-loopback bind requires one):
const client = new DaemonClient({
baseUrl: 'https://your-host:4170',
token: process.env.QWEN_SERVER_TOKEN,
});
Wrong / missing tokens return 401 with a uniform body — the SDK throws DaemonHttpError on any 4xx/5xx from a route handler.
import { DaemonHttpError } from '@qwen-code/sdk';
try {
await client.health();
} catch (err) {
if (err instanceof DaemonHttpError) {
console.error(`Daemon error ${err.status}:`, err.body);
} else {
throw err;
}
}
Cancel an in-flight prompt
If your user hits Esc:
await client.cancel(session.sessionId);
// In the event stream you'll see the prompt resolve with stopReason: "cancelled"
Cancel only winds down the active prompt — anything you'd already POSTed and that's still queued behind it will continue to run. (See protocol reference for the rationale.)
What's next
- HTTP protocol reference — full route spec with status codes
- Daemon mode user guide — operator-side docs
- Source:
packages/sdk-typescript/src/daemon/