vrr/qwen-code: mirror of https://github.com/QwenLM/qwen-code?tab=readme-ov-file#-free-options-available

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-19 16:28:28 +00:00

mirror of https://github.com/QwenLM/qwen-code?tab=readme-ov-file#-free-options-available

Find a file

jinye 36760ca63c feat(serve): auth device-flow route (#4175 Wave 4 PR 21) (#4255 ) * feat(serve): auth device-flow route Implements issue #4175 Wave 4 PR 21. Brokers OAuth 2.0 Device Authorization Grant (RFC 8628) through the `qwen serve` daemon so a remote SDK client can trigger a Qwen-account login whose tokens land on the daemon filesystem, not on the client. The daemon polls the IdP itself; the client's only job is to display the verification URL + user code. Runtime locality (#4175 §11): the daemon NEVER spawns a browser or calls `open(url)` — even when running locally. Static-source grep test fails the build on `node:child_process` / `open` / `xdg-open` / `shell.openExternal` / `execa` / `shelljs` / `process.spawn` and their dynamic-import / require variants. - `POST /workspace/auth/device-flow` — strict mutation gate; returns 201 fresh / 200 idempotent take-over with `attached: true`. Per per-`providerId` singleton: a second POST while pending takes over rather than allocating a new `device_code`. - `GET /workspace/auth/device-flow/:id` — public state read. Pending entries echo `userCode/verificationUri/expiresAt/intervalMs`; terminal entries (5-min grace) drop them and surface `status/errorKind/hint`. - `DELETE /workspace/auth/device-flow/:id` — strict; idempotent (terminal → 204 no-op; unknown → 404). - `GET /workspace/auth/status` — pending flows + supported providers snapshot. v1 stub for `providers: []` (populated in fold-in 1). `DeviceFlowRegistry` (`packages/cli/src/serve/auth/deviceFlow.ts`) is the in-memory state holder: - per-`providerId` singleton with idempotent take-over - workspace-wide cap of 4 active flows (abuse defense) - 5-min terminal grace so SDK reconnects can still observe results - TTL sweeper evicts grace-expired entries every 30s - in-flight `Promise` map coalesces concurrent `start()` calls so two parallel POSTs don't double-allocate IdP `device_code` - `transitionTerminal` returns `boolean` so caller-side emit/audit guard prevents sweeper × poll-tick double-fire - `dispose()` wired into `runQwenServe.close()`'s shutdown drain; cancels `provider.poll()` mid-flight via `cancelController`, records `lost_success` audit when an IdP-minted token is dropped by transition `DeviceFlowProvider` interface accepts `start({signal})` + `poll(state, {signal})`. `QwenOAuthDeviceFlowProvider` wraps the existing `QwenOAuth2Client.requestDeviceAuthorization` / `pollDeviceToken` primitives directly (NOT `authWithQwenDeviceFlow`, which calls `open(url)`). PKCE is provider-required by Qwen but optional in the interface for future non-PKCE providers. `success.persist()` writes to disk FIRST, then updates the in-process client — a failed disk write no longer leaves the daemon with a zombie in-memory token. Maps RFC 8628 errors via an anchored regex (`^Device token poll failed: (expired_token\|access_denied\|invalid_grant)`) so an `error_description` containing one of those literals can't mis-classify an unrelated upstream error. `BrandedSecret<T extends string>` holds the `device_code` and PKCE verifier. Earlier draft used `new String()` wrapper which leaked through `+` / template literals (`Symbol.toPrimitive` → `valueOf` returned the primitive). Final shape: frozen plain object + `WeakMap` indirection + 4-way redaction (`toString` / `toJSON` / `Symbol.toPrimitive` / numeric coercion → `'[redacted]'` or `NaN`) + `unique symbol` brand. 6 leak-path tests: `JSON.stringify` / `String()` / concat / template / `+x` / reveal-roundtrip. 5 new daemon events (workspace-scoped, fanned out to every active session bus via `bridge.broadcastWorkspaceEvent`): - `auth_device_flow_started` — `{deviceFlowId, providerId, expiresAt}` (no userCode/verificationUri — see PR 21 design §3) - `auth_device_flow_throttled` — `{deviceFlowId, intervalMs}`, emitted only on upstream `slow_down` interval bumps - `auth_device_flow_authorized` — `{deviceFlowId, providerId, expiresAt?, accountAlias?}`; `accountAlias` is best-effort non-PII (never email/phone) - `auth_device_flow_failed` — `{deviceFlowId, errorKind, hint?}` with `errorKind ∈ {expired_token, access_denied, invalid_grant, upstream_error, persist_failed}` - `auth_device_flow_cancelled` — `{deviceFlowId}` (DELETE on pending) Workspace-scoped reducer `reduceDaemonAuthEvent` produces `DaemonAuthState { flows: Partial<Record<ProviderId, ...>> }` — parallel to `reduceDaemonSessionEvent`. Session reducer no-ops on auth events (workspace-scoped state belongs in its own reducer). `bridge.broadcastWorkspaceEvent` is intentionally distinct from PR 16's `publishWorkspaceEvent` to avoid merge conflict; collapses to the shared helper as a fold-in once #4249 lands (~25 LoC). `@qwen-code/sdk` (`packages/sdk-typescript/`): - 4 new `DaemonClient` methods: `startDeviceFlow`, `getDeviceFlow`, `cancelDeviceFlow`, `getAuthStatus` — typed against the wire shapes, errors mapped through the existing `DaemonHttpError`. - High-level `client.auth` getter (lazy `DaemonAuthFlow` singleton) exposes a `start(...).awaitCompletion()` shape mirroring `gh auth login`'s UX: print code first, let the SDK consumer decide where to open the browser. `awaitCompletion` polls GET on the daemon-supplied `intervalMs`, honors `slow_down` bumps, and fall-back-recovers from 404 (entry evicted post-grace). POST + DELETE flow through PR 15's `mutate({strict: true})` — 401 `token_required` on token-less loopback defaults. GET routes use only the global `bearerAuth`. Every state transition (`started/authorized/failed/cancelled/expired/lost_success`) records a structured stderr breadcrumb (`[serve] auth.device-flow: provider=... deviceFlowId=abc12... clientId=... status=...`) since `mutate()` doesn't carry an audit hook — events alone aren't enough since SDK can silently drop them; stderr → journald/docker logs is the unfalsifiable record. `auth_device_flow` advertised unconditionally on `/capabilities.features`. Supported providers list lives on `/workspace/auth/status` to keep the registry descriptor uniform. - `packages/core/src/qwen/qwenOAuth2.ts`: - exports `cacheQwenCredentials` (was a private function; needed by the daemon's device-flow registry) - `cacheQwenCredentials` now calls `SharedTokenManager.clearCache()` after writing, folding what was previously a paired call site at L820+L829. Idempotent change. - file mode `0o600` on `oauth_creds.json` (was default 0o666 + umask). Mirrors opencode's `auth/index.ts`. - `packages/cli/src/serve/runQwenServe.ts`: device-flow registry `dispose()` wired into the shutdown drain (BEFORE `bridge.shutdown()`). - `auth/deviceFlow.test.ts` — 21 tests: BrandedSecret leak paths, state machine (slow_down / success / error), terminal grace, concurrent-start coalescing, dispose, cancel idempotency, static- source grep against browser-spawn primitives. - `server.test.ts` — 10 device-flow integration tests: POST 201/200 take-over, strict 401, 400 `unsupported_provider`, GET / DELETE / `/workspace/auth/status`, 502 `upstream_error` mapping, sweeper-driven auto-expiry with controlled clock, capability advertisement. - `daemonEvents.test.ts` — 5 SDK reducer tests: type guards, per- provider state projection, `failed` always → `status: 'error'` (errorKind carries the kind, including new `persist_failed`), session reducer no-ops on auth events. 369/369 serve + SDK tests pass; typecheck + `eslint --max-warnings 0` clean across 14 PR 21 files. - [x] Independently mergeable (depends only on merged PR 4 / PR 7 / PR 12 / PR 15) - [x] Backward compatible (4 new routes + 1 capability tag + 5 typed events + 4 SDK helpers; existing routes/events untouched) - [x] Default off (capability advertised but no client is forced to use it; CLI `qwen` OAuth flow unchanged) - [x] `qwen serve` Stage 1 routes / SDK behavior preserved - [x] Gradual migration (v1 only `qwen-oauth`; future providers register through the `DeviceFlowProvider` interface) - [x] Reversible (revert removes 4 routes + 1 tag + 5 events with no schema migration) - [x] Tests-first (28 new tests across 3 layers) - Inline `bridge.broadcastWorkspaceEvent` → fold-in to PR 16 (#4249) `publishWorkspaceEvent` once that lands - `/workspace/auth/status` vs PR 12 `/workspace/providers` boundary — separate route in v1; merge alternative discussed - Wave 4 PRs 17/19/20 should adopt the same mutate-strict + workspace event-fan-out pattern 5 items from pre-PR specialist passes parked for a focused follow-up: `DeviceFlowEntry` discriminated union, single-source SDK status / ProviderId unions, `awaitCompletion` memoization, broadcast-100%-fail stderr elevation, SDK 404 → `not_found_or_evicted` errorKind. Refs: #4175 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 round-1 review feedback Eleven items from copilot-pull-request-reviewer's round-1 pass on #4255 — 4 inline threads + 7 from the PR-level review summary. ## Adopted (11 items, code/doc changes) - `lastSeenAt` → `lastSeenEventId` (`events.ts`, `DaemonDeviceFlowReducerState`). The field was set from `rawEvent.id` (SSE event id) but documented as "epoch ms" — a real semantic mismatch that would mislead consumers into time-based logic against a monotonic counter. Rename + tighten the JSDoc to describe it as an event-id counter; reducer cases updated. - `DEVICE_FLOW_EXPIRY_GRACE_MS = 30_000` extracted in `DaemonAuthFlow.ts` (was a magic number on `start.expiresAt + 30_000`). `AwaitCompletionOptions.timeoutMs` doc now describes the actual grace-past-expiry behavior + the rationale (clock skew + daemon sweeper interval + network latency) instead of the wrong "defaults to expiresAt - Date.now()" claim. - Explicit `chmod 0o600` in `cacheQwenCredentials` after every write. `fs.writeFile`'s `mode` only applies on file creation; a pre-existing `oauth_creds.json` written under a broader umask kept its old permissions across upgrades. The chmod now tightens it on every write; chmod failure (Windows / hardened FS) surfaces via `debugLogger.warn` instead of silently dropping the invariant. - `SharedTokenManager.clearCache()` failure now logs `debugLogger.warn` (was a silent `try { } catch { }`). In production a swallowed clearCache means in-process callers serve stale credentials until the SharedTokenManager mtime watcher catches up — a recoverable degradation worth a log line. - Protocol doc lists `persist_failed` in the `auth_device_flow_failed.errorKind` union (was added to the type but missed in the doc). - `pollDeviceToken({signal})` plumbed through `IQwenOAuth2Client` interface + `QwenOAuth2Client` impl + the Qwen device-flow provider. Cancel / dispose during a slow IdP response now aborts the in-flight HTTP socket immediately instead of waiting for the upstream timeout. Two new registry tests assert `cancel()` / `dispose()` propagate abort to the signal observed by `provider.poll`. - `revealSecret` error message clarified: was "secret has been GC-evicted" (impossible — WeakMap doesn't evict reachable keys). Now points at the actual reachable failure modes (forged shape / serialize+reparse losing the WeakMap binding). - `transitionTerminal` JSDoc clarifies that the PRIMARY guard against late timer secret leaks is the `entry.status !== 'pending'` check at the top of `runPollTick`; secret-clearing here is defense-in-depth. - `DeviceFlowErrorKind` JSDoc'd per variant so consumers can tell when each fires (RFC 8628 distinctions + `persist_failed` vs `upstream_error` boundary). - Stale "PR 16 / PR 21 §3" temporal references in `DaemonAuthFlow.ts:124` rephrased to be timeless ("workspace-scoped events fan out through whatever session buses happen to be live" — no PR number references that rot when those PRs merge). ## Not adopted (4 items, replied to in-thread) - `authWithQwenDeviceFlow` browser-launch separation — correct architectural advice but out of #4255 scope (would refactor a CLI auth UX module that PR 21 only touched additively). Tracked as a Wave 5 follow-up. - Copyright header year range — repo-wide convention "2025"; not introduced by this PR. - Spread `...(x ? {x} : {})` → `x: x ?? undefined` — the two are not semantically equivalent. The current form omits the key entirely on falsy `x`; the suggested form always includes the key. Tests assert object shape and would break under the change. - Eager `client.auth` getter — public API boundary. Lazy construction matches `DaemonSessionClient` precedent + saves the module load for SDK consumers that never touch auth. Refs: #4175 #4255 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-1 review feedback 15 items from @wenshao's review batches on #4255. Catches a handful of real bugs that the earlier round (commit 3d9f082f5) didn't surface. ## Critical fixes - C1 — `pollUntilTerminal` providerId pass-through (`DaemonAuthFlow.ts:185`). The synthetic 404 fallback hardcoded `providerId: 'qwen-oauth'`; the parent `awaitCompletion` already receives the real providerId via `start.providerId` but `pollUntilTerminal`'s parameter type stripped it. Add the field to the param type, propagate. - C2 — open `errorKind` allowlist (`events.ts`). The closed 5-value union in the type guard silently dropped any `failed` event whose errorKind the daemon added without mirroring SDK-side (e.g. a future `rate_limited`). The flow's reducer state would never transition to terminal, leaving SDK consumers stuck on `pending` forever. Open the union with `(string & {})` and accept any non-empty string in the runtime guard. Updated test asserts forward-compat behavior + still rejects the truly-malformed empty-string case. - C3 — `persist()` timeout + signal (`deviceFlow.ts`). A wedged disk I/O (NFS stall, encrypted-volume contention) without bounds would pin the entry in `pending` until the upstream `expires_in` elapsed (potentially minutes). The registry now passes its `cancelController.signal` AND arms a hard `DEVICE_FLOW_PERSIST_TIMEOUT_MS = 30_000` timer; persist failure surfaces as `persist_failed` immediately. The `DeviceFlowPollResult` `success` variant signature changed to `persist({signal})`. - C4 — cancel × success race rollback (`deviceFlow.ts` + Qwen provider). Today, if `cancel()` transitions while `persist()` is in flight, the credentials get written but the flow's status is `cancelled`. User sees cancelled, daemon disk has a valid token. `DeviceFlowPollResult.success` gains an optional `unpersist()` callback the registry calls when `transitionTerminal(authorized)` fails — the Qwen provider wires it to `clearQwenCredentials()`. Rollback failure is audited but not propagated (re-running auth would overwrite anyway). - C5 — don't `unref()` the `awaitCompletion` sleep timer (`DaemonAuthFlow.ts`). On a standalone Node CLI/script doing just `client.auth.start().awaitCompletion()`, the unref'd between-poll timer was the only event-loop handle, so Node could exit before the user finished authorization. The poll wait is foreground work the caller explicitly awaits — keep it ref'd. ## Information-leak fixes - S1 — sanitize `persist_failed` hint. `err.message` from `cacheQwenCredentials` embeds the full `~/.qwen/oauth_creds.json` path. Broadcast via SSE, that path leaks the daemon's home layout to every connected session subscriber. Replace user-facing hint with `"credentials could not be written to the daemon filesystem — check disk space and permissions"`; full err goes to stderr audit only. - S2 — sanitize upstream `pollDeviceToken` hint. The class embedded the entire raw IdP response body (which can be an HTML error page from a reverse proxy) into the thrown message. Same broadcast leak path. Replace upstream-error hint with `"unexpected response from identity provider"`; RFC 8628 errors use `"Qwen IdP returned ${kind}"`. ## Cleanup / forward-compat - D1 — drop duplicate `clearCache()` at `qwenOAuth2.ts:840`. The paired call became redundant once `cacheQwenCredentials` folded the clearCache in (PR #4255 fold-in 1). The fold-in 1 message said this would be done; the duplicate slipped through. - S3 — drop unused `DeviceFlowNotFoundError` (`deviceFlow.ts`). Exported but never imported; route handlers do inline 404 JSON. - S4 — single-source SDK status / errorKind unions (`types.ts`). `DaemonAuthDeviceFlowSdkStatus` / `DaemonAuthDeviceFlowSdkErrorKind` were parallel literal copies of the canonical events.ts definitions — drift waiting to happen. Now imported + aliased as type-only re-exports. - S5 — broadcast 100% fail elevates to stderr (`httpAcpBridge.ts`). Per-session bus failures stay debug-only, but a broadcast where EVERY session bus refused is operationally interesting (clients won't see the event). Track success / fail counts; `writeStderrLine` when `successCount === 0`. - S6 — `this.disposed` check after `await provider.start()` (`deviceFlow.ts`). `dispose()` mid-start would orphan the freshly- inserted entry (`schedulePoll` guards on `disposed` so no poll fires; the entry never transitions). Throw post-await if disposed. - W1 — thread `signal` into `requestDeviceAuthorization` (`qwenOAuth2.ts` + Qwen provider). `start()` had the same cancellation gap that `pollDeviceToken` had — a slow device-authorization request couldn't be aborted during shutdown. Now plumbed end-to-end. - W2 — split `invalid_request` from `unsupported_provider` (`server.ts`). Conflating them surfaced misleading remediation hints to SDK consumers branching on `code` ("this provider isn't supported here" when the real cause was a serializer dropping the field). Bad-shape now returns `code: 'invalid_request'`; unknown-but-well-formed stays `unsupported_provider`. - W3 — drop never-populated `accountAlias` (Qwen provider). The field was wired through types / events / reducer / audit but the Qwen IdP's token response doesn't carry one (no `name` / `email` / `sub`). Returning only `{expiresAt}` makes the field type-honestly absent rather than always-undefined. Future provider with an alias-bearing response can populate it. - W4 — `DaemonAuthFlow` JSDoc accuracy. Doc claimed "first attempts to consume an SSE event stream … falls back to GET-based polling"; actual is GET-only with SSE as a real-time hint for clients already subscribed to a session stream. - W5 — clearer unit arithmetic in interval normalization. The `(_INTERVAL_MS / 1000) * 1000` cancelation hid the s↔ms boundary; expanded form makes both branches unit-explicit. ## Test changes - `daemonEvents.test.ts` updated to match the now-OPEN errorKind union (forward-compat assertion + empty-string still rejected). - `deviceFlow.test.ts` `FakeProvider.poll` aligned with the new `persist({signal})` signature + optional `unpersist`. ## Validation - `npm run typecheck --workspace packages/cli --workspace packages/sdk-typescript --workspace packages/core` — clean - `npx vitest run packages/cli/src/serve/ packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 368/368 - `npx eslint --max-warnings 0` over the 11 PR 21 surface files — clean Refs: #4175 #4255 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-2 review feedback 10 new threads from @wenshao's second deep-review pass on #4255. Verified status: 5 real issues, 1 improvement, 3 stale (already fixed; comments lagged), 1 false alarm (typecheck demonstrably clean). ## Critical fixes - fold-in 2 C4 REVERSED: when `provider.poll()` returns success AND `cancel()` / `dispose()` transitioned the entry mid-`persist()`, the registry now FORCES the entry to `authorized` and keeps the on-disk credentials. The earlier rollback (`unpersist()`) wasted the user's IdP approval because the RFC 8628 `device_code` is single-use — re-running the flow would force them through the whole browser-prompt + paste-code dance again for a click whose intent was likely "stop the wait" rather than "undo my already- completed approval". Aligns with gh CLI / Auth0 SDK / git- credential-manager. Audit captures the race via `hint: 'lost_success_kept ...'`. `DeviceFlowPollResult.success.unpersist` field + Qwen provider's `clearQwenCredentials` rollback removed. - #1 GET /workspace/auth/device-flow/:id strict gate: this GET surfaces `userCode` / `verificationUri` for pending entries, which on the loopback no-token default were readable by any local process. POST + DELETE were already strict; aligning GET closes the information-disclosure asymmetry. `/workspace/auth/status` stays bearer-only (its `pendingDeviceFlows` entries intentionally omit `userCode`). - #2 `inFlightStarts` hard timeout: a hung `provider.start()` (network partition, unresponsive IdP) used to leave the per- `providerId` slot in `inFlightStarts` occupied forever, blocking every subsequent POST until daemon restart. New `DEVICE_FLOW_START_TIMEOUT_MS = 30_000` arms a timer that `cancelController.abort()`s the start; the rejected promise unwinds through the `try/finally` clearing the slot. - #10 chain-completing the C3 persist-timeout: the earlier C3 fix armed a 30s timer that fired `cancelController.abort()` then `await result.persist({signal})`, but the chain ended at the registry boundary — `cacheQwenCredentials` didn't take a signal, so `fs.writeFile` couldn't be aborted. Now `cacheQwenCredentials` accepts an optional `{signal}` and threads it into `fs.writeFile(..., {signal})` (Node native). The Qwen provider's `persist({signal})` forwards the entry's `cancelController.signal` end-to-end. ## Improvement (#4): 404 fallback errorKind `pollUntilTerminal`'s 404 catch used to synthesize `{status: 'expired'}` for ALL evicted entries — conflating "your flow expired during your disconnect", "the daemon was restarted", and "your deviceFlowId was wrong". Now returns `status: 'error'` + `errorKind: 'not_found_or_evicted'` + a `hint` so SDK consumers branching on errorKind can distinguish. ## Information leak (#9): start() path raw IdP message S2 (fold-in 2) sanitized `poll()`'s upstream-error hint, but `start()` still embedded the raw `err.message` (full IdP response, potentially HTML from a reverse proxy / WAF) into the `UpstreamDeviceFlowError` that flowed to SDK clients via the 502. Now uses static messages for the SDK-visible errors; raw detail goes through `writeStderrLine` for operator audit only. Mirrors S2's approach. ## Stale comments cleaned (#5, #7) `qwenDeviceFlowProvider.ts:177` claimed `cacheQwenCredentials` "doesn't currently take a signal — that's a follow-up". After #10 above, that's no longer true; the comment is replaced with the actual end-to-end signal-threading note. ## Not adopted (1 false alarm) - Thread on `types.ts:330` claimed type-only-import-after- declarations breaks `tsc` and fails `daemonEvents.test.ts:670` with TS2345. Demonstrably false: `npx tsc -p packages/sdk-typescript/tsconfig.json --noEmit` exits 0; `daemonEvents.test.ts` is the post-fold-in-2 file with the open-allowlist assertion (test 28/28 passes). The reviewer may have been looking at a transient state during their analysis. ## Validation - `npm run typecheck --workspace packages/cli --workspace packages/sdk-typescript --workspace packages/core` — clean - `npx vitest run packages/cli/src/serve/ packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 398/398 pass - `npx eslint --max-warnings 0` over the PR 21 surface — clean Refs: #4175 #4255 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-3 review feedback 5 new threads from the third deep-review pass on #4255. 3 real issues fixed; 1 stale (already done in fold-in 3); 1 deferred as non-blocking design suggestion. - A — `expiresIn` / `interval` non-finite guard (`deviceFlow.ts`). The provider contract types both as `number`, but a misbehaving / future provider could hand `undefined` / `NaN` / `Infinity`. `Math.max(0, NaN) * 1000` is `NaN`, then `now() + NaN` is `NaN`, then `now >= NaN` is always `false` — the sweeper would NEVER evict the entry, pinning an upstream `device_code` slot until daemon restart. Same hazard on `interval * 1000` (NaN → `setTimeout(NaN)` fires immediately, Infinity → scheduler clamps to TIMEOUT_MAX). Now both fields go through `Number.isFinite(x) && x > 0`; missing/bad values fall back to RFC 8628's recommended ceilings (10 min for expiry, 5s for interval). - D — typed `app.locals` accessor (`deviceFlow.ts` + writer/reader call sites). The `app.locals['deviceFlowRegistry']` string key was shared between `createServeApp` (writer) and `runQwenServe` (reader); a typo on either side would compile cleanly and the shutdown dispose call would silently no-op, leaving polling timers running until the `unref()` rescue. New `setDeviceFlowRegistry(app, registry)` / `getDeviceFlowRegistry(app)` pair gives both call sites type-checked access; the string literal is encapsulated in one module. - E — `UnsupportedDeviceFlowProviderError` docstring (`deviceFlow.ts`). After fold-in 2's W2 fix split `invalid_request` from `unsupported_provider`, the route layer screens unknown ids against `DEVICE_FLOW_SUPPORTED_PROVIDERS` before reaching the registry — so this error is now reachable ONLY on a daemon-internal invariant violation (id is declared supported but not registered in the runtime provider map). Docstring + thrown message updated to reflect that this branch signals a programmer error, not user input. - B claimed `cacheQwenCredentials(credentials)` doesn't forward signal to `fs.writeFile`. Verified: fold-in 3 (#10) at `qwenDeviceFlowProvider.ts:204` calls `cacheQwenCredentials(credentials, { signal: persistOpts.signal })` and the core helper threads it into `fs.writeFile(..., {mode, signal})`. The reviewer was looking at the comment block above (lines 174-181) without scrolling to the actual call site. - C — SDK `cancelDeviceFlow` lossy 204/404 collapse. Suggested returning `{existed: boolean; alreadyTerminal: boolean}` instead of resolving void on both 204 and 404. Real signal-loss but tagged "[非阻塞]" by the reviewer; changing requires a daemon route shape change (200 + body instead of 204) which is better as a focused follow-up PR. Acknowledged in-thread; deferred to a fold-in PR after #4255 lands. - `npm run typecheck` — clean across `packages/{cli,sdk-typescript,core}` - `npx vitest run packages/cli/src/serve/ packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 398/398 - `npx eslint --max-warnings 0` over the PR 21 surface — clean Refs: #4175 #4255 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-4 review feedback 4 threads from the fourth review pass on #4255. 3 adopted + 1 deferred (out-of-scope rename of PR 15's `mutate` helper). ## Adopted ### #1 — `persistInFlight` flag suppresses cancel × persist event-stream UX trap When `provider.poll()` returns success and we await `persist()`, a concurrent `cancel()` would synchronously transition the entry to `cancelled` and emit `auth_device_flow_cancelled` — then `persist()` resolves and (per fold-in 3 C4) force-overrides to `authorized` + emits `auth_device_flow_authorized`. The reducer state correctly last-write-wins on `authorized`, but DIRECT event-stream consumers (close-dialog handlers, telemetry, UI cleanup) race onto an unmounted UI when the second event lands. Now: while persist is in-flight, `cancel()` and the sweeper SKIP the state transition + event emit. They register intent (set `cancelRequestedDuringPersist=true` for cancel; sweeper just no-ops) and let the persist resolution decide: - persist succeeds → `authorized` (IdP wins per fold-in 3 C4) - persist fails AND cancel was requested → `cancelled` - persist fails AND `now >= expiresAt` → `expired` / `expired_token` - persist fails otherwise → `error` / `persist_failed` Result: at most one terminal event per flow. Imperative SSE consumers no longer see oscillating terminal states. Audit captures the race (`hint: 'lost_success_kept ...'`) for incident-response correlation. ### #2 — `revealSecret` → `unsafeRevealSecret` rename The earlier JSDoc claimed "the `unsafeReveal_` naming is intentional: greppable in code review, easy to allowlist in lint rules, hard to invoke by accident" — but the actual function was named `revealSecret`. The promised safety properties didn't exist; a code reviewer wouldn't single out `revealSecret` as suspicious, and a `no-restricted-syntax` ESLint rule wouldn't flag it. Renamed to `unsafeRevealSecret` so the JSDoc-promised "greppable" / "lintable" property is now actually true. Two call sites in the Qwen provider + 4 test references updated. Internal symbol; not exposed through the SDK package. ### #4 — `QwenOAuthPollError` typed class replaces substring regex The earlier RFC 8628 error mapper used an anchored regex against the thrown error message text — an implicit cross-file string contract between `qwenOAuth2.ts` (throws) and `qwenDeviceFlowProvider.ts` (matches). If `qwenOAuth2.ts` ever changed its message format, ALL RFC 8628 errors (`expired_token` / `access_denied` / `invalid_grant`) would silently fall through to `upstream_error` — wrong errorKind flowing through telemetry with no test or type-system check to catch the drift. Now `QwenOAuth2Client.pollDeviceToken` throws a structured `QwenOAuthPollError extends Error` with `oauthError` / `description` / `status` fields. The provider branches on `instanceof QwenOAuthPollError` and reads `.oauthError` directly via a dedicated `mapRfc8628OAuthCode(code)` switch. The drift hazard is gone: a future code change that touches the typed class will fail tsc until both sides are updated. Message format preserved for any pre-existing log-parsing / substring matchers. ## Not adopted ### #3 — `mutate({strict:true})` semantic awkwardness on GET Reviewer correctly noted that `mutate` is named for state-changing routes, but `GET /workspace/auth/device-flow/:id` uses it for an information-disclosure defense (only reachable code path is reading state). Suggested rename: `mutate` → `strictHttpGate`. Deferred: the rename touches PR 15's helper which has many call sites in `server.ts` and is shared infrastructure for Wave 4 PRs 17/19/20. PR 21 is the first / only consumer of the strict-on-GET form so far; widening the rename to a Wave 4 follow-up keeps the fold-in scope tight. Replied in-thread. ## Validation - `npm run typecheck` — clean across `packages/{cli,sdk-typescript,core}` - `npx vitest run packages/cli/src/serve/ packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 544/544 - `npx eslint --max-warnings 0` over the PR 21 surface — clean Refs: #4175 #4255 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-5 review feedback Five small adopt items from the round-5 review pass; one stale thread already addressed in `b5b77ee90` (fold-in 5). #2 — `as const` + derived type for DEVICE_FLOW_SUPPORTED_PROVIDERS so adding/removing a provider id requires touching exactly ONE site. Mirrors `SERVE_ERROR_KINDS` / `ServeErrorKind` in `status.ts`. #3 — Clarify `DEVICE_FLOW_EXPIRY_GRACE_MS` JSDoc to distinguish the daemon's 30s SWEEP cadence (what the grace tracks) from the 5-min TERMINAL_GRACE_MS reconnect window (which awaitCompletion does NOT need to wait through). #4 — Add `@remarks` block on `DeviceFlowProvider.poll()` warning future provider authors that thrown `err.message` flows verbatim into the SSE-broadcast `auth_device_flow_failed` hint, and must be sanitized. Two equally-correct paths documented (typed `error` result vs sanitized thrown message). #5 — Truncate raw IdP detail in `qwenDeviceFlowProvider.ts` stderr audit lines to 2 KiB. WAFs / reverse proxies can return MB-sized HTML error pages, and container log aggregators (Loki, Fluent Bit, Stackdriver) typically truncate or drop lines past 4-32 KiB — losing the useful prefix downstream. 2 KiB retains structured JSON envelopes while staying well below every aggregator's per-line cap. #6 — Track latest `originatorClientId` on per-provider singleton take-over via new `entry.lastOriginatorClientId` field + `recordTakeover()` helper. When a second SDK client posts `POST /workspace/auth/device-flow` for an already-pending provider (or one being created in `inFlightStarts`) with a different `initiatorClientId`, an audit breadcrumb records the take-over so incident response can correlate "client A started, client B took over at 12:34". Event-routing intentionally still uses the original `initiatorClientId` (events are workspace-broadcast and changing the originator field mid-flow would break SDK reducers that key on it). Two new tests cover the differing-id audit + same-id no-op. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-6 review feedback Six "Critical" findings from a gpt-5.5 /review pass — all real liveness/correctness defects in the daemon's auth device-flow path and the SDK's awaitCompletion polling loop. #1 — Make `provider.start()` timeout authoritative via `Promise.race` in `DeviceFlowRegistry.doStart`. The earlier shape only ABORTED the signal on timeout; a provider that ignores `signal` (non-abortable I/O, future implementer who forgets to thread it to `fetch`) would leave the `await` hanging until daemon restart, pinning the `inFlightStarts` slot for that providerId. Race against a rejecting timer makes the timeout authoritative regardless of provider cooperation; abort still fires for cooperative cleanup. #2 — Same shape for `result.persist()` in the success branch of `runPollTick`. A future provider whose persist performs non-abortable steps (mkdir/chmod/mv outside the abortable fs.writeFile path) would otherwise hang the poll tick until process restart. Race against rejecting timer; rejection maps to `persist_failed`. #3 — Clamp `expiresIn` and `interval` upper bounds. Previous `Number.isFinite + > 0` guards stopped NaN/Infinity but a finite extreme like `1e12` was still accepted — pinning the per-provider singleton for ~30,000 years (`expires_in`) or scheduling a TIMEOUT_MAX-clamped poll that never fires within `expiresAt` (`interval`). Two new constants (`DEVICE_FLOW_MAX_EXPIRES_IN_SEC = 3600`, `DEVICE_FLOW_MAX_INTERVAL_MS = 60_000`) cap the worst case. #4 — Extract `getDeviceFlowOrSynthetic404(...)` helper in `DaemonAuthFlow.ts` and route BOTH the loop body and the timeout-ceiling final read through it. Previously the ceiling read went directly through `client.getDeviceFlow` and a 404 at the boundary (entry evicted just as the timeout fired) would reject with `DaemonHttpError(404)` instead of returning the structured `{ status: 'error', errorKind: 'not_found_or_evicted' }` that the rest of `awaitCompletion` promises. #5 — Validate `AwaitCompletionOptions.timeoutMs` and `pollOverrideMs` with `Number.isFinite + > 0`. NaN slipped past the previous `?? default` form (NaN is truthy-ish in that position) and produced a `ceiling` of `NaN` (loop runs forever — `now >= NaN` always false) or a `setTimeout(NaN)` (Node clamps to 1ms — tight polling loop). Sanitize to `undefined` so the documented defaults take effect. #6 — Thread `signal` into `DaemonClient.getDeviceFlow` and forward to `fetchWithTimeout` (which already composes caller + timeout signals). awaitCompletion now passes `opts.signal` from both GET sites. Without this, an `awaitCompletion` caller that aborts mid- poll could not cancel an in-flight stalled GET; it would have to wait for the daemon-side `fetchTimeoutMs` (30s default) to fire. Four new tests in `deviceFlow.test.ts` pin the new behaviors: hanging-start timeout (#1), hanging-persist → persist_failed (#2), extreme-expiresIn clamp (#3), extreme-interval clamp (#3). FakeProvider gained a `startHangs` flag for the non-cooperative provider scenario. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-7 review feedback Two findings from a DeepSeek /review pass; both small but legitimate defense-in-depth gaps. #1 — `runPollTick`'s catch block forwarded `err.message` verbatim into the SSE-broadcast `hint`. The provider's `@remarks` contract (fold-in 6 #4) requires throwers to sanitize, but if violated the unbounded raw payload would reach every SSE subscriber. Added `DEVICE_FLOW_POLL_HINT_MAX_LEN = 256` + `truncatePollHint()`, applied to the catch's `result.hint`. Full raw `err.message` is still routed to the audit trail (`audit?.record({hint: 'provider.poll() threw (raw): ...'})`) so operator visibility for incident response is preserved. Belt-and-suspenders: the contract is now structurally enforced rather than relying on every future provider author to read the JSDoc. #2 — `updateMatchingFlow` (and the `started`/`authorized` handlers in `reduceDaemonAuthEvent`) unconditionally overwrote state without comparing `rawEvent.id` against the existing flow's `lastSeenEventId`. The field's JSDoc documented it as a monotonic counter to prevent stale frames from overwriting newer state, but the code didn't enforce that contract. SSE reconnect with `Last-Event-ID < terminal-frame-id` would replay older frames; if any of them were for the same `deviceFlowId` (e.g. a delayed `failed` arriving after `authorized`) the stale frame would overwrite the terminal. Daemon-side `transitionTerminal` makes the exact reachable scenario thin, but the documented contract should match the code. Threaded `rawEventId` into `updateMatchingFlow` and added the gate there + in the `started` and `authorized` handlers (the two cases that don't go through `updateMatchingFlow`). Synthetic frames without an envelope `id` (`rawEventId === undefined`) bypass the gate — they originate inside SDK reducer machinery and aren't subject to replay ordering. Three new tests pin the contracts: - `runPollTick catch truncates the SSE hint and preserves raw on the audit (fold-in 8 #1)` — `pollThrowsWith` flag on FakeProvider models a non-conforming provider; SSE hint < 400 chars + contains 'truncated'; audit hint contains the full 4_000-char raw. - `reduceDaemonAuthEvent rejects out-of-order frames (fold-in 8 #2 monotonicity)` — stale `failed`(id=7) does NOT overwrite `authorized`(id=10); stale `started`(id=4) for a different flow also rejected. - `reduceDaemonAuthEvent passes synthetic frames (no envelope id) through the gate` — SDK-internal frames without `id` are honored. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-8 review feedback Twelve correctness + structural fixes from a wenshao + DeepSeek + gpt-5.5 review pass. Tests deferred to fold-in 10 (separate, larger commit). CRITICAL CORRECTNESS #7 — `provider.persist()` Promise.race could publish `persist_failed` to SSE while a non-cooperative provider was still committing credentials to disk. Added an independent tracker on the original persist promise: if the race timed out (`persistTimedOut === true`) AND the underlying persist later resolved successfully, audit a `lost_success_after_timeout` breadcrumb so operators see the inconsistency. Tightened the persist `@remarks` contract to require signal honoring end-to-end. Qwen provider already complies (fold-in 3 #10); this is forward-defense for future providers. #11 — auth surface (`DaemonAuthFlow`, `reduceDaemonAuthEvent`, `createDaemonAuthState`, `DEVICE_FLOW_EXPIRY_GRACE_MS`, all event / data / state types) was re-exported from `src/daemon/index.ts` but NEVER from the published SDK entry `src/index.ts`. SDK consumers got `undefined` for everything except `client.auth.start()` (which traveled through the already-exported `DaemonClient`). Added the missing exports and pinned via `daemon-public-surface.test.ts`. #12 — `core/src/qwen/qwenOAuth2.ts:373`'s `debugLogger.debug('Device authorization result:', result)` writes the raw `device_code` (RFC 8628 bearer-equivalent credential) to stderr / journald, bypassing the `BrandedSecret` redaction layer. Pre-existing on main but PR 21 expanded the exposure surface. Sanitized to log only `{ ok, expires_in }` on success / `{ ok, error }` on error. #13 — `runPollTick` success-branch persist-failure × past-`expiresAt` classified as `expired_token` instead of `persist_failed`, routing operators toward "tell user to retry" (RFC 8628 expiry) when the actual root cause was disk I/O. Reclassified to `persist_failed` with a `persist_also_failed_past_expiry` audit hint to preserve the timing detail for incident response. SMALL CORRECTNESS #1 — `runPollTick` catch hint replaced with a STATIC bounded message ("provider.poll() failed; see daemon audit log for details"). The fold-in 8 truncated-prefix approach could still leak the first 256 chars of provider-templated raw text including secret material. Full raw still routed to audit channel for operator visibility. #5 — `cancellerClientId` field added to `DeviceFlowEntry`; deferred- cancel branch in `cancel()` now stamps it on the entry, and the persist-resolution `cancelled` event publish uses `entry.cancellerClientId ?? entry.initiatorClientId`. SSE consumers that suppress self-emitted events can now attribute the cancel correctly. #6 — `AwaitCompletionOptions.timeoutMs === 0` (the documented "settle immediately, return current daemon view" contract) was treated as falsy by the `?` ternary, falling back to the default. `sanitizePositiveMs` now takes an `allowZero` opt-in; the ceiling computation uses `!== undefined` instead of truthy check. #8 — `EventBus.publish()` returns `undefined` for closed buses (it does NOT throw). `broadcastWorkspaceEvent` previously counted that path as success, hiding the all-buses-dropped operator alarm. Folded the closed-bus-as-failure check into the canonical `publishWorkspaceEvent` (see #X below). #9 — start-timeout Promise.race rejected with a plain `Error`, falling through `sendBridgeError` to a generic 500. Switched to `UpstreamDeviceFlowError` so a hung IdP correctly surfaces as 502 (matching the envelope every other IdP start failure uses). STRUCTURAL #3 — Three identical `transitionTerminal + publish + audit` expired_token blocks in `runPollTick`/`sweep`/(removed by #13) deduplicated into a private `expireEntry()` helper. Future event- shape changes are now a one-edit operation. #X — PR 16 (#4249) merged on 2026-05-18 06:27Z. Per the inline comment at httpAcpBridge.ts:501, PR 21's `broadcastWorkspaceEvent` was kept distinct only to avoid the merge conflict; once PR 16 landed, it became a fold-in candidate. Folded the closed-bus + all-failed-stderr-escalation operator-visibility features (PR 21's S5 + fold-in 9 #8) INTO `publishWorkspaceEvent`; dropped `broadcastWorkspaceEvent` from the bridge interface + impl + test mocks. PR 21's deviceFlowEventSink now calls `bridge.publishWorkspaceEvent` — single canonical workspace fan-out. DOC #16 — Added a "Cross-client take-over" paragraph to `docs/users/qwen-serve.md` explaining that two clients on the same daemon for the same provider get the per-provider singleton with `attached: true`/`false` distinguishing them; no separate event fires (both eventually observe the same `auth_device_flow_authorized`). 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao round-9 review feedback Two small non-blocking items from the round-9 pass; defensive shape + docs only. The 4 deferred test-coverage threads (#1-4 of round-8) are still tracked for fold-in 10. #6 — `lastSeenEventId` typed `number` with `?? 0` defaults in the `auth_device_flow_started` reducer case. The daemon-side `EventBus` assigns ids ≥ 1 so the `0` sentinel has no real-traffic meaning, but the monotonic gate (`rawEventId <= flow.lastSeenEventId`) would reject any future SDK-internal synthetic frame using `id: 0`. Changed the field type to `number \| undefined` and dropped the `?? 0` from the started case. The `updateMatchingFlow` / `auth_device_flow_authorized` guards already short-circuit on `existing.lastSeenEventId !== undefined`, so undefined is safe end-to-end. Existing 34 reducer tests still pass unchanged. #7 — Added `@remarks` block to `DeviceFlowErrorKind.persist_failed`'s JSDoc explaining the lost-success retry UX. When fold-in 9 #7's `lost_success_after_timeout` audit fires (non-conforming provider violates signal contract; disk write succeeds after registry published `persist_failed`), a naive SDK retry hits the IdP a second time with a fresh `device_code` and prompts the user twice — but the first credential set is already valid. JSDoc now documents the mitigation: SDK consumers writing retry logic on `persist_failed` should call `client.auth.getStatus()` BEFORE re-prompting; operators can grep stderr/audit for `lost_success_after_timeout` to detect occurrences. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * test(serve): fold-in 10 — auth device-flow test bundle (#4255) Lands the four deferred test-coverage items the round-8 review flagged (and round-9 re-surfaced) as a hard merge prerequisite. Net +41 tests across registry / SDK helper / client HTTP / HTTP route layers. #1 — `deviceFlow.test.ts` `persist failure paths` describe (3 tests, +3). The success arm's three terminal mappings — pure `persist_failed`, `cancelled` (cancel during persist), and `persist_failed` past `expiresAt` (the fold-in 9 #13 reclassification with `persist_also_failed_past_expiry` audit hint) — were 0-covered. Now pinned. Test #2 also asserts the fold-in 9 #5 cancellerClientId routing on the deferred `cancelled` event. #2 — new `DaemonAuthFlow.test.ts` (+14 tests). Mock DaemonClient with sequenced `getDeviceFlow` replies. Covers happy-path polling → `authorized`; `slow_down`-driven `intervalMs` bump firing `onThrottled`; `signal.abort()` rejection; `signal` propagation through `client.getDeviceFlow` (fold-in 7 #6); `timeoutMs` ceiling final-read; `timeoutMs:0` immediate-return (round-9 #6); NaN/Infinity → `sanitizePositiveMs` fallback to default ceiling (fold-in 7 #5); 404 → synthetic `error`/`not_found_or_evicted` (fold-in 3 #4) at BOTH the loop body AND the timeoutMs ceiling read (fold-in 7 #4); non-404 DaemonHttpError rethrown; `cancel()` and top-level `status()`/`cancel()` wrappers forward correctly. #3 — `DaemonClient.test.ts` `device-flow methods` describe (+11 tests). POSTs `/workspace/auth/device-flow` happy path + clientId header + body shape; 200/201 acceptance; non-2xx → `DaemonHttpError`. GETs URL-encode the deviceFlowId; forward `opts.signal` to `fetchWithTimeout`'s composed signal (fold-in 7 #6 — verified by aborting caller signal and observing the fetch's signal flip to `aborted`); 404 throws. DELETEs swallow 204 + 404 (idempotent, mirrors `closeSession`); non- 204/404 throws. `getAuthStatus` plain GET. `client.auth` lazy-instantiated singleton. #4 — `server.test.ts` 5 supplementary contract tests (+5). The existing 8 `it()`s cover happy paths + take-over + 401 POST + DELETE pending/terminal/unknown + 502 upstream + sweeper. This commit plugs gaps: 400 `invalid_request` for missing / non-string providerId (fold-in W2 split this from `unsupported_provider`); 409 `too_many_active_flows` (via injected fake registry); 401 `token_required` on DELETE without bearer; the asymmetric GET posture (`/workspace/auth/device-flow/:id` IS strict-gated to prevent peer-process userCode shoulder-surf; `/workspace/auth/status` stays read-only because its `pendingDeviceFlows` entries intentionally redact `userCode`). Validation: cli serve 631/631 (+8 from #1, #4); sdk 384/384 (+25 from #2, #3, +/- some pre-existing churn). Typecheck + lint clean. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(qwen): atomic temp+chmod+rename in cacheQwenCredentials (PR #4255 round-11 #2) gpt-5.5 /review flagged a real correctness/security gap: the post-write `chmod` ordering left a window where freshly-written credentials could land in a broadly-readable existing `oauth_creds.json` before the chmod tightened it. On POSIX, a chmod failure additionally degraded to a debug warning while the broadly-readable tokens stayed on disk. New shape mirrors the standard atomic-write idiom: 1. Write `${filePath}.tmp.${pid}.${randomUUID()}` with `mode: 0o600`. The temp path doesn't exist beforehand, so the `mode` flag actually applies on creation (it doesn't on existing files, which was the root of the original race). 2. Defensive `chmod` on the temp file. POSIX failure is now a HARD ERROR (refuses to publish broad-perm credentials to the canonical filename). Windows logs a debug breadcrumb and proceeds, since chmod is a no-op on most NTFS volumes (perms go through ACLs). 3. Atomic `fs.rename` over `filePath`. The canonical path is ALWAYS at `0o600` from the moment it contains the new tokens; readers see either the old creds or the new creds, never a partially-written or broadly-readable state. 4. Best-effort `fs.unlink` of the temp file on any failure path so failed writes don't leave `.tmp.<pid>.<uuid>` litter on disk. Test mock in `qwenOAuth2.test.ts` extended with `chmod` + `rename` no-op stubs so the existing 158 core/qwen tests still pass; no test behavior change beyond the mock surface. Validation: typecheck clean (cli + core + sdk-typescript); core qwen 158/158; cli serve 643/643; sdk 384/384. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): address PR #4255 wenshao + gpt-5.5 round-12 review feedback Eight findings from a wenshao + gpt-5.5 /review pass: 1 critical correctness, 2 real defensive defects, 4 edge cases / minor hardening, 1 test gap. All adopted. CRITICAL CORRECTNESS #1 CzSpN — `dispose()` race: after `await provider.poll(...)` the post-await guard checked only `entry.status !== 'pending'`, NOT `this.disposed`. `dispose()` clears the registry maps and aborts the entry's signal but doesn't mutate `entry.status`, so a provider whose poll already resolved (or doesn't honor abort) could enter the success branch and call `result.persist({...})` — committing credentials on a shutting-down daemon. Added the `if (this.disposed) return;` guard symmetric with the top-of-method check. REAL DEFENSIVE DEFECTS #2 Cy_ZG — sync-throw escape: the `result.persist({signal})` call happens BEFORE the `new Promise` constructor that captures it (`persistTracker` is closed-over inside the constructor). A non-conforming provider whose persist throws synchronously (e.g. top-of-function validation) would escape past the outer `try/catch (await new Promise(...))` and become an `unhandledRejection` since `runPollTick` is fire-and-forget via `void`. Wrapped the persist invocation in a try/catch that routes the sync-throw into the same `persistError` branch. #3 CzSpe — runtime provider map: provider validation hardcoded `DEVICE_FLOW_SUPPORTED_PROVIDERS` even though `deps.deviceFlowProviders` is the documented extension hook for tests/future providers. Switched both POST validation and `/workspace/auth/status` `supportedDeviceFlowProviders` to derive from `deviceFlowProviderMap.keys()` — single source of truth matches the registry's `resolveProvider`. EDGE CASES / MINOR HARDENING #4 Cy_Y9 — `slow_down` re-clamp: `intervalMs += SLOW_DOWN_BUMP_MS` can push past `DEVICE_FLOW_MAX_INTERVAL_MS` (the bound that keeps `setTimeout` from clamping to TIMEOUT_MAX). Wrapped in `Math.min(MAX_INTERVAL_MS, ...)` symmetric with the doStart clamp. #5 Cy_ZF — `expiresInSec` lower bound: `0.5` was finite-positive and produced `expiresAt = now() + 500 ms` — first poll (clamped at ≥1 s) fires AFTER expiresAt → flow expires before any user could authorize. Added `DEVICE_FLOW_MIN_EXPIRES_IN_SEC = 30` (RFC 8628 §3.2 calls 5–30 minutes "reasonable"; sub-30s is non-compliant). #6 CzHOK — take-over response privacy: `initiatorClientId` was echoed to ANY take-over POST caller, including those with no `X-Qwen-Client-Id` header. Bearer-gated already, but the asymmetry "anonymous caller learns who started it" violated the no-header-as-privacy-signal contract. Now only echoed when the caller's id matches the entry's initiator. #7 CzSpd — production audit visibility: production audit sink dropped `line.hint`, but the registry uses hints for operator-only breadcrumbs (`provider.poll() threw (raw)...`, `lost_success_after_timeout`, `persist_also_failed_past_expiry`, take-over correlation, `deferred (persist in flight; ...)`). The documented troubleshooting trail was invisible in production stderr. Now included with a 1 KiB bound + JSON-quoted so multi- word hints stay parseable. TEST GAP #8 Cy_ZH — `lost_success_after_timeout` audit: the fold-in 9 #7 split-brain detector for non-cooperative providers had no test pinning it. Added a controllable `latePersist` Promise + test that drives poll → success → enters persist race → fires PERSIST_TIMEOUT (registry publishes persist_failed) → resolves persist late → asserts the lost_success audit fires. Validation: typecheck + lint clean; cli serve 644/644 (+1 from the new test); sdk-typescript 384/384. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fixup(serve): close concurrent multi-provider cap bypass (PR #4255 round-13 #1) gpt-5.5 /review caught a real workspace-wide cap bypass: `countActive()` only counted entries already installed in `byProvider`, but the cap check at the top of `start()` runs before any provider's `inFlightStarts` slot completes `provider.start()`. A burst of fresh starts for `DEVICE_FLOW_MAX_CONCURRENT + 1` distinct providers all run synchronously to the cap check (each `start()` is async but runs to its first await — the await happens AFTER the cap check), all observe `count === 0` (no `byProvider` entries installed yet), and all pass — eventually installing more than the documented four pending flows. Fix: include `inFlightStarts.size` in `countActive()`. The two maps are disjoint by construction (the existing-pending fast-path catches any provider with both), so simple addition cannot double-count. The second concurrent caller sees count=1, the third count=2, …, and the (MAX+1)th caller is rejected with `TooManyActiveDeviceFlowsError`. Test: `caps at DEVICE_FLOW_MAX_CONCURRENT under CONCURRENT distinct-provider starts`. Fires `MAX+1` concurrent starts via `Promise.allSettled`, asserts exactly `MAX` fulfilled + exactly 1 rejected with the typed error. Pre-fix this test fails (all `MAX+1` succeed); post-fix it passes. Validation: typecheck clean across all 4 workspaces; deviceFlow.test.ts 35/35 (was 34); cli serve 645/645. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)		2026-05-18 22:05:53 +08:00
.github	ci(deps): bump docker/* actions to Node 24 majors (#4131 )	2026-05-14 10:54:01 +08:00
.husky	Sync upstream Gemini-CLI v0.8.2 (#838 )	2025-10-23 09:27:04 +08:00
.qwen	feat(cli): support batch deletion of sessions in /delete (#3733 )	2026-05-13 14:34:39 +08:00
.vscode	Merge branch 'main' into feat/sandbox-config-improvements	2026-03-06 14:38:39 +08:00
docs	feat(serve): auth device-flow route (#4175 Wave 4 PR 21) (#4255 )	2026-05-18 22:05:53 +08:00
docs-site	feat: update docs	2025-12-15 09:47:03 +08:00
eslint-rules	pre-release commit	2025-07-22 23:26:01 +08:00
integration-tests	fix(serve): sync E2E baseline capabilities with registry (#4284 )	2026-05-18 20:35:58 +08:00
packages	feat(serve): auth device-flow route (#4175 Wave 4 PR 21) (#4255 )	2026-05-18 22:05:53 +08:00
scripts	perf(cli): code-split lowlight to cut startup V8 parse cost (#4070 )	2026-05-15 17:26:18 +08:00
.dockerignore	fix(cli): skip stdin read for ACP mode	2026-03-27 11:47:01 +00:00
.editorconfig	pre-release commit	2025-07-22 23:26:01 +08:00
.gitattributes	pre-release commit	2025-07-22 23:26:01 +08:00
.gitignore	feat(skills): Add codegraph skill for PR review risk analysis and conflict detection (#3910 )	2026-05-11 18:14:40 +08:00
.npmrc	chore: remove google registry	2025-08-08 20:45:54 +08:00
.nvmrc	chore(deps): upgrade ink 6.2.3 → 7.0.2 + bump Node engine to 22 (#3860 )	2026-05-11 17:29:50 +08:00
.prettierignore	Merge branch 'main' into feat/add-vscode-settings-json-schema	2026-03-03 11:21:57 +08:00
.prettierrc.json	pre-release commit	2025-07-22 23:26:01 +08:00
.yamllint.yml	Sync upstream Gemini-CLI v0.8.2 (#838 )	2025-10-23 09:27:04 +08:00
AGENTS.md	chore(deps): upgrade ink 6.2.3 → 7.0.2 + bump Node engine to 22 (#3860 )	2026-05-11 17:29:50 +08:00
CONTRIBUTING.md	chore(deps): upgrade ink 6.2.3 → 7.0.2 + bump Node engine to 22 (#3860 )	2026-05-11 17:29:50 +08:00
Dockerfile	chore(deps): upgrade ink 6.2.3 → 7.0.2 + bump Node engine to 22 (#3860 )	2026-05-11 17:29:50 +08:00
esbuild.config.js	perf(cli): code-split lowlight to cut startup V8 parse cost (#4070 )	2026-05-15 17:26:18 +08:00
eslint.config.js	feat(cli,sdk): qwen serve daemon (Stage 1) (#3889 )	2026-05-13 14:47:47 +08:00
LICENSE	Sync upstream Gemini-CLI v0.8.2 (#838 )	2025-10-23 09:27:04 +08:00
Makefile	feat: update docs	2025-12-22 21:11:33 +08:00
package-lock.json	feat(ide): add daemon connection spike (#4199 )	2026-05-18 10:38:26 +08:00
package.json	fix: add cache limits to prevent OOM during build/test (#4188 )	2026-05-17 23:01:23 +08:00
README.md	feat(cli,sdk): qwen serve daemon (Stage 1) (#3889 )	2026-05-13 14:47:47 +08:00
SECURITY.md	fix: update security vulnerability reporting channel	2026-02-24 14:22:47 +08:00
tsconfig.json	# 🚀 Sync Gemini CLI v0.2.1 - Major Feature Update (#483 )	2025-09-01 14:48:55 +08:00
vitest.config.ts	test(channels): add comprehensive test suites for channel adapters	2026-03-27 15:26:39 +00:00

README.md

An open-source AI agent that lives in your terminal.

🎉 News

2026-04-15: Qwen OAuth free tier has been discontinued. To continue using Qwen Code, switch to Alibaba Cloud Coding Plan, OpenRouter, Fireworks AI, or bring your own API key. Run qwen auth to configure.
2026-04-13: Qwen OAuth free tier policy update: daily quota adjusted to 100 requests/day (from 1,000).
2026-04-02: Qwen3.6-Plus is now live! Get an API key from Alibaba Cloud ModelStudio to access it through the OpenAI-compatible API.
2026-02-16: Qwen3.5-Plus is now live!

Why Qwen Code?

Qwen Code is an open-source AI agent for the terminal, optimized for Qwen series models. It helps you understand large codebases, automate tedious work, and ship faster.

Multi-protocol, flexible providers: use OpenAI / Anthropic / Gemini-compatible APIs, Alibaba Cloud Coding Plan, OpenRouter, Fireworks AI, or bring your own API key.
Open-source, co-evolving: both the framework and the Qwen3-Coder model are open-source—and they ship and evolve together.
Agentic workflow, feature-rich: rich built-in tools (Skills, SubAgents) for a full agentic workflow and a Claude Code-like experience.
Terminal-first, IDE-friendly: built for developers who live in the command line, with optional integration for VS Code, Zed, and JetBrains IDEs.

Installation

Quick Install (Recommended)

Linux / macOS

bash -c "$(curl -fsSL https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.sh)"

Windows (Run as Administrator)

Works in both Command Prompt and PowerShell:

powershell -Command "Invoke-WebRequest 'https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.bat' -OutFile (Join-Path $env:TEMP 'install-qwen.bat'); & (Join-Path $env:TEMP 'install-qwen.bat')"

Note

: It's recommended to restart your terminal after installation to ensure environment variables take effect.

Manual Installation

Prerequisites

Make sure you have Node.js 22 or later installed. Download it from nodejs.org.

NPM

npm install -g @qwen-code/qwen-code@latest

Homebrew (macOS, Linux)

brew install qwen-code

Quick Start

# Start Qwen Code (interactive)
qwen

# Then, in the session:
/help
/auth

On first use, you'll be prompted to sign in. You can run /auth anytime to switch authentication methods.

Example prompts:

What does this project do?
Explain the codebase structure.
Help me refactor this function.
Generate unit tests for this module.

Click to watch a demo video

🦞 Use Qwen Code for Coding Tasks in Claw

Copy the prompt below and paste it into your agent:

Use Qwen Code as a coding agent. First, fetch and read this skill reference:
https://raw.githubusercontent.com/QwenLM/qwen-code/refs/heads/main/.qwen/skills/qwen-code-claw/SKILL.md
Then follow the instructions inside to install, authenticate, and use Qwen Code via acpx for all coding tasks.

Authentication

Qwen Code supports the following authentication methods:

API Key (recommended): use an API key from Alibaba Cloud Model Studio (Beijing / intl) or any supported provider (OpenAI, Anthropic, Google GenAI, and other compatible endpoints).
Coding Plan: subscribe to the Alibaba Cloud Coding Plan (Beijing / intl) for a fixed monthly fee with higher quotas.

⚠️ Qwen OAuth was discontinued on April 15, 2026. If you were previously using Qwen OAuth, please switch to one of the methods above. Run qwen and then /auth to reconfigure.

API Key (recommended)

Use an API key to connect to Alibaba Cloud Model Studio or any supported provider. Supports multiple protocols:

OpenAI-compatible: Alibaba Cloud ModelStudio, ModelScope, OpenAI, OpenRouter, and other OpenAI-compatible providers
Anthropic: Claude models
Google GenAI: Gemini models

The recommended way to configure models and providers is by editing ~/.qwen/settings.json (create it if it doesn't exist). This file lets you define all available models, API keys, and default settings in one place.

Quick Setup in 3 Steps

Step 1: Create or edit ~/.qwen/settings.json

Here is a complete example:

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.6-plus",
        "name": "qwen3.6-plus",
        "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        "description": "Qwen3-Coder via Dashscope",
        "envKey": "DASHSCOPE_API_KEY"
      }
    ]
  },
  "env": {
    "DASHSCOPE_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.6-plus"
  }
}

Step 2: Understand each field

Field	What it does
`modelProviders`	Declares which models are available and how to connect to them. Keys like `openai`, `anthropic`, `gemini` represent the API protocol.
`modelProviders[].id`	The model ID sent to the API (e.g. `qwen3.6-plus`, `gpt-4o`).
`modelProviders[].envKey`	The name of the environment variable that holds your API key.
`modelProviders[].baseUrl`	The API endpoint URL (required for non-default endpoints).
`env`	A fallback place to store API keys (lowest priority; prefer `.env` files or `export` for sensitive keys).
`security.auth.selectedType`	The protocol to use on startup (`openai`, `anthropic`, `gemini`, `vertex-ai`).
`model.name`	The default model to use when Qwen Code starts.

Step 3: Start Qwen Code — your configuration takes effect automatically:

qwen

Use the /model command at any time to switch between all configured models.

More Examples

Coding Plan (Alibaba Cloud ModelStudio) — fixed monthly fee, higher quotas

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.6-plus",
        "name": "qwen3.6-plus (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "qwen3.6-plus from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY"
      },
      {
        "id": "qwen3.5-plus",
        "name": "qwen3.5-plus (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "qwen3.5-plus with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      },
      {
        "id": "glm-4.7",
        "name": "glm-4.7 (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "glm-4.7 with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      },
      {
        "id": "kimi-k2.5",
        "name": "kimi-k2.5 (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "kimi-k2.5 with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      }
    ]
  },
  "env": {
    "BAILIAN_CODING_PLAN_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.6-plus"
  }
}

Subscribe to the Coding Plan and get your API key at Alibaba Cloud ModelStudio(Beijing) or Alibaba Cloud ModelStudio(intl).

Multiple providers (OpenAI + Anthropic + Gemini)

{
  "modelProviders": {
    "openai": [
      {
        "id": "gpt-4o",
        "name": "GPT-4o",
        "envKey": "OPENAI_API_KEY",
        "baseUrl": "https://api.openai.com/v1"
      }
    ],
    "anthropic": [
      {
        "id": "claude-sonnet-4-20250514",
        "name": "Claude Sonnet 4",
        "envKey": "ANTHROPIC_API_KEY"
      }
    ],
    "gemini": [
      {
        "id": "gemini-2.5-pro",
        "name": "Gemini 2.5 Pro",
        "envKey": "GEMINI_API_KEY"
      }
    ]
  },
  "env": {
    "OPENAI_API_KEY": "sk-xxxxxxxxxxxxx",
    "ANTHROPIC_API_KEY": "sk-ant-xxxxxxxxxxxxx",
    "GEMINI_API_KEY": "AIzaxxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "gpt-4o"
  }
}

Enable thinking mode (for supported models like qwen3.5-plus)

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.5-plus",
        "name": "qwen3.5-plus (thinking)",
        "envKey": "DASHSCOPE_API_KEY",
        "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      }
    ]
  },
  "env": {
    "DASHSCOPE_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.5-plus"
  }
}

Tip: You can also set API keys via export in your shell or .env files, which take higher priority than settings.json → env. See the authentication guide for full details.

Security note: Never commit API keys to version control. The ~/.qwen/settings.json file is in your home directory and should stay private.

Local Model Setup (Ollama / vLLM)

You can also run models locally — no API key or cloud account needed. This is not an authentication method; instead, configure your local model endpoint in ~/.qwen/settings.json using the modelProviders field.

Set generationConfig.contextWindowSize inside the matching provider entry and adjust it to the context length configured on your local server.

Ollama setup

Install Ollama from ollama.com
Pull a model: ollama pull qwen3:32b
Configure ~/.qwen/settings.json:

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3:32b",
        "name": "Qwen3 32B (Ollama)",
        "baseUrl": "http://localhost:11434/v1",
        "description": "Qwen3 32B running locally via Ollama",
        "generationConfig": {
          "contextWindowSize": 131072
        }
      }
    ]
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3:32b"
  }
}

vLLM setup

Install vLLM: pip install vllm
Start the server: vllm serve Qwen/Qwen3-32B
Configure ~/.qwen/settings.json:

{
  "modelProviders": {
    "openai": [
      {
        "id": "Qwen/Qwen3-32B",
        "name": "Qwen3 32B (vLLM)",
        "baseUrl": "http://localhost:8000/v1",
        "description": "Qwen3 32B running locally via vLLM",
        "generationConfig": {
          "contextWindowSize": 131072
        }
      }
    ]
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "Qwen/Qwen3-32B"
  }
}

Usage

As an open-source terminal agent, you can use Qwen Code in five primary ways:

Interactive mode (terminal UI)
Headless mode (scripts, CI)
IDE integration (VS Code, Zed)
SDKs (TypeScript, Python, Java)
Daemon mode — qwen serve exposes ACP over HTTP+SSE so multiple clients share one agent (experimental)

Interactive mode

cd your-project/
qwen

Run qwen in your project folder to launch the interactive terminal UI. Use @ to reference local files (for example @src/main.ts).

Headless mode

cd your-project/
qwen -p "your question"

Use -p to run Qwen Code without the interactive UI—ideal for scripts, automation, and CI/CD. Learn more: Headless mode.

IDE integration

Use Qwen Code inside your editor (VS Code, Zed, and JetBrains IDEs):

Daemon mode (`qwen serve`, experimental)

cd your-project/
qwen serve
# → qwen serve listening on http://127.0.0.1:4170 (mode=http-bridge)

Run Qwen Code as a local HTTP daemon so IDE plugins, web UIs, CI scripts and custom CLIs all share one agent session over HTTP+SSE — instead of each spawning their own subprocess. Loopback bind has no auth by default (set QWEN_SERVER_TOKEN to enable bearer auth even on loopback); remote binds (--hostname 0.0.0.0) require a token — boot refuses without one. See:

SDKs

Build on top of Qwen Code with the available SDKs:

Python SDK example:

import asyncio

from qwen_code_sdk import is_sdk_result_message, query


async def main() -> None:
    result = query(
        "Summarize the repository layout.",
        {
            "cwd": "/path/to/project",
            "path_to_qwen_executable": "qwen",
        },
    )

    async for message in result:
        if is_sdk_result_message(message):
            print(message["result"])


asyncio.run(main())

Commands & Shortcuts

Session Commands

/help - Display available commands
/clear - Clear conversation history
/compress - Compress history to save tokens
/stats - Show current session information
/bug - Submit a bug report
/exit or /quit - Exit Qwen Code

Keyboard Shortcuts

Ctrl+C - Cancel current operation
Ctrl+D - Exit (on empty line)
Up/Down - Navigate command history

Learn more about Commands

Tip: In YOLO mode (--yolo), vision switching happens automatically without prompts when images are detected. Learn more about Approval Mode

Configuration

Qwen Code can be configured via settings.json, environment variables, and CLI flags.

File	Scope	Description
`~/.qwen/settings.json`	User (global)	Applies to all your Qwen Code sessions. Recommended for `modelProviders` and `env`.
`.qwen/settings.json`	Project	Applies only when running Qwen Code in this project. Overrides user settings.

The most commonly used top-level fields in settings.json:

Field	Description
`modelProviders`	Define available models per protocol (`openai`, `anthropic`, `gemini`, `vertex-ai`).
`env`	Fallback environment variables (e.g. API keys). Lower priority than shell `export` and `.env` files.
`security.auth.selectedType`	The protocol to use on startup (e.g. `openai`).
`model.name`	The default model to use when Qwen Code starts.

See the Authentication section above for complete settings.json examples, and the settings reference for all available options.

Benchmark Results

Terminal-Bench Performance

Agent	Model	Accuracy
Qwen Code	Qwen3-Coder-480A35	37.5%
Qwen Code	Qwen3-Coder-30BA3B	31.3%

Ecosystem

Looking for a graphical interface?

AionUi A modern GUI for command-line AI tools including Qwen Code
Gemini CLI Desktop A cross-platform desktop/web/mobile UI for Qwen Code

Troubleshooting

If you encounter issues, check the troubleshooting guide.

Common issues:

Qwen OAuth free tier was discontinued on 2026-04-15: Qwen OAuth is no longer available. Run qwen → /auth and switch to API Key or Coding Plan. See the Authentication section above for setup instructions.

To report a bug from within the CLI, run /bug and include a short title and repro steps.

Connect with Us

Acknowledgments

This project is based on Google Gemini CLI. We acknowledge and appreciate the excellent work of the Gemini CLI team. Our main contribution focuses on parser-level adaptations to better support Qwen-Coder models.