feat(serve): auth device-flow route (#4175 Wave 4 PR 21) (#4255)

* feat(serve): auth device-flow route

Implements issue #4175 Wave 4 PR 21. Brokers OAuth 2.0 Device
Authorization Grant (RFC 8628) through the `qwen serve` daemon so a
remote SDK client can trigger a Qwen-account login whose tokens land
on the **daemon** filesystem, not on the client. The daemon polls the
IdP itself; the client's only job is to display the verification URL +
user code.

Runtime locality (#4175 §11): the daemon NEVER spawns a browser or
calls `open(url)` — even when running locally. Static-source grep
test fails the build on `node:child_process` / `open` / `xdg-open` /
`shell.openExternal` / `execa` / `shelljs` / `process.spawn` and
their dynamic-import / require variants.

- `POST /workspace/auth/device-flow` — strict mutation gate; returns
  201 fresh / 200 idempotent take-over with `attached: true`. Per
  per-`providerId` singleton: a second POST while pending takes over
  rather than allocating a new `device_code`.
- `GET /workspace/auth/device-flow/:id` — public state read. Pending
  entries echo `userCode/verificationUri/expiresAt/intervalMs`;
  terminal entries (5-min grace) drop them and surface
  `status/errorKind/hint`.
- `DELETE /workspace/auth/device-flow/:id` — strict; idempotent
  (terminal → 204 no-op; unknown → 404).
- `GET /workspace/auth/status` — pending flows + supported providers
  snapshot. v1 stub for `providers: []` (populated in fold-in 1).

`DeviceFlowRegistry` (`packages/cli/src/serve/auth/deviceFlow.ts`)
is the in-memory state holder:
- per-`providerId` singleton with idempotent take-over
- workspace-wide cap of 4 active flows (abuse defense)
- 5-min terminal grace so SDK reconnects can still observe results
- TTL sweeper evicts grace-expired entries every 30s
- in-flight `Promise` map coalesces concurrent `start()` calls so two
  parallel POSTs don't double-allocate IdP `device_code`
- `transitionTerminal` returns `boolean` so caller-side emit/audit
  guard prevents sweeper × poll-tick double-fire
- `dispose()` wired into `runQwenServe.close()`'s shutdown drain;
  cancels `provider.poll()` mid-flight via `cancelController`,
  records `lost_success` audit when an IdP-minted token is dropped
  by transition

`DeviceFlowProvider` interface accepts `start({signal})` +
`poll(state, {signal})`. `QwenOAuthDeviceFlowProvider` wraps the
existing `QwenOAuth2Client.requestDeviceAuthorization` /
`pollDeviceToken` primitives directly (NOT
`authWithQwenDeviceFlow`, which calls `open(url)`). PKCE is
provider-required by Qwen but optional in the interface for future
non-PKCE providers. `success.persist()` writes to disk FIRST, then
updates the in-process client — a failed disk write no longer
leaves the daemon with a zombie in-memory token. Maps RFC 8628
errors via an anchored regex (`^Device token poll failed:
(expired_token|access_denied|invalid_grant)`) so an
`error_description` containing one of those literals can't
mis-classify an unrelated upstream error.

`BrandedSecret<T extends string>` holds the `device_code` and PKCE
verifier. Earlier draft used `new String()` wrapper which leaked
through `+` / template literals (`Symbol.toPrimitive` →
`valueOf` returned the primitive). Final shape: frozen plain object
+ `WeakMap` indirection + 4-way redaction
(`toString` / `toJSON` / `Symbol.toPrimitive` / numeric coercion →
`'[redacted]'` or `NaN`) + `unique symbol` brand. 6 leak-path
tests: `JSON.stringify` / `String()` / concat / template / `+x` /
reveal-roundtrip.

5 new daemon events (workspace-scoped, fanned out to every active
session bus via `bridge.broadcastWorkspaceEvent`):

- `auth_device_flow_started` — `{deviceFlowId, providerId, expiresAt}`
  (no userCode/verificationUri — see PR 21 design §3)
- `auth_device_flow_throttled` — `{deviceFlowId, intervalMs}`,
  emitted only on upstream `slow_down` interval bumps
- `auth_device_flow_authorized` — `{deviceFlowId, providerId,
  expiresAt?, accountAlias?}`; `accountAlias` is best-effort
  non-PII (never email/phone)
- `auth_device_flow_failed` — `{deviceFlowId, errorKind, hint?}`
  with `errorKind ∈ {expired_token, access_denied, invalid_grant,
  upstream_error, persist_failed}`
- `auth_device_flow_cancelled` — `{deviceFlowId}` (DELETE on pending)

Workspace-scoped reducer `reduceDaemonAuthEvent` produces
`DaemonAuthState { flows: Partial<Record<ProviderId, ...>> }` —
parallel to `reduceDaemonSessionEvent`. Session reducer no-ops on
auth events (workspace-scoped state belongs in its own reducer).

`bridge.broadcastWorkspaceEvent` is intentionally distinct from PR
16's `publishWorkspaceEvent` to avoid merge conflict; collapses to
the shared helper as a fold-in once #4249 lands (~25 LoC).

`@qwen-code/sdk` (`packages/sdk-typescript/`):

- 4 new `DaemonClient` methods: `startDeviceFlow`, `getDeviceFlow`,
  `cancelDeviceFlow`, `getAuthStatus` — typed against the wire
  shapes, errors mapped through the existing `DaemonHttpError`.
- High-level `client.auth` getter (lazy `DaemonAuthFlow` singleton)
  exposes a `start(...).awaitCompletion()` shape mirroring `gh auth
  login`'s UX: print code first, let the SDK consumer decide where
  to open the browser. `awaitCompletion` polls GET on the
  daemon-supplied `intervalMs`, honors `slow_down` bumps, and
  fall-back-recovers from 404 (entry evicted post-grace).

POST + DELETE flow through PR 15's `mutate({strict: true})` —
401 `token_required` on token-less loopback defaults. GET routes
use only the global `bearerAuth`. Every state transition
(`started/authorized/failed/cancelled/expired/lost_success`)
records a structured stderr breadcrumb (`[serve] auth.device-flow:
provider=... deviceFlowId=abc12... clientId=... status=...`)
since `mutate()` doesn't carry an audit hook — events alone aren't
enough since SDK can silently drop them; stderr → journald/docker
logs is the unfalsifiable record.

`auth_device_flow` advertised unconditionally on
`/capabilities.features`. Supported providers list lives on
`/workspace/auth/status` to keep the registry descriptor uniform.

- `packages/core/src/qwen/qwenOAuth2.ts`:
  - exports `cacheQwenCredentials` (was a private function; needed
    by the daemon's device-flow registry)
  - `cacheQwenCredentials` now calls `SharedTokenManager.clearCache()`
    after writing, folding what was previously a paired call site at
    L820+L829. Idempotent change.
  - file mode `0o600` on `oauth_creds.json` (was default 0o666 +
    umask). Mirrors opencode's `auth/index.ts`.
- `packages/cli/src/serve/runQwenServe.ts`: device-flow registry
  `dispose()` wired into the shutdown drain (BEFORE
  `bridge.shutdown()`).

- `auth/deviceFlow.test.ts` — 21 tests: BrandedSecret leak paths,
  state machine (slow_down / success / error), terminal grace,
  concurrent-start coalescing, dispose, cancel idempotency, static-
  source grep against browser-spawn primitives.
- `server.test.ts` — 10 device-flow integration tests:
  POST 201/200 take-over, strict 401, 400 `unsupported_provider`,
  GET / DELETE / `/workspace/auth/status`, 502 `upstream_error`
  mapping, sweeper-driven auto-expiry with controlled clock,
  capability advertisement.
- `daemonEvents.test.ts` — 5 SDK reducer tests: type guards, per-
  provider state projection, `failed` always → `status: 'error'`
  (errorKind carries the kind, including new `persist_failed`),
  session reducer no-ops on auth events.

369/369 serve + SDK tests pass; typecheck + `eslint
--max-warnings 0` clean across 14 PR 21 files.

- [x] Independently mergeable (depends only on merged PR 4 / PR 7 /
      PR 12 / PR 15)
- [x] Backward compatible (4 new routes + 1 capability tag + 5 typed
      events + 4 SDK helpers; existing routes/events untouched)
- [x] Default off (capability advertised but no client is forced to
      use it; CLI `qwen` OAuth flow unchanged)
- [x] `qwen serve` Stage 1 routes / SDK behavior preserved
- [x] Gradual migration (v1 only `qwen-oauth`; future providers
      register through the `DeviceFlowProvider` interface)
- [x] Reversible (revert removes 4 routes + 1 tag + 5 events with no
      schema migration)
- [x] Tests-first (28 new tests across 3 layers)

- Inline `bridge.broadcastWorkspaceEvent` → fold-in to PR 16 (#4249)
  `publishWorkspaceEvent` once that lands
- `/workspace/auth/status` vs PR 12 `/workspace/providers` boundary
  — separate route in v1; merge alternative discussed
- Wave 4 PRs 17/19/20 should adopt the same mutate-strict +
  workspace event-fan-out pattern

5 items from pre-PR specialist passes parked for a focused
follow-up: `DeviceFlowEntry` discriminated union, single-source SDK
status / ProviderId unions, `awaitCompletion` memoization,
broadcast-100%-fail stderr elevation, SDK 404 →
`not_found_or_evicted` errorKind.

Refs: #4175

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 round-1 review feedback

Eleven items from copilot-pull-request-reviewer's round-1 pass on
#4255 — 4 inline threads + 7 from the PR-level review summary.

## Adopted (11 items, code/doc changes)

- **`lastSeenAt` → `lastSeenEventId`** (`events.ts`,
  `DaemonDeviceFlowReducerState`). The field was set from
  `rawEvent.id` (SSE event id) but documented as "epoch ms" — a real
  semantic mismatch that would mislead consumers into time-based
  logic against a monotonic counter. Rename + tighten the JSDoc to
  describe it as an event-id counter; reducer cases updated.
- **`DEVICE_FLOW_EXPIRY_GRACE_MS = 30_000` extracted** in
  `DaemonAuthFlow.ts` (was a magic number on `start.expiresAt +
  30_000`). `AwaitCompletionOptions.timeoutMs` doc now describes the
  actual grace-past-expiry behavior + the rationale (clock skew +
  daemon sweeper interval + network latency) instead of the wrong
  "defaults to expiresAt - Date.now()" claim.
- **Explicit `chmod 0o600`** in `cacheQwenCredentials` after every
  write. `fs.writeFile`'s `mode` only applies on file creation; a
  pre-existing `oauth_creds.json` written under a broader umask kept
  its old permissions across upgrades. The chmod now tightens it on
  every write; chmod failure (Windows / hardened FS) surfaces via
  `debugLogger.warn` instead of silently dropping the invariant.
- **`SharedTokenManager.clearCache()` failure now logs**
  `debugLogger.warn` (was a silent `try { } catch { }`). In
  production a swallowed clearCache means in-process callers serve
  stale credentials until the SharedTokenManager mtime watcher
  catches up — a recoverable degradation worth a log line.
- **Protocol doc** lists `persist_failed` in the
  `auth_device_flow_failed.errorKind` union (was added to the type
  but missed in the doc).
- **`pollDeviceToken({signal})`** plumbed through
  `IQwenOAuth2Client` interface + `QwenOAuth2Client` impl + the Qwen
  device-flow provider. Cancel / dispose during a slow IdP response
  now aborts the in-flight HTTP socket immediately instead of
  waiting for the upstream timeout. Two new registry tests assert
  `cancel()` / `dispose()` propagate abort to the signal observed by
  `provider.poll`.
- **`revealSecret` error message** clarified: was "secret has been
  GC-evicted" (impossible — WeakMap doesn't evict reachable keys).
  Now points at the actual reachable failure modes (forged shape /
  serialize+reparse losing the WeakMap binding).
- **`transitionTerminal` JSDoc** clarifies that the PRIMARY guard
  against late timer secret leaks is the `entry.status !== 'pending'`
  check at the top of `runPollTick`; secret-clearing here is
  defense-in-depth.
- **`DeviceFlowErrorKind` JSDoc'd per variant** so consumers can tell
  when each fires (RFC 8628 distinctions + `persist_failed` vs
  `upstream_error` boundary).
- **Stale "PR 16 / PR 21 §3" temporal references** in
  `DaemonAuthFlow.ts:124` rephrased to be timeless ("workspace-scoped
  events fan out through whatever session buses happen to be live"
  — no PR number references that rot when those PRs merge).

## Not adopted (4 items, replied to in-thread)

- **`authWithQwenDeviceFlow` browser-launch separation** — correct
  architectural advice but out of #4255 scope (would refactor a CLI
  auth UX module that PR 21 only touched additively). Tracked as a
  Wave 5 follow-up.
- **Copyright header year range** — repo-wide convention "2025"; not
  introduced by this PR.
- **Spread `...(x ? {x} : {})` → `x: x ?? undefined`** — the two are
  not semantically equivalent. The current form omits the key
  entirely on falsy `x`; the suggested form always includes the key.
  Tests assert object shape and would break under the change.
- **Eager `client.auth` getter** — public API boundary. Lazy
  construction matches `DaemonSessionClient` precedent + saves the
  module load for SDK consumers that never touch auth.

Refs: #4175 #4255

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-1 review feedback

15 items from @wenshao's review batches on #4255. Catches a handful
of real bugs that the earlier round (commit 3d9f082f5) didn't
surface.

## Critical fixes

- **C1 — `pollUntilTerminal` providerId pass-through**
  (`DaemonAuthFlow.ts:185`). The synthetic 404 fallback hardcoded
  `providerId: 'qwen-oauth'`; the parent `awaitCompletion` already
  receives the real providerId via `start.providerId` but
  `pollUntilTerminal`'s parameter type stripped it. Add the field to
  the param type, propagate.
- **C2 — open `errorKind` allowlist** (`events.ts`). The closed
  5-value union in the type guard silently dropped any `failed`
  event whose errorKind the daemon added without mirroring SDK-side
  (e.g. a future `rate_limited`). The flow's reducer state would
  never transition to terminal, leaving SDK consumers stuck on
  `pending` forever. Open the union with `(string & {})` and accept
  any non-empty string in the runtime guard. Updated test asserts
  forward-compat behavior + still rejects the truly-malformed
  empty-string case.
- **C3 — `persist()` timeout + signal**
  (`deviceFlow.ts`). A wedged disk I/O (NFS stall, encrypted-volume
  contention) without bounds would pin the entry in `pending` until
  the upstream `expires_in` elapsed (potentially minutes). The
  registry now passes its `cancelController.signal` AND arms a hard
  `DEVICE_FLOW_PERSIST_TIMEOUT_MS = 30_000` timer; persist failure
  surfaces as `persist_failed` immediately. The
  `DeviceFlowPollResult` `success` variant signature changed to
  `persist({signal})`.
- **C4 — cancel × success race rollback**
  (`deviceFlow.ts` + Qwen provider). Today, if `cancel()`
  transitions while `persist()` is in flight, the credentials get
  written but the flow's status is `cancelled`. User sees cancelled,
  daemon disk has a valid token. `DeviceFlowPollResult.success`
  gains an optional `unpersist()` callback the registry calls when
  `transitionTerminal(authorized)` fails — the Qwen provider wires
  it to `clearQwenCredentials()`. Rollback failure is audited but
  not propagated (re-running auth would overwrite anyway).
- **C5 — don't `unref()` the `awaitCompletion` sleep timer**
  (`DaemonAuthFlow.ts`). On a standalone Node CLI/script doing just
  `client.auth.start().awaitCompletion()`, the unref'd between-poll
  timer was the only event-loop handle, so Node could exit before
  the user finished authorization. The poll wait is foreground work
  the caller explicitly awaits — keep it ref'd.

## Information-leak fixes

- **S1 — sanitize `persist_failed` hint**. `err.message` from
  `cacheQwenCredentials` embeds the full `~/.qwen/oauth_creds.json`
  path. Broadcast via SSE, that path leaks the daemon's home layout
  to every connected session subscriber. Replace user-facing hint
  with `"credentials could not be written to the daemon filesystem
  — check disk space and permissions"`; full err goes to stderr
  audit only.
- **S2 — sanitize upstream `pollDeviceToken` hint**. The class
  embedded the entire raw IdP response body (which can be an HTML
  error page from a reverse proxy) into the thrown message. Same
  broadcast leak path. Replace upstream-error hint with
  `"unexpected response from identity provider"`; RFC 8628 errors
  use `"Qwen IdP returned ${kind}"`.

## Cleanup / forward-compat

- **D1 — drop duplicate `clearCache()`** at `qwenOAuth2.ts:840`. The
  paired call became redundant once `cacheQwenCredentials` folded
  the clearCache in (PR #4255 fold-in 1). The fold-in 1 message
  said this would be done; the duplicate slipped through.
- **S3 — drop unused `DeviceFlowNotFoundError`** (`deviceFlow.ts`).
  Exported but never imported; route handlers do inline 404 JSON.
- **S4 — single-source SDK status / errorKind unions**
  (`types.ts`). `DaemonAuthDeviceFlowSdkStatus` /
  `DaemonAuthDeviceFlowSdkErrorKind` were parallel literal copies
  of the canonical events.ts definitions — drift waiting to happen.
  Now imported + aliased as type-only re-exports.
- **S5 — broadcast 100% fail elevates to stderr**
  (`httpAcpBridge.ts`). Per-session bus failures stay debug-only,
  but a broadcast where EVERY session bus refused is operationally
  interesting (clients won't see the event). Track success / fail
  counts; `writeStderrLine` when `successCount === 0`.
- **S6 — `this.disposed` check after `await provider.start()`**
  (`deviceFlow.ts`). `dispose()` mid-start would orphan the freshly-
  inserted entry (`schedulePoll` guards on `disposed` so no poll
  fires; the entry never transitions). Throw post-await if disposed.
- **W1 — thread `signal` into `requestDeviceAuthorization`**
  (`qwenOAuth2.ts` + Qwen provider). `start()` had the same
  cancellation gap that `pollDeviceToken` had — a slow
  device-authorization request couldn't be aborted during shutdown.
  Now plumbed end-to-end.
- **W2 — split `invalid_request` from `unsupported_provider`**
  (`server.ts`). Conflating them surfaced misleading remediation
  hints to SDK consumers branching on `code` ("this provider isn't
  supported here" when the real cause was a serializer dropping the
  field). Bad-shape now returns `code: 'invalid_request'`;
  unknown-but-well-formed stays `unsupported_provider`.
- **W3 — drop never-populated `accountAlias`**
  (Qwen provider). The field was wired through types / events /
  reducer / audit but the Qwen IdP's token response doesn't carry
  one (no `name` / `email` / `sub`). Returning only `{expiresAt}`
  makes the field type-honestly absent rather than always-undefined.
  Future provider with an alias-bearing response can populate it.
- **W4 — `DaemonAuthFlow` JSDoc accuracy**. Doc claimed "first
  attempts to consume an SSE event stream … falls back to GET-based
  polling"; actual is GET-only with SSE as a real-time hint for
  clients already subscribed to a session stream.
- **W5 — clearer unit arithmetic** in interval normalization. The
  `(_INTERVAL_MS / 1000) * 1000` cancelation hid the s↔ms boundary;
  expanded form makes both branches unit-explicit.

## Test changes

- `daemonEvents.test.ts` updated to match the now-OPEN errorKind
  union (forward-compat assertion + empty-string still rejected).
- `deviceFlow.test.ts` `FakeProvider.poll` aligned with the new
  `persist({signal})` signature + optional `unpersist`.

## Validation

- `npm run typecheck --workspace packages/cli --workspace
  packages/sdk-typescript --workspace packages/core` — clean
- `npx vitest run packages/cli/src/serve/
  packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 368/368
- `npx eslint --max-warnings 0` over the 11 PR 21 surface files —
  clean

Refs: #4175 #4255

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-2 review feedback

10 new threads from @wenshao's second deep-review pass on #4255.
Verified status: 5 real issues, 1 improvement, 3 stale (already
fixed; comments lagged), 1 false alarm (typecheck demonstrably
clean).

## Critical fixes

- **fold-in 2 C4 REVERSED**: when `provider.poll()` returns success
  AND `cancel()` / `dispose()` transitioned the entry mid-`persist()`,
  the registry now FORCES the entry to `authorized` and keeps the
  on-disk credentials. The earlier rollback (`unpersist()`) wasted
  the user's IdP approval because the RFC 8628 `device_code` is
  single-use — re-running the flow would force them through the
  whole browser-prompt + paste-code dance again for a click whose
  intent was likely "stop the wait" rather than "undo my already-
  completed approval". Aligns with gh CLI / Auth0 SDK / git-
  credential-manager. Audit captures the race via `hint:
  'lost_success_kept ...'`. `DeviceFlowPollResult.success.unpersist`
  field + Qwen provider's `clearQwenCredentials` rollback removed.
- **#1 GET /workspace/auth/device-flow/:id strict gate**: this GET
  surfaces `userCode` / `verificationUri` for pending entries, which
  on the loopback no-token default were readable by any local
  process. POST + DELETE were already strict; aligning GET closes
  the information-disclosure asymmetry. `/workspace/auth/status`
  stays bearer-only (its `pendingDeviceFlows` entries intentionally
  omit `userCode`).
- **#2 `inFlightStarts` hard timeout**: a hung `provider.start()`
  (network partition, unresponsive IdP) used to leave the per-
  `providerId` slot in `inFlightStarts` occupied forever, blocking
  every subsequent POST until daemon restart. New
  `DEVICE_FLOW_START_TIMEOUT_MS = 30_000` arms a timer that
  `cancelController.abort()`s the start; the rejected promise
  unwinds through the `try/finally` clearing the slot.
- **#10 chain-completing the C3 persist-timeout**: the earlier C3
  fix armed a 30s timer that fired `cancelController.abort()` then
  `await result.persist({signal})`, but the chain ended at the
  registry boundary — `cacheQwenCredentials` didn't take a signal,
  so `fs.writeFile` couldn't be aborted. Now `cacheQwenCredentials`
  accepts an optional `{signal}` and threads it into
  `fs.writeFile(..., {signal})` (Node native). The Qwen provider's
  `persist({signal})` forwards the entry's
  `cancelController.signal` end-to-end.

## Improvement (#4): 404 fallback errorKind

`pollUntilTerminal`'s 404 catch used to synthesize
`{status: 'expired'}` for ALL evicted entries — conflating "your
flow expired during your disconnect", "the daemon was restarted",
and "your deviceFlowId was wrong". Now returns
`status: 'error'` + `errorKind: 'not_found_or_evicted'` + a `hint`
so SDK consumers branching on errorKind can distinguish.

## Information leak (#9): start() path raw IdP message

S2 (fold-in 2) sanitized `poll()`'s upstream-error hint, but
`start()` still embedded the raw `err.message` (full IdP response,
potentially HTML from a reverse proxy / WAF) into the
`UpstreamDeviceFlowError` that flowed to SDK clients via the 502.
Now uses static messages for the SDK-visible errors; raw detail
goes through `writeStderrLine` for operator audit only. Mirrors
S2's approach.

## Stale comments cleaned (#5, #7)

`qwenDeviceFlowProvider.ts:177` claimed
`cacheQwenCredentials` "doesn't currently take a signal — that's
a follow-up". After #10 above, that's no longer true; the comment
is replaced with the actual end-to-end signal-threading note.

## Not adopted (1 false alarm)

- Thread on `types.ts:330` claimed type-only-import-after-
  declarations breaks `tsc` and fails `daemonEvents.test.ts:670`
  with TS2345. Demonstrably false: `npx tsc -p
  packages/sdk-typescript/tsconfig.json --noEmit` exits 0;
  `daemonEvents.test.ts` is the post-fold-in-2 file with the
  open-allowlist assertion (test 28/28 passes). The reviewer may
  have been looking at a transient state during their analysis.

## Validation

- `npm run typecheck --workspace packages/cli --workspace
  packages/sdk-typescript --workspace packages/core` — clean
- `npx vitest run packages/cli/src/serve/
  packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 398/398
  pass
- `npx eslint --max-warnings 0` over the PR 21 surface — clean

Refs: #4175 #4255

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-3 review feedback

5 new threads from the third deep-review pass on #4255. 3 real
issues fixed; 1 stale (already done in fold-in 3); 1 deferred as
non-blocking design suggestion.

- **A — `expiresIn` / `interval` non-finite guard**
  (`deviceFlow.ts`). The provider contract types both as `number`,
  but a misbehaving / future provider could hand `undefined` /
  `NaN` / `Infinity`. `Math.max(0, NaN) * 1000` is `NaN`, then
  `now() + NaN` is `NaN`, then `now >= NaN` is always `false` —
  the sweeper would NEVER evict the entry, pinning an upstream
  `device_code` slot until daemon restart. Same hazard on
  `interval * 1000` (NaN → `setTimeout(NaN)` fires immediately,
  Infinity → scheduler clamps to TIMEOUT_MAX). Now both fields go
  through `Number.isFinite(x) && x > 0`; missing/bad values fall
  back to RFC 8628's recommended ceilings (10 min for expiry, 5s
  for interval).

- **D — typed `app.locals` accessor**
  (`deviceFlow.ts` + writer/reader call sites). The
  `app.locals['deviceFlowRegistry']` string key was shared between
  `createServeApp` (writer) and `runQwenServe` (reader); a typo on
  either side would compile cleanly and the shutdown dispose call
  would silently no-op, leaving polling timers running until the
  `unref()` rescue. New `setDeviceFlowRegistry(app, registry)` /
  `getDeviceFlowRegistry(app)` pair gives both call sites
  type-checked access; the string literal is encapsulated in one
  module.

- **E — `UnsupportedDeviceFlowProviderError` docstring**
  (`deviceFlow.ts`). After fold-in 2's W2 fix split
  `invalid_request` from `unsupported_provider`, the route layer
  screens unknown ids against `DEVICE_FLOW_SUPPORTED_PROVIDERS`
  before reaching the registry — so this error is now reachable
  ONLY on a daemon-internal invariant violation (id is declared
  supported but not registered in the runtime provider map).
  Docstring + thrown message updated to reflect that this branch
  signals a programmer error, not user input.

- **B** claimed `cacheQwenCredentials(credentials)` doesn't forward
  signal to `fs.writeFile`. Verified: fold-in 3 (#10) at
  `qwenDeviceFlowProvider.ts:204` calls
  `cacheQwenCredentials(credentials, { signal: persistOpts.signal })`
  and the core helper threads it into `fs.writeFile(..., {mode,
  signal})`. The reviewer was looking at the comment block above
  (lines 174-181) without scrolling to the actual call site.

- **C — SDK `cancelDeviceFlow` lossy 204/404 collapse**.
  Suggested returning `{existed: boolean; alreadyTerminal: boolean}`
  instead of resolving void on both 204 and 404. Real signal-loss
  but tagged "[非阻塞]" by the reviewer; changing requires a
  daemon route shape change (200 + body instead of 204) which is
  better as a focused follow-up PR. Acknowledged in-thread;
  deferred to a fold-in PR after #4255 lands.

- `npm run typecheck` — clean across `packages/{cli,sdk-typescript,core}`
- `npx vitest run packages/cli/src/serve/
  packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 398/398
- `npx eslint --max-warnings 0` over the PR 21 surface — clean

Refs: #4175 #4255

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-4 review feedback

4 threads from the fourth review pass on #4255. 3 adopted + 1
deferred (out-of-scope rename of PR 15's `mutate` helper).

## Adopted

### #1 — `persistInFlight` flag suppresses cancel × persist event-stream UX trap

When `provider.poll()` returns success and we await `persist()`, a
concurrent `cancel()` would synchronously transition the entry to
`cancelled` and emit `auth_device_flow_cancelled` — then `persist()`
resolves and (per fold-in 3 C4) force-overrides to `authorized` +
emits `auth_device_flow_authorized`. The reducer state correctly
last-write-wins on `authorized`, but DIRECT event-stream consumers
(close-dialog handlers, telemetry, UI cleanup) race onto an unmounted
UI when the second event lands.

Now: while persist is in-flight, `cancel()` and the sweeper SKIP the
state transition + event emit. They register intent (set
`cancelRequestedDuringPersist=true` for cancel; sweeper just no-ops)
and let the persist resolution decide:

- persist succeeds → `authorized` (IdP wins per fold-in 3 C4)
- persist fails AND cancel was requested → `cancelled`
- persist fails AND `now >= expiresAt` → `expired` / `expired_token`
- persist fails otherwise → `error` / `persist_failed`

Result: at most one terminal event per flow. Imperative SSE
consumers no longer see oscillating terminal states. Audit captures
the race (`hint: 'lost_success_kept ...'`) for incident-response
correlation.

### #2 — `revealSecret` → `unsafeRevealSecret` rename

The earlier JSDoc claimed "the `unsafeReveal_` naming is intentional:
greppable in code review, easy to allowlist in lint rules, hard to
invoke by accident" — but the actual function was named
`revealSecret`. The promised safety properties didn't exist; a code
reviewer wouldn't single out `revealSecret` as suspicious, and a
`no-restricted-syntax` ESLint rule wouldn't flag it.

Renamed to `unsafeRevealSecret` so the JSDoc-promised "greppable" /
"lintable" property is now actually true. Two call sites in the
Qwen provider + 4 test references updated. Internal symbol; not
exposed through the SDK package.

### #4 — `QwenOAuthPollError` typed class replaces substring regex

The earlier RFC 8628 error mapper used an anchored regex against the
thrown error message text — an implicit cross-file string contract
between `qwenOAuth2.ts` (throws) and `qwenDeviceFlowProvider.ts`
(matches). If `qwenOAuth2.ts` ever changed its message format, ALL
RFC 8628 errors (`expired_token` / `access_denied` / `invalid_grant`)
would silently fall through to `upstream_error` — wrong errorKind
flowing through telemetry with no test or type-system check to catch
the drift.

Now `QwenOAuth2Client.pollDeviceToken` throws a structured
`QwenOAuthPollError extends Error` with `oauthError` / `description`
/ `status` fields. The provider branches on `instanceof
QwenOAuthPollError` and reads `.oauthError` directly via a
dedicated `mapRfc8628OAuthCode(code)` switch. The drift hazard is
gone: a future code change that touches the typed class will
fail tsc until both sides are updated. Message format preserved
for any pre-existing log-parsing / substring matchers.

## Not adopted

### #3 — `mutate({strict:true})` semantic awkwardness on GET

Reviewer correctly noted that `mutate` is named for state-changing
routes, but `GET /workspace/auth/device-flow/:id` uses it for an
information-disclosure defense (only reachable code path is reading
state). Suggested rename: `mutate` → `strictHttpGate`.

Deferred: the rename touches PR 15's helper which has many call
sites in `server.ts` and is shared infrastructure for Wave 4 PRs
17/19/20. PR 21 is the first / only consumer of the strict-on-GET
form so far; widening the rename to a Wave 4 follow-up keeps the
fold-in scope tight. Replied in-thread.

## Validation

- `npm run typecheck` — clean across `packages/{cli,sdk-typescript,core}`
- `npx vitest run packages/cli/src/serve/
  packages/sdk-typescript/test/unit/daemonEvents.test.ts` — 544/544
- `npx eslint --max-warnings 0` over the PR 21 surface — clean

Refs: #4175 #4255

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-5 review feedback

Five small adopt items from the round-5 review pass; one stale thread
already addressed in b5b77ee90 (fold-in 5).

#2 — `as const` + derived type for DEVICE_FLOW_SUPPORTED_PROVIDERS so
adding/removing a provider id requires touching exactly ONE site.
Mirrors `SERVE_ERROR_KINDS` / `ServeErrorKind` in `status.ts`.

#3 — Clarify `DEVICE_FLOW_EXPIRY_GRACE_MS` JSDoc to distinguish the
daemon's 30s SWEEP cadence (what the grace tracks) from the 5-min
TERMINAL_GRACE_MS reconnect window (which awaitCompletion does NOT
need to wait through).

#4 — Add `@remarks` block on `DeviceFlowProvider.poll()` warning
future provider authors that thrown `err.message` flows verbatim
into the SSE-broadcast `auth_device_flow_failed` hint, and must be
sanitized. Two equally-correct paths documented (typed `error`
result vs sanitized thrown message).

#5 — Truncate raw IdP detail in `qwenDeviceFlowProvider.ts` stderr
audit lines to 2 KiB. WAFs / reverse proxies can return MB-sized
HTML error pages, and container log aggregators (Loki, Fluent Bit,
Stackdriver) typically truncate or drop lines past 4-32 KiB —
losing the useful prefix downstream. 2 KiB retains structured JSON
envelopes while staying well below every aggregator's per-line cap.

#6 — Track latest `originatorClientId` on per-provider singleton
take-over via new `entry.lastOriginatorClientId` field +
`recordTakeover()` helper. When a second SDK client posts
`POST /workspace/auth/device-flow` for an already-pending provider
(or one being created in `inFlightStarts`) with a different
`initiatorClientId`, an audit breadcrumb records the take-over so
incident response can correlate "client A started, client B took
over at 12:34". Event-routing intentionally still uses the original
`initiatorClientId` (events are workspace-broadcast and changing
the originator field mid-flow would break SDK reducers that key on
it). Two new tests cover the differing-id audit + same-id no-op.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-6 review feedback

Six "Critical" findings from a gpt-5.5 /review pass — all real
liveness/correctness defects in the daemon's auth device-flow path
and the SDK's awaitCompletion polling loop.

#1 — Make `provider.start()` timeout authoritative via `Promise.race`
in `DeviceFlowRegistry.doStart`. The earlier shape only ABORTED the
signal on timeout; a provider that ignores `signal` (non-abortable
I/O, future implementer who forgets to thread it to `fetch`) would
leave the `await` hanging until daemon restart, pinning the
`inFlightStarts` slot for that providerId. Race against a rejecting
timer makes the timeout authoritative regardless of provider
cooperation; abort still fires for cooperative cleanup.

#2 — Same shape for `result.persist()` in the success branch of
`runPollTick`. A future provider whose persist performs
non-abortable steps (mkdir/chmod/mv outside the abortable
fs.writeFile path) would otherwise hang the poll tick until process
restart. Race against rejecting timer; rejection maps to
`persist_failed`.

#3 — Clamp `expiresIn` and `interval` upper bounds. Previous
`Number.isFinite + > 0` guards stopped NaN/Infinity but a finite
extreme like `1e12` was still accepted — pinning the per-provider
singleton for ~30,000 years (`expires_in`) or scheduling a
TIMEOUT_MAX-clamped poll that never fires within `expiresAt`
(`interval`). Two new constants (`DEVICE_FLOW_MAX_EXPIRES_IN_SEC =
3600`, `DEVICE_FLOW_MAX_INTERVAL_MS = 60_000`) cap the worst case.

#4 — Extract `getDeviceFlowOrSynthetic404(...)` helper in
`DaemonAuthFlow.ts` and route BOTH the loop body and the
timeout-ceiling final read through it. Previously the ceiling read
went directly through `client.getDeviceFlow` and a 404 at the
boundary (entry evicted just as the timeout fired) would reject with
`DaemonHttpError(404)` instead of returning the structured `{ status:
'error', errorKind: 'not_found_or_evicted' }` that the rest of
`awaitCompletion` promises.

#5 — Validate `AwaitCompletionOptions.timeoutMs` and `pollOverrideMs`
with `Number.isFinite + > 0`. NaN slipped past the previous `??
default` form (NaN is truthy-ish in that position) and produced a
`ceiling` of `NaN` (loop runs forever — `now >= NaN` always false)
or a `setTimeout(NaN)` (Node clamps to 1ms — tight polling loop).
Sanitize to `undefined` so the documented defaults take effect.

#6 — Thread `signal` into `DaemonClient.getDeviceFlow` and forward
to `fetchWithTimeout` (which already composes caller + timeout
signals). awaitCompletion now passes `opts.signal` from both GET
sites. Without this, an `awaitCompletion` caller that aborts mid-
poll could not cancel an in-flight stalled GET; it would have to
wait for the daemon-side `fetchTimeoutMs` (30s default) to fire.

Four new tests in `deviceFlow.test.ts` pin the new behaviors:
hanging-start timeout (#1), hanging-persist → persist_failed (#2),
extreme-expiresIn clamp (#3), extreme-interval clamp (#3).
FakeProvider gained a `startHangs` flag for the non-cooperative
provider scenario.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-7 review feedback

Two findings from a DeepSeek /review pass; both small but legitimate
defense-in-depth gaps.

#1 — `runPollTick`'s catch block forwarded `err.message` verbatim
into the SSE-broadcast `hint`. The provider's `@remarks` contract
(fold-in 6 #4) requires throwers to sanitize, but if violated the
unbounded raw payload would reach every SSE subscriber. Added
`DEVICE_FLOW_POLL_HINT_MAX_LEN = 256` + `truncatePollHint()`,
applied to the catch's `result.hint`. Full raw `err.message` is
still routed to the audit trail (`audit?.record({hint: 'provider.poll()
threw (raw): ...'})`) so operator visibility for incident response
is preserved. Belt-and-suspenders: the contract is now structurally
enforced rather than relying on every future provider author to
read the JSDoc.

#2 — `updateMatchingFlow` (and the `started`/`authorized` handlers
in `reduceDaemonAuthEvent`) unconditionally overwrote state without
comparing `rawEvent.id` against the existing flow's
`lastSeenEventId`. The field's JSDoc documented it as a monotonic
counter to prevent stale frames from overwriting newer state, but
the code didn't enforce that contract. SSE reconnect with
`Last-Event-ID < terminal-frame-id` would replay older frames; if
any of them were for the same `deviceFlowId` (e.g. a delayed
`failed` arriving after `authorized`) the stale frame would
overwrite the terminal. Daemon-side `transitionTerminal` makes the
exact reachable scenario thin, but the documented contract should
match the code.

Threaded `rawEventId` into `updateMatchingFlow` and added the gate
there + in the `started` and `authorized` handlers (the two cases
that don't go through `updateMatchingFlow`). Synthetic frames
without an envelope `id` (`rawEventId === undefined`) bypass the
gate — they originate inside SDK reducer machinery and aren't
subject to replay ordering.

Three new tests pin the contracts:
- `runPollTick catch truncates the SSE hint and preserves raw on
  the audit (fold-in 8 #1)` — `pollThrowsWith` flag on FakeProvider
  models a non-conforming provider; SSE hint < 400 chars + contains
  'truncated'; audit hint contains the full 4_000-char raw.
- `reduceDaemonAuthEvent rejects out-of-order frames (fold-in 8 #2
  monotonicity)` — stale `failed`(id=7) does NOT overwrite
  `authorized`(id=10); stale `started`(id=4) for a different flow
  also rejected.
- `reduceDaemonAuthEvent passes synthetic frames (no envelope id)
  through the gate` — SDK-internal frames without `id` are honored.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-8 review feedback

Twelve correctness + structural fixes from a wenshao + DeepSeek + gpt-5.5
review pass. Tests deferred to fold-in 10 (separate, larger commit).

CRITICAL CORRECTNESS

#7 — `provider.persist()` Promise.race could publish `persist_failed`
to SSE while a non-cooperative provider was still committing
credentials to disk. Added an independent tracker on the original
persist promise: if the race timed out (`persistTimedOut === true`)
AND the underlying persist later resolved successfully, audit a
`lost_success_after_timeout` breadcrumb so operators see the
inconsistency. Tightened the persist `@remarks` contract to require
signal honoring end-to-end. Qwen provider already complies (fold-in
3 #10); this is forward-defense for future providers.

#11 — auth surface (`DaemonAuthFlow`, `reduceDaemonAuthEvent`,
`createDaemonAuthState`, `DEVICE_FLOW_EXPIRY_GRACE_MS`, all event /
data / state types) was re-exported from `src/daemon/index.ts` but
NEVER from the published SDK entry `src/index.ts`. SDK consumers got
`undefined` for everything except `client.auth.start()` (which
traveled through the already-exported `DaemonClient`). Added the
missing exports and pinned via `daemon-public-surface.test.ts`.

#12 — `core/src/qwen/qwenOAuth2.ts:373`'s
`debugLogger.debug('Device authorization result:', result)` writes
the raw `device_code` (RFC 8628 bearer-equivalent credential) to
stderr / journald, bypassing the `BrandedSecret` redaction layer.
Pre-existing on main but PR 21 expanded the exposure surface.
Sanitized to log only `{ ok, expires_in }` on success / `{ ok,
error }` on error.

#13 — `runPollTick` success-branch persist-failure × past-`expiresAt`
classified as `expired_token` instead of `persist_failed`, routing
operators toward "tell user to retry" (RFC 8628 expiry) when the
actual root cause was disk I/O. Reclassified to `persist_failed`
with a `persist_also_failed_past_expiry` audit hint to preserve the
timing detail for incident response.

SMALL CORRECTNESS

#1 — `runPollTick` catch hint replaced with a STATIC bounded message
("provider.poll() failed; see daemon audit log for details"). The
fold-in 8 truncated-prefix approach could still leak the first 256
chars of provider-templated raw text including secret material. Full
raw still routed to audit channel for operator visibility.

#5 — `cancellerClientId` field added to `DeviceFlowEntry`; deferred-
cancel branch in `cancel()` now stamps it on the entry, and the
persist-resolution `cancelled` event publish uses
`entry.cancellerClientId ?? entry.initiatorClientId`. SSE consumers
that suppress self-emitted events can now attribute the cancel
correctly.

#6 — `AwaitCompletionOptions.timeoutMs === 0` (the documented
"settle immediately, return current daemon view" contract) was
treated as falsy by the `?` ternary, falling back to the default.
`sanitizePositiveMs` now takes an `allowZero` opt-in; the ceiling
computation uses `!== undefined` instead of truthy check.

#8 — `EventBus.publish()` returns `undefined` for closed buses (it
does NOT throw). `broadcastWorkspaceEvent` previously counted that
path as success, hiding the all-buses-dropped operator alarm.
Folded the closed-bus-as-failure check into the canonical
`publishWorkspaceEvent` (see #X below).

#9 — start-timeout Promise.race rejected with a plain `Error`,
falling through `sendBridgeError` to a generic 500. Switched to
`UpstreamDeviceFlowError` so a hung IdP correctly surfaces as 502
(matching the envelope every other IdP start failure uses).

STRUCTURAL

#3 — Three identical `transitionTerminal + publish + audit`
expired_token blocks in `runPollTick`/`sweep`/(removed by #13)
deduplicated into a private `expireEntry()` helper. Future event-
shape changes are now a one-edit operation.

#X — PR 16 (#4249) merged on 2026-05-18 06:27Z. Per the inline
comment at httpAcpBridge.ts:501, PR 21's `broadcastWorkspaceEvent`
was kept distinct only to avoid the merge conflict; once PR 16
landed, it became a fold-in candidate. Folded the closed-bus +
all-failed-stderr-escalation operator-visibility features (PR 21's
S5 + fold-in 9 #8) INTO `publishWorkspaceEvent`; dropped
`broadcastWorkspaceEvent` from the bridge interface + impl + test
mocks. PR 21's deviceFlowEventSink now calls
`bridge.publishWorkspaceEvent` — single canonical workspace fan-out.

DOC

#16 — Added a "Cross-client take-over" paragraph to
`docs/users/qwen-serve.md` explaining that two clients on the same
daemon for the same provider get the per-provider singleton with
`attached: true`/`false` distinguishing them; no separate event
fires (both eventually observe the same `auth_device_flow_authorized`).

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao round-9 review feedback

Two small non-blocking items from the round-9 pass; defensive shape +
docs only. The 4 deferred test-coverage threads (#1-4 of round-8) are
still tracked for fold-in 10.

#6 — `lastSeenEventId` typed `number` with `?? 0` defaults in the
`auth_device_flow_started` reducer case. The daemon-side `EventBus`
assigns ids ≥ 1 so the `0` sentinel has no real-traffic meaning, but
the monotonic gate (`rawEventId <= flow.lastSeenEventId`) would
reject any future SDK-internal synthetic frame using `id: 0`.
Changed the field type to `number | undefined` and dropped the
`?? 0` from the started case. The `updateMatchingFlow` /
`auth_device_flow_authorized` guards already short-circuit on
`existing.lastSeenEventId !== undefined`, so undefined is safe
end-to-end. Existing 34 reducer tests still pass unchanged.

#7 — Added `@remarks` block to `DeviceFlowErrorKind.persist_failed`'s
JSDoc explaining the lost-success retry UX. When fold-in 9 #7's
`lost_success_after_timeout` audit fires (non-conforming provider
violates signal contract; disk write succeeds after registry
published `persist_failed`), a naive SDK retry hits the IdP a
second time with a fresh `device_code` and prompts the user
twice — but the first credential set is already valid. JSDoc now
documents the mitigation: SDK consumers writing retry logic on
`persist_failed` should call `client.auth.getStatus()` BEFORE
re-prompting; operators can grep stderr/audit for
`lost_success_after_timeout` to detect occurrences.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* test(serve): fold-in 10 — auth device-flow test bundle (#4255)

Lands the four deferred test-coverage items the round-8 review
flagged (and round-9 re-surfaced) as a hard merge prerequisite.
Net +41 tests across registry / SDK helper / client HTTP /
HTTP route layers.

#1 — `deviceFlow.test.ts` `persist failure paths` describe (3
tests, +3). The success arm's three terminal mappings — pure
`persist_failed`, `cancelled` (cancel during persist), and
`persist_failed` past `expiresAt` (the fold-in 9 #13
reclassification with `persist_also_failed_past_expiry` audit
hint) — were 0-covered. Now pinned. Test #2 also asserts the
fold-in 9 #5 cancellerClientId routing on the deferred
`cancelled` event.

#2 — new `DaemonAuthFlow.test.ts` (+14 tests). Mock DaemonClient
with sequenced `getDeviceFlow` replies. Covers happy-path
polling → `authorized`; `slow_down`-driven `intervalMs` bump
firing `onThrottled`; `signal.abort()` rejection; `signal`
propagation through `client.getDeviceFlow` (fold-in 7 #6);
`timeoutMs` ceiling final-read; `timeoutMs:0` immediate-return
(round-9 #6); NaN/Infinity → `sanitizePositiveMs` fallback to
default ceiling (fold-in 7 #5); 404 → synthetic
`error`/`not_found_or_evicted` (fold-in 3 #4) at BOTH the loop
body AND the timeoutMs ceiling read (fold-in 7 #4); non-404
DaemonHttpError rethrown; `cancel()` and top-level
`status()`/`cancel()` wrappers forward correctly.

#3 — `DaemonClient.test.ts` `device-flow methods` describe
(+11 tests). POSTs `/workspace/auth/device-flow` happy path +
clientId header + body shape; 200/201 acceptance; non-2xx →
`DaemonHttpError`. GETs URL-encode the deviceFlowId; forward
`opts.signal` to `fetchWithTimeout`'s composed signal (fold-in
7 #6 — verified by aborting caller signal and observing the
fetch's signal flip to `aborted`); 404 throws. DELETEs
swallow 204 + 404 (idempotent, mirrors `closeSession`); non-
204/404 throws. `getAuthStatus` plain GET. `client.auth`
lazy-instantiated singleton.

#4 — `server.test.ts` 5 supplementary contract tests (+5).
The existing 8 `it()`s cover happy paths + take-over + 401
POST + DELETE pending/terminal/unknown + 502 upstream + sweeper.
This commit plugs gaps: 400 `invalid_request` for missing /
non-string providerId (fold-in W2 split this from
`unsupported_provider`); 409 `too_many_active_flows` (via
injected fake registry); 401 `token_required` on DELETE
without bearer; the asymmetric GET posture
(`/workspace/auth/device-flow/:id` IS strict-gated to prevent
peer-process userCode shoulder-surf; `/workspace/auth/status`
stays read-only because its `pendingDeviceFlows` entries
intentionally redact `userCode`).

Validation: cli serve 631/631 (+8 from #1, #4); sdk 384/384
(+25 from #2, #3, +/- some pre-existing churn). Typecheck +
lint clean.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fix(qwen): atomic temp+chmod+rename in cacheQwenCredentials (PR #4255 round-11 #2)

gpt-5.5 /review flagged a real correctness/security gap: the
post-write `chmod` ordering left a window where freshly-written
credentials could land in a broadly-readable existing
`oauth_creds.json` before the chmod tightened it. On POSIX, a
chmod failure additionally degraded to a debug warning while the
broadly-readable tokens stayed on disk.

New shape mirrors the standard atomic-write idiom:

  1. Write `${filePath}.tmp.${pid}.${randomUUID()}` with `mode: 0o600`.
     The temp path doesn't exist beforehand, so the `mode` flag
     actually applies on creation (it doesn't on existing files,
     which was the root of the original race).
  2. Defensive `chmod` on the temp file. POSIX failure is now a
     HARD ERROR (refuses to publish broad-perm credentials to the
     canonical filename). Windows logs a debug breadcrumb and
     proceeds, since chmod is a no-op on most NTFS volumes (perms
     go through ACLs).
  3. Atomic `fs.rename` over `filePath`. The canonical path is
     ALWAYS at `0o600` from the moment it contains the new tokens;
     readers see either the old creds or the new creds, never a
     partially-written or broadly-readable state.
  4. Best-effort `fs.unlink` of the temp file on any failure path
     so failed writes don't leave `.tmp.<pid>.<uuid>` litter on
     disk.

Test mock in `qwenOAuth2.test.ts` extended with `chmod` + `rename`
no-op stubs so the existing 158 core/qwen tests still pass; no test
behavior change beyond the mock surface.

Validation: typecheck clean (cli + core + sdk-typescript); core
qwen 158/158; cli serve 643/643; sdk 384/384.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): address PR #4255 wenshao + gpt-5.5 round-12 review feedback

Eight findings from a wenshao + gpt-5.5 /review pass: 1 critical
correctness, 2 real defensive defects, 4 edge cases / minor
hardening, 1 test gap. All adopted.

CRITICAL CORRECTNESS

#1 CzSpN — `dispose()` race: after `await provider.poll(...)` the
post-await guard checked only `entry.status !== 'pending'`, NOT
`this.disposed`. `dispose()` clears the registry maps and aborts
the entry's signal but doesn't mutate `entry.status`, so a
provider whose poll already resolved (or doesn't honor abort) could
enter the success branch and call `result.persist({...})` —
committing credentials on a shutting-down daemon. Added the
`if (this.disposed) return;` guard symmetric with the top-of-method
check.

REAL DEFENSIVE DEFECTS

#2 Cy_ZG — sync-throw escape: the `result.persist({signal})` call
happens BEFORE the `new Promise` constructor that captures it
(`persistTracker` is closed-over inside the constructor). A
non-conforming provider whose persist throws synchronously (e.g.
top-of-function validation) would escape past the outer
`try/catch (await new Promise(...))` and become an
`unhandledRejection` since `runPollTick` is fire-and-forget via
`void`. Wrapped the persist invocation in a try/catch that routes
the sync-throw into the same `persistError` branch.

#3 CzSpe — runtime provider map: provider validation hardcoded
`DEVICE_FLOW_SUPPORTED_PROVIDERS` even though `deps.deviceFlowProviders`
is the documented extension hook for tests/future providers.
Switched both POST validation and `/workspace/auth/status`
`supportedDeviceFlowProviders` to derive from
`deviceFlowProviderMap.keys()` — single source of truth matches
the registry's `resolveProvider`.

EDGE CASES / MINOR HARDENING

#4 Cy_Y9 — `slow_down` re-clamp: `intervalMs += SLOW_DOWN_BUMP_MS`
can push past `DEVICE_FLOW_MAX_INTERVAL_MS` (the bound that keeps
`setTimeout` from clamping to TIMEOUT_MAX). Wrapped in
`Math.min(MAX_INTERVAL_MS, ...)` symmetric with the doStart clamp.

#5 Cy_ZF — `expiresInSec` lower bound: `0.5` was finite-positive
and produced `expiresAt = now() + 500 ms` — first poll (clamped at
≥1 s) fires AFTER expiresAt → flow expires before any user could
authorize. Added `DEVICE_FLOW_MIN_EXPIRES_IN_SEC = 30` (RFC 8628
§3.2 calls 5–30 minutes "reasonable"; sub-30s is non-compliant).

#6 CzHOK — take-over response privacy: `initiatorClientId` was
echoed to ANY take-over POST caller, including those with no
`X-Qwen-Client-Id` header. Bearer-gated already, but the
asymmetry "anonymous caller learns who started it" violated the
no-header-as-privacy-signal contract. Now only echoed when the
caller's id matches the entry's initiator.

#7 CzSpd — production audit visibility: production audit sink
dropped `line.hint`, but the registry uses hints for operator-only
breadcrumbs (`provider.poll() threw (raw)...`,
`lost_success_after_timeout`, `persist_also_failed_past_expiry`,
take-over correlation, `deferred (persist in flight; ...)`). The
documented troubleshooting trail was invisible in production
stderr. Now included with a 1 KiB bound + JSON-quoted so multi-
word hints stay parseable.

TEST GAP

#8 Cy_ZH — `lost_success_after_timeout` audit: the
fold-in 9 #7 split-brain detector for non-cooperative providers
had no test pinning it. Added a controllable `latePersist` Promise
+ test that drives poll → success → enters persist race → fires
PERSIST_TIMEOUT (registry publishes persist_failed) → resolves
persist late → asserts the lost_success audit fires.

Validation: typecheck + lint clean; cli serve 644/644 (+1 from
the new test); sdk-typescript 384/384.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

* fixup(serve): close concurrent multi-provider cap bypass (PR #4255 round-13 #1)

gpt-5.5 /review caught a real workspace-wide cap bypass:
`countActive()` only counted entries already installed in
`byProvider`, but the cap check at the top of `start()` runs
before any provider's `inFlightStarts` slot completes
`provider.start()`. A burst of fresh starts for
`DEVICE_FLOW_MAX_CONCURRENT + 1` distinct providers all run
synchronously to the cap check (each `start()` is async but
runs to its first await — the await happens AFTER the cap
check), all observe `count === 0` (no `byProvider` entries
installed yet), and all pass — eventually installing more
than the documented four pending flows.

Fix: include `inFlightStarts.size` in `countActive()`. The
two maps are disjoint by construction (the existing-pending
fast-path catches any provider with both), so simple
addition cannot double-count. The second concurrent caller
sees count=1, the third count=2, …, and the (MAX+1)th caller
is rejected with `TooManyActiveDeviceFlowsError`.

Test: `caps at DEVICE_FLOW_MAX_CONCURRENT under CONCURRENT
distinct-provider starts`. Fires `MAX+1` concurrent starts
via `Promise.allSettled`, asserts exactly `MAX` fulfilled +
exactly 1 rejected with the typed error. Pre-fix this test
fails (all `MAX+1` succeed); post-fix it passes.

Validation: typecheck clean across all 4 workspaces;
deviceFlow.test.ts 35/35 (was 34); cli serve 645/645.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
This commit is contained in:
jinye 2026-05-18 22:05:53 +08:00 committed by GitHub
parent d14ffd469a
commit 36760ca63c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
22 changed files with 6172 additions and 51 deletions

View file

@ -1007,6 +1007,97 @@ Response:
After a successful vote, every connected client sees `permission_resolved` with the same `requestId` and the chosen `outcome`.
### Auth device-flow routes (issue #4175 PR 21)
The daemon brokers an OAuth 2.0 Device Authorization Grant (RFC 8628) so a remote SDK client can trigger a login whose tokens land on the **daemon** filesystem — not on the client. The daemon polls the IdP itself; the client's only job is to display the verification URL + user code and (optionally) subscribe to SSE for completion events.
Capability tag: `auth_device_flow` (always advertised). Supported providers in v1: `qwen-oauth`.
**Runtime locality.** The daemon never spawns a browser — even if it can. The client decides whether to call `open(verificationUri)` locally; on a headless pod (the canonical Mode B deployment) the user opens the URL on whatever device they have a browser on. See `docs/users/qwen-serve.md` for the recommended UX.
**No token leakage in events.** `auth_device_flow_started` carries `{deviceFlowId, providerId, expiresAt}` only. The user code and verification URL come back point-to-point in the POST 201 body and via `GET /workspace/auth/device-flow/:id`; they are never broadcast on SSE.
**Per-provider singleton.** A second `POST` for the same provider while a flow is pending is an idempotent take-over — it returns the existing entry with `attached: true` rather than starting a fresh IdP request.
#### `POST /workspace/auth/device-flow`
Strict mutation gate: requires a bearer token even on token-less loopback defaults (`401 token_required`).
Request:
```json
{ "providerId": "qwen-oauth" }
```
Response (`201` fresh start, `200` idempotent take-over):
```json
{
"deviceFlowId": "fa07c61b-…",
"providerId": "qwen-oauth",
"status": "pending",
"userCode": "USER-1",
"verificationUri": "https://chat.qwen.ai/api/v1/oauth2/device",
"verificationUriComplete": "https://chat.qwen.ai/api/v1/oauth2/device?user_code=USER-1",
"expiresAt": 1700000600000,
"intervalMs": 5000,
"attached": false
}
```
Errors:
- `400 unsupported_provider` — unknown `providerId` (response includes `supportedProviders`)
- `409 too_many_active_flows` — workspace cap (4) reached; cancel one with `DELETE`
- `401 token_required` — strict gate denied a token-less request
- `502 upstream_error` — IdP returned an unexpected error
#### `GET /workspace/auth/device-flow/:id`
Read the current state. Pending entries echo `userCode/verificationUri/expiresAt/intervalMs`; terminal entries (5-min grace) drop them and surface `status` + optional `errorKind/hint`.
Returns `404 device_flow_not_found` for unknown ids and post-grace evicted entries.
#### `DELETE /workspace/auth/device-flow/:id`
Idempotent cancel:
- pending entry → `204` + emit `auth_device_flow_cancelled`
- terminal entry → `204` no-op (no event re-emit)
- unknown id → `404`
#### `GET /workspace/auth/status`
Snapshot of pending flows + supported providers:
```json
{
"v": 1,
"workspaceCwd": "/work/bound",
"providers": [],
"pendingDeviceFlows": [
{
"deviceFlowId": "fa07c61b-…",
"providerId": "qwen-oauth",
"expiresAt": 1700000600000
}
],
"supportedDeviceFlowProviders": ["qwen-oauth"]
}
```
#### Device-flow SSE events
Five typed events (workspace-scoped, fanned out to every active session bus):
- `auth_device_flow_started` `{deviceFlowId, providerId, expiresAt}` — POST succeeded; SDK should subscribe (no userCode here, fetch via GET if needed)
- `auth_device_flow_throttled` `{deviceFlowId, intervalMs}` — daemon honored upstream `slow_down`; clients polling GET should bump their interval to match
- `auth_device_flow_authorized` `{deviceFlowId, providerId, expiresAt?, accountAlias?}` — credentials persisted; `accountAlias` is a non-PII label (never email/phone)
- `auth_device_flow_failed` `{deviceFlowId, errorKind, hint?}` — terminal; `errorKind` is one of `expired_token | access_denied | invalid_grant | upstream_error | persist_failed`. `persist_failed` is daemon-internal: the IdP exchange succeeded but the daemon couldn't durably store credentials (EACCES / EROFS / ENOSPC). The user should retry once the underlying disk condition is fixed.
- `auth_device_flow_cancelled` `{deviceFlowId}` — DELETE succeeded against a pending entry
> **Not MCP-compatible.** The MCP authorization spec (2025-06-18) mandates OAuth 2.1 + PKCE auth-code with a redirect callback, which doesn't work for headless-pod daemons. Mode B's device-flow surface is daemon-private — clients targeting MCP-compliant servers should use a different auth path.
## Streaming wire format
Events are emitted as standard EventSource frames. The daemon writes one `data:` line per frame (the JSON has no embedded newlines after `JSON.stringify`); the SDK parser at `packages/sdk-typescript/src/daemon/sse.ts` handles both that and the spec-allowed multi-`data:` form on the receive side.

View file

@ -360,6 +360,53 @@ The bridge keeps **one channel per daemon** (one daemon per workspace, per §02)
**Peer agents (Cursor / Continue / Claude Code / OpenCode / Gemini CLI) all do single-process multi-session.** qwen-code matches them at the agent layer; the Stage 1 bridge in this PR makes the same architecture visible over HTTP.
## Logging in to a remote daemon (issue #4175 PR 21)
When the daemon runs on a remote pod (no shared display with you), you can still log in to a Qwen account by triggering an OAuth device flow over HTTP. The daemon polls the IdP itself; your job is just to open a URL on whatever device has a browser.
```bash
# 1. Start a flow. The daemon contacts the IdP, returns a code + URL.
curl -X POST http://127.0.0.1:4170/workspace/auth/device-flow \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"providerId":"qwen-oauth"}'
# → 201 {
# "deviceFlowId": "fa07c61b-…",
# "userCode": "USER-1",
# "verificationUri": "https://chat.qwen.ai/api/v1/oauth2/device",
# "verificationUriComplete": "https://chat.qwen.ai/...?user_code=USER-1",
# "expiresAt": 1700000600000,
# "intervalMs": 5000,
# "attached": false
# }
# 2. Visit the URL on your phone / laptop, enter the user code.
# 3. Poll for completion (or subscribe to SSE for the auth_device_flow_authorized event):
curl http://127.0.0.1:4170/workspace/auth/device-flow/fa07c61b-… \
-H "Authorization: Bearer $TOKEN"
# → status transitions: pending → authorized
```
The TypeScript SDK wraps both steps into a single helper:
```ts
import { DaemonClient } from '@qwen-code/sdk';
const client = new DaemonClient({ baseUrl, token });
const flow = await client.auth.start({ providerId: 'qwen-oauth' });
console.log(`Open ${flow.verificationUri}\nCode: ${flow.userCode}`);
const result = await flow.awaitCompletion({ signal: abortCtrl.signal });
// result.status === 'authorized'
```
**The daemon never opens a browser on your behalf.** Even when running locally, the daemon stays passive — it returns the URL and lets the SDK / user choose where to open it. This is intentional: a daemon on a headless pod that called `xdg-open` would silently fail, masking the actual auth surface. Mirror `gh auth login`'s "Press Enter to open browser" UX in your client.
**`--require-auth` and dev convenience.** The device-flow routes use the strict mutation gate (PR 15), which means a token-less loopback default returns `401 token_required`. Locally, the simplest way around this during development is `qwen serve --token=dev-token`; you don't need `--require-auth` unless you're hardening the loopback default.
**Cross-daemon limitation.** `oauth_creds.json` is daemon-shared (`~/.qwen/oauth_creds.json`), so a successful login in daemon A is automatically picked up by daemon B's next token refresh — but daemon B's SDK clients won't receive the `auth_device_flow_authorized` event (events are per-daemon).
**Cross-client take-over.** Two SDK clients on the same daemon that both `POST /workspace/auth/device-flow` for the same provider get the per-provider singleton: the first call starts a fresh IdP request and returns `attached: false`; the second call returns the EXISTING in-flight entry with `attached: true`. The take-over is recorded on the audit trail (under the second client's `X-Qwen-Client-Id`) but does NOT emit a separate event — both clients eventually observe the SAME `auth_device_flow_authorized` once the user finishes the IdP page. If your UI distinguishes "I started this" from "someone else's flow I joined", branch on the `attached` field returned by `start()`.
## What's next
- **Build a client?** See the [DaemonClient TypeScript quickstart](../developers/examples/daemon-client-quickstart.md) and the [HTTP protocol reference](../developers/qwen-serve-protocol.md).

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,304 @@
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import {
cacheQwenCredentials,
generatePKCEPair,
isDeviceAuthorizationSuccess,
isDeviceTokenPending,
isDeviceTokenSuccess,
QwenOAuth2Client,
QwenOAuthPollError,
type DeviceTokenPendingData,
type IQwenOAuth2Client,
type QwenCredentials,
} from '@qwen-code/qwen-code-core';
import { writeStderrLine } from '../../utils/stdioHelpers.js';
import {
brandSecret,
unsafeRevealSecret,
UpstreamDeviceFlowError,
type BrandedSecret,
type DeviceFlowErrorKind,
type DeviceFlowPollResult,
type DeviceFlowProvider,
type DeviceFlowProviderId,
type DeviceFlowStartResult,
} from './deviceFlow.js';
const QWEN_OAUTH_SCOPE = 'openid profile email model.completion';
/**
* Maximum length of raw IdP detail written to stderr for operator
* audit. PR #4255 fold-in 6 review thread #5: the raw `err.message`
* from `QwenOAuth2Client` embeds the full upstream response body,
* which on a misbehaving reverse proxy / WAF can be megabytes of
* HTML and container log-aggregation pipelines (Loki, Fluent Bit,
* Stackdriver) typically truncate or reject lines past 432 KiB,
* meaning the *useful* prefix is lost downstream. Truncate here so
* the kept prefix is the part with the actual IdP error code /
* description, with a `[+N more]` tail so the reader knows how much
* was dropped. 2 KiB is comfortably below every aggregator's per-line
* cap and large enough to retain a structured JSON error envelope.
*/
const STDERR_DETAIL_MAX = 2_048;
function truncateForStderr(detail: string): string {
if (detail.length <= STDERR_DETAIL_MAX) return detail;
const dropped = detail.length - STDERR_DETAIL_MAX;
return `${detail.slice(0, STDERR_DETAIL_MAX)}…[+${dropped} bytes truncated]`;
}
/**
* Qwen-OAuth implementation of `DeviceFlowProvider` for `qwen serve`.
*
* Uses the lower-level `QwenOAuth2Client` primitives (`requestDeviceAuthorization`
* / `pollDeviceToken`) directly rather than the high-level
* `authWithQwenDeviceFlow` because that helper invokes `open(url)` to launch
* a browser on the daemon host. PR 21 design §8 #1 forbids browser-spawning
* from the daemon only the SDK/user side may decide to open a URL.
*/
export class QwenOAuthDeviceFlowProvider implements DeviceFlowProvider {
readonly providerId: DeviceFlowProviderId = 'qwen-oauth';
private readonly client: IQwenOAuth2Client;
constructor(client?: IQwenOAuth2Client) {
this.client = client ?? new QwenOAuth2Client();
}
async start(opts: { signal: AbortSignal }): Promise<DeviceFlowStartResult> {
const { code_verifier, code_challenge } = generatePKCEPair();
let auth;
try {
// PR #4255 review W1: thread `signal` into the IdP fetch so a
// dispose / cancel during the device-authorization request
// aborts the in-flight socket immediately. Pre-existing CLI
// callers don't pass a signal; the optional second arg keeps
// them compatible.
auth = await this.client.requestDeviceAuthorization(
{
scope: QWEN_OAUTH_SCOPE,
code_challenge,
code_challenge_method: 'S256',
},
{ signal: opts.signal },
);
} catch (err: unknown) {
// Network / parse / non-2xx errors from the Qwen IdP. Wrap so the
// route layer maps to `502 upstream_error` rather than the generic
// `500` fall-through in `sendBridgeError`.
//
// PR #4255 fold-in 3 (#9): the raw `err.message` from the
// QwenOAuth2Client embeds the full IdP response body (which can
// be HTML from a reverse proxy / WAF — hundreds of bytes,
// potentially leaking infrastructure detail). Use a stable
// bounded message for the route response; the original err
// detail goes through stderr audit only via the registry's
// standard error path (qwenOAuth2.ts logs via `debugLogger`
// when needed).
const detail = err instanceof Error ? err.message : String(err);
writeStderrLine(
`[serve] qwen device-flow start failed (raw): ${truncateForStderr(detail)}`,
);
throw new UpstreamDeviceFlowError(
'Qwen IdP device authorization request failed',
);
}
if (opts.signal.aborted) {
throw new UpstreamDeviceFlowError('device-flow start aborted');
}
if (!isDeviceAuthorizationSuccess(auth)) {
// PR #4255 fold-in 3 (#9): same sanitization as the catch above
// — well-formed but unsuccessful IdP responses can carry
// arbitrary `error_description` text that we don't want in the
// SDK-visible 502 hint. Static message; raw envelope to stderr.
const errorData = auth as { error?: string; error_description?: string };
writeStderrLine(
truncateForStderr(
`[serve] qwen device-flow start error envelope (raw): error=${
errorData?.error ?? 'unknown'
} description=${errorData?.error_description ?? '(none)'}`,
),
);
throw new UpstreamDeviceFlowError(
'Qwen IdP rejected the device authorization request',
);
}
return {
deviceCode: brandSecret(auth.device_code),
pkceVerifier: brandSecret(code_verifier),
userCode: auth.user_code,
verificationUri: auth.verification_uri,
verificationUriComplete: auth.verification_uri_complete,
expiresIn: auth.expires_in,
// Qwen IdP doesn't return `interval`; registry falls back to the
// RFC 8628 default (5s) when this is undefined.
};
}
async poll(
state: {
deviceCode: BrandedSecret<string>;
pkceVerifier?: BrandedSecret<string>;
},
opts: { signal: AbortSignal },
): Promise<DeviceFlowPollResult> {
if (!state.pkceVerifier) {
// Qwen *requires* PKCE; missing verifier is a programmer error.
return {
kind: 'error',
errorKind: 'invalid_grant',
hint: 'Qwen device-flow requires a PKCE verifier',
};
}
if (opts.signal.aborted) {
// Caller already gave up. Returning `pending` is the correct
// semantic — the registry's post-await guard will see entry.status
// !== 'pending' and skip emit/audit.
return { kind: 'pending' };
}
let response: Awaited<ReturnType<IQwenOAuth2Client['pollDeviceToken']>>;
try {
// Pass `signal` through to the IdP fetch so cancel / dispose
// during a slow upstream response aborts the in-flight socket
// immediately instead of waiting for the IdP's own timeout.
// The post-await abort check is still useful: an early cancel
// can land before fetch even starts, in which case the abort
// throws synchronously into our catch block below.
response = await this.client.pollDeviceToken(
{
device_code: unsafeRevealSecret(state.deviceCode),
code_verifier: unsafeRevealSecret(state.pkceVerifier),
},
{ signal: opts.signal },
);
} catch (err: unknown) {
// The class throws on non-OAuth error responses (network, malformed
// upstream payloads) and on RFC 8628 terminal errors that aren't
// `authorization_pending` or `slow_down`. Map RFC 8628 errors to
// structured terminal results; everything else is `upstream_error`.
// PR #4255 review S2: do NOT echo the raw thrown message into
// `hint` — `qwenOAuth2.ts` embeds the entire IdP responseText
// (which can be an HTML error page from a reverse proxy / WAF
// running into hundreds of bytes) into the message, and that
// would flow through `publishWorkspaceEvent` to every SSE
// subscriber. Use a stable bounded summary; full detail goes
// through the registry's stderr audit only.
//
// PR #4255 fold-in 5 (#4): branch on `instanceof
// QwenOAuthPollError` and read the structured `oauthError`
// field instead of substring-matching the message text. The
// earlier regex was a fragile cross-file string contract that
// would silently degrade to `upstream_error` if `qwenOAuth2.ts`
// ever changed its message format. The typed class makes the
// contract explicit + tsc-checkable.
const errorKind: DeviceFlowErrorKind =
err instanceof QwenOAuthPollError
? mapRfc8628OAuthCode(err.oauthError)
: 'upstream_error';
return {
kind: 'error',
errorKind,
hint:
errorKind === 'upstream_error'
? 'unexpected response from identity provider'
: `Qwen IdP returned ${errorKind}`,
};
}
if (isDeviceTokenSuccess(response)) {
const tokenData = response;
const credentials: QwenCredentials = {
access_token: tokenData.access_token!,
refresh_token: tokenData.refresh_token ?? undefined,
token_type: tokenData.token_type,
resource_url: tokenData.resource_url,
expiry_date: tokenData.expires_in
? Date.now() + tokenData.expires_in * 1000
: undefined,
};
const expiresAt = credentials.expiry_date;
const client = this.client;
return {
kind: 'success',
// PR #4255 review C3 + fold-in 3 (#10): `persist({signal})`
// is now threaded end-to-end. The registry passes its
// per-entry `cancelController.signal`; we forward it to
// `cacheQwenCredentials({signal})` which forwards to
// `fs.writeFile(..., {signal})`. A wedged disk write aborts
// immediately when `cancel()` / `dispose()` / the
// 30s `DEVICE_FLOW_PERSIST_TIMEOUT_MS` fires, instead of
// hanging until the OS-level timeout.
async persist(persistOpts: { signal: AbortSignal }) {
// Order matters: write to disk FIRST. If `cacheQwenCredentials`
// throws (EACCES, EROFS, ENOSPC) we MUST NOT update the
// in-process client — otherwise the daemon enters a zombie
// state where this session "remembers" the token but a
// restart loses it.
await cacheQwenCredentials(credentials, {
signal: persistOpts.signal,
});
try {
client.setCredentials(credentials);
} catch {
// ignore — disk file is the durable record; in-process
// refresh happens on next SharedTokenManager mtime poll
}
// PR #4255 review W3: `accountAlias` USED to be wired
// through events / reducer / audit but the Qwen IdP token
// response doesn't carry one (see DeviceTokenData shape in
// `qwenOAuth2.ts:152-160` — no `name` / `email` / `sub`
// field). Returning only `{expiresAt}` makes the field
// type-honestly absent rather than always-undefined. A
// future provider whose token response carries an alias
// can populate it; the type stays optional.
return { expiresAt };
},
// PR #4255 fold-in 3: `unpersist` was removed in favor of
// honoring the IdP's already-completed approval over a
// microsecond cancel/dispose race. See registry success
// branch for the rationale + audit hint.
};
}
if (isDeviceTokenPending(response)) {
const pending = response as DeviceTokenPendingData;
return pending.slowDown ? { kind: 'slow_down' } : { kind: 'pending' };
}
// The `QwenOAuth2Client.pollDeviceToken` implementation in
// `qwenOAuth2.ts:386-393` THROWS on every non-pending non-success
// response (it never returns a structured error envelope from the
// success path). So this fall-through is reached only if a future
// refactor changes that contract. Map defensively to
// `upstream_error` with a bounded hint (PR #4255 review S2 — never
// forward the raw IdP response body to SDK clients).
return {
kind: 'error',
errorKind: 'upstream_error',
hint: 'unexpected response from identity provider',
};
}
}
/**
* Map a structured RFC 8628 OAuth error code (from
* `QwenOAuthPollError.oauthError`) to the registry's
* `DeviceFlowErrorKind` taxonomy. Unknown / missing codes fall
* through to `upstream_error`. PR #4255 fold-in 5 (#4) replaced the
* earlier substring-regex match against the message text, which was
* an implicit string contract with `qwenOAuth2.ts` that would
* silently degrade if the message format changed.
*/
function mapRfc8628OAuthCode(code: string | undefined): DeviceFlowErrorKind {
switch (code) {
case 'expired_token':
return 'expired_token';
case 'access_denied':
return 'access_denied';
case 'invalid_grant':
return 'invalid_grant';
default:
return 'upstream_error';
}
}

View file

@ -117,6 +117,15 @@ export const SERVE_CAPABILITY_REGISTRY = {
// defaults (no flag) omit the tag, preserving the bit-for-bit shape
// older clients expect.
require_auth: { since: 'v1' },
// Issue #4175 PR 21. Daemon exposes the device-flow auth surface
// (`POST /workspace/auth/device-flow`, GET/DELETE on `/:id`, and
// `GET /workspace/auth/status`). Advertised UNCONDITIONALLY: the
// routes themselves return `400 unsupported_provider` if the daemon
// can't satisfy a specific provider, so clients always probe via the
// route. The list of supported providers is surfaced through the
// status route (extension data on `/capabilities` would inflate the
// descriptor shape; we keep the registry uniform).
auth_device_flow: { since: 'v1' },
} as const satisfies Record<string, ServeCapabilityDescriptor>;
export type ServeFeature = keyof typeof SERVE_CAPABILITY_REGISTRY;

View file

@ -487,6 +487,24 @@ export interface HttpAcpBridge {
/** Close all live child processes; called on daemon shutdown. */
shutdown(): Promise<void>;
/**
* Issue #4175 PR 21 best-effort fan-out of a workspace-scoped event
* (no `sessionId`) to every live session bus. Used by routes that
* make workspace-level state changes e.g. device-flow auth so SSE
* subscribers attached to any session learn about the change.
*
* **Best-effort semantics:** swallowed bus failures (closed bus,
* subscriber overflow) do NOT throw. Workspace events are
* authoritative via the GET routes; SSE is the convenience path.
*
* Removed in PR #4255 fold-in 9: PR 16 (#4249) landed
* `publishWorkspaceEvent` with identical fan-out semantics; the
* closed-bus + all-failed-stderr operator-visibility features
* that PR 21 added here have been folded INTO
* `publishWorkspaceEvent`. Use that helper for all workspace-
* scoped fan-outs (memory, agents, auth device-flow, future).
*/
}
/**
@ -3264,8 +3282,7 @@ export function createHttpAcpBridge(opts: BridgeOptions): HttpAcpBridge {
// (mid-shutdown, or evicted under load) is silently skipped, same
// posture as `permission_resolved` at line 1717.
//
// We deliberately do NOT track delivery success per session here:
// the route handler's contract is "read-after-write" and any SSE
// The route handler's contract is "read-after-write" and any SSE
// subscriber that misses the event can re-fetch via the route's
// GET sibling. Stage 5 PR 24 PermissionMediator can layer a
// proper workspace event bus on top if adapters need stricter
@ -3280,10 +3297,32 @@ export function createHttpAcpBridge(opts: BridgeOptions): HttpAcpBridge {
// route layer (200 OK) while SSE subscribers stop seeing
// events. The shutdown gate keeps the common race noise out of
// the production log without hiding actual bugs.
for (const entry of byId.values()) {
//
// PR #4255 fold-in 9: track per-session success/fail. A
// closed-bus return (`undefined` from `EventBus.publish` —
// see eventBus.ts:195-207) counts as a failure (operator
// signal), distinct from a thrown exception (regression
// signal). When zero sessions are active OR every active bus
// dropped the event, we elevate to unconditional stderr so
// monitoring catches the all-buses-dropped scenario.
// Inherited from the (now removed) `broadcastWorkspaceEvent`
// PR 21 added — PR 16's helper is now the single fan-out.
const sessions = Array.from(byId.values());
let successCount = 0;
let failureCount = 0;
for (const entry of sessions) {
try {
entry.events.publish(event);
const published = entry.events.publish(event);
if (published === undefined) {
failureCount += 1;
writeServeDebugLine(
`publishWorkspaceEvent: publish on session ${entry.sessionId} no-op (bus closed)`,
);
} else {
successCount += 1;
}
} catch (err) {
failureCount += 1;
const detail =
`publishWorkspaceEvent: bus publish failed for session ` +
`${JSON.stringify(entry.sessionId)} (type=${event.type}): ` +
@ -3295,6 +3334,11 @@ export function createHttpAcpBridge(opts: BridgeOptions): HttpAcpBridge {
}
}
}
if (sessions.length > 0 && successCount === 0 && !shuttingDown) {
writeStderrLine(
`qwen serve: publishWorkspaceEvent type=${event.type} dropped on ALL ${failureCount} session bus(es); SSE subscribers will miss this event (GET fallback still authoritative)`,
);
}
},
knownClientIds() {

View file

@ -9,6 +9,7 @@ import { type Server } from 'node:http';
import * as path from 'node:path';
import { writeStderrLine, writeStdoutLine } from '../utils/stdioHelpers.js';
import type { BridgeEvent } from './eventBus.js';
import { getDeviceFlowRegistry } from './auth/deviceFlow.js';
import {
canonicalizeWorkspace,
createHttpAcpBridge,
@ -320,6 +321,16 @@ export async function runQwenServe(
boundWorkspace,
fsFactory,
});
// Issue #4175 PR 21 — `createServeApp` parks the device-flow registry
// on `app.locals` when it constructs (or accepts) one. Pull it back
// out so the close hook can dispose it before `bridge.shutdown()`,
// ensuring polling timers + cancel controllers are torn down BEFORE
// we tell agent children to exit (otherwise a stuck IdP fetch could
// pin the drain). `unref()`'d timers mean the process WILL exit
// either way; explicit dispose is for cleanliness + audit
// visibility. Typed accessor (fold-in 4 review thread D) prevents
// a key-name typo from silently nulling out the dispose path.
const deviceFlowRegistry = getDeviceFlowRegistry(app);
// Node's `app.listen()` wants the unbracketed IPv6 literal (`::1`) but
// operators conventionally type `[::1]` (or copy/paste from URLs that
@ -542,6 +553,21 @@ export async function runQwenServe(
else res();
};
// PR 21: dispose the device-flow registry FIRST so any
// in-flight IdP poll is cancelled and timers are cleared
// before the bridge tear-down (which would otherwise race
// with the still-polling registry on shared HTTP agents).
if (deviceFlowRegistry) {
try {
deviceFlowRegistry.dispose();
} catch (err) {
writeStderrLine(
`qwen serve: device-flow registry dispose error: ${
err instanceof Error ? err.message : String(err)
}`,
);
}
}
bridge
.shutdown()
.catch((err) => {

View file

@ -114,17 +114,21 @@ const EXPECTED_STAGE1_FEATURES = [
// Issue #4175 PR 19. Always-on. Daemon exposes the read-only file
// surface: `GET /file`, `GET /list`, `GET /glob`, `GET /stat`.
'workspace_file_read',
// Issue #4175 PR 21 — auth device-flow surface advertised unconditionally.
'auth_device_flow',
] as const;
// Issue #4175 PR 15. `require_auth` is registered but conditionally
// advertised (only when `--require-auth` is set), so the registry list
// is a strict superset of the always-on list. Kept as a separate
// constant rather than appended to `EXPECTED_STAGE1_FEATURES` so the
// existing "advertised features" assertions stay tight against
// surprise additions.
// is a strict superset of the always-on list. The registry's source-of-
// truth ORDER puts `require_auth` between PR 11 (`session_metadata`)
// and PR 21 (`auth_device_flow`); reflect that here so the assertion
// matches the real ordering.
const EXPECTED_REGISTERED_FEATURES = [
...EXPECTED_STAGE1_FEATURES,
// Same order as `SERVE_CAPABILITY_REGISTRY` declaration:
...EXPECTED_STAGE1_FEATURES.filter((f) => f !== 'auth_device_flow'),
'require_auth',
'auth_device_flow',
] as const;
interface FakeBridgeOpts {
@ -4022,3 +4026,399 @@ describe('createServeApp ServeAppDeps.fsFactory wiring (#4175 PR 18)', () => {
}
});
});
// -- Issue #4175 PR 21 — auth device-flow integration tests ----------------
describe('auth device-flow routes', () => {
// Build a fake provider whose `start` returns deterministic values and
// whose `poll` is scripted per-test. Lives at the top of the suite so
// every `it()` can compose it with the registry.
function makeFakeProvider(): {
provider: import('./auth/deviceFlow.js').DeviceFlowProvider;
startCount: () => number;
} {
let starts = 0;
return {
provider: {
providerId: 'qwen-oauth' as const,
async start() {
starts += 1;
return {
deviceCode:
// Use the brandSecret helper so the secret follows the same
// redaction shape the production provider produces.
(await import('./auth/deviceFlow.js')).brandSecret(
`device-${starts}`,
),
pkceVerifier: (await import('./auth/deviceFlow.js')).brandSecret(
`pkce-${starts}`,
),
userCode: `USER-${starts}`,
verificationUri: 'https://idp.example/verify',
verificationUriComplete: 'https://idp.example/verify?u=AB12',
expiresIn: 600,
};
},
async poll(_state: unknown, _opts: { signal: AbortSignal }) {
// Stays pending forever — tests don't need the upstream to
// succeed for the route-layer assertions to be meaningful.
return { kind: 'pending' as const };
},
},
startCount: () => starts,
};
}
function buildApp(
overrides: Partial<ServeOptions> = {},
fakeProvider = makeFakeProvider(),
) {
const bridge = fakeBridge();
const app = createServeApp({ ...baseOpts, ...overrides }, undefined, {
bridge,
deviceFlowProviders: [fakeProvider.provider],
});
return { app, bridge, fakeProvider };
}
it('POST /workspace/auth/device-flow returns 201 on fresh start with redacted body', async () => {
const { app, fakeProvider } = buildApp({ token: 'tkn' });
const res = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
expect(res.status).toBe(201);
expect(res.body.providerId).toBe('qwen-oauth');
expect(res.body.userCode).toBe('USER-1');
expect(res.body.attached).toBe(false);
expect(typeof res.body.deviceFlowId).toBe('string');
// Critical: response body never contains device_code / pkce_verifier.
const json = JSON.stringify(res.body);
expect(json).not.toContain('device-1');
expect(json).not.toContain('pkce-1');
expect(fakeProvider.startCount()).toBe(1);
});
it('POST is rejected with 401 token_required on token-less loopback (strict gate)', async () => {
const { app } = buildApp({ token: undefined });
const res = await request(app)
.post('/workspace/auth/device-flow')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
expect(res.status).toBe(401);
expect(res.body.code).toBe('token_required');
});
it('POST with unknown providerId returns 400 unsupported_provider', async () => {
const { app } = buildApp({ token: 'tkn' });
const res = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'totally-fake' });
expect(res.status).toBe(400);
expect(res.body.code).toBe('unsupported_provider');
expect(res.body.supportedProviders).toContain('qwen-oauth');
});
it('POST is idempotent take-over for the same providerId — second POST returns 200 + attached:true', async () => {
const { app, fakeProvider } = buildApp({ token: 'tkn' });
const first = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
expect(first.status).toBe(201);
const second = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
expect(second.status).toBe(200);
expect(second.body.attached).toBe(true);
expect(second.body.deviceFlowId).toBe(first.body.deviceFlowId);
// Critical: provider.start is NOT called twice — the take-over is
// a daemon-internal operation, not a re-auth round trip.
expect(fakeProvider.startCount()).toBe(1);
});
it('GET /workspace/auth/device-flow/:id returns 200 for known + 404 for unknown', async () => {
const { app } = buildApp({ token: 'tkn' });
const post = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
const id = post.body.deviceFlowId as string;
const ok = await request(app)
.get(`/workspace/auth/device-flow/${id}`)
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(ok.status).toBe(200);
expect(ok.body.deviceFlowId).toBe(id);
expect(ok.body.status).toBe('pending');
const missing = await request(app)
.get('/workspace/auth/device-flow/nonexistent-id')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(missing.status).toBe(404);
expect(missing.body.code).toBe('device_flow_not_found');
});
it('DELETE on pending → 204; idempotent on already-cancelled → 204; unknown → 404', async () => {
const { app } = buildApp({ token: 'tkn' });
const post = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
const id = post.body.deviceFlowId as string;
const first = await request(app)
.delete(`/workspace/auth/device-flow/${id}`)
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(first.status).toBe(204);
const second = await request(app)
.delete(`/workspace/auth/device-flow/${id}`)
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`);
// Idempotent: terminal entries return 204 no-op.
expect(second.status).toBe(204);
const missing = await request(app)
.delete('/workspace/auth/device-flow/nonexistent-id')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(missing.status).toBe(404);
});
it('GET /workspace/auth/status surfaces pending flows and supported providers', async () => {
const { app } = buildApp({ token: 'tkn' });
const start = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
const id = start.body.deviceFlowId as string;
const status = await request(app)
.get('/workspace/auth/status')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(status.status).toBe(200);
expect(status.body.v).toBe(1);
expect(status.body.supportedDeviceFlowProviders).toContain('qwen-oauth');
expect(status.body.pendingDeviceFlows).toHaveLength(1);
expect(status.body.pendingDeviceFlows[0].deviceFlowId).toBe(id);
// Status payload MUST NOT echo userCode/verificationUri.
const json = JSON.stringify(status.body);
expect(json).not.toContain('USER-1');
expect(json).not.toContain('idp.example');
});
it('capability tag auth_device_flow is advertised unconditionally', async () => {
const { app } = buildApp({ token: 'tkn' });
const res = await request(app)
.get('/capabilities')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(res.status).toBe(200);
expect(res.body.features).toContain('auth_device_flow');
});
it('upstream provider.start failure → 502 upstream_error, not 500', async () => {
// PR 21 fold-in 0 P1-14: provider throwing UpstreamDeviceFlowError
// must surface as 502 with code:'upstream_error' instead of falling
// through `sendBridgeError`'s generic 500 path. Build a fake
// provider whose start always throws.
const { UpstreamDeviceFlowError } = await import('./auth/deviceFlow.js');
const failingProvider: import('./auth/deviceFlow.js').DeviceFlowProvider = {
providerId: 'qwen-oauth',
async start() {
throw new UpstreamDeviceFlowError('mocked upstream outage');
},
async poll() {
return { kind: 'pending' as const };
},
};
const bridge = fakeBridge();
const app = createServeApp({ ...baseOpts, token: 'tkn' }, undefined, {
bridge,
deviceFlowProviders: [failingProvider],
});
const res = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
expect(res.status).toBe(502);
expect(res.body.code).toBe('upstream_error');
expect(res.body.error).toContain('mocked upstream outage');
});
it('sweeper-driven auto-expiry transitions a stale entry to status:error and surfaces over GET', async () => {
// PR 21 fold-in 0 P1-13: cover the time-based expiry path via an
// injected registry with a controlled clock + manual sweeper trigger.
const { DeviceFlowRegistry, brandSecret } = await import(
'./auth/deviceFlow.js'
);
const fakeProvider: import('./auth/deviceFlow.js').DeviceFlowProvider = {
providerId: 'qwen-oauth',
async start() {
return {
deviceCode: brandSecret('device-1'),
pkceVerifier: brandSecret('pkce-1'),
userCode: 'USER-1',
verificationUri: 'https://idp.example/verify',
expiresIn: 60, // 60 seconds
};
},
async poll() {
// Stays pending; the sweeper drives terminal state via expiresAt.
return { kind: 'pending' as const };
},
};
let now = 1_700_000_000_000;
const intervalsRegistered: Array<{ cb: () => void }> = [];
const registry = new DeviceFlowRegistry({
events: { publish: () => {} },
resolveProvider: (id) => (id === 'qwen-oauth' ? fakeProvider : undefined),
now: () => now,
// Run polls forever-deferred; sweeper interval is what we drive.
schedule: (_ms, _cb) => ({ cancelled: false }) as never,
clearScheduled: () => {},
scheduleInterval: (_ms, cb) => {
const handle = { cb, cancelled: false };
intervalsRegistered.push(handle);
return handle as never;
},
clearScheduledInterval: () => {},
});
const bridge = fakeBridge();
const app = createServeApp({ ...baseOpts, token: 'tkn' }, undefined, {
bridge,
deviceFlowRegistry: registry,
});
const startRes = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
expect(startRes.status).toBe(201);
const id = startRes.body.deviceFlowId as string;
// Drive the clock past expiresAt and trigger the sweeper.
now += 61_000;
for (const interval of intervalsRegistered) interval.cb();
const stateRes = await request(app)
.get(`/workspace/auth/device-flow/${id}`)
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(stateRes.status).toBe(200);
// Time-based expiry transitions to status='expired' with errorKind='expired_token'.
expect(stateRes.body.status).toBe('expired');
expect(stateRes.body.errorKind).toBe('expired_token');
registry.dispose();
});
// PR #4255 fold-in 10 #4 — HTTP route contract coverage. Round-8
// wenshao thread `Cvx93` flagged that the existing 4 it()'s
// covered the happy paths but missed the malformed-input,
// resource-cap, and strict-bearer error envelopes that SDK
// consumers depend on for retry / surface routing. Each case
// here is a supertest one-liner asserting status code + `code:`
// discriminator.
it('POST with missing providerId returns 400 invalid_request', async () => {
// PR 21 fold-in W2 split the 400 envelope into `invalid_request`
// (caller-shape error: missing/non-string body field) vs
// `unsupported_provider` (well-shaped but the providerId isn't
// in the supported tuple). This pins that split.
const { app } = buildApp({ token: 'tkn' });
const res = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({}); // no providerId at all
expect(res.status).toBe(400);
expect(res.body.code).toBe('invalid_request');
expect(res.body.error).toContain('providerId');
});
it('POST with non-string providerId returns 400 invalid_request', async () => {
const { app } = buildApp({ token: 'tkn' });
const res = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 42 });
expect(res.status).toBe(400);
expect(res.body.code).toBe('invalid_request');
});
it('POST returns 409 too_many_active_flows when registry cap is reached', async () => {
// Inject a fake registry whose `start` always throws the cap error.
const { TooManyActiveDeviceFlowsError } = await import(
'./auth/deviceFlow.js'
);
const fakeRegistry = {
start: async () => {
throw new TooManyActiveDeviceFlowsError();
},
get: () => undefined,
cancel: () => undefined,
listPending: () => [],
dispose: () => {},
} as unknown as import('./auth/deviceFlow.js').DeviceFlowRegistry;
const bridge = fakeBridge();
const app = createServeApp({ ...baseOpts, token: 'tkn' }, undefined, {
bridge,
deviceFlowRegistry: fakeRegistry,
});
const res = await request(app)
.post('/workspace/auth/device-flow')
.set('Authorization', 'Bearer tkn')
.set('Host', `127.0.0.1:${baseOpts.port}`)
.send({ providerId: 'qwen-oauth' });
expect(res.status).toBe(409);
expect(res.body.code).toBe('too_many_active_flows');
});
it('DELETE without bearer is rejected 401 token_required (strict-mutation gate)', async () => {
const { app } = buildApp({ token: undefined });
const res = await request(app)
.delete('/workspace/auth/device-flow/some-id')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(res.status).toBe(401);
expect(res.body.code).toBe('token_required');
});
it('GET /workspace/auth/device-flow/:id is strict-gated; GET /workspace/auth/status is read-only', async () => {
// The two GETs have ASYMMETRIC auth posture by design:
// - `GET /workspace/auth/device-flow/:id` returns `userCode` for
// pending entries, which is shoulder-surf-able if a peer process
// on the same host can read it. fold-in (round-4 #1) added
// `mutate({strict:true})` to close the info-disclosure
// asymmetry vs. the strict POST/DELETE.
// - `GET /workspace/auth/status` intentionally redacts userCode
// (lists only deviceFlowId/providerId/expiresAt) so it stays
// bearer-only (passthrough on loopback no-token default).
const { app } = buildApp({ token: undefined });
const flowGet = await request(app)
.get('/workspace/auth/device-flow/no-such-id')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(flowGet.status).toBe(401);
expect(flowGet.body.code).toBe('token_required');
// Status, by contrast, is reachable on loopback without a token.
const status = await request(app)
.get('/workspace/auth/status')
.set('Host', `127.0.0.1:${baseOpts.port}`);
expect(status.status).toBe(200);
});
});

View file

@ -14,6 +14,18 @@ import {
denyBrowserOriginCors,
hostAllowlist,
} from './auth.js';
import {
DeviceFlowRegistry,
setDeviceFlowRegistry,
TooManyActiveDeviceFlowsError,
UnsupportedDeviceFlowProviderError,
UpstreamDeviceFlowError,
type DeviceFlowEventSink,
type DeviceFlowProvider,
type DeviceFlowProviderId,
type DeviceFlowPublicView,
} from './auth/deviceFlow.js';
import { QwenOAuthDeviceFlowProvider } from './auth/qwenDeviceFlowProvider.js';
import { isLoopbackBind } from './loopbackBinds.js';
import {
canonicalizeWorkspace,
@ -119,6 +131,22 @@ export interface ServeAppDeps {
* per-session EventBus.
*/
fsFactory?: WorkspaceFileSystemFactory;
/**
* Issue #4175 PR 21 device-flow auth registry. Tests inject a fake
* (`now` / `schedule` overrides for deterministic timer control,
* stubbed providers, captured event sink). Production callers omit
* this and `createServeApp` constructs a default wired to the
* shipped Qwen provider, the bridge's `publishWorkspaceEvent`,
* and a stderr audit sink.
*/
deviceFlowRegistry?: DeviceFlowRegistry;
/**
* Issue #4175 PR 21 extra device-flow providers for tests / future
* extensions. Production builds register only `QwenOAuthDeviceFlowProvider`;
* passing extra entries here registers them in addition to the default
* Qwen provider. Used by tests that stub the OAuth flow.
*/
deviceFlowProviders?: DeviceFlowProvider[];
}
/**
@ -276,6 +304,90 @@ export function createServeApp(
// and the bridge enforces — keeping every layer in agreement.
(app.locals as { boundWorkspace?: string }).boundWorkspace = boundWorkspace;
// Issue #4175 PR 21 — wire the device-flow registry. Default builds
// a single Qwen provider; tests inject `deps.deviceFlowRegistry`
// wholesale (with controlled clock/scheduler) or
// `deps.deviceFlowProviders` to stub the OAuth client only.
const deviceFlowProviderMap = new Map<
DeviceFlowProviderId,
DeviceFlowProvider
>();
for (const provider of deps.deviceFlowProviders ?? []) {
deviceFlowProviderMap.set(provider.providerId, provider);
}
if (!deviceFlowProviderMap.has('qwen-oauth')) {
deviceFlowProviderMap.set('qwen-oauth', new QwenOAuthDeviceFlowProvider());
}
const deviceFlowEventSink: DeviceFlowEventSink = {
publish(emission, originatorClientId) {
// PR #4255 fold-in 9: PR 16 (#4249) landed
// `publishWorkspaceEvent` with the same fan-out semantics as
// PR 21's `broadcastWorkspaceEvent`. The closed-bus +
// all-failed-stderr operator-visibility features that PR 21
// added have been folded INTO `publishWorkspaceEvent`; PR 21
// now uses the canonical helper.
bridge.publishWorkspaceEvent({
type: `auth_device_flow_${emission.type}`,
data: emission.data,
...(originatorClientId ? { originatorClientId } : {}),
});
},
};
const deviceFlowRegistry =
deps.deviceFlowRegistry ??
new DeviceFlowRegistry({
events: deviceFlowEventSink,
audit: {
record(line) {
// Structured stderr breadcrumb; deviceFlowId truncated to first
// 8 chars (mirrors PR 16 audit-event-stamp shape) so log
// skimmers can follow a flow without retaining full uuids.
const id = line.deviceFlowId.slice(0, 8);
const parts = [
`[serve] auth.device-flow:`,
`provider=${line.providerId}`,
`deviceFlowId=${id}...`,
line.clientId ? `clientId=${line.clientId}` : 'clientId=-',
`status=${line.status}`,
];
if (line.errorKind) parts.push(`errorKind=${line.errorKind}`);
if (line.expiresInMs !== undefined) {
parts.push(`expiresInMs=${Math.max(0, line.expiresInMs)}`);
}
// PR #4255 round-12 #7 (gpt-5.5 review CzSpd): include
// `line.hint` in the production stderr line. The
// registry uses the hint slot for operator-only
// breadcrumbs that aren't surfaced over SSE: the static
// catch-all hint "provider.poll() threw (raw): ..."
// (round-8 #1), `lost_success_after_timeout` (round-8
// #7's split-brain detector), `persist_also_failed_past_expiry`
// (round-8 #13), `take-over` audit on per-provider
// singleton, and `deferred (persist in flight; ...)` on
// cancel-during-persist. Without echoing here, the
// documented troubleshooting trail is invisible in
// production. Bound at 1 KiB so a misbehaving caller
// can't spam stderr.
if (line.hint) {
const STDERR_HINT_MAX = 1_024;
const hint =
line.hint.length > STDERR_HINT_MAX
? `${line.hint.slice(0, STDERR_HINT_MAX)}…[+${line.hint.length - STDERR_HINT_MAX} bytes truncated]`
: line.hint;
// Quote the hint so multi-word values stay parseable.
parts.push(`hint=${JSON.stringify(hint)}`);
}
writeStderrLine(parts.join(' '));
},
},
resolveProvider: (providerId) => deviceFlowProviderMap.get(providerId),
});
// Park the registry on `app.locals` so request handlers can reach it
// without closure capture (and so future helper extracts can find it
// without threading it through their args). Typed accessor (fold-in 4
// review thread D) prevents a string-key typo from silently
// detaching `runQwenServe`'s shutdown dispose call.
setDeviceFlowRegistry(app, deviceFlowRegistry);
// Order matters: rejection guards (CORS / Host allowlist / bearer auth)
// run BEFORE the JSON body parser. Otherwise an unauthenticated POST
// gets a full 10MB `JSON.parse` before the 401 fires — a trivially
@ -508,6 +620,170 @@ export function createServeApp(
parseClientId: parseClientIdHeader,
});
// -- Issue #4175 PR 21 — auth device-flow routes ------------------------
app.post(
'/workspace/auth/device-flow',
mutate({ strict: true }),
async (req, res) => {
const body = safeBody(req);
const providerIdRaw = body['providerId'];
// PR #4255 review W2: split `invalid_request` (request shape is
// wrong — missing/non-string field) from `unsupported_provider`
// (the field is well-formed but its value isn't in the
// daemon's known set). Conflating the two surfaced misleading
// remediation hints to SDK consumers branching on `code`
// ("this provider isn't supported here" when the actual cause
// was a serializer dropping the field).
if (typeof providerIdRaw !== 'string' || providerIdRaw.length === 0) {
res.status(400).json({
error: '`providerId` must be a non-empty string',
code: 'invalid_request',
});
return;
}
// PR #4255 round-12 #3 (gpt-5.5 review CzSpe): validate
// against the runtime provider map, not the static
// `DEVICE_FLOW_SUPPORTED_PROVIDERS` tuple. The static tuple
// is the SDK-facing default; `deps.deviceFlowProviders` is
// the documented extension hook for tests / future
// providers. Hardcoding the static tuple here meant
// injected providers were rejected at the route while still
// being registered in `deviceFlowProviderMap` — easy to
// break when adding a second provider.
if (!deviceFlowProviderMap.has(providerIdRaw as DeviceFlowProviderId)) {
res.status(400).json({
error: `Unsupported device-flow provider: ${providerIdRaw}`,
code: 'unsupported_provider',
supportedProviders: Array.from(deviceFlowProviderMap.keys()),
});
return;
}
const providerId = providerIdRaw as DeviceFlowProviderId;
const clientId = parseClientIdHeader(req, res);
if (clientId === null) return;
try {
const { view, attached } = await deviceFlowRegistry.start({
providerId,
...(clientId !== undefined ? { initiatorClientId: clientId } : {}),
});
// Idempotent take-over → 200 with `attached: true`. Fresh start →
// 201 + `attached: false`. The registry is the source of truth on
// which branch fired (it's the one that decided not to call
// `provider.start()` again).
res
.status(attached ? 200 : 201)
.json(toDeviceFlowStartResponseBody(view, attached, clientId));
} catch (err) {
if (err instanceof UnsupportedDeviceFlowProviderError) {
res
.status(400)
.json({ error: err.message, code: 'unsupported_provider' });
return;
}
if (err instanceof TooManyActiveDeviceFlowsError) {
res
.status(409)
.json({ error: err.message, code: 'too_many_active_flows' });
return;
}
if (err instanceof UpstreamDeviceFlowError) {
// IdP-side failure (network / parse / non-2xx). 502 distinguishes
// "the upstream we depend on misbehaved" from a daemon bug (5xx
// generic) so SDK clients can branch on retry strategy.
res.status(502).json({ error: err.message, code: 'upstream_error' });
return;
}
sendBridgeError(res, err, {
route: 'POST /workspace/auth/device-flow',
});
}
},
);
// PR #4255 fold-in 3: this GET surfaces `userCode` /
// `verificationUri` / `verificationUriComplete` for pending entries
// — material an attacker on the same loopback host could use to
// shoulder-surf the IdP approval flow. POST + DELETE are already
// strict; aligning GET to `mutate({ strict: true })` closes the
// information-disclosure asymmetry (the sibling
// `GET /workspace/auth/status` stays bearer-only because its
// pendingDeviceFlows entries intentionally omit `userCode`).
app.get(
'/workspace/auth/device-flow/:id',
mutate({ strict: true }),
async (req, res) => {
const id = req.params['id'];
if (!id) {
res.status(404).json({
error: 'Device-flow id required',
code: 'device_flow_not_found',
});
return;
}
const view = deviceFlowRegistry.get(id);
if (!view) {
res.status(404).json({
error: `Device-flow ${id} not found`,
code: 'device_flow_not_found',
});
return;
}
res.status(200).json(toDeviceFlowStateBody(view));
},
);
app.delete(
'/workspace/auth/device-flow/:id',
mutate({ strict: true }),
(req, res) => {
const id = req.params['id'];
if (!id) {
res.status(404).json({
error: 'Device-flow id required',
code: 'device_flow_not_found',
});
return;
}
const clientId = parseClientIdHeader(req, res);
if (clientId === null) return;
const result = deviceFlowRegistry.cancel(id, clientId);
if (result === undefined) {
res.status(404).json({
error: `Device-flow ${id} not found`,
code: 'device_flow_not_found',
});
return;
}
// Both freshly-cancelled and already-terminal are 204 (idempotent).
res.status(204).end();
},
);
app.get('/workspace/auth/status', (_req, res) => {
const pending = deviceFlowRegistry.listPending();
res.status(200).json({
v: 1,
workspaceCwd: boundWorkspace,
// GET /workspace/auth/status read-side intentionally minimal in
// this PR: a future PR can broaden the per-provider view (e.g.
// by reading SharedTokenManager.getCachedSnapshot for an `ok` /
// `expired` cell), but landing the additive route shape now
// unblocks SDK clients that need to know "is there a flow
// running?" without subscribing to SSE.
providers: [],
pendingDeviceFlows: pending.map((view) => ({
deviceFlowId: view.deviceFlowId,
providerId: view.providerId,
...(view.expiresAt !== undefined ? { expiresAt: view.expiresAt } : {}),
})),
// PR #4255 round-12 #3: derive from runtime provider map so
// injected providers are surfaced. Single source of truth
// matches the POST validation above.
supportedDeviceFlowProviders: Array.from(deviceFlowProviderMap.keys()),
});
});
app.post('/session', mutate(), async (req, res) => {
const body = safeBody(req);
// #3803 §02: 1 daemon = 1 workspace. Three input shapes:
@ -1434,6 +1710,75 @@ function parseOptionalWorkspaceCwd(
return cwd;
}
/**
* PR 21 translate the registry's redacted `DeviceFlowPublicView` into
* the wire shape declared by `DaemonDeviceFlowStartResult`. Splitting
* "start response" from "state body" preserves the `attached` field
* the start route needs without polluting the GET shape.
*/
function toDeviceFlowStartResponseBody(
view: DeviceFlowPublicView,
attached: boolean,
callerClientId?: string,
): Record<string, unknown> {
const body: Record<string, unknown> = {
deviceFlowId: view.deviceFlowId,
providerId: view.providerId,
status: view.status,
userCode: view.userCode ?? '',
verificationUri: view.verificationUri ?? '',
expiresAt: view.expiresAt ?? 0,
intervalMs: view.intervalMs ?? 0,
attached,
};
if (view.verificationUriComplete) {
body['verificationUriComplete'] = view.verificationUriComplete;
}
// PR #4255 round-12 #6 (gpt-5.5 review CzHOK): minor info-leak
// close-out — only echo `initiatorClientId` back to a take-over
// POST when the caller is the same client that started the flow
// (or when the take-over caller explicitly identified
// themselves and matches the original starter). An anonymous
// take-over caller (no `X-Qwen-Client-Id`) gets no echo of the
// original starter's id; this preserves the symmetry "the
// daemon respects the absence of `X-Qwen-Client-Id` as a
// privacy signal." Bearer-gated already, so the blast radius
// was small, but the asymmetry is now closed.
if (
view.initiatorClientId &&
callerClientId !== undefined &&
callerClientId === view.initiatorClientId
) {
body['initiatorClientId'] = view.initiatorClientId;
}
return body;
}
function toDeviceFlowStateBody(
view: DeviceFlowPublicView,
): Record<string, unknown> {
const body: Record<string, unknown> = {
deviceFlowId: view.deviceFlowId,
providerId: view.providerId,
status: view.status,
createdAt: view.createdAt,
};
if (view.errorKind) body['errorKind'] = view.errorKind;
if (view.hint) body['hint'] = view.hint;
if (view.userCode) body['userCode'] = view.userCode;
if (view.verificationUri) body['verificationUri'] = view.verificationUri;
if (view.verificationUriComplete) {
body['verificationUriComplete'] = view.verificationUriComplete;
}
if (view.expiresAt !== undefined) body['expiresAt'] = view.expiresAt;
if (view.intervalMs !== undefined) body['intervalMs'] = view.intervalMs;
if (view.lastPolledAt !== undefined) body['lastPolledAt'] = view.lastPolledAt;
if (view.initiatorClientId) {
body['initiatorClientId'] = view.initiatorClientId;
}
return body;
}
function parseClientIdHeader(
req: import('express').Request,
res: import('express').Response,

View file

@ -110,6 +110,11 @@ vi.mock('node:fs', () => ({
writeFile: vi.fn(),
unlink: vi.fn(),
mkdir: vi.fn().mockResolvedValue(undefined),
// PR #4255 round-11 #2 (gpt-5.5 review): atomic write uses
// temp-file → chmod → rename. Tests need chmod + rename in the
// mocked fs surface; both default to no-op success.
chmod: vi.fn().mockResolvedValue(undefined),
rename: vi.fn().mockResolvedValue(undefined),
},
}));

View file

@ -109,6 +109,44 @@ export class CredentialsClearRequiredError extends Error {
}
}
/**
* Typed error thrown by `QwenOAuth2Client.pollDeviceToken` for upstream
* RFC 8628 errors that aren't `authorization_pending` / `slow_down`.
*
* Earlier the class threw a plain `Error` with the OAuth code embedded
* in the message text; downstream callers (notably PR #4255's
* device-flow registry provider) had to substring-match the message
* to extract the error code, an implicit cross-file contract that
* silently degrades to `upstream_error` if the message format ever
* changes. The structured `oauthError` / `description` / `status`
* fields make the contract explicit + type-checked.
*
* The thrown `message` keeps the same `"Device token poll failed:
* ${error} - ${description}"` shape so existing log-parsing /
* substring-matching code continues to work; new code should branch
* on `instanceof QwenOAuthPollError` + read fields directly.
*/
export class QwenOAuthPollError extends Error {
readonly status?: number;
readonly oauthError?: string;
readonly description?: string;
constructor(opts: {
oauthError?: string;
description?: string;
status?: number;
}) {
super(
`Device token poll failed: ${opts.oauthError ?? 'Unknown error'} - ${
opts.description ?? '(no description)'
}`,
);
this.name = 'QwenOAuthPollError';
this.oauthError = opts.oauthError;
this.description = opts.description;
this.status = opts.status;
}
}
/**
* Qwen OAuth2 credentials interface
*/
@ -237,15 +275,21 @@ export interface IQwenOAuth2Client {
setCredentials(credentials: QwenCredentials): void;
getCredentials(): QwenCredentials;
getAccessToken(): Promise<{ token?: string }>;
requestDeviceAuthorization(options: {
scope: string;
code_challenge: string;
code_challenge_method: string;
}): Promise<DeviceAuthorizationResponse>;
pollDeviceToken(options: {
device_code: string;
code_verifier: string;
}): Promise<DeviceTokenResponse>;
requestDeviceAuthorization(
options: {
scope: string;
code_challenge: string;
code_challenge_method: string;
},
fetchOpts?: { signal?: AbortSignal },
): Promise<DeviceAuthorizationResponse>;
pollDeviceToken(
options: {
device_code: string;
code_verifier: string;
},
fetchOpts?: { signal?: AbortSignal },
): Promise<DeviceTokenResponse>;
refreshAccessToken(): Promise<TokenRefreshResponse>;
}
@ -287,11 +331,14 @@ export class QwenOAuth2Client implements IQwenOAuth2Client {
}
}
async requestDeviceAuthorization(options: {
scope: string;
code_challenge: string;
code_challenge_method: string;
}): Promise<DeviceAuthorizationResponse> {
async requestDeviceAuthorization(
options: {
scope: string;
code_challenge: string;
code_challenge_method: string;
},
fetchOpts?: { signal?: AbortSignal },
): Promise<DeviceAuthorizationResponse> {
const bodyData = {
client_id: QWEN_OAUTH_CLIENT_ID,
scope: options.scope,
@ -307,6 +354,12 @@ export class QwenOAuth2Client implements IQwenOAuth2Client {
'x-request-id': randomUUID(),
},
body: objectToUrlEncoded(bodyData),
// PR #4255 — daemon device-flow registry passes its
// `cancelController.signal` so dispose / cancel during a slow
// device-authorization request actually aborts the in-flight
// socket immediately. Pre-existing CLI callers omit it; the
// optional shape preserves backward compatibility.
...(fetchOpts?.signal ? { signal: fetchOpts.signal } : {}),
});
if (!response.ok) {
@ -317,7 +370,28 @@ export class QwenOAuth2Client implements IQwenOAuth2Client {
}
const result = (await response.json()) as DeviceAuthorizationResponse;
debugLogger.debug('Device authorization result:', result);
// PR #4255 fold-in 9 review thread #12: do NOT log the full
// result. `device_code` is an RFC 8628 bearer-equivalent
// credential — anyone holding it within the grant's lifetime
// can complete the token exchange. The daemon device-flow
// registry's `BrandedSecret` keeps `device_code` out of HTTP
// bodies / events / logs, but a debug-mode `console.log(result)`
// here would write the raw `device_code` to stderr / journald,
// bypassing the entire redaction layer. Log only the
// operationally-useful timing fields (size + presence of error
// envelope + lifetimes); secrets stay in memory.
if (isDeviceAuthorizationSuccess(result)) {
debugLogger.debug('Device authorization result (sanitized):', {
ok: true,
expires_in: result.expires_in,
});
} else {
const errorData = result as ErrorData;
debugLogger.debug('Device authorization result (sanitized):', {
ok: false,
error: errorData?.error,
});
}
// Check if the response indicates success
if (!isDeviceAuthorizationSuccess(result)) {
@ -330,10 +404,13 @@ export class QwenOAuth2Client implements IQwenOAuth2Client {
return result;
}
async pollDeviceToken(options: {
device_code: string;
code_verifier: string;
}): Promise<DeviceTokenResponse> {
async pollDeviceToken(
options: {
device_code: string;
code_verifier: string;
},
fetchOpts?: { signal?: AbortSignal },
): Promise<DeviceTokenResponse> {
const bodyData = {
grant_type: QWEN_OAUTH_GRANT_TYPE,
client_id: QWEN_OAUTH_CLIENT_ID,
@ -348,6 +425,11 @@ export class QwenOAuth2Client implements IQwenOAuth2Client {
Accept: 'application/json',
},
body: objectToUrlEncoded(bodyData),
// PR #4255 — daemon device-flow registry passes its per-entry
// `cancelController.signal` so cancel() / dispose() during a
// slow IdP response actually aborts the in-flight socket
// instead of waiting for the upstream timeout.
...(fetchOpts?.signal ? { signal: fetchOpts.signal } : {}),
});
if (!response.ok) {
@ -386,12 +468,16 @@ export class QwenOAuth2Client implements IQwenOAuth2Client {
// Handle other 400 errors (access_denied, expired_token, etc.) as real errors
// For other errors, throw with proper error information
const error = new Error(
`Device token poll failed: ${errorData.error || 'Unknown error'} - ${errorData.error_description}`,
);
(error as Error & { status?: number }).status = response.status;
throw error;
// For other errors, throw a typed `QwenOAuthPollError` so
// downstream callers (PR #4255 device-flow registry) can branch
// on `instanceof` + structured fields instead of substring-
// matching the message text. The message format is preserved
// for log-readers + any pre-existing substring matchers.
throw new QwenOAuthPollError({
oauthError: errorData.error,
description: errorData.error_description,
status: response.status,
});
}
return (await response.json()) as DeviceTokenResponse;
@ -816,22 +902,13 @@ async function authWithQwenDeviceFlow(
client.setCredentials(credentials);
// Cache the new tokens
// Cache the new tokens. `cacheQwenCredentials` itself folds
// in `SharedTokenManager.clearCache()` (PR #4255 review D1) so
// we no longer need a paired call here — the previous explicit
// post-cache clear was a duplicate that fired clearCache twice
// on the success path.
await cacheQwenCredentials(credentials);
// IMPORTANT:
// SharedTokenManager maintains an in-memory cache and throttles file checks.
// If we only write the creds file here, a subsequent `getQwenOAuthClient()`
// call in the same process (within the throttle window) may not re-read the
// updated file and could incorrectly re-trigger device auth.
// Clearing the cache forces the next call to reload from disk.
try {
SharedTokenManager.getInstance().clearCache();
} catch {
// In unit tests we sometimes mock SharedTokenManager.getInstance() with a
// minimal stub; cache invalidation is best-effort and should not break auth.
}
emitAuthProgress(
'success',
'Authentication successful! Access token obtained.',
@ -973,13 +1050,120 @@ async function authWithQwenDeviceFlow(
}
}
async function cacheQwenCredentials(credentials: QwenCredentials) {
// PR 21 (#4175 Wave 4): exported so the `qwen serve` device-flow registry can
// persist credentials acquired through the daemon's HTTP route. Mode 0o600
// matches opencode's `auth.json` to keep tokens unreadable by other users on
// shared hosts. The constant is exported so tests/auditors can assert intent
// rather than re-deriving it from a raw octal literal.
export const QWEN_CREDENTIAL_FILE_MODE = 0o600;
export async function cacheQwenCredentials(
credentials: QwenCredentials,
opts?: { signal?: AbortSignal },
) {
const filePath = getQwenCachedCredentialPath();
try {
await fs.mkdir(path.dirname(filePath), { recursive: true });
const credString = JSON.stringify(credentials, null, 2);
await fs.writeFile(filePath, credString);
// PR #4255 round-11 #2 (gpt-5.5 review): atomic write with
// permission hardening BEFORE the secret payload becomes
// accessible at the canonical filename. The earlier shape was
// 1. fs.writeFile(filePath, creds, {mode: 0o600}) ← creates
// with 0o600 OR retains existing broader perms
// 2. fs.chmod(filePath, 0o600) ← post-hoc
// tightening
// which left a window where, if `oauth_creds.json` already
// existed with broader perms (operator pre-creation, prior
// version's looser write), the freshly-written tokens were
// momentarily readable by other principals before the chmod
// closed the gap. A chmod failure on POSIX previously degraded
// to a warning while the broadly-readable tokens stayed.
//
// New shape: write to a temp file (created with 0o600 atomically
// via the `mode` flag — which DOES apply on creation since the
// path didn't exist), verify perms, then `rename` over the
// canonical filename. `fs.rename` is atomic on POSIX (within a
// filesystem) and on Windows. The canonical filename never
// contains the new tokens until they're already at 0o600.
//
// PR #4255 fold-in 3 (#10): `signal` threading is preserved —
// both `writeFile` AND the temp-file path honor the registry's
// persist-timeout + cancelController.
const tempPath = `${filePath}.tmp.${process.pid}.${randomUUID()}`;
try {
await fs.writeFile(tempPath, credString, {
mode: QWEN_CREDENTIAL_FILE_MODE,
...(opts?.signal ? { signal: opts.signal } : {}),
});
// Defensive: if the platform ignored `mode` on creation
// (some Windows FSes), explicit chmod tightens the temp BEFORE
// it's renamed into place. Failure here is a HARD ERROR — we
// refuse to publish broadly-readable tokens to the canonical
// path. A non-cooperative FS that can't tighten a 0o600 file
// shouldn't be serving credentials anyway.
try {
await fs.chmod(tempPath, QWEN_CREDENTIAL_FILE_MODE);
} catch (chmodErr) {
if (process.platform !== 'win32') {
throw new Error(
`cacheQwenCredentials: refusing to publish credentials — chmod 0o${QWEN_CREDENTIAL_FILE_MODE.toString(8)} on temp file failed: ${
chmodErr instanceof Error ? chmodErr.message : String(chmodErr)
}`,
);
}
// Windows: chmod's a no-op on most NTFS volumes; permissions
// there go through ACLs which we don't manage from here.
// Surface a debug breadcrumb for operators on exotic Windows
// filesystems but allow the rename to proceed.
debugLogger.warn(
`cacheQwenCredentials: chmod 0o${QWEN_CREDENTIAL_FILE_MODE.toString(8)} on Windows temp file ${tempPath} failed; relying on NTFS ACL: ${
chmodErr instanceof Error ? chmodErr.message : String(chmodErr)
}`,
);
}
// Atomic rename. Replaces any existing file at `filePath` in
// a single inode swap; readers either see the old creds or
// the new creds, never a partial mix.
await fs.rename(tempPath, filePath);
} catch (writeErr) {
// Best-effort cleanup of the temp file — if rename succeeded
// there's nothing to clean (path no longer points anywhere);
// if it failed there's a leftover .tmp.<pid>.<uuid> file we
// shouldn't leave on disk. Swallow ENOENT (already-renamed)
// and any other unlink errors since they're not user-actionable.
try {
await fs.unlink(tempPath);
} catch {
/* best-effort */
}
throw writeErr;
}
// SharedTokenManager throttles file checks and serves an in-memory cache;
// without an explicit invalidation a follow-up `getValidCredentials` in
// the same process can stay on the previous (often empty) cache and
// re-trigger device auth despite the just-written file. The original
// device-flow site (L820+L829) paired write+clear; folding the clear
// here keeps every caller (#4255 daemon device-flow registry included)
// correct without re-pairing the call.
try {
SharedTokenManager.getInstance().clearCache();
} catch (clearErr) {
// In production, a failed cache clear means subsequent
// `getValidCredentials` reads in the same process may serve
// stale (pre-write) credentials until the SharedTokenManager
// mtime watcher catches up. That's a recoverable degradation
// (worst case: device auth re-prompts), but the silent swallow
// it used to be made the symptom invisible. Warn so logs show
// it. Unit tests stubbing `SharedTokenManager.getInstance()`
// with a minimal shape will also flow through here — acceptable
// noise for the production-visibility win.
debugLogger.warn(
`cacheQwenCredentials: SharedTokenManager.clearCache failed; in-process callers may serve stale credentials until the next mtime poll: ${
clearErr instanceof Error ? clearErr.message : String(clearErr)
}`,
);
}
} catch (error: unknown) {
// Handle file system errors (e.g., EACCES permission denied)
const errorMessage = error instanceof Error ? error.message : String(error);

View file

@ -0,0 +1,339 @@
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import { DaemonHttpError, type DaemonClient } from './DaemonClient.js';
import type { DaemonAuthProviderId, DaemonDeviceFlowState } from './types.js';
/**
* Grace period added past the daemon-stated `expiresAt` before
* `awaitCompletion` gives up. Covers (a) clock skew between SDK and
* daemon, (b) the daemon's own sweep interval (so we don't bail one
* tick before the daemon would surface a synthetic `expired`
* terminal), and (c) per-poll network latency.
*
* **Why 30 s, and which daemon constant it relates to.** The relevant
* daemon-side constant is `DEVICE_FLOW_SWEEP_INTERVAL_MS` (the
* interval at which the registry's sweeper RUNS currently 30 s),
* NOT `DEVICE_FLOW_TERMINAL_GRACE_MS` (the 5-minute window during
* which terminal entries remain GET-able before eviction). One sweep
* cycle past `expiresAt` is enough to flip the entry to a synthetic
* `expired`/`expired_token` terminal state; once that happens the
* SDK's GET poll will return it immediately. Waiting any longer
* client-side just delays the inevitable. PR #4255 fold-in 6 review
* thread #3.
*
* **Not** to be confused with `TERMINAL_GRACE_MS` terminal entries
* remain queryable for 5 minutes after they go terminal, but that's
* a reconnect-affordance for SDK clients that want to *re-read* a
* settled state, not a window `awaitCompletion` needs to wait
* through. Keep this aligned with `SWEEP_INTERVAL_MS`; if the daemon
* ever raises its sweep cadence, raise this in lockstep.
*/
export const DEVICE_FLOW_EXPIRY_GRACE_MS = 30_000;
/**
* High-level convenience wrapper around the four `client.*DeviceFlow*` HTTP
* helpers. SDK users should normally write:
*
* const flow = await client.auth.start({ providerId: 'qwen-oauth' });
* console.log(`Open ${flow.verificationUri}\nCode: ${flow.userCode}`);
* const result = await flow.awaitCompletion({ signal });
*
* `awaitCompletion` polls `client.getDeviceFlow(...)` at the daemon-
* supplied `intervalMs`, honors `slow_down`-driven interval bumps via
* `getDeviceFlow`'s response, and terminates when the daemon's view
* reaches a terminal status (`authorized`, `expired`, `error`,
* `cancelled`). The same `auth_device_flow_*` SSE events are emitted
* by the daemon for clients that ARE already subscribed to a session
* stream those provide a real-time hint, but `awaitCompletion`
* itself does not require an SSE subscription and works against any
* client that can hit the GET endpoint.
*
* Issue #4175 PR 21.
*/
export interface DaemonAuthFlowHandle {
deviceFlowId: string;
providerId: DaemonAuthProviderId;
userCode: string;
verificationUri: string;
verificationUriComplete?: string;
expiresAt: number;
intervalMs: number;
/** True iff the daemon returned an existing pending entry rather than
* starting a fresh IdP request. */
attached: boolean;
/** Block until the daemon settles the flow into a terminal state, then
* return the final state. The promise rejects on `signal.abort()`. */
awaitCompletion(
opts?: AwaitCompletionOptions,
): Promise<DaemonDeviceFlowState>;
/** Cancel the in-flight device flow on the daemon. Idempotent. */
cancel(): Promise<void>;
}
export interface AwaitCompletionOptions {
/** Aborts both SSE consumption and GET-fallback polling. */
signal?: AbortSignal;
/** Called whenever the daemon reports an upstream `slow_down` (mirroring
* the `auth_device_flow_throttled` event). The new effective interval
* is the value the SDK will use for the next GET poll. */
onThrottled?: (intervalMs: number) => void;
/** Optional override of the GET-fallback interval. Defaults to the
* daemon-supplied `intervalMs` from `start(...)` and respects bumps
* from `slow_down`. */
pollOverrideMs?: number;
/** Hard ceiling on `awaitCompletion`'s wall-clock duration, in ms.
* When omitted, `awaitCompletion` runs until the daemon-stated
* `expiresAt` plus `DEVICE_FLOW_EXPIRY_GRACE_MS` (default 30s),
* which lets the daemon's own sweeper surface the authoritative
* terminal state instead of timing out client-side. Set explicitly
* to clamp the wait shorter; values past `expiresAt` will still see
* the daemon return `expired` once its sweeper fires. */
timeoutMs?: number;
}
const TERMINAL_STATUSES: ReadonlySet<DaemonDeviceFlowState['status']> = new Set(
['authorized', 'expired', 'error', 'cancelled'],
);
export class DaemonAuthFlow {
constructor(private readonly client: DaemonClient) {}
async start(opts: {
providerId: DaemonAuthProviderId;
clientId?: string;
}): Promise<DaemonAuthFlowHandle> {
const initial = await this.client.startDeviceFlow(opts);
const handleClient = this.client;
const handle: DaemonAuthFlowHandle = {
deviceFlowId: initial.deviceFlowId,
providerId: initial.providerId,
userCode: initial.userCode,
verificationUri: initial.verificationUri,
verificationUriComplete: initial.verificationUriComplete,
expiresAt: initial.expiresAt,
intervalMs: initial.intervalMs,
attached: initial.attached,
cancel: () =>
handleClient.cancelDeviceFlow(initial.deviceFlowId, {
clientId: opts.clientId,
}),
awaitCompletion: async (waitOpts = {}) => {
const finalState = await awaitCompletion(
handleClient,
initial,
opts.clientId,
waitOpts,
);
return finalState;
},
};
return handle;
}
status(deviceFlowId: string, opts?: { clientId?: string }) {
return this.client.getDeviceFlow(deviceFlowId, opts);
}
cancel(deviceFlowId: string, opts?: { clientId?: string }) {
return this.client.cancelDeviceFlow(deviceFlowId, opts);
}
}
async function awaitCompletion(
client: DaemonClient,
start: {
deviceFlowId: string;
intervalMs: number;
expiresAt: number;
providerId: DaemonAuthProviderId;
},
clientId: string | undefined,
opts: AwaitCompletionOptions,
): Promise<DaemonDeviceFlowState> {
// Workspace-scoped events fan out through whatever session buses
// happen to be live, but `awaitCompletion` is workspace-level (no
// session id) — so attaching to a single SSE stream isn't a stable
// contract here. GET polling against the daemon's authoritative
// device-flow state is the universal path; `auth_device_flow_*`
// events remain a real-time hint for clients that ARE already
// subscribed to a session stream.
return await pollUntilTerminal(client, start, clientId, opts);
}
/**
* Read the daemon's view of a device flow, mapping a 404 from the
* GET endpoint to a synthetic terminal `error`/`not_found_or_evicted`
* state instead of letting `DaemonHttpError(404)` escape. PR #4255
* fold-in 7 review thread #4: extracted from the inline catch in
* `pollUntilTerminal` so the timeout-ceiling final read uses the same
* logic without this, the ceiling read would reject with a raw
* `DaemonHttpError` if the daemon evicted the entry exactly at the
* boundary, breaking `awaitCompletion`'s "always returns a settled
* `DaemonDeviceFlowState`" contract.
*/
async function getDeviceFlowOrSynthetic404(
client: DaemonClient,
start: {
deviceFlowId: string;
providerId: DaemonAuthProviderId;
},
clientId: string | undefined,
signal: AbortSignal | undefined,
): Promise<DaemonDeviceFlowState> {
try {
return await client.getDeviceFlow(start.deviceFlowId, {
clientId,
signal,
});
} catch (err: unknown) {
if (err instanceof DaemonHttpError && err.status === 404) {
// PR #4255 fold-in 3 (#4): a 404 here can mean (a) the entry
// expired and the sweeper reaped it past the terminal grace
// window, (b) the daemon was restarted and lost the registry,
// (c) the deviceFlowId was wrong / spoofed. The earlier
// synthetic `'expired'` status conflated all three. Surface
// `status: 'error'` + `errorKind: 'not_found_or_evicted'` so
// SDK consumers can distinguish "your flow expired during your
// disconnect" from "this id was never valid on this daemon."
return {
deviceFlowId: start.deviceFlowId,
providerId: start.providerId,
status: 'error',
errorKind: 'not_found_or_evicted',
hint: 'device-flow not found on daemon (evicted past terminal grace, daemon restart, or unknown deviceFlowId)',
createdAt: Date.now(),
};
}
throw err;
}
}
/**
* Validate an `AwaitCompletionOptions` numeric field. PR #4255
* fold-in 7 review thread #5: `NaN` / `Infinity` from a misbehaving
* caller would otherwise produce a `ceiling` of `NaN` (so `now >=
* ceiling` is always `false` — the loop runs forever) or a
* `setTimeout(NaN)` (Node clamps to a 1 ms delay tight polling
* loop). Reject non-finite-positive values; when the caller's intent
* was sloppy ("a long timeout") they fall back to the documented
* default rather than getting a pathological loop.
*/
function sanitizePositiveMs(
raw: number | undefined,
opts: { allowZero?: boolean } = {},
): number | undefined {
if (raw === undefined) return undefined;
if (!Number.isFinite(raw)) return undefined;
// PR #4255 fold-in 9 review thread #6: `timeoutMs: 0` is the
// documented "settle immediately, return current daemon view"
// contract — must be honored, not collapsed to falsy. Opt-in via
// `allowZero` so `pollOverrideMs: 0` still falls back to the
// default (a 0 ms poll interval is a tight loop, not a useful
// contract).
if (opts.allowZero ? raw < 0 : raw <= 0) return undefined;
return raw;
}
async function pollUntilTerminal(
client: DaemonClient,
start: {
deviceFlowId: string;
intervalMs: number;
expiresAt: number;
/** Carried through from the parent `start` so the synthetic 404
* fallback below reports the actual provider rather than the
* hardcoded `'qwen-oauth'` (PR #4255 review C1). */
providerId: DaemonAuthProviderId;
},
clientId: string | undefined,
opts: AwaitCompletionOptions,
): Promise<DaemonDeviceFlowState> {
const signal = opts.signal;
// PR #4255 fold-in 7 review thread #5: validate caller-supplied
// numeric inputs BEFORE composing the ceiling / interval. NaN /
// Infinity slip past the original `?? default` form (they're
// truthy-ish) and break the loop's wall-clock guard.
const sanitizedTimeoutMs = sanitizePositiveMs(opts.timeoutMs, {
allowZero: true,
});
const sanitizedPollOverrideMs = sanitizePositiveMs(opts.pollOverrideMs);
// PR #4255 fold-in 9 review thread #6: use `!== undefined` (not
// truthy check) so `timeoutMs: 0` produces a `ceiling = Date.now()`
// — which the loop's `now >= ceiling` guard will satisfy on the
// very first iteration, returning the daemon's current snapshot
// immediately. The earlier `?` form treated 0 as falsy and
// silently fell back to the default.
const ceiling =
sanitizedTimeoutMs !== undefined
? Date.now() + sanitizedTimeoutMs
: start.expiresAt + DEVICE_FLOW_EXPIRY_GRACE_MS;
let interval = Math.max(
1_000,
sanitizedPollOverrideMs ?? start.intervalMs ?? 5_000,
);
let lastIntervalMs = interval;
while (true) {
if (signal?.aborted) {
throw signalAbortError(signal);
}
const now = Date.now();
if (now >= ceiling) {
// PR #4255 fold-in 7 #4: route the ceiling read through the
// same 404-aware helper as the loop body. A 404 at the
// boundary is a settled state, not a throw.
return await getDeviceFlowOrSynthetic404(client, start, clientId, signal);
}
const snapshot = await getDeviceFlowOrSynthetic404(
client,
start,
clientId,
signal,
);
if (snapshot.intervalMs && snapshot.intervalMs !== lastIntervalMs) {
lastIntervalMs = snapshot.intervalMs;
interval = snapshot.intervalMs;
opts.onThrottled?.(snapshot.intervalMs);
}
if (TERMINAL_STATUSES.has(snapshot.status)) return snapshot;
await waitFor(interval, signal);
}
}
async function waitFor(ms: number, signal?: AbortSignal): Promise<void> {
if (signal?.aborted) throw signalAbortError(signal);
await new Promise<void>((resolve, reject) => {
// PR #4255 review C5: do NOT `unref()` this timer. The earlier
// version did, which on a standalone Node CLI/script that does
// `await client.auth.start().awaitCompletion()` and nothing else
// could leave Node with no remaining ref'd handles between polls
// and exit the process before the user finishes authorization.
// This sleep is foreground work the caller explicitly awaits;
// unref'ing it broke the contract.
const handle = setTimeout(() => {
cleanup();
resolve();
}, ms);
const onAbort = () => {
cleanup();
reject(signalAbortError(signal));
};
function cleanup() {
clearTimeout(handle);
signal?.removeEventListener('abort', onAbort);
}
if (signal) {
signal.addEventListener('abort', onAbort, { once: true });
}
});
}
function signalAbortError(signal: AbortSignal | undefined): Error {
const reason = signal?.reason;
if (reason instanceof Error) return reason;
if (typeof reason === 'string') return new Error(reason);
return new Error('aborted');
}

View file

@ -4,11 +4,16 @@
* SPDX-License-Identifier: Apache-2.0
*/
import { DaemonAuthFlow } from './DaemonAuthFlow.js';
import { parseSseStream } from './sse.js';
import type {
DaemonAgentMutationResult,
DaemonAuthProviderId,
DaemonAuthStatusSnapshot,
DaemonCapabilities,
DaemonCreateAgentRequest,
DaemonDeviceFlowStartResult,
DaemonDeviceFlowState,
DaemonEvent,
DaemonSessionContextStatus,
DaemonRestoredSession,
@ -175,6 +180,21 @@ export class DaemonClient {
private readonly token: string | undefined;
private readonly _fetch: typeof globalThis.fetch;
private readonly fetchTimeoutMs: number;
// Lazy singleton so clients that never touch auth pay no allocation cost.
// Exposed via the readonly `auth` accessor below.
private _authFlow?: DaemonAuthFlow;
/**
* High-level auth helper (issue #4175 PR 21). Wraps the four
* `*DeviceFlow*` methods with a `start(...).awaitCompletion()` shape
* for the common "log in remotely" UX. Lazy-constructed.
*/
get auth(): DaemonAuthFlow {
if (!this._authFlow) {
this._authFlow = new DaemonAuthFlow(this);
}
return this._authFlow;
}
constructor(opts: DaemonClientOptions) {
this.baseUrl = stripTrailingSlashes(opts.baseUrl);
@ -1038,6 +1058,115 @@ export class DaemonClient {
);
}
// -- Auth device-flow (issue #4175 PR 21) -------------------------------
/**
* Start an OAuth device-flow login for the given provider. The daemon
* polls the IdP in the background and emits typed `auth_device_flow_*`
* SSE events; callers can also poll `getDeviceFlow(...)`.
*
* Per-provider singleton: a repeat call while a flow is already pending
* for the same provider is an idempotent take-over and returns the
* existing entry rather than starting a fresh IdP request. The
* `attached` field on the result distinguishes the two cases.
*/
async startDeviceFlow(opts: {
providerId: DaemonAuthProviderId;
clientId?: string;
}): Promise<DaemonDeviceFlowStartResult> {
return await this.fetchWithTimeout(
`${this.baseUrl}/workspace/auth/device-flow`,
{
method: 'POST',
headers: this.headers(
{ 'Content-Type': 'application/json' },
opts.clientId,
),
body: JSON.stringify({ providerId: opts.providerId }),
},
async (res) => {
if (res.status !== 200 && res.status !== 201) {
throw await this.failOnError(res, 'POST /workspace/auth/device-flow');
}
return (await res.json()) as DaemonDeviceFlowStartResult;
},
);
}
async getDeviceFlow(
deviceFlowId: string,
opts: { clientId?: string; signal?: AbortSignal } = {},
): Promise<DaemonDeviceFlowState> {
// PR #4255 fold-in 7 review thread #6: forward `signal` into
// `fetchWithTimeout`, which composes it with the per-request
// `fetchTimeoutMs` controller. Without this, an `awaitCompletion`
// caller that aborts mid-poll could not cancel the in-flight GET
// — only the post-await guard would notice, but that runs only
// after the body is already settled (or the daemon-side
// `fetchTimeoutMs` fires, which can be 30s+).
return await this.fetchWithTimeout(
`${this.baseUrl}/workspace/auth/device-flow/${encodeURIComponent(deviceFlowId)}`,
{ headers: this.headers({}, opts.clientId), signal: opts.signal },
async (res) => {
if (!res.ok) {
throw await this.failOnError(
res,
'GET /workspace/auth/device-flow/:id',
);
}
return (await res.json()) as DaemonDeviceFlowState;
},
);
}
/**
* Cancel a pending device-flow. Idempotent: terminal entries return
* 204 (no-op); unknown ids return 404 both resolve here, matching
* the SDK's `closeSession` shape.
*/
async cancelDeviceFlow(
deviceFlowId: string,
opts: { clientId?: string } = {},
): Promise<void> {
return await this.fetchWithTimeout(
`${this.baseUrl}/workspace/auth/device-flow/${encodeURIComponent(deviceFlowId)}`,
{
method: 'DELETE',
headers: this.headers({}, opts.clientId),
},
async (res) => {
if (res.status === 204 || res.status === 404) {
try {
await res.body?.cancel();
} catch {
/* body already consumed or no body */
}
return;
}
throw await this.failOnError(
res,
'DELETE /workspace/auth/device-flow/:id',
);
},
);
}
/** Snapshot of persisted auth credentials + currently pending device-flows. */
async getAuthStatus(
opts: { clientId?: string } = {},
): Promise<DaemonAuthStatusSnapshot> {
return await this.fetchWithTimeout(
`${this.baseUrl}/workspace/auth/status`,
{ headers: this.headers({}, opts.clientId) },
async (res) => {
if (!res.ok) {
throw await this.failOnError(res, 'GET /workspace/auth/status');
}
return (await res.json()) as DaemonAuthStatusSnapshot;
},
);
}
// -- Session metadata ----------------------------------------------------
/**

View file

@ -25,6 +25,15 @@ const DAEMON_KNOWN_EVENT_TYPE_VALUES = [
// updated" toasts. Read-after-write remains the correctness contract.
'memory_changed',
'agent_changed',
// Issue #4175 PR 21 — workspace-scoped auth device-flow events.
// These are NOT session-keyed; the session reducer no-ops on them
// and `reduceDaemonAuthEvent` projects them into a workspace-level
// state shape (one entry per provider).
'auth_device_flow_started',
'auth_device_flow_throttled',
'auth_device_flow_authorized',
'auth_device_flow_failed',
'auth_device_flow_cancelled',
] as const;
const DAEMON_KNOWN_EVENT_TYPES: ReadonlySet<string> = new Set<string>(
@ -158,6 +167,77 @@ export interface DaemonAgentChangedData {
[key: string]: unknown;
}
/** Issue #4175 PR 21 — auth device-flow event payloads. */
/** Provider id. Open string union for forward-compatible providers; `qwen-oauth`
* is the only value v1 currently emits. */
export type DaemonAuthDeviceFlowProviderId = 'qwen-oauth' | (string & {});
export type DaemonAuthDeviceFlowStatus =
| 'pending'
| 'authorized'
| 'expired'
| 'error'
| 'cancelled';
/**
* Known errorKind values surfaced on `auth_device_flow_failed`. The
* trailing `(string & {})` keeps this as an OPEN union so a daemon
* adding a new errorKind doesn't get its event silently dropped by an
* older SDK's type guard consumers branching exhaustively on the
* known literals get the same narrowing as before, while unknown
* future kinds fall through to a `string` fallback rather than failing
* `isAuthDeviceFlowFailedData` and being filtered out by
* `asKnownDaemonEvent` (PR #4255 review C2).
*/
export type DaemonAuthDeviceFlowErrorKind =
| 'expired_token'
| 'access_denied'
| 'invalid_grant'
| 'upstream_error'
/** Disk-write / `provider.persist()` failure path. The IdP-side token
* exchange succeeded but the daemon couldn't durably store credentials
* (EACCES, EROFS, ENOSPC, etc.). Distinct from `upstream_error`. */
| 'persist_failed'
| (string & {});
export interface DaemonAuthDeviceFlowStartedData {
deviceFlowId: string;
providerId: DaemonAuthDeviceFlowProviderId;
/** Daemon-clock epoch ms when the flow's `device_code` expires. */
expiresAt: number;
[key: string]: unknown;
}
export interface DaemonAuthDeviceFlowThrottledData {
deviceFlowId: string;
/** Bumped polling interval after the daemon honored an upstream `slow_down`. */
intervalMs: number;
[key: string]: unknown;
}
export interface DaemonAuthDeviceFlowAuthorizedData {
deviceFlowId: string;
providerId: DaemonAuthDeviceFlowProviderId;
/** Credential expiry, daemon clock. Undefined when the IdP omitted `expires_in`. */
expiresAt?: number;
/** Best-effort non-PII account label (nickname / uid hash); never email/phone. */
accountAlias?: string;
[key: string]: unknown;
}
export interface DaemonAuthDeviceFlowFailedData {
deviceFlowId: string;
errorKind: DaemonAuthDeviceFlowErrorKind;
hint?: string;
[key: string]: unknown;
}
export interface DaemonAuthDeviceFlowCancelledData {
deviceFlowId: string;
[key: string]: unknown;
}
export type DaemonSessionUpdateEvent = DaemonEventEnvelope<
'session_update',
DaemonSessionUpdateData
@ -215,6 +295,34 @@ export type DaemonAgentChangedEvent = DaemonEventEnvelope<
DaemonAgentChangedData
>;
export type DaemonAuthDeviceFlowStartedEvent = DaemonEventEnvelope<
'auth_device_flow_started',
DaemonAuthDeviceFlowStartedData
>;
export type DaemonAuthDeviceFlowThrottledEvent = DaemonEventEnvelope<
'auth_device_flow_throttled',
DaemonAuthDeviceFlowThrottledData
>;
export type DaemonAuthDeviceFlowAuthorizedEvent = DaemonEventEnvelope<
'auth_device_flow_authorized',
DaemonAuthDeviceFlowAuthorizedData
>;
export type DaemonAuthDeviceFlowFailedEvent = DaemonEventEnvelope<
'auth_device_flow_failed',
DaemonAuthDeviceFlowFailedData
>;
export type DaemonAuthDeviceFlowCancelledEvent = DaemonEventEnvelope<
'auth_device_flow_cancelled',
DaemonAuthDeviceFlowCancelledData
>;
export type DaemonAuthEvent =
| DaemonAuthDeviceFlowStartedEvent
| DaemonAuthDeviceFlowThrottledEvent
| DaemonAuthDeviceFlowAuthorizedEvent
| DaemonAuthDeviceFlowFailedEvent
| DaemonAuthDeviceFlowCancelledEvent;
export type DaemonSessionEvent =
| DaemonSessionUpdateEvent
| DaemonModelSwitchedEvent
@ -246,7 +354,8 @@ export type KnownDaemonEvent =
| DaemonSessionEvent
| DaemonControlEvent
| DaemonStreamLifecycleEvent
| DaemonWorkspaceMutationEvent;
| DaemonWorkspaceMutationEvent
| DaemonAuthEvent;
export interface DaemonSessionViewState {
lastEventId?: number;
@ -397,6 +506,26 @@ export function asKnownDaemonEvent(
return isAgentChangedData(event.data)
? (event as DaemonAgentChangedEvent)
: undefined;
case 'auth_device_flow_started':
return isAuthDeviceFlowStartedData(event.data)
? (event as DaemonAuthDeviceFlowStartedEvent)
: undefined;
case 'auth_device_flow_throttled':
return isAuthDeviceFlowThrottledData(event.data)
? (event as DaemonAuthDeviceFlowThrottledEvent)
: undefined;
case 'auth_device_flow_authorized':
return isAuthDeviceFlowAuthorizedData(event.data)
? (event as DaemonAuthDeviceFlowAuthorizedEvent)
: undefined;
case 'auth_device_flow_failed':
return isAuthDeviceFlowFailedData(event.data)
? (event as DaemonAuthDeviceFlowFailedEvent)
: undefined;
case 'auth_device_flow_cancelled':
return isAuthDeviceFlowCancelledData(event.data)
? (event as DaemonAuthDeviceFlowCancelledEvent)
: undefined;
default:
return undefined;
}
@ -551,6 +680,16 @@ export function reduceDaemonSessionEvent(
lastWorkspaceMutation: event.data,
lastWorkspaceMutationType: 'agent_changed',
};
// Auth device-flow events are workspace-scoped; the session reducer
// is a no-op (consume `lastEventId` via `base` and otherwise pass
// state through). Workspace-level state lives in `DaemonAuthState`
// and is projected by `reduceDaemonAuthEvent`.
case 'auth_device_flow_started':
case 'auth_device_flow_throttled':
case 'auth_device_flow_authorized':
case 'auth_device_flow_failed':
case 'auth_device_flow_cancelled':
return base;
default: {
const _exhaustive: never = event;
return _exhaustive;
@ -567,6 +706,227 @@ export function reduceDaemonSessionEvents(
return state;
}
/** Issue #4175 PR 21 workspace-scoped auth device-flow state. One entry
* per provider; the registry's per-provider singleton constraint is
* reflected here so adapters can render `state.flows[providerId]` without
* worrying about concurrent flows for the same provider. */
export interface DaemonDeviceFlowReducerState {
deviceFlowId: string;
status: DaemonAuthDeviceFlowStatus;
errorKind?: DaemonAuthDeviceFlowErrorKind;
hint?: string;
/** Most recent `intervalMs` reported by `auth_device_flow_throttled`. */
intervalMs?: number;
/** Most recent SSE event id observed for this flow (NOT a wall-clock
* timestamp). Used as a monotonic counter so out-of-order delivery
* doesn't let a stale frame overwrite a newer one. `undefined` if
* the underlying envelope omitted `id` (synthetic / SDK-internal
* frames). PR #4255 round-9 #6: changed from `number` (defaulting
* to 0) to `number | undefined` the daemon-side EventBus assigns
* ids 1, so `0` is a sentinel that has no meaning in real
* traffic, but the monotonic gate (`rawEventId <= lastSeenEventId`)
* would reject any future synthetic frame using `id: 0`. The gate
* already short-circuits on `existing.lastSeenEventId !== undefined`,
* so undefined is safe. */
lastSeenEventId: number | undefined;
/** Set on `authorized` to the credential's expiry, when known. */
authorizedExpiresAt?: number;
/** Best-effort non-PII account label echoed from `authorized`. */
accountAlias?: string;
}
export interface DaemonAuthState {
flows: Partial<
Record<DaemonAuthDeviceFlowProviderId, DaemonDeviceFlowReducerState>
>;
}
export function createDaemonAuthState(
seed: Partial<DaemonAuthState> = {},
): DaemonAuthState {
return { flows: { ...(seed.flows ?? {}) } };
}
/**
* Apply a single auth device-flow event to a workspace-scoped auth state.
* Non-auth events (sessions, control, lifecycle) pass through unchanged so
* adapters can fan one event stream into both `reduceDaemonSessionEvent`
* (per session) and `reduceDaemonAuthEvent` (workspace-wide) without
* filtering ahead of time.
*
* Edge cases:
* - `throttled` / `authorized` / `failed` / `cancelled` for a deviceFlowId
* not matching the current `flows[providerId]` are dropped: by the time
* they arrive, that flow's terminal-grace window has already expired or
* the SDK has rebased onto a newer flow. Silently ignoring stale events
* is the correct behavior here (events are non-authoritative; the
* daemon's GET .../device-flow/:id is the source of truth).
*/
export function reduceDaemonAuthEvent(
state: DaemonAuthState,
rawEvent: DaemonEvent,
): DaemonAuthState {
const event = asKnownDaemonEvent(rawEvent);
if (!event) return state;
switch (event.type) {
case 'auth_device_flow_started': {
// PR #4255 fold-in 8 review thread #2: gate stale `started`
// frames the same way as the matching-flow handlers. SSE
// reconnect with `Last-Event-ID < started.id` would otherwise
// replay an old started for the SAME deviceFlowId after the
// SDK reducer already advanced to a terminal state, resetting
// the visible status to 'pending'. A stale started for an
// OLDER flow (different deviceFlowId, lower id than the
// current flow's lastSeenEventId) similarly gets ignored.
const providerId = event.data.providerId;
const existing = state.flows[providerId];
if (
existing !== undefined &&
rawEvent.id !== undefined &&
existing.lastSeenEventId !== undefined &&
rawEvent.id <= existing.lastSeenEventId
) {
return state;
}
return {
flows: {
...state.flows,
[providerId]: {
deviceFlowId: event.data.deviceFlowId,
status: 'pending',
lastSeenEventId: rawEvent.id ?? existing?.lastSeenEventId,
},
},
};
}
case 'auth_device_flow_throttled': {
const updated = updateMatchingFlow(
state,
event.data.deviceFlowId,
rawEvent.id,
(flow) => ({
...flow,
intervalMs: event.data.intervalMs,
lastSeenEventId: rawEvent.id ?? flow.lastSeenEventId,
}),
);
return updated ?? state;
}
case 'auth_device_flow_authorized': {
const providerId = event.data.providerId;
const existing = state.flows[providerId];
if (!existing || existing.deviceFlowId !== event.data.deviceFlowId) {
return state;
}
// PR #4255 fold-in 8 review thread #2: enforce monotonicity
// here too. The deviceFlowId equality check above narrows to
// "this frame is for the current flow"; the id gate then
// refuses out-of-order replay (e.g. a delayed `authorized`
// arriving after a more recent `failed` for the same flow,
// which the daemon's transitionTerminal would never produce
// but a malformed/synthetic stream could).
if (
rawEvent.id !== undefined &&
existing.lastSeenEventId !== undefined &&
rawEvent.id <= existing.lastSeenEventId
) {
return state;
}
const next: DaemonDeviceFlowReducerState = {
...existing,
status: 'authorized',
authorizedExpiresAt: event.data.expiresAt,
accountAlias: event.data.accountAlias,
errorKind: undefined,
lastSeenEventId: rawEvent.id ?? existing.lastSeenEventId,
};
return { flows: { ...state.flows, [providerId]: next } };
}
case 'auth_device_flow_failed': {
// The daemon's status machine reserves 'expired' for the time-based
// path (now >= expiresAt). Upstream RFC 8628 errors — including
// `expired_token` — go to 'error' with `errorKind` carrying the
// distinction. Earlier drafts collapsed `errorKind: 'expired_token'`
// to status 'expired', which gave SDK consumers a different
// status than the daemon's GET endpoint reported. Code-reviewer
// P1-9 / silent-failure D2: align with daemon, surface errorKind
// separately.
const updated = updateMatchingFlow(
state,
event.data.deviceFlowId,
rawEvent.id,
(flow) => ({
...flow,
status: 'error',
errorKind: event.data.errorKind,
hint: event.data.hint,
lastSeenEventId: rawEvent.id ?? flow.lastSeenEventId,
}),
);
return updated ?? state;
}
case 'auth_device_flow_cancelled': {
const updated = updateMatchingFlow(
state,
event.data.deviceFlowId,
rawEvent.id,
(flow) => ({
...flow,
status: 'cancelled',
lastSeenEventId: rawEvent.id ?? flow.lastSeenEventId,
}),
);
return updated ?? state;
}
default:
return state;
}
}
export function reduceDaemonAuthEvents(
events: Iterable<DaemonEvent>,
initialState: DaemonAuthState = createDaemonAuthState(),
): DaemonAuthState {
let state = initialState;
for (const event of events) state = reduceDaemonAuthEvent(state, event);
return state;
}
function updateMatchingFlow(
state: DaemonAuthState,
deviceFlowId: string,
rawEventId: number | undefined,
patch: (flow: DaemonDeviceFlowReducerState) => DaemonDeviceFlowReducerState,
): DaemonAuthState | undefined {
const entries = Object.entries(state.flows) as Array<
[DaemonAuthDeviceFlowProviderId, DaemonDeviceFlowReducerState | undefined]
>;
for (const [providerId, flow] of entries) {
if (flow && flow.deviceFlowId === deviceFlowId) {
// PR #4255 fold-in 8 review thread #2: enforce the
// monotonicity guarantee that `lastSeenEventId`'s JSDoc
// documents. Out-of-order delivery (SSE replay-then-live
// mixing) could otherwise let a stale frame overwrite a
// newer terminal state. Synthetic frames without an
// envelope `id` (rawEventId === undefined) bypass the
// gate — they originate inside the SDK reducer machinery
// (e.g. fallback paths) and aren't subject to replay
// ordering.
if (
rawEventId !== undefined &&
flow.lastSeenEventId !== undefined &&
rawEventId <= flow.lastSeenEventId
) {
return state;
}
return {
flows: { ...state.flows, [providerId]: patch(flow) },
};
}
}
return undefined;
}
function isKnownDaemonEventTypeName(
type: string,
): type is DaemonKnownEventType {
@ -731,6 +1091,70 @@ function isAgentChangedData(value: unknown): value is DaemonAgentChangedData {
);
}
function isAuthDeviceFlowStartedData(
value: unknown,
): value is DaemonAuthDeviceFlowStartedData {
return (
isRecord(value) &&
isNonEmptyString(value['deviceFlowId']) &&
isNonEmptyString(value['providerId']) &&
isFiniteNumber(value['expiresAt'])
);
}
function isAuthDeviceFlowThrottledData(
value: unknown,
): value is DaemonAuthDeviceFlowThrottledData {
return (
isRecord(value) &&
isNonEmptyString(value['deviceFlowId']) &&
isFiniteNumber(value['intervalMs'])
);
}
function isAuthDeviceFlowAuthorizedData(
value: unknown,
): value is DaemonAuthDeviceFlowAuthorizedData {
return (
isRecord(value) &&
isNonEmptyString(value['deviceFlowId']) &&
isNonEmptyString(value['providerId']) &&
isOptionalNumber(value['expiresAt']) &&
isOptionalStringOrNull(value['accountAlias'])
);
}
function isAuthDeviceFlowFailedData(
value: unknown,
): value is DaemonAuthDeviceFlowFailedData {
return (
isRecord(value) &&
isNonEmptyString(value['deviceFlowId']) &&
isAuthDeviceFlowErrorKind(value['errorKind']) &&
isOptionalStringOrNull(value['hint'])
);
}
function isAuthDeviceFlowCancelledData(
value: unknown,
): value is DaemonAuthDeviceFlowCancelledData {
return isRecord(value) && isNonEmptyString(value['deviceFlowId']);
}
function isAuthDeviceFlowErrorKind(
value: unknown,
): value is DaemonAuthDeviceFlowErrorKind {
// Forward-compat: accept ANY non-empty string. The earlier closed
// allowlist would silently drop a daemon-emitted `failed` event with
// a future errorKind (e.g. `rate_limited`) — `asKnownDaemonEvent`
// would treat it as malformed and `reduceDaemonAuthEvent` never
// transitions the flow's status, leaving SDK consumers stuck on
// `pending` (PR #4255 review C2). The known literals still narrow
// exhaustively in consumer `switch` statements; unknown kinds fall
// into the `(string & {})` arm of the union for graceful handling.
return typeof value === 'string' && value.length > 0;
}
function isPermissionOption(value: unknown): value is DaemonPermissionOption {
return isRecord(value) && isNonEmptyString(value['optionId']);
}

View file

@ -13,6 +13,12 @@ export {
type RestoreSessionRequest,
type SubscribeOptions,
} from './DaemonClient.js';
export {
DaemonAuthFlow,
DEVICE_FLOW_EXPIRY_GRACE_MS,
type AwaitCompletionOptions,
type DaemonAuthFlowHandle,
} from './DaemonAuthFlow.js';
export {
DaemonSessionClient,
type DaemonSessionClientOptions,
@ -20,9 +26,12 @@ export {
} from './DaemonSessionClient.js';
export {
asKnownDaemonEvent,
createDaemonAuthState,
createDaemonSessionViewState,
isDaemonEventType,
isKnownDaemonEvent,
reduceDaemonAuthEvent,
reduceDaemonAuthEvents,
reduceDaemonSessionEvent,
reduceDaemonSessionEvents,
} from './events.js';
@ -70,6 +79,22 @@ export type {
DaemonStreamErrorEvent,
DaemonStreamLifecycleEvent,
DaemonWorkspaceMutationEvent,
DaemonAuthDeviceFlowProviderId,
DaemonAuthDeviceFlowStatus,
DaemonAuthDeviceFlowErrorKind,
DaemonAuthDeviceFlowStartedData,
DaemonAuthDeviceFlowStartedEvent,
DaemonAuthDeviceFlowThrottledData,
DaemonAuthDeviceFlowThrottledEvent,
DaemonAuthDeviceFlowAuthorizedData,
DaemonAuthDeviceFlowAuthorizedEvent,
DaemonAuthDeviceFlowFailedData,
DaemonAuthDeviceFlowFailedEvent,
DaemonAuthDeviceFlowCancelledData,
DaemonAuthDeviceFlowCancelledEvent,
DaemonAuthEvent,
DaemonDeviceFlowReducerState,
DaemonAuthState,
KnownDaemonEvent,
} from './events.js';
export type {
@ -90,6 +115,13 @@ export type {
DaemonProtocolVersions,
DaemonRestoredSession,
DaemonSession,
DaemonAuthProviderId,
DaemonAuthDeviceFlowSdkStatus,
DaemonAuthDeviceFlowSdkErrorKind,
DaemonAuthProviderStatus,
DaemonAuthStatusSnapshot,
DaemonDeviceFlowStartResult,
DaemonDeviceFlowState,
DaemonSessionContextStatus,
DaemonSessionState,
DaemonSessionSummary,

View file

@ -621,6 +621,85 @@ export interface HeartbeatResult {
lastSeenAt: number;
}
/** Issue #4175 PR 21 — auth device-flow wire types. */
export type DaemonAuthProviderId = 'qwen-oauth' | (string & {});
// PR #4255 review S4: Sdk-prefixed aliases USED to be parallel literal
// unions, which silently diverged from the canonical event-side types
// the moment one was extended. Single-source the canonical definitions
// from `./events.js` so a single source of truth governs both layers
// (event payloads + REST wire shapes). TypeScript handles the
// circular type-only import cleanly because there is no runtime
// dependency direction. Local `type X = ...` aliases (rather than a
// re-export) make the symbols usable INSIDE this module too — required
// by `DaemonDeviceFlowState` / `DaemonAuthProviderStatus` below.
import type {
DaemonAuthDeviceFlowStatus,
DaemonAuthDeviceFlowErrorKind,
} from './events.js';
export type DaemonAuthDeviceFlowSdkStatus = DaemonAuthDeviceFlowStatus;
export type DaemonAuthDeviceFlowSdkErrorKind = DaemonAuthDeviceFlowErrorKind;
/** Returned from `POST /workspace/auth/device-flow`. */
export interface DaemonDeviceFlowStartResult {
deviceFlowId: string;
providerId: DaemonAuthProviderId;
status: DaemonAuthDeviceFlowSdkStatus;
userCode: string;
verificationUri: string;
verificationUriComplete?: string;
expiresAt: number;
intervalMs: number;
/** True iff the daemon returned an existing pending entry rather than
* starting a fresh flow (per-provider singleton take-over). */
attached: boolean;
initiatorClientId?: string;
}
/** Returned from `GET /workspace/auth/device-flow/:id`. */
export interface DaemonDeviceFlowState {
deviceFlowId: string;
providerId: DaemonAuthProviderId;
status: DaemonAuthDeviceFlowSdkStatus;
errorKind?: DaemonAuthDeviceFlowSdkErrorKind;
hint?: string;
userCode?: string;
verificationUri?: string;
verificationUriComplete?: string;
expiresAt?: number;
intervalMs?: number;
lastPolledAt?: number;
createdAt: number;
initiatorClientId?: string;
}
export interface DaemonAuthProviderStatus extends DaemonStatusCell {
kind: 'auth_provider';
providerId: DaemonAuthProviderId;
expiresAt?: number;
/** Best-effort non-PII account label. Never email/phone/username. */
accountAlias?: string;
}
/** Returned from `GET /workspace/auth/status`. */
export interface DaemonAuthStatusSnapshot {
v: 1;
workspaceCwd: string;
/** Currently registered providers and their auth status. */
providers: DaemonAuthProviderStatus[];
/** Pending flows; userCode/verificationUri intentionally redacted (the
* full record is fetched via GET /workspace/auth/device-flow/:id). */
pendingDeviceFlows: Array<{
deviceFlowId: string;
providerId: DaemonAuthProviderId;
expiresAt: number;
}>;
/** Provider ids the daemon advertises support for under
* `POST /workspace/auth/device-flow`. */
supportedDeviceFlowProviders: DaemonAuthProviderId[];
}
/** A frame in the SSE event stream. */
export interface DaemonEvent {
/**

View file

@ -108,6 +108,44 @@ export {
type SubscribeOptions,
} from './daemon/index.js';
// PR #4255 fold-in 9 review thread #11 — Issue #4175 PR 21 auth
// surface. These were re-exported from `./daemon/index.js` but the
// public SDK entry (this file) never re-exported them, so an
// `import { DaemonAuthFlow } from '@qwen-code/sdk'` resolved to
// undefined. The PR description lists `reduceDaemonAuthEvent` as
// SDK surface and `client.auth.start()` works only because
// `DaemonClient` (already exported above) constructs `DaemonAuthFlow`
// internally; every other API path was unreachable.
export {
DaemonAuthFlow,
DEVICE_FLOW_EXPIRY_GRACE_MS,
createDaemonAuthState,
reduceDaemonAuthEvent,
reduceDaemonAuthEvents,
type AwaitCompletionOptions,
type DaemonAuthDeviceFlowAuthorizedData,
type DaemonAuthDeviceFlowAuthorizedEvent,
type DaemonAuthDeviceFlowCancelledData,
type DaemonAuthDeviceFlowCancelledEvent,
type DaemonAuthDeviceFlowErrorKind,
type DaemonAuthDeviceFlowFailedData,
type DaemonAuthDeviceFlowFailedEvent,
type DaemonAuthDeviceFlowProviderId,
type DaemonAuthDeviceFlowStartedData,
type DaemonAuthDeviceFlowStartedEvent,
type DaemonAuthDeviceFlowStatus,
type DaemonAuthDeviceFlowThrottledData,
type DaemonAuthDeviceFlowThrottledEvent,
type DaemonAuthEvent,
type DaemonAuthFlowHandle,
type DaemonAuthProviderId,
type DaemonAuthState,
type DaemonAuthStatusSnapshot,
type DaemonDeviceFlowReducerState,
type DaemonDeviceFlowStartResult,
type DaemonDeviceFlowState,
} from './daemon/index.js';
// SDK MCP Server exports
export { tool } from './mcp/tool.js';
export { createSdkMcpServer } from './mcp/createSdkMcpServer.js';

View file

@ -0,0 +1,411 @@
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import { describe, expect, it } from 'vitest';
import { DaemonAuthFlow } from '../../src/daemon/DaemonAuthFlow.js';
import {
DaemonHttpError,
type DaemonClient,
} from '../../src/daemon/DaemonClient.js';
import type {
DaemonDeviceFlowStartResult,
DaemonDeviceFlowState,
} from '../../src/daemon/types.js';
// PR #4255 fold-in 10 #2: covers `DaemonAuthFlow`'s `start()` +
// `awaitCompletion()` state machine end-to-end. The class is the
// primary SDK entry point in PR 21's user-facing surface
// (`client.auth.start({providerId}).awaitCompletion()`); this file
// exercises the production paths the round-8 reviewer flagged as
// untested:
// - happy path polling → `authorized`
// - `slow_down` interval bumping + `onThrottled` callback
// - `AbortSignal` propagation through both polling and the GET
// - `timeoutMs` ceiling (incl. the round-9 #6 `0` honor)
// - 404 → synthetic `error`/`not_found_or_evicted` (loop AND ceiling)
// - `sanitizePositiveMs` edge cases (NaN / Infinity fallback)
// - `cancel()` wrapper forwards to `client.cancelDeviceFlow`
interface FakeClientCalls {
start: number;
get: Array<{
deviceFlowId: string;
clientId?: string;
signal?: AbortSignal;
}>;
cancel: Array<{ deviceFlowId: string; clientId?: string }>;
}
function makeFakeClient(opts: {
startResult?: DaemonDeviceFlowStartResult;
/** Sequenced replies for `getDeviceFlow`. The Nth call returns the
* Nth entry; if the list runs out, the LAST entry is repeated.
* Either a `DaemonDeviceFlowState` or a thrown error. */
getReplies: Array<DaemonDeviceFlowState | Error>;
}): { client: DaemonClient; calls: FakeClientCalls } {
const calls: FakeClientCalls = { start: 0, get: [], cancel: [] };
const startResult: DaemonDeviceFlowStartResult = opts.startResult ?? {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
userCode: 'USER-1',
verificationUri: 'https://idp.example/verify',
expiresAt: Date.now() + 60_000,
intervalMs: 50, // tests use small intervals so polling is fast
attached: false,
};
const replies = [...opts.getReplies];
const fake = {
async startDeviceFlow(_opts: {
providerId: string;
clientId?: string;
}): Promise<DaemonDeviceFlowStartResult> {
calls.start += 1;
return startResult;
},
async getDeviceFlow(
deviceFlowId: string,
callOpts: { clientId?: string; signal?: AbortSignal } = {},
): Promise<DaemonDeviceFlowState> {
calls.get.push({
deviceFlowId,
...(callOpts.clientId !== undefined
? { clientId: callOpts.clientId }
: {}),
...(callOpts.signal !== undefined ? { signal: callOpts.signal } : {}),
});
const reply = replies.length > 1 ? replies.shift()! : replies[0];
if (reply instanceof Error) throw reply;
return reply;
},
async cancelDeviceFlow(
deviceFlowId: string,
callOpts: { clientId?: string } = {},
): Promise<void> {
calls.cancel.push({
deviceFlowId,
...(callOpts.clientId !== undefined
? { clientId: callOpts.clientId }
: {}),
});
},
};
return { client: fake as unknown as DaemonClient, calls };
}
describe('DaemonAuthFlow.start (fold-in 10 #2)', () => {
it('returns a handle pinned to the daemon-supplied start result', async () => {
const { client } = makeFakeClient({
getReplies: [
// Will only be called if awaitCompletion runs; this test only
// exercises start, so reply shape is irrelevant.
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
expect(handle.deviceFlowId).toBe('flow-A');
expect(handle.providerId).toBe('qwen-oauth');
expect(handle.userCode).toBe('USER-1');
expect(handle.attached).toBe(false);
expect(handle.intervalMs).toBe(50);
});
});
describe('DaemonAuthFlow.awaitCompletion (fold-in 10 #2)', () => {
it('polls until the daemon reports a terminal state (authorized)', async () => {
const expiresAt = Date.now() + 5_000;
const { client, calls } = makeFakeClient({
getReplies: [
// First two GETs: still pending.
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
// Third GET: terminal authorized.
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'authorized',
expiresAt,
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
const final = await handle.awaitCompletion({ pollOverrideMs: 1_000 });
expect(final.status).toBe('authorized');
expect(final.expiresAt).toBe(expiresAt);
expect(calls.get.length).toBeGreaterThanOrEqual(3);
});
it('honors `slow_down`-driven intervalMs bumps via onThrottled callback', async () => {
const observedIntervals: number[] = [];
const { client } = makeFakeClient({
getReplies: [
// First GET: daemon reports a bumped interval (slow_down).
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
intervalMs: 10_000, // bumped from start's 50
createdAt: Date.now(),
},
// Second GET: terminal so the loop exits.
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'authorized',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
await handle.awaitCompletion({
onThrottled: (ms) => observedIntervals.push(ms),
pollOverrideMs: 1_000,
});
expect(observedIntervals).toContain(10_000);
});
it('rejects when opts.signal is aborted mid-poll', async () => {
// Replies stream forever as `pending` — caller's abort must be the
// exit path.
const { client } = makeFakeClient({
getReplies: [
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
const ctrl = new AbortController();
const completion = handle.awaitCompletion({
signal: ctrl.signal,
pollOverrideMs: 1_000,
});
// Fire abort on the very next microtask so the loop's signal check
// sees it before issuing another GET.
queueMicrotask(() => ctrl.abort(new Error('test-cancel')));
await expect(completion).rejects.toThrowError(/test-cancel/);
});
it('forwards opts.signal into client.getDeviceFlow on every GET (fold-in 7 #6)', async () => {
const { client, calls } = makeFakeClient({
getReplies: [
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'authorized',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
const ctrl = new AbortController();
await handle.awaitCompletion({ signal: ctrl.signal });
expect(calls.get.length).toBeGreaterThanOrEqual(1);
expect(calls.get[0]?.signal).toBe(ctrl.signal);
});
it('returns the final GET snapshot at the timeoutMs ceiling', async () => {
const { client, calls } = makeFakeClient({
getReplies: [
// Stays pending forever; timeoutMs ceiling is what exits.
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
const final = await handle.awaitCompletion({
timeoutMs: 60, // very short; ceiling fires after a few ticks
pollOverrideMs: 1_000,
});
expect(final.status).toBe('pending');
expect(calls.get.length).toBeGreaterThanOrEqual(1);
});
it('honors timeoutMs:0 — returns the daemon snapshot immediately (round-9 #6)', async () => {
const { client, calls } = makeFakeClient({
getReplies: [
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
const final = await handle.awaitCompletion({ timeoutMs: 0 });
expect(final.status).toBe('pending');
// Exactly one GET — the immediate ceiling-read path.
expect(calls.get.length).toBe(1);
});
it('falls back to default ceiling when timeoutMs is NaN (sanitizePositiveMs)', async () => {
// NaN was the bug fold-in 7 #5 fixed: previously `?? default`
// accepted NaN and produced `ceiling = NaN`, looping forever
// (`now >= NaN` is always false). The sanitized form drops NaN
// to undefined which falls back to `expiresAt + GRACE`.
//
// Test pins the contract by using a start result whose
// `expiresAt` is FAR in the past — `expiresAt + GRACE` is then
// also in the past, so the ceiling check on iteration 1 fires
// immediately and the test bails fast. If sanitization broke
// and NaN slipped through, the loop would never exit.
const { client } = makeFakeClient({
startResult: {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
userCode: 'USER-1',
verificationUri: 'https://idp.example/verify',
expiresAt: Date.now() - 60_000, // ceiling = -30s ago → bail
intervalMs: 50,
attached: false,
},
getReplies: [
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
const final = await handle.awaitCompletion({ timeoutMs: NaN });
expect(final.status).toBe('pending');
});
it('synthesizes error/not_found_or_evicted on a 404 from getDeviceFlow (fold-in 3 #4)', async () => {
const { client } = makeFakeClient({
getReplies: [new DaemonHttpError(404, null, 'not found')],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
const final = await handle.awaitCompletion({ pollOverrideMs: 1_000 });
expect(final.status).toBe('error');
expect(final.errorKind).toBe('not_found_or_evicted');
expect(final.providerId).toBe('qwen-oauth');
});
it('routes the timeoutMs:0 ceiling read through the same 404 helper (fold-in 7 #4)', async () => {
// Pre-fold-in-7 #4, the ceiling read called getDeviceFlow
// directly and a 404 there would reject `awaitCompletion` with
// `DaemonHttpError(404)` instead of returning the structured
// synthetic state. With timeoutMs:0 the FIRST read is the
// ceiling read — verify the 404 still synthesizes.
const { client } = makeFakeClient({
getReplies: [new DaemonHttpError(404, null, 'evicted')],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
const final = await handle.awaitCompletion({ timeoutMs: 0 });
expect(final.status).toBe('error');
expect(final.errorKind).toBe('not_found_or_evicted');
});
it('rethrows non-404 DaemonHttpErrors so the SDK consumer sees the daemon-side failure', async () => {
const { client } = makeFakeClient({
getReplies: [new DaemonHttpError(500, null, 'daemon exploded')],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({ providerId: 'qwen-oauth' });
await expect(
handle.awaitCompletion({ pollOverrideMs: 1_000 }),
).rejects.toBeInstanceOf(DaemonHttpError);
});
});
describe('DaemonAuthFlow.cancel (fold-in 10 #2)', () => {
it('forwards to client.cancelDeviceFlow with the captured deviceFlowId + clientId', async () => {
const { client, calls } = makeFakeClient({
getReplies: [
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const handle = await auth.start({
providerId: 'qwen-oauth',
clientId: 'sdk-client-X',
});
await handle.cancel();
expect(calls.cancel).toEqual([
{ deviceFlowId: 'flow-A', clientId: 'sdk-client-X' },
]);
});
it('top-level cancel(deviceFlowId) wrapper also forwards to the client', async () => {
const { client, calls } = makeFakeClient({
getReplies: [
{
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
await auth.cancel('flow-Z', { clientId: 'admin-1' });
expect(calls.cancel).toEqual([
{ deviceFlowId: 'flow-Z', clientId: 'admin-1' },
]);
});
it('top-level status(deviceFlowId) wrapper forwards to client.getDeviceFlow', async () => {
const { client, calls } = makeFakeClient({
getReplies: [
{
deviceFlowId: 'flow-Q',
providerId: 'qwen-oauth',
status: 'authorized',
createdAt: Date.now(),
},
],
});
const auth = new DaemonAuthFlow(client);
const result = await auth.status('flow-Q', { clientId: 'admin-1' });
expect(result.status).toBe('authorized');
expect(calls.get).toEqual([
{ deviceFlowId: 'flow-Q', clientId: 'admin-1' },
]);
});
});

View file

@ -1576,4 +1576,203 @@ describe('DaemonClient', () => {
}
});
});
// PR #4255 fold-in 10 #3 — device-flow HTTP method coverage. The
// round-8 reviewer flagged that `startDeviceFlow` /
// `getDeviceFlow` / `cancelDeviceFlow` / `getAuthStatus` plus the
// `client.auth` lazy getter had zero unit tests; this block
// exercises route paths, method codes, signal forwarding (fold-in
// 7 #6), and the `failOnError` → `DaemonHttpError` mapping.
describe('device-flow methods (fold-in 10 #3)', () => {
it('startDeviceFlow POSTs /workspace/auth/device-flow + forwards body / clientId header', async () => {
const { fetch, calls } = recordingFetch(() =>
jsonResponse(201, {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
userCode: 'USER-1',
verificationUri: 'https://idp.example/verify',
expiresAt: 1_700_000_000_000,
intervalMs: 5_000,
attached: false,
}),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
const res = await client.startDeviceFlow({
providerId: 'qwen-oauth',
clientId: 'sdk-X',
});
expect(res.deviceFlowId).toBe('flow-A');
expect(res.attached).toBe(false);
const call = calls[0];
expect(call?.url).toBe('http://daemon/workspace/auth/device-flow');
expect(call?.method).toBe('POST');
expect(call?.headers['x-qwen-client-id']).toBe('sdk-X');
expect(JSON.parse(call?.body ?? '{}')).toEqual({
providerId: 'qwen-oauth',
});
});
it('startDeviceFlow accepts 200 (take-over branch) and 201 (fresh) identically', async () => {
const body = {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
userCode: 'USER-1',
verificationUri: 'https://idp.example/verify',
expiresAt: 1_700_000_000_000,
intervalMs: 5_000,
attached: true,
};
for (const status of [200, 201]) {
const { fetch } = recordingFetch(() => jsonResponse(status, body));
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
await expect(
client.startDeviceFlow({ providerId: 'qwen-oauth' }),
).resolves.toMatchObject({ attached: true });
}
});
it('startDeviceFlow throws DaemonHttpError on non-2xx (e.g. 502 upstream_error)', async () => {
const { fetch } = recordingFetch(() =>
jsonResponse(502, { error: 'upstream', code: 'upstream_error' }),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
await expect(
client.startDeviceFlow({ providerId: 'qwen-oauth' }),
).rejects.toBeInstanceOf(DaemonHttpError);
});
it('getDeviceFlow GETs /workspace/auth/device-flow/:id with URL-encoded id', async () => {
const { fetch, calls } = recordingFetch(() =>
jsonResponse(200, {
deviceFlowId: 'flow with space',
providerId: 'qwen-oauth',
status: 'authorized',
createdAt: 1_700_000_000_000,
}),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
const res = await client.getDeviceFlow('flow with space');
expect(res.status).toBe('authorized');
// RFC 3986 / encodeURIComponent — `' '` → `%20`.
expect(calls[0]?.url).toBe(
'http://daemon/workspace/auth/device-flow/flow%20with%20space',
);
expect(calls[0]?.method).toBe('GET');
});
it('getDeviceFlow forwards opts.signal into fetch (fold-in 7 #6)', async () => {
const ctrl = new AbortController();
let observedSignal: AbortSignal | undefined;
const fetchImpl = vi.fn(
async (_input: RequestInfo | URL, init?: RequestInit) => {
observedSignal = init?.signal ?? undefined;
return jsonResponse(200, {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
status: 'pending',
createdAt: 1_700_000_000_000,
});
},
) as unknown as typeof globalThis.fetch;
const client = new DaemonClient({
baseUrl: 'http://daemon',
fetch: fetchImpl,
});
await client.getDeviceFlow('flow-A', { signal: ctrl.signal });
// The fetched signal is COMPOSED with the per-request timeout
// controller (composeAbortSignals), so we can't assert
// identity. Instead verify that aborting the caller's signal
// propagates to fetch's signal.
expect(observedSignal).toBeDefined();
expect(observedSignal!.aborted).toBe(false);
ctrl.abort(new Error('caller-cancel'));
expect(observedSignal!.aborted).toBe(true);
});
it('getDeviceFlow throws DaemonHttpError(404) on missing/evicted id', async () => {
const { fetch } = recordingFetch(() =>
jsonResponse(404, {
error: 'not found',
code: 'device_flow_not_found',
}),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
const err = await client
.getDeviceFlow('nonexistent')
.catch((e: unknown) => e);
expect(err).toBeInstanceOf(DaemonHttpError);
expect((err as DaemonHttpError).status).toBe(404);
});
it('cancelDeviceFlow DELETEs /workspace/auth/device-flow/:id and resolves on 204', async () => {
const { fetch, calls } = recordingFetch(
() =>
new Response(null, {
status: 204,
}),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
await expect(
client.cancelDeviceFlow('flow-A', { clientId: 'sdk-Y' }),
).resolves.toBeUndefined();
expect(calls[0]?.method).toBe('DELETE');
expect(calls[0]?.headers['x-qwen-client-id']).toBe('sdk-Y');
});
it('cancelDeviceFlow swallows 404 idempotently (matches closeSession contract)', async () => {
// Per `cancelDeviceFlow`'s JSDoc + the daemon's DELETE route:
// both 204 (terminal-grace no-op) and 404 (unknown / evicted)
// resolve to undefined so retries from a SDK that's lost track
// are safe. Non-404/204 statuses are the only error envelope.
const { fetch } = recordingFetch(() =>
jsonResponse(404, {
error: 'not found',
code: 'device_flow_not_found',
}),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
await expect(client.cancelDeviceFlow('nope')).resolves.toBeUndefined();
});
it('cancelDeviceFlow throws DaemonHttpError on non-204/404 (e.g. 500)', async () => {
const { fetch } = recordingFetch(() =>
jsonResponse(500, { error: 'daemon exploded' }),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
await expect(client.cancelDeviceFlow('flow-A')).rejects.toBeInstanceOf(
DaemonHttpError,
);
});
it('getAuthStatus GETs /workspace/auth/status and returns the snapshot', async () => {
const snapshot = {
v: 1 as const,
workspaceCwd: '/work/bound',
providers: [],
pendingDeviceFlows: [],
supportedDeviceFlowProviders: ['qwen-oauth' as const],
};
const { fetch, calls } = recordingFetch(() =>
jsonResponse(200, snapshot),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
const res = await client.getAuthStatus();
expect(res).toEqual(snapshot);
expect(calls[0]?.url).toBe('http://daemon/workspace/auth/status');
expect(calls[0]?.method).toBe('GET');
});
it('client.auth is a lazy DaemonAuthFlow instance (constructed on first access, then cached)', async () => {
const { fetch } = recordingFetch(() =>
jsonResponse(200, { status: 'ok' }),
);
const client = new DaemonClient({ baseUrl: 'http://daemon', fetch });
const a = client.auth;
const b = client.auth;
// Same instance on subsequent reads — singleton allocation.
expect(a).toBe(b);
});
});
});

View file

@ -100,4 +100,17 @@ describe('public SDK entry — typed daemon event surface (#4217)', () => {
expectTypeOf<DaemonStreamErrorData>().not.toBeNever();
expectTypeOf<DaemonPermissionOption>().not.toBeNever();
});
it('exposes the PR 21 auth device-flow surface at the public entry', () => {
// PR #4255 fold-in 9 review thread #11: the auth surface had
// been re-exported from `src/daemon/index.ts` but never from
// the published `src/index.ts`, so SDK consumers got
// `undefined` for everything except `client.auth.start()`
// (which traveled through the already-exported `DaemonClient`).
expect(typeof Public.DaemonAuthFlow).toBe('function');
expect(typeof Public.reduceDaemonAuthEvent).toBe('function');
expect(typeof Public.reduceDaemonAuthEvents).toBe('function');
expect(typeof Public.createDaemonAuthState).toBe('function');
expect(typeof Public.DEVICE_FLOW_EXPIRY_GRACE_MS).toBe('number');
});
});

View file

@ -7,8 +7,11 @@
import { describe, expect, it } from 'vitest';
import {
asKnownDaemonEvent,
createDaemonAuthState,
createDaemonSessionViewState,
isDaemonEventType,
reduceDaemonAuthEvent,
reduceDaemonAuthEvents,
reduceDaemonSessionEvent,
reduceDaemonSessionEvents,
} from '../../src/daemon/events.js';
@ -872,3 +875,326 @@ describe('daemon event schema', () => {
});
});
});
describe('PR 21 — auth device-flow events', () => {
it('narrows the 5 device-flow event types', () => {
const types = [
'auth_device_flow_started',
'auth_device_flow_throttled',
'auth_device_flow_authorized',
'auth_device_flow_failed',
'auth_device_flow_cancelled',
] as const;
const datas: Record<(typeof types)[number], unknown> = {
auth_device_flow_started: {
deviceFlowId: 'flow-1',
providerId: 'qwen-oauth',
expiresAt: 1_700_000_000_000,
},
auth_device_flow_throttled: {
deviceFlowId: 'flow-1',
intervalMs: 10_000,
},
auth_device_flow_authorized: {
deviceFlowId: 'flow-1',
providerId: 'qwen-oauth',
expiresAt: 1_700_000_900_000,
accountAlias: 'user-A',
},
auth_device_flow_failed: {
deviceFlowId: 'flow-1',
errorKind: 'access_denied',
},
auth_device_flow_cancelled: {
deviceFlowId: 'flow-1',
},
};
for (const [i, type] of types.entries()) {
const event: DaemonEvent = {
id: i + 1,
v: 1,
type,
data: datas[type],
};
expect(isDaemonEventType(event, type)).toBe(true);
expect(asKnownDaemonEvent(event)?.type).toBe(type);
}
});
it('rejects malformed device-flow data via type guards', () => {
expect(
asKnownDaemonEvent({
id: 1,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'x',
providerId: 'qwen-oauth' /* missing expiresAt */,
},
}),
).toBeUndefined();
// PR #4255 fold-in 2 (C2): unknown errorKind is no longer a
// narrowing failure — the open `(string & {})` arm of the
// DaemonAuthDeviceFlowErrorKind union accepts ANY non-empty
// string so a daemon adding a new kind isn't silently dropped.
// The data IS valid; consumers branching on the known literals
// still narrow exhaustively, with unknown kinds falling into the
// string fallback arm.
const futureKind = asKnownDaemonEvent({
id: 2,
v: 1,
type: 'auth_device_flow_failed',
data: { deviceFlowId: 'x', errorKind: 'rate_limited' },
});
expect(futureKind).toBeDefined();
expect(futureKind?.type).toBe('auth_device_flow_failed');
// Empty string still rejected (truly malformed).
expect(
asKnownDaemonEvent({
id: 3,
v: 1,
type: 'auth_device_flow_failed',
data: { deviceFlowId: 'x', errorKind: '' },
}),
).toBeUndefined();
});
it('reduceDaemonAuthEvent: started → throttled → authorized projects per-provider state', () => {
const events: DaemonEvent[] = [
{
id: 1,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
expiresAt: 1_700_000_900_000,
},
},
{
id: 2,
v: 1,
type: 'auth_device_flow_throttled',
data: { deviceFlowId: 'flow-A', intervalMs: 10_000 },
},
{
id: 3,
v: 1,
type: 'auth_device_flow_authorized',
data: {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
expiresAt: 1_700_000_999_000,
accountAlias: 'user-A',
},
},
];
const state = reduceDaemonAuthEvents(events);
const flow = state.flows['qwen-oauth'];
expect(flow).toBeDefined();
expect(flow?.status).toBe('authorized');
expect(flow?.intervalMs).toBe(10_000);
expect(flow?.authorizedExpiresAt).toBe(1_700_000_999_000);
expect(flow?.accountAlias).toBe('user-A');
});
it('reduceDaemonAuthEvent: failed event always projects status:error + errorKind (aligned with daemon)', () => {
// Issue #4175 PR 21 fold-in 0 P1-10: SDK reducer now mirrors the
// daemon's status machine — every `failed` event resolves to
// `status: 'error'`, regardless of `errorKind`. The error nature
// (expired vs denied vs persist failure) lives in `errorKind`,
// not `status`. Earlier drafts collapsed `expired_token` to
// `status: 'expired'`, diverging from the daemon's GET response.
const expired = reduceDaemonAuthEvent(
reduceDaemonAuthEvent(createDaemonAuthState(), {
id: 1,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-X',
providerId: 'qwen-oauth',
expiresAt: 0,
},
}),
{
id: 2,
v: 1,
type: 'auth_device_flow_failed',
data: { deviceFlowId: 'flow-X', errorKind: 'expired_token' },
},
);
expect(expired.flows['qwen-oauth']?.status).toBe('error');
expect(expired.flows['qwen-oauth']?.errorKind).toBe('expired_token');
const denied = reduceDaemonAuthEvent(
reduceDaemonAuthEvent(createDaemonAuthState(), {
id: 3,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-Y',
providerId: 'qwen-oauth',
expiresAt: 0,
},
}),
{
id: 4,
v: 1,
type: 'auth_device_flow_failed',
data: { deviceFlowId: 'flow-Y', errorKind: 'access_denied' },
},
);
expect(denied.flows['qwen-oauth']?.status).toBe('error');
expect(denied.flows['qwen-oauth']?.errorKind).toBe('access_denied');
// P1-10 cousin: new `persist_failed` errorKind also lands as
// `status: 'error'`, with the kind preserved.
const persistFailed = reduceDaemonAuthEvent(
reduceDaemonAuthEvent(createDaemonAuthState(), {
id: 5,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-Z',
providerId: 'qwen-oauth',
expiresAt: 0,
},
}),
{
id: 6,
v: 1,
type: 'auth_device_flow_failed',
data: { deviceFlowId: 'flow-Z', errorKind: 'persist_failed' },
},
);
expect(persistFailed.flows['qwen-oauth']?.status).toBe('error');
expect(persistFailed.flows['qwen-oauth']?.errorKind).toBe('persist_failed');
});
it('reduceDaemonAuthEvent ignores stale events that do not match the current flow', () => {
const seeded = reduceDaemonAuthEvent(createDaemonAuthState(), {
id: 1,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
expiresAt: 100,
},
});
const stale = reduceDaemonAuthEvent(seeded, {
id: 2,
v: 1,
type: 'auth_device_flow_authorized',
data: {
deviceFlowId: 'flow-OTHER',
providerId: 'qwen-oauth',
expiresAt: 200,
},
});
expect(stale.flows['qwen-oauth']?.status).toBe('pending');
});
it('reduceDaemonAuthEvent rejects out-of-order frames (fold-in 8 #2 monotonicity)', () => {
// Live: started(id=5) → authorized(id=10). Replay then injects a
// stale `failed` (id=7) for the same flow — without monotonicity
// it would overwrite `authorized` back to `error`/`upstream_error`.
let state = reduceDaemonAuthEvent(createDaemonAuthState(), {
id: 5,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
expiresAt: 1_700_000_900_000,
},
});
state = reduceDaemonAuthEvent(state, {
id: 10,
v: 1,
type: 'auth_device_flow_authorized',
data: {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
expiresAt: 1_700_001_000_000,
},
});
expect(state.flows['qwen-oauth']?.status).toBe('authorized');
expect(state.flows['qwen-oauth']?.lastSeenEventId).toBe(10);
const replayedStale = reduceDaemonAuthEvent(state, {
id: 7, // stale: less than the current lastSeenEventId (10)
v: 1,
type: 'auth_device_flow_failed',
data: {
deviceFlowId: 'flow-A',
errorKind: 'upstream_error',
},
});
// Stale frame must NOT overwrite the authorized terminal.
expect(replayedStale.flows['qwen-oauth']?.status).toBe('authorized');
expect(replayedStale.flows['qwen-oauth']?.lastSeenEventId).toBe(10);
expect(replayedStale.flows['qwen-oauth']?.errorKind).toBeUndefined();
// A fresh `started` (id=4 < 10) for a NEW flow under the same
// providerId is also rejected as stale — the SDK has already
// observed the newer flow's authorized state and the lower-id
// started must be a replay of an old flow that gave way.
const replayedStartedStale = reduceDaemonAuthEvent(state, {
id: 4,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-OLD',
providerId: 'qwen-oauth',
expiresAt: 1_700_000_500_000,
},
});
expect(replayedStartedStale.flows['qwen-oauth']?.deviceFlowId).toBe(
'flow-A',
);
expect(replayedStartedStale.flows['qwen-oauth']?.status).toBe('authorized');
});
it('reduceDaemonAuthEvent passes synthetic frames (no envelope id) through the gate', () => {
// Synthetic frames originate inside SDK reducer machinery and
// aren't subject to replay ordering — gate must let them
// through even when state's lastSeenEventId is set.
let state = reduceDaemonAuthEvent(createDaemonAuthState(), {
id: 5,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
expiresAt: 1_700_000_900_000,
},
});
state = reduceDaemonAuthEvent(state, {
// No `id`: synthetic / fallback path.
v: 1,
type: 'auth_device_flow_cancelled',
data: { deviceFlowId: 'flow-A' },
});
expect(state.flows['qwen-oauth']?.status).toBe('cancelled');
});
it('reduceDaemonSessionEvent no-ops on auth events (workspace-scoped)', () => {
const initial = createDaemonSessionViewState();
const next = reduceDaemonSessionEvent(initial, {
id: 1,
v: 1,
type: 'auth_device_flow_started',
data: {
deviceFlowId: 'flow-A',
providerId: 'qwen-oauth',
expiresAt: 1_700_000_900_000,
},
});
// Only `lastEventId` advanced; everything else is the seeded zero state.
expect(next.lastEventId).toBe(1);
expect(next.alive).toBe(true);
expect(next.terminalEvent).toBeUndefined();
expect(next.unrecognizedKnownEventCount).toBe(0);
});
});