mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-18 23:42:43 +00:00

wenshao 66ffd7cc66 feat(serve): close 3 chiga0 audit items — ringSize 4000, --max-sessions, /health?deep=1 (#3803 )

Three "30-minute" items from chiga0's external architecture audit
(2026-05-11). All actionable within Stage 1 scope; remaining items
in chiga0's review (SaaS positioning, multi-token to Stage 1.5,
acp-bridge package extraction, reference orchestrator) are larger
scoping decisions deferred to Stage 1.5/2.

DEFAULT_RING_SIZE 1000 → 4000 (Risk 4):
- A single long turn can emit hundreds of frames (test plan reports
  13 for a SHORT turn, real workloads can be 10× that). 1000 was
  exhausted by a moderate turn before a 5s reconnect window
  finished. 4000 gives ~30× headroom over a typical busy turn at
  the cost of a few hundred KB RAM/session. Updated user + protocol
  docs and the daemon-client-quickstart example.

--max-sessions <n> (default 20) (Rec 3):
- New `ServeOptions.maxSessions` + matching `BridgeOptions`. Bridge
  throws `SessionLimitExceededError` when `byId.size +
  inFlightSpawns.size >= max` BEFORE issuing a fresh spawn. Attaches
  to existing sessions (single scope) bypass the cap so an idle
  daemon's reconnects keep working at-capacity. `0` disables.
  Default of 20 sized below the design's N≈50 cliff (per-session
  ~30–50 MB RSS + FD pressure). HTTP route maps to 503 with
  `Retry-After: 5` and `code: session_limit_exceeded`. Tests cover:
  cap rejection under thread scope, attach-not-counted under single
  scope, `0` disables. Documented in CLI flags table + protocol
  Common-error section.

/health?deep=1 (Risk 3):
- Default `/health` stays cheap (no bridge access). With `?deep=1`
  the response includes `sessions` and `pendingPermissions` from
  the bridge — touches state so a wedged bridge surfaces as 503
  `{status: "degraded"}` instead of "200 ok" on a zombie daemon
  (the `k8s rolling deploy will see healthy` failure mode chiga0
  flagged). Loopback-vs-non-loopback bearer-exempt logic from the
  earlier A8dZT fix is preserved via a shared handler. Tests cover:
  cheap default, deep response shape, throwing-getter → 503.

2026-05-12 07:35:52 +08:00

6.3 KiB

Raw Blame History

DaemonClient quickstart (TypeScript)

A minimal end-to-end example: start a qwen serve daemon in another terminal, then drive it from a Node script with the SDK's DaemonClient. See also: Daemon mode user guide and HTTP protocol reference.

Setup

In one terminal:

cd your-project/
qwen serve --port 4170
# → qwen serve listening on http://127.0.0.1:4170 (mode=http-bridge)

In another:

npm install @qwen-code/sdk

Hello daemon

import { DaemonClient, type DaemonEvent } from '@qwen-code/sdk';

const client = new DaemonClient({
  baseUrl: 'http://127.0.0.1:4170',
  // token: process.env.QWEN_SERVER_TOKEN, // required for non-loopback binds
});

// 1. Confirm we can reach the daemon and gate UI on its features.
const caps = await client.capabilities();
console.log('Daemon features:', caps.features);

// 2. Spawn-or-attach a session for the current workspace.
const session = await client.createOrAttachSession({
  workspaceCwd: process.cwd(),
});
console.log(`session=${session.sessionId} attached=${session.attached}`);

// 3. Subscribe to the event stream. Pass `lastEventId: 0` so the daemon
//    replays everything from the session's start — without it, there's
//    a TOCTOU window between `subscribeEvents()` returning the iterator
//    and the underlying SSE connection actually opening (one fetch
//    round-trip), during which a fast-starting agent can emit events
//    that go into the per-session ring but won't be streamed to a fresh
//    no-cursor subscriber. `lastEventId: 0` makes the replay buffer
//    cover that gap (and any reconnect later — see below).
const abort = new AbortController();
const subscription = (async () => {
  for await (const event of client.subscribeEvents(session.sessionId, {
    signal: abort.signal,
    lastEventId: 0,
  })) {
    handleEvent(event);
  }
})();

// 4. Send a prompt and wait for it to settle. (Order-of-operations
//    note: even if `prompt()` fires before the SSE handshake
//    completes, step 3's `lastEventId: 0` guarantees every event
//    lands in the iterator.)
const result = await client.prompt(session.sessionId, {
  prompt: [{ type: 'text', text: 'Summarize src/main.ts in one sentence.' }],
});
console.log('stop reason:', result.stopReason);

// 5. Tear down the subscription so the script can exit.
abort.abort();
await subscription;

function handleEvent(event: DaemonEvent): void {
  switch (event.type) {
    case 'session_update': {
      const data = event.data as {
        sessionUpdate: string;
        content?: { text?: string };
      };
      if (data.sessionUpdate === 'agent_message_chunk' && data.content?.text) {
        process.stdout.write(data.content.text);
      }
      break;
    }
    case 'permission_request':
      // See "Voting on permissions" below for first-responder semantics.
      console.log('\n[needs permission]', event.data);
      break;
    case 'permission_resolved':
      console.log('\n[permission resolved]', event.data);
      break;
    case 'session_died':
      console.error('\n[agent crashed]', event.data);
      break;
    default:
      console.log(`\n[${event.type}]`, event.data);
  }
}

Reconnect with `Last-Event-ID`

If your client process restarts mid-session, replay events you missed:

let cursor: number | undefined;

for await (const event of client.subscribeEvents(session.sessionId, {
  signal: abort.signal,
  lastEventId: cursor, // resume from after this id; undefined = live only
})) {
  if (typeof event.id === 'number') cursor = event.id;
  handleEvent(event);
}

The daemon retains the last 4000 events per session in a ring buffer; gaps beyond that window won't be re-deliverable.

Voting on permissions

When the agent asks for permission to run a tool, every connected client sees the permission_request event. First responder wins — once one client votes, the rest get 404 if they try to vote on the same requestId.

case 'permission_request': {
  const req = event.data as {
    requestId: string;
    options: Array<{ optionId: string; name: string; kind: string }>;
  };
  // Pick whichever option you want — `proceed_once`, `allow`, etc.
  const choice = req.options.find((o) => o.kind === 'allow_once') ?? req.options[0];
  const accepted = await client.respondToPermission(req.requestId, {
    outcome: { outcome: 'selected', optionId: choice.optionId },
  });
  if (!accepted) {
    console.log('Another client voted first; nothing to do.');
  }
  break;
}

Shared-session collaboration

Two clients pointed at the same daemon and cwd end up on the same session:

// Client A (e.g. an IDE plugin)
const a = await clientA.createOrAttachSession({ workspaceCwd: '/work/repo' });
console.log(a.attached); // false — A spawned the agent

// Client B (e.g. a web UI on the same machine)
const b = await clientB.createOrAttachSession({ workspaceCwd: '/work/repo' });
console.log(b.attached); // true — B joined A's session
console.log(a.sessionId === b.sessionId); // true

Both clients see the same session_update / permission_request stream. Either can send a prompt; they FIFO-queue per the agent's "one active prompt per session" guarantee.

Authentication

When the daemon was started with a token (any non-loopback bind requires one):

const client = new DaemonClient({
  baseUrl: 'https://your-host:4170',
  token: process.env.QWEN_SERVER_TOKEN,
});

Wrong / missing tokens return 401 with a uniform body — the SDK throws DaemonHttpError on any 4xx/5xx from a route handler.

import { DaemonHttpError } from '@qwen-code/sdk';

try {
  await client.health();
} catch (err) {
  if (err instanceof DaemonHttpError) {
    console.error(`Daemon error ${err.status}:`, err.body);
  } else {
    throw err;
  }
}

Cancel an in-flight prompt

If your user hits Esc:

await client.cancel(session.sessionId);
// In the event stream you'll see the prompt resolve with stopReason: "cancelled"

Cancel only winds down the active prompt — anything you'd already POSTed and that's still queued behind it will continue to run. (See protocol reference for the rationale.)

What's next

HTTP protocol reference — full route spec with status codes
Daemon mode user guide — operator-side docs
Source: packages/sdk-typescript/src/daemon/

6.3 KiB Raw Blame History