qwen-code

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-17 03:57:18 +00:00

History

ChiGao d343e2c15e Some checks are pending Qwen Code CI / Classify PR (push) Waiting to run Details Qwen Code CI / Lint (push) Blocked by required conditions Details Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details feat(perf): progressive MCP availability — MCP no longer blocks first input (#3994 ) * feat(perf): progressive MCP availability — MCP no longer blocks first input Today `Config.initialize()` runs MCP discovery synchronously and the cli can't accept input until every configured MCP server finishes its discover handshake. One slow or hung server bottlenecks every user with MCP configured. Validated by the profiler instrumentation added in this PR (set `QWEN_CODE_PROFILE_STARTUP=1` to reproduce): \| User scenario \| Time to first prompt input \| \| ------------------------- \| -------------------------- \| \| No MCP \| ~480 ms \| \| 1 fast MCP \| ~875 ms \| \| 2 fast + 1 slow MCP \| ~7.1 s \| \| 1 hung MCP server \| ~10.5 s \| (Measured on macOS arm64 / Node 24.15, n=30/fixture, p50.) `Config.initialize()` now passes `{ skipDiscovery: true }` to `createToolRegistry` by default and kicks off MCP discovery in a fire-and-forget background path. As each server completes discover, the cli's `AppContainer` debounces `setTools()` calls into one-frame (16 ms) batches so the model sees the consolidated tool list shortly after each server settles. Rollback: `QWEN_CODE_LEGACY_MCP_BLOCKING=1`. - `packages/core/src/config/config.ts` — `Config.initialize` switches to `skipDiscovery: true` + new `startMcpDiscoveryInBackground()` (defensive against partially-stubbed `ToolRegistry` in tests). Adds `MCPServerConfig.discoveryTimeoutMs` (last positional ctor param — doesn't shift existing call sites). Tool-call timeout is untouched. - `packages/core/src/tools/tool-registry.ts` — new `getMcpClientManager()` getter so the background path can call the incremental discover directly without going through `discoverMcpTools` (which would wipe already-registered tools). - `packages/core/src/tools/mcp-client-manager.ts` — `discoverAllMcpToolsIncremental` now: emits `mcp-client-update` after IN_PROGRESS transition, wraps each per-server discover in a discovery-only timeout (stdio 30s, remote 5s), emits trailing `mcp-client-update` after COMPLETED so UI subscribers see the terminal state. - `packages/cli/src/ui/AppContainer.tsx` — new `useEffect` (gated on `isConfigInitialized`) subscribes to `mcp-client-update` and 16ms-batches `setTools()` calls. Same effect also defers `finalizeStartupProfile` until MCP settles (or 35s hard cap), so startup-perf profiles capture the full MCP timeline. Activated only by `QWEN_CODE_PROFILE_STARTUP=1`; when unset every profiler entry point short-circuits in a single null/flag check and returns. Heisenberg overhead measured at -1.12% Δp50 between profile-on vs profile-off (Welch p=0.092, n=30/config × 3 configs) — within statistical noise. - `packages/cli/src/utils/startupProfiler.ts` — extended with `events` array (multi-fire), `recordStartupEvent`, `setInteractiveMode`, `derivedPhases`, per-checkpoint heap snapshots, `MAX_EVENTS` cap, and `QWEN_CODE_PROFILE_STARTUP_OUTER` / NO_HEAP env opt-ins. + 7 new tests. - `packages/core/src/utils/startupEventSink.ts` (new) — minimal cross-package sink so `core` can emit profiler events without reverse-depending on `cli`. No-op when no sink registered. + 4 tests. - `packages/core/src/index.ts` — export `setStartupEventSink` / `recordStartupEvent` / type aliases. - `packages/cli/src/gemini.tsx` — registers the sink at `main()` entry, adds `first_paint` checkpoint after Ink render, calls `setInteractiveMode(true)` in the interactive branch. - `packages/core/src/config/config.ts` — emits `tool_registry_created`. - `packages/core/src/core/client.ts` — emits `gemini_tools_updated` at the end of `setTools()`. - `packages/core/src/tools/mcp-client-manager.ts` — emits `mcp_discovery_start`, `mcp_server_ready:<name>`, `mcp_first_tool_registered`, `mcp_all_servers_settled`. - `packages/cli/src/ui/AppContainer.tsx` — emits `config_initialize_start`, `config_initialize_end`, `input_enabled`. `Config.initialize()` now returns BEFORE MCP discovery completes. Things to check: - Any code path that assumed "after `config.initialize()`, all MCP tools exist in the registry" — these will see only built-in tools initially; new tools appear via `mcp-client-update` events. - `MCPDiscoveryState.COMPLETED` is now set asynchronously instead of synchronously after `initialize()` resolves. - Model requests issued before MCP settles see only built-in tools; subsequent requests see the full set as servers come online. - Tests that assert MCP tool count immediately after `config.initialize()` should wait for the `mcp-client-update` with COMPLETED discoveryState instead. - 313 impacted-area tests green (config / mcp-client-manager / client / startupProfiler 18 / startupEventSink 4). - `tsc --noEmit` clean for `packages/core` and `packages/cli`. - `eslint` clean on touched files. - Manual: `QWEN_CODE_PROFILE_STARTUP=1 SANDBOX=1` interactive run produces a JSON profile in `~/.qwen/startup-perf/` containing `first_paint`, `config_initialize_start/end`, `input_enabled`, MCP per-server events, and `gemini_tools_updated`. See PR description's "How to validate" section. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): harden progressive MCP discovery against silent regressions Addresses review feedback on PR #3994: - Skip user-disabled servers in discoverAllMcpToolsIncremental. The new incremental path used to iterate Object.entries(servers) without consulting isMcpServerDisabled, so a server the user had explicitly turned off would still get connected and its tools registered. Mirrors the existing protection in discoverAllMcpTools. - Disconnect the underlying client when runWithDiscoveryTimeout fires. Without this, the inner discoverMcpToolsForServer kept running after the timeout rejected the outer promise — if discover() eventually succeeded it would register the late server's tools into the live toolRegistry (a silent registration vector, especially exploitable with a 0/negative discoveryTimeoutMs override). - Clamp discoveryTimeoutMs to [100ms, 300_000ms]. 0/negative/Infinity values previously passed through to setTimeout unvalidated and made the silent-registration bug above trivially reachable. - Classify the `tcp` (WebSocket) transport field as remote so hung WS handshakes use the 5s default instead of the 30s stdio default. - Defensive delete of serverDiscoveryPromises[name] in the per-server catch so a doomed/orphan entry can't briefly short-circuit a subsequent discoverMcpToolsForServer call. Adds focused tests for each fix. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): restore runtime.json sidecar and harden non-interactive MCP visibility Addresses review feedback on PR #3994: - Restore writeRuntimeStatus + markRuntimeStatusEnabled in startInteractiveUI. The progressive-MCP diff inadvertently dropped the runtime.json sidecar write from the interactive entry point, leaving Config.refreshSessionId()'s session-swap refresh as dead code and silently breaking external integrations (terminal multiplexers, IDE integrations, status daemons) that map PID → sessionId via runtime.json. - Add Config.getFailedMcpServerNames() and surface a stderr warning in --prompt / stream-json / ACP entry points when one or more MCP servers failed during background discovery. Per-server errors are caught inside discoverAllMcpToolsIncremental and never reached a TTY otherwise, so a script using non-interactive mode with broken MCP config would silently run with only built-in tools — a regression vs the legacy synchronous path. - Pass the parsed `settings` object through to runNonInteractiveStreamJson. The new call site dropped the argument, falling back to createMinimalSettings() and losing any user-configured permission / approval / hook setup for stream-json sessions. Added regression assertion to gemini.test.tsx. - Move finalizeStartupProfile out of gemini.tsx's stream-json branch and into Session.ensureConfigInitialized so it runs AFTER config.initialize() / waitForMcpReady() in stream-json. Previously the profile was finalized before any MCP / config_initialize_* events were emitted, producing empty stream-json profiles. - Gate setStartupEventSink registration on isStartupProfilerEnabled() so core-side recordStartupEvent calls short-circuit at the first null-check when profiling is disabled, instead of going through an arrow wrapper and the profiler's own enabled gate. - Tighten the type-unsafe ToolRegistry cast in startMcpDiscoveryInBackground to preserve the typed return signature so a rename of getMcpClientManager would be flagged at this call site (kept the optional-chain guard for tests that stub ToolRegistry as a plain object). - Re-document first_paint as "render call returned" so consumers don't confuse Ink's synchronous render() return with literal pixel paint. Kept the checkpoint name for backward compatibility with collected profiles. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(cli): restore resize repaint and pin gemini_tools_lag capture in AppContainer Addresses review feedback on PR #3994: - Restore the terminal-resize useEffect that calls repaintStaticViewport() when terminalWidth changes. The progressive- MCP diff removed previousTerminalWidthRef + the repaint useCallback + the resize useEffect, so tmux pane resizes and fullscreen toggles leave the static region rendered at the old width — header content visibly tears until something else triggers refreshStatic. - Pin the gemini_tools_lag startup metric. The previous onMcpUpdate handler called finalizeOnce() synchronously when discovery reached COMPLETED, but the pending setTools() batch was still 16ms away. setTools() emits `gemini_tools_updated` — when finalize ran first the profile's `finalized` guard suppressed that event, so gemini_tools_lag came out undefined in interactive mode. New onMcpUpdate flushes setTools() NOW on COMPLETED and only finalizes after the flush resolves, guaranteeing the event lands. - Log setTools() batch-flush errors via debugLogger instead of silently swallowing them. GeminiClient.setTools() has no try/catch around warmAll() / getFunctionDeclarations() / getChat().setTools(); the previous `.catch(() => {})` would have hidden production tool-registration regressions completely. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): correct MCP failure visibility and incremental cleanup Addresses three review findings on PR #3994: - McpClient.discover() now flips the client status to DISCONNECTED before re-throwing. Previously, a server that connected successfully but whose discoverPrompts / discoverTools then rejected (or that returned no prompts and no tools) would remain CONNECTED in the global status registry. Config.getFailedMcpServerNames() filters by `status !== CONNECTED`, so such servers were silently omitted from the non-interactive failure banner and the Footer's MCP health pill kept counting them as healthy. - discoverAllMcpToolsIncremental no longer records `outcome: 'ready'` for servers whose connect/discover threw. The inner discoverMcpToolsForServerInternal catches errors without re-throwing (best-effort discovery semantics), so the try block resolved even for failures — only the runWithDiscoveryTimeout path reached the catch. Auth errors, server crashes, and missing-tools responses were therefore recorded as success in the startup profile. We now consult the actual server status (now correctly DISCONNECTED after the first fix) before emitting `ready`, and emit `outcome: 'failed'` otherwise. `mcp_first_tool_registered` is gated on the same check so a failed server can't pollute that user-facing metric. - discoverAllMcpToolsIncremental tears down enabled→disabled mid-session transitions. When a previously-connected server is disabled (e.g. via `/mcp disable foo` or by editing settings), the incremental path used to just `continue` past it, leaving its client, tools, health check, and global status entry in place. Now calls removeServer() for any already-known client we encounter in the disabled branch. Adds focused tests for each fix. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * docs(core): clarify ToolRegistry cast comment in startMcpDiscoveryInBackground Addresses review feedback on PR #3994. The previous comment claimed the call site uses "no defensive cast" but the code still casts via `as ToolRegistry & { getMcpClientManager?: ... }`. Reword to explain the cast's actual purpose: it exists only because some tests stub ToolRegistry as a plain object, so we use optional chaining to avoid crashing the init path when those tests run. Also note that the inner shape now uses `ReturnType<ToolRegistry['getMcpClientManager']>` — a future rename of the production method still surfaces as a type error at this call site rather than silently falling through to the `if (!manager)` branch. Comment-only change; no behavior diff. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): close MCP timeout TOCTOU race and propagate disconnect status Addresses two critical findings on PR #3994 round 6: - runWithDiscoveryTimeout no longer uses fire-and-forget disconnect. The prior `void client.disconnect()` returned before `transport.close()` landed, leaving a window where an in-flight `discover()` could pump `tools/list` through the transport and synchronously register tools into the live registry BEFORE the close took effect. The earlier fix comment described this as a "remote-exploitable silent-tool-registration vector"; the await closes the timing window but doesn't help if tools already landed, so we also drop them with `removeMcpToolsByServer()` after the disconnect resolves. No-op when discover hadn't reached registration yet. - McpClient.disconnect() now writes DISCONNECTED to the global registry directly. Previously, `isDisconnecting = true` was set BEFORE the internal `updateStatus(DISCONNECTED)` call, and `updateStatus`'s guard (designed to suppress LATE writes from a stale `connect()` catch) silently swallowed the write. The global stayed CONNECTED forever for timeout-disconnected servers, so `Config.getFailedMcpServerNames()` (which filters `status !== CONNECTED`) omitted them from the non-interactive failure banner and the Footer's MCP health pill kept counting them as healthy. This invalidated the round-5 `getMCPServerStatus === CONNECTED` gate, which would always pass the "ready" check for timed-out servers. The guard stays in place for its original purpose; the legitimate disconnect→DISCONNECTED notification now bypasses it by writing the registry directly. Also adds the `config_initialize_start` / `_end` profiler checkpoints to `Session.ensureConfigInitialized()` so stream-json startup profiles include the same derived `config_initialize_dur` phase as the non-stream-json branch in gemini.tsx (round 6 [Suggestion]). Tests cover (a) the disconnect-and-cleanup path on timeout and (b) the intentional-disconnect global registry propagation regression. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(mcp): surface failures + prevent health-check resurrection of timed-out servers Round-7 review follow-ups: - AppContainer (interactive): MCP startup failures now route through debugLogger.warn on COMPLETED. Was silent — only debug logs / profile events surfaced failures, so regular interactive users got no indication their MCP servers failed. Mirrors the non-interactive stderr warning, adjusted to debugLogger so it doesn't collide with Ink's rendered output. - acpAgent per-session: `QwenAgent.initializeConfig()` now emits the same `Warning: MCP server(s) failed to start` stderr line as the top-level `runAcpAgent` path. Previously per-session ACP configs with failed MCP servers silently fell back to built-in tools. - mcp-client-manager timeout handler: after disconnecting an intentionally timed-out server, also drop it from `this.clients` and stop any pending health-check timer. Without this the discovery `finally` block would arm a health-check that detected DISCONNECTED status and called `reconnectServer()` → `discoverMcpToolsForServer()` directly — bypassing `runWithDiscoveryTimeout` entirely and silently resurrecting the slow server. `startHealthCheck` also early-returns for unknown servers so the trailing finally-block call is a no-op. - startupEventSink: silent `catch {}` now logs via `debugLogger.error` so a corrupted sink doesn't silently drop every subsequent event. Quiet by default; visible under `QWEN_CODE_DEBUG=1`. Tests: - mcp-client-manager.test.ts: regression for the timeout → no-reconnect invariant (clients map purged + health-check timer absent). - acpAgent.test.ts: per-session newSession surfaces failures to stderr, and stays safe when Config lacks `getFailedMcpServerNames`. Declines (with reasoning in PR reply): - [Critical] AppContainer batch-flush useEffect untested → re-flag of the round-5 deferral that wenshao acknowledged at the time. Lower- layer invariants (this PR's mcp-client-manager + mcp-client tests) pin the dependent contracts. The component-test harness for timers + event emitters in this file is non-trivial and out of scope; tracked for a follow-up. Generated with AI Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> --------- Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com> Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>		2026-05-13 22:17:16 +08:00
..
design	docs(auth): add custom API key wizard PRD (#3583 )	2026-05-13 14:04:41 +08:00
developers	feat(cli,sdk): qwen serve daemon (Stage 1) (#3889 )	2026-05-13 14:47:47 +08:00
plans	feat(vscode-ide-companion): add agent execution tool display (#2590 )	2026-04-18 23:39:26 +08:00
users	feat(perf): progressive MCP availability — MCP no longer blocks first input (#3994 )	2026-05-13 22:17:16 +08:00
_meta.ts	feat: refactor docs	2025-12-05 10:51:57 +08:00
index.md	fix: lint issues	2025-12-19 15:52:11 +08:00