qwen-code

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-19 16:28:28 +00:00

History

tanzhenxin d7a25682e6 refactor(core): route side-query LLM calls through runSideQuery chokepoint (#3775 ) * refactor(core): route side-query LLM calls through runSideQuery chokepoint Folds every one-shot side-query call site through a single `runSideQuery` entry point with `thinkingConfig.includeThoughts: false` and `fastModel` (falling back to main) as the default policy. Adds a text-mode sibling to the existing JSON-mode helper, plus a `BaseLlmClient.generateText` primitive that calls `ContentGenerator.generateContent` directly so side queries get neither user-memory wrapping nor the main-prompt fallback that `geminiClient.generateContent` applies. Migrated call sites: session title, recap, tool-use summary, /rename, follow-up suggestion (direct path), ACP rewrite, project /summary, arena approach summary, chat compression, web-fetch, insight analysis, subagent spec generation. Six call sites override the helper defaults explicitly (subagent gen, suggestion, ACP rewrite, /summary, compression, insight) where main-model quality or caller-supplied model matters. The /summary path additionally fixes a latent bug: text extraction previously did not strip thought parts, so on thinking models the saved `.qwen/PROJECT_SUMMARY.md` could leak `reasoning_content` into the file. The chokepoint now strips thought parts and the request itself goes out with thinking off. Best-effort cosmetic callers (recap, tool-use summary, kebab rename, suggestion) opt into `maxAttempts: 1` so transient outages don't burn seven retries on output the user will likely never see. `isInternalPromptId` recognises the `side-query:` prefix automatically so new call sites are filtered without per-site allowlist updates. Removes the `getAgentContentGenerator` workaround in `InProcessBackend` and the `getAgentSummaryGenerator` indirection in `ArenaManager` — arena approach summaries now run through the chokepoint against `fastModel`, giving every agent a neutral arbiter rather than a self-summary on its own model. * fix(core): guard isInternalPromptId against undefined prompt_id logToolCall calls isInternalPromptId(event.prompt_id), and tool-call events from useToolScheduler can carry an undefined prompt_id. The side-query refactor added promptId.startsWith(SIDE_QUERY_PROMPT_PREFIX) without a falsy guard, so the missing id crashed the logger and broke six useToolScheduler tests across all OS / Node matrix entries on CI. * fix(cli,core): polish runSideQuery callers from review feedback - Cap web-fetch, chat-compression, and ACP rewrite at maxAttempts: 1. These paths degrade gracefully on failure (tool error, NOOP fallback, null return), so 7 retries only delays the user-visible outcome. - /summary now carries the main session's system instruction so the summarizer keeps the coding-assistant role, project context, and user memory instead of summarizing the chat in isolation. - Add isInternalPromptId tests for the side-query: prefix so future callers minted via runSideQuery stay filtered out of recordings. * refactor(core): document runSideQuery defaults and surface promptId in errors - Add JSDoc on the model and config fields of SideQueryJsonOptions and SideQueryTextOptions so the fastModel-first defaulting and the thinkingConfig.includeThoughts: false default are visible at the API surface, not buried in resolveDefaultModel / applyThinkingDefault. - BaseLlmClient.generate{Json,Text} error wraps now include promptId in the message and pass { cause: error }, so a side-query failure identifies which call site failed and preserves the original stack. - Add tests covering maxAttempts forwarding (present + omitted) and rejection propagation for both JSON and text modes — the conditional spread is non-trivial and was previously unverified. * fix(core): preserve per-model provider routing in side queries BaseLlmClient was bound to the main session's ContentGenerator and only swapped the request `model` field, so side queries targeting a fast or alternate model inherited the main provider's baseUrl, credentials, and sampling settings — breaking cross-provider configurations. Move per-model generator/authType resolution out of GeminiClient and into BaseLlmClient as `resolveForModel`. Both generateJson and generateText now build a per-model ContentGenerator (with cache) when the request targets a non-main model and pass the resolved retry authType through to retryWithBackoff. GeminiClient.generateContent delegates to the same resolver so there is a single source of truth. Also pin the /forget destructive selector to the main model — the runSideQuery default moved to fast model in this branch, but /forget acts on the selection without confirmation, so a weaker fast model could silently delete the wrong managed-memory entries. * test(core): assert thinkingConfig/maxAttempts/model forwarding in compression The compression caller of runSideQuery sets thinkingConfig.includeThoughts=true and maxAttempts=1. A future refactor that silently drops either would degrade compression quality without test failure; this assertion locks the contract. * fix(cli): route dynamic localization through side query * refactor(core): remove unused memory governance review		2026-05-11 19:03:14 +08:00
..
src	refactor(core): route side-query LLM calls through runSideQuery chokepoint (#3775 )	2026-05-11 19:03:14 +08:00
index.ts	fix(cli): stop double-wrapping and double-printing API errors in non-interactive mode (#3749 )	2026-05-03 08:39:31 +08:00
package.json	chore(deps): upgrade ink 6.2.3 → 7.0.2 + bump Node engine to 22 (#3860 )	2026-05-11 17:29:50 +08:00
test-setup.ts	fix: prevent bogus shell permission rules in tests	2026-03-20 17:55:33 +08:00
tsconfig.json	Add background agent resume and continuation (#3739 )	2026-05-01 12:14:33 +08:00
vitest.config.ts	refactor(core): Unify package exports and improve dev experience	2026-02-01 11:59:05 +08:00