qwen-code

vrr/qwen-code

Fork 0

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-05-19 16:28:28 +00:00

Commit graph

Author	SHA1	Message	Date
tanzhenxin	cc800d0132	fix(core): support cross-auth fast side queries (#4117 ) Some checks are pending Qwen Code CI / Classify PR (push) Waiting to run Details Qwen Code CI / Lint (push) Blocked by required conditions Details Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions Details Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions Details Qwen Code CI / CodeQL (push) Blocked by required conditions Details E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run Details E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run Details E2E Tests / E2E Test - macOS (push) Waiting to run Details * fix(core): support cross-auth fast side queries * refactor(core): hoist resolveForModel selector and refresh side-query docs Compute the model selector once at the top of `resolveForModel` and pass it through to `createContentGeneratorForModel` and `resolveModelAcrossAuthTypes`. This eliminates the redundant selector resolution that happened up to five times per cross-auth side query (once per call, plus once inside each downstream helper). Also update the JSDoc for `SideQueryJsonOptions.model` and `SideQueryTextOptions.model` to reflect the actual fallback chain (`getFastModelForSideQuery` → `getFastModel` → `getModel` → `DEFAULT_QWEN_MODEL`) introduced in this PR.	2026-05-14 19:22:12 +08:00
tanzhenxin	d7a25682e6	refactor(core): route side-query LLM calls through runSideQuery chokepoint (#3775 ) * refactor(core): route side-query LLM calls through runSideQuery chokepoint Folds every one-shot side-query call site through a single `runSideQuery` entry point with `thinkingConfig.includeThoughts: false` and `fastModel` (falling back to main) as the default policy. Adds a text-mode sibling to the existing JSON-mode helper, plus a `BaseLlmClient.generateText` primitive that calls `ContentGenerator.generateContent` directly so side queries get neither user-memory wrapping nor the main-prompt fallback that `geminiClient.generateContent` applies. Migrated call sites: session title, recap, tool-use summary, /rename, follow-up suggestion (direct path), ACP rewrite, project /summary, arena approach summary, chat compression, web-fetch, insight analysis, subagent spec generation. Six call sites override the helper defaults explicitly (subagent gen, suggestion, ACP rewrite, /summary, compression, insight) where main-model quality or caller-supplied model matters. The /summary path additionally fixes a latent bug: text extraction previously did not strip thought parts, so on thinking models the saved `.qwen/PROJECT_SUMMARY.md` could leak `reasoning_content` into the file. The chokepoint now strips thought parts and the request itself goes out with thinking off. Best-effort cosmetic callers (recap, tool-use summary, kebab rename, suggestion) opt into `maxAttempts: 1` so transient outages don't burn seven retries on output the user will likely never see. `isInternalPromptId` recognises the `side-query:` prefix automatically so new call sites are filtered without per-site allowlist updates. Removes the `getAgentContentGenerator` workaround in `InProcessBackend` and the `getAgentSummaryGenerator` indirection in `ArenaManager` — arena approach summaries now run through the chokepoint against `fastModel`, giving every agent a neutral arbiter rather than a self-summary on its own model. * fix(core): guard isInternalPromptId against undefined prompt_id logToolCall calls isInternalPromptId(event.prompt_id), and tool-call events from useToolScheduler can carry an undefined prompt_id. The side-query refactor added promptId.startsWith(SIDE_QUERY_PROMPT_PREFIX) without a falsy guard, so the missing id crashed the logger and broke six useToolScheduler tests across all OS / Node matrix entries on CI. * fix(cli,core): polish runSideQuery callers from review feedback - Cap web-fetch, chat-compression, and ACP rewrite at maxAttempts: 1. These paths degrade gracefully on failure (tool error, NOOP fallback, null return), so 7 retries only delays the user-visible outcome. - /summary now carries the main session's system instruction so the summarizer keeps the coding-assistant role, project context, and user memory instead of summarizing the chat in isolation. - Add isInternalPromptId tests for the side-query: prefix so future callers minted via runSideQuery stay filtered out of recordings. * refactor(core): document runSideQuery defaults and surface promptId in errors - Add JSDoc on the model and config fields of SideQueryJsonOptions and SideQueryTextOptions so the fastModel-first defaulting and the thinkingConfig.includeThoughts: false default are visible at the API surface, not buried in resolveDefaultModel / applyThinkingDefault. - BaseLlmClient.generate{Json,Text} error wraps now include promptId in the message and pass { cause: error }, so a side-query failure identifies which call site failed and preserves the original stack. - Add tests covering maxAttempts forwarding (present + omitted) and rejection propagation for both JSON and text modes — the conditional spread is non-trivial and was previously unverified. * fix(core): preserve per-model provider routing in side queries BaseLlmClient was bound to the main session's ContentGenerator and only swapped the request `model` field, so side queries targeting a fast or alternate model inherited the main provider's baseUrl, credentials, and sampling settings — breaking cross-provider configurations. Move per-model generator/authType resolution out of GeminiClient and into BaseLlmClient as `resolveForModel`. Both generateJson and generateText now build a per-model ContentGenerator (with cache) when the request targets a non-main model and pass the resolved retry authType through to retryWithBackoff. GeminiClient.generateContent delegates to the same resolver so there is a single source of truth. Also pin the /forget destructive selector to the main model — the runSideQuery default moved to fast model in this branch, but /forget acts on the selection without confirmation, so a weaker fast model could silently delete the wrong managed-memory entries. * test(core): assert thinkingConfig/maxAttempts/model forwarding in compression The compression caller of runSideQuery sets thinkingConfig.includeThoughts=true and maxAttempts=1. A future refactor that silently drops either would degrade compression quality without test failure; this assertion locks the contract. * fix(cli): route dynamic localization through side query * refactor(core): remove unused memory governance review	2026-05-11 19:03:14 +08:00
Yan Shen	9bd5a0180b	feat(cli): core built-in i18n coverage (#3871 ) * feat(i18n): expand built-in locale coverage * feat(cli): add dynamic slash command translation * test(cli): stabilize session picker assertions * fix(core): close jsonl readers before cleanup * fix: address i18n review regressions * fix(cli): address dynamic i18n review findings * fix(cli): address i18n review follow-ups * fix(cli): address i18n review feedback * test(cli): align i18n parity coverage with strict locales * fix(cli): address i18n review findings	2026-05-10 22:35:03 +08:00

Author

SHA1

Message

Date

tanzhenxin

cc800d0132

fix(core): support cross-auth fast side queries (#4117 )

Qwen Code CI / Classify PR (push) Waiting to run

Details

Qwen Code CI / Lint (push) Blocked by required conditions

Details

Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions

Details

Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions

Details

Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions

Details

Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions

Details

Qwen Code CI / CodeQL (push) Blocked by required conditions

Details

E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run

Details

E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run

Details

E2E Tests / E2E Test - macOS (push) Waiting to run

Details

* fix(core): support cross-auth fast side queries

* refactor(core): hoist resolveForModel selector and refresh side-query docs

Compute the model selector once at the top of `resolveForModel` and pass
it through to `createContentGeneratorForModel` and
`resolveModelAcrossAuthTypes`. This eliminates the redundant selector
resolution that happened up to five times per cross-auth side query
(once per call, plus once inside each downstream helper).

Also update the JSDoc for `SideQueryJsonOptions.model` and
`SideQueryTextOptions.model` to reflect the actual fallback chain
(`getFastModelForSideQuery` → `getFastModel` → `getModel` →
`DEFAULT_QWEN_MODEL`) introduced in this PR.

2026-05-14 19:22:12 +08:00

tanzhenxin

d7a25682e6

refactor(core): route side-query LLM calls through runSideQuery chokepoint (#3775 )

* refactor(core): route side-query LLM calls through runSideQuery chokepoint

Folds every one-shot side-query call site through a single `runSideQuery`
entry point with `thinkingConfig.includeThoughts: false` and `fastModel`
(falling back to main) as the default policy. Adds a text-mode sibling
to the existing JSON-mode helper, plus a `BaseLlmClient.generateText`
primitive that calls `ContentGenerator.generateContent` directly so
side queries get neither user-memory wrapping nor the main-prompt
fallback that `geminiClient.generateContent` applies.

Migrated call sites: session title, recap, tool-use summary, /rename,
follow-up suggestion (direct path), ACP rewrite, project /summary,
arena approach summary, chat compression, web-fetch, insight analysis,
subagent spec generation. Six call sites override the helper defaults
explicitly (subagent gen, suggestion, ACP rewrite, /summary, compression,
insight) where main-model quality or caller-supplied model matters.

The /summary path additionally fixes a latent bug: text extraction
previously did not strip thought parts, so on thinking models the
saved `.qwen/PROJECT_SUMMARY.md` could leak `reasoning_content` into
the file. The chokepoint now strips thought parts and the request
itself goes out with thinking off.

Best-effort cosmetic callers (recap, tool-use summary, kebab rename,
suggestion) opt into `maxAttempts: 1` so transient outages don't burn
seven retries on output the user will likely never see. `isInternalPromptId`
recognises the `side-query:` prefix automatically so new call sites are
filtered without per-site allowlist updates.

Removes the `getAgentContentGenerator` workaround in `InProcessBackend`
and the `getAgentSummaryGenerator` indirection in `ArenaManager` —
arena approach summaries now run through the chokepoint against
`fastModel`, giving every agent a neutral arbiter rather than a
self-summary on its own model.

* fix(core): guard isInternalPromptId against undefined prompt_id

logToolCall calls isInternalPromptId(event.prompt_id), and tool-call
events from useToolScheduler can carry an undefined prompt_id. The
side-query refactor added promptId.startsWith(SIDE_QUERY_PROMPT_PREFIX)
without a falsy guard, so the missing id crashed the logger and broke
six useToolScheduler tests across all OS / Node matrix entries on CI.

* fix(cli,core): polish runSideQuery callers from review feedback

- Cap web-fetch, chat-compression, and ACP rewrite at maxAttempts: 1.
  These paths degrade gracefully on failure (tool error, NOOP fallback,
  null return), so 7 retries only delays the user-visible outcome.
- /summary now carries the main session's system instruction so the
  summarizer keeps the coding-assistant role, project context, and
  user memory instead of summarizing the chat in isolation.
- Add isInternalPromptId tests for the side-query: prefix so future
  callers minted via runSideQuery stay filtered out of recordings.

* refactor(core): document runSideQuery defaults and surface promptId in errors

- Add JSDoc on the model and config fields of SideQueryJsonOptions and
  SideQueryTextOptions so the fastModel-first defaulting and the
  thinkingConfig.includeThoughts: false default are visible at the API
  surface, not buried in resolveDefaultModel / applyThinkingDefault.
- BaseLlmClient.generate{Json,Text} error wraps now include promptId
  in the message and pass { cause: error }, so a side-query failure
  identifies which call site failed and preserves the original stack.
- Add tests covering maxAttempts forwarding (present + omitted) and
  rejection propagation for both JSON and text modes — the conditional
  spread is non-trivial and was previously unverified.

* fix(core): preserve per-model provider routing in side queries

BaseLlmClient was bound to the main session's ContentGenerator and only
swapped the request `model` field, so side queries targeting a fast or
alternate model inherited the main provider's baseUrl, credentials, and
sampling settings — breaking cross-provider configurations.

Move per-model generator/authType resolution out of GeminiClient and into
BaseLlmClient as `resolveForModel`. Both generateJson and generateText
now build a per-model ContentGenerator (with cache) when the request
targets a non-main model and pass the resolved retry authType through
to retryWithBackoff. GeminiClient.generateContent delegates to the same
resolver so there is a single source of truth.

Also pin the /forget destructive selector to the main model — the
runSideQuery default moved to fast model in this branch, but /forget
acts on the selection without confirmation, so a weaker fast model
could silently delete the wrong managed-memory entries.

* test(core): assert thinkingConfig/maxAttempts/model forwarding in compression

The compression caller of runSideQuery sets thinkingConfig.includeThoughts=true
and maxAttempts=1. A future refactor that silently drops either would degrade
compression quality without test failure; this assertion locks the contract.

* fix(cli): route dynamic localization through side query

* refactor(core): remove unused memory governance review

2026-05-11 19:03:14 +08:00

Yan Shen

9bd5a0180b

feat(cli): core built-in i18n coverage (#3871 )

* feat(i18n): expand built-in locale coverage

* feat(cli): add dynamic slash command translation

* test(cli): stabilize session picker assertions

* fix(core): close jsonl readers before cleanup

* fix: address i18n review regressions

* fix(cli): address dynamic i18n review findings

* fix(cli): address i18n review follow-ups

* fix(cli): address i18n review feedback

* test(cli): align i18n parity coverage with strict locales

* fix(cli): address i18n review findings

2026-05-10 22:35:03 +08:00

3 commits