From 1c9ff66c765c10e665e274531ac14b778c5276dc Mon Sep 17 00:00:00 2001 From: "mingholy.lmh" Date: Tue, 24 Feb 2026 22:18:50 +0800 Subject: [PATCH 1/6] docs: enhance modelProviders configuration documentation - Add comprehensive configuration examples for all auth types (openai, anthropic, gemini, vertex-ai) - Add local self-hosted model examples (vLLM, Ollama, LM Studio) - Clarify generation config layering with impermeable provider layer concept - Add Provider Model vs Runtime Model explanation - Document duplicate model ID limitation - Deprecate security.auth.apiKey and security.auth.baseUrl settings - Add notes about extra_body parameter support limitations Co-authored-by: Qwen-Coder --- docs/users/configuration/settings.md | 367 ++++++++++++++++++++++++--- 1 file changed, 327 insertions(+), 40 deletions(-) diff --git a/docs/users/configuration/settings.md b/docs/users/configuration/settings.md index 0094f411d..aff42545d 100644 --- a/docs/users/configuration/settings.md +++ b/docs/users/configuration/settings.md @@ -184,7 +184,17 @@ The `extra_body` field allows you to add custom parameters to the request body s Use `modelProviders` to declare curated model lists per auth type that the `/model` picker can switch between. Keys must be valid auth types (`openai`, `anthropic`, `gemini`, `vertex-ai`, etc.). Each entry requires an `id` and **must include `envKey`**, with optional `name`, `description`, `baseUrl`, and `generationConfig`. Credentials are never persisted in settings; the runtime reads them from `process.env[envKey]`. Qwen OAuth models remain hard-coded and cannot be overridden. -##### Example +> [!note] +> Only the `/model` command exposes non-default auth types. Anthropic, Gemini, Vertex AI, etc., must be defined via `modelProviders`. The `/auth` command intentionally lists only the built-in Qwen OAuth and OpenAI flows. + +> [!warning] +> **Duplicate model IDs within the same authType:** Defining multiple models with the same `id` under a single `authType` (e.g., two entries with `"id": "gpt-4o"` in `openai`) is currently not supported. If duplicates exist, **the first occurrence wins** and subsequent duplicates are skipped with a warning. Note that the `id` field is used both as the configuration identifier and as the actual model name sent to the API, so using unique IDs (e.g., `gpt-4o-creative`, `gpt-4o-balanced`) is not a viable workaround. This is a known limitation that we plan to address in a future release. + +##### Configuration Examples by Auth Type + +Below are comprehensive configuration examples for different authentication types, showing the available parameters and their combinations: + +**OpenAI-compatible providers** (`openai`): ```json { @@ -198,47 +208,213 @@ Use `modelProviders` to declare curated model lists per auth type that the `/mod "generationConfig": { "timeout": 60000, "maxRetries": 3, + "enableCacheControl": true, + "contextWindowSize": 128000, "customHeaders": { - "X-Model-Version": "v1.0", - "X-Request-Priority": "high" + "X-Request-ID": "req-123", + "X-User-ID": "user-456" }, "extra_body": { - "enable_thinking": true + "enable_thinking": true, + "service_tier": "priority" }, - "samplingParams": { "temperature": 0.2 } + "samplingParams": { + "temperature": 0.2, + "top_p": 0.8, + "max_tokens": 4096, + "presence_penalty": 0.1, + "frequency_penalty": 0.1 + } } - } - ], - "anthropic": [ + }, { - "id": "claude-3-5-sonnet", - "envKey": "ANTHROPIC_API_KEY", - "baseUrl": "https://api.anthropic.com/v1" - } - ], - "gemini": [ - { - "id": "gemini-2.0-flash", - "name": "Gemini 2.0 Flash", - "envKey": "GEMINI_API_KEY", - "baseUrl": "https://generativelanguage.googleapis.com" - } - ], - "vertex-ai": [ - { - "id": "gemini-1.5-pro-vertex", - "envKey": "GOOGLE_API_KEY", - "baseUrl": "https://generativelanguage.googleapis.com" + "id": "gpt-4o-mini", + "name": "GPT-4o Mini", + "envKey": "OPENAI_API_KEY", + "baseUrl": "https://api.openai.com/v1", + "generationConfig": { + "timeout": 30000, + "samplingParams": { + "temperature": 0.5, + "max_tokens": 2048 + } + } } ] } } ``` -> [!note] -> Only the `/model` command exposes non-default auth types. Anthropic, Gemini, Vertex AI, etc., must be defined via `modelProviders`. The `/auth` command intentionally lists only the built-in Qwen OAuth and OpenAI flows. +**Anthropic** (`anthropic`): -##### Resolution layers and atomicity +```json +{ + "modelProviders": { + "anthropic": [ + { + "id": "claude-3-5-sonnet", + "name": "Claude 3.5 Sonnet", + "envKey": "ANTHROPIC_API_KEY", + "baseUrl": "https://api.anthropic.com/v1", + "generationConfig": { + "timeout": 120000, + "maxRetries": 3, + "contextWindowSize": 200000, + "customHeaders": { + "anthropic-version": "2023-06-01" + }, + "samplingParams": { + "temperature": 0.7, + "max_tokens": 8192, + "top_p": 0.9 + } + } + }, + { + "id": "claude-3-opus", + "name": "Claude 3 Opus", + "envKey": "ANTHROPIC_API_KEY", + "baseUrl": "https://api.anthropic.com/v1", + "generationConfig": { + "timeout": 180000, + "samplingParams": { + "temperature": 0.3, + "max_tokens": 4096 + } + } + } + ] + } +} +``` + +**Google Gemini** (`gemini`): + +```json +{ + "modelProviders": { + "gemini": [ + { + "id": "gemini-2.0-flash", + "name": "Gemini 2.0 Flash", + "envKey": "GEMINI_API_KEY", + "baseUrl": "https://generativelanguage.googleapis.com", + "capabilities": { + "vision": true + }, + "generationConfig": { + "timeout": 60000, + "maxRetries": 2, + "contextWindowSize": 1000000, + "schemaCompliance": "auto", + "samplingParams": { + "temperature": 0.4, + "top_p": 0.95, + "max_tokens": 8192, + "top_k": 40 + } + } + } + ] + } +} +``` + +**Google Vertex AI** (`vertex-ai`): + +```json +{ + "modelProviders": { + "vertex-ai": [ + { + "id": "gemini-1.5-pro-vertex", + "name": "Gemini 1.5 Pro (Vertex AI)", + "envKey": "GOOGLE_API_KEY", + "baseUrl": "https://generativelanguage.googleapis.com", + "generationConfig": { + "timeout": 90000, + "contextWindowSize": 2000000, + "samplingParams": { + "temperature": 0.2, + "max_tokens": 8192 + } + } + } + ] + } +} +``` + +**Local Self-Hosted Models (via OpenAI-compatible API)**: + +Most local inference servers (vLLM, Ollama, LM Studio, etc.) provide an OpenAI-compatible API endpoint. Configure them using the `openai` auth type with a local `baseUrl`: + +```json +{ + "modelProviders": { + "openai": [ + { + "id": "qwen2.5-7b", + "name": "Qwen2.5 7B (Ollama)", + "envKey": "OLLAMA_API_KEY", + "baseUrl": "http://localhost:11434/v1", + "generationConfig": { + "timeout": 300000, + "maxRetries": 1, + "contextWindowSize": 32768, + "samplingParams": { + "temperature": 0.7, + "top_p": 0.9, + "max_tokens": 4096 + } + } + }, + { + "id": "llama-3.1-8b", + "name": "Llama 3.1 8B (vLLM)", + "envKey": "VLLM_API_KEY", + "baseUrl": "http://localhost:8000/v1", + "generationConfig": { + "timeout": 120000, + "maxRetries": 2, + "contextWindowSize": 128000, + "samplingParams": { + "temperature": 0.6, + "max_tokens": 8192 + } + } + }, + { + "id": "local-model", + "name": "Local Model (LM Studio)", + "envKey": "LMSTUDIO_API_KEY", + "baseUrl": "http://localhost:1234/v1", + "generationConfig": { + "timeout": 60000, + "samplingParams": { + "temperature": 0.5 + } + } + } + ] + } +} +``` + +For local servers that don't require authentication, you can use any placeholder value for the API key: + +```bash +# For Ollama (no auth required) +export OLLAMA_API_KEY="ollama" + +# For vLLM (if no auth is configured) +export VLLM_API_KEY="not-needed" +``` + +> [!note] +> The `extra_body` parameter is **only supported for OpenAI-compatible providers** (`openai`, `qwen-oauth`). It is ignored for Anthropic, Gemini, and Vertex AI providers. + +##### Resolution Layers and Atomicity The effective auth/model/credential values are chosen per field using the following precedence (first present wins). You can combine `--auth-type` with `--model` to point directly at a provider entry; these CLI flags run before other layers. @@ -253,28 +429,139 @@ The effective auth/model/credential values are chosen per field using the follow \*When present, CLI auth flags override settings. Otherwise, `security.auth.selectedType` or the implicit default determine the auth type. Qwen OAuth and OpenAI are the only auth types surfaced without extra configuration. -Model-provider sourced values are applied atomically: once a provider model is active, every field it defines is protected from lower layers until you manually clear credentials via `/auth`. The final `generationConfig` is the projection across all layers—lower layers only fill gaps left by higher ones, and the provider layer remains impenetrable. +> [!warning] +> **Deprecation of `security.auth.apiKey` and `security.auth.baseUrl`:** Directly configuring API credentials via `security.auth.apiKey` and `security.auth.baseUrl` in `settings.json` is deprecated. These settings were used in historical versions for credentials entered through the UI, but the credential input flow was removed in version 0.10.1. These fields will be fully removed in a future release. **It is strongly recommended to migrate to `modelProviders`** for all model and credential configurations. Use `envKey` in `modelProviders` to reference environment variables for secure credential management instead of hardcoding credentials in settings files. -The merge strategy for `modelProviders` is REPLACE: the entire `modelProviders` from project settings will override the corresponding section in user settings, rather than merging the two. +##### Generation Config Layering: The Impermeable Provider Layer -##### Generation config layering +The configuration resolution follows a strict layering model with one crucial rule: **the modelProvider layer is impermeable**. -Per-field precedence for `generationConfig`: +**How it works:** -1. Programmatic overrides (e.g. runtime `/model`, `/auth` changes) -2. `modelProviders[authType][].generationConfig` -3. `settings.model.generationConfig` -4. Content-generator defaults (`getDefaultGenerationConfig` for OpenAI, `getParameterValue` for Gemini, etc.) +1. **When a modelProvider model IS selected** (e.g., via `/model` command choosing a provider-configured model): + - The entire `generationConfig` from the provider is applied **atomically** + - **The provider layer is completely impermeable** — lower layers (CLI, env, settings) do not participate in generationConfig resolution at all + - All fields defined in `modelProviders[].generationConfig` use the provider's values + - All fields **not defined** by the provider are set to `undefined` (not inherited from settings) + - This ensures provider configurations act as a complete, self-contained "sealed package" -`samplingParams`, `customHeaders`, and `extra_body` are all treated atomically; provider values replace the entire object. If `modelProviders[].generationConfig` defines these fields, they are used directly; otherwise, values from `model.generationConfig` are used. No merging occurs between provider and global configuration levels. Defaults from the content generator apply last so each provider retains its tuned baseline. +2. **When NO modelProvider model is selected** (e.g., using `--model` with a raw model ID, or using CLI/env/settings directly): + - The resolution falls through to lower layers + - Fields are populated from CLI → env → settings → defaults + - This creates a **Runtime Model** (see next section) -##### Selection persistence and recommendations +**Per-field precedence for `generationConfig`:** + +| Priority | Source | Behavior | +|----------|--------|----------| +| 1 | Programmatic overrides | Runtime `/model`, `/auth` changes | +| 2 | `modelProviders[authType][].generationConfig` | **Impermeable layer** - completely replaces all generationConfig fields; lower layers do not participate | +| 3 | `settings.model.generationConfig` | Only used for **Runtime Models** (when no provider model is selected) | +| 4 | Content-generator defaults | Provider-specific defaults (e.g., OpenAI vs Gemini) - only for Runtime Models | + +**Atomic field treatment:** + +The following fields are treated as atomic objects - provider values completely replace the entire object, no merging occurs: + +- `samplingParams` - Temperature, top_p, max_tokens, etc. +- `customHeaders` - Custom HTTP headers +- `extra_body` - Extra request body parameters + +**Example:** + +```json +// User settings (~/.qwen/settings.json) +{ + "model": { + "generationConfig": { + "timeout": 30000, + "samplingParams": { "temperature": 0.5, "max_tokens": 1000 } + } + } +} + +// modelProviders configuration +{ + "modelProviders": { + "openai": [{ + "id": "gpt-4o", + "envKey": "OPENAI_API_KEY", + "generationConfig": { + "timeout": 60000, + "samplingParams": { "temperature": 0.2 } + } + }] + } +} +``` + +When `gpt-4o` is selected from modelProviders: +- `timeout` = 60000 (from provider, overrides settings) +- `samplingParams.temperature` = 0.2 (from provider, completely replaces settings object) +- `samplingParams.max_tokens` = **undefined** (not defined in provider, and provider layer does not inherit from settings — fields are explicitly set to undefined if not provided) + +When using a raw model via `--model gpt-4` (not from modelProviders, creates a Runtime Model): +- `timeout` = 30000 (from settings) +- `samplingParams.temperature` = 0.5 (from settings) +- `samplingParams.max_tokens` = 1000 (from settings) + +The merge strategy for `modelProviders` itself is REPLACE: the entire `modelProviders` from project settings will override the corresponding section in user settings, rather than merging the two. + +##### Provider Models vs Runtime Models + +Qwen Code distinguishes between two types of model configurations: + +**Provider Model**: +- Defined in `modelProviders` configuration +- Has a complete, atomic configuration package +- When selected, its configuration is applied as an impermeable layer +- Appears in `/model` command list with full metadata (name, description, capabilities) +- Recommended for multi-model workflows and team consistency + +**Runtime Model**: +- Created dynamically when using raw model IDs via CLI (`--model`), environment variables, or settings +- Not defined in `modelProviders` +- Configuration is built by "projecting" through resolution layers (CLI → env → settings → defaults) +- Automatically captured as a **RuntimeModelSnapshot** when a complete configuration is detected +- Allows reuse without re-entering credentials + +**RuntimeModelSnapshot lifecycle:** + +When you configure a model without using `modelProviders`, Qwen Code automatically creates a RuntimeModelSnapshot to preserve your configuration: + +```bash +# This creates a RuntimeModelSnapshot with ID: $runtime|openai|my-custom-model +qwen --auth-type openai --model my-custom-model --openaiApiKey $KEY --openaiBaseUrl https://api.example.com/v1 +``` + +The snapshot: +- Captures model ID, API key, base URL, and generation config +- Persists across sessions (stored in memory during runtime) +- Appears in the `/model` command list as a runtime option +- Can be switched to using `/model $runtime|openai|my-custom-model` + +**Key differences:** + +| Aspect | Provider Model | Runtime Model | +|--------|---------------|---------------| +| Configuration source | `modelProviders` in settings | CLI, env, settings layers | +| Configuration atomicity | Complete, impermeable package | Layered, each field resolved independently | +| Reusability | Always available in `/model` list | Captured as snapshot, appears if complete | +| Team sharing | Yes (via committed settings) | No (user-local) | +| Credential storage | Reference via `envKey` only | May capture actual key in snapshot | + +**When to use each:** + +- **Use Provider Models** when: You have standard models shared across a team, need consistent configurations, or want to prevent accidental overrides +- **Use Runtime Models** when: Quickly testing a new model, using temporary credentials, or working with ad-hoc endpoints + +##### Selection Persistence and Recommendations > [!important] > Define `modelProviders` in the user-scope `~/.qwen/settings.json` whenever possible and avoid persisting credential overrides in any scope. Keeping the provider catalog in user settings prevents merge/override conflicts between project and user scopes and ensures `/auth` and `/model` updates always write back to a consistent scope. - `/model` and `/auth` persist `model.name` (where applicable) and `security.auth.selectedType` to the closest writable scope that already defines `modelProviders`; otherwise they fall back to the user scope. This keeps workspace/user files in sync with the active provider catalog. -- Without `modelProviders`, the resolver mixes CLI/env/settings layers, which is fine for single-provider setups but cumbersome when frequently switching. Define provider catalogs whenever multi-model workflows are common so that switches stay atomic, source-attributed, and debuggable. +- Without `modelProviders`, the resolver mixes CLI/env/settings layers, creating Runtime Models. This is fine for single-provider setups but cumbersome when frequently switching. Define provider catalogs whenever multi-model workflows are common so that switches stay atomic, source-attributed, and debuggable. #### context From 9cb624f79f0f648a6fc6ffe9da8d691162ecc378 Mon Sep 17 00:00:00 2001 From: "mingholy.lmh" Date: Wed, 25 Feb 2026 10:07:41 +0800 Subject: [PATCH 2/6] docs: extract modelProviders section to standalone model-providers.md document - Create new model-providers.md with complete model provider configuration guide - Add Bailian Coding Plan documentation with setup and auto-update details - Remove modelProviders content from settings.md to avoid duplication - Document reserved envKey BAILIAN_CODING_PLAN_API_KEY and security recommendations Co-authored-by: Qwen-Coder --- docs/users/configuration/model-providers.md | 521 ++++++++++++++++++++ docs/users/configuration/settings.md | 386 +-------------- 2 files changed, 522 insertions(+), 385 deletions(-) create mode 100644 docs/users/configuration/model-providers.md diff --git a/docs/users/configuration/model-providers.md b/docs/users/configuration/model-providers.md new file mode 100644 index 000000000..2e6265917 --- /dev/null +++ b/docs/users/configuration/model-providers.md @@ -0,0 +1,521 @@ +# Model Providers + +Qwen Code allows you to configure multiple model providers through the `modelProviders` setting in your `settings.json`. This enables you to switch between different AI models and providers using the `/model` command. + +## Overview + +Use `modelProviders` to declare curated model lists per auth type that the `/model` picker can switch between. Keys must be valid auth types (`openai`, `anthropic`, `gemini`, `vertex-ai`, etc.). Each entry requires an `id` and **must include `envKey`**, with optional `name`, `description`, `baseUrl`, and `generationConfig`. Credentials are never persisted in settings; the runtime reads them from `process.env[envKey]`. Qwen OAuth models remain hard-coded and cannot be overridden. + +> [!note] +> Only the `/model` command exposes non-default auth types. Anthropic, Gemini, Vertex AI, etc., must be defined via `modelProviders`. The `/auth` command intentionally lists only the built-in Qwen OAuth and OpenAI flows. + +> [!warning] +> **Duplicate model IDs within the same authType:** Defining multiple models with the same `id` under a single `authType` (e.g., two entries with `"id": "gpt-4o"` in `openai`) is currently not supported. If duplicates exist, **the first occurrence wins** and subsequent duplicates are skipped with a warning. Note that the `id` field is used both as the configuration identifier and as the actual model name sent to the API, so using unique IDs (e.g., `gpt-4o-creative`, `gpt-4o-balanced`) is not a viable workaround. This is a known limitation that we plan to address in a future release. + +## Configuration Examples by Auth Type + +Below are comprehensive configuration examples for different authentication types, showing the available parameters and their combinations. + +### Supported Auth Types + +The `modelProviders` object keys must be valid `authType` values. Currently supported auth types are: + +| Auth Type | Description | +| ------------ | --------------------------------------------------------------------------------------- | +| `openai` | OpenAI-compatible APIs (OpenAI, Azure OpenAI, local inference servers like vLLM/Ollama) | +| `anthropic` | Anthropic Claude API | +| `gemini` | Google Gemini API | +| `vertex-ai` | Google Vertex AI | +| `qwen-oauth` | Qwen OAuth (hard-coded, cannot be overridden in `modelProviders`) | + +> [!warning] +> If an invalid auth type key is used (e.g., a typo like `"openai-custom"`), the configuration will be **silently skipped** and the models will not appear in the `/model` picker. Always use one of the supported auth type values listed above. + +### SDKs Used for API Requests + +Qwen Code uses the following official SDKs to send requests to each provider: + +| Auth Type | SDK Package | +| ---------------------- | ----------------------------------------------------------------------------------------------- | +| `openai` | [`openai`](https://www.npmjs.com/package/openai) - Official OpenAI Node.js SDK | +| `anthropic` | [`@anthropic-ai/sdk`](https://www.npmjs.com/package/@anthropic-ai/sdk) - Official Anthropic SDK | +| `gemini` / `vertex-ai` | [`@google/genai`](https://www.npmjs.com/package/@google/genai) - Official Google GenAI SDK | +| `qwen-oauth` | [`openai`](https://www.npmjs.com/package/openai) with custom provider (DashScope-compatible) | + +This means the `baseUrl` you configure should be compatible with the corresponding SDK's expected API format. For example, when using `openai` auth type, the endpoint must accept OpenAI API format requests. + +### OpenAI-compatible providers (`openai`) + +This auth type supports not only OpenAI's official API but also any OpenAI-compatible endpoint, including aggregated model providers like OpenRouter. + +```json +{ + "modelProviders": { + "openai": [ + { + "id": "gpt-4o", + "name": "GPT-4o", + "envKey": "OPENAI_API_KEY", + "baseUrl": "https://api.openai.com/v1", + "generationConfig": { + "timeout": 60000, + "maxRetries": 3, + "enableCacheControl": true, + "contextWindowSize": 128000, + "customHeaders": { + "X-Client-Request-ID": "req-123" + }, + "extra_body": { + "enable_thinking": true, + "service_tier": "priority" + }, + "samplingParams": { + "temperature": 0.2, + "top_p": 0.8, + "max_tokens": 4096, + "presence_penalty": 0.1, + "frequency_penalty": 0.1 + } + } + }, + { + "id": "gpt-4o-mini", + "name": "GPT-4o Mini", + "envKey": "OPENAI_API_KEY", + "baseUrl": "https://api.openai.com/v1", + "generationConfig": { + "timeout": 30000, + "samplingParams": { + "temperature": 0.5, + "max_tokens": 2048 + } + } + }, + { + "id": "openai/gpt-4o", + "name": "GPT-4o (via OpenRouter)", + "envKey": "OPENROUTER_API_KEY", + "baseUrl": "https://openrouter.ai/api/v1", + "generationConfig": { + "timeout": 120000, + "maxRetries": 3, + "samplingParams": { + "temperature": 0.7 + } + } + } + ] + } +} +``` + +### Anthropic (`anthropic`) + +```json +{ + "modelProviders": { + "anthropic": [ + { + "id": "claude-3-5-sonnet", + "name": "Claude 3.5 Sonnet", + "envKey": "ANTHROPIC_API_KEY", + "baseUrl": "https://api.anthropic.com/v1", + "generationConfig": { + "timeout": 120000, + "maxRetries": 3, + "contextWindowSize": 200000, + "samplingParams": { + "temperature": 0.7, + "max_tokens": 8192, + "top_p": 0.9 + } + } + }, + { + "id": "claude-3-opus", + "name": "Claude 3 Opus", + "envKey": "ANTHROPIC_API_KEY", + "baseUrl": "https://api.anthropic.com/v1", + "generationConfig": { + "timeout": 180000, + "samplingParams": { + "temperature": 0.3, + "max_tokens": 4096 + } + } + } + ] + } +} +``` + +### Google Gemini (`gemini`) + +```json +{ + "modelProviders": { + "gemini": [ + { + "id": "gemini-2.0-flash", + "name": "Gemini 2.0 Flash", + "envKey": "GEMINI_API_KEY", + "baseUrl": "https://generativelanguage.googleapis.com", + "capabilities": { + "vision": true + }, + "generationConfig": { + "timeout": 60000, + "maxRetries": 2, + "contextWindowSize": 1000000, + "schemaCompliance": "auto", + "samplingParams": { + "temperature": 0.4, + "top_p": 0.95, + "max_tokens": 8192, + "top_k": 40 + } + } + } + ] + } +} +``` + +### Google Vertex AI (`vertex-ai`) + +```json +{ + "modelProviders": { + "vertex-ai": [ + { + "id": "gemini-1.5-pro-vertex", + "name": "Gemini 1.5 Pro (Vertex AI)", + "envKey": "GOOGLE_API_KEY", + "baseUrl": "https://generativelanguage.googleapis.com", + "generationConfig": { + "timeout": 90000, + "contextWindowSize": 2000000, + "samplingParams": { + "temperature": 0.2, + "max_tokens": 8192 + } + } + } + ] + } +} +``` + +### Local Self-Hosted Models (via OpenAI-compatible API) + +Most local inference servers (vLLM, Ollama, LM Studio, etc.) provide an OpenAI-compatible API endpoint. Configure them using the `openai` auth type with a local `baseUrl`: + +```json +{ + "modelProviders": { + "openai": [ + { + "id": "qwen2.5-7b", + "name": "Qwen2.5 7B (Ollama)", + "envKey": "OLLAMA_API_KEY", + "baseUrl": "http://localhost:11434/v1", + "generationConfig": { + "timeout": 300000, + "maxRetries": 1, + "contextWindowSize": 32768, + "samplingParams": { + "temperature": 0.7, + "top_p": 0.9, + "max_tokens": 4096 + } + } + }, + { + "id": "llama-3.1-8b", + "name": "Llama 3.1 8B (vLLM)", + "envKey": "VLLM_API_KEY", + "baseUrl": "http://localhost:8000/v1", + "generationConfig": { + "timeout": 120000, + "maxRetries": 2, + "contextWindowSize": 128000, + "samplingParams": { + "temperature": 0.6, + "max_tokens": 8192 + } + } + }, + { + "id": "local-model", + "name": "Local Model (LM Studio)", + "envKey": "LMSTUDIO_API_KEY", + "baseUrl": "http://localhost:1234/v1", + "generationConfig": { + "timeout": 60000, + "samplingParams": { + "temperature": 0.5 + } + } + } + ] + } +} +``` + +For local servers that don't require authentication, you can use any placeholder value for the API key: + +```bash +# For Ollama (no auth required) +export OLLAMA_API_KEY="ollama" + +# For vLLM (if no auth is configured) +export VLLM_API_KEY="not-needed" +``` + +> [!note] +> The `extra_body` parameter is **only supported for OpenAI-compatible providers** (`openai`, `qwen-oauth`). It is ignored for Anthropic, Gemini, and Vertex AI providers. + +## Bailian Coding Plan + +Bailian Coding Plan provides a pre-configured set of Qwen models optimized for coding tasks. This feature is available for users with Bailian API access and offers a simplified setup experience with automatic model configuration updates. + +### Overview + +When you authenticate with a Bailian Coding Plan API key using the `/auth` command, Qwen Code automatically configures the following models: + +| Model ID | Name | Description | +| ---------------------- | -------------------- | -------------------------------------- | +| `qwen3.5-plus` | qwen3.5-plus | Advanced model with thinking enabled | +| `qwen3-coder-plus` | qwen3-coder-plus | Optimized for coding tasks | +| `qwen3-max-2026-01-23` | qwen3-max-2026-01-23 | Latest max model with thinking enabled | + +### Setup + +1. Obtain a Bailian Coding Plan API key: + - **China**: + - **International**: +2. Run the `/auth` command in Qwen Code +3. Select the API-KEY authentication method +4. Select your region (China or Global/International) +5. Enter your API key when prompted + +The models will be automatically configured and added to your `/model` picker. + +### Regions + +Bailian Coding Plan supports two regions: + +| Region | Endpoint | Description | +| -------------------- | ----------------------------------------------- | ----------------------- | +| China | `https://coding.dashscope.aliyuncs.com/v1` | Mainland China endpoint | +| Global/International | `https://coding-intl.dashscope.aliyuncs.com/v1` | International endpoint | + +The region is selected during authentication and stored in `settings.json` under `codingPlan.region`. To switch regions, re-run the `/auth` command and select a different region. + +### API Key Storage + +When you configure Coding Plan through the `/auth` command, the API key is stored using the reserved environment variable name `BAILIAN_CODING_PLAN_API_KEY`. By default, it is stored in the `settings.env` field of your `settings.json` file. + +> [!warning] +> **Security Recommendation**: For better security, it is recommended to move the API key from `settings.json` to a separate `.env` file and load it as an environment variable. For example: +> +> ```bash +> # ~/.qwen/.env +> BAILIAN_CODING_PLAN_API_KEY=your-api-key-here +> ``` +> +> Then ensure this file is added to your `.gitignore` if you're using project-level settings. + +### Automatic Updates + +Coding Plan model configurations are versioned. When Qwen Code detects a newer version of the model template, you will be prompted to update. Accepting the update will: + +- Replace the existing Coding Plan model configurations with the latest versions +- Preserve any custom model configurations you've added manually +- Automatically switch to the first model in the updated configuration + +The update process ensures you always have access to the latest model configurations and features without manual intervention. + +### Manual Configuration (Advanced) + +If you prefer to manually configure Coding Plan models, you can add them to your `settings.json` like any OpenAI-compatible provider: + +```json +{ + "modelProviders": { + "openai": [ + { + "id": "qwen3-coder-plus", + "name": "qwen3-coder-plus", + "description": "Qwen3-Coder via Bailian Coding Plan", + "envKey": "YOUR_CUSTOM_ENV_KEY", + "baseUrl": "https://coding.dashscope.aliyuncs.com/v1" + } + ] + } +} +``` + +> [!note] +> When using manual configuration: + +> - You can use any environment variable name for `envKey` +> - You do not need to configure `codingPlan.*` +> - **Automatic updates will not apply** to manually configured Coding Plan models + +> [!warning] +> If you also use automatic Coding Plan configuration, automatic updates may overwrite your manual configurations if they use the same `envKey` and `baseUrl` as the automatic configuration. To avoid this, ensure your manual configuration uses a different `envKey` if possible. + +## Resolution Layers and Atomicity + +The effective auth/model/credential values are chosen per field using the following precedence (first present wins). You can combine `--auth-type` with `--model` to point directly at a provider entry; these CLI flags run before other layers. + +| Layer (highest → lowest) | authType | model | apiKey | baseUrl | apiKeyEnvKey | proxy | +| -------------------------- | ----------------------------------- | ----------------------------------------------- | --------------------------------------------------- | ---------------------------------------------------- | ---------------------- | --------------------------------- | +| Programmatic overrides | `/auth` | `/auth` input | `/auth` input | `/auth` input | — | — | +| Model provider selection | — | `modelProvider.id` | `env[modelProvider.envKey]` | `modelProvider.baseUrl` | `modelProvider.envKey` | — | +| CLI arguments | `--auth-type` | `--model` | `--openaiApiKey` (or provider-specific equivalents) | `--openaiBaseUrl` (or provider-specific equivalents) | — | — | +| Environment variables | — | Provider-specific mapping (e.g. `OPENAI_MODEL`) | Provider-specific mapping (e.g. `OPENAI_API_KEY`) | Provider-specific mapping (e.g. `OPENAI_BASE_URL`) | — | — | +| Settings (`settings.json`) | `security.auth.selectedType` | `model.name` | `security.auth.apiKey` | `security.auth.baseUrl` | — | — | +| Default / computed | Falls back to `AuthType.QWEN_OAUTH` | Built-in default (OpenAI ⇒ `qwen3-coder-plus`) | — | — | — | `Config.getProxy()` if configured | + +\*When present, CLI auth flags override settings. Otherwise, `security.auth.selectedType` or the implicit default determine the auth type. Qwen OAuth and OpenAI are the only auth types surfaced without extra configuration. + +> [!warning] +> **Deprecation of `security.auth.apiKey` and `security.auth.baseUrl`:** Directly configuring API credentials via `security.auth.apiKey` and `security.auth.baseUrl` in `settings.json` is deprecated. These settings were used in historical versions for credentials entered through the UI, but the credential input flow was removed in version 0.10.1. These fields will be fully removed in a future release. **It is strongly recommended to migrate to `modelProviders`** for all model and credential configurations. Use `envKey` in `modelProviders` to reference environment variables for secure credential management instead of hardcoding credentials in settings files. + +## Generation Config Layering: The Impermeable Provider Layer + +The configuration resolution follows a strict layering model with one crucial rule: **the modelProvider layer is impermeable**. + +### How it works + +1. **When a modelProvider model IS selected** (e.g., via `/model` command choosing a provider-configured model): + - The entire `generationConfig` from the provider is applied **atomically** + - **The provider layer is completely impermeable** — lower layers (CLI, env, settings) do not participate in generationConfig resolution at all + - All fields defined in `modelProviders[].generationConfig` use the provider's values + - All fields **not defined** by the provider are set to `undefined` (not inherited from settings) + - This ensures provider configurations act as a complete, self-contained "sealed package" + +2. **When NO modelProvider model is selected** (e.g., using `--model` with a raw model ID, or using CLI/env/settings directly): + - The resolution falls through to lower layers + - Fields are populated from CLI → env → settings → defaults + - This creates a **Runtime Model** (see next section) + +### Per-field precedence for `generationConfig` + +| Priority | Source | Behavior | +| -------- | --------------------------------------------- | -------------------------------------------------------------------------------------------------------- | +| 1 | Programmatic overrides | Runtime `/model`, `/auth` changes | +| 2 | `modelProviders[authType][].generationConfig` | **Impermeable layer** - completely replaces all generationConfig fields; lower layers do not participate | +| 3 | `settings.model.generationConfig` | Only used for **Runtime Models** (when no provider model is selected) | +| 4 | Content-generator defaults | Provider-specific defaults (e.g., OpenAI vs Gemini) - only for Runtime Models | + +### Atomic field treatment + +The following fields are treated as atomic objects - provider values completely replace the entire object, no merging occurs: + +- `samplingParams` - Temperature, top_p, max_tokens, etc. +- `customHeaders` - Custom HTTP headers +- `extra_body` - Extra request body parameters + +### Example + +```json +// User settings (~/.qwen/settings.json) +{ + "model": { + "generationConfig": { + "timeout": 30000, + "samplingParams": { "temperature": 0.5, "max_tokens": 1000 } + } + } +} + +// modelProviders configuration +{ + "modelProviders": { + "openai": [{ + "id": "gpt-4o", + "envKey": "OPENAI_API_KEY", + "generationConfig": { + "timeout": 60000, + "samplingParams": { "temperature": 0.2 } + } + }] + } +} +``` + +When `gpt-4o` is selected from modelProviders: + +- `timeout` = 60000 (from provider, overrides settings) +- `samplingParams.temperature` = 0.2 (from provider, completely replaces settings object) +- `samplingParams.max_tokens` = **undefined** (not defined in provider, and provider layer does not inherit from settings — fields are explicitly set to undefined if not provided) + +When using a raw model via `--model gpt-4` (not from modelProviders, creates a Runtime Model): + +- `timeout` = 30000 (from settings) +- `samplingParams.temperature` = 0.5 (from settings) +- `samplingParams.max_tokens` = 1000 (from settings) + +The merge strategy for `modelProviders` itself is REPLACE: the entire `modelProviders` from project settings will override the corresponding section in user settings, rather than merging the two. + +## Provider Models vs Runtime Models + +Qwen Code distinguishes between two types of model configurations: + +### Provider Model + +- Defined in `modelProviders` configuration +- Has a complete, atomic configuration package +- When selected, its configuration is applied as an impermeable layer +- Appears in `/model` command list with full metadata (name, description, capabilities) +- Recommended for multi-model workflows and team consistency + +### Runtime Model + +- Created dynamically when using raw model IDs via CLI (`--model`), environment variables, or settings +- Not defined in `modelProviders` +- Configuration is built by "projecting" through resolution layers (CLI → env → settings → defaults) +- Automatically captured as a **RuntimeModelSnapshot** when a complete configuration is detected +- Allows reuse without re-entering credentials + +### RuntimeModelSnapshot lifecycle + +When you configure a model without using `modelProviders`, Qwen Code automatically creates a RuntimeModelSnapshot to preserve your configuration: + +```bash +# This creates a RuntimeModelSnapshot with ID: $runtime|openai|my-custom-model +qwen --auth-type openai --model my-custom-model --openaiApiKey $KEY --openaiBaseUrl https://api.example.com/v1 +``` + +The snapshot: + +- Captures model ID, API key, base URL, and generation config +- Persists across sessions (stored in memory during runtime) +- Appears in the `/model` command list as a runtime option +- Can be switched to using `/model $runtime|openai|my-custom-model` + +### Key differences + +| Aspect | Provider Model | Runtime Model | +| ----------------------- | --------------------------------- | ------------------------------------------ | +| Configuration source | `modelProviders` in settings | CLI, env, settings layers | +| Configuration atomicity | Complete, impermeable package | Layered, each field resolved independently | +| Reusability | Always available in `/model` list | Captured as snapshot, appears if complete | +| Team sharing | Yes (via committed settings) | No (user-local) | +| Credential storage | Reference via `envKey` only | May capture actual key in snapshot | + +### When to use each + +- **Use Provider Models** when: You have standard models shared across a team, need consistent configurations, or want to prevent accidental overrides +- **Use Runtime Models** when: Quickly testing a new model, using temporary credentials, or working with ad-hoc endpoints + +## Selection Persistence and Recommendations + +> [!important] +> Define `modelProviders` in the user-scope `~/.qwen/settings.json` whenever possible and avoid persisting credential overrides in any scope. Keeping the provider catalog in user settings prevents merge/override conflicts between project and user scopes and ensures `/auth` and `/model` updates always write back to a consistent scope. + +- `/model` and `/auth` persist `model.name` (where applicable) and `security.auth.selectedType` to the closest writable scope that already defines `modelProviders`; otherwise they fall back to the user scope. This keeps workspace/user files in sync with the active provider catalog. +- Without `modelProviders`, the resolver mixes CLI/env/settings layers, creating Runtime Models. This is fine for single-provider setups but cumbersome when frequently switching. Define provider catalogs whenever multi-model workflows are common so that switches stay atomic, source-attributed, and debuggable. diff --git a/docs/users/configuration/settings.md b/docs/users/configuration/settings.md index aff42545d..82db2b319 100644 --- a/docs/users/configuration/settings.md +++ b/docs/users/configuration/settings.md @@ -148,8 +148,7 @@ Settings are organized into categories. All settings should be placed within the "contextWindowSize": 128000, "enableCacheControl": true, "customHeaders": { - "X-Request-ID": "req-123", - "X-User-ID": "user-456" + "X-Client-Request-ID": "req-123" }, "extra_body": { "enable_thinking": true @@ -180,389 +179,6 @@ The `extra_body` field allows you to add custom parameters to the request body s - `"./custom-logs"` - Logs to `./custom-logs` relative to current directory - `"/tmp/openai-logs"` - Logs to absolute path `/tmp/openai-logs` -#### modelProviders - -Use `modelProviders` to declare curated model lists per auth type that the `/model` picker can switch between. Keys must be valid auth types (`openai`, `anthropic`, `gemini`, `vertex-ai`, etc.). Each entry requires an `id` and **must include `envKey`**, with optional `name`, `description`, `baseUrl`, and `generationConfig`. Credentials are never persisted in settings; the runtime reads them from `process.env[envKey]`. Qwen OAuth models remain hard-coded and cannot be overridden. - -> [!note] -> Only the `/model` command exposes non-default auth types. Anthropic, Gemini, Vertex AI, etc., must be defined via `modelProviders`. The `/auth` command intentionally lists only the built-in Qwen OAuth and OpenAI flows. - -> [!warning] -> **Duplicate model IDs within the same authType:** Defining multiple models with the same `id` under a single `authType` (e.g., two entries with `"id": "gpt-4o"` in `openai`) is currently not supported. If duplicates exist, **the first occurrence wins** and subsequent duplicates are skipped with a warning. Note that the `id` field is used both as the configuration identifier and as the actual model name sent to the API, so using unique IDs (e.g., `gpt-4o-creative`, `gpt-4o-balanced`) is not a viable workaround. This is a known limitation that we plan to address in a future release. - -##### Configuration Examples by Auth Type - -Below are comprehensive configuration examples for different authentication types, showing the available parameters and their combinations: - -**OpenAI-compatible providers** (`openai`): - -```json -{ - "modelProviders": { - "openai": [ - { - "id": "gpt-4o", - "name": "GPT-4o", - "envKey": "OPENAI_API_KEY", - "baseUrl": "https://api.openai.com/v1", - "generationConfig": { - "timeout": 60000, - "maxRetries": 3, - "enableCacheControl": true, - "contextWindowSize": 128000, - "customHeaders": { - "X-Request-ID": "req-123", - "X-User-ID": "user-456" - }, - "extra_body": { - "enable_thinking": true, - "service_tier": "priority" - }, - "samplingParams": { - "temperature": 0.2, - "top_p": 0.8, - "max_tokens": 4096, - "presence_penalty": 0.1, - "frequency_penalty": 0.1 - } - } - }, - { - "id": "gpt-4o-mini", - "name": "GPT-4o Mini", - "envKey": "OPENAI_API_KEY", - "baseUrl": "https://api.openai.com/v1", - "generationConfig": { - "timeout": 30000, - "samplingParams": { - "temperature": 0.5, - "max_tokens": 2048 - } - } - } - ] - } -} -``` - -**Anthropic** (`anthropic`): - -```json -{ - "modelProviders": { - "anthropic": [ - { - "id": "claude-3-5-sonnet", - "name": "Claude 3.5 Sonnet", - "envKey": "ANTHROPIC_API_KEY", - "baseUrl": "https://api.anthropic.com/v1", - "generationConfig": { - "timeout": 120000, - "maxRetries": 3, - "contextWindowSize": 200000, - "customHeaders": { - "anthropic-version": "2023-06-01" - }, - "samplingParams": { - "temperature": 0.7, - "max_tokens": 8192, - "top_p": 0.9 - } - } - }, - { - "id": "claude-3-opus", - "name": "Claude 3 Opus", - "envKey": "ANTHROPIC_API_KEY", - "baseUrl": "https://api.anthropic.com/v1", - "generationConfig": { - "timeout": 180000, - "samplingParams": { - "temperature": 0.3, - "max_tokens": 4096 - } - } - } - ] - } -} -``` - -**Google Gemini** (`gemini`): - -```json -{ - "modelProviders": { - "gemini": [ - { - "id": "gemini-2.0-flash", - "name": "Gemini 2.0 Flash", - "envKey": "GEMINI_API_KEY", - "baseUrl": "https://generativelanguage.googleapis.com", - "capabilities": { - "vision": true - }, - "generationConfig": { - "timeout": 60000, - "maxRetries": 2, - "contextWindowSize": 1000000, - "schemaCompliance": "auto", - "samplingParams": { - "temperature": 0.4, - "top_p": 0.95, - "max_tokens": 8192, - "top_k": 40 - } - } - } - ] - } -} -``` - -**Google Vertex AI** (`vertex-ai`): - -```json -{ - "modelProviders": { - "vertex-ai": [ - { - "id": "gemini-1.5-pro-vertex", - "name": "Gemini 1.5 Pro (Vertex AI)", - "envKey": "GOOGLE_API_KEY", - "baseUrl": "https://generativelanguage.googleapis.com", - "generationConfig": { - "timeout": 90000, - "contextWindowSize": 2000000, - "samplingParams": { - "temperature": 0.2, - "max_tokens": 8192 - } - } - } - ] - } -} -``` - -**Local Self-Hosted Models (via OpenAI-compatible API)**: - -Most local inference servers (vLLM, Ollama, LM Studio, etc.) provide an OpenAI-compatible API endpoint. Configure them using the `openai` auth type with a local `baseUrl`: - -```json -{ - "modelProviders": { - "openai": [ - { - "id": "qwen2.5-7b", - "name": "Qwen2.5 7B (Ollama)", - "envKey": "OLLAMA_API_KEY", - "baseUrl": "http://localhost:11434/v1", - "generationConfig": { - "timeout": 300000, - "maxRetries": 1, - "contextWindowSize": 32768, - "samplingParams": { - "temperature": 0.7, - "top_p": 0.9, - "max_tokens": 4096 - } - } - }, - { - "id": "llama-3.1-8b", - "name": "Llama 3.1 8B (vLLM)", - "envKey": "VLLM_API_KEY", - "baseUrl": "http://localhost:8000/v1", - "generationConfig": { - "timeout": 120000, - "maxRetries": 2, - "contextWindowSize": 128000, - "samplingParams": { - "temperature": 0.6, - "max_tokens": 8192 - } - } - }, - { - "id": "local-model", - "name": "Local Model (LM Studio)", - "envKey": "LMSTUDIO_API_KEY", - "baseUrl": "http://localhost:1234/v1", - "generationConfig": { - "timeout": 60000, - "samplingParams": { - "temperature": 0.5 - } - } - } - ] - } -} -``` - -For local servers that don't require authentication, you can use any placeholder value for the API key: - -```bash -# For Ollama (no auth required) -export OLLAMA_API_KEY="ollama" - -# For vLLM (if no auth is configured) -export VLLM_API_KEY="not-needed" -``` - -> [!note] -> The `extra_body` parameter is **only supported for OpenAI-compatible providers** (`openai`, `qwen-oauth`). It is ignored for Anthropic, Gemini, and Vertex AI providers. - -##### Resolution Layers and Atomicity - -The effective auth/model/credential values are chosen per field using the following precedence (first present wins). You can combine `--auth-type` with `--model` to point directly at a provider entry; these CLI flags run before other layers. - -| Layer (highest → lowest) | authType | model | apiKey | baseUrl | apiKeyEnvKey | proxy | -| -------------------------- | ----------------------------------- | ----------------------------------------------- | --------------------------------------------------- | ---------------------------------------------------- | ---------------------- | --------------------------------- | -| Programmatic overrides | `/auth ` | `/auth` input | `/auth` input | `/auth` input | — | — | -| Model provider selection | — | `modelProvider.id` | `env[modelProvider.envKey]` | `modelProvider.baseUrl` | `modelProvider.envKey` | — | -| CLI arguments | `--auth-type` | `--model` | `--openaiApiKey` (or provider-specific equivalents) | `--openaiBaseUrl` (or provider-specific equivalents) | — | — | -| Environment variables | — | Provider-specific mapping (e.g. `OPENAI_MODEL`) | Provider-specific mapping (e.g. `OPENAI_API_KEY`) | Provider-specific mapping (e.g. `OPENAI_BASE_URL`) | — | — | -| Settings (`settings.json`) | `security.auth.selectedType` | `model.name` | `security.auth.apiKey` | `security.auth.baseUrl` | — | — | -| Default / computed | Falls back to `AuthType.QWEN_OAUTH` | Built-in default (OpenAI ⇒ `qwen3-coder-plus`) | — | — | — | `Config.getProxy()` if configured | - -\*When present, CLI auth flags override settings. Otherwise, `security.auth.selectedType` or the implicit default determine the auth type. Qwen OAuth and OpenAI are the only auth types surfaced without extra configuration. - -> [!warning] -> **Deprecation of `security.auth.apiKey` and `security.auth.baseUrl`:** Directly configuring API credentials via `security.auth.apiKey` and `security.auth.baseUrl` in `settings.json` is deprecated. These settings were used in historical versions for credentials entered through the UI, but the credential input flow was removed in version 0.10.1. These fields will be fully removed in a future release. **It is strongly recommended to migrate to `modelProviders`** for all model and credential configurations. Use `envKey` in `modelProviders` to reference environment variables for secure credential management instead of hardcoding credentials in settings files. - -##### Generation Config Layering: The Impermeable Provider Layer - -The configuration resolution follows a strict layering model with one crucial rule: **the modelProvider layer is impermeable**. - -**How it works:** - -1. **When a modelProvider model IS selected** (e.g., via `/model` command choosing a provider-configured model): - - The entire `generationConfig` from the provider is applied **atomically** - - **The provider layer is completely impermeable** — lower layers (CLI, env, settings) do not participate in generationConfig resolution at all - - All fields defined in `modelProviders[].generationConfig` use the provider's values - - All fields **not defined** by the provider are set to `undefined` (not inherited from settings) - - This ensures provider configurations act as a complete, self-contained "sealed package" - -2. **When NO modelProvider model is selected** (e.g., using `--model` with a raw model ID, or using CLI/env/settings directly): - - The resolution falls through to lower layers - - Fields are populated from CLI → env → settings → defaults - - This creates a **Runtime Model** (see next section) - -**Per-field precedence for `generationConfig`:** - -| Priority | Source | Behavior | -|----------|--------|----------| -| 1 | Programmatic overrides | Runtime `/model`, `/auth` changes | -| 2 | `modelProviders[authType][].generationConfig` | **Impermeable layer** - completely replaces all generationConfig fields; lower layers do not participate | -| 3 | `settings.model.generationConfig` | Only used for **Runtime Models** (when no provider model is selected) | -| 4 | Content-generator defaults | Provider-specific defaults (e.g., OpenAI vs Gemini) - only for Runtime Models | - -**Atomic field treatment:** - -The following fields are treated as atomic objects - provider values completely replace the entire object, no merging occurs: - -- `samplingParams` - Temperature, top_p, max_tokens, etc. -- `customHeaders` - Custom HTTP headers -- `extra_body` - Extra request body parameters - -**Example:** - -```json -// User settings (~/.qwen/settings.json) -{ - "model": { - "generationConfig": { - "timeout": 30000, - "samplingParams": { "temperature": 0.5, "max_tokens": 1000 } - } - } -} - -// modelProviders configuration -{ - "modelProviders": { - "openai": [{ - "id": "gpt-4o", - "envKey": "OPENAI_API_KEY", - "generationConfig": { - "timeout": 60000, - "samplingParams": { "temperature": 0.2 } - } - }] - } -} -``` - -When `gpt-4o` is selected from modelProviders: -- `timeout` = 60000 (from provider, overrides settings) -- `samplingParams.temperature` = 0.2 (from provider, completely replaces settings object) -- `samplingParams.max_tokens` = **undefined** (not defined in provider, and provider layer does not inherit from settings — fields are explicitly set to undefined if not provided) - -When using a raw model via `--model gpt-4` (not from modelProviders, creates a Runtime Model): -- `timeout` = 30000 (from settings) -- `samplingParams.temperature` = 0.5 (from settings) -- `samplingParams.max_tokens` = 1000 (from settings) - -The merge strategy for `modelProviders` itself is REPLACE: the entire `modelProviders` from project settings will override the corresponding section in user settings, rather than merging the two. - -##### Provider Models vs Runtime Models - -Qwen Code distinguishes between two types of model configurations: - -**Provider Model**: -- Defined in `modelProviders` configuration -- Has a complete, atomic configuration package -- When selected, its configuration is applied as an impermeable layer -- Appears in `/model` command list with full metadata (name, description, capabilities) -- Recommended for multi-model workflows and team consistency - -**Runtime Model**: -- Created dynamically when using raw model IDs via CLI (`--model`), environment variables, or settings -- Not defined in `modelProviders` -- Configuration is built by "projecting" through resolution layers (CLI → env → settings → defaults) -- Automatically captured as a **RuntimeModelSnapshot** when a complete configuration is detected -- Allows reuse without re-entering credentials - -**RuntimeModelSnapshot lifecycle:** - -When you configure a model without using `modelProviders`, Qwen Code automatically creates a RuntimeModelSnapshot to preserve your configuration: - -```bash -# This creates a RuntimeModelSnapshot with ID: $runtime|openai|my-custom-model -qwen --auth-type openai --model my-custom-model --openaiApiKey $KEY --openaiBaseUrl https://api.example.com/v1 -``` - -The snapshot: -- Captures model ID, API key, base URL, and generation config -- Persists across sessions (stored in memory during runtime) -- Appears in the `/model` command list as a runtime option -- Can be switched to using `/model $runtime|openai|my-custom-model` - -**Key differences:** - -| Aspect | Provider Model | Runtime Model | -|--------|---------------|---------------| -| Configuration source | `modelProviders` in settings | CLI, env, settings layers | -| Configuration atomicity | Complete, impermeable package | Layered, each field resolved independently | -| Reusability | Always available in `/model` list | Captured as snapshot, appears if complete | -| Team sharing | Yes (via committed settings) | No (user-local) | -| Credential storage | Reference via `envKey` only | May capture actual key in snapshot | - -**When to use each:** - -- **Use Provider Models** when: You have standard models shared across a team, need consistent configurations, or want to prevent accidental overrides -- **Use Runtime Models** when: Quickly testing a new model, using temporary credentials, or working with ad-hoc endpoints - -##### Selection Persistence and Recommendations - -> [!important] -> Define `modelProviders` in the user-scope `~/.qwen/settings.json` whenever possible and avoid persisting credential overrides in any scope. Keeping the provider catalog in user settings prevents merge/override conflicts between project and user scopes and ensures `/auth` and `/model` updates always write back to a consistent scope. - -- `/model` and `/auth` persist `model.name` (where applicable) and `security.auth.selectedType` to the closest writable scope that already defines `modelProviders`; otherwise they fall back to the user scope. This keeps workspace/user files in sync with the active provider catalog. -- Without `modelProviders`, the resolver mixes CLI/env/settings layers, creating Runtime Models. This is fine for single-provider setups but cumbersome when frequently switching. Define provider catalogs whenever multi-model workflows are common so that switches stay atomic, source-attributed, and debuggable. - #### context | Setting | Type | Description | Default | From c4f5c82468b8eef6485acebefd98748acc1840ec Mon Sep 17 00:00:00 2001 From: "mingholy.lmh" Date: Wed, 25 Feb 2026 10:11:33 +0800 Subject: [PATCH 3/6] docs: update link to model-providers.md in auth.md Fix broken link from settings.md#modelproviders to new model-providers.md Co-authored-by: Qwen-Coder --- docs/users/configuration/auth.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/users/configuration/auth.md b/docs/users/configuration/auth.md index 2b56c1fb6..0d51a5715 100644 --- a/docs/users/configuration/auth.md +++ b/docs/users/configuration/auth.md @@ -205,7 +205,7 @@ Edit `~/.qwen/settings.json` (create it if it doesn't exist). You can mix multip > > When using the `env` field in `settings.json`, credentials are stored in plain text. For better security, prefer `.env` files or shell `export` — see [Step 2](#step-2-set-environment-variables). -For the full `modelProviders` schema and advanced options like `generationConfig`, `customHeaders`, and `extra_body`, see [Settings Reference → modelProviders](settings.md#modelproviders). +For the full `modelProviders` schema and advanced options like `generationConfig`, `customHeaders`, and `extra_body`, see [Model Providers Reference](model-providers.md). #### Step 2: Set environment variables From d12a593aa8969260ae0c69cd5f5662d0c04c4d06 Mon Sep 17 00:00:00 2001 From: "mingholy.lmh" Date: Wed, 25 Feb 2026 10:12:25 +0800 Subject: [PATCH 4/6] fix(openrouter): correct header name for OpenRouter provider Change X-Title to X-OpenRouter-Title for proper OpenRouter API compatibility Co-authored-by: Qwen-Coder --- .../core/src/core/openaiContentGenerator/provider/openrouter.ts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/packages/core/src/core/openaiContentGenerator/provider/openrouter.ts b/packages/core/src/core/openaiContentGenerator/provider/openrouter.ts index 7eb9d55af..9bf8716f2 100644 --- a/packages/core/src/core/openaiContentGenerator/provider/openrouter.ts +++ b/packages/core/src/core/openaiContentGenerator/provider/openrouter.ts @@ -25,7 +25,7 @@ export class OpenRouterOpenAICompatibleProvider extends DefaultOpenAICompatibleP return { ...baseHeaders, 'HTTP-Referer': 'https://github.com/QwenLM/qwen-code.git', - 'X-Title': 'Qwen Code', + 'X-OpenRouter-Title': 'Qwen Code', }; } } From b749ef302e12b1f40885dec1376c260993a1b488 Mon Sep 17 00:00:00 2001 From: "mingholy.lmh" Date: Wed, 25 Feb 2026 10:41:39 +0800 Subject: [PATCH 5/6] test(openrouter): update test expectations for X-OpenRouter-Title header Co-authored-by: Qwen-Coder --- .../openaiContentGenerator/provider/openrouter.test.ts | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/packages/core/src/core/openaiContentGenerator/provider/openrouter.test.ts b/packages/core/src/core/openaiContentGenerator/provider/openrouter.test.ts index cfd9c59d7..385cdd563 100644 --- a/packages/core/src/core/openaiContentGenerator/provider/openrouter.test.ts +++ b/packages/core/src/core/openaiContentGenerator/provider/openrouter.test.ts @@ -105,7 +105,7 @@ describe('OpenRouterOpenAICompatibleProvider', () => { expect(headers).toEqual({ 'User-Agent': `QwenCode/1.0.0 (${process.platform}; ${process.arch})`, 'HTTP-Referer': 'https://github.com/QwenLM/qwen-code.git', - 'X-Title': 'Qwen Code', + 'X-OpenRouter-Title': 'Qwen Code', }); }); @@ -125,7 +125,7 @@ describe('OpenRouterOpenAICompatibleProvider', () => { expect(headers).toEqual({ 'User-Agent': 'ParentAgent/1.0.0', 'HTTP-Referer': 'https://github.com/QwenLM/qwen-code.git', // OpenRouter-specific value should override - 'X-Title': 'Qwen Code', + 'X-OpenRouter-Title': 'Qwen Code', }); parentBuildHeaders.mockRestore(); @@ -142,7 +142,7 @@ describe('OpenRouterOpenAICompatibleProvider', () => { expect(headers['HTTP-Referer']).toBe( 'https://github.com/QwenLM/qwen-code.git', ); - expect(headers['X-Title']).toBe('Qwen Code'); + expect(headers['X-OpenRouter-Title']).toBe('Qwen Code'); }); }); @@ -215,7 +215,7 @@ describe('OpenRouterOpenAICompatibleProvider', () => { expect(headers['HTTP-Referer']).toBe( 'https://github.com/QwenLM/qwen-code.git', ); // OpenRouter-specific - expect(headers['X-Title']).toBe('Qwen Code'); // OpenRouter-specific + expect(headers['X-OpenRouter-Title']).toBe('Qwen Code'); // OpenRouter-specific }); }); }); From 4013c7c5dbb5b98bf5c2e9121ce5997ee479a9fd Mon Sep 17 00:00:00 2001 From: pomelo-nwu Date: Wed, 25 Feb 2026 11:58:01 +0800 Subject: [PATCH 6/6] docs: update roadmap.md --- docs/developers/roadmap.md | 90 +++++++++++++++++++++----------------- 1 file changed, 49 insertions(+), 41 deletions(-) diff --git a/docs/developers/roadmap.md b/docs/developers/roadmap.md index 125a4d36e..83cd42355 100644 --- a/docs/developers/roadmap.md +++ b/docs/developers/roadmap.md @@ -2,13 +2,13 @@ > **Objective**: Catch up with Claude Code's product functionality, continuously refine details, and enhance user experience. -| Category | Phase 1 | Phase 2 | -| ------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | -| User Experience | ✅ Terminal UI
✅ Support OpenAI Protocol
✅ Settings
✅ OAuth
✅ Cache Control
✅ Memory
✅ Compress
✅ Theme | Better UI
OnBoarding
LogView
✅ Session
Permission
🔄 Cross-platform Compatibility | -| Coding Workflow | ✅ Slash Commands
✅ MCP
✅ PlanMode
✅ TodoWrite
✅ SubAgent
✅ Multi Model
✅ Chat Management
✅ Tools (WebFetch, Bash, TextSearch, FileReadFile, EditFile) | 🔄 Hooks
SubAgent (enhanced)
✅ Skill
✅ Headless Mode
✅ Tools (WebSearch) | -| Building Open Capabilities | ✅ Custom Commands | ✅ QwenCode SDK
Extension | -| Integrating Community Ecosystem | | ✅ VSCode Plugin
🔄 ACP/Zed
✅ GHA | -| Administrative Capabilities | ✅ Stats
✅ Feedback | Costs
Dashboard | +| Category | Phase 1 | Phase 2 | +| ------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| User Experience | ✅ Terminal UI
✅ Support OpenAI Protocol
✅ Settings
✅ OAuth
✅ Cache Control
✅ Memory
✅ Compress
✅ Theme | Better UI
OnBoarding
LogView
✅ Session
Permission
🔄 Cross-platform Compatibility
✅ Coding Plan
✅ Anthropic Provider
✅ Multimodal Input
✅ Unified WebUI | +| Coding Workflow | ✅ Slash Commands
✅ MCP
✅ PlanMode
✅ TodoWrite
✅ SubAgent
✅ Multi Model
✅ Chat Management
✅ Tools (WebFetch, Bash, TextSearch, FileReadFile, EditFile) | 🔄 Hooks
✅ Skill
✅ Headless Mode
✅ Tools (WebSearch)
✅ LSP Support
✅ Concurrent Runner | +| Building Open Capabilities | ✅ Custom Commands | ✅ QwenCode SDK
✅ Extension System | +| Integrating Community Ecosystem | | ✅ VSCode Plugin
✅ ACP/Zed
✅ GHA | +| Administrative Capabilities | ✅ Stats
✅ Feedback | Costs
Dashboard
✅ User Feedback Dialog | > For more details, please see the list below. @@ -16,39 +16,48 @@ #### Completed Features -| Feature | Version | Description | Category | -| ----------------------- | --------- | ------------------------------------------------------- | ------------------------------- | -| Skill | `V0.6.0` | Extensible custom AI skills | Coding Workflow | -| Github Actions | `V0.5.0` | qwen-code-action and automation | Integrating Community Ecosystem | -| VSCode Plugin | `V0.5.0` | VSCode extension plugin | Integrating Community Ecosystem | -| QwenCode SDK | `V0.4.0` | Open SDK for third-party integration | Building Open Capabilities | -| Session | `V0.4.0` | Enhanced session management | User Experience | -| i18n | `V0.3.0` | Internationalization and multilingual support | User Experience | -| Headless Mode | `V0.3.0` | Headless mode (non-interactive) | Coding Workflow | -| ACP/Zed | `V0.2.0` | ACP and Zed editor integration | Integrating Community Ecosystem | -| Terminal UI | `V0.1.0+` | Interactive terminal user interface | User Experience | -| Settings | `V0.1.0+` | Configuration management system | User Experience | -| Theme | `V0.1.0+` | Multi-theme support | User Experience | -| Support OpenAI Protocol | `V0.1.0+` | Support for OpenAI API protocol | User Experience | -| Chat Management | `V0.1.0+` | Session management (save, restore, browse) | Coding Workflow | -| MCP | `V0.1.0+` | Model Context Protocol integration | Coding Workflow | -| Multi Model | `V0.1.0+` | Multi-model support and switching | Coding Workflow | -| Slash Commands | `V0.1.0+` | Slash command system | Coding Workflow | -| Tool: Bash | `V0.1.0+` | Shell command execution tool (with is_background param) | Coding Workflow | -| Tool: FileRead/EditFile | `V0.1.0+` | File read/write and edit tools | Coding Workflow | -| Custom Commands | `V0.1.0+` | Custom command loading | Building Open Capabilities | -| Feedback | `V0.1.0+` | Feedback mechanism (/bug command) | Administrative Capabilities | -| Stats | `V0.1.0+` | Usage statistics and quota display | Administrative Capabilities | -| Memory | `V0.0.9+` | Project-level and global memory management | User Experience | -| Cache Control | `V0.0.9+` | Prompt caching control (Anthropic, DashScope) | User Experience | -| PlanMode | `V0.0.14` | Task planning mode | Coding Workflow | -| Compress | `V0.0.11` | Chat compression mechanism | User Experience | -| SubAgent | `V0.0.11` | Dedicated sub-agent system | Coding Workflow | -| TodoWrite | `V0.0.10` | Task management and progress tracking | Coding Workflow | -| Tool: TextSearch | `V0.0.8+` | Text search tool (grep, supports .qwenignore) | Coding Workflow | -| Tool: WebFetch | `V0.0.7+` | Web content fetching tool | Coding Workflow | -| Tool: WebSearch | `V0.0.7+` | Web search tool (using Tavily API) | Coding Workflow | -| OAuth | `V0.0.5+` | OAuth login authentication (Qwen OAuth) | User Experience | +| Feature | Version | Description | Category | Phase | +| ----------------------- | --------- | ------------------------------------------------------- | ------------------------------- | ----- | +| **Coding Plan** | `V0.10.0` | Bailian Coding Plan authentication & models | User Experience | 2 | +| Unified WebUI | `V0.9.0` | Shared WebUI component library for VSCode/CLI | User Experience | 2 | +| Export Chat | `V0.8.0` | Export sessions to Markdown/HTML/JSON/JSONL | User Experience | 2 | +| Extension System | `V0.8.0` | Full extension management with slash commands | Building Open Capabilities | 2 | +| LSP Support | `V0.7.0` | Experimental LSP service (`--experimental-lsp`) | Coding Workflow | 2 | +| Anthropic Provider | `V0.7.0` | Anthropic API provider support | User Experience | 2 | +| User Feedback Dialog | `V0.7.0` | In-app feedback collection with fatigue mechanism | Administrative Capabilities | 2 | +| Concurrent Runner | `V0.6.0` | Batch CLI execution with Git integration | Coding Workflow | 2 | +| Multimodal Input | `V0.6.0` | Image, PDF, audio, video input support | User Experience | 2 | +| Skill | `V0.6.0` | Extensible custom AI skills (experimental) | Coding Workflow | 2 | +| Github Actions | `V0.5.0` | qwen-code-action and automation | Integrating Community Ecosystem | 1 | +| VSCode Plugin | `V0.5.0` | VSCode extension plugin | Integrating Community Ecosystem | 1 | +| QwenCode SDK | `V0.4.0` | Open SDK for third-party integration | Building Open Capabilities | 1 | +| Session | `V0.4.0` | Enhanced session management | User Experience | 1 | +| i18n | `V0.3.0` | Internationalization and multilingual support | User Experience | 1 | +| Headless Mode | `V0.3.0` | Headless mode (non-interactive) | Coding Workflow | 1 | +| ACP/Zed | `V0.2.0` | ACP and Zed editor integration | Integrating Community Ecosystem | 1 | +| Terminal UI | `V0.1.0+` | Interactive terminal user interface | User Experience | 1 | +| Settings | `V0.1.0+` | Configuration management system | User Experience | 1 | +| Theme | `V0.1.0+` | Multi-theme support | User Experience | 1 | +| Support OpenAI Protocol | `V0.1.0+` | Support for OpenAI API protocol | User Experience | 1 | +| Chat Management | `V0.1.0+` | Session management (save, restore, browse) | Coding Workflow | 1 | +| MCP | `V0.1.0+` | Model Context Protocol integration | Coding Workflow | 1 | +| Multi Model | `V0.1.0+` | Multi-model support and switching | Coding Workflow | 1 | +| Slash Commands | `V0.1.0+` | Slash command system | Coding Workflow | 1 | +| Tool: Bash | `V0.1.0+` | Shell command execution tool (with is_background param) | Coding Workflow | 1 | +| Tool: FileRead/EditFile | `V0.1.0+` | File read/write and edit tools | Coding Workflow | 1 | +| Custom Commands | `V0.1.0+` | Custom command loading | Building Open Capabilities | 1 | +| Feedback | `V0.1.0+` | Feedback mechanism (/bug command) | Administrative Capabilities | 1 | +| Stats | `V0.1.0+` | Usage statistics and quota display | Administrative Capabilities | 1 | +| Memory | `V0.0.9+` | Project-level and global memory management | User Experience | 1 | +| Cache Control | `V0.0.9+` | Prompt caching control (Anthropic, DashScope) | User Experience | 1 | +| PlanMode | `V0.0.14` | Task planning mode | Coding Workflow | 1 | +| Compress | `V0.0.11` | Chat compression mechanism | User Experience | 1 | +| SubAgent | `V0.0.11` | Dedicated sub-agent system | Coding Workflow | 1 | +| TodoWrite | `V0.0.10` | Task management and progress tracking | Coding Workflow | 1 | +| Tool: TextSearch | `V0.0.8+` | Text search tool (grep, supports .qwenignore) | Coding Workflow | 1 | +| Tool: WebFetch | `V0.0.7+` | Web content fetching tool | Coding Workflow | 1 | +| Tool: WebSearch | `V0.0.7+` | Web search tool (using Tavily API) | Coding Workflow | 1 | +| OAuth | `V0.0.5+` | OAuth login authentication (Qwen OAuth) | User Experience | 1 | #### Features to Develop @@ -60,7 +69,6 @@ | Cross-platform Compatibility | P1 | In Progress | Windows/Linux/macOS compatibility | User Experience | | LogView | P2 | Planned | Log viewing and debugging feature | User Experience | | Hooks | P2 | In Progress | Extension hooks system | Coding Workflow | -| Extension | P2 | Planned | Extension system | Building Open Capabilities | | Costs | P2 | Planned | Cost tracking and analysis | Administrative Capabilities | | Dashboard | P2 | Planned | Management dashboard | Administrative Capabilities |