mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-04-28 11:41:04 +00:00
feat: add extra_body support for OpenAI-compatible providers
Add extra_body configuration option to model.generationConfig for passing custom parameters to OpenAI-compatible API request bodies. - Add extra_body to ContentGeneratorConfig type - Add extra_body to MODEL_GENERATION_CONFIG_FIELDS and ModelGenerationConfig - Implement extra_body merging in DefaultOpenAICompatibleProvider - Implement extra_body merging in DashScopeOpenAICompatibleProvider - Update documentation with examples and provider compatibility notes - Note: This feature is only for OpenAI-compatible providers (openai, qwen-oauth) Resolves #1647 Resolves #1644 Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
parent
561be0eb42
commit
532d97670b
8 changed files with 140 additions and 13 deletions
|
|
@ -96,18 +96,18 @@ Settings are organized into categories. All settings should be placed within the
|
|||
|
||||
#### model
|
||||
|
||||
| Setting | Type | Description | Default |
|
||||
| -------------------------------------------------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
|
||||
| `model.name` | string | The Qwen model to use for conversations. | `undefined` |
|
||||
| `model.maxSessionTurns` | number | Maximum number of user/model/tool turns to keep in a session. -1 means unlimited. | `-1` |
|
||||
| `model.summarizeToolOutput` | object | Enables or disables the summarization of tool output. You can specify the token budget for the summarization using the `tokenBudget` setting. Note: Currently only the `run_shell_command` tool is supported. For example `{"run_shell_command": {"tokenBudget": 2000}}` | `undefined` |
|
||||
| `model.generationConfig` | object | Advanced overrides passed to the underlying content generator. Supports request controls such as `timeout`, `maxRetries`, `disableCacheControl`, and `customHeaders` (custom HTTP headers for API requests), along with fine-tuning knobs under `samplingParams` (for example `temperature`, `top_p`, `max_tokens`). Leave unset to rely on provider defaults. | `undefined` |
|
||||
| `model.chatCompression.contextPercentageThreshold` | number | Sets the threshold for chat history compression as a percentage of the model's total token limit. This is a value between 0 and 1 that applies to both automatic compression and the manual `/compress` command. For example, a value of `0.6` will trigger compression when the chat history exceeds 60% of the token limit. Use `0` to disable compression entirely. | `0.7` |
|
||||
| `model.skipNextSpeakerCheck` | boolean | Skip the next speaker check. | `false` |
|
||||
| `model.skipLoopDetection` | boolean | Disables loop detection checks. Loop detection prevents infinite loops in AI responses but can generate false positives that interrupt legitimate workflows. Enable this option if you experience frequent false positive loop detection interruptions. | `false` |
|
||||
| `model.skipStartupContext` | boolean | Skips sending the startup workspace context (environment summary and acknowledgement) at the beginning of each session. Enable this if you prefer to provide context manually or want to save tokens on startup. | `false` |
|
||||
| `model.enableOpenAILogging` | boolean | Enables logging of OpenAI API calls for debugging and analysis. When enabled, API requests and responses are logged to JSON files. | `false` |
|
||||
| `model.openAILoggingDir` | string | Custom directory path for OpenAI API logs. If not specified, defaults to `logs/openai` in the current working directory. Supports absolute paths, relative paths (resolved from current working directory), and `~` expansion (home directory). | `undefined` |
|
||||
| Setting | Type | Description | Default |
|
||||
| -------------------------------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
|
||||
| `model.name` | string | The Qwen model to use for conversations. | `undefined` |
|
||||
| `model.maxSessionTurns` | number | Maximum number of user/model/tool turns to keep in a session. -1 means unlimited. | `-1` |
|
||||
| `model.summarizeToolOutput` | object | Enables or disables the summarization of tool output. You can specify the token budget for the summarization using the `tokenBudget` setting. Note: Currently only the `run_shell_command` tool is supported. For example `{"run_shell_command": {"tokenBudget": 2000}}` | `undefined` |
|
||||
| `model.generationConfig` | object | Advanced overrides passed to the underlying content generator. Supports request controls such as `timeout`, `maxRetries`, `disableCacheControl`, `customHeaders` (custom HTTP headers for API requests), and `extra_body` (additional body parameters for OpenAI-compatible API requests only), along with fine-tuning knobs under `samplingParams` (for example `temperature`, `top_p`, `max_tokens`). Leave unset to rely on provider defaults. | `undefined` |
|
||||
| `model.chatCompression.contextPercentageThreshold` | number | Sets the threshold for chat history compression as a percentage of the model's total token limit. This is a value between 0 and 1 that applies to both automatic compression and the manual `/compress` command. For example, a value of `0.6` will trigger compression when the chat history exceeds 60% of the token limit. Use `0` to disable compression entirely. | `0.7` |
|
||||
| `model.skipNextSpeakerCheck` | boolean | Skip the next speaker check. | `false` |
|
||||
| `model.skipLoopDetection` | boolean | Disables loop detection checks. Loop detection prevents infinite loops in AI responses but can generate false positives that interrupt legitimate workflows. Enable this option if you experience frequent false positive loop detection interruptions. | `false` |
|
||||
| `model.skipStartupContext` | boolean | Skips sending the startup workspace context (environment summary and acknowledgement) at the beginning of each session. Enable this if you prefer to provide context manually or want to save tokens on startup. | `false` |
|
||||
| `model.enableOpenAILogging` | boolean | Enables logging of OpenAI API calls for debugging and analysis. When enabled, API requests and responses are logged to JSON files. | `false` |
|
||||
| `model.openAILoggingDir` | string | Custom directory path for OpenAI API logs. If not specified, defaults to `logs/openai` in the current working directory. Supports absolute paths, relative paths (resolved from current working directory), and `~` expansion (home directory). | `undefined` |
|
||||
|
||||
**Example model.generationConfig:**
|
||||
|
||||
|
|
@ -121,6 +121,9 @@ Settings are organized into categories. All settings should be placed within the
|
|||
"X-Request-ID": "req-123",
|
||||
"X-User-ID": "user-456"
|
||||
},
|
||||
"extra_body": {
|
||||
"enable_thinking": true
|
||||
},
|
||||
"samplingParams": {
|
||||
"temperature": 0.2,
|
||||
"top_p": 0.8,
|
||||
|
|
@ -133,6 +136,8 @@ Settings are organized into categories. All settings should be placed within the
|
|||
|
||||
The `customHeaders` field allows you to add custom HTTP headers to all API requests. This is useful for request tracing, monitoring, API gateway routing, or when different models require different headers. If `customHeaders` is defined in `modelProviders[].generationConfig.customHeaders`, it will be used directly; otherwise, headers from `model.generationConfig.customHeaders` will be used. No merging occurs between the two levels.
|
||||
|
||||
The `extra_body` field allows you to add custom parameters to the request body sent to the API. This is useful for provider-specific options that are not covered by the standard configuration fields. **Note: This field is only supported for OpenAI-compatible providers (`openai`, `qwen-oauth`). It is ignored for Anthropic and Gemini providers.** If `extra_body` is defined in `modelProviders[].generationConfig.extra_body`, it will be used directly; otherwise, values from `model.generationConfig.extra_body` will be used.
|
||||
|
||||
**model.openAILoggingDir examples:**
|
||||
|
||||
- `"~/qwen-logs"` - Logs to `~/qwen-logs` directory
|
||||
|
|
@ -161,6 +166,9 @@ Use `modelProviders` to declare curated model lists per auth type that the `/mod
|
|||
"X-Model-Version": "v1.0",
|
||||
"X-Request-Priority": "high"
|
||||
},
|
||||
"extra_body": {
|
||||
"enable_thinking": true
|
||||
},
|
||||
"samplingParams": { "temperature": 0.2 }
|
||||
}
|
||||
}
|
||||
|
|
@ -222,7 +230,7 @@ Per-field precedence for `generationConfig`:
|
|||
3. `settings.model.generationConfig`
|
||||
4. Content-generator defaults (`getDefaultGenerationConfig` for OpenAI, `getParameterValue` for Gemini, etc.)
|
||||
|
||||
`samplingParams` and `customHeaders` are both treated atomically; provider values replace the entire object. If `modelProviders[].generationConfig` defines these fields, they are used directly; otherwise, values from `model.generationConfig` are used. No merging occurs between provider and global configuration levels. Defaults from the content generator apply last so each provider retains its tuned baseline.
|
||||
`samplingParams`, `customHeaders`, and `extra_body` are all treated atomically; provider values replace the entire object. If `modelProviders[].generationConfig` defines these fields, they are used directly; otherwise, values from `model.generationConfig` are used. No merging occurs between provider and global configuration levels. Defaults from the content generator apply last so each provider retains its tuned baseline.
|
||||
|
||||
##### Selection persistence and recommendations
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue