mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-04-28 19:52:02 +00:00
refactor: remove maxOutputTokens user configuration
- Remove maxOutputTokens field from ContentGeneratorConfig - Remove maxOutputTokens auto-calculation logic in contentGenerator.ts - Remove maxOutputTokens sync logic in config.ts hot-update - Update dashscope.ts to use tokenLimit() function for dynamic calculation - Remove maxOutputTokens from documentation (settings.md) - Update all test cases to use correct model output limits - All 143 tests passing (config.test.ts: 105, dashscope.test.ts: 38) Changes: - Users can no longer configure maxOutputTokens via settings - Output token limits are now always dynamically calculated based on model - contextWindowSize (input) remains user-configurable - Prevents long text output truncation caused by default value settings
This commit is contained in:
parent
111356b5d5
commit
e77a67cd5e
6 changed files with 36 additions and 96 deletions
|
|
@ -96,18 +96,18 @@ Settings are organized into categories. All settings should be placed within the
|
|||
|
||||
#### model
|
||||
|
||||
| Setting | Type | Description | Default |
|
||||
| -------------------------------------------------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
|
||||
| `model.name` | string | The Qwen model to use for conversations. | `undefined` |
|
||||
| `model.maxSessionTurns` | number | Maximum number of user/model/tool turns to keep in a session. -1 means unlimited. | `-1` |
|
||||
| `model.summarizeToolOutput` | object | Enables or disables the summarization of tool output. You can specify the token budget for the summarization using the `tokenBudget` setting. Note: Currently only the `run_shell_command` tool is supported. For example `{"run_shell_command": {"tokenBudget": 2000}}` | `undefined` |
|
||||
| `model.generationConfig` | object | Advanced overrides passed to the underlying content generator. Supports request controls such as `timeout`, `maxRetries`, `disableCacheControl`, `contextWindowSize` (override model's context window size), `maxOutputTokens` (override model's maximum output tokens), and `customHeaders` (custom HTTP headers for API requests), along with fine-tuning knobs under `samplingParams` (for example `temperature`, `top_p`, `max_tokens`). Leave unset to rely on provider defaults. | `undefined` |
|
||||
| `model.chatCompression.contextPercentageThreshold` | number | Sets the threshold for chat history compression as a percentage of the model's total token limit. This is a value between 0 and 1 that applies to both automatic compression and the manual `/compress` command. For example, a value of `0.6` will trigger compression when the chat history exceeds 60% of the token limit. Use `0` to disable compression entirely. | `0.7` |
|
||||
| `model.skipNextSpeakerCheck` | boolean | Skip the next speaker check. | `false` |
|
||||
| `model.skipLoopDetection` | boolean | Disables loop detection checks. Loop detection prevents infinite loops in AI responses but can generate false positives that interrupt legitimate workflows. Enable this option if you experience frequent false positive loop detection interruptions. | `false` |
|
||||
| `model.skipStartupContext` | boolean | Skips sending the startup workspace context (environment summary and acknowledgement) at the beginning of each session. Enable this if you prefer to provide context manually or want to save tokens on startup. | `false` |
|
||||
| `model.enableOpenAILogging` | boolean | Enables logging of OpenAI API calls for debugging and analysis. When enabled, API requests and responses are logged to JSON files. | `false` |
|
||||
| `model.openAILoggingDir` | string | Custom directory path for OpenAI API logs. If not specified, defaults to `logs/openai` in the current working directory. Supports absolute paths, relative paths (resolved from current working directory), and `~` expansion (home directory). | `undefined` |
|
||||
| Setting | Type | Description | Default |
|
||||
| -------------------------------------------------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
|
||||
| `model.name` | string | The Qwen model to use for conversations. | `undefined` |
|
||||
| `model.maxSessionTurns` | number | Maximum number of user/model/tool turns to keep in a session. -1 means unlimited. | `-1` |
|
||||
| `model.summarizeToolOutput` | object | Enables or disables the summarization of tool output. You can specify the token budget for the summarization using the `tokenBudget` setting. Note: Currently only the `run_shell_command` tool is supported. For example `{"run_shell_command": {"tokenBudget": 2000}}` | `undefined` |
|
||||
| `model.generationConfig` | object | Advanced overrides passed to the underlying content generator. Supports request controls such as `timeout`, `maxRetries`, `disableCacheControl`, `contextWindowSize` (override model's context window size), and `customHeaders` (custom HTTP headers for API requests), along with fine-tuning knobs under `samplingParams` (for example `temperature`, `top_p`, `max_tokens`). Leave unset to rely on provider defaults. | `undefined` |
|
||||
| `model.chatCompression.contextPercentageThreshold` | number | Sets the threshold for chat history compression as a percentage of the model's total token limit. This is a value between 0 and 1 that applies to both automatic compression and the manual `/compress` command. For example, a value of `0.6` will trigger compression when the chat history exceeds 60% of the token limit. Use `0` to disable compression entirely. | `0.7` |
|
||||
| `model.skipNextSpeakerCheck` | boolean | Skip the next speaker check. | `false` |
|
||||
| `model.skipLoopDetection` | boolean | Disables loop detection checks. Loop detection prevents infinite loops in AI responses but can generate false positives that interrupt legitimate workflows. Enable this option if you experience frequent false positive loop detection interruptions. | `false` |
|
||||
| `model.skipStartupContext` | boolean | Skips sending the startup workspace context (environment summary and acknowledgement) at the beginning of each session. Enable this if you prefer to provide context manually or want to save tokens on startup. | `false` |
|
||||
| `model.enableOpenAILogging` | boolean | Enables logging of OpenAI API calls for debugging and analysis. When enabled, API requests and responses are logged to JSON files. | `false` |
|
||||
| `model.openAILoggingDir` | string | Custom directory path for OpenAI API logs. If not specified, defaults to `logs/openai` in the current working directory. Supports absolute paths, relative paths (resolved from current working directory), and `~` expansion (home directory). | `undefined` |
|
||||
|
||||
**Example model.generationConfig:**
|
||||
|
||||
|
|
@ -118,7 +118,6 @@ Settings are organized into categories. All settings should be placed within the
|
|||
"timeout": 60000,
|
||||
"disableCacheControl": false,
|
||||
"contextWindowSize": 128000,
|
||||
"maxOutputTokens": 8192,
|
||||
"customHeaders": {
|
||||
"X-Request-ID": "req-123",
|
||||
"X-User-ID": "user-456"
|
||||
|
|
@ -137,10 +136,6 @@ Settings are organized into categories. All settings should be placed within the
|
|||
|
||||
Overrides the default context window size for the selected model. Qwen Code determines the context window using built-in defaults based on model name matching, with a constant fallback value. Use this setting when a provider's effective context limit differs from Qwen Code's default. This value defines the model's assumed maximum context capacity, not a per-request token limit.
|
||||
|
||||
**maxOutputTokens:**
|
||||
|
||||
Overrides the default maximum output tokens for the selected model. Qwen Code determines the maximum output tokens using built-in defaults based on model name matching, with a constant fallback value of 8,192 tokens. Use this setting when a provider's effective output limit differs from Qwen Code's default. This value defines the maximum number of tokens the model can generate in a single response.
|
||||
|
||||
**customHeaders:**
|
||||
|
||||
Allows you to add custom HTTP headers to all API requests. This is useful for request tracing, monitoring, API gateway routing, or when different models require different headers. If `customHeaders` is defined in `modelProviders[].generationConfig.customHeaders`, it will be used directly; otherwise, headers from `model.generationConfig.customHeaders` will be used. No merging occurs between the two levels.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue