fix: improve /model --fast description clarity and prevent accidental activation (#3077)

Replace vague "background tasks" with specific "prompt suggestions and speculative
execution" in the --fast flag description across all i18n locales, docs, and VS Code
schema. Update example model name from qwen3.5-flash to qwen3-coder-flash. Also fix
completion logic to require a non-empty partial arg before suggesting --fast, preventing
Tab+Enter from accidentally entering fast model mode.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Shaojin Wen 2026-04-10 12:09:46 +08:00 committed by GitHub
parent 746f67f436
commit 5482044e59
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 41 additions and 37 deletions

View file

@ -204,9 +204,9 @@ The `extra_body` field allows you to add custom parameters to the request body s
#### fastModel
| Setting | Type | Description | Default |
| ----------- | ------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `fastModel` | string | Model for background tasks ([suggestion generation](../features/followup-suggestions), speculation). Leave empty to use the main model. A smaller/faster model (e.g., `qwen3.5-flash`) reduces latency and cost. Can also be set via `/model --fast`. | `""` |
| Setting | Type | Description | Default |
| ----------- | ------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `fastModel` | string | Model used for generating [prompt suggestions](../features/followup-suggestions) and speculative execution. Leave empty to use the main model. A smaller/faster model (e.g., `qwen3-coder-flash`) reduces latency and cost. Can also be set via `/model --fast`. | `""` |
#### context