fix: improve /model --fast description clarity and prevent accidental activation (#3077)

Replace vague "background tasks" with specific "prompt suggestions and speculative execution" in the --fast flag description across all i18n locales, docs, and VS Code schema. Update example model name from qwen3.5-flash to qwen3-coder-flash. Also fix completion logic to require a non-empty partial arg before suggesting --fast, preventing Tab+Enter from accidentally entering fast model mode. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-30 04:30:48 +00:00 · 2026-04-10 12:09:46 +08:00 · 2026-04-10 12:09:46 +08:00 · 5482044e59
commit 5482044e59
parent 746f67f436
12 changed files with 41 additions and 37 deletions
--- a/docs/users/features/followup-suggestions.md
+++ b/docs/users/features/followup-suggestions.md
@ -49,7 +49,7 @@ By default, suggestions use the same model as your main conversation. For faster
 ### Via command

 ```
-/model --fast qwen3.5-flash
+/model --fast qwen3-coder-flash
 ```

 Or use `/model --fast` (without a model name) to open a selection dialog.
@ -58,11 +58,11 @@ Or use `/model --fast` (without a model name) to open a selection dialog.

 ```json
 {
-  "fastModel": "qwen3.5-flash"
+  "fastModel": "qwen3-coder-flash"
 }
 ```

-The fast model is used for background tasks like suggestion generation. When not configured, the main conversation model is used as fallback.
+The fast model is used for prompt suggestions and speculative execution. When not configured, the main conversation model is used as fallback.

 Thinking/reasoning mode is automatically disabled for all background tasks (suggestion generation and speculation), regardless of your main model's thinking configuration. This avoids wasting tokens on internal reasoning that isn't needed for these tasks.

@ -75,13 +75,13 @@ These settings can be configured in `settings.json`:
 | `ui.enableFollowupSuggestions` | boolean | `true`  | Enable or disable followup suggestions                             |
 | `ui.enableCacheSharing`        | boolean | `true`  | Use cache-aware forked queries to reduce cost (experimental)       |
 | `ui.enableSpeculation`         | boolean | `false` | Speculatively execute suggestions before submission (experimental) |
-| `fastModel`                    | string  | `""`    | Model for background tasks (suggestion generation, speculation)    |
+| `fastModel`                    | string  | `""`    | Model for prompt suggestions and speculative execution             |

 ### Example

 ```json
 {
-  "fastModel": "qwen3.5-flash",
+  "fastModel": "qwen3-coder-flash",
  "ui": {
    "enableFollowupSuggestions": true,
    "enableCacheSharing": true