diff --git a/docs/developers/roadmap.md b/docs/developers/roadmap.md
index 83cd42355..b1f30199c 100644
--- a/docs/developers/roadmap.md
+++ b/docs/developers/roadmap.md
@@ -18,7 +18,7 @@
 
 | Feature                 | Version   | Description                                             | Category                        | Phase |
 | ----------------------- | --------- | ------------------------------------------------------- | ------------------------------- | ----- |
-| **Coding Plan**         | `V0.10.0` | Bailian Coding Plan authentication & models             | User Experience                 | 2     |
+| **Coding Plan**         | `V0.10.0` | Alibaba Cloud Coding Plan authentication & models       | User Experience                 | 2     |
 | Unified WebUI           | `V0.9.0`  | Shared WebUI component library for VSCode/CLI           | User Experience                 | 2     |
 | Export Chat             | `V0.8.0`  | Export sessions to Markdown/HTML/JSON/JSONL             | User Experience                 | 2     |
 | Extension System        | `V0.8.0`  | Full extension management with slash commands           | Building Open Capabilities      | 2     |
diff --git a/docs/users/configuration/auth.md b/docs/users/configuration/auth.md
index bce47dd3f..3e15aa462 100644
--- a/docs/users/configuration/auth.md
+++ b/docs/users/configuration/auth.md
@@ -1,13 +1,12 @@
 # Authentication
 
-Qwen Code supports two authentication methods. Pick the one that matches how you want to run the CLI:
+Qwen Code supports three authentication methods. Pick the one that matches how you want to run the CLI:
 
-- **Qwen OAuth (recommended)**: sign in with your `qwen.ai` account in a browser.
-- **API-KEY**: use an API key to connect to any supported provider. More flexible — supports OpenAI, Anthropic, Google GenAI, Alibaba Cloud Bailian, and other compatible endpoints.
+- **Qwen OAuth**: sign in with your `qwen.ai` account in a browser. Free with a daily quota.
+- **Alibaba Cloud Coding Plan**: use an API key from Alibaba Cloud. Paid subscription with diverse model options and higher quotas.
+- **API Key**: bring your own API key. Flexible to your own needs — supports OpenAI, Anthropic, Gemini, and other compatible endpoints.
 
-![](https://gw.alicdn.com/imgextra/i4/O1CN01yXSXc91uYxJxhJXBF_!!6000000006050-2-tps-2372-916.png)
-
-## 👍 Option 1: Qwen OAuth (recommended & free)
+## Option 1: Qwen OAuth (Free)
 
 Use this if you want the simplest setup and you're using Qwen models.
 
@@ -25,15 +24,72 @@ qwen
 > [!note]
 >
 > In non-interactive or headless environments (e.g., CI, SSH, containers), you typically **cannot** complete the OAuth browser login flow.  
-> In these cases, please use the API-KEY authentication method.
+> In these cases, please use the Alibaba Cloud Coding Plan or API Key authentication method.
 
-## 🚀 Option 2: API-KEY (flexible)
+## 💳 Option 2: Alibaba Cloud Coding Plan
 
-Use this if you want more flexibility over which provider and model to use. Supports multiple protocols and providers, including OpenAI, Anthropic, Google GenAI, Alibaba Cloud Bailian, Azure OpenAI, OpenRouter, ModelScope, or a self-hosted compatible endpoint.
+Use this if you want predictable costs with diverse model options and higher usage quotas.
+
+- **How it works**: Subscribe to the Coding Plan with a fixed monthly fee, then configure Qwen Code to use the dedicated endpoint and your subscription API key.
+- **Requirements**: Obtain an active Coding Plan subscription from [Aliyun Bailian](https://bailian.console.aliyun.com/?tab=model#/efm/coding_plan) or [Alibaba Cloud](https://bailian.console.alibabacloud.com/?tab=model#/efm/coding_plan), depending on the region of your account.
+- **Benefits**: Diverse model options, higher usage quotas, predictable monthly costs, access to a wide range of models (Qwen, GLM, Kimi, Minimax and more).
+- **Cost & quota**: View [Aliyun Bailian Coding Plan documentation](https://bailian.console.aliyun.com/cn-beijing/?tab=doc#/doc/?type=model&url=3005961).
+
+Alibaba Cloud Coding Plan is available in two regions:
+
+| Region                           | Console URL                                                                  |
+| -------------------------------- | ---------------------------------------------------------------------------- |
+| Aliyun Bailian (aliyun.com)      | [bailian.console.aliyun.com](https://bailian.console.aliyun.com)             |
+| Alibaba Cloud (alibabacloud.com) | [bailian.console.alibabacloud.com](https://bailian.console.alibabacloud.com) |
+
+### Interactive setup
+
+Enter `qwen` in the terminal to launch Qwen Code, then run the `/auth` command and select **Alibaba Cloud Coding Plan**. Choose your region, then enter your `sk-sp-xxxxxxxxx` key.
+
+After authentication, use the `/model` command to switch between all Alibaba Cloud Coding Plan supported models (including qwen3.5-plus, qwen3-coder-plus, qwen3-coder-next, qwen3-max, glm-4.7, and kimi-k2.5).
+
+### Alternative: configure via `settings.json`
+
+If you prefer to skip the interactive `/auth` flow, add the following to `~/.qwen/settings.json`:
+
+```json
+{
+  "modelProviders": {
+    "openai": [
+      {
+        "id": "qwen3-coder-plus",
+        "name": "qwen3-coder-plus (Coding Plan)",
+        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
+        "description": "qwen3-coder-plus from Alibaba Cloud Coding Plan",
+        "envKey": "BAILIAN_CODING_PLAN_API_KEY"
+      }
+    ]
+  },
+  "env": {
+    "BAILIAN_CODING_PLAN_API_KEY": "sk-sp-xxxxxxxxx"
+  },
+  "security": {
+    "auth": {
+      "selectedType": "openai"
+    }
+  },
+  "model": {
+    "name": "qwen3-coder-plus"
+  }
+}
+```
+
+> [!note]
+>
+> The Coding Plan uses a dedicated endpoint (`https://coding.dashscope.aliyuncs.com/v1`) that is different from the standard Dashscope endpoint. Make sure to use the correct `baseUrl`.
+
+## 🚀 Option 3: API Key (flexible)
+
+Use this if you want to connect to third-party providers such as OpenAI, Anthropic, Google, Azure OpenAI, OpenRouter, ModelScope, or a self-hosted endpoint. Supports multiple protocols and providers.
 
 ### Recommended: One-file setup via `settings.json`
 
-The simplest way to get started with API-KEY authentication is to put everything in a single `~/.qwen/settings.json` file. Here's a complete, ready-to-use example:
+The simplest way to get started with API Key authentication is to put everything in a single `~/.qwen/settings.json` file. Here's a complete, ready-to-use example:
 
 ```json
 {
@@ -66,7 +122,7 @@ What each field does:
 
 | Field                        | Description                                                                                                                                     |
 | ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
-| `modelProviders`             | Declares which models are available and how to connect to them. Keys (`openai`, `anthropic`, `gemini`, `vertex-ai`) represent the API protocol. |
+| `modelProviders`             | Declares which models are available and how to connect to them. Keys (`openai`, `anthropic`, `gemini`) represent the API protocol.              |
 | `env`                        | Stores API keys directly in `settings.json` as a fallback (lowest priority — shell `export` and `.env` files take precedence).                  |
 | `security.auth.selectedType` | Tells Qwen Code which protocol to use on startup (e.g. `openai`, `anthropic`, `gemini`). Without this, you'd need to run `/auth` interactively. |
 | `model.name`                 | The default model to activate when Qwen Code starts. Must match one of the `id` values in your `modelProviders`.                                |
@@ -77,76 +133,15 @@ After saving the file, just run `qwen` — no interactive `/auth` setup needed.
 >
 > The sections below explain each part in more detail. If the quick example above works for you, feel free to skip ahead to [Security notes](#security-notes).
 
-### Option1: Coding Plan（Aliyun Bailian）
-
-Use this if you want predictable costs with higher usage quotas for the qwen3-coder-plus model.
-
-- **How it works**: Subscribe to the Coding Plan with a fixed monthly fee, then configure Qwen Code to use the dedicated endpoint and your subscription API key.
-- **Requirements**: Obtain an active Coding Plan subscription from [Alibaba Cloud Bailian](https://bailian.console.aliyun.com/cn-beijing/?tab=globalset#/efm/coding_plan).
-- **Benefits**: Higher usage quotas, predictable monthly costs, access to the latest qwen3-coder-plus model.
-- **Cost & quota**: View [Alibaba Cloud Bailian Coding Plan documentation](https://bailian.console.aliyun.com/cn-beijing/?tab=doc#/doc/?type=model&url=3005961).
-
-Enter `qwen` in the terminal to launch Qwen Code, then enter the `/auth` command and select `API-KEY`
-
-![](https://gw.alicdn.com/imgextra/i4/O1CN01yXSXc91uYxJxhJXBF_!!6000000006050-2-tps-2372-916.png)
-
-After entering, select `Coding Plan`:
-
-![](https://gw.alicdn.com/imgextra/i4/O1CN01Irk0AD1ebfop69o0r_!!6000000003890-2-tps-2308-830.png)
-
-Enter your `sk-sp-xxxxxxxxx` key, then use the `/model` command to switch between all Bailian `Coding Plan` supported models (including qwen3.5-plus, qwen3-coder-plus, qwen3-coder-next, qwen3-max, glm-4.7, and kimi-k2.5):
-
-![](https://gw.alicdn.com/imgextra/i4/O1CN01fWArmf1kaCEgSmPln_!!6000000004699-2-tps-2304-1374.png)
-
-**Alternative: configure Coding Plan via `settings.json`**
-
-If you prefer to skip the interactive `/auth` flow, add the following to `~/.qwen/settings.json`:
-
-```json
-{
-  "modelProviders": {
-    "openai": [
-      {
-        "id": "qwen3-coder-plus",
-        "name": "qwen3-coder-plus (Coding Plan)",
-        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
-        "description": "qwen3-coder-plus from Bailian Coding Plan",
-        "envKey": "BAILIAN_CODING_PLAN_API_KEY"
-      }
-    ]
-  },
-  "env": {
-    "BAILIAN_CODING_PLAN_API_KEY": "sk-sp-xxxxxxxxx"
-  },
-  "security": {
-    "auth": {
-      "selectedType": "openai"
-    }
-  },
-  "model": {
-    "name": "qwen3-coder-plus"
-  }
-}
-```
-
-> [!note]
->
-> The Coding Plan uses a dedicated endpoint (`https://coding.dashscope.aliyuncs.com/v1`) that is different from the standard Dashscope endpoint. Make sure to use the correct `baseUrl`.
-
-### Option2: Third-party API-KEY
-
-Use this if you want to connect to third-party providers such as OpenAI, Anthropic, Google, Azure OpenAI, OpenRouter, ModelScope, or a self-hosted endpoint.
-
 The key concept is **Model Providers** (`modelProviders`): Qwen Code supports multiple API protocols, not just OpenAI. You configure which providers and models are available by editing `~/.qwen/settings.json`, then switch between them at runtime with the `/model` command.
 
 #### Supported protocols
 
-| Protocol          | `modelProviders` key | Environment variables                                        | Providers                                                                                           |
-| ----------------- | -------------------- | ------------------------------------------------------------ | --------------------------------------------------------------------------------------------------- |
-| OpenAI-compatible | `openai`             | `OPENAI_API_KEY`, `OPENAI_BASE_URL`, `OPENAI_MODEL`          | OpenAI, Azure OpenAI, OpenRouter, ModelScope, Alibaba Cloud Bailian, any OpenAI-compatible endpoint |
-| Anthropic         | `anthropic`          | `ANTHROPIC_API_KEY`, `ANTHROPIC_BASE_URL`, `ANTHROPIC_MODEL` | Anthropic Claude                                                                                    |
-| Google GenAI      | `gemini`             | `GEMINI_API_KEY`, `GEMINI_MODEL`                             | Google Gemini                                                                                       |
-| Google Vertex AI  | `vertex-ai`          | `GOOGLE_API_KEY`, `GOOGLE_MODEL`                             | Google Vertex AI                                                                                    |
+| Protocol          | `modelProviders` key | Environment variables                                        | Providers                                                                                   |
+| ----------------- | -------------------- | ------------------------------------------------------------ | ------------------------------------------------------------------------------------------- |
+| OpenAI-compatible | `openai`             | `OPENAI_API_KEY`, `OPENAI_BASE_URL`, `OPENAI_MODEL`          | OpenAI, Azure OpenAI, OpenRouter, ModelScope, Alibaba Cloud, any OpenAI-compatible endpoint |
+| Anthropic         | `anthropic`          | `ANTHROPIC_API_KEY`, `ANTHROPIC_BASE_URL`, `ANTHROPIC_MODEL` | Anthropic Claude                                                                            |
+| Google GenAI      | `gemini`             | `GEMINI_API_KEY`, `GEMINI_MODEL`                             | Google Gemini                                                                               |
 
 #### Step 1: Configure models and providers in `~/.qwen/settings.json`
 
@@ -297,6 +292,6 @@ qwen --model "qwen3.5-plus"
 
 ## Security notes
 
-- Don’t commit API keys to version control.
+- Don't commit API keys to version control.
 - Prefer `.qwen/.env` for project-local secrets (and keep it out of git).
 - Treat your terminal output as sensitive if it prints credentials for verification.
diff --git a/docs/users/configuration/model-providers.md b/docs/users/configuration/model-providers.md
index 836237457..0b429398f 100644
--- a/docs/users/configuration/model-providers.md
+++ b/docs/users/configuration/model-providers.md
@@ -4,11 +4,11 @@ Qwen Code allows you to configure multiple model providers through the `modelPro
 
 ## Overview
 
-Use `modelProviders` to declare curated model lists per auth type that the `/model` picker can switch between. Keys must be valid auth types (`openai`, `anthropic`, `gemini`, `vertex-ai`, etc.). Each entry requires an `id` and **must include `envKey`**, with optional `name`, `description`, `baseUrl`, and `generationConfig`. Credentials are never persisted in settings; the runtime reads them from `process.env[envKey]`. Qwen OAuth models remain hard-coded and cannot be overridden.
+Use `modelProviders` to declare curated model lists per auth type that the `/model` picker can switch between. Keys must be valid auth types (`openai`, `anthropic`, `gemini`, etc.). Each entry requires an `id` and **must include `envKey`**, with optional `name`, `description`, `baseUrl`, and `generationConfig`. Credentials are never persisted in settings; the runtime reads them from `process.env[envKey]`. Qwen OAuth models remain hard-coded and cannot be overridden.
 
 > [!note]
 >
-> Only the `/model` command exposes non-default auth types. Anthropic, Gemini, Vertex AI, etc., must be defined via `modelProviders`. The `/auth` command intentionally lists only the built-in Qwen OAuth and OpenAI flows.
+> Only the `/model` command exposes non-default auth types. Anthropic, Gemini, etc., must be defined via `modelProviders`. The `/auth` command lists Qwen OAuth, Alibaba Cloud Coding Plan, and API Key as the built-in authentication options.
 
 > [!warning]
 >
@@ -27,7 +27,6 @@ The `modelProviders` object keys must be valid `authType` values. Currently supp
 | `openai`     | OpenAI-compatible APIs (OpenAI, Azure OpenAI, local inference servers like vLLM/Ollama) |
 | `anthropic`  | Anthropic Claude API                                                                    |
 | `gemini`     | Google Gemini API                                                                       |
-| `vertex-ai`  | Google Vertex AI                                                                        |
 | `qwen-oauth` | Qwen OAuth (hard-coded, cannot be overridden in `modelProviders`)                       |
 
 > [!warning]
@@ -37,12 +36,12 @@ The `modelProviders` object keys must be valid `authType` values. Currently supp
 
 Qwen Code uses the following official SDKs to send requests to each provider:
 
-| Auth Type              | SDK Package                                                                                     |
-| ---------------------- | ----------------------------------------------------------------------------------------------- |
-| `openai`               | [`openai`](https://www.npmjs.com/package/openai) - Official OpenAI Node.js SDK                  |
-| `anthropic`            | [`@anthropic-ai/sdk`](https://www.npmjs.com/package/@anthropic-ai/sdk) - Official Anthropic SDK |
-| `gemini` / `vertex-ai` | [`@google/genai`](https://www.npmjs.com/package/@google/genai) - Official Google GenAI SDK      |
-| `qwen-oauth`           | [`openai`](https://www.npmjs.com/package/openai) with custom provider (DashScope-compatible)    |
+| Auth Type    | SDK Package                                                                                     |
+| ------------ | ----------------------------------------------------------------------------------------------- |
+| `openai`     | [`openai`](https://www.npmjs.com/package/openai) - Official OpenAI Node.js SDK                  |
+| `anthropic`  | [`@anthropic-ai/sdk`](https://www.npmjs.com/package/@anthropic-ai/sdk) - Official Anthropic SDK |
+| `gemini`     | [`@google/genai`](https://www.npmjs.com/package/@google/genai) - Official Google GenAI SDK      |
+| `qwen-oauth` | [`openai`](https://www.npmjs.com/package/openai) with custom provider (DashScope-compatible)    |
 
 This means the `baseUrl` you configure should be compatible with the corresponding SDK's expected API format. For example, when using `openai` auth type, the endpoint must accept OpenAI API format requests.
 
@@ -64,6 +63,9 @@ This auth type supports not only OpenAI's official API but also any OpenAI-compa
           "maxRetries": 3,
           "enableCacheControl": true,
           "contextWindowSize": 128000,
+          "modalities": {
+            "image": true
+          },
           "customHeaders": {
             "X-Client-Request-ID": "req-123"
           },
@@ -183,31 +185,6 @@ This auth type supports not only OpenAI's official API but also any OpenAI-compa
 }
 ```
 
-### Google Vertex AI (`vertex-ai`)
-
-```json
-{
-  "modelProviders": {
-    "vertex-ai": [
-      {
-        "id": "gemini-1.5-pro-vertex",
-        "name": "Gemini 1.5 Pro (Vertex AI)",
-        "envKey": "GOOGLE_API_KEY",
-        "baseUrl": "https://generativelanguage.googleapis.com",
-        "generationConfig": {
-          "timeout": 90000,
-          "contextWindowSize": 2000000,
-          "samplingParams": {
-            "temperature": 0.2,
-            "max_tokens": 8192
-          }
-        }
-      }
-    ]
-  }
-}
-```
-
 ### Local Self-Hosted Models (via OpenAI-compatible API)
 
 Most local inference servers (vLLM, Ollama, LM Studio, etc.) provide an OpenAI-compatible API endpoint. Configure them using the `openai` auth type with a local `baseUrl`:
@@ -276,15 +253,20 @@ export VLLM_API_KEY="not-needed"
 
 > [!note]
 >
+> <<<<<<< HEAD
 > The `extra_body` parameter is **only supported for OpenAI-compatible providers** (`openai`, `qwen-oauth`). It is ignored for Anthropic, Gemini, and Vertex AI providers.
+> =======
+> The `extra_body` parameter is **only supported for OpenAI-compatible providers** (`openai`, `qwen-oauth`). It is ignored for Anthropic, and Gemini providers.
+>
+> > > > > > > main
 
-## Bailian Coding Plan
+## Alibaba Cloud Coding Plan
 
-Bailian Coding Plan provides a pre-configured set of Qwen models optimized for coding tasks. This feature is available for users with Bailian API access and offers a simplified setup experience with automatic model configuration updates.
+Alibaba Cloud Coding Plan provides a pre-configured set of Qwen models optimized for coding tasks. This feature is available for users with Alibaba Cloud Coding Plan API access and offers a simplified setup experience with automatic model configuration updates.
 
 ### Overview
 
-When you authenticate with a Bailian Coding Plan API key using the `/auth` command, Qwen Code automatically configures the following models:
+When you authenticate with an Alibaba Cloud Coding Plan API key using the `/auth` command, Qwen Code automatically configures the following models:
 
 | Model ID               | Name                 | Description                            |
 | ---------------------- | -------------------- | -------------------------------------- |
@@ -294,19 +276,19 @@ When you authenticate with a Bailian Coding Plan API key using the `/auth` comma
 
 ### Setup
 
-1. Obtain a Bailian Coding Plan API key:
+1. Obtain an Alibaba Cloud Coding Plan API key:
    - **China**: <https://bailian.console.aliyun.com/?tab=model#/efm/coding_plan>
    - **International**: <https://modelstudio.console.alibabacloud.com/?tab=dashboard#/efm/coding_plan>
 2. Run the `/auth` command in Qwen Code
-3. Select the API-KEY authentication method
-4. Select your region (China or Global/International)
+3. Select **Alibaba Cloud Coding Plan**
+4. Select your region
 5. Enter your API key when prompted
 
 The models will be automatically configured and added to your `/model` picker.
 
 ### Regions
 
-Bailian Coding Plan supports two regions:
+Alibaba Cloud Coding Plan supports two regions:
 
 | Region               | Endpoint                                        | Description             |
 | -------------------- | ----------------------------------------------- | ----------------------- |
@@ -351,7 +333,7 @@ If you prefer to manually configure Coding Plan models, you can add them to your
       {
         "id": "qwen3-coder-plus",
         "name": "qwen3-coder-plus",
-        "description": "Qwen3-Coder via Bailian Coding Plan",
+        "description": "Qwen3-Coder via Alibaba Cloud Coding Plan",
         "envKey": "YOUR_CUSTOM_ENV_KEY",
         "baseUrl": "https://coding.dashscope.aliyuncs.com/v1"
       }
diff --git a/docs/users/configuration/settings.md b/docs/users/configuration/settings.md
index 82db2b319..edca4aedd 100644
--- a/docs/users/configuration/settings.md
+++ b/docs/users/configuration/settings.md
@@ -2,7 +2,7 @@
 
 > [!tip]
 >
-> **Authentication / API keys:** Authentication (Qwen OAuth vs OpenAI-compatible API) and auth-related environment variables (like `OPENAI_API_KEY`) are documented in **[Authentication](../configuration/auth)**.
+> **Authentication / API keys:** Authentication (Qwen OAuth, Alibaba Cloud Coding Plan, or API Key) and auth-related environment variables (like `OPENAI_API_KEY`) are documented in **[Authentication](../configuration/auth)**.
 
 > [!note]
 >
@@ -125,18 +125,18 @@ Settings are organized into categories. All settings should be placed within the
 
 #### model
 
-| Setting                                            | Type    | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Default     |
-| -------------------------------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------- |
-| `model.name`                                       | string  | The Qwen model to use for conversations.                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | `undefined` |
-| `model.maxSessionTurns`                            | number  | Maximum number of user/model/tool turns to keep in a session. -1 means unlimited.                                                                                                                                                                                                                                                                                                                                                                                                                            | `-1`        |
-| `model.summarizeToolOutput`                        | object  | Enables or disables the summarization of tool output. You can specify the token budget for the summarization using the `tokenBudget` setting. Note: Currently only the `run_shell_command` tool is supported. For example `{"run_shell_command": {"tokenBudget": 2000}}`                                                                                                                                                                                                                                     | `undefined` |
-| `model.generationConfig`                           | object  | Advanced overrides passed to the underlying content generator. Supports request controls such as `timeout`, `maxRetries`, `enableCacheControl`, `contextWindowSize` (override model's context window size), `customHeaders` (custom HTTP headers for API requests), and `extra_body` (additional body parameters for OpenAI-compatible API requests only), along with fine-tuning knobs under `samplingParams` (for example `temperature`, `top_p`, `max_tokens`). Leave unset to rely on provider defaults. | `undefined` |
-| `model.chatCompression.contextPercentageThreshold` | number  | Sets the threshold for chat history compression as a percentage of the model's total token limit. This is a value between 0 and 1 that applies to both automatic compression and the manual `/compress` command. For example, a value of `0.6` will trigger compression when the chat history exceeds 60% of the token limit. Use `0` to disable compression entirely.                                                                                                                                       | `0.7`       |
-| `model.skipNextSpeakerCheck`                       | boolean | Skip the next speaker check.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | `false`     |
-| `model.skipLoopDetection`                          | boolean | Disables loop detection checks. Loop detection prevents infinite loops in AI responses but can generate false positives that interrupt legitimate workflows. Enable this option if you experience frequent false positive loop detection interruptions.                                                                                                                                                                                                                                                      | `false`     |
-| `model.skipStartupContext`                         | boolean | Skips sending the startup workspace context (environment summary and acknowledgement) at the beginning of each session. Enable this if you prefer to provide context manually or want to save tokens on startup.                                                                                                                                                                                                                                                                                             | `false`     |
-| `model.enableOpenAILogging`                        | boolean | Enables logging of OpenAI API calls for debugging and analysis. When enabled, API requests and responses are logged to JSON files.                                                                                                                                                                                                                                                                                                                                                                           | `false`     |
-| `model.openAILoggingDir`                           | string  | Custom directory path for OpenAI API logs. If not specified, defaults to `logs/openai` in the current working directory. Supports absolute paths, relative paths (resolved from current working directory), and `~` expansion (home directory).                                                                                                                                                                                                                                                              | `undefined` |
+| Setting                                            | Type    | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | Default     |
+| -------------------------------------------------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
+| `model.name`                                       | string  | The Qwen model to use for conversations.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | `undefined` |
+| `model.maxSessionTurns`                            | number  | Maximum number of user/model/tool turns to keep in a session. -1 means unlimited.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | `-1`        |
+| `model.summarizeToolOutput`                        | object  | Enables or disables the summarization of tool output. You can specify the token budget for the summarization using the `tokenBudget` setting. Note: Currently only the `run_shell_command` tool is supported. For example `{"run_shell_command": {"tokenBudget": 2000}}`                                                                                                                                                                                                                                                                                             | `undefined` |
+| `model.generationConfig`                           | object  | Advanced overrides passed to the underlying content generator. Supports request controls such as `timeout`, `maxRetries`, `enableCacheControl`, `contextWindowSize` (override model's context window size), `modalities` (override auto-detected input modalities), `customHeaders` (custom HTTP headers for API requests), and `extra_body` (additional body parameters for OpenAI-compatible API requests only), along with fine-tuning knobs under `samplingParams` (for example `temperature`, `top_p`, `max_tokens`). Leave unset to rely on provider defaults. | `undefined` |
+| `model.chatCompression.contextPercentageThreshold` | number  | Sets the threshold for chat history compression as a percentage of the model's total token limit. This is a value between 0 and 1 that applies to both automatic compression and the manual `/compress` command. For example, a value of `0.6` will trigger compression when the chat history exceeds 60% of the token limit. Use `0` to disable compression entirely.                                                                                                                                                                                               | `0.7`       |
+| `model.skipNextSpeakerCheck`                       | boolean | Skip the next speaker check.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | `false`     |
+| `model.skipLoopDetection`                          | boolean | Disables loop detection checks. Loop detection prevents infinite loops in AI responses but can generate false positives that interrupt legitimate workflows. Enable this option if you experience frequent false positive loop detection interruptions.                                                                                                                                                                                                                                                                                                              | `false`     |
+| `model.skipStartupContext`                         | boolean | Skips sending the startup workspace context (environment summary and acknowledgement) at the beginning of each session. Enable this if you prefer to provide context manually or want to save tokens on startup.                                                                                                                                                                                                                                                                                                                                                     | `false`     |
+| `model.enableOpenAILogging`                        | boolean | Enables logging of OpenAI API calls for debugging and analysis. When enabled, API requests and responses are logged to JSON files.                                                                                                                                                                                                                                                                                                                                                                                                                                   | `false`     |
+| `model.openAILoggingDir`                           | string  | Custom directory path for OpenAI API logs. If not specified, defaults to `logs/openai` in the current working directory. Supports absolute paths, relative paths (resolved from current working directory), and `~` expansion (home directory).                                                                                                                                                                                                                                                                                                                      | `undefined` |
 
 **Example model.generationConfig:**
 
@@ -146,6 +146,9 @@ Settings are organized into categories. All settings should be placed within the
     "generationConfig": {
       "timeout": 60000,
       "contextWindowSize": 128000,
+      "modalities": {
+        "image": true
+      },
       "enableCacheControl": true,
       "customHeaders": {
         "X-Client-Request-ID": "req-123"
@@ -167,6 +170,10 @@ Settings are organized into categories. All settings should be placed within the
 
 Overrides the default context window size for the selected model. Qwen Code determines the context window using built-in defaults based on model name matching, with a constant fallback value. Use this setting when a provider's effective context limit differs from Qwen Code's default. This value defines the model's assumed maximum context capacity, not a per-request token limit.
 
+**modalities:**
+
+Overrides the auto-detected input modalities for the selected model. Qwen Code automatically detects supported modalities (image, PDF, audio, video) based on model name pattern matching. Use this setting when the auto-detection is incorrect — for example, to enable `pdf` for a model that supports it but isn't recognized. Format: `{ "image": true, "pdf": true, "audio": true, "video": true }`. Omit a key or set it to `false` for unsupported types.
+
 **customHeaders:**
 
 Allows you to add custom HTTP headers to all API requests. This is useful for request tracing, monitoring, API gateway routing, or when different models require different headers. If `customHeaders` is defined in `modelProviders[].generationConfig.customHeaders`, it will be used directly; otherwise, headers from `model.generationConfig.customHeaders` will be used. No merging occurs between the two levels.
diff --git a/docs/users/overview.md b/docs/users/overview.md
index 3b45cc2f0..f3c52be91 100644
--- a/docs/users/overview.md
+++ b/docs/users/overview.md
@@ -7,25 +7,24 @@
 
 ## Get started in 30 seconds
 
-Prerequisites:
-
-- A [Qwen Code](https://chat.qwen.ai/auth?mode=register) account
-- Requires [Node.js 20+](https://nodejs.org/zh-cn/download), you can use `node -v` to check the version. If it's not installed, use the following command to install it.
-
 ### Install Qwen Code:
 
-**NPM**(recommended)
+**Linux / macOS**
 
-```bash
-npm install -g @qwen-code/qwen-code@latest
+```sh
+curl -fsSL https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.sh | bash
 ```
 
-**Homebrew**(macOS, Linux)
+**Windows (Run as Administrator CMD)**
 
-```bash
-brew install qwen-code
+```sh
+curl -fsSL -o %TEMP%\install-qwen.bat https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.bat && %TEMP%\install-qwen.bat
 ```
 
+> [!note]
+>
+> It's recommended to restart your terminal after installation to ensure environment variables take effect. If the installation fails, please refer to [Manual Installation](./quickstart#manual-installation) in the Quickstart guide.
+
 ### Start using Qwen Code:
 
 ```bash
diff --git a/docs/users/quickstart.md b/docs/users/quickstart.md
index eac8f9474..3c4eafcea 100644
--- a/docs/users/quickstart.md
+++ b/docs/users/quickstart.md
@@ -16,19 +16,39 @@ Make sure you have:
 
 To install Qwen Code, use one of the following methods:
 
-### NPM (recommended)
+### Quick Install (Recommended)
 
-Requires [Node.js 20+](https://nodejs.org/download), you can use `node -v` check the version. If it's not installed, use the following command to install it.
-
-If you have [Node.js or newer installed](https://nodejs.org/en/download/):
+**Linux / macOS**
 
 ```sh
+curl -fsSL https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.sh | bash
+```
+
+**Windows (Run as Administrator CMD)**
+
+```sh
+curl -fsSL -o %TEMP%\install-qwen.bat https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.bat && %TEMP%\install-qwen.bat
+```
+
+> [!note]
+>
+> It's recommended to restart your terminal after installation to ensure environment variables take effect.
+
+### Manual Installation
+
+**Prerequisites**
+
+Make sure you have Node.js 20 or later installed. Download it from [nodejs.org](https://nodejs.org/en/download).
+
+**NPM**
+
+```bash
 npm install -g @qwen-code/qwen-code@latest
 ```
 
-### Homebrew (macOS, Linux)
+**Homebrew (macOS, Linux)**
 
-```sh
+```bash
 brew install qwen-code
 ```
 
diff --git a/docs/users/reference/keyboard-shortcuts.md b/docs/users/reference/keyboard-shortcuts.md
index f0cbd7b16..fdfc41b87 100644
--- a/docs/users/reference/keyboard-shortcuts.md
+++ b/docs/users/reference/keyboard-shortcuts.md
@@ -40,6 +40,7 @@ This document lists the available keyboard shortcuts in Qwen Code.
 | `Ctrl+N`                                           | Navigate down through the input history.                                                                                            |
 | `Ctrl+P`                                           | Navigate up through the input history.                                                                                              |
 | `Ctrl+R`                                           | Reverse search through input/shell history.                                                                                         |
+| `Ctrl+Y`                                           | Retry the last failed request.                                                                                                      |
 | `Ctrl+Right Arrow` / `Meta+Right Arrow` / `Meta+F` | Move the cursor one word to the right.                                                                                              |
 | `Ctrl+U`                                           | Delete from the cursor to the beginning of the line.                                                                                |
 | `Ctrl+V` (Windows: `Alt+V`)                        | Paste clipboard content. If the clipboard contains an image, it will be saved and a reference to it will be inserted in the prompt. |
diff --git a/docs/users/support/tos-privacy.md b/docs/users/support/tos-privacy.md
index aa0d5c471..386153512 100644
--- a/docs/users/support/tos-privacy.md
+++ b/docs/users/support/tos-privacy.md
@@ -4,17 +4,19 @@ Qwen Code is an open-source AI coding assistant tool maintained by the Qwen Code
 
 ## How to determine your authentication method
 
-Qwen Code supports two main authentication methods to access AI models. Your authentication method determines which terms of service and privacy policies apply to your usage:
+Qwen Code supports three authentication methods to access AI models. Your authentication method determines which terms of service and privacy policies apply to your usage:
 
-1. **Qwen OAuth** - Log in with your qwen.ai account
-2. **OpenAI-Compatible API** - Use API keys from various AI model providers
+1. **Qwen OAuth** — Log in with your qwen.ai account (free daily quota)
+2. **Alibaba Cloud Coding Plan** — Use an API key from Alibaba Cloud
+3. **API Key** — Bring your own API key
 
 For each authentication method, different Terms of Service and Privacy Notices may apply depending on the underlying service provider.
 
-| Authentication Method | Provider          | Terms of Service                                                              | Privacy Notice                                       |
-| :-------------------- | :---------------- | :---------------------------------------------------------------------------- | :--------------------------------------------------- |
-| Qwen OAuth            | Qwen AI           | [Qwen Terms of Service](https://qwen.ai/termsservice)                         | [Qwen Privacy Policy](https://qwen.ai/privacypolicy) |
-| OpenAI-Compatible API | Various Providers | Depends on your chosen API provider (OpenAI, Alibaba Cloud, ModelScope, etc.) | Depends on your chosen API provider                  |
+| Authentication Method     | Provider          | Terms of Service                                                   | Privacy Notice                                                     |
+| :------------------------ | :---------------- | :----------------------------------------------------------------- | :----------------------------------------------------------------- |
+| Qwen OAuth                | Qwen AI           | [Qwen Terms of Service](https://qwen.ai/termsservice)              | [Qwen Privacy Policy](https://qwen.ai/privacypolicy)               |
+| Alibaba Cloud Coding Plan | Alibaba Cloud     | See [details below](#2-if-you-are-using-alibaba-cloud-coding-plan) | See [details below](#2-if-you-are-using-alibaba-cloud-coding-plan) |
+| API Key                   | Various Providers | Depends on your chosen API provider (OpenAI, Anthropic, etc.)      | Depends on your chosen API provider                                |
 
 ## 1. If you are using Qwen OAuth Authentication
 
@@ -25,13 +27,26 @@ When you authenticate using your qwen.ai account, these Terms of Service and Pri
 
 For details about authentication setup, quotas, and supported features, see [Authentication Setup](../configuration/settings).
 
-## 2. If you are using OpenAI-Compatible API Authentication
+## 2. If you are using Alibaba Cloud Coding Plan
 
-When you authenticate using API keys from OpenAI-compatible providers, the applicable Terms of Service and Privacy Notice depend on your chosen provider.
+When you authenticate using an API key from Alibaba Cloud, the applicable Terms of Service and Privacy Notice from Alibaba Cloud apply.
+
+Alibaba Cloud Coding Plan is available in two regions:
+
+- **阿里云百炼 (aliyun.com)** — [bailian.console.aliyun.com](https://bailian.console.aliyun.com)
+- **Alibaba Cloud (alibabacloud.com)** — [bailian.console.alibabacloud.com](https://bailian.console.alibabacloud.com)
 
 > [!important]
 >
-> When using OpenAI-compatible API authentication, you are subject to the terms and privacy policies of your chosen API provider, not Qwen Code's terms. Please review your provider's documentation for specific details about data usage, retention, and privacy practices.
+> When using Alibaba Cloud Coding Plan, you are subject to Alibaba Cloud's terms and privacy policies. Please review their documentation for specific details about data usage, retention, and privacy practices.
+
+## 3. If you are using your own API Key
+
+When you authenticate using API keys from other providers, the applicable Terms of Service and Privacy Notice depend on your chosen provider.
+
+> [!important]
+>
+> When using your own API key, you are subject to the terms and privacy policies of your chosen API provider, not Qwen Code's terms. Please review your provider's documentation for specific details about data usage, retention, and privacy practices.
 
 Qwen Code supports various OpenAI-compatible providers. Please refer to your specific provider's terms of service and privacy policy for detailed information.
 
@@ -50,7 +65,8 @@ When enabled, Qwen Code may collect:
 ### Data Collection by Authentication Method
 
 - **Qwen OAuth:** Usage statistics are governed by Qwen's privacy policy. You can opt-out through Qwen Code's configuration settings.
-- **OpenAI-Compatible API:** No additional data is collected by Qwen Code beyond what your chosen API provider collects.
+- **Alibaba Cloud Coding Plan:** Usage statistics are governed by Alibaba Cloud's privacy policy. You can opt-out through Qwen Code's configuration settings.
+- **API Key:** No additional data is collected by Qwen Code beyond what your chosen API provider collects.
 
 ## Frequently Asked Questions (FAQ)
 
@@ -60,7 +76,9 @@ Whether your code, including prompts and answers, is used to train AI models dep
 
 - **Qwen OAuth**: Data usage is governed by [Qwen's Privacy Policy](https://qwen.ai/privacy). Please refer to their policy for specific details about data collection and model training practices.
 
-- **OpenAI-Compatible API**: Data usage depends entirely on your chosen API provider. Each provider has their own data usage policies. Please review the privacy policy and terms of service of your specific provider.
+- **Alibaba Cloud Coding Plan**: Data usage is governed by Alibaba Cloud's privacy policy. Please refer to their policy for specific details about data collection and model training practices.
+
+- **API Key**: Data usage depends entirely on your chosen API provider. Each provider has their own data usage policies. Please review the privacy policy and terms of service of your specific provider.
 
 **Important**: Qwen Code itself does not use your prompts, code, or responses for model training. Any data usage for training purposes would be governed by the policies of the AI service provider you authenticate with.
 
@@ -85,10 +103,10 @@ The Usage Statistics setting only controls data collection by Qwen Code itself.
 
 ### 3. How do I switch between authentication methods?
 
-You can switch between Qwen OAuth and OpenAI-compatible API authentication at any time:
+You can switch between Qwen OAuth, Alibaba Cloud Coding Plan, and your own API key at any time:
 
 1. **During startup**: Choose your preferred authentication method when prompted
 2. **Within the CLI**: Use the `/auth` command to reconfigure your authentication method
-3. **Environment variables**: Set up `.env` files for automatic OpenAI-compatible API authentication
+3. **Environment variables**: Set up `.env` files for automatic API key authentication
 
 For detailed instructions, see the [Authentication Setup](../configuration/settings#environment-variables-for-api-access) documentation.
diff --git a/package-lock.json b/package-lock.json
index c2064fa6f..f26e50737 100644
--- a/package-lock.json
+++ b/package-lock.json
@@ -1,12 +1,12 @@
 {
   "name": "@qwen-code/qwen-code",
-  "version": "0.11.0",
+  "version": "0.11.1",
   "lockfileVersion": 3,
   "requires": true,
   "packages": {
     "": {
       "name": "@qwen-code/qwen-code",
-      "version": "0.11.0",
+      "version": "0.11.1",
       "workspaces": [
         "packages/*"
       ],
@@ -18780,7 +18780,7 @@
     },
     "packages/cli": {
       "name": "@qwen-code/qwen-code",
-      "version": "0.11.0",
+      "version": "0.11.1",
       "dependencies": {
         "@google/genai": "1.30.0",
         "@iarna/toml": "^2.2.5",
@@ -19437,7 +19437,7 @@
     },
     "packages/core": {
       "name": "@qwen-code/qwen-code-core",
-      "version": "0.11.0",
+      "version": "0.11.1",
       "hasInstallScript": true,
       "dependencies": {
         "@anthropic-ai/sdk": "^0.36.1",
@@ -22917,7 +22917,7 @@
     },
     "packages/test-utils": {
       "name": "@qwen-code/qwen-code-test-utils",
-      "version": "0.11.0",
+      "version": "0.11.1",
       "dev": true,
       "license": "Apache-2.0",
       "devDependencies": {
@@ -22929,7 +22929,7 @@
     },
     "packages/vscode-ide-companion": {
       "name": "qwen-code-vscode-ide-companion",
-      "version": "0.11.0",
+      "version": "0.11.1",
       "license": "LICENSE",
       "dependencies": {
         "@modelcontextprotocol/sdk": "^1.25.1",
@@ -23176,7 +23176,7 @@
     },
     "packages/web-templates": {
       "name": "@qwen-code/web-templates",
-      "version": "0.10.0",
+      "version": "0.11.1",
       "devDependencies": {
         "@types/react": "^18.2.0",
         "@types/react-dom": "^18.2.0",
@@ -23704,7 +23704,7 @@
     },
     "packages/webui": {
       "name": "@qwen-code/webui",
-      "version": "0.11.0",
+      "version": "0.11.1",
       "license": "MIT",
       "dependencies": {
         "markdown-it": "^14.1.0"
diff --git a/package.json b/package.json
index bc2544c9b..5657d4129 100644
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@qwen-code/qwen-code",
-  "version": "0.11.0",
+  "version": "0.11.1",
   "engines": {
     "node": ">=20.0.0"
   },
@@ -13,7 +13,7 @@
     "url": "git+https://github.com/QwenLM/qwen-code.git"
   },
   "config": {
-    "sandboxImageUri": "ghcr.io/qwenlm/qwen-code:0.11.0"
+    "sandboxImageUri": "ghcr.io/qwenlm/qwen-code:0.11.1"
   },
   "scripts": {
     "start": "cross-env node scripts/start.js",
diff --git a/packages/cli/package.json b/packages/cli/package.json
index 3dba290bd..2dc3d87d7 100644
--- a/packages/cli/package.json
+++ b/packages/cli/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@qwen-code/qwen-code",
-  "version": "0.11.0",
+  "version": "0.11.1",
   "description": "Qwen Code",
   "repository": {
     "type": "git",
@@ -33,7 +33,7 @@
     "dist"
   ],
   "config": {
-    "sandboxImageUri": "ghcr.io/qwenlm/qwen-code:0.11.0"
+    "sandboxImageUri": "ghcr.io/qwenlm/qwen-code:0.11.1"
   },
   "dependencies": {
     "@google/genai": "1.30.0",
diff --git a/packages/cli/src/acp-integration/acpAgent.ts b/packages/cli/src/acp-integration/acpAgent.ts
index a7ae2cf4c..11878017a 100644
--- a/packages/cli/src/acp-integration/acpAgent.ts
+++ b/packages/cli/src/acp-integration/acpAgent.ts
@@ -107,6 +107,10 @@ class GeminiAgent {
           audio: true,
           embeddedContext: true,
         },
+        sessionCapabilities: {
+          list: {},
+          resume: {},
+        },
       },
     };
   }
@@ -153,10 +157,14 @@ class GeminiAgent {
 
     const session = await this.createAndStoreSession(config);
     const availableModels = this.buildAvailableModels(config);
+    const modesData = this.buildModesData(config);
+    const configOptions = this.buildConfigOptions(config);
 
     return {
       sessionId: session.getId(),
       models: availableModels,
+      modes: modesData,
+      configOptions,
     };
   }
 
@@ -239,25 +247,31 @@ class GeminiAgent {
   async listSessions(
     params: acp.ListSessionsRequest,
   ): Promise<acp.ListSessionsResponse> {
-    const sessionService = new SessionService(params.cwd);
+    const cwd = params.cwd || process.cwd();
+    const sessionService = new SessionService(cwd);
     const result = await sessionService.listSessions({
       cursor: params.cursor,
       size: params.size,
     });
 
+    const sessions = result.items.map((item) => ({
+      cwd: item.cwd,
+      filePath: item.filePath,
+      gitBranch: item.gitBranch,
+      messageCount: item.messageCount,
+      mtime: item.mtime,
+      prompt: item.prompt,
+      sessionId: item.sessionId,
+      startTime: item.startTime,
+      title: item.prompt || '(session)',
+      updatedAt: new Date(item.mtime).toISOString(),
+    }));
+
     return {
-      items: result.items.map((item) => ({
-        sessionId: item.sessionId,
-        cwd: item.cwd,
-        startTime: item.startTime,
-        mtime: item.mtime,
-        prompt: item.prompt,
-        gitBranch: item.gitBranch,
-        filePath: item.filePath,
-        messageCount: item.messageCount,
-      })),
-      nextCursor: result.nextCursor,
       hasMore: result.hasMore,
+      items: sessions,
+      nextCursor: result.nextCursor,
+      sessions,
     };
   }
 
@@ -449,6 +463,70 @@ class GeminiAgent {
     };
   }
 
+  private buildModesData(config: Config): acp.ModesData {
+    const currentApprovalMode = config.getApprovalMode();
+
+    const availableModes = APPROVAL_MODES.map((mode) => ({
+      id: mode as ApprovalModeValue,
+      name: APPROVAL_MODE_INFO[mode].name,
+      description: APPROVAL_MODE_INFO[mode].description,
+    }));
+
+    return {
+      currentModeId: currentApprovalMode as ApprovalModeValue,
+      availableModes,
+    };
+  }
+
+  private buildConfigOptions(config: Config): acp.ConfigOption[] {
+    const currentApprovalMode = config.getApprovalMode();
+    const currentModelId = this.formatCurrentModelId(
+      config.getModel() || this.config.getModel() || '',
+      config.getAuthType(),
+    );
+
+    const modeOptions = APPROVAL_MODES.map((mode) => ({
+      value: mode,
+      name: APPROVAL_MODE_INFO[mode].name,
+      description: APPROVAL_MODE_INFO[mode].description,
+    }));
+
+    const allConfiguredModels = config.getAllConfiguredModels();
+    const modelOptions = allConfiguredModels.map((model) => {
+      const effectiveModelId =
+        model.isRuntimeModel && model.runtimeSnapshotId
+          ? model.runtimeSnapshotId
+          : model.id;
+
+      return {
+        value: formatAcpModelId(effectiveModelId, model.authType),
+        name: model.label,
+        description: model.description ?? '',
+      };
+    });
+
+    return [
+      {
+        id: 'mode',
+        name: 'Mode',
+        description: 'Session permission mode',
+        category: 'mode',
+        type: 'select',
+        currentValue: currentApprovalMode,
+        options: modeOptions,
+      },
+      {
+        id: 'model',
+        name: 'Model',
+        description: 'AI model to use',
+        category: 'model',
+        type: 'select',
+        currentValue: currentModelId,
+        options: modelOptions,
+      },
+    ];
+  }
+
   private formatCurrentModelId(
     baseModelId: string,
     authType?: AuthType,
diff --git a/packages/cli/src/acp-integration/schema.ts b/packages/cli/src/acp-integration/schema.ts
index 952ad0bd5..1df709c45 100644
--- a/packages/cli/src/acp-integration/schema.ts
+++ b/packages/cli/src/acp-integration/schema.ts
@@ -59,7 +59,7 @@ export type CancelNotification = z.infer<typeof cancelNotificationSchema>;
 
 export type AuthenticateRequest = z.infer<typeof authenticateRequestSchema>;
 
-export type NewSessionResponse = z.infer<typeof newSessionResponseSchema>;
+// Note: NewSessionResponse type is defined later after newSessionResponseSchema
 
 export type LoadSessionResponse = z.infer<typeof loadSessionResponseSchema>;
 
@@ -285,33 +285,33 @@ export const sessionModelStateSchema = z.object({
   currentModelId: modelIdSchema,
 });
 
-export const newSessionResponseSchema = z.object({
-  sessionId: z.string(),
-  models: sessionModelStateSchema,
-});
+// Note: newSessionResponseSchema is defined later in the file after modesDataSchema
 
 export const loadSessionResponseSchema = z.null();
 
 export const sessionListItemSchema = z.object({
   cwd: z.string(),
-  filePath: z.string(),
+  filePath: z.string().optional(),
   gitBranch: z.string().optional(),
-  messageCount: z.number(),
-  mtime: z.number(),
-  prompt: z.string(),
+  messageCount: z.number().optional(),
+  mtime: z.number().optional(),
+  prompt: z.string().optional(),
   sessionId: z.string(),
-  startTime: z.string(),
+  startTime: z.string().optional(),
+  title: z.string(),
+  updatedAt: z.string(),
 });
 
 export const listSessionsResponseSchema = z.object({
-  hasMore: z.boolean(),
-  items: z.array(sessionListItemSchema),
+  hasMore: z.boolean().optional(),
+  items: z.array(sessionListItemSchema).optional(),
   nextCursor: z.number().optional(),
+  sessions: z.array(sessionListItemSchema),
 });
 
 export const listSessionsRequestSchema = z.object({
   cursor: z.number().optional(),
-  cwd: z.string(),
+  cwd: z.string().optional(),
   size: z.number().optional(),
 });
 
@@ -405,6 +405,12 @@ export const promptCapabilitiesSchema = z.object({
 export const agentCapabilitiesSchema = z.object({
   loadSession: z.boolean().optional(),
   promptCapabilities: promptCapabilitiesSchema.optional(),
+  sessionCapabilities: z
+    .object({
+      list: z.object({}).optional(),
+      resume: z.object({}).optional(),
+    })
+    .optional(),
 });
 
 export const authMethodSchema = z.object({
@@ -451,6 +457,34 @@ export const modesDataSchema = z.object({
   availableModes: z.array(modeInfoSchema),
 });
 
+export const configOptionSchema = z.object({
+  id: z.string(),
+  name: z.string(),
+  description: z.string(),
+  category: z.string(),
+  type: z.string(),
+  currentValue: z.string(),
+  options: z.array(
+    z.object({
+      value: z.string(),
+      name: z.string(),
+      description: z.string(),
+    }),
+  ),
+});
+
+export type ConfigOption = z.infer<typeof configOptionSchema>;
+
+// newSessionResponseSchema includes modes and configOptions for ACP/Zed integration
+export const newSessionResponseSchema = z.object({
+  sessionId: z.string(),
+  models: sessionModelStateSchema,
+  modes: modesDataSchema,
+  configOptions: z.array(configOptionSchema),
+});
+
+export type NewSessionResponse = z.infer<typeof newSessionResponseSchema>;
+
 export const agentInfoSchema = z.object({
   name: z.string(),
   title: z.string(),
diff --git a/packages/cli/src/config/config.ts b/packages/cli/src/config/config.ts
index e2c4b77af..0a4e0a6c4 100755
--- a/packages/cli/src/config/config.ts
+++ b/packages/cli/src/config/config.ts
@@ -698,14 +698,21 @@ export async function loadCliConfig(
   }
 
   // Automatically load output-language.md if it exists
-  let outputLanguageFilePath: string | undefined = path.join(
+  const projectStorage = new Storage(cwd);
+  const projectOutputLanguagePath = path.join(
+    projectStorage.getQwenDir(),
+    'output-language.md',
+  );
+  const globalOutputLanguagePath = path.join(
     Storage.getGlobalQwenDir(),
     'output-language.md',
   );
-  if (fs.existsSync(outputLanguageFilePath)) {
-    // output-language.md found - will be added to context files
-  } else {
-    outputLanguageFilePath = undefined;
+
+  let outputLanguageFilePath: string | undefined;
+  if (fs.existsSync(projectOutputLanguagePath)) {
+    outputLanguageFilePath = projectOutputLanguagePath;
+  } else if (fs.existsSync(globalOutputLanguagePath)) {
+    outputLanguageFilePath = globalOutputLanguagePath;
   }
 
   const fileService = new FileDiscoveryService(cwd);
diff --git a/packages/cli/src/config/keyBindings.ts b/packages/cli/src/config/keyBindings.ts
index 226727c5b..7499a8c68 100644
--- a/packages/cli/src/config/keyBindings.ts
+++ b/packages/cli/src/config/keyBindings.ts
@@ -50,6 +50,7 @@ export enum Command {
   QUIT = 'quit',
   EXIT = 'exit',
   SHOW_MORE_LINES = 'showMoreLines',
+  RETRY_LAST = 'retryLast',
 
   // Shell commands
   REVERSE_SEARCH = 'reverseSearch',
@@ -170,6 +171,7 @@ export const defaultKeyBindings: KeyBindingConfig = {
   [Command.QUIT]: [{ key: 'c', ctrl: true }],
   [Command.EXIT]: [{ key: 'd', ctrl: true }],
   [Command.SHOW_MORE_LINES]: [{ key: 's', ctrl: true }],
+  [Command.RETRY_LAST]: [{ key: 'y', ctrl: true }],
 
   // Shell commands
   [Command.REVERSE_SEARCH]: [{ key: 'r', ctrl: true }],
diff --git a/packages/cli/src/constants/codingPlan.ts b/packages/cli/src/constants/codingPlan.ts
index 80c41f50b..03c164d8e 100644
--- a/packages/cli/src/constants/codingPlan.ts
+++ b/packages/cli/src/constants/codingPlan.ts
@@ -251,15 +251,9 @@ export function getCodingPlanConfig(region: CodingPlanRegion) {
     region === CodingPlanRegion.CHINA
       ? 'https://coding.dashscope.aliyuncs.com/v1'
       : 'https://coding-intl.dashscope.aliyuncs.com/v1';
-  const regionName =
-    region === CodingPlanRegion.CHINA
-      ? 'Coding Plan (Bailian, China)'
-      : 'Coding Plan (Bailian, Global/Intl)';
-
   return {
     template,
     baseUrl,
-    regionName,
     version: computeCodingPlanVersion(template),
   };
 }
diff --git a/packages/cli/src/i18n/locales/de.js b/packages/cli/src/i18n/locales/de.js
index 8ae18e16e..1144aa31c 100644
--- a/packages/cli/src/i18n/locales/de.js
+++ b/packages/cli/src/i18n/locales/de.js
@@ -157,6 +157,7 @@ export default {
   'Enter to confirm, Esc to cancel': 'Enter zum Bestätigen, Esc zum Abbrechen',
   'Enter to select, ↑↓ to navigate, Esc to go back':
     'Enter zum Auswählen, ↑↓ zum Navigieren, Esc zum Zurückgehen',
+  'Enter to submit, Esc to go back': 'Enter zum Absenden, Esc zum Zurückgehen',
   'Invalid step: {{step}}': 'Ungültiger Schritt: {{step}}',
   'No subagents found.': 'Keine Unteragenten gefunden.',
   "Use '/agents create' to create your first subagent.":
@@ -944,18 +945,22 @@ export default {
   // Dialogs - Auth
   // ============================================================================
   'Get started': 'Loslegen',
-  'How would you like to authenticate for this project?':
-    'Wie möchten Sie sich für dieses Projekt authentifizieren?',
+  'Select Authentication Method': 'Authentifizierungsmethode auswählen',
   'OpenAI API key is required to use OpenAI authentication.':
     'OpenAI API-Schlüssel ist für die OpenAI-Authentifizierung erforderlich.',
   'You must select an auth method to proceed. Press Ctrl+C again to exit.':
     'Sie müssen eine Authentifizierungsmethode wählen, um fortzufahren. Drücken Sie erneut Strg+C zum Beenden.',
-  '(Use Enter to Set Auth)': '(Enter zum Festlegen der Authentifizierung)',
-  'Terms of Services and Privacy Notice for Qwen Code':
-    'Nutzungsbedingungen und Datenschutzhinweis für Qwen Code',
+  'Terms of Services and Privacy Notice':
+    'Nutzungsbedingungen und Datenschutzhinweis',
   'Qwen OAuth': 'Qwen OAuth',
+  'Free \u00B7 Up to 1,000 requests/day \u00B7 Qwen latest models':
+    'Kostenlos \u00B7 Bis zu 1.000 Anfragen/Tag \u00B7 Qwen neueste Modelle',
   'Login with QwenChat account to use daily free quota.':
     'Melden Sie sich mit Ihrem QwenChat-Konto an, um das tägliche kostenlose Kontingent zu nutzen.',
+  'Paid \u00B7 Up to 6,000 requests/5 hrs \u00B7 All Alibaba Cloud Coding Plan Models':
+    'Kostenpflichtig \u00B7 Bis zu 6.000 Anfragen/5 Std. \u00B7 Alle Alibaba Cloud Coding Plan Modelle',
+  'Alibaba Cloud Coding Plan': 'Alibaba Cloud Coding Plan',
+  'Bring your own API key': 'Eigenen API-Schlüssel verwenden',
   'API-KEY': 'API-KEY',
   'Use coding plan credentials or your own api-keys/providers.':
     'Verwenden Sie Coding Plan-Anmeldedaten oder Ihre eigenen API-Schlüssel/Anbieter.',
@@ -985,6 +990,8 @@ export default {
     'Warten auf Qwen OAuth-Authentifizierung...',
   'Note: Your existing API key in settings.json will not be cleared when using Qwen OAuth. You can switch back to OpenAI authentication later if needed.':
     'Hinweis: Ihr bestehender API-Schlüssel in settings.json wird bei Verwendung von Qwen OAuth nicht gelöscht. Sie können später bei Bedarf zur OpenAI-Authentifizierung zurückwechseln.',
+  'Note: Your existing API key will not be cleared when using Qwen OAuth.':
+    'Hinweis: Ihr bestehender API-Schlüssel wird bei Verwendung von Qwen OAuth nicht gelöscht.',
   'Authentication timed out. Please try again.':
     'Authentifizierung abgelaufen. Bitte versuchen Sie es erneut.',
   'Waiting for auth... (Press ESC or CTRL+C to cancel)':
@@ -1034,6 +1041,17 @@ export default {
   '(default)': '(Standard)',
   '(set)': '(gesetzt)',
   '(not set)': '(nicht gesetzt)',
+  Modality: 'Modalität',
+  'Context Window': 'Kontextfenster',
+  text: 'Text',
+  'text-only': 'nur Text',
+  image: 'Bild',
+  pdf: 'PDF',
+  audio: 'Audio',
+  video: 'Video',
+  'not set': 'nicht gesetzt',
+  none: 'keine',
+  unknown: 'unbekannt',
   "Failed to switch model to '{{modelId}}'.\n\n{{error}}":
     "Modell konnte nicht auf '{{modelId}}' umgestellt werden.\n\n{{error}}",
   'Qwen 3.5 Plus — efficient hybrid model with leading coding performance':
@@ -1380,38 +1398,43 @@ export default {
     'Erweiterungsseite wird im Browser geöffnet: {{url}}',
   'Failed to open browser. Check out the extensions gallery at {{url}}':
     'Browser konnte nicht geöffnet werden. Besuchen Sie die Erweiterungsgalerie unter {{url}}',
+  'Use /compress when the conversation gets long to summarize history and free up context.':
+    'Verwenden Sie /compress, wenn die Unterhaltung lang wird, um den Verlauf zusammenzufassen und Kontext freizugeben.',
+  'Start a fresh idea with /clear or /new; the previous session stays available in history.':
+    'Starten Sie eine neue Idee mit /clear oder /new; die vorherige Sitzung bleibt im Verlauf verfügbar.',
+  'Use /bug to submit issues to the maintainers when something goes off.':
+    'Verwenden Sie /bug, um Probleme an die Betreuer zu melden, wenn etwas schiefgeht.',
+  'Switch auth type quickly with /auth.':
+    'Wechseln Sie den Authentifizierungstyp schnell mit /auth.',
+  'You can run any shell commands from Qwen Code using ! (e.g. !ls).':
+    'Sie können beliebige Shell-Befehle in Qwen Code mit ! ausführen (z. B. !ls).',
+  'Type / to open the command popup; Tab autocompletes slash commands and saved prompts.':
+    'Geben Sie / ein, um das Befehlsmenü zu öffnen; Tab vervollständigt Slash-Befehle und gespeicherte Prompts.',
+  'You can resume a previous conversation by running qwen --continue or qwen --resume.':
+    'Sie können eine frühere Unterhaltung mit qwen --continue oder qwen --resume fortsetzen.',
   'You can switch permission mode quickly with Shift+Tab or /approval-mode.':
     'Sie können den Berechtigungsmodus schnell mit Shift+Tab oder /approval-mode wechseln.',
   'You can switch permission mode quickly with Tab or /approval-mode.':
     'Sie können den Berechtigungsmodus schnell mit Tab oder /approval-mode wechseln.',
+  'Try /insight to generate personalized insights from your chat history.':
+    'Probieren Sie /insight, um personalisierte Erkenntnisse aus Ihrem Chatverlauf zu erstellen.',
 
   // ============================================================================
-  // Custom API-KEY Configuration
+  // Custom API Key Configuration
   // ============================================================================
-  'For advanced users who want to configure models manually.':
-    'Für fortgeschrittene Benutzer, die Modelle manuell konfigurieren möchten.',
-  'Please configure your models in settings.json:':
-    'Bitte konfigurieren Sie Ihre Modelle in settings.json:',
-  'Set API key via environment variable (e.g., OPENAI_API_KEY)':
-    'API-Schlüssel über Umgebungsvariable setzen (z.B. OPENAI_API_KEY)',
-  "Add model configuration to modelProviders['openai'] (or other auth types)":
-    "Modellkonfiguration zu modelProviders['openai'] (oder anderen Authentifizierungstypen) hinzufügen",
-  'Each provider needs: id, envKey (required), plus optional baseUrl, generationConfig':
-    'Jeder Anbieter benötigt: id, envKey (erforderlich), plus optionale baseUrl, generationConfig',
-  'Use /model command to select your preferred model from the configured list':
-    'Verwenden Sie den /model-Befehl, um Ihr bevorzugtes Modell aus der konfigurierten Liste auszuwählen',
-  'Supported auth types: openai, anthropic, gemini, vertex-ai, etc.':
-    'Unterstützte Authentifizierungstypen: openai, anthropic, gemini, vertex-ai, usw.',
+  'You can configure your API key and models in settings.json':
+    'Sie können Ihren API-Schlüssel und Modelle in settings.json konfigurieren',
+  'Refer to the documentation for setup instructions':
+    'Einrichtungsanweisungen finden Sie in der Dokumentation',
 
   // ============================================================================
   // Coding Plan Authentication
   // ============================================================================
-  'Please enter your API key:': 'Bitte geben Sie Ihren API-Schlüssel ein:',
   'API key cannot be empty.': 'API-Schlüssel darf nicht leer sein.',
-  'You can get your exclusive Coding Plan API-KEY here:':
-    'Hier können Sie Ihren exklusiven Coding Plan API-KEY erhalten:',
-  'New model configurations are available for Bailian Coding Plan. Update now?':
-    'Neue Modellkonfigurationen sind für Bailian Coding Plan verfügbar. Jetzt aktualisieren?',
+  'You can get your Coding Plan API key here':
+    'Sie können Ihren Coding-Plan-API-Schlüssel hier erhalten',
+  'New model configurations are available for Alibaba Cloud Coding Plan. Update now?':
+    'Neue Modellkonfigurationen sind für Alibaba Cloud Coding Plan verfügbar. Jetzt aktualisieren?',
   'Coding Plan configuration updated successfully. New models are now available.':
     'Coding Plan-Konfiguration erfolgreich aktualisiert. Neue Modelle sind jetzt verfügbar.',
   'Coding Plan API key not found. Please re-authenticate with Coding Plan.':
@@ -1422,32 +1445,16 @@ export default {
   // ============================================================================
   // Auth Dialog - View Titles and Labels
   // ============================================================================
-  'Coding Plan': 'Coding Plan',
-  'Coding Plan (Bailian, China)': 'Coding Plan (Bailian, China)',
-  'Coding Plan (Bailian, Global/Intl)': 'Coding Plan (Bailian, Global/Intl)',
-  "Paste your api key of Bailian Coding Plan and you're all set!":
-    'Fügen Sie Ihren Bailian Coding Plan API-Schlüssel ein und Sie sind bereit!',
-  "Paste your api key of Coding Plan (Bailian, Global/Intl) and you're all set!":
-    'Fügen Sie Ihren Coding Plan (Bailian, Global/Intl) API-Schlüssel ein und Sie sind bereit!',
-  Custom: 'Benutzerdefiniert',
-  'More instructions about configuring `modelProviders` manually.':
-    'Weitere Anweisungen zur manuellen Konfiguration von `modelProviders`.',
-  'Select API-KEY configuration mode:':
-    'API-KEY-Konfigurationsmodus auswählen:',
-  '(Press Escape to go back)': '(Escape drücken zum Zurückgehen)',
-  '(Press Enter to submit, Escape to cancel)':
-    '(Enter zum Absenden, Escape zum Abbrechen)',
-  'More instructions please check:': 'Weitere Anweisungen finden Sie unter:',
+  'Select Region for Coding Plan': 'Region für Coding Plan auswählen',
+  'Choose based on where your account is registered':
+    'Wählen Sie basierend auf dem Registrierungsort Ihres Kontos',
+  'Enter Coding Plan API Key': 'Coding-Plan-API-Schlüssel eingeben',
 
   // ============================================================================
   // Coding Plan International Updates
   // ============================================================================
   'New model configurations are available for {{region}}. Update now?':
     'Neue Modellkonfigurationen sind für {{region}} verfügbar. Jetzt aktualisieren?',
-  'New model configurations are available for Bailian Coding Plan (China). Update now?':
-    'Neue Modellkonfigurationen sind für Bailian Coding Plan (China) verfügbar. Jetzt aktualisieren?',
-  'New model configurations are available for Coding Plan (Bailian, Global/Intl). Update now?':
-    'Neue Modellkonfigurationen sind für Coding Plan (Bailian, Global/Intl) verfügbar. Jetzt aktualisieren?',
   '{{region}} configuration updated successfully. Model switched to "{{model}}".':
     '{{region}}-Konfiguration erfolgreich aktualisiert. Modell auf "{{model}}" umgeschaltet.',
   'Authenticated successfully with {{region}}. API key and model configs saved to settings.json (backed up).':
diff --git a/packages/cli/src/i18n/locales/en.js b/packages/cli/src/i18n/locales/en.js
index 0d3d422a7..1c27b760f 100644
--- a/packages/cli/src/i18n/locales/en.js
+++ b/packages/cli/src/i18n/locales/en.js
@@ -178,6 +178,7 @@ export default {
   'Enter to confirm, Esc to cancel': 'Enter to confirm, Esc to cancel',
   'Enter to select, ↑↓ to navigate, Esc to go back':
     'Enter to select, ↑↓ to navigate, Esc to go back',
+  'Enter to submit, Esc to go back': 'Enter to submit, Esc to go back',
   'Invalid step: {{step}}': 'Invalid step: {{step}}',
   'No subagents found.': 'No subagents found.',
   "Use '/agents create' to create your first subagent.":
@@ -935,18 +936,22 @@ export default {
   // Dialogs - Auth
   // ============================================================================
   'Get started': 'Get started',
-  'How would you like to authenticate for this project?':
-    'How would you like to authenticate for this project?',
+  'Select Authentication Method': 'Select Authentication Method',
   'OpenAI API key is required to use OpenAI authentication.':
     'OpenAI API key is required to use OpenAI authentication.',
   'You must select an auth method to proceed. Press Ctrl+C again to exit.':
     'You must select an auth method to proceed. Press Ctrl+C again to exit.',
-  '(Use Enter to Set Auth)': '(Use Enter to Set Auth)',
-  'Terms of Services and Privacy Notice for Qwen Code':
-    'Terms of Services and Privacy Notice for Qwen Code',
+  'Terms of Services and Privacy Notice':
+    'Terms of Services and Privacy Notice',
   'Qwen OAuth': 'Qwen OAuth',
+  'Free \u00B7 Up to 1,000 requests/day \u00B7 Qwen latest models':
+    'Free \u00B7 Up to 1,000 requests/day \u00B7 Qwen latest models',
   'Login with QwenChat account to use daily free quota.':
     'Login with QwenChat account to use daily free quota.',
+  'Paid \u00B7 Up to 6,000 requests/5 hrs \u00B7 All Alibaba Cloud Coding Plan Models':
+    'Paid \u00B7 Up to 6,000 requests/5 hrs \u00B7 All Alibaba Cloud Coding Plan Models',
+  'Alibaba Cloud Coding Plan': 'Alibaba Cloud Coding Plan',
+  'Bring your own API key': 'Bring your own API key',
   'API-KEY': 'API-KEY',
   'Use coding plan credentials or your own api-keys/providers.':
     'Use coding plan credentials or your own api-keys/providers.',
@@ -974,6 +979,8 @@ export default {
     'Waiting for Qwen OAuth authentication...',
   'Note: Your existing API key in settings.json will not be cleared when using Qwen OAuth. You can switch back to OpenAI authentication later if needed.':
     'Note: Your existing API key in settings.json will not be cleared when using Qwen OAuth. You can switch back to OpenAI authentication later if needed.',
+  'Note: Your existing API key will not be cleared when using Qwen OAuth.':
+    'Note: Your existing API key will not be cleared when using Qwen OAuth.',
   'Authentication timed out. Please try again.':
     'Authentication timed out. Please try again.',
   'Waiting for auth... (Press ESC or CTRL+C to cancel)':
@@ -1021,6 +1028,17 @@ export default {
   '(default)': '(default)',
   '(set)': '(set)',
   '(not set)': '(not set)',
+  Modality: 'Modality',
+  'Context Window': 'Context Window',
+  text: 'text',
+  'text-only': 'text-only',
+  image: 'image',
+  pdf: 'pdf',
+  audio: 'audio',
+  video: 'video',
+  'not set': 'not set',
+  none: 'none',
+  unknown: 'unknown',
   "Failed to switch model to '{{modelId}}'.\n\n{{error}}":
     "Failed to switch model to '{{modelId}}'.\n\n{{error}}",
   'Qwen 3.5 Plus — efficient hybrid model with leading coding performance':
@@ -1115,6 +1133,8 @@ export default {
     'You can switch permission mode quickly with Shift+Tab or /approval-mode.',
   'You can switch permission mode quickly with Tab or /approval-mode.':
     'You can switch permission mode quickly with Tab or /approval-mode.',
+  'Try /insight to generate personalized insights from your chat history.':
+    'Try /insight to generate personalized insights from your chat history.',
 
   // ============================================================================
   // Exit Screen / Stats
@@ -1382,18 +1402,20 @@ export default {
   'Rate limit error: {{reason}}': 'Rate limit error: {{reason}}',
   'Retrying in {{seconds}} seconds… (attempt {{attempt}}/{{maxRetries}})':
     'Retrying in {{seconds}} seconds… (attempt {{attempt}}/{{maxRetries}})',
+  'Press Ctrl+Y to retry': 'Press Ctrl+Y to retry',
+  'No failed request to retry.': 'No failed request to retry.',
+  'to retry last request': 'to retry last request',
 
   // ============================================================================
   // Coding Plan Authentication
   // ============================================================================
-  'Please enter your API key:': 'Please enter your API key:',
   'API key cannot be empty.': 'API key cannot be empty.',
-  'You can get your exclusive Coding Plan API-KEY here:':
-    'You can get your exclusive Coding Plan API-KEY here:',
+  'You can get your Coding Plan API key here':
+    'You can get your Coding Plan API key here',
   'API key is stored in settings.env. You can migrate it to a .env file for better security.':
     'API key is stored in settings.env. You can migrate it to a .env file for better security.',
-  'New model configurations are available for Bailian Coding Plan. Update now?':
-    'New model configurations are available for Bailian Coding Plan. Update now?',
+  'New model configurations are available for Alibaba Cloud Coding Plan. Update now?':
+    'New model configurations are available for Alibaba Cloud Coding Plan. Update now?',
   'Coding Plan configuration updated successfully. New models are now available.':
     'Coding Plan configuration updated successfully. New models are now available.',
   'Coding Plan API key not found. Please re-authenticate with Coding Plan.':
@@ -1402,51 +1424,26 @@ export default {
     'Failed to update Coding Plan configuration: {{message}}',
 
   // ============================================================================
-  // Custom API-KEY Configuration
+  // Custom API Key Configuration
   // ============================================================================
-  'For advanced users who want to configure models manually.':
-    'For advanced users who want to configure models manually.',
-  'Please configure your models in settings.json:':
-    'Please configure your models in settings.json:',
-  'Set API key via environment variable (e.g., OPENAI_API_KEY)':
-    'Set API key via environment variable (e.g., OPENAI_API_KEY)',
-  "Add model configuration to modelProviders['openai'] (or other auth types)":
-    "Add model configuration to modelProviders['openai'] (or other auth types)",
-  'Each provider needs: id, envKey (required), plus optional baseUrl, generationConfig':
-    'Each provider needs: id, envKey (required), plus optional baseUrl, generationConfig',
-  'Use /model command to select your preferred model from the configured list':
-    'Use /model command to select your preferred model from the configured list',
-  'Supported auth types: openai, anthropic, gemini, vertex-ai, etc.':
-    'Supported auth types: openai, anthropic, gemini, vertex-ai, etc.',
-  'More instructions please check:': 'More instructions please check:',
+  'You can configure your API key and models in settings.json':
+    'You can configure your API key and models in settings.json',
+  'Refer to the documentation for setup instructions':
+    'Refer to the documentation for setup instructions',
 
   // ============================================================================
   // Auth Dialog - View Titles and Labels
   // ============================================================================
-  'Coding Plan': 'Coding Plan',
-  'Coding Plan (Bailian, China)': 'Coding Plan (Bailian, China)',
-  'Coding Plan (Bailian, Global/Intl)': 'Coding Plan (Bailian, Global/Intl)',
-  "Paste your api key of Bailian Coding Plan and you're all set!":
-    "Paste your api key of Bailian Coding Plan and you're all set!",
-  "Paste your api key of Coding Plan (Bailian, Global/Intl) and you're all set!":
-    "Paste your api key of Coding Plan (Bailian, Global/Intl) and you're all set!",
-  Custom: 'Custom',
-  'More instructions about configuring `modelProviders` manually.':
-    'More instructions about configuring `modelProviders` manually.',
-  'Select API-KEY configuration mode:': 'Select API-KEY configuration mode:',
-  '(Press Escape to go back)': '(Press Escape to go back)',
-  '(Press Enter to submit, Escape to cancel)':
-    '(Press Enter to submit, Escape to cancel)',
+  'Select Region for Coding Plan': 'Select Region for Coding Plan',
+  'Choose based on where your account is registered':
+    'Choose based on where your account is registered',
+  'Enter Coding Plan API Key': 'Enter Coding Plan API Key',
 
   // ============================================================================
   // Coding Plan International Updates
   // ============================================================================
   'New model configurations are available for {{region}}. Update now?':
     'New model configurations are available for {{region}}. Update now?',
-  'New model configurations are available for Bailian Coding Plan (China). Update now?':
-    'New model configurations are available for Bailian Coding Plan (China). Update now?',
-  'New model configurations are available for Coding Plan (Bailian, Global/Intl). Update now?':
-    'New model configurations are available for Coding Plan (Bailian, Global/Intl). Update now?',
   '{{region}} configuration updated successfully. Model switched to "{{model}}".':
     '{{region}} configuration updated successfully. Model switched to "{{model}}".',
   'Authenticated successfully with {{region}}. API key and model configs saved to settings.json (backed up).':
diff --git a/packages/cli/src/i18n/locales/ja.js b/packages/cli/src/i18n/locales/ja.js
index 9632d5675..634cec49d 100644
--- a/packages/cli/src/i18n/locales/ja.js
+++ b/packages/cli/src/i18n/locales/ja.js
@@ -142,6 +142,7 @@ export default {
   'Enter to confirm, Esc to cancel': 'Enter で確定、Esc でキャンセル',
   'Enter to select, ↑↓ to navigate, Esc to go back':
     'Enter で選択、↑↓ で移動、Esc で戻る',
+  'Enter to submit, Esc to go back': 'Enter で送信、Esc で戻る',
   'Invalid step: {{step}}': '無効なステップ: {{step}}',
   'No subagents found.': 'サブエージェントが見つかりません',
   "Use '/agents create' to create your first subagent.":
@@ -671,18 +672,21 @@ export default {
   '🎯 Overall Goal:': '🎯 全体目標:',
   // Dialogs - Auth
   'Get started': '始める',
-  'How would you like to authenticate for this project?':
-    'このプロジェクトの認証方法を選択してください:',
+  'Select Authentication Method': '認証方法を選択',
   'OpenAI API key is required to use OpenAI authentication.':
     'OpenAI認証を使用するには OpenAI APIキーが必要です',
   'You must select an auth method to proceed. Press Ctrl+C again to exit.':
     '続行するには認証方法を選択してください。Ctrl+C をもう一度押すと終了します',
-  '(Use Enter to Set Auth)': '(Enter で認証を設定)',
-  'Terms of Services and Privacy Notice for Qwen Code':
-    'Qwen Code の利用規約とプライバシー通知',
+  'Terms of Services and Privacy Notice': '利用規約とプライバシー通知',
   'Qwen OAuth': 'Qwen OAuth',
+  'Free \u00B7 Up to 1,000 requests/day \u00B7 Qwen latest models':
+    '無料 \u00B7 1日最大1,000リクエスト \u00B7 Qwen最新モデル',
   'Login with QwenChat account to use daily free quota.':
     'QwenChatアカウントでログインして、毎日の無料クォータをご利用ください。',
+  'Paid \u00B7 Up to 6,000 requests/5 hrs \u00B7 All Alibaba Cloud Coding Plan Models':
+    '有料 \u00B7 5時間最大6,000リクエスト \u00B7 すべての Alibaba Cloud Coding Plan モデル',
+  'Alibaba Cloud Coding Plan': 'Alibaba Cloud Coding Plan',
+  'Bring your own API key': '自分のAPIキーを使用',
   'API-KEY': 'API-KEY',
   'Use coding plan credentials or your own api-keys/providers.':
     'Coding Planの認証情報またはご自身のAPIキー/プロバイダーをご利用ください。',
@@ -710,6 +714,8 @@ export default {
   'Waiting for Qwen OAuth authentication...': 'Qwen OAuth認証を待っています...',
   'Note: Your existing API key in settings.json will not be cleared when using Qwen OAuth. You can switch back to OpenAI authentication later if needed.':
     '注: Qwen OAuthを使用しても、settings.json内の既存のAPIキーはクリアされません。必要に応じて後でOpenAI認証に切り替えることができます',
+  'Note: Your existing API key will not be cleared when using Qwen OAuth.':
+    '注: Qwen OAuthを使用しても、既存のAPIキーはクリアされません。',
   'Authentication timed out. Please try again.':
     '認証がタイムアウトしました。再度お試しください',
   'Waiting for auth... (Press ESC or CTRL+C to cancel)':
@@ -731,6 +737,17 @@ export default {
   // Dialogs - Model
   'Select Model': 'モデルを選択',
   '(Press Esc to close)': '(Esc で閉じる)',
+  Modality: 'モダリティ',
+  'Context Window': 'コンテキストウィンドウ',
+  text: 'テキスト',
+  'text-only': 'テキストのみ',
+  image: '画像',
+  pdf: 'PDF',
+  audio: '音声',
+  video: '動画',
+  'not set': '未設定',
+  none: 'なし',
+  unknown: '不明',
   'Qwen 3.5 Plus — efficient hybrid model with leading coding performance':
     'Qwen 3.5 Plus — 効率的なハイブリッドモデル、業界トップクラスのコーディング性能',
   'The latest Qwen Vision model from Alibaba Cloud ModelStudio (version: qwen3-vl-plus-2025-09-23)':
@@ -783,6 +800,27 @@ export default {
   "Starting OAuth authentication for MCP server '{{name}}'...":
     "MCPサーバー '{{name}}' のOAuth認証を開始中...",
   // Startup Tips
+  'Tips:': 'ヒント：',
+  'Use /compress when the conversation gets long to summarize history and free up context.':
+    '会話が長くなったら /compress で履歴を要約し、コンテキストを解放できます。',
+  'Start a fresh idea with /clear or /new; the previous session stays available in history.':
+    '/clear または /new で新しいアイデアを始められます。前のセッションは履歴に残ります。',
+  'Use /bug to submit issues to the maintainers when something goes off.':
+    '問題が発生したら /bug でメンテナーに報告できます。',
+  'Switch auth type quickly with /auth.':
+    '/auth で認証タイプをすばやく切り替えられます。',
+  'You can run any shell commands from Qwen Code using ! (e.g. !ls).':
+    'Qwen Code から ! を使って任意のシェルコマンドを実行できます（例: !ls）。',
+  'Type / to open the command popup; Tab autocompletes slash commands and saved prompts.':
+    '/ を入力してコマンドポップアップを開きます。Tab でスラッシュコマンドと保存済みプロンプトを補完できます。',
+  'You can resume a previous conversation by running qwen --continue or qwen --resume.':
+    'qwen --continue または qwen --resume で前の会話を再開できます。',
+  'You can switch permission mode quickly with Shift+Tab or /approval-mode.':
+    'Shift+Tab または /approval-mode で権限モードをすばやく切り替えられます。',
+  'You can switch permission mode quickly with Tab or /approval-mode.':
+    'Tab または /approval-mode で権限モードをすばやく切り替えられます。',
+  'Try /insight to generate personalized insights from your chat history.':
+    '/insight でチャット履歴からパーソナライズされたインサイトを生成できます。',
   'Tips for getting started:': '始めるためのヒント:',
   '1. Ask questions, edit files, or run commands.':
     '1. 質問したり、ファイルを編集したり、コマンドを実行したりできます',
@@ -891,32 +929,19 @@ export default {
   ],
 
   // ============================================================================
-  // Custom API-KEY Configuration
+  // Custom API Key Configuration
   // ============================================================================
-  'For advanced users who want to configure models manually.':
-    'モデルを手動で設定したい上級ユーザー向け。',
-  'Please configure your models in settings.json:':
-    'settings.json でモデルを設定してください：',
-  'Set API key via environment variable (e.g., OPENAI_API_KEY)':
-    '環境変数を使用して API キーを設定してください（例：OPENAI_API_KEY）',
-  "Add model configuration to modelProviders['openai'] (or other auth types)":
-    "modelProviders['openai']（または他の認証タイプ）にモデル設定を追加してください",
-  'Each provider needs: id, envKey (required), plus optional baseUrl, generationConfig':
-    '各プロバイダーには：id、envKey（必須）、およびオプションの baseUrl、generationConfig が必要です',
-  'Use /model command to select your preferred model from the configured list':
-    '/model コマンドを使用して、設定済みリストからお好みのモデルを選択してください',
-  'Supported auth types: openai, anthropic, gemini, vertex-ai, etc.':
-    'サポートされている認証タイプ：openai、anthropic、gemini、vertex-ai など',
+  'You can configure your API key and models in settings.json':
+    'settings.json で API キーとモデルを設定できます',
+  'Refer to the documentation for setup instructions':
+    'セットアップ手順はドキュメントを参照してください',
 
   // ============================================================================
   // Coding Plan Authentication
   // ============================================================================
-  'Please enter your API key:': 'APIキーを入力してください：',
   'API key cannot be empty.': 'APIキーは空にできません。',
-  'You can get your exclusive Coding Plan API-KEY here:':
-    'Coding Plan の API-KEY はこちらで取得できます：',
-  'New model configurations are available for Bailian Coding Plan. Update now?':
-    'Bailian Coding Plan の新しいモデル設定が利用可能です。今すぐ更新しますか？',
+  'You can get your Coding Plan API key here':
+    'Coding Plan APIキーはこちらで取得できます',
   'Coding Plan configuration updated successfully. New models are now available.':
     'Coding Plan の設定が正常に更新されました。新しいモデルが利用可能になりました。',
   'Coding Plan API key not found. Please re-authenticate with Coding Plan.':
@@ -927,32 +952,16 @@ export default {
   // ============================================================================
   // Auth Dialog - View Titles and Labels
   // ============================================================================
-  'Coding Plan': 'Coding Plan',
-  'Coding Plan (Bailian, China)': 'Coding Plan (Bailian, 中国)',
-  'Coding Plan (Bailian, Global/Intl)':
-    'Coding Plan (Bailian, グローバル/国際)',
-  "Paste your api key of Bailian Coding Plan and you're all set!":
-    'Bailian Coding PlanのAPIキーを貼り付けるだけで準備完了です！',
-  "Paste your api key of Coding Plan (Bailian, Global/Intl) and you're all set!":
-    'Coding Plan (Bailian, グローバル/国際) のAPIキーを貼り付けるだけで準備完了です！',
-  Custom: 'カスタム',
-  'More instructions about configuring `modelProviders` manually.':
-    '`modelProviders`を手動で設定する方法の詳細はこちら。',
-  'Select API-KEY configuration mode:': 'API-KEY設定モードを選択してください：',
-  '(Press Escape to go back)': '(Escapeキーで戻る)',
-  '(Press Enter to submit, Escape to cancel)':
-    '(Enterで送信、Escapeでキャンセル)',
-  'More instructions please check:': '詳細な手順はこちらをご確認ください：',
+  'Select Region for Coding Plan': 'Coding Planのリージョンを選択',
+  'Choose based on where your account is registered':
+    'アカウントの登録先に応じて選択してください',
+  'Enter Coding Plan API Key': 'Coding Plan APIキーを入力',
 
   // ============================================================================
   // Coding Plan International Updates
   // ============================================================================
   'New model configurations are available for {{region}}. Update now?':
     '{{region}} の新しいモデル設定が利用可能です。今すぐ更新しますか？',
-  'New model configurations are available for Bailian Coding Plan (China). Update now?':
-    'Bailian Coding Plan (中国) の新しいモデル設定が利用可能です。今すぐ更新しますか？',
-  'New model configurations are available for Coding Plan (Bailian, Global/Intl). Update now?':
-    'Coding Plan (Bailian, グローバル/国際) の新しいモデル設定が利用可能です。今すぐ更新しますか？',
   '{{region}} configuration updated successfully. Model switched to "{{model}}".':
     '{{region}} の設定が正常に更新されました。モデルが "{{model}}" に切り替わりました。',
   'Authenticated successfully with {{region}}. API key and model configs saved to settings.json (backed up).':
diff --git a/packages/cli/src/i18n/locales/pt.js b/packages/cli/src/i18n/locales/pt.js
index d630879d1..729ebbd74 100644
--- a/packages/cli/src/i18n/locales/pt.js
+++ b/packages/cli/src/i18n/locales/pt.js
@@ -173,6 +173,7 @@ export default {
   'Enter to confirm, Esc to cancel': 'Enter para confirmar, Esc para cancelar',
   'Enter to select, ↑↓ to navigate, Esc to go back':
     'Enter para selecionar, ↑↓ para navegar, Esc para voltar',
+  'Enter to submit, Esc to go back': 'Enter para enviar, Esc para voltar',
   'Invalid step: {{step}}': 'Etapa inválida: {{step}}',
   'No subagents found.': 'Nenhum subagente encontrado.',
   "Use '/agents create' to create your first subagent.":
@@ -950,18 +951,22 @@ export default {
   // Dialogs - Auth
   // ============================================================================
   'Get started': 'Começar',
-  'How would you like to authenticate for this project?':
-    'Como você gostaria de se autenticar para este projeto?',
+  'Select Authentication Method': 'Selecionar Método de Autenticação',
   'OpenAI API key is required to use OpenAI authentication.':
     'A chave da API do OpenAI é necessária para usar a autenticação do OpenAI.',
   'You must select an auth method to proceed. Press Ctrl+C again to exit.':
     'Você deve selecionar um método de autenticação para prosseguir. Pressione Ctrl+C novamente para sair.',
-  '(Use Enter to Set Auth)': '(Use Enter para Definir Autenticação)',
-  'Terms of Services and Privacy Notice for Qwen Code':
-    'Termos de Serviço e Aviso de Privacidade do Qwen Code',
+  'Terms of Services and Privacy Notice':
+    'Termos de Serviço e Aviso de Privacidade',
   'Qwen OAuth': 'Qwen OAuth',
+  'Free \u00B7 Up to 1,000 requests/day \u00B7 Qwen latest models':
+    'Gratuito \u00B7 Até 1.000 solicitações/dia \u00B7 Modelos Qwen mais recentes',
   'Login with QwenChat account to use daily free quota.':
     'Faça login com sua conta QwenChat para usar a cota gratuita diária.',
+  'Paid \u00B7 Up to 6,000 requests/5 hrs \u00B7 All Alibaba Cloud Coding Plan Models':
+    'Pago \u00B7 Até 6.000 solicitações/5 hrs \u00B7 Todos os modelos Alibaba Cloud Coding Plan',
+  'Alibaba Cloud Coding Plan': 'Alibaba Cloud Coding Plan',
+  'Bring your own API key': 'Traga sua própria chave API',
   'API-KEY': 'API-KEY',
   'Use coding plan credentials or your own api-keys/providers.':
     'Use credenciais do Coding Plan ou suas próprias chaves API/provedores.',
@@ -989,6 +994,8 @@ export default {
     'Aguardando autenticação Qwen OAuth...',
   'Note: Your existing API key in settings.json will not be cleared when using Qwen OAuth. You can switch back to OpenAI authentication later if needed.':
     'Nota: Sua chave de API existente no settings.json não será limpa ao usar o Qwen OAuth. Você pode voltar para a autenticação do OpenAI mais tarde, se necessário.',
+  'Note: Your existing API key will not be cleared when using Qwen OAuth.':
+    'Nota: Sua chave de API existente não será limpa ao usar o Qwen OAuth.',
   'Authentication timed out. Please try again.':
     'A autenticação expirou. Tente novamente.',
   'Waiting for auth... (Press ESC or CTRL+C to cancel)':
@@ -1037,6 +1044,17 @@ export default {
   '(default)': '(padrão)',
   '(set)': '(definido)',
   '(not set)': '(não definido)',
+  Modality: 'Modalidade',
+  'Context Window': 'Janela de Contexto',
+  text: 'texto',
+  'text-only': 'somente texto',
+  image: 'imagem',
+  pdf: 'PDF',
+  audio: 'áudio',
+  video: 'vídeo',
+  'not set': 'não definido',
+  none: 'nenhum',
+  unknown: 'desconhecido',
   "Failed to switch model to '{{modelId}}'.\n\n{{error}}":
     "Falha ao trocar o modelo para '{{modelId}}'.\n\n{{error}}",
   'Qwen 3.5 Plus — efficient hybrid model with leading coding performance':
@@ -1132,6 +1150,8 @@ export default {
     'Você pode retomar uma conversa anterior executando qwen --continue ou qwen --resume.',
   'You can switch permission mode quickly with Shift+Tab or /approval-mode.':
     'Você pode alternar o modo de permissão rapidamente com Shift+Tab ou /approval-mode.',
+  'Try /insight to generate personalized insights from your chat history.':
+    'Experimente /insight para gerar insights personalizados do seu histórico de conversas.',
 
   // ============================================================================
   // Exit Screen / Stats
@@ -1394,32 +1414,21 @@ export default {
     'Falha ao abrir o navegador. Confira a galeria de extensões em {{url}}',
 
   // ============================================================================
-  // Custom API-KEY Configuration
+  // Custom API Key Configuration
   // ============================================================================
-  'For advanced users who want to configure models manually.':
-    'Para usuários avançados que desejam configurar modelos manualmente.',
-  'Please configure your models in settings.json:':
-    'Por favor, configure seus modelos em settings.json:',
-  'Set API key via environment variable (e.g., OPENAI_API_KEY)':
-    'Defina a chave de API via variável de ambiente (ex: OPENAI_API_KEY)',
-  "Add model configuration to modelProviders['openai'] (or other auth types)":
-    "Adicione a configuração do modelo a modelProviders['openai'] (ou outros tipos de autenticação)",
-  'Each provider needs: id, envKey (required), plus optional baseUrl, generationConfig':
-    'Cada provedor precisa de: id, envKey (obrigatório), além de baseUrl e generationConfig opcionais',
-  'Use /model command to select your preferred model from the configured list':
-    'Use o comando /model para selecionar seu modelo preferido da lista configurada',
-  'Supported auth types: openai, anthropic, gemini, vertex-ai, etc.':
-    'Tipos de autenticação suportados: openai, anthropic, gemini, vertex-ai, etc.',
+  'You can configure your API key and models in settings.json':
+    'Você pode configurar sua chave de API e modelos em settings.json',
+  'Refer to the documentation for setup instructions':
+    'Consulte a documentação para instruções de configuração',
 
   // ============================================================================
   // Coding Plan Authentication
   // ============================================================================
-  'Please enter your API key:': 'Por favor, digite sua chave de API:',
   'API key cannot be empty.': 'A chave de API não pode estar vazia.',
-  'You can get your exclusive Coding Plan API-KEY here:':
-    'Você pode obter sua chave de API exclusiva do Coding Plan aqui:',
-  'New model configurations are available for Bailian Coding Plan. Update now?':
-    'Novas configurações de modelo estão disponíveis para o Bailian Coding Plan. Atualizar agora?',
+  'You can get your Coding Plan API key here':
+    'Você pode obter sua chave de API do Coding Plan aqui',
+  'New model configurations are available for Alibaba Cloud Coding Plan. Update now?':
+    'Novas configurações de modelo estão disponíveis para o Alibaba Cloud Coding Plan. Atualizar agora?',
   'Coding Plan configuration updated successfully. New models are now available.':
     'Configuração do Coding Plan atualizada com sucesso. Novos modelos agora estão disponíveis.',
   'Coding Plan API key not found. Please re-authenticate with Coding Plan.':
@@ -1430,32 +1439,16 @@ export default {
   // ============================================================================
   // Auth Dialog - View Titles and Labels
   // ============================================================================
-  'Coding Plan': 'Coding Plan',
-  'Coding Plan (Bailian, China)': 'Coding Plan (Bailian, China)',
-  'Coding Plan (Bailian, Global/Intl)': 'Coding Plan (Bailian, Global/Intl)',
-  "Paste your api key of Bailian Coding Plan and you're all set!":
-    'Cole sua chave de API do Bailian Coding Plan e pronto!',
-  "Paste your api key of Coding Plan (Bailian, Global/Intl) and you're all set!":
-    'Cole sua chave de API do Coding Plan (Bailian, Global/Intl) e pronto!',
-  Custom: 'Personalizado',
-  'More instructions about configuring `modelProviders` manually.':
-    'Mais instruções sobre como configurar `modelProviders` manualmente.',
-  'Select API-KEY configuration mode:':
-    'Selecione o modo de configuração da API-KEY:',
-  '(Press Escape to go back)': '(Pressione Escape para voltar)',
-  '(Press Enter to submit, Escape to cancel)':
-    '(Pressione Enter para enviar, Escape para cancelar)',
-  'More instructions please check:': 'Mais instruções, consulte:',
+  'Select Region for Coding Plan': 'Selecionar região do Coding Plan',
+  'Choose based on where your account is registered':
+    'Escolha com base em onde sua conta está registrada',
+  'Enter Coding Plan API Key': 'Inserir chave de API do Coding Plan',
 
   // ============================================================================
   // Coding Plan International Updates
   // ============================================================================
   'New model configurations are available for {{region}}. Update now?':
     'Novas configurações de modelo estão disponíveis para o {{region}}. Atualizar agora?',
-  'New model configurations are available for Bailian Coding Plan (China). Update now?':
-    'Novas configurações de modelo estão disponíveis para o Bailian Coding Plan (China). Atualizar agora?',
-  'New model configurations are available for Coding Plan (Bailian, Global/Intl). Update now?':
-    'Novas configurações de modelo estão disponíveis para o Coding Plan (Bailian, Global/Intl). Atualizar agora?',
   '{{region}} configuration updated successfully. Model switched to "{{model}}".':
     'Configuração do {{region}} atualizada com sucesso. Modelo alterado para "{{model}}".',
   'Authenticated successfully with {{region}}. API key and model configs saved to settings.json (backed up).':
diff --git a/packages/cli/src/i18n/locales/ru.js b/packages/cli/src/i18n/locales/ru.js
index b8b332b76..867de9b9a 100644
--- a/packages/cli/src/i18n/locales/ru.js
+++ b/packages/cli/src/i18n/locales/ru.js
@@ -181,6 +181,7 @@ export default {
   'Enter to confirm, Esc to cancel': 'Enter для подтверждения, Esc для отмены',
   'Enter to select, ↑↓ to navigate, Esc to go back':
     'Enter для выбора, ↑↓ для навигации, Esc для возврата',
+  'Enter to submit, Esc to go back': 'Enter для отправки, Esc для возврата',
   'Invalid step: {{step}}': 'Неверный шаг: {{step}}',
   'No subagents found.': 'Подагенты не найдены.',
   "Use '/agents create' to create your first subagent.":
@@ -950,18 +951,22 @@ export default {
   // Диалоги - Авторизация
   // ============================================================================
   'Get started': 'Начать',
-  'How would you like to authenticate for this project?':
-    'Как вы хотите авторизоваться для этого проекта?',
+  'Select Authentication Method': 'Выберите метод авторизации',
   'OpenAI API key is required to use OpenAI authentication.':
     'Для использования авторизации OpenAI требуется ключ API OpenAI.',
   'You must select an auth method to proceed. Press Ctrl+C again to exit.':
     'Вы должны выбрать метод авторизации для продолжения. Нажмите Ctrl+C снова для выхода.',
-  '(Use Enter to Set Auth)': '(Enter для установки авторизации)',
-  'Terms of Services and Privacy Notice for Qwen Code':
-    'Условия обслуживания и уведомление о конфиденциальности для Qwen Code',
+  'Terms of Services and Privacy Notice':
+    'Условия обслуживания и уведомление о конфиденциальности',
   'Qwen OAuth': 'Qwen OAuth',
+  'Free \u00B7 Up to 1,000 requests/day \u00B7 Qwen latest models':
+    'Бесплатно \u00B7 До 1 000 запросов/день \u00B7 Новейшие модели Qwen',
   'Login with QwenChat account to use daily free quota.':
     'Войдите с помощью аккаунта QwenChat, чтобы использовать ежедневную бесплатную квоту.',
+  'Paid \u00B7 Up to 6,000 requests/5 hrs \u00B7 All Alibaba Cloud Coding Plan Models':
+    'Платно \u00B7 До 6 000 запросов/5 часов \u00B7 Все модели Alibaba Cloud Coding Plan',
+  'Alibaba Cloud Coding Plan': 'Alibaba Cloud Coding Plan',
+  'Bring your own API key': 'Используйте свой API-ключ',
   'API-KEY': 'API-KEY',
   'Use coding plan credentials or your own api-keys/providers.':
     'Используйте учетные данные Coding Plan или свои собственные API-ключи/провайдеры.',
@@ -989,6 +994,8 @@ export default {
     'Ожидание авторизации Qwen OAuth...',
   'Note: Your existing API key in settings.json will not be cleared when using Qwen OAuth. You can switch back to OpenAI authentication later if needed.':
     'Примечание: Ваш существующий ключ API в settings.json не будет удален при использовании Qwen OAuth. Вы можете переключиться обратно на авторизацию OpenAI позже при необходимости.',
+  'Note: Your existing API key will not be cleared when using Qwen OAuth.':
+    'Примечание: Ваш существующий ключ API не будет удален при использовании Qwen OAuth.',
   'Authentication timed out. Please try again.':
     'Время ожидания авторизации истекло. Пожалуйста, попробуйте снова.',
   'Waiting for auth... (Press ESC or CTRL+C to cancel)':
@@ -1036,6 +1043,17 @@ export default {
   '(default)': '(по умолчанию)',
   '(set)': '(установлено)',
   '(not set)': '(не задано)',
+  Modality: 'Модальность',
+  'Context Window': 'Контекстное окно',
+  text: 'текст',
+  'text-only': 'только текст',
+  image: 'изображение',
+  pdf: 'PDF',
+  audio: 'аудио',
+  video: 'видео',
+  'not set': 'не задано',
+  none: 'нет',
+  unknown: 'неизвестно',
   "Failed to switch model to '{{modelId}}'.\n\n{{error}}":
     "Не удалось переключиться на модель '{{modelId}}'.\n\n{{error}}",
   'Qwen 3.5 Plus — efficient hybrid model with leading coding performance':
@@ -1384,38 +1402,43 @@ export default {
     'Открываем страницу расширений в браузере: {{url}}',
   'Failed to open browser. Check out the extensions gallery at {{url}}':
     'Не удалось открыть браузер. Посетите галерею расширений по адресу {{url}}',
+  'Use /compress when the conversation gets long to summarize history and free up context.':
+    'Используйте /compress, когда разговор становится длинным, чтобы подвести итог и освободить контекст.',
+  'Start a fresh idea with /clear or /new; the previous session stays available in history.':
+    'Начните новую идею с /clear или /new; предыдущая сессия останется в истории.',
+  'Use /bug to submit issues to the maintainers when something goes off.':
+    'Используйте /bug, чтобы сообщить о проблемах разработчикам.',
+  'Switch auth type quickly with /auth.':
+    'Быстро переключите тип аутентификации с помощью /auth.',
+  'You can run any shell commands from Qwen Code using ! (e.g. !ls).':
+    'Вы можете выполнять любые shell-команды в Qwen Code с помощью ! (например, !ls).',
+  'Type / to open the command popup; Tab autocompletes slash commands and saved prompts.':
+    'Введите /, чтобы открыть меню команд; Tab автодополняет слэш-команды и сохранённые промпты.',
+  'You can resume a previous conversation by running qwen --continue or qwen --resume.':
+    'Вы можете продолжить предыдущий разговор, запустив qwen --continue или qwen --resume.',
   'You can switch permission mode quickly with Shift+Tab or /approval-mode.':
     'Вы можете быстро переключать режим разрешений с помощью Shift+Tab или /approval-mode.',
   'You can switch permission mode quickly with Tab or /approval-mode.':
     'Вы можете быстро переключать режим разрешений с помощью Tab или /approval-mode.',
+  'Try /insight to generate personalized insights from your chat history.':
+    'Попробуйте /insight, чтобы получить персонализированные выводы из истории чатов.',
 
   // ============================================================================
-  // Custom API-KEY Configuration
+  // Custom API Key Configuration
   // ============================================================================
-  'For advanced users who want to configure models manually.':
-    'Для продвинутых пользователей, которые хотят настраивать модели вручную.',
-  'Please configure your models in settings.json:':
-    'Пожалуйста, настройте ваши модели в settings.json:',
-  'Set API key via environment variable (e.g., OPENAI_API_KEY)':
-    'Установите ключ API через переменную окружения (например, OPENAI_API_KEY)',
-  "Add model configuration to modelProviders['openai'] (or other auth types)":
-    "Добавьте конфигурацию модели в modelProviders['openai'] (или другие типы аутентификации)",
-  'Each provider needs: id, envKey (required), plus optional baseUrl, generationConfig':
-    'Каждому провайдеру нужны: id, envKey (обязательно), а также опциональные baseUrl, generationConfig',
-  'Use /model command to select your preferred model from the configured list':
-    'Используйте команду /model, чтобы выбрать предпочитаемую модель из настроенного списка',
-  'Supported auth types: openai, anthropic, gemini, vertex-ai, etc.':
-    'Поддерживаемые типы аутентификации: openai, anthropic, gemini, vertex-ai и др.',
+  'You can configure your API key and models in settings.json':
+    'Вы можете настроить API-ключ и модели в settings.json',
+  'Refer to the documentation for setup instructions':
+    'Инструкции по настройке см. в документации',
 
   // ============================================================================
   // Coding Plan Authentication
   // ============================================================================
-  'Please enter your API key:': 'Пожалуйста, введите ваш API-ключ:',
   'API key cannot be empty.': 'API-ключ не может быть пустым.',
-  'You can get your exclusive Coding Plan API-KEY here:':
-    'Получите свой эксклюзивный API-KEY Coding Plan здесь:',
-  'New model configurations are available for Bailian Coding Plan. Update now?':
-    'Доступны новые конфигурации моделей для Bailian Coding Plan. Обновить сейчас?',
+  'You can get your Coding Plan API key here':
+    'Вы можете получить API-ключ Coding Plan здесь',
+  'New model configurations are available for Alibaba Cloud Coding Plan. Update now?':
+    'Доступны новые конфигурации моделей для Alibaba Cloud Coding Plan. Обновить сейчас?',
   'Coding Plan configuration updated successfully. New models are now available.':
     'Конфигурация Coding Plan успешно обновлена. Новые модели теперь доступны.',
   'Coding Plan API key not found. Please re-authenticate with Coding Plan.':
@@ -1426,32 +1449,16 @@ export default {
   // ============================================================================
   // Auth Dialog - View Titles and Labels
   // ============================================================================
-  'Coding Plan': 'Coding Plan',
-  'Coding Plan (Bailian, China)': 'Coding Plan (Bailian, Китай)',
-  'Coding Plan (Bailian, Global/Intl)':
-    'Coding Plan (Bailian, Глобальный/Международный)',
-  "Paste your api key of Bailian Coding Plan and you're all set!":
-    'Вставьте ваш API-ключ Bailian Coding Plan и всё готово!',
-  "Paste your api key of Coding Plan (Bailian, Global/Intl) and you're all set!":
-    'Вставьте ваш API-ключ Coding Plan (Bailian, Глобальный/Международный) и всё готово!',
-  Custom: 'Пользовательский',
-  'More instructions about configuring `modelProviders` manually.':
-    'Дополнительные инструкции по ручной настройке `modelProviders`.',
-  'Select API-KEY configuration mode:': 'Выберите режим конфигурации API-KEY:',
-  '(Press Escape to go back)': '(Нажмите Escape для возврата)',
-  '(Press Enter to submit, Escape to cancel)':
-    '(Нажмите Enter для отправки, Escape для отмены)',
-  'More instructions please check:': 'Дополнительные инструкции см.:',
+  'Select Region for Coding Plan': 'Выберите регион Coding Plan',
+  'Choose based on where your account is registered':
+    'Выберите в зависимости от места регистрации вашего аккаунта',
+  'Enter Coding Plan API Key': 'Введите API-ключ Coding Plan',
 
   // ============================================================================
   // Coding Plan International Updates
   // ============================================================================
   'New model configurations are available for {{region}}. Update now?':
     'Доступны новые конфигурации моделей для {{region}}. Обновить сейчас?',
-  'New model configurations are available for Bailian Coding Plan (China). Update now?':
-    'Доступны новые конфигурации моделей для Bailian Coding Plan (Китай). Обновить сейчас?',
-  'New model configurations are available for Coding Plan (Bailian, Global/Intl). Update now?':
-    'Доступны новые конфигурации моделей для Coding Plan (Bailian, Глобальный/Международный). Обновить сейчас?',
   '{{region}} configuration updated successfully. Model switched to "{{model}}".':
     'Конфигурация {{region}} успешно обновлена. Модель переключена на "{{model}}".',
   'Authenticated successfully with {{region}}. API key and model configs saved to settings.json (backed up).':
diff --git a/packages/cli/src/i18n/locales/zh.js b/packages/cli/src/i18n/locales/zh.js
index 02ae707b6..5bc2bef92 100644
--- a/packages/cli/src/i18n/locales/zh.js
+++ b/packages/cli/src/i18n/locales/zh.js
@@ -173,6 +173,7 @@ export default {
   'Enter to confirm, Esc to cancel': 'Enter 确认，Esc 取消',
   'Enter to select, ↑↓ to navigate, Esc to go back':
     'Enter 选择，↑↓ 导航，Esc 返回',
+  'Enter to submit, Esc to go back': 'Enter 提交，Esc 返回',
   'Invalid step: {{step}}': '无效步骤: {{step}}',
   'No subagents found.': '未找到子智能体。',
   "Use '/agents create' to create your first subagent.":
@@ -882,18 +883,21 @@ export default {
   // Dialogs - Auth
   // ============================================================================
   'Get started': '开始使用',
-  'How would you like to authenticate for this project?':
-    '您希望如何为此项目进行身份验证？',
+  'Select Authentication Method': '选择认证方式',
   'OpenAI API key is required to use OpenAI authentication.':
     '使用 OpenAI 认证需要 OpenAI API 密钥',
   'You must select an auth method to proceed. Press Ctrl+C again to exit.':
     '您必须选择认证方法才能继续。再次按 Ctrl+C 退出',
-  '(Use Enter to Set Auth)': '（使用 Enter 设置认证）',
-  'Terms of Services and Privacy Notice for Qwen Code':
-    'Qwen Code 的服务条款和隐私声明',
+  'Terms of Services and Privacy Notice': '服务条款和隐私声明',
   'Qwen OAuth': 'Qwen OAuth (免费)',
+  'Free \u00B7 Up to 1,000 requests/day \u00B7 Qwen latest models':
+    '免费 \u00B7 每天最多 1,000 次请求 \u00B7 Qwen 最新模型',
   'Login with QwenChat account to use daily free quota.':
     '使用 QwenChat 账号登录，享受每日免费额度。',
+  'Paid \u00B7 Up to 6,000 requests/5 hrs \u00B7 All Alibaba Cloud Coding Plan Models':
+    '付费 \u00B7 每 5 小时最多 6,000 次请求 \u00B7 支持阿里云百炼 Coding Plan 全部模型',
+  'Alibaba Cloud Coding Plan': '阿里云百炼 Coding Plan',
+  'Bring your own API key': '使用自己的 API 密钥',
   'Use coding plan credentials or your own api-keys/providers.':
     '使用 Coding Plan 凭证或您自己的 API 密钥/提供商。',
   OpenAI: 'OpenAI',
@@ -917,6 +921,8 @@ export default {
   'Waiting for Qwen OAuth authentication...': '正在等待 Qwen OAuth 认证...',
   'Note: Your existing API key in settings.json will not be cleared when using Qwen OAuth. You can switch back to OpenAI authentication later if needed.':
     '注意：使用 Qwen OAuth 时，settings.json 中现有的 API 密钥不会被清除。如果需要，您可以稍后切换回 OpenAI 认证。',
+  'Note: Your existing API key will not be cleared when using Qwen OAuth.':
+    '注意：使用 Qwen OAuth 时，现有的 API 密钥不会被清除。',
   'Authentication timed out. Please try again.': '认证超时。请重试。',
   'Waiting for auth... (Press ESC or CTRL+C to cancel)':
     '正在等待认证...（按 ESC 或 CTRL+C 取消）',
@@ -961,6 +967,17 @@ export default {
   '(default)': '(默认)',
   '(set)': '(已设置)',
   '(not set)': '(未设置)',
+  Modality: '模态',
+  'Context Window': '上下文窗口',
+  text: '文本',
+  'text-only': '纯文本',
+  image: '图像',
+  pdf: 'PDF',
+  audio: '音频',
+  video: '视频',
+  'not set': '未设置',
+  none: '无',
+  unknown: '未知',
   "Failed to switch model to '{{modelId}}'.\n\n{{error}}":
     "无法切换到模型 '{{modelId}}'.\n\n{{error}}",
   'Qwen 3.5 Plus — efficient hybrid model with leading coding performance':
@@ -1052,6 +1069,8 @@ export default {
     '按 Shift+Tab 或输入 /approval-mode 可快速切换权限模式。',
   'You can switch permission mode quickly with Tab or /approval-mode.':
     '按 Tab 或输入 /approval-mode 可快速切换权限模式。',
+  'Try /insight to generate personalized insights from your chat history.':
+    '试试 /insight，从聊天记录中生成个性化洞察。',
 
   // ============================================================================
   // Exit Screen / Stats
@@ -1215,18 +1234,22 @@ export default {
   'Rate limit error: {{reason}}': '触发限流：{{reason}}',
   'Retrying in {{seconds}} seconds… (attempt {{attempt}}/{{maxRetries}})':
     '将于 {{seconds}} 秒后重试…（第 {{attempt}}/{{maxRetries}} 次）',
+  'Press Ctrl+Y to retry': '按 Ctrl+Y 重试。',
+  'No failed request to retry.': '没有可重试的失败请求。',
+  'to retry last request': '重试上一次请求',
 
   // ============================================================================
   // Coding Plan Authentication
   // ============================================================================
-  'Please enter your API key:': '请输入您的 API Key：',
   'API key cannot be empty.': 'API Key 不能为空。',
-  'You can get your exclusive Coding Plan API-KEY here:':
-    '您可以在这里获取专属的 Coding Plan API-KEY：',
+  'Invalid API key. Coding Plan API keys start with "sk-sp-". Please check.':
+    '无效的 API Key，Coding Plan API Key 均以 "sk-sp-" 开头，请检查',
+  'You can get your Coding Plan API key here':
+    '您可以在这里获取 Coding Plan API Key',
   'API key is stored in settings.env. You can migrate it to a .env file for better security.':
     'API Key 已存储在 settings.env 中。您可以将其迁移到 .env 文件以获得更好的安全性。',
-  'New model configurations are available for Bailian Coding Plan. Update now?':
-    '百炼 Coding Plan 有新模型配置可用。是否立即更新？',
+  'New model configurations are available for Alibaba Cloud Coding Plan. Update now?':
+    '阿里云百炼 Coding Plan 有新模型配置可用。是否立即更新？',
   'Coding Plan configuration updated successfully. New models are now available.':
     'Coding Plan 配置更新成功。新模型现已可用。',
   'Coding Plan API key not found. Please re-authenticate with Coding Plan.':
@@ -1235,51 +1258,25 @@ export default {
     '更新 Coding Plan 配置失败：{{message}}',
 
   // ============================================================================
-  // Custom API-KEY Configuration
+  // Custom API Key Configuration
   // ============================================================================
-  'For advanced users who want to configure models manually.':
-    '适合需要手动配置模型的高级用户。',
-  'Please configure your models in settings.json:':
-    '请在 settings.json 中配置您的模型：',
-  'Set API key via environment variable (e.g., OPENAI_API_KEY)':
-    '通过环境变量设置 API Key（例如：OPENAI_API_KEY）',
-  "Add model configuration to modelProviders['openai'] (or other auth types)":
-    "将模型配置添加到 modelProviders['openai']（或其他认证类型）",
-  'Each provider needs: id, envKey (required), plus optional baseUrl, generationConfig':
-    '每个提供商需要：id、envKey（必需），以及可选的 baseUrl、generationConfig',
-  'Use /model command to select your preferred model from the configured list':
-    '使用 /model 命令从配置列表中选择您偏好的模型',
-  'Supported auth types: openai, anthropic, gemini, vertex-ai, etc.':
-    '支持的认证类型：openai、anthropic、gemini、vertex-ai 等',
-  'More instructions please check:': '更多说明请查看：',
+  'You can configure your API key and models in settings.json':
+    '您可以在 settings.json 中配置 API Key 和模型',
+  'Refer to the documentation for setup instructions': '请参考文档了解配置说明',
 
   // ============================================================================
   // Auth Dialog - View Titles and Labels
   // ============================================================================
-  'API-KEY': 'API-KEY',
-  'Coding Plan': 'Coding Plan',
-  'Coding Plan (Bailian, China)': 'Coding Plan (百炼, 中国)',
-  'Coding Plan (Bailian, Global/Intl)': 'Coding Plan (百炼, 全球/国际)',
-  "Paste your api key of Bailian Coding Plan and you're all set!":
-    '粘贴您的百炼 Coding Plan API Key，即可完成设置！',
-  "Paste your api key of Coding Plan (Bailian, Global/Intl) and you're all set!":
-    '粘贴您的 Coding Plan (百炼, 全球/国际) API Key，即可完成设置！',
-  Custom: '自定义',
-  'More instructions about configuring `modelProviders` manually.':
-    '关于手动配置 `modelProviders` 的更多说明。',
-  'Select API-KEY configuration mode:': '选择 API-KEY 配置模式：',
-  '(Press Escape to go back)': '(按 Escape 键返回)',
-  '(Press Enter to submit, Escape to cancel)': '(按 Enter 提交，Escape 取消)',
+  'Select Region for Coding Plan': '选择 Coding Plan 区域',
+  'Choose based on where your account is registered':
+    '请根据您的账号注册地区选择',
+  'Enter Coding Plan API Key': '输入 Coding Plan API Key',
 
   // ============================================================================
   // Coding Plan International Updates
   // ============================================================================
   'New model configurations are available for {{region}}. Update now?':
     '{{region}} 有新的模型配置可用。是否立即更新？',
-  'New model configurations are available for Bailian Coding Plan (China). Update now?':
-    '百炼 Coding Plan (中国) 有新的模型配置可用。是否立即更新？',
-  'New model configurations are available for Coding Plan (Bailian, Global/Intl). Update now?':
-    'Coding Plan (百炼, 全球/国际) 有新的模型配置可用。是否立即更新？',
   '{{region}} configuration updated successfully. Model switched to "{{model}}".':
     '{{region}} 配置更新成功。模型已切换至 "{{model}}"。',
   'Authenticated successfully with {{region}}. API key and model configs saved to settings.json (backed up).':
diff --git a/packages/cli/src/ui/AppContainer.test.tsx b/packages/cli/src/ui/AppContainer.test.tsx
index 1edec79f9..9e9d4f673 100644
--- a/packages/cli/src/ui/AppContainer.test.tsx
+++ b/packages/cli/src/ui/AppContainer.test.tsx
@@ -209,6 +209,7 @@ describe('AppContainer State Management', () => {
       pendingHistoryItems: [],
       thought: null,
       cancelOngoingRequest: vi.fn(),
+      retryLastPrompt: vi.fn(),
     });
     mockedUseVim.mockReturnValue({ handleInput: vi.fn() });
     mockedUseFolderTrust.mockReturnValue({
@@ -607,6 +608,7 @@ describe('AppContainer State Management', () => {
         pendingHistoryItems: [],
         thought: { subject: thoughtSubject },
         cancelOngoingRequest: vi.fn(),
+        retryLastPrompt: vi.fn(),
       });
 
       // Act: Render the container
@@ -652,6 +654,7 @@ describe('AppContainer State Management', () => {
         pendingHistoryItems: [],
         thought: null,
         cancelOngoingRequest: vi.fn(),
+        retryLastPrompt: vi.fn(),
       });
 
       // Act: Render the container
@@ -698,6 +701,7 @@ describe('AppContainer State Management', () => {
         pendingHistoryItems: [],
         thought: { subject: thoughtSubject },
         cancelOngoingRequest: vi.fn(),
+        retryLastPrompt: vi.fn(),
       });
 
       // Act: Render the container
@@ -744,6 +748,7 @@ describe('AppContainer State Management', () => {
         pendingHistoryItems: [],
         thought: { subject: shortTitle },
         cancelOngoingRequest: vi.fn(),
+        retryLastPrompt: vi.fn(),
       });
 
       // Act: Render the container
@@ -794,6 +799,7 @@ describe('AppContainer State Management', () => {
         pendingHistoryItems: [],
         thought: { subject: title },
         cancelOngoingRequest: vi.fn(),
+        retryLastPrompt: vi.fn(),
       });
 
       // Act: Render the container
@@ -841,6 +847,7 @@ describe('AppContainer State Management', () => {
         pendingHistoryItems: [],
         thought: null,
         cancelOngoingRequest: vi.fn(),
+        retryLastPrompt: vi.fn(),
       });
 
       // Act: Render the container
@@ -882,6 +889,7 @@ describe('AppContainer State Management', () => {
         pendingHistoryItems: [],
         thought: null,
         cancelOngoingRequest: vi.fn(),
+        retryLastPrompt: vi.fn(),
         activePtyId: 'some-id',
       });
 
@@ -1013,6 +1021,7 @@ describe('AppContainer State Management', () => {
         pendingHistoryItems: [],
         thought: null,
         cancelOngoingRequest: mockCancelOngoingRequest,
+        retryLastPrompt: vi.fn(),
       });
 
       const mockHandleSlashCommand = vi.fn();
diff --git a/packages/cli/src/ui/AppContainer.tsx b/packages/cli/src/ui/AppContainer.tsx
index 2ab8eeec4..781aab375 100644
--- a/packages/cli/src/ui/AppContainer.tsx
+++ b/packages/cli/src/ui/AppContainer.tsx
@@ -629,6 +629,7 @@ export const AppContainer = (props: AppContainerProps) => {
     pendingHistoryItems: pendingGeminiHistoryItems,
     thought,
     cancelOngoingRequest,
+    retryLastPrompt,
     handleApprovalModeChange,
     activePtyId,
     loopDetectionConfirmationRequest,
@@ -1532,6 +1533,7 @@ export const AppContainer = (props: AppContainerProps) => {
       onSuggestionsVisibilityChange: setHasSuggestionsVisible,
       refreshStatic,
       handleFinalSubmit,
+      handleRetryLastPrompt: retryLastPrompt,
       handleClearScreen,
       // Welcome back dialog
       handleWelcomeBackSelection,
@@ -1575,6 +1577,7 @@ export const AppContainer = (props: AppContainerProps) => {
       handleEscapePromptChange,
       refreshStatic,
       handleFinalSubmit,
+      retryLastPrompt,
       handleClearScreen,
       handleWelcomeBackSelection,
       handleWelcomeBackClose,
diff --git a/packages/cli/src/ui/auth/AuthDialog.test.tsx b/packages/cli/src/ui/auth/AuthDialog.test.tsx
index a975a599e..90b15c968 100644
--- a/packages/cli/src/ui/auth/AuthDialog.test.tsx
+++ b/packages/cli/src/ui/auth/AuthDialog.test.tsx
@@ -32,6 +32,7 @@ const createMockUIActions = (overrides: Partial<UIActions> = {}): UIActions => {
   // AuthDialog only uses handleAuthSelect
   const baseActions = {
     handleAuthSelect: vi.fn(),
+    handleRetryLastPrompt: vi.fn(),
   } as Partial<UIActions>;
 
   return {
@@ -169,9 +170,9 @@ describe('AuthDialog', () => {
 
       const { lastFrame } = renderAuthDialog(settings);
 
-      // Since the auth dialog shows API-KEY option now,
+      // Since the auth dialog shows API Key option now,
       // it won't show GEMINI_API_KEY messages
-      expect(lastFrame()).toContain('API-KEY');
+      expect(lastFrame()).toContain('API Key');
     });
 
     it('should not show the GEMINI_API_KEY message if QWEN_DEFAULT_AUTH_TYPE is set to something else', () => {
@@ -257,9 +258,9 @@ describe('AuthDialog', () => {
 
       const { lastFrame } = renderAuthDialog(settings);
 
-      // Since the auth dialog shows API-KEY option now,
+      // Since the auth dialog shows API Key option now,
       // it won't show GEMINI_API_KEY messages
-      expect(lastFrame()).toContain('API-KEY');
+      expect(lastFrame()).toContain('API Key');
     });
   });
 
@@ -305,7 +306,7 @@ describe('AuthDialog', () => {
       const { lastFrame } = renderAuthDialog(settings);
 
       // QWEN_OAUTH is the first option, so it should be selected
-      expect(lastFrame()).toContain('● 1. Qwen OAuth');
+      expect(lastFrame()).toContain('Qwen OAuth');
     });
 
     it('should fall back to default if QWEN_DEFAULT_AUTH_TYPE is not set', () => {
@@ -345,7 +346,7 @@ describe('AuthDialog', () => {
       const { lastFrame } = renderAuthDialog(settings);
 
       // Default is Qwen OAuth (first option)
-      expect(lastFrame()).toContain('● 1. Qwen OAuth');
+      expect(lastFrame()).toContain('Qwen OAuth');
     });
 
     it('should show an error and fall back to default if QWEN_DEFAULT_AUTH_TYPE is invalid', () => {
@@ -388,7 +389,7 @@ describe('AuthDialog', () => {
 
       // Since the auth dialog doesn't show QWEN_DEFAULT_AUTH_TYPE errors anymore,
       // it will just show the default Qwen OAuth option
-      expect(lastFrame()).toContain('● 1. Qwen OAuth');
+      expect(lastFrame()).toContain('Qwen OAuth');
     });
   });
 
diff --git a/packages/cli/src/ui/auth/AuthDialog.tsx b/packages/cli/src/ui/auth/AuthDialog.tsx
index 7f43fa582..309e77adf 100644
--- a/packages/cli/src/ui/auth/AuthDialog.tsx
+++ b/packages/cli/src/ui/auth/AuthDialog.tsx
@@ -11,16 +11,19 @@ import { Box, Text } from 'ink';
 import Link from 'ink-link';
 import { theme } from '../semantic-colors.js';
 import { useKeypress } from '../hooks/useKeypress.js';
-import { RadioButtonSelect } from '../components/shared/RadioButtonSelect.js';
+import { DescriptiveRadioButtonSelect } from '../components/shared/DescriptiveRadioButtonSelect.js';
 import { ApiKeyInput } from '../components/ApiKeyInput.js';
 import { useUIState } from '../contexts/UIStateContext.js';
 import { useUIActions } from '../contexts/UIActionsContext.js';
 import { useConfig } from '../contexts/ConfigContext.js';
 import { t } from '../../i18n/index.js';
-import { CodingPlanRegion } from '../../constants/codingPlan.js';
+import {
+  CodingPlanRegion,
+  isCodingPlanConfig,
+} from '../../constants/codingPlan.js';
 
 const MODEL_PROVIDERS_DOCUMENTATION_URL =
-  'https://qwenlm.github.io/qwen-code-docs/en/users/configuration/settings/#modelproviders';
+  'https://qwenlm.github.io/qwen-code-docs/en/users/configuration/model-providers/';
 
 function parseDefaultAuthType(
   defaultAuthType: string | undefined,
@@ -34,11 +37,11 @@ function parseDefaultAuthType(
   return null;
 }
 
-// Sub-mode types for API-KEY authentication
-type ApiKeySubMode = 'coding-plan' | 'coding-plan-intl' | 'custom';
+// Main menu option type
+type MainOption = typeof AuthType.QWEN_OAUTH | 'CODING_PLAN' | 'API_KEY';
 
 // View level for navigation
-type ViewLevel = 'main' | 'api-key-sub' | 'api-key-input' | 'custom-info';
+type ViewLevel = 'main' | 'region-select' | 'api-key-input' | 'custom-info';
 
 export function AuthDialog(): React.JSX.Element {
   const { pendingAuthType, authError } = useUIState();
@@ -50,58 +53,107 @@ export function AuthDialog(): React.JSX.Element {
   const config = useConfig();
 
   const [errorMessage, setErrorMessage] = useState<string | null>(null);
-  const [selectedIndex, setSelectedIndex] = useState<number | null>(null);
   const [viewLevel, setViewLevel] = useState<ViewLevel>('main');
-  const [apiKeySubModeIndex, setApiKeySubModeIndex] = useState<number>(0);
+  const [regionIndex, setRegionIndex] = useState<number>(0);
   const [region, setRegion] = useState<CodingPlanRegion>(
     CodingPlanRegion.CHINA,
   );
 
-  // Main authentication entries
+  // Main authentication entries (flat three-option layout)
   const mainItems = [
     {
       key: AuthType.QWEN_OAUTH,
+      title: t('Qwen OAuth'),
       label: t('Qwen OAuth'),
-      value: AuthType.QWEN_OAUTH,
+      description: t(
+        'Free \u00B7 Up to 1,000 requests/day \u00B7 Qwen latest models',
+      ),
+      value: AuthType.QWEN_OAUTH as MainOption,
     },
     {
-      key: 'API-KEY',
-      label: t('API-KEY'),
-      value: 'API-KEY' as const,
+      key: 'CODING_PLAN',
+      title: t('Alibaba Cloud Coding Plan'),
+      label: t('Alibaba Cloud Coding Plan'),
+      description: t(
+        'Paid \u00B7 Up to 6,000 requests/5 hrs \u00B7 All Alibaba Cloud Coding Plan Models',
+      ),
+      value: 'CODING_PLAN' as MainOption,
+    },
+    {
+      key: 'API_KEY',
+      title: t('API Key'),
+      label: t('API Key'),
+      description: t('Bring your own API key'),
+      value: 'API_KEY' as MainOption,
     },
   ];
 
-  // API-KEY sub-mode entries
-  const apiKeySubItems = [
+  // Region selection entries (shown after selecting Alibaba Cloud Coding Plan)
+  const regionItems = [
     {
-      key: 'coding-plan',
-      label: t('Coding Plan (Bailian, China)'),
-      value: 'coding-plan' as ApiKeySubMode,
+      key: 'china',
+      title: '阿里云百炼 (aliyun.com)',
+      label: '阿里云百炼 (aliyun.com)',
+      description: (
+        <Link
+          url="https://help.aliyun.com/zh/model-studio/coding-plan"
+          fallback={false}
+        >
+          <Text color={theme.text.secondary}>
+            https://help.aliyun.com/zh/model-studio/coding-plan
+          </Text>
+        </Link>
+      ),
+      value: CodingPlanRegion.CHINA,
     },
     {
-      key: 'coding-plan-intl',
-      label: t('Coding Plan (Bailian, Global/Intl)'),
-      value: 'coding-plan-intl' as ApiKeySubMode,
-    },
-    {
-      key: 'custom',
-      label: t('Custom'),
-      value: 'custom' as ApiKeySubMode,
+      key: 'global',
+      title: 'Alibaba Cloud (alibabacloud.com)',
+      label: 'Alibaba Cloud (alibabacloud.com)',
+      description: (
+        <Link
+          url="https://www.alibabacloud.com/help/en/model-studio/coding-plan"
+          fallback={false}
+        >
+          <Text color={theme.text.secondary}>
+            https://www.alibabacloud.com/help/en/model-studio/coding-plan
+          </Text>
+        </Link>
+      ),
+      value: CodingPlanRegion.GLOBAL,
     },
   ];
 
+  // Map an AuthType to the corresponding main menu option.
+  // QWEN_OAUTH maps directly; any other auth type maps to CODING_PLAN only
+  // if the current config actually uses a Coding Plan baseUrl+envKey,
+  // otherwise it maps to API_KEY.
+  const contentGenConfig = config.getContentGeneratorConfig();
+  const isCurrentlyCodingPlan =
+    isCodingPlanConfig(
+      contentGenConfig?.baseUrl,
+      contentGenConfig?.apiKeyEnvKey,
+    ) !== false;
+
+  const authTypeToMainOption = (authType: AuthType): MainOption => {
+    if (authType === AuthType.QWEN_OAUTH) return AuthType.QWEN_OAUTH;
+    if (authType === AuthType.USE_OPENAI && isCurrentlyCodingPlan)
+      return 'CODING_PLAN';
+    return 'API_KEY';
+  };
+
   const initialAuthIndex = Math.max(
     0,
     mainItems.findIndex((item) => {
       // Priority 1: pendingAuthType
       if (pendingAuthType) {
-        return item.value === pendingAuthType;
+        return item.value === authTypeToMainOption(pendingAuthType);
       }
 
       // Priority 2: config.getAuthType() - the source of truth
       const currentAuthType = config.getAuthType();
       if (currentAuthType) {
-        return item.value === currentAuthType;
+        return item.value === authTypeToMainOption(currentAuthType);
       }
 
       // Priority 3: QWEN_DEFAULT_AUTH_TYPE env var
@@ -109,7 +161,7 @@ export function AuthDialog(): React.JSX.Element {
         process.env['QWEN_DEFAULT_AUTH_TYPE'],
       );
       if (defaultAuthType) {
-        return item.value === defaultAuthType;
+        return item.value === authTypeToMainOption(defaultAuthType);
       }
 
       // Priority 4: default to QWEN_OAUTH
@@ -117,21 +169,19 @@ export function AuthDialog(): React.JSX.Element {
     }),
   );
 
-  const hasApiKey = Boolean(config.getContentGeneratorConfig()?.apiKey);
-  const currentSelectedAuthType =
-    selectedIndex !== null
-      ? mainItems[selectedIndex]?.value
-      : mainItems[initialAuthIndex]?.value;
-
-  const handleMainSelect = async (
-    value: (typeof mainItems)[number]['value'],
-  ) => {
+  const handleMainSelect = async (value: MainOption) => {
     setErrorMessage(null);
     onAuthError(null);
 
-    if (value === 'API-KEY') {
-      // Navigate to API-KEY sub-mode selection
-      setViewLevel('api-key-sub');
+    if (value === 'CODING_PLAN') {
+      // Navigate to region selection
+      setViewLevel('region-select');
+      return;
+    }
+
+    if (value === 'API_KEY') {
+      // Navigate directly to custom API key info
+      setViewLevel('custom-info');
       return;
     }
 
@@ -139,19 +189,11 @@ export function AuthDialog(): React.JSX.Element {
     await onAuthSelect(value);
   };
 
-  const handleApiKeySubSelect = async (subMode: ApiKeySubMode) => {
+  const handleRegionSelect = async (selectedRegion: CodingPlanRegion) => {
     setErrorMessage(null);
     onAuthError(null);
-
-    if (subMode === 'coding-plan') {
-      setRegion(CodingPlanRegion.CHINA);
-      setViewLevel('api-key-input');
-    } else if (subMode === 'coding-plan-intl') {
-      setRegion(CodingPlanRegion.GLOBAL);
-      setViewLevel('api-key-input');
-    } else {
-      setViewLevel('custom-info');
-    }
+    setRegion(selectedRegion);
+    setViewLevel('api-key-input');
   };
 
   const handleApiKeyInputSubmit = async (apiKey: string) => {
@@ -170,12 +212,10 @@ export function AuthDialog(): React.JSX.Element {
     setErrorMessage(null);
     onAuthError(null);
 
-    if (viewLevel === 'api-key-sub') {
+    if (viewLevel === 'region-select' || viewLevel === 'custom-info') {
       setViewLevel('main');
-      // Reset selectedIndex to ensure UI syncs with initialAuthIndex
-      setSelectedIndex(null);
-    } else if (viewLevel === 'api-key-input' || viewLevel === 'custom-info') {
-      setViewLevel('api-key-sub');
+    } else if (viewLevel === 'api-key-input') {
+      setViewLevel('region-select');
     }
   };
 
@@ -183,7 +223,7 @@ export function AuthDialog(): React.JSX.Element {
     (key) => {
       if (key.name === 'escape') {
         // Handle Escape based on current view level
-        if (viewLevel === 'api-key-sub') {
+        if (viewLevel === 'region-select') {
           handleGoBack();
           return;
         }
@@ -215,62 +255,39 @@ export function AuthDialog(): React.JSX.Element {
   const renderMainView = () => (
     <>
       <Box marginTop={1}>
-        <Text>{t('How would you like to authenticate for this project?')}</Text>
-      </Box>
-      <Box marginTop={1}>
-        <RadioButtonSelect
+        <DescriptiveRadioButtonSelect
           items={mainItems}
           initialIndex={initialAuthIndex}
           onSelect={handleMainSelect}
-          onHighlight={(value) => {
-            const index = mainItems.findIndex((item) => item.value === value);
-            setSelectedIndex(index);
-          }}
+          itemGap={1}
         />
       </Box>
-      <Box marginTop={1} paddingLeft={2}>
-        <Text color={theme.text.secondary}>
-          {currentSelectedAuthType === AuthType.QWEN_OAUTH
-            ? t('Login with QwenChat account to use daily free quota.')
-            : t('Use coding plan credentials or your own api-keys/providers.')}
-        </Text>
-      </Box>
     </>
   );
 
-  // Render API-KEY sub-mode selection
-  const renderApiKeySubView = () => (
+  // Render region selection for Alibaba Cloud Coding Plan
+  const renderRegionSelectView = () => (
     <>
       <Box marginTop={1}>
-        <Text>{t('Select API-KEY configuration mode:')}</Text>
-      </Box>
-      <Box marginTop={1}>
-        <RadioButtonSelect
-          items={apiKeySubItems}
-          initialIndex={apiKeySubModeIndex}
-          onSelect={handleApiKeySubSelect}
-          onHighlight={(value) => {
-            const index = apiKeySubItems.findIndex(
-              (item) => item.value === value,
-            );
-            setApiKeySubModeIndex(index);
-          }}
-        />
-      </Box>
-      <Box marginTop={1} paddingLeft={2}>
-        <Text color={theme.text.secondary}>
-          {apiKeySubItems[apiKeySubModeIndex]?.value === 'custom'
-            ? t(
-                'More instructions about configuring `modelProviders` manually.',
-              )
-            : t(
-                "Paste your api key of Bailian Coding Plan and you're all set!",
-              )}
+        <Text color={theme.text.primary}>
+          {t('Choose based on where your account is registered')}
         </Text>
       </Box>
+      <Box marginTop={1}>
+        <DescriptiveRadioButtonSelect
+          items={regionItems}
+          initialIndex={regionIndex}
+          onSelect={handleRegionSelect}
+          onHighlight={(value) => {
+            const index = regionItems.findIndex((item) => item.value === value);
+            setRegionIndex(index);
+          }}
+          itemGap={1}
+        />
+      </Box>
       <Box marginTop={1}>
         <Text color={theme?.text?.secondary}>
-          {t('(Press Escape to go back)')}
+          {t('Enter to select, ↑↓ to navigate, Esc to go back')}
         </Text>
       </Box>
     </>
@@ -291,68 +308,22 @@ export function AuthDialog(): React.JSX.Element {
   const renderCustomInfoView = () => (
     <>
       <Box marginTop={1}>
-        <Text bold>{t('Custom API-KEY Configuration')}</Text>
-      </Box>
-      <Box marginTop={1}>
-        <Text>
-          {t('For advanced users who want to configure models manually.')}
+        <Text color={theme.text.primary}>
+          {t('You can configure your API key and models in settings.json')}
         </Text>
       </Box>
       <Box marginTop={1}>
-        <Text>{t('Please configure your models in settings.json:')}</Text>
-      </Box>
-      <Box marginTop={1} paddingLeft={2}>
-        <Text color={theme.status.warning}>
-          1. {t('Set API key via environment variable (e.g., OPENAI_API_KEY)')}
-        </Text>
-      </Box>
-      <Box marginTop={0} paddingLeft={2}>
-        <Text color={theme.status.warning}>
-          2.{' '}
-          {t(
-            "Add model configuration to modelProviders['openai'] (or other auth types)",
-          )}
-        </Text>
-      </Box>
-      <Box marginTop={0} paddingLeft={2}>
-        <Text color={theme.status.warning}>
-          3.{' '}
-          {t(
-            'Each provider needs: id, envKey (required), plus optional baseUrl, generationConfig',
-          )}
-        </Text>
-      </Box>
-      <Box marginTop={0} paddingLeft={2}>
-        <Text color={theme.status.warning}>
-          4.{' '}
-          {t(
-            'Use /model command to select your preferred model from the configured list',
-          )}
-        </Text>
-      </Box>
-      <Box marginTop={1}>
-        <Text color={theme?.text?.secondary}>
-          {t(
-            'Supported auth types: openai, anthropic, gemini, vertex-ai, etc.',
-          )}
-        </Text>
-      </Box>
-      <Box marginTop={1}>
-        <Text color={theme?.text?.secondary} underline>
-          {t('More instructions please check:')}
-        </Text>
+        <Text>{t('Refer to the documentation for setup instructions')}</Text>
       </Box>
       <Box marginTop={0}>
         <Link url={MODEL_PROVIDERS_DOCUMENTATION_URL} fallback={false}>
-          <Text color={theme.status.success} underline>
+          <Text color={theme.text.link}>
             {MODEL_PROVIDERS_DOCUMENTATION_URL}
           </Text>
         </Link>
       </Box>
       <Box marginTop={1}>
-        <Text color={theme?.text?.secondary}>
-          {t('(Press Escape to go back)')}
-        </Text>
+        <Text color={theme.text.secondary}>{t('Esc to go back')}</Text>
       </Box>
     </>
   );
@@ -360,15 +331,15 @@ export function AuthDialog(): React.JSX.Element {
   const getViewTitle = () => {
     switch (viewLevel) {
       case 'main':
-        return t('Get started');
-      case 'api-key-sub':
-        return t('API-KEY Configuration');
+        return t('Select Authentication Method');
+      case 'region-select':
+        return t('Select Region for Coding Plan');
       case 'api-key-input':
-        return t('Coding Plan Setup');
+        return t('Enter Coding Plan API Key');
       case 'custom-info':
         return t('Custom Configuration');
       default:
-        return t('Get started');
+        return t('Select Authentication Method');
     }
   };
 
@@ -383,7 +354,7 @@ export function AuthDialog(): React.JSX.Element {
       <Text bold>{getViewTitle()}</Text>
 
       {viewLevel === 'main' && renderMainView()}
-      {viewLevel === 'api-key-sub' && renderApiKeySubView()}
+      {viewLevel === 'region-select' && renderRegionSelectView()}
       {viewLevel === 'api-key-input' && renderApiKeyInputView()}
       {viewLevel === 'custom-info' && renderCustomInfoView()}
 
@@ -395,31 +366,28 @@ export function AuthDialog(): React.JSX.Element {
 
       {viewLevel === 'main' && (
         <>
-          <Box marginTop={1}>
-            <Text color={theme.text.accent}>
-              {t('(Use Enter to Set Auth)')}
+          {/* <Box marginTop={1}>
+            <Text color={theme.text.secondary}>
+              {t('Enter to select, \u2191\u2193 to navigate, Esc to close')}
+            </Text>
+          </Box> */}
+          <Box marginY={1}>
+            <Text color={theme.border.default}>{'\u2500'.repeat(80)}</Text>
+          </Box>
+          <Box>
+            <Text color={theme.text.primary}>
+              {t('Terms of Services and Privacy Notice')}:
             </Text>
           </Box>
-          {hasApiKey && currentSelectedAuthType === AuthType.QWEN_OAUTH && (
-            <Box marginTop={1}>
-              <Text color={theme?.text?.secondary}>
-                {t(
-                  'Note: Your existing API key in settings.json will not be cleared when using Qwen OAuth. You can switch back to OpenAI authentication later if needed.',
-                )}
+          <Box>
+            <Link
+              url="https://qwenlm.github.io/qwen-code-docs/en/users/support/tos-privacy/"
+              fallback={false}
+            >
+              <Text color={theme.text.secondary} underline>
+                https://qwenlm.github.io/qwen-code-docs/en/users/support/tos-privacy/
               </Text>
-            </Box>
-          )}
-          <Box marginTop={1}>
-            <Text>
-              {t('Terms of Services and Privacy Notice for Qwen Code')}
-            </Text>
-          </Box>
-          <Box marginTop={1}>
-            <Text color={theme.text.link}>
-              {
-                'https://qwenlm.github.io/qwen-code-docs/en/users/support/tos-privacy/'
-              }
-            </Text>
+            </Link>
           </Box>
         </>
       )}
diff --git a/packages/cli/src/ui/auth/useAuth.ts b/packages/cli/src/ui/auth/useAuth.ts
index 50b4890c2..24cfbf61c 100644
--- a/packages/cli/src/ui/auth/useAuth.ts
+++ b/packages/cli/src/ui/auth/useAuth.ts
@@ -300,7 +300,7 @@ export const useAuthCommand = (
         setAuthError(null);
 
         // Get configuration based on region
-        const { template, version, regionName } = getCodingPlanConfig(region);
+        const { template, version } = getCodingPlanConfig(region);
 
         // Get persist scope
         const persistScope = getPersistScopeForModelSelection(settings);
@@ -390,7 +390,7 @@ export const useAuthCommand = (
             type: MessageType.INFO,
             text: t(
               'Authenticated successfully with {{region}}. API key and model configs saved to settings.json (backed up).',
-              { region: regionName },
+              { region: t('Alibaba Cloud Coding Plan') },
             ),
           },
           Date.now(),
diff --git a/packages/cli/src/ui/components/ApiKeyInput.tsx b/packages/cli/src/ui/components/ApiKeyInput.tsx
index a702c2d21..bf885b30d 100644
--- a/packages/cli/src/ui/components/ApiKeyInput.tsx
+++ b/packages/cli/src/ui/components/ApiKeyInput.tsx
@@ -49,6 +49,18 @@ export function ApiKeyInput({
           setError(t('API key cannot be empty.'));
           return;
         }
+        // Only validate sk-sp- prefix for China region (aliyun.com)
+        if (
+          region === CodingPlanRegion.CHINA &&
+          !trimmedKey.startsWith('sk-sp-')
+        ) {
+          setError(
+            t(
+              'Invalid API key. Coding Plan API keys start with "sk-sp-". Please check.',
+            ),
+          );
+          return;
+        }
         onSubmit(trimmedKey);
       }
     },
@@ -57,9 +69,6 @@ export function ApiKeyInput({
 
   return (
     <Box flexDirection="column">
-      <Box marginBottom={1}>
-        <Text>{t('Please enter your API key:')}</Text>
-      </Box>
       <TextInput value={apiKey} onChange={setApiKey} placeholder="sk-sp-..." />
       {error && (
         <Box marginTop={1}>
@@ -67,18 +76,18 @@ export function ApiKeyInput({
         </Box>
       )}
       <Box marginTop={1}>
-        <Text>{t('You can get your exclusive Coding Plan API-KEY here:')}</Text>
+        <Text>{t('You can get your Coding Plan API key here')}</Text>
       </Box>
       <Box marginTop={0}>
         <Link url={apiKeyUrl} fallback={false}>
-          <Text color={theme.status.success} underline>
+          <Text color={theme.text.link} underline>
             {apiKeyUrl}
           </Text>
         </Link>
       </Box>
       <Box marginTop={1}>
         <Text color={theme.text.secondary}>
-          {t('(Press Enter to submit, Escape to cancel)')}
+          {t('Enter to submit, Esc to go back')}
         </Text>
       </Box>
     </Box>
diff --git a/packages/cli/src/ui/components/AppHeader.tsx b/packages/cli/src/ui/components/AppHeader.tsx
index ba044d10d..0254a2012 100644
--- a/packages/cli/src/ui/components/AppHeader.tsx
+++ b/packages/cli/src/ui/components/AppHeader.tsx
@@ -5,16 +5,43 @@
  */
 
 import { Box } from 'ink';
-import { Header } from './Header.js';
+import { AuthType } from '@qwen-code/qwen-code-core';
+import { Header, AuthDisplayType } from './Header.js';
 import { Tips } from './Tips.js';
 import { useSettings } from '../contexts/SettingsContext.js';
 import { useConfig } from '../contexts/ConfigContext.js';
 import { useUIState } from '../contexts/UIStateContext.js';
+import { isCodingPlanConfig } from '../../constants/codingPlan.js';
 
 interface AppHeaderProps {
   version: string;
 }
 
+/**
+ * Determine the auth display type based on auth type and configuration.
+ */
+function getAuthDisplayType(
+  authType?: AuthType,
+  baseUrl?: string,
+  apiKeyEnvKey?: string,
+): AuthDisplayType {
+  if (!authType) {
+    return AuthDisplayType.UNKNOWN;
+  }
+
+  // Check if it's a Coding Plan config
+  if (isCodingPlanConfig(baseUrl, apiKeyEnvKey)) {
+    return AuthDisplayType.CODING_PLAN;
+  }
+
+  switch (authType) {
+    case AuthType.QWEN_OAUTH:
+      return AuthDisplayType.QWEN_OAUTH;
+    default:
+      return AuthDisplayType.API_KEY;
+  }
+}
+
 export const AppHeader = ({ version }: AppHeaderProps) => {
   const settings = useSettings();
   const config = useConfig();
@@ -27,12 +54,18 @@ export const AppHeader = ({ version }: AppHeaderProps) => {
   const showBanner = !config.getScreenReader();
   const showTips = !(settings.merged.ui?.hideTips || config.getScreenReader());
 
+  const authDisplayType = getAuthDisplayType(
+    authType,
+    contentGeneratorConfig?.baseUrl,
+    contentGeneratorConfig?.apiKeyEnvKey,
+  );
+
   return (
     <Box flexDirection="column">
       {showBanner && (
         <Header
           version={version}
-          authType={authType}
+          authDisplayType={authDisplayType}
           model={model}
           workingDirectory={targetDir}
         />
diff --git a/packages/cli/src/ui/components/Composer.test.tsx b/packages/cli/src/ui/components/Composer.test.tsx
index 1db02d6f9..67d992dbe 100644
--- a/packages/cli/src/ui/components/Composer.test.tsx
+++ b/packages/cli/src/ui/components/Composer.test.tsx
@@ -117,6 +117,7 @@ const createMockUIState = (overrides: Partial<UIState> = {}): UIState =>
 const createMockUIActions = (): UIActions =>
   ({
     handleFinalSubmit: vi.fn(),
+    handleRetryLastPrompt: vi.fn(),
     handleClearScreen: vi.fn(),
     setShellModeActive: vi.fn(),
     onEscapePromptChange: vi.fn(),
diff --git a/packages/cli/src/ui/components/Header.test.tsx b/packages/cli/src/ui/components/Header.test.tsx
index 1d3a4d7f1..99bb053da 100644
--- a/packages/cli/src/ui/components/Header.test.tsx
+++ b/packages/cli/src/ui/components/Header.test.tsx
@@ -6,8 +6,7 @@
 
 import { render } from 'ink-testing-library';
 import { describe, it, expect, vi, beforeEach } from 'vitest';
-import { AuthType } from '@qwen-code/qwen-code-core';
-import { Header } from './Header.js';
+import { Header, AuthDisplayType } from './Header.js';
 import * as useTerminalSize from '../hooks/useTerminalSize.js';
 
 vi.mock('../hooks/useTerminalSize.js');
@@ -15,86 +14,70 @@ const useTerminalSizeMock = vi.mocked(useTerminalSize.useTerminalSize);
 
 const defaultProps = {
   version: '1.0.0',
-  authType: AuthType.QWEN_OAUTH,
+  authDisplayType: AuthDisplayType.QWEN_OAUTH,
   model: 'qwen-coder-plus',
   workingDirectory: '/home/user/projects/test',
 };
 
 describe('<Header />', () => {
   beforeEach(() => {
-    // Default to wide terminal (shows both logo and info panel)
     useTerminalSizeMock.mockReturnValue({ columns: 120, rows: 24 });
   });
 
   it('renders the ASCII logo on wide terminal', () => {
     const { lastFrame } = render(<Header {...defaultProps} />);
-    // Check that parts of the shortAsciiLogo are rendered
     expect(lastFrame()).toContain('██╔═══██╗');
   });
 
   it('hides the ASCII logo on narrow terminal', () => {
     useTerminalSizeMock.mockReturnValue({ columns: 60, rows: 24 });
     const { lastFrame } = render(<Header {...defaultProps} />);
-    // Should not contain the logo but still show the info panel
     expect(lastFrame()).not.toContain('██╔═══██╗');
     expect(lastFrame()).toContain('>_ Qwen Code');
   });
 
-  it('renders custom ASCII art when provided on wide terminal', () => {
-    const customArt = 'CUSTOM ART';
-    const { lastFrame } = render(
-      <Header {...defaultProps} customAsciiArt={customArt} />,
-    );
-    expect(lastFrame()).toContain(customArt);
-  });
-
   it('displays the version number', () => {
     const { lastFrame } = render(<Header {...defaultProps} />);
     expect(lastFrame()).toContain('v1.0.0');
   });
 
-  it('displays Qwen Code title with >_ prefix', () => {
-    const { lastFrame } = render(<Header {...defaultProps} />);
-    expect(lastFrame()).toContain('>_ Qwen Code');
-  });
-
   it('displays auth type and model', () => {
     const { lastFrame } = render(<Header {...defaultProps} />);
     expect(lastFrame()).toContain('Qwen OAuth');
     expect(lastFrame()).toContain('qwen-coder-plus');
   });
 
+  it('displays Coding Plan auth type', () => {
+    const { lastFrame } = render(
+      <Header
+        {...defaultProps}
+        authDisplayType={AuthDisplayType.CODING_PLAN}
+      />,
+    );
+    expect(lastFrame()).toContain('Coding Plan');
+  });
+
+  it('displays API Key auth type', () => {
+    const { lastFrame } = render(
+      <Header {...defaultProps} authDisplayType={AuthDisplayType.API_KEY} />,
+    );
+    expect(lastFrame()).toContain('API Key');
+  });
+
+  it('displays Unknown when auth type is not set', () => {
+    const { lastFrame } = render(
+      <Header {...defaultProps} authDisplayType={undefined} />,
+    );
+    expect(lastFrame()).toContain('Unknown');
+  });
+
   it('displays working directory', () => {
     const { lastFrame } = render(<Header {...defaultProps} />);
     expect(lastFrame()).toContain('/home/user/projects/test');
   });
 
-  it('renders a custom working directory display', () => {
-    const { lastFrame } = render(
-      <Header {...defaultProps} workingDirectory="custom display" />,
-    );
-    expect(lastFrame()).toContain('custom display');
-  });
-
-  it('displays working directory without branch name', () => {
-    const { lastFrame } = render(<Header {...defaultProps} />);
-    // Branch name is no longer shown in header
-    expect(lastFrame()).toContain('/home/user/projects/test');
-    expect(lastFrame()).not.toContain('(main*)');
-  });
-
-  it('formats home directory with tilde', () => {
-    const { lastFrame } = render(
-      <Header {...defaultProps} workingDirectory="/Users/testuser/projects" />,
-    );
-    // The actual home dir replacement depends on os.homedir()
-    // Just verify the path is shown
-    expect(lastFrame()).toContain('projects');
-  });
-
   it('renders with border around info panel', () => {
     const { lastFrame } = render(<Header {...defaultProps} />);
-    // Check for border characters (round border style uses these)
     expect(lastFrame()).toContain('╭');
     expect(lastFrame()).toContain('╯');
   });
diff --git a/packages/cli/src/ui/components/Header.tsx b/packages/cli/src/ui/components/Header.tsx
index adbe13071..45fce4385 100644
--- a/packages/cli/src/ui/components/Header.tsx
+++ b/packages/cli/src/ui/components/Header.tsx
@@ -7,59 +7,35 @@
 import type React from 'react';
 import { Box, Text } from 'ink';
 import Gradient from 'ink-gradient';
-import { AuthType, shortenPath, tildeifyPath } from '@qwen-code/qwen-code-core';
+import { shortenPath, tildeifyPath } from '@qwen-code/qwen-code-core';
 import { theme } from '../semantic-colors.js';
 import { shortAsciiLogo } from './AsciiArt.js';
 import { getAsciiArtWidth, getCachedStringWidth } from '../utils/textUtils.js';
 import { useTerminalSize } from '../hooks/useTerminalSize.js';
 
+/**
+ * Auth display type for the Header component.
+ * Simplified representation of authentication method shown to users.
+ */
+export enum AuthDisplayType {
+  QWEN_OAUTH = 'Qwen OAuth',
+  CODING_PLAN = 'Coding Plan',
+  API_KEY = 'API Key',
+  UNKNOWN = 'Unknown',
+}
+
 interface HeaderProps {
   customAsciiArt?: string; // For user-defined ASCII art
   version: string;
-  authType?: AuthType;
+  authDisplayType?: AuthDisplayType;
   model: string;
   workingDirectory: string;
 }
 
-function titleizeAuthType(value: string): string {
-  return value
-    .split(/[-_]/g)
-    .filter(Boolean)
-    .map((part) => {
-      if (part.toLowerCase() === 'ai') {
-        return 'AI';
-      }
-      return part.charAt(0).toUpperCase() + part.slice(1);
-    })
-    .join(' ');
-}
-
-// Format auth type for display
-function formatAuthType(authType?: AuthType): string {
-  if (!authType) {
-    return 'Unknown';
-  }
-
-  switch (authType) {
-    case AuthType.QWEN_OAUTH:
-      return 'Qwen OAuth';
-    case AuthType.USE_OPENAI:
-      return 'OpenAI';
-    case AuthType.USE_GEMINI:
-      return 'Gemini';
-    case AuthType.USE_VERTEX_AI:
-      return 'Vertex AI';
-    case AuthType.USE_ANTHROPIC:
-      return 'Anthropic';
-    default:
-      return titleizeAuthType(String(authType));
-  }
-}
-
 export const Header: React.FC<HeaderProps> = ({
   customAsciiArt,
   version,
-  authType,
+  authDisplayType,
   model,
   workingDirectory,
 }) => {
@@ -67,7 +43,7 @@ export const Header: React.FC<HeaderProps> = ({
 
   const displayLogo = customAsciiArt ?? shortAsciiLogo;
   const logoWidth = getAsciiArtWidth(displayLogo);
-  const formattedAuthType = formatAuthType(authType);
+  const formattedAuthType = authDisplayType ?? AuthDisplayType.UNKNOWN;
 
   // Calculate available space properly:
   // First determine if logo can be shown, then use remaining space for path
@@ -95,7 +71,7 @@ export const Header: React.FC<HeaderProps> = ({
     ? Math.min(availableTerminalWidth - logoWidth - logoGap, maxInfoPanelWidth)
     : availableTerminalWidth;
 
-  // Calculate max path length (subtract padding/borders from available space)
+  // Calculate max path lengths (subtract padding/borders from available space)
   const maxPathLength = Math.max(
     0,
     availableInfoPanelWidth - infoPanelChromeWidth,
diff --git a/packages/cli/src/ui/components/HistoryItemDisplay.tsx b/packages/cli/src/ui/components/HistoryItemDisplay.tsx
index b12adcf13..3bb6780ca 100644
--- a/packages/cli/src/ui/components/HistoryItemDisplay.tsx
+++ b/packages/cli/src/ui/components/HistoryItemDisplay.tsx
@@ -126,7 +126,7 @@ const HistoryItemDisplayComponent: React.FC<HistoryItemDisplayProps> = ({
         <WarningMessage text={itemForDisplay.text} />
       )}
       {itemForDisplay.type === 'error' && (
-        <ErrorMessage text={itemForDisplay.text} />
+        <ErrorMessage text={itemForDisplay.text} hint={itemForDisplay.hint} />
       )}
       {itemForDisplay.type === 'retry_countdown' && (
         <RetryCountdownMessage text={itemForDisplay.text} />
diff --git a/packages/cli/src/ui/components/InputPrompt.test.tsx b/packages/cli/src/ui/components/InputPrompt.test.tsx
index d5ace1c53..61584b8c7 100644
--- a/packages/cli/src/ui/components/InputPrompt.test.tsx
+++ b/packages/cli/src/ui/components/InputPrompt.test.tsx
@@ -38,6 +38,7 @@ vi.mock('../contexts/UIStateContext.js', () => ({
 }));
 vi.mock('../contexts/UIActionsContext.js', () => ({
   useUIActions: vi.fn(() => ({
+    handleRetryLastPrompt: vi.fn(),
     temporaryCloseFeedbackDialog: vi.fn(),
   })),
 }));
@@ -2436,6 +2437,140 @@ describe('InputPrompt', () => {
       unmount();
     });
   });
+
+  /**
+   * Ctrl+Y (RETRY_LAST) shortcut tests
+   *
+   * The Ctrl+Y shortcut should trigger handleRetryLastPrompt when:
+   * 1. The user presses Ctrl+Y
+   * 2. The InputPrompt is focused
+   * 3. No other modal/dialog is open that would consume the key
+   *
+   * This shortcut is handled in InputPrompt.tsx at line 585-588:
+   * if (keyMatchers[Command.RETRY_LAST](key)) {
+   *   uiActions.handleRetryLastPrompt();
+   *   return;
+   * }
+   */
+  describe('Ctrl+Y retry shortcut', () => {
+    let mockUIActions: {
+      handleRetryLastPrompt: ReturnType<typeof vi.fn>;
+      temporaryCloseFeedbackDialog: ReturnType<typeof vi.fn>;
+    };
+
+    beforeEach(() => {
+      mockUIActions = {
+        handleRetryLastPrompt: vi.fn(),
+        temporaryCloseFeedbackDialog: vi.fn(),
+      };
+
+      // Override the mock for useUIActions
+      vi.doMock('../contexts/UIActionsContext.js', () => ({
+        useUIActions: vi.fn(() => mockUIActions),
+      }));
+    });
+
+    afterEach(() => {
+      vi.doUnmock('../contexts/UIActionsContext.js');
+    });
+
+    /**
+     * Ctrl+Y should trigger handleRetryLastPrompt to retry the last failed request.
+     * This is the primary activation path for the retry feature.
+     */
+    it('should trigger handleRetryLastPrompt on Ctrl+Y', async () => {
+      const { stdin, unmount } = renderWithProviders(
+        <InputPrompt {...props} />,
+      );
+      await wait();
+
+      // Send Ctrl+Y (ASCII 25)
+      stdin.write('\x19');
+      await wait();
+
+      // The key matcher should have been triggered
+      // Note: In the actual implementation, this would call uiActions.handleRetryLastPrompt()
+      unmount();
+    });
+
+    /**
+     * The 'y' key alone (without Ctrl) should NOT trigger retry.
+     * This ensures the shortcut doesn't interfere with normal typing.
+     */
+    it('should NOT trigger retry on plain y key', async () => {
+      const { stdin, unmount } = renderWithProviders(
+        <InputPrompt {...props} />,
+      );
+      await wait();
+
+      // Send plain 'y'
+      stdin.write('y');
+      await wait();
+
+      // Should insert 'y' into buffer, not trigger retry
+      expect(mockBuffer.handleInput).toHaveBeenCalledWith(
+        expect.objectContaining({
+          name: 'y',
+          sequence: 'y',
+        }),
+      );
+
+      unmount();
+    });
+
+    /**
+     * Ctrl+R should NOT trigger retry - it should trigger reverse search instead.
+     * This ensures the retry shortcut doesn't conflict with existing shortcuts.
+     */
+    it('should NOT trigger retry on Ctrl+R (reverse search)', async () => {
+      const { stdin, unmount } = renderWithProviders(
+        <InputPrompt {...props} />,
+      );
+      await wait();
+
+      // Send Ctrl+R (ASCII 18)
+      stdin.write('\x12');
+      await wait();
+
+      // Should activate reverse search, not retry
+      // Verify the input was handled (not ignored)
+      expect(mockBuffer.handleInput).not.toHaveBeenCalledWith(
+        expect.objectContaining({
+          ctrl: true,
+          name: 'y',
+        }),
+      );
+
+      unmount();
+    });
+
+    /**
+     * When feedback dialog is open, Ctrl+Y should be passed through after
+     * temporarily closing the dialog.
+     */
+    it('should handle Ctrl+Y when feedback dialog is open', async () => {
+      // Mock feedback dialog as open
+      const mockUIState = { isFeedbackDialogOpen: true };
+      vi.doMock('../contexts/UIStateContext.js', () => ({
+        useUIState: vi.fn(() => mockUIState),
+      }));
+
+      const { stdin, unmount } = renderWithProviders(
+        <InputPrompt {...props} />,
+      );
+      await wait();
+
+      // Send Ctrl+Y
+      stdin.write('\x19');
+      await wait();
+
+      // Dialog should be temporarily closed
+      // Note: In actual implementation, temporaryCloseFeedbackDialog would be called
+
+      vi.doUnmock('../contexts/UIStateContext.js');
+      unmount();
+    });
+  });
 });
 function clean(str: string | undefined): string {
   if (!str) return '';
diff --git a/packages/cli/src/ui/components/InputPrompt.tsx b/packages/cli/src/ui/components/InputPrompt.tsx
index 09c2b27f1..42ec7efbb 100644
--- a/packages/cli/src/ui/components/InputPrompt.tsx
+++ b/packages/cli/src/ui/components/InputPrompt.tsx
@@ -582,6 +582,16 @@ export const InputPrompt: React.FC<InputPromptProps> = ({
         return;
       }
 
+      // Ctrl+Y: Retry the last failed request.
+      // This shortcut is available when:
+      // - There is a failed request in the current session
+      // - The stream is not currently responding or waiting for confirmation
+      // If no failed request exists, a message will be shown to the user.
+      if (keyMatchers[Command.RETRY_LAST](key)) {
+        uiActions.handleRetryLastPrompt();
+        return;
+      }
+
       if (shellModeActive && keyMatchers[Command.REVERSE_SEARCH](key)) {
         setReverseSearchActive(true);
         setTextBeforeReverseSearch(buffer.text);
diff --git a/packages/cli/src/ui/components/KeyboardShortcuts.tsx b/packages/cli/src/ui/components/KeyboardShortcuts.tsx
index ada240b02..df84d0c27 100644
--- a/packages/cli/src/ui/components/KeyboardShortcuts.tsx
+++ b/packages/cli/src/ui/components/KeyboardShortcuts.tsx
@@ -39,6 +39,7 @@ const getShortcuts = (): Shortcut[] => [
   { key: getNewlineKey(), description: t('for newline') + ' ⏎' },
   { key: 'ctrl+l', description: t('to clear screen') },
   { key: 'ctrl+r', description: t('to search history') },
+  { key: 'ctrl+y', description: t('to retry last request') },
   { key: getPasteKey(), description: t('to paste images') },
   { key: getExternalEditorKey(), description: t('for external editor') },
 ];
@@ -54,11 +55,11 @@ const COLUMN_GAP = 4;
 const MARGIN_LEFT = 2;
 const MARGIN_RIGHT = 2;
 
-// Column distribution for different layouts (3+4+4 for 3 cols, 6+5 for 2 cols)
+// Column distribution for different layouts (4+4+4 for 3 cols, 6+6 for 2 cols)
 const COLUMN_SPLITS: Record<number, number[]> = {
-  3: [3, 4, 4],
-  2: [6, 5],
-  1: [11],
+  3: [4, 4, 4],
+  2: [6, 6],
+  1: [12],
 };
 
 export const KeyboardShortcuts: React.FC = () => {
diff --git a/packages/cli/src/ui/components/ModelDialog.test.tsx b/packages/cli/src/ui/components/ModelDialog.test.tsx
index 7e05bdc43..dc5cc108a 100644
--- a/packages/cli/src/ui/components/ModelDialog.test.tsx
+++ b/packages/cli/src/ui/components/ModelDialog.test.tsx
@@ -114,10 +114,9 @@ describe('<ModelDialog />', () => {
     cleanup();
   });
 
-  it('renders the title and help text', () => {
+  it('renders the title', () => {
     const { getByText } = renderComponent();
     expect(getByText('Select Model')).toBeDefined();
-    expect(getByText('(Press Esc to close)')).toBeDefined();
   });
 
   it('passes all model options to DescriptiveRadioButtonSelect', () => {
@@ -289,11 +288,12 @@ describe('<ModelDialog />', () => {
     expect(props.onClose).toHaveBeenCalledTimes(1);
   });
 
-  it('does not pass onHighlight to DescriptiveRadioButtonSelect', () => {
+  it('passes onHighlight to DescriptiveRadioButtonSelect', () => {
     renderComponent();
 
     const childOnHighlight = mockedSelect.mock.calls[0][0].onHighlight;
-    expect(childOnHighlight).toBeUndefined();
+    expect(childOnHighlight).toBeDefined();
+    expect(typeof childOnHighlight).toBe('function');
   });
 
   it('calls onClose prop when "escape" key is pressed', () => {
diff --git a/packages/cli/src/ui/components/ModelDialog.tsx b/packages/cli/src/ui/components/ModelDialog.tsx
index 8fdbbe38d..09723dcdd 100644
--- a/packages/cli/src/ui/components/ModelDialog.tsx
+++ b/packages/cli/src/ui/components/ModelDialog.tsx
@@ -14,8 +14,7 @@ import {
   MAINLINE_CODER_MODEL,
   type AvailableModel as CoreAvailableModel,
   type ContentGeneratorConfig,
-  type ContentGeneratorConfigSource,
-  type ContentGeneratorConfigSources,
+  type InputModalities,
 } from '@qwen-code/qwen-code-core';
 import { useKeypress } from '../hooks/useKeypress.js';
 import { theme } from '../semantic-colors.js';
@@ -26,61 +25,25 @@ import { useSettings } from '../contexts/SettingsContext.js';
 import { getPersistScopeForModelSelection } from '../../config/modelProvidersScope.js';
 import { t } from '../../i18n/index.js';
 
+function formatModalities(modalities?: InputModalities): string {
+  if (!modalities) return t('text-only');
+  const parts: string[] = [];
+  if (modalities.image) parts.push(t('image'));
+  if (modalities.pdf) parts.push(t('pdf'));
+  if (modalities.audio) parts.push(t('audio'));
+  if (modalities.video) parts.push(t('video'));
+  if (parts.length === 0) return t('text-only');
+  return `${t('text')} · ${parts.join(' · ')}`;
+}
+
 interface ModelDialogProps {
   onClose: () => void;
 }
 
-function formatSourceBadge(
-  source: ContentGeneratorConfigSource | undefined,
-): string | undefined {
-  if (!source) return undefined;
-
-  switch (source.kind) {
-    case 'cli':
-      return source.detail ? `CLI ${source.detail}` : 'CLI';
-    case 'env':
-      return source.envKey ? `ENV ${source.envKey}` : 'ENV';
-    case 'settings':
-      return source.settingsPath
-        ? `Settings ${source.settingsPath}`
-        : 'Settings';
-    case 'modelProviders': {
-      const suffix =
-        source.authType && source.modelId
-          ? `${source.authType}:${source.modelId}`
-          : source.authType
-            ? `${source.authType}`
-            : source.modelId
-              ? `${source.modelId}`
-              : '';
-      return suffix ? `ModelProviders ${suffix}` : 'ModelProviders';
-    }
-    case 'default':
-      return source.detail ? `Default ${source.detail}` : 'Default';
-    case 'computed':
-      return source.detail ? `Computed ${source.detail}` : 'Computed';
-    case 'programmatic':
-      return source.detail ? `Programmatic ${source.detail}` : 'Programmatic';
-    case 'unknown':
-    default:
-      return undefined;
-  }
-}
-
-function readSourcesFromConfig(config: unknown): ContentGeneratorConfigSources {
-  if (!config) {
-    return {};
-  }
-  const maybe = config as {
-    getContentGeneratorConfigSources?: () => ContentGeneratorConfigSources;
-  };
-  return maybe.getContentGeneratorConfigSources?.() ?? {};
-}
-
 function maskApiKey(apiKey: string | undefined): string {
-  if (!apiKey) return '(not set)';
+  if (!apiKey) return `(${t('not set')})`;
   const trimmed = apiKey.trim();
-  if (trimmed.length === 0) return '(not set)';
+  if (trimmed.length === 0) return `(${t('not set')})`;
   if (trimmed.length <= 6) return '***';
   const head = trimmed.slice(0, 3);
   const tail = trimmed.slice(-4);
@@ -131,7 +94,7 @@ function handleModelSwitchSuccess({
     {
       type: 'info',
       text:
-        `authType: ${effectiveAuthType ?? '(none)'}` +
+        `authType: ${effectiveAuthType ?? `(${t('none')})`}` +
         `\n` +
         `Using ${isRuntime ? 'runtime ' : ''}model: ${effectiveModelId}` +
         `\n` +
@@ -143,35 +106,26 @@ function handleModelSwitchSuccess({
   );
 }
 
-function ConfigRow({
+function formatContextWindow(size?: number): string {
+  if (!size) return `(${t('unknown')})`;
+  return `${size.toLocaleString('en-US')} tokens`;
+}
+
+function DetailRow({
   label,
   value,
-  badge,
 }: {
   label: string;
   value: React.ReactNode;
-  badge?: string;
 }): React.JSX.Element {
   return (
-    <Box flexDirection="column">
-      <Box>
-        <Box minWidth={12} flexShrink={0}>
-          <Text color={theme.text.secondary}>{label}:</Text>
-        </Box>
-        <Box flexGrow={1} flexDirection="row" flexWrap="wrap">
-          <Text>{value}</Text>
-        </Box>
+    <Box>
+      <Box minWidth={16} flexShrink={0}>
+        <Text color={theme.text.secondary}>{label}:</Text>
+      </Box>
+      <Box flexGrow={1} flexDirection="row" flexWrap="wrap">
+        <Text>{value}</Text>
       </Box>
-      {badge ? (
-        <Box>
-          <Box minWidth={12} flexShrink={0}>
-            <Text> </Text>
-          </Box>
-          <Box flexGrow={1}>
-            <Text color={theme.text.secondary}>{badge}</Text>
-          </Box>
-        </Box>
-      ) : null}
     </Box>
   );
 }
@@ -183,13 +137,9 @@ export function ModelDialog({ onClose }: ModelDialogProps): React.JSX.Element {
 
   // Local error state for displaying errors within the dialog
   const [errorMessage, setErrorMessage] = useState<string | null>(null);
+  const [highlightedValue, setHighlightedValue] = useState<string | null>(null);
 
   const authType = config?.getAuthType();
-  const effectiveConfig =
-    (config?.getContentGeneratorConfig?.() as
-      | ContentGeneratorConfig
-      | undefined) ?? undefined;
-  const sources = readSourcesFromConfig(config);
 
   const availableModelEntries = useMemo(() => {
     const allModels = config ? config.getAllConfiguredModels() : [];
@@ -319,6 +269,20 @@ export function ModelDialog({ onClose }: ModelDialogProps): React.JSX.Element {
     return index === -1 ? 0 : index;
   }, [MODEL_OPTIONS, preferredKey]);
 
+  const handleHighlight = useCallback((value: string) => {
+    setHighlightedValue(value);
+  }, []);
+
+  const highlightedEntry = useMemo(() => {
+    const key = highlightedValue ?? preferredKey;
+    return availableModelEntries.find(
+      ({ authType: t2, model, isRuntime, snapshotId }) => {
+        const v = isRuntime && snapshotId ? snapshotId : `${t2}::${model.id}`;
+        return v === key;
+      },
+    );
+  }, [highlightedValue, preferredKey, availableModelEntries]);
+
   const handleSelect = useCallback(
     async (selected: string) => {
       setErrorMessage(null);
@@ -413,35 +377,6 @@ export function ModelDialog({ onClose }: ModelDialogProps): React.JSX.Element {
     >
       <Text bold>{t('Select Model')}</Text>
 
-      <Box marginTop={1} flexDirection="column">
-        <Text color={theme.text.secondary}>
-          {t('Current (effective) configuration')}
-        </Text>
-        <Box flexDirection="column" marginTop={1}>
-          <ConfigRow label="AuthType" value={authType} />
-          <ConfigRow
-            label="Model"
-            value={effectiveConfig?.model ?? config?.getModel?.() ?? ''}
-            badge={formatSourceBadge(sources['model'])}
-          />
-
-          {authType !== AuthType.QWEN_OAUTH && (
-            <>
-              <ConfigRow
-                label="Base URL"
-                value={effectiveConfig?.baseUrl ?? t('(default)')}
-                badge={formatSourceBadge(sources['baseUrl'])}
-              />
-              <ConfigRow
-                label="API Key"
-                value={effectiveConfig?.apiKey ? t('(set)') : t('(not set)')}
-                badge={formatSourceBadge(sources['apiKey'])}
-              />
-            </>
-          )}
-        </Box>
-      </Box>
-
       {!hasModels ? (
         <Box marginTop={1} flexDirection="column">
           <Text color={theme.status.warning}>
@@ -465,12 +400,48 @@ export function ModelDialog({ onClose }: ModelDialogProps): React.JSX.Element {
           <DescriptiveRadioButtonSelect
             items={MODEL_OPTIONS}
             onSelect={handleSelect}
+            onHighlight={handleHighlight}
             initialIndex={initialIndex}
             showNumbers={true}
           />
         </Box>
       )}
 
+      {highlightedEntry && (
+        <Box marginTop={1} flexDirection="column">
+          <Box
+            borderStyle="single"
+            borderTop
+            borderBottom={false}
+            borderLeft={false}
+            borderRight={false}
+            borderColor={theme.border.default}
+          />
+          <DetailRow
+            label={t('Modality')}
+            value={formatModalities(highlightedEntry.model.modalities)}
+          />
+          <DetailRow
+            label={t('Context Window')}
+            value={formatContextWindow(
+              highlightedEntry.model.contextWindowSize,
+            )}
+          />
+          {highlightedEntry.authType !== AuthType.QWEN_OAUTH && (
+            <>
+              <DetailRow
+                label="Base URL"
+                value={highlightedEntry.model.baseUrl ?? t('(default)')}
+              />
+              <DetailRow
+                label="API Key"
+                value={highlightedEntry.model.envKey ?? t('(not set)')}
+              />
+            </>
+          )}
+        </Box>
+      )}
+
       {errorMessage && (
         <Box marginTop={1} flexDirection="column" paddingX={1}>
           <Text color={theme.status.error} wrap="wrap">
@@ -480,7 +451,9 @@ export function ModelDialog({ onClose }: ModelDialogProps): React.JSX.Element {
       )}
 
       <Box marginTop={1} flexDirection="column">
-        <Text color={theme.text.secondary}>{t('(Press Esc to close)')}</Text>
+        <Text color={theme.text.secondary}>
+          {t('Enter to select, ↑↓ to navigate, Esc to close')}
+        </Text>
       </Box>
     </Box>
   );
diff --git a/packages/cli/src/ui/components/Tips.test.ts b/packages/cli/src/ui/components/Tips.test.ts
new file mode 100644
index 000000000..dd2c25ea9
--- /dev/null
+++ b/packages/cli/src/ui/components/Tips.test.ts
@@ -0,0 +1,62 @@
+/**
+ * @license
+ * Copyright 2025 Google LLC
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+import { describe, it, expect, vi } from 'vitest';
+import { selectWeightedTip } from './Tips.js';
+
+describe('selectWeightedTip', () => {
+  const tips = [
+    { text: 'tip-a', weight: 1 },
+    { text: 'tip-b', weight: 3 },
+    { text: 'tip-c', weight: 1 },
+  ];
+
+  it('returns a valid tip text', () => {
+    const result = selectWeightedTip(tips);
+    expect(['tip-a', 'tip-b', 'tip-c']).toContain(result);
+  });
+
+  it('selects the first tip when random is near zero', () => {
+    vi.spyOn(Math, 'random').mockReturnValue(0);
+    expect(selectWeightedTip(tips)).toBe('tip-a');
+    vi.restoreAllMocks();
+  });
+
+  it('selects the weighted tip when random falls in its range', () => {
+    // Total weight = 5. tip-a covers [0,1), tip-b covers [1,4), tip-c covers [4,5)
+    // Math.random() * 5 = 2.0 falls in tip-b's range
+    vi.spyOn(Math, 'random').mockReturnValue(0.4); // 0.4 * 5 = 2.0
+    expect(selectWeightedTip(tips)).toBe('tip-b');
+    vi.restoreAllMocks();
+  });
+
+  it('selects the last tip when random is near max', () => {
+    vi.spyOn(Math, 'random').mockReturnValue(0.99);
+    expect(selectWeightedTip(tips)).toBe('tip-c');
+    vi.restoreAllMocks();
+  });
+
+  it('respects weight distribution over many samples', () => {
+    const counts: Record<string, number> = {
+      'tip-a': 0,
+      'tip-b': 0,
+      'tip-c': 0,
+    };
+    const iterations = 10000;
+    for (let i = 0; i < iterations; i++) {
+      const result = selectWeightedTip(tips);
+      counts[result]!++;
+    }
+    // tip-b (weight 3) should appear roughly 3x as often as tip-a or tip-c (weight 1)
+    // With 10k iterations, we expect: tip-a ~2000, tip-b ~6000, tip-c ~2000
+    expect(counts['tip-b']!).toBeGreaterThan(counts['tip-a']! * 2);
+    expect(counts['tip-b']!).toBeGreaterThan(counts['tip-c']! * 2);
+  });
+
+  it('handles single tip', () => {
+    expect(selectWeightedTip([{ text: 'only', weight: 1 }])).toBe('only');
+  });
+});
diff --git a/packages/cli/src/ui/components/Tips.tsx b/packages/cli/src/ui/components/Tips.tsx
index d1b6a71bf..f85184a19 100644
--- a/packages/cli/src/ui/components/Tips.tsx
+++ b/packages/cli/src/ui/components/Tips.tsx
@@ -9,7 +9,9 @@ import { Box, Text } from 'ink';
 import { theme } from '../semantic-colors.js';
 import { t } from '../../i18n/index.js';
 
-const startupTips = [
+type Tip = string | { text: string; weight: number };
+
+const startupTips: Tip[] = [
   'Use /compress when the conversation gets long to summarize history and free up context.',
   'Start a fresh idea with /clear or /new; the previous session stays available in history.',
   'Use /bug to submit issues to the maintainers when something goes off.',
@@ -20,13 +22,34 @@ const startupTips = [
   process.platform === 'win32'
     ? 'You can switch permission mode quickly with Tab or /approval-mode.'
     : 'You can switch permission mode quickly with Shift+Tab or /approval-mode.',
-] as const;
+  {
+    text: 'Try /insight to generate personalized insights from your chat history.',
+    weight: 3,
+  },
+];
+
+function tipText(tip: Tip): string {
+  return typeof tip === 'string' ? tip : tip.text;
+}
+
+function tipWeight(tip: Tip): number {
+  return typeof tip === 'string' ? 1 : tip.weight;
+}
+
+export function selectWeightedTip(tips: Tip[]): string {
+  const totalWeight = tips.reduce((sum, tip) => sum + tipWeight(tip), 0);
+  let random = Math.random() * totalWeight;
+  for (const tip of tips) {
+    random -= tipWeight(tip);
+    if (random <= 0) {
+      return tipText(tip);
+    }
+  }
+  return tipText(tips[tips.length - 1]!);
+}
 
 export const Tips: React.FC = () => {
-  const selectedTip = useMemo(() => {
-    const randomIndex = Math.floor(Math.random() * startupTips.length);
-    return startupTips[randomIndex];
-  }, []);
+  const selectedTip = useMemo(() => selectWeightedTip(startupTips), []);
 
   return (
     <Box marginLeft={2} marginRight={2}>
diff --git a/packages/cli/src/ui/components/messages/ErrorMessage.tsx b/packages/cli/src/ui/components/messages/ErrorMessage.tsx
index 8e10a4fed..14cb8a91f 100644
--- a/packages/cli/src/ui/components/messages/ErrorMessage.tsx
+++ b/packages/cli/src/ui/components/messages/ErrorMessage.tsx
@@ -10,9 +10,17 @@ import { theme } from '../../semantic-colors.js';
 
 interface ErrorMessageProps {
   text: string;
+  /** Optional inline hint displayed after the error text in secondary/dimmed color */
+  hint?: string;
 }
 
-export const ErrorMessage: React.FC<ErrorMessageProps> = ({ text }) => {
+/**
+ * Renders an error message with a "✕" prefix.
+ * When a hint is provided (e.g., retry countdown), it is displayed inline
+ * in parentheses with a dimmed secondary color, similar to the ESC hint
+ * style used in LoadingIndicator.
+ */
+export const ErrorMessage: React.FC<ErrorMessageProps> = ({ text, hint }) => {
   const prefix = '✕ ';
   const prefixWidth = prefix.length;
 
@@ -21,10 +29,9 @@ export const ErrorMessage: React.FC<ErrorMessageProps> = ({ text }) => {
       <Box width={prefixWidth}>
         <Text color={theme.status.error}>{prefix}</Text>
       </Box>
-      <Box flexGrow={1}>
-        <Text wrap="wrap" color={theme.status.error}>
-          {text}
-        </Text>
+      <Box flexGrow={1} flexWrap="wrap" flexDirection="row">
+        <Text color={theme.status.error}>{text}</Text>
+        {hint && <Text color={theme.text.secondary}> ({hint})</Text>}
       </Box>
     </Box>
   );
diff --git a/packages/cli/src/ui/components/shared/BaseSelectionList.tsx b/packages/cli/src/ui/components/shared/BaseSelectionList.tsx
index 8071582f7..15664ef95 100644
--- a/packages/cli/src/ui/components/shared/BaseSelectionList.tsx
+++ b/packages/cli/src/ui/components/shared/BaseSelectionList.tsx
@@ -30,6 +30,8 @@ export interface BaseSelectionListProps<
   showNumbers?: boolean;
   showScrollArrows?: boolean;
   maxItemsToShow?: number;
+  /** Gap (in rows) between each item. */
+  itemGap?: number;
   renderItem: (item: TItem, context: RenderItemContext) => React.ReactNode;
 }
 
@@ -59,6 +61,7 @@ export function BaseSelectionList<
   showNumbers = true,
   showScrollArrows = false,
   maxItemsToShow = 10,
+  itemGap = 0,
   renderItem,
 }: BaseSelectionListProps<T, TItem>): React.JSX.Element {
   const { activeIndex } = useSelectionList({
@@ -89,7 +92,7 @@ export function BaseSelectionList<
   const numberColumnWidth = String(items.length).length;
 
   return (
-    <Box flexDirection="column">
+    <Box flexDirection="column" gap={itemGap}>
       {/* Use conditional coloring instead of conditional rendering */}
       {showScrollArrows && (
         <Text
diff --git a/packages/cli/src/ui/components/shared/DescriptiveRadioButtonSelect.tsx b/packages/cli/src/ui/components/shared/DescriptiveRadioButtonSelect.tsx
index 89bf4c03b..396ee8c3a 100644
--- a/packages/cli/src/ui/components/shared/DescriptiveRadioButtonSelect.tsx
+++ b/packages/cli/src/ui/components/shared/DescriptiveRadioButtonSelect.tsx
@@ -12,7 +12,7 @@ import type { SelectionListItem } from '../../hooks/useSelectionList.js';
 
 export interface DescriptiveRadioSelectItem<T> extends SelectionListItem<T> {
   title: React.ReactNode;
-  description: string;
+  description: React.ReactNode;
 }
 
 export interface DescriptiveRadioButtonSelectProps<T> {
@@ -32,6 +32,8 @@ export interface DescriptiveRadioButtonSelectProps<T> {
   showScrollArrows?: boolean;
   /** The maximum number of items to show at once. */
   maxItemsToShow?: number;
+  /** Gap (in rows) between each item. */
+  itemGap?: number;
 }
 
 /**
@@ -48,6 +50,7 @@ export function DescriptiveRadioButtonSelect<T>({
   showNumbers = false,
   showScrollArrows = false,
   maxItemsToShow = 10,
+  itemGap = 0,
 }: DescriptiveRadioButtonSelectProps<T>): React.JSX.Element {
   return (
     <BaseSelectionList<T, DescriptiveRadioSelectItem<T>>
@@ -59,6 +62,7 @@ export function DescriptiveRadioButtonSelect<T>({
       showNumbers={showNumbers}
       showScrollArrows={showScrollArrows}
       maxItemsToShow={maxItemsToShow}
+      itemGap={itemGap}
       renderItem={(item, { titleColor }) => (
         <Box flexDirection="column" key={item.key}>
           <Text color={titleColor}>{item.title}</Text>
diff --git a/packages/cli/src/ui/contexts/UIActionsContext.tsx b/packages/cli/src/ui/contexts/UIActionsContext.tsx
index 1965ceb26..af15e72b6 100644
--- a/packages/cli/src/ui/contexts/UIActionsContext.tsx
+++ b/packages/cli/src/ui/contexts/UIActionsContext.tsx
@@ -66,6 +66,7 @@ export interface UIActions {
   onSuggestionsVisibilityChange: (visible: boolean) => void;
   refreshStatic: () => void;
   handleFinalSubmit: (value: string) => void;
+  handleRetryLastPrompt: () => void;
   handleClearScreen: () => void;
   // Welcome back dialog
   handleWelcomeBackSelection: (choice: 'continue' | 'restart') => void;
diff --git a/packages/cli/src/ui/hooks/useCodingPlanUpdates.test.ts b/packages/cli/src/ui/hooks/useCodingPlanUpdates.test.ts
index 3ddaf42e6..bcb5bce33 100644
--- a/packages/cli/src/ui/hooks/useCodingPlanUpdates.test.ts
+++ b/packages/cli/src/ui/hooks/useCodingPlanUpdates.test.ts
@@ -112,7 +112,7 @@ describe('useCodingPlanUpdates', () => {
 
       // Should prompt for China region since it defaults to China
       expect(result.current.codingPlanUpdateRequest?.prompt).toContain(
-        chinaConfig.regionName,
+        'Alibaba Cloud Coding Plan',
       );
     });
 
@@ -135,7 +135,7 @@ describe('useCodingPlanUpdates', () => {
       });
 
       expect(result.current.codingPlanUpdateRequest?.prompt).toContain(
-        chinaConfig.regionName,
+        'Alibaba Cloud Coding Plan',
       );
     });
 
@@ -158,7 +158,7 @@ describe('useCodingPlanUpdates', () => {
       });
 
       expect(result.current.codingPlanUpdateRequest?.prompt).toContain(
-        globalConfig.regionName,
+        'Alibaba Cloud Coding Plan',
       );
     });
   });
@@ -228,7 +228,7 @@ describe('useCodingPlanUpdates', () => {
       expect(mockAddItem).toHaveBeenCalledWith(
         expect.objectContaining({
           type: 'info',
-          text: expect.stringContaining(chinaConfig.regionName),
+          text: expect.stringContaining('Alibaba Cloud Coding Plan'),
         }),
         expect.any(Number),
       );
@@ -297,7 +297,7 @@ describe('useCodingPlanUpdates', () => {
       expect(mockAddItem).toHaveBeenCalledWith(
         expect.objectContaining({
           type: 'info',
-          text: expect.stringContaining(globalConfig.regionName),
+          text: expect.stringContaining('Alibaba Cloud Coding Plan'),
         }),
         expect.any(Number),
       );
diff --git a/packages/cli/src/ui/hooks/useCodingPlanUpdates.ts b/packages/cli/src/ui/hooks/useCodingPlanUpdates.ts
index dee70e035..138498abf 100644
--- a/packages/cli/src/ui/hooks/useCodingPlanUpdates.ts
+++ b/packages/cli/src/ui/hooks/useCodingPlanUpdates.ts
@@ -68,7 +68,7 @@ export function useCodingPlanUpdates(
         );
 
         // Get the configuration for the current region
-        const { template, version, regionName } = getCodingPlanConfig(region);
+        const { template, version } = getCodingPlanConfig(region);
 
         // Generate new configs from template
         const newConfigs = template.map((templateConfig) => ({
@@ -117,7 +117,7 @@ export function useCodingPlanUpdates(
             type: 'info',
             text: t(
               '{{region}} configuration updated successfully. Model switched to "{{model}}".',
-              { region: regionName, model: activeModel },
+              { region: t('Alibaba Cloud Coding Plan'), model: activeModel },
             ),
           },
           Date.now(),
@@ -170,11 +170,10 @@ export function useCodingPlanUpdates(
 
     // Check if version matches
     if (savedVersion !== currentVersion) {
-      const { regionName } = getCodingPlanConfig(region);
       setUpdateRequest({
         prompt: t(
           'New model configurations are available for {{region}}. Update now?',
-          { region: regionName },
+          { region: t('Alibaba Cloud Coding Plan') },
         ),
         onConfirm: async (confirmed: boolean) => {
           setUpdateRequest(undefined);
diff --git a/packages/cli/src/ui/hooks/useGeminiStream.test.tsx b/packages/cli/src/ui/hooks/useGeminiStream.test.tsx
index e855eefc3..42f28f5e2 100644
--- a/packages/cli/src/ui/hooks/useGeminiStream.test.tsx
+++ b/packages/cli/src/ui/hooks/useGeminiStream.test.tsx
@@ -2304,40 +2304,30 @@ describe('useGeminiStream', () => {
           result.current.pendingHistoryItems.find(
             (item) => item.type === MessageType.ERROR,
           );
-        const findCountdownItem = () =>
-          result.current.pendingHistoryItems.find(
-            (item) => item.type === 'retry_countdown',
-          );
 
         let errorItem = findErrorItem();
-        let countdownItem = findCountdownItem();
-        for (
-          let attempts = 0;
-          attempts < 5 && (!errorItem || !countdownItem);
-          attempts++
-        ) {
+        for (let attempts = 0; attempts < 5 && !errorItem; attempts++) {
           await act(async () => {
             await Promise.resolve();
           });
           errorItem = findErrorItem();
-          countdownItem = findCountdownItem();
         }
 
-        // Error line should be rendered as ERROR type (wrapped by parseAndFormatApiError)
+        // Error item should contain the error text and a retry hint
         expect(errorItem?.text).toContain('Rate limit exceeded');
-
-        // Countdown line should be rendered as retry_countdown type
-        expect(countdownItem?.text).toContain('Retrying in 3 seconds');
+        // Countdown hint should be inline on the error item (not a separate item)
+        expect((errorItem as { hint?: string })?.hint).toContain('3s');
+        expect((errorItem as { hint?: string })?.hint).toContain('attempt 1/3');
 
         await act(async () => {
           await vi.advanceTimersByTimeAsync(1000);
         });
 
-        const countdownAfterOneSecond = result.current.pendingHistoryItems.find(
-          (item) => item.type === 'retry_countdown',
+        const errorAfterOneSecond = result.current.pendingHistoryItems.find(
+          (item) => item.type === MessageType.ERROR,
         );
-        expect(countdownAfterOneSecond?.text).toContain(
-          'Retrying in 2 seconds',
+        expect((errorAfterOneSecond as { hint?: string })?.hint).toContain(
+          '2s',
         );
 
         resolveStream?.();
@@ -2347,15 +2337,11 @@ describe('useGeminiStream', () => {
           await vi.runAllTimersAsync();
         });
 
-        // Both error and countdown should be cleared after retry succeeds
+        // Error item (with hint) should be cleared after retry succeeds
         const remainingError = result.current.pendingHistoryItems.find(
           (item) => item.type === MessageType.ERROR,
         );
-        const remainingCountdown = result.current.pendingHistoryItems.find(
-          (item) => item.type === 'retry_countdown',
-        );
         expect(remainingError).toBeUndefined();
-        expect(remainingCountdown).toBeUndefined();
       } finally {
         vi.useRealTimers();
       }
@@ -2525,14 +2511,13 @@ describe('useGeminiStream', () => {
         await result.current.submitQuery('Test query');
       });
 
-      // Verify error message was added
+      // Verify error message appears in pending history items (not via addItem,
+      // since errors with retry hints are now stored as pending items)
       await waitFor(() => {
-        expect(mockAddItem).toHaveBeenCalledWith(
-          expect.objectContaining({
-            type: 'error',
-          }),
-          expect.any(Number),
+        const errorItem = result.current.pendingHistoryItems.find(
+          (item) => item.type === 'error',
         );
+        expect(errorItem).toBeDefined();
       });
 
       // Verify parseAndFormatApiError was called
diff --git a/packages/cli/src/ui/hooks/useGeminiStream.ts b/packages/cli/src/ui/hooks/useGeminiStream.ts
index f3b3208e6..dd3dc060f 100644
--- a/packages/cli/src/ui/hooks/useGeminiStream.ts
+++ b/packages/cli/src/ui/hooks/useGeminiStream.ts
@@ -169,12 +169,17 @@ export const useGeminiStream = (
   const abortControllerRef = useRef<AbortController | null>(null);
   const turnCancelledRef = useRef(false);
   const isSubmittingQueryRef = useRef(false);
+  const lastPromptRef = useRef<PartListUnion | null>(null);
+  const lastPromptErroredRef = useRef(false);
   const [isResponding, setIsResponding] = useState<boolean>(false);
   const [thought, setThought] = useState<ThoughtSummary | null>(null);
   const [pendingHistoryItem, pendingHistoryItemRef, setPendingHistoryItem] =
     useStateAndRef<HistoryItemWithoutId | null>(null);
-  const [pendingRetryErrorItem, setPendingRetryErrorItem] =
-    useState<HistoryItemWithoutId | null>(null);
+  const [
+    pendingRetryErrorItem,
+    pendingRetryErrorItemRef,
+    setPendingRetryErrorItem,
+  ] = useStateAndRef<HistoryItemWithoutId | null>(null);
   const [
     pendingRetryCountdownItem,
     pendingRetryCountdownItemRef,
@@ -254,11 +259,18 @@ export const useGeminiStream = (
     }
   }, []);
 
+  /**
+   * Clears the retry countdown timer and pending retry items.
+   */
   const clearRetryCountdown = useCallback(() => {
     stopRetryCountdownTimer();
     setPendingRetryErrorItem(null);
     setPendingRetryCountdownItem(null);
-  }, [setPendingRetryCountdownItem, stopRetryCountdownTimer]);
+  }, [
+    setPendingRetryErrorItem,
+    setPendingRetryCountdownItem,
+    stopRetryCountdownTimer,
+  ]);
 
   const startRetryCountdown = useCallback(
     (retryInfo: {
@@ -273,18 +285,21 @@ export const useGeminiStream = (
       const retryReasonText =
         message ?? t('Rate limit exceeded. Please wait and try again.');
 
-      // Error line stays static (red with ✕ prefix)
-      setPendingRetryErrorItem({
-        type: MessageType.ERROR,
-        text: retryReasonText,
-      });
-
       // Countdown line updates every second (dim/secondary color)
       const updateCountdown = () => {
         const elapsedMs = Date.now() - startTime;
         const remainingMs = Math.max(0, delayMs - elapsedMs);
         const remainingSec = Math.ceil(remainingMs / 1000);
 
+        // Update error item with hint containing countdown info (short format)
+        const hintText = `Retrying in ${remainingSec}s… (attempt ${attempt}/${maxRetries})`;
+
+        setPendingRetryErrorItem({
+          type: MessageType.ERROR,
+          text: retryReasonText,
+          hint: hintText,
+        });
+
         setPendingRetryCountdownItem({
           type: 'retry_countdown',
           text: t(
@@ -305,7 +320,11 @@ export const useGeminiStream = (
       updateCountdown();
       retryCountdownTimerRef.current = setInterval(updateCountdown, 1000);
     },
-    [setPendingRetryCountdownItem, stopRetryCountdownTimer],
+    [
+      setPendingRetryErrorItem,
+      setPendingRetryCountdownItem,
+      stopRetryCountdownTimer,
+    ],
   );
 
   useEffect(() => () => stopRetryCountdownTimer(), [stopRetryCountdownTimer]);
@@ -693,6 +712,7 @@ export const useGeminiStream = (
         return;
       }
 
+      lastPromptErroredRef.current = false;
       if (pendingHistoryItemRef.current) {
         if (pendingHistoryItemRef.current.type === 'tool_group') {
           const updatedTools = pendingHistoryItemRef.current.tools.map(
@@ -732,27 +752,36 @@ export const useGeminiStream = (
 
   const handleErrorEvent = useCallback(
     (eventValue: GeminiErrorEventValue, userMessageTimestamp: number) => {
+      lastPromptErroredRef.current = true;
       if (pendingHistoryItemRef.current) {
         addItem(pendingHistoryItemRef.current, userMessageTimestamp);
         setPendingHistoryItem(null);
       }
-      addItem(
-        {
-          type: MessageType.ERROR,
+      // Only show Ctrl+Y hint if not already showing an auto-retry countdown
+      // (auto-retry countdown is shown when retryCountdownTimerRef is active)
+      const isShowingAutoRetry = retryCountdownTimerRef.current !== null;
+      clearRetryCountdown();
+      if (!isShowingAutoRetry) {
+        const retryHint = t('Press Ctrl+Y to retry');
+        // Store error with hint as a pending item (not in history).
+        // This allows the hint to be removed when the user retries with Ctrl+Y,
+        // since pending items are in the dynamic rendering area (not <Static>).
+        setPendingRetryErrorItem({
+          type: 'error' as const,
           text: parseAndFormatApiError(
             eventValue.error,
             config.getContentGeneratorConfig()?.authType,
           ),
-        },
-        userMessageTimestamp,
-      );
-      clearRetryCountdown();
+          hint: retryHint,
+        });
+      }
       setThought(null); // Reset thought when there's an error
     },
     [
       addItem,
       pendingHistoryItemRef,
       setPendingHistoryItem,
+      setPendingRetryErrorItem,
       config,
       setThought,
       clearRetryCountdown,
@@ -816,7 +845,10 @@ export const useGeminiStream = (
           userMessageTimestamp,
         );
       }
-      clearRetryCountdown();
+      // Only clear auto-retry countdown errors (those with active timer)
+      if (retryCountdownTimerRef.current) {
+        clearRetryCountdown();
+      }
     },
     [addItem, clearRetryCountdown],
   );
@@ -1032,7 +1064,7 @@ export const useGeminiStream = (
   const submitQuery = useCallback(
     async (
       query: PartListUnion,
-      options?: { isContinuation: boolean },
+      options?: { isContinuation: boolean; skipPreparation?: boolean },
       prompt_id?: string,
     ) => {
       // Prevent concurrent executions of submitQuery, but allow continuations
@@ -1056,7 +1088,11 @@ export const useGeminiStream = (
       // Reset quota error flag when starting a new query (not a continuation)
       if (!options?.isContinuation) {
         setModelSwitchedFromQuotaError(false);
-        // No quota-error / fallback routing mechanism currently; keep state minimal.
+        // Commit any pending retry error to history (without hint) since the
+        // user is starting a new conversation turn
+        if (pendingRetryCountdownItemRef.current) {
+          clearRetryCountdown();
+        }
       }
 
       abortControllerRef.current = new AbortController();
@@ -1068,12 +1104,14 @@ export const useGeminiStream = (
       }
 
       return promptIdContext.run(prompt_id, async () => {
-        const { queryToSend, shouldProceed } = await prepareQueryForGemini(
-          query,
-          userMessageTimestamp,
-          abortSignal,
-          prompt_id!,
-        );
+        const { queryToSend, shouldProceed } = options?.skipPreparation
+          ? { queryToSend: query, shouldProceed: true }
+          : await prepareQueryForGemini(
+              query,
+              userMessageTimestamp,
+              abortSignal,
+              prompt_id!,
+            );
 
         if (!shouldProceed || queryToSend === null) {
           isSubmittingQueryRef.current = false;
@@ -1095,6 +1133,8 @@ export const useGeminiStream = (
         }
 
         const finalQueryToSend = queryToSend;
+        lastPromptRef.current = finalQueryToSend;
+        lastPromptErroredRef.current = false;
 
         if (!options?.isContinuation) {
           // trigger new prompt event for session stats in CLI
@@ -1143,6 +1183,12 @@ export const useGeminiStream = (
             addItem(pendingHistoryItemRef.current, userMessageTimestamp);
             setPendingHistoryItem(null);
           }
+          // Only clear auto-retry countdown errors (those with an active timer).
+          // Do NOT clear static error+hint from handleErrorEvent — those should
+          // remain visible until the user presses Ctrl+Y to retry.
+          if (retryCountdownTimerRef.current) {
+            clearRetryCountdown();
+          }
           if (loopDetectedRef.current) {
             loopDetectedRef.current = false;
             handleLoopDetectedEvent();
@@ -1151,16 +1197,17 @@ export const useGeminiStream = (
           if (error instanceof UnauthorizedError) {
             onAuthError('Session expired or is unauthorized.');
           } else if (!isNodeError(error) || error.name !== 'AbortError') {
-            addItem(
-              {
-                type: MessageType.ERROR,
-                text: parseAndFormatApiError(
-                  getErrorMessage(error) || 'Unknown error',
-                  config.getContentGeneratorConfig()?.authType,
-                ),
-              },
-              userMessageTimestamp,
-            );
+            lastPromptErroredRef.current = true;
+            const retryHint = t('Press Ctrl+Y to retry');
+            // Store error with hint as a pending item (same as handleErrorEvent)
+            setPendingRetryErrorItem({
+              type: 'error' as const,
+              text: parseAndFormatApiError(
+                getErrorMessage(error) || 'Unknown error',
+                config.getContentGeneratorConfig()?.authType,
+              ),
+              hint: retryHint,
+            });
           }
         } finally {
           setIsResponding(false);
@@ -1183,9 +1230,71 @@ export const useGeminiStream = (
       startNewPrompt,
       getPromptCount,
       handleLoopDetectedEvent,
+      clearRetryCountdown,
+      pendingRetryCountdownItemRef,
+      setPendingRetryErrorItem,
     ],
   );
 
+  /**
+   * Retries the last failed prompt when the user presses Ctrl+Y.
+   *
+   * Activation conditions for Ctrl+Y shortcut:
+   * 1. ✅ The last request must have failed (lastPromptErroredRef.current === true)
+   * 2. ✅ Current streaming state must NOT be "Responding" (avoid interrupting ongoing stream)
+   * 3. ✅ Current streaming state must NOT be "WaitingForConfirmation" (avoid conflicting with tool confirmation flow)
+   * 4. ✅ There must be a stored lastPrompt in lastPromptRef.current
+   *
+   * When conditions are not met:
+   * - If streaming is active (Responding/WaitingForConfirmation): silently return without action
+   * - If no failed request exists: display "No failed request to retry." info message
+   *
+   * When conditions are met:
+   * - Clears any pending auto-retry countdown to avoid duplicate retries
+   * - Re-submits the last query with skipPreparation: true for faster retry
+   *
+   * This function is exposed via UIActionsContext and triggered by InputPrompt
+   * when the user presses Ctrl+Y (bound to Command.RETRY_LAST in keyBindings.ts).
+   */
+  const retryLastPrompt = useCallback(async () => {
+    if (
+      streamingState === StreamingState.Responding ||
+      streamingState === StreamingState.WaitingForConfirmation
+    ) {
+      return;
+    }
+
+    const lastPrompt = lastPromptRef.current;
+    if (!lastPrompt || !lastPromptErroredRef.current) {
+      addItem(
+        {
+          type: MessageType.INFO,
+          text: t('No failed request to retry.'),
+        },
+        Date.now(),
+      );
+      return;
+    }
+
+    // Commit the error to history (without hint) before clearing
+    const errorItem = pendingRetryErrorItemRef.current;
+    if (errorItem) {
+      addItem({ type: errorItem.type, text: errorItem.text }, Date.now());
+    }
+    clearRetryCountdown();
+
+    await submitQuery(lastPrompt, {
+      isContinuation: false,
+      skipPreparation: true,
+    });
+  }, [
+    streamingState,
+    addItem,
+    clearRetryCountdown,
+    submitQuery,
+    pendingRetryErrorItemRef,
+  ]);
+
   const handleApprovalModeChange = useCallback(
     async (newApprovalMode: ApprovalMode) => {
       // Auto-approve pending tool calls when switching to auto-approval modes
@@ -1489,6 +1598,7 @@ export const useGeminiStream = (
     pendingHistoryItems,
     thought,
     cancelOngoingRequest,
+    retryLastPrompt,
     pendingToolCalls: toolCalls,
     handleApprovalModeChange,
     activePtyId,
diff --git a/packages/cli/src/ui/keyMatchers.test.ts b/packages/cli/src/ui/keyMatchers.test.ts
index 15d45fdab..8961f9ff7 100644
--- a/packages/cli/src/ui/keyMatchers.test.ts
+++ b/packages/cli/src/ui/keyMatchers.test.ts
@@ -59,6 +59,7 @@ describe('keyMatchers', () => {
     [Command.QUIT]: (key: Key) => key.ctrl && key.name === 'c',
     [Command.EXIT]: (key: Key) => key.ctrl && key.name === 'd',
     [Command.SHOW_MORE_LINES]: (key: Key) => key.ctrl && key.name === 's',
+    [Command.RETRY_LAST]: (key: Key) => key.ctrl && key.name === 'y',
     [Command.REVERSE_SEARCH]: (key: Key) => key.ctrl && key.name === 'r',
     [Command.SUBMIT_REVERSE_SEARCH]: (key: Key) =>
       key.name === 'return' && !key.ctrl,
@@ -252,6 +253,11 @@ describe('keyMatchers', () => {
       positive: [createKey('s', { ctrl: true })],
       negative: [createKey('s'), createKey('l', { ctrl: true })],
     },
+    {
+      command: Command.RETRY_LAST,
+      positive: [createKey('y', { ctrl: true })],
+      negative: [createKey('y'), createKey('r', { ctrl: true })],
+    },
 
     // Shell commands
     {
diff --git a/packages/cli/src/ui/types.ts b/packages/cli/src/ui/types.ts
index b2e86de62..d2483f371 100644
--- a/packages/cli/src/ui/types.ts
+++ b/packages/cli/src/ui/types.ts
@@ -121,6 +121,7 @@ export type HistoryItemInfo = HistoryItemBase & {
 export type HistoryItemError = HistoryItemBase & {
   type: 'error';
   text: string;
+  hint?: string; // Optional inline hint (e.g., retry countdown) displayed in secondary color
 };
 
 export type HistoryItemWarning = HistoryItemBase & {
diff --git a/packages/cli/src/utils/languageUtils.test.ts b/packages/cli/src/utils/languageUtils.test.ts
index a0f0ca717..7081f0c94 100644
--- a/packages/cli/src/utils/languageUtils.test.ts
+++ b/packages/cli/src/utils/languageUtils.test.ts
@@ -380,4 +380,62 @@ describe('languageUtils', () => {
       expect(fs.writeFileSync).not.toHaveBeenCalled();
     });
   });
+
+  describe('output-language.md path resolution priority', () => {
+    it('should prefer project-level path over global path', () => {
+      const projectPath = '/project/.qwen/output-language.md';
+      const globalPath = '/mock/home/.qwen/output-language.md';
+
+      vi.mocked(fs.existsSync).mockImplementation((p) => {
+        if (p.toString() === projectPath) return true;
+        if (p.toString() === globalPath) return true;
+        return false;
+      });
+
+      let resolvedPath: string | undefined;
+      if (fs.existsSync(projectPath)) {
+        resolvedPath = projectPath;
+      } else if (fs.existsSync(globalPath)) {
+        resolvedPath = globalPath;
+      }
+
+      expect(resolvedPath).toBe(projectPath);
+    });
+
+    it('should fall back to global path when project-level does not exist', () => {
+      const projectPath = '/project/.qwen/output-language.md';
+      const globalPath = '/mock/home/.qwen/output-language.md';
+
+      vi.mocked(fs.existsSync).mockImplementation((p) => {
+        if (p.toString() === projectPath) return false;
+        if (p.toString() === globalPath) return true;
+        return false;
+      });
+
+      let resolvedPath: string | undefined;
+      if (fs.existsSync(projectPath)) {
+        resolvedPath = projectPath;
+      } else if (fs.existsSync(globalPath)) {
+        resolvedPath = globalPath;
+      }
+
+      expect(resolvedPath).toBe(globalPath);
+    });
+
+    it('should return undefined when neither path exists', () => {
+      const projectPath = '/project/.qwen/output-language.md';
+      const globalPath = '/mock/home/.qwen/output-language.md';
+
+      vi.mocked(fs.existsSync).mockReturnValue(false);
+
+      let resolvedPath: string | undefined;
+      if (fs.existsSync(projectPath)) {
+        resolvedPath = projectPath;
+      } else if (fs.existsSync(globalPath)) {
+        resolvedPath = globalPath;
+      }
+
+      expect(resolvedPath).toBeUndefined();
+    });
+  });
 });
diff --git a/packages/core/package.json b/packages/core/package.json
index 62be09b95..91dd7709b 100644
--- a/packages/core/package.json
+++ b/packages/core/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@qwen-code/qwen-code-core",
-  "version": "0.11.0",
+  "version": "0.11.1",
   "description": "Qwen Code Core",
   "repository": {
     "type": "git",
diff --git a/packages/core/src/core/contentGenerator.ts b/packages/core/src/core/contentGenerator.ts
index f3af06bda..d809193d7 100644
--- a/packages/core/src/core/contentGenerator.ts
+++ b/packages/core/src/core/contentGenerator.ts
@@ -60,6 +60,17 @@ export enum AuthType {
   USE_ANTHROPIC = 'anthropic',
 }
 
+/**
+ * Supported input modalities for a model.
+ * Omitted or false fields mean the model does not support that input type.
+ */
+export type InputModalities = {
+  image?: boolean;
+  pdf?: boolean;
+  audio?: boolean;
+  video?: boolean;
+};
+
 export type ContentGeneratorConfig = {
   model: string;
   apiKey?: string;
@@ -70,7 +81,8 @@ export type ContentGeneratorConfig = {
   enableOpenAILogging?: boolean;
   openAILoggingDir?: string;
   timeout?: number; // Timeout configuration in milliseconds
-  maxRetries?: number; // Maximum retries for failed requests
+  maxRetries?: number; // Maximum retries for rate-limit errors
+  retryErrorCodes?: number[]; // Additional error codes that trigger rate-limit retry
   enableCacheControl?: boolean; // Enable cache control for DashScope providers
   samplingParams?: {
     top_p?: number;
@@ -98,6 +110,9 @@ export type ContentGeneratorConfig = {
   customHeaders?: Record<string, string>;
   // Extra body parameters to be merged into the request body
   extra_body?: Record<string, unknown>;
+  // Supported input modalities. Unsupported media types are replaced with text
+  // placeholders. Leave undefined to use automatic detection from model name.
+  modalities?: InputModalities;
 };
 
 // Keep the public ContentGeneratorConfigSources API, but reuse the generic
diff --git a/packages/core/src/core/coreToolScheduler.test.ts b/packages/core/src/core/coreToolScheduler.test.ts
index 4a19aec2f..1f810430f 100644
--- a/packages/core/src/core/coreToolScheduler.test.ts
+++ b/packages/core/src/core/coreToolScheduler.test.ts
@@ -1859,6 +1859,175 @@ describe('CoreToolScheduler request queueing', () => {
   });
 });
 
+describe('CoreToolScheduler truncated output protection', () => {
+  function createTruncationTestScheduler(
+    tool: TestApprovalTool | MockTool,
+    toolNames: string[],
+  ) {
+    const onAllToolCallsComplete = vi.fn();
+    const onToolCallsUpdate = vi.fn();
+
+    const mockToolRegistry = {
+      getTool: () => tool,
+      getAllToolNames: () => toolNames,
+      getFunctionDeclarations: () => [],
+      tools: new Map(),
+    } as unknown as ToolRegistry;
+
+    const mockConfig = {
+      getSessionId: () => 'test-session-id',
+      getUsageStatisticsEnabled: () => true,
+      getDebugMode: () => false,
+      getApprovalMode: () => ApprovalMode.AUTO_EDIT,
+      getAllowedTools: () => [],
+      getExcludeTools: () => undefined,
+      getContentGeneratorConfig: () => ({
+        model: 'test-model',
+        authType: 'gemini',
+      }),
+      getShellExecutionConfig: () => ({
+        terminalWidth: 90,
+        terminalHeight: 30,
+      }),
+      storage: {
+        getProjectTempDir: () => '/tmp',
+      },
+      getTruncateToolOutputThreshold: () =>
+        DEFAULT_TRUNCATE_TOOL_OUTPUT_THRESHOLD,
+      getTruncateToolOutputLines: () => DEFAULT_TRUNCATE_TOOL_OUTPUT_LINES,
+      getToolRegistry: () => mockToolRegistry,
+      getUseModelRouter: () => false,
+      getGeminiClient: () => null,
+      getChatRecordingService: () => undefined,
+      isInteractive: () => true,
+    } as unknown as Config;
+
+    const scheduler = new CoreToolScheduler({
+      config: mockConfig,
+      onAllToolCallsComplete,
+      onToolCallsUpdate,
+      getPreferredEditor: () => 'vscode',
+      onEditorClose: vi.fn(),
+    });
+
+    return { scheduler, onAllToolCallsComplete };
+  }
+
+  it('should reject Kind.Edit tool calls when wasOutputTruncated is true', async () => {
+    const declarativeTool = new TestApprovalTool({
+      getApprovalMode: () => ApprovalMode.AUTO_EDIT,
+    } as unknown as Config);
+    const { scheduler, onAllToolCallsComplete } = createTruncationTestScheduler(
+      declarativeTool,
+      [TestApprovalTool.Name],
+    );
+
+    await scheduler.schedule(
+      [
+        {
+          callId: '1',
+          name: TestApprovalTool.Name,
+          args: { id: 'test-truncated' },
+          isClientInitiated: false,
+          prompt_id: 'prompt-id-truncated',
+          wasOutputTruncated: true,
+        },
+      ],
+      new AbortController().signal,
+    );
+
+    await vi.waitFor(() => {
+      expect(onAllToolCallsComplete).toHaveBeenCalled();
+    });
+
+    const completedCalls = onAllToolCallsComplete.mock
+      .calls[0][0] as ToolCall[];
+    expect(completedCalls).toHaveLength(1);
+    const completedCall = completedCalls[0];
+    expect(completedCall.status).toBe('error');
+
+    if (completedCall.status === 'error') {
+      const errorMessage = completedCall.response.error?.message;
+      expect(errorMessage).toContain('truncated due to max_tokens limit');
+      expect(errorMessage).toContain(
+        'rejected to prevent writing truncated content',
+      );
+    }
+  });
+
+  it('should allow Kind.Edit tool calls when wasOutputTruncated is false', async () => {
+    const declarativeTool = new TestApprovalTool({
+      getApprovalMode: () => ApprovalMode.AUTO_EDIT,
+    } as unknown as Config);
+    const { scheduler, onAllToolCallsComplete } = createTruncationTestScheduler(
+      declarativeTool,
+      [TestApprovalTool.Name],
+    );
+
+    await scheduler.schedule(
+      [
+        {
+          callId: '1',
+          name: TestApprovalTool.Name,
+          args: { id: 'test-normal' },
+          isClientInitiated: false,
+          prompt_id: 'prompt-id-normal',
+          wasOutputTruncated: false,
+        },
+      ],
+      new AbortController().signal,
+    );
+
+    await vi.waitFor(() => {
+      expect(onAllToolCallsComplete).toHaveBeenCalled();
+    });
+
+    const completedCalls = onAllToolCallsComplete.mock
+      .calls[0][0] as ToolCall[];
+    expect(completedCalls).toHaveLength(1);
+    // Should succeed (not error) since wasOutputTruncated is false
+    expect(completedCalls[0].status).toBe('success');
+  });
+
+  it('should allow non-Edit tools when wasOutputTruncated is true', async () => {
+    const mockTool = new MockTool({
+      name: 'mockReadTool',
+      execute: async () => ({
+        llmContent: 'read result',
+        returnDisplay: 'read result',
+      }),
+    });
+    const { scheduler, onAllToolCallsComplete } = createTruncationTestScheduler(
+      mockTool,
+      ['mockReadTool'],
+    );
+
+    await scheduler.schedule(
+      [
+        {
+          callId: '1',
+          name: 'mockReadTool',
+          args: {},
+          isClientInitiated: false,
+          prompt_id: 'prompt-id-read-truncated',
+          wasOutputTruncated: true,
+        },
+      ],
+      new AbortController().signal,
+    );
+
+    await vi.waitFor(() => {
+      expect(onAllToolCallsComplete).toHaveBeenCalled();
+    });
+
+    const completedCalls = onAllToolCallsComplete.mock
+      .calls[0][0] as ToolCall[];
+    expect(completedCalls).toHaveLength(1);
+    // Non-Edit tools should still execute even when output was truncated
+    expect(completedCalls[0].status).toBe('success');
+  });
+});
+
 describe('CoreToolScheduler Sequential Execution', () => {
   it('should execute tool calls in a batch sequentially', async () => {
     // Arrange
diff --git a/packages/core/src/core/coreToolScheduler.ts b/packages/core/src/core/coreToolScheduler.ts
index fc0455a8a..3cdc8232f 100644
--- a/packages/core/src/core/coreToolScheduler.ts
+++ b/packages/core/src/core/coreToolScheduler.ts
@@ -32,6 +32,7 @@ import {
   logToolOutputTruncated,
   ToolOutputTruncatedEvent,
   InputFormat,
+  Kind,
   SkillTool,
 } from '../index.js';
 import type {
@@ -55,6 +56,23 @@ import levenshtein from 'fast-levenshtein';
 import { getPlanModeSystemReminder } from './prompts.js';
 import { ShellToolInvocation } from '../tools/shell.js';
 
+const TRUNCATION_PARAM_GUIDANCE =
+  'Note: Your previous response was truncated due to max_tokens limit, ' +
+  'which likely caused incomplete tool call parameters. ' +
+  'Please retry the tool call with complete parameters. ' +
+  'If the content is too large for a single response, ' +
+  'consider splitting it into smaller parts.';
+
+const TRUNCATION_EDIT_REJECTION =
+  'Your previous response was truncated due to max_tokens limit, ' +
+  'which likely produced incomplete file content. ' +
+  'The tool call has been rejected to prevent writing ' +
+  'truncated content to the file. ' +
+  'Please retry the tool call with complete content. ' +
+  'If the content is too large for a single response, ' +
+  'consider splitting it into smaller parts ' +
+  '(e.g., write_file for initial content, then edit for additions).';
+
 export type ValidatingToolCall = {
   status: 'validating';
   request: ToolCallRequestInfo;
@@ -773,19 +791,41 @@ export class CoreToolScheduler {
             reqInfo.args,
           );
           if (invocationOrError instanceof Error) {
+            const error = reqInfo.wasOutputTruncated
+              ? new Error(
+                  `${invocationOrError.message} ${TRUNCATION_PARAM_GUIDANCE}`,
+                )
+              : invocationOrError;
             return {
               status: 'error',
               request: reqInfo,
               tool: toolInstance,
               response: createErrorResponse(
                 reqInfo,
-                invocationOrError,
+                error,
                 ToolErrorType.INVALID_TOOL_PARAMS,
               ),
               durationMs: 0,
             };
           }
 
+          // Reject file-modifying calls when truncated to prevent
+          // writing incomplete content.
+          if (reqInfo.wasOutputTruncated && toolInstance.kind === Kind.Edit) {
+            const truncationError = new Error(TRUNCATION_EDIT_REJECTION);
+            return {
+              status: 'error',
+              request: reqInfo,
+              tool: toolInstance,
+              response: createErrorResponse(
+                reqInfo,
+                truncationError,
+                ToolErrorType.OUTPUT_TRUNCATED,
+              ),
+              durationMs: 0,
+            };
+          }
+
           return {
             status: 'validating',
             request: reqInfo,
@@ -1089,156 +1129,156 @@ export class CoreToolScheduler {
       );
 
       for (const toolCall of callsToExecute) {
-        if (toolCall.status !== 'scheduled') continue;
+        await this.executeSingleToolCall(toolCall, signal);
+      }
+    }
+  }
 
-        const scheduledCall = toolCall;
-        const { callId, name: toolName } = scheduledCall.request;
-        const invocation = scheduledCall.invocation;
-        this.setStatusInternal(callId, 'executing');
+  private async executeSingleToolCall(
+    toolCall: ToolCall,
+    signal: AbortSignal,
+  ): Promise<void> {
+    if (toolCall.status !== 'scheduled') return;
 
-        const liveOutputCallback = scheduledCall.tool.canUpdateOutput
-          ? (outputChunk: ToolResultDisplay) => {
-              if (this.outputUpdateHandler) {
-                this.outputUpdateHandler(callId, outputChunk);
-              }
-              this.toolCalls = this.toolCalls.map((tc) =>
-                tc.request.callId === callId && tc.status === 'executing'
-                  ? { ...tc, liveOutput: outputChunk }
-                  : tc,
-              );
-              this.notifyToolCallsUpdate();
-            }
-          : undefined;
+    const scheduledCall = toolCall;
+    const { callId, name: toolName } = scheduledCall.request;
+    const invocation = scheduledCall.invocation;
+    this.setStatusInternal(callId, 'executing');
 
-        const shellExecutionConfig = this.config.getShellExecutionConfig();
-
-        // TODO: Refactor to remove special casing for ShellToolInvocation.
-        // Introduce a generic callbacks object for the execute method to handle
-        // things like `onPid` and `onLiveOutput`. This will make the scheduler
-        // agnostic to the invocation type.
-        let promise: Promise<ToolResult>;
-        if (invocation instanceof ShellToolInvocation) {
-          const setPidCallback = (pid: number) => {
-            this.toolCalls = this.toolCalls.map((tc) =>
-              tc.request.callId === callId && tc.status === 'executing'
-                ? { ...tc, pid }
-                : tc,
-            );
-            this.notifyToolCallsUpdate();
-          };
-          promise = invocation.execute(
-            signal,
-            liveOutputCallback,
-            shellExecutionConfig,
-            setPidCallback,
-          );
-        } else {
-          promise = invocation.execute(
-            signal,
-            liveOutputCallback,
-            shellExecutionConfig,
-          );
-        }
-
-        try {
-          const toolResult: ToolResult = await promise;
-          if (signal.aborted) {
-            this.setStatusInternal(
-              callId,
-              'cancelled',
-              'User cancelled tool execution.',
-            );
-            continue;
+    const liveOutputCallback = scheduledCall.tool.canUpdateOutput
+      ? (outputChunk: ToolResultDisplay) => {
+          if (this.outputUpdateHandler) {
+            this.outputUpdateHandler(callId, outputChunk);
           }
+          this.toolCalls = this.toolCalls.map((tc) =>
+            tc.request.callId === callId && tc.status === 'executing'
+              ? { ...tc, liveOutput: outputChunk }
+              : tc,
+          );
+          this.notifyToolCallsUpdate();
+        }
+      : undefined;
 
-          if (toolResult.error === undefined) {
-            let content = toolResult.llmContent;
-            let outputFile: string | undefined = undefined;
-            const contentLength =
-              typeof content === 'string' ? content.length : undefined;
-            if (
-              typeof content === 'string' &&
-              toolName === ShellTool.Name &&
-              this.config.getEnableToolOutputTruncation() &&
-              this.config.getTruncateToolOutputThreshold() > 0 &&
-              this.config.getTruncateToolOutputLines() > 0
-            ) {
-              const originalContentLength = content.length;
-              const threshold = this.config.getTruncateToolOutputThreshold();
-              const lines = this.config.getTruncateToolOutputLines();
-              const truncatedResult = await truncateAndSaveToFile(
-                content,
-                callId,
-                this.config.storage.getProjectTempDir(),
+    const shellExecutionConfig = this.config.getShellExecutionConfig();
+
+    // TODO: Refactor to remove special casing for ShellToolInvocation.
+    // Introduce a generic callbacks object for the execute method to handle
+    // things like `onPid` and `onLiveOutput`. This will make the scheduler
+    // agnostic to the invocation type.
+    let promise: Promise<ToolResult>;
+    if (invocation instanceof ShellToolInvocation) {
+      const setPidCallback = (pid: number) => {
+        this.toolCalls = this.toolCalls.map((tc) =>
+          tc.request.callId === callId && tc.status === 'executing'
+            ? { ...tc, pid }
+            : tc,
+        );
+        this.notifyToolCallsUpdate();
+      };
+      promise = invocation.execute(
+        signal,
+        liveOutputCallback,
+        shellExecutionConfig,
+        setPidCallback,
+      );
+    } else {
+      promise = invocation.execute(
+        signal,
+        liveOutputCallback,
+        shellExecutionConfig,
+      );
+    }
+
+    try {
+      const toolResult: ToolResult = await promise;
+      if (signal.aborted) {
+        this.setStatusInternal(
+          callId,
+          'cancelled',
+          'User cancelled tool execution.',
+        );
+        return;
+      }
+
+      if (toolResult.error === undefined) {
+        let content = toolResult.llmContent;
+        let outputFile: string | undefined = undefined;
+        const contentLength =
+          typeof content === 'string' ? content.length : undefined;
+        if (
+          typeof content === 'string' &&
+          toolName === ShellTool.Name &&
+          this.config.getEnableToolOutputTruncation() &&
+          this.config.getTruncateToolOutputThreshold() > 0 &&
+          this.config.getTruncateToolOutputLines() > 0
+        ) {
+          const originalContentLength = content.length;
+          const threshold = this.config.getTruncateToolOutputThreshold();
+          const lines = this.config.getTruncateToolOutputLines();
+          const truncatedResult = await truncateAndSaveToFile(
+            content,
+            callId,
+            this.config.storage.getProjectTempDir(),
+            threshold,
+            lines,
+          );
+          content = truncatedResult.content;
+          outputFile = truncatedResult.outputFile;
+
+          if (outputFile) {
+            logToolOutputTruncated(
+              this.config,
+              new ToolOutputTruncatedEvent(scheduledCall.request.prompt_id, {
+                toolName,
+                originalContentLength,
+                truncatedContentLength: content.length,
                 threshold,
                 lines,
-              );
-              content = truncatedResult.content;
-              outputFile = truncatedResult.outputFile;
-
-              if (outputFile) {
-                logToolOutputTruncated(
-                  this.config,
-                  new ToolOutputTruncatedEvent(
-                    scheduledCall.request.prompt_id,
-                    {
-                      toolName,
-                      originalContentLength,
-                      truncatedContentLength: content.length,
-                      threshold,
-                      lines,
-                    },
-                  ),
-                );
-              }
-            }
-
-            const response = convertToFunctionResponse(
-              toolName,
-              callId,
-              content,
-            );
-            const successResponse: ToolCallResponseInfo = {
-              callId,
-              responseParts: response,
-              resultDisplay: toolResult.returnDisplay,
-              error: undefined,
-              errorType: undefined,
-              outputFile,
-              contentLength,
-            };
-            this.setStatusInternal(callId, 'success', successResponse);
-          } else {
-            // It is a failure
-            const error = new Error(toolResult.error.message);
-            const errorResponse = createErrorResponse(
-              scheduledCall.request,
-              error,
-              toolResult.error.type,
-            );
-            this.setStatusInternal(callId, 'error', errorResponse);
-          }
-        } catch (executionError: unknown) {
-          if (signal.aborted) {
-            this.setStatusInternal(
-              callId,
-              'cancelled',
-              'User cancelled tool execution.',
-            );
-          } else {
-            this.setStatusInternal(
-              callId,
-              'error',
-              createErrorResponse(
-                scheduledCall.request,
-                executionError instanceof Error
-                  ? executionError
-                  : new Error(String(executionError)),
-                ToolErrorType.UNHANDLED_EXCEPTION,
-              ),
+              }),
             );
           }
         }
+
+        const response = convertToFunctionResponse(toolName, callId, content);
+        const successResponse: ToolCallResponseInfo = {
+          callId,
+          responseParts: response,
+          resultDisplay: toolResult.returnDisplay,
+          error: undefined,
+          errorType: undefined,
+          outputFile,
+          contentLength,
+        };
+        this.setStatusInternal(callId, 'success', successResponse);
+      } else {
+        // It is a failure
+        const error = new Error(toolResult.error.message);
+        const errorResponse = createErrorResponse(
+          scheduledCall.request,
+          error,
+          toolResult.error.type,
+        );
+        this.setStatusInternal(callId, 'error', errorResponse);
+      }
+    } catch (executionError: unknown) {
+      if (signal.aborted) {
+        this.setStatusInternal(
+          callId,
+          'cancelled',
+          'User cancelled tool execution.',
+        );
+      } else {
+        this.setStatusInternal(
+          callId,
+          'error',
+          createErrorResponse(
+            scheduledCall.request,
+            executionError instanceof Error
+              ? executionError
+              : new Error(String(executionError)),
+            ToolErrorType.UNHANDLED_EXCEPTION,
+          ),
+        );
       }
     }
   }
diff --git a/packages/core/src/core/geminiChat.ts b/packages/core/src/core/geminiChat.ts
index 0bac7066f..2e1923355 100644
--- a/packages/core/src/core/geminiChat.ts
+++ b/packages/core/src/core/geminiChat.ts
@@ -286,6 +286,12 @@ export class GeminiChat {
         let lastError: unknown = new Error('Request failed after all retries.');
         let rateLimitRetryCount = 0;
 
+        // Read per-config overrides; fall back to built-in defaults.
+        const cgConfig = self.config.getContentGeneratorConfig();
+        const maxRateLimitRetries =
+          cgConfig?.maxRetries ?? RATE_LIMIT_RETRY_OPTIONS.maxRetries;
+        const extraRetryErrorCodes = cgConfig?.retryErrorCodes;
+
         for (
           let attempt = 0;
           attempt < INVALID_CONTENT_RETRY_OPTIONS.maxAttempts;
@@ -316,18 +322,15 @@ export class GeminiChat {
             // These arrive as StreamContentError with finish_reason="error_finish"
             // from the pipeline, containing the throttling message in the content.
             // Covers TPM throttling, GLM rate limits, and other provider throttling.
-            const isRateLimit = isRateLimitError(error);
-            if (
-              isRateLimit &&
-              rateLimitRetryCount < RATE_LIMIT_RETRY_OPTIONS.maxRetries
-            ) {
+            const isRateLimit = isRateLimitError(error, extraRetryErrorCodes);
+            if (isRateLimit && rateLimitRetryCount < maxRateLimitRetries) {
               rateLimitRetryCount++;
               const delayMs = RATE_LIMIT_RETRY_OPTIONS.delayMs;
               const message = parseAndFormatApiError(
                 error instanceof Error ? error.message : String(error),
               );
               debugLogger.warn(
-                `Rate limit throttling detected (retry ${rateLimitRetryCount}/${RATE_LIMIT_RETRY_OPTIONS.maxRetries}). ` +
+                `Rate limit throttling detected (retry ${rateLimitRetryCount}/${maxRateLimitRetries}). ` +
                   `Waiting ${delayMs / 1000}s before retrying...`,
               );
               yield {
@@ -335,7 +338,7 @@ export class GeminiChat {
                 retryInfo: {
                   message,
                   attempt: rateLimitRetryCount,
-                  maxRetries: RATE_LIMIT_RETRY_OPTIONS.maxRetries,
+                  maxRetries: maxRateLimitRetries,
                   delayMs,
                 },
               };
diff --git a/packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts b/packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts
index 88e9e2c87..3c64c1267 100644
--- a/packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts
+++ b/packages/core/src/core/loggingContentGenerator/loggingContentGenerator.ts
@@ -154,7 +154,6 @@ export class LoggingContentGenerator implements ContentGenerator {
         response.modelVersion || req.model,
         userPromptId,
         response.usageMetadata,
-        JSON.stringify(response),
       );
       await this.logOpenAIInteraction(openaiRequest, response);
       return response;
@@ -219,7 +218,6 @@ export class LoggingContentGenerator implements ContentGenerator {
         responses[0]?.modelVersion || model,
         userPromptId,
         lastUsageMetadata,
-        JSON.stringify(responses),
       );
       const consolidatedResponse =
         this.consolidateGeminiResponsesForLogging(responses);
diff --git a/packages/core/src/core/modalityDefaults.test.ts b/packages/core/src/core/modalityDefaults.test.ts
new file mode 100644
index 000000000..b90bc069e
--- /dev/null
+++ b/packages/core/src/core/modalityDefaults.test.ts
@@ -0,0 +1,213 @@
+/**
+ * @license
+ * Copyright 2025 Qwen Team
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+import { describe, it, expect } from 'vitest';
+import { defaultModalities } from './modalityDefaults.js';
+
+describe('defaultModalities', () => {
+  describe('Google Gemini', () => {
+    it('returns full multimodal for gemini-3-pro', () => {
+      expect(defaultModalities('gemini-3-pro-preview')).toEqual({
+        image: true,
+        pdf: true,
+        audio: true,
+        video: true,
+      });
+    });
+
+    it('returns full multimodal for gemini-3-flash', () => {
+      expect(defaultModalities('gemini-3-flash-preview')).toEqual({
+        image: true,
+        pdf: true,
+        audio: true,
+        video: true,
+      });
+    });
+
+    it('returns full multimodal for gemini-3.1-pro', () => {
+      expect(defaultModalities('gemini-3.1-pro-preview')).toEqual({
+        image: true,
+        pdf: true,
+        audio: true,
+        video: true,
+      });
+    });
+
+    it('returns full multimodal for gemini-2.5-pro', () => {
+      expect(defaultModalities('gemini-2.5-pro')).toEqual({
+        image: true,
+        pdf: true,
+        audio: true,
+        video: true,
+      });
+    });
+
+    it('returns full multimodal for gemini-1.5-flash', () => {
+      expect(defaultModalities('gemini-1.5-flash')).toEqual({
+        image: true,
+        pdf: true,
+        audio: true,
+        video: true,
+      });
+    });
+  });
+
+  describe('OpenAI', () => {
+    it('returns image for gpt-5.2', () => {
+      const m = defaultModalities('gpt-5.2');
+      expect(m.image).toBe(true);
+      expect(m.audio).toBeUndefined();
+      expect(m.pdf).toBeUndefined();
+      expect(m.video).toBeUndefined();
+    });
+
+    it('returns image for gpt-5-mini', () => {
+      expect(defaultModalities('gpt-5-mini').image).toBe(true);
+    });
+
+    it('returns image for gpt-4o', () => {
+      expect(defaultModalities('gpt-4o').image).toBe(true);
+    });
+
+    it('returns image for o3', () => {
+      expect(defaultModalities('o3').image).toBe(true);
+    });
+  });
+
+  describe('Anthropic Claude', () => {
+    it('returns image + pdf for claude-opus-4-6', () => {
+      const m = defaultModalities('claude-opus-4-6');
+      expect(m.image).toBe(true);
+      expect(m.pdf).toBe(true);
+      expect(m.audio).toBeUndefined();
+      expect(m.video).toBeUndefined();
+    });
+
+    it('returns image + pdf for claude-sonnet-4-6', () => {
+      const m = defaultModalities('claude-sonnet-4-6');
+      expect(m.image).toBe(true);
+      expect(m.pdf).toBe(true);
+    });
+
+    it('returns image + pdf for claude-sonnet-4', () => {
+      const m = defaultModalities('claude-sonnet-4');
+      expect(m.image).toBe(true);
+      expect(m.pdf).toBe(true);
+    });
+
+    it('returns image + pdf for claude-3.5-sonnet', () => {
+      const m = defaultModalities('claude-3.5-sonnet');
+      expect(m.image).toBe(true);
+      expect(m.pdf).toBe(true);
+    });
+  });
+
+  describe('Qwen', () => {
+    it('returns image + video for qwen-vl-max', () => {
+      const m = defaultModalities('qwen-vl-max');
+      expect(m.image).toBe(true);
+      expect(m.video).toBe(true);
+      expect(m.pdf).toBeUndefined();
+      expect(m.audio).toBeUndefined();
+    });
+
+    it('returns image + video for qwen3-vl-plus', () => {
+      const m = defaultModalities('qwen3-vl-plus');
+      expect(m.image).toBe(true);
+      expect(m.video).toBe(true);
+    });
+
+    it('returns text-only for qwen3-coder-plus', () => {
+      expect(defaultModalities('qwen3-coder-plus')).toEqual({});
+    });
+
+    it('returns image + video for coder-model (same as qwen3.5-plus)', () => {
+      expect(defaultModalities('coder-model')).toEqual({
+        image: true,
+        video: true,
+      });
+    });
+
+    it('returns image + video for qwen3.5-plus', () => {
+      const m = defaultModalities('qwen3.5-plus');
+      expect(m.image).toBe(true);
+      expect(m.video).toBe(true);
+      expect(m.pdf).toBeUndefined();
+      expect(m.audio).toBeUndefined();
+    });
+
+    it('returns text-only for qwen-turbo', () => {
+      expect(defaultModalities('qwen-turbo')).toEqual({});
+    });
+  });
+
+  describe('DeepSeek', () => {
+    it('returns text-only for deepseek-chat', () => {
+      expect(defaultModalities('deepseek-chat')).toEqual({});
+    });
+
+    it('returns text-only for deepseek-reasoner', () => {
+      expect(defaultModalities('deepseek-reasoner')).toEqual({});
+    });
+  });
+
+  describe('Zhipu GLM', () => {
+    it('returns image for glm-4.5v', () => {
+      const m = defaultModalities('glm-4.5v');
+      expect(m.image).toBe(true);
+      expect(m.pdf).toBeUndefined();
+    });
+
+    it('returns text-only for glm-5', () => {
+      expect(defaultModalities('glm-5')).toEqual({});
+    });
+
+    it('returns text-only for glm-4.7', () => {
+      expect(defaultModalities('glm-4.7')).toEqual({});
+    });
+  });
+
+  describe('MiniMax', () => {
+    it('returns text-only for MiniMax-M2.5', () => {
+      expect(defaultModalities('MiniMax-M2.5')).toEqual({});
+    });
+  });
+
+  describe('Kimi', () => {
+    it('returns image + video for kimi-k2.5', () => {
+      const m = defaultModalities('kimi-k2.5');
+      expect(m.image).toBe(true);
+      expect(m.video).toBe(true);
+      expect(m.pdf).toBeUndefined();
+      expect(m.audio).toBeUndefined();
+    });
+
+    it('returns text-only for kimi-k2', () => {
+      expect(defaultModalities('kimi-k2')).toEqual({});
+    });
+  });
+
+  describe('unknown models', () => {
+    it('returns text-only for unrecognized models', () => {
+      expect(defaultModalities('some-random-model-xyz')).toEqual({});
+    });
+  });
+
+  describe('normalization', () => {
+    it('normalizes provider prefixes', () => {
+      expect(defaultModalities('openai/gpt-4o')).toEqual(
+        defaultModalities('gpt-4o'),
+      );
+    });
+
+    it('returns a fresh copy each time', () => {
+      const a = defaultModalities('gemini-2.5-pro');
+      const b = defaultModalities('gemini-2.5-pro');
+      expect(a).toEqual(b);
+      expect(a).not.toBe(b);
+    });
+  });
+});
diff --git a/packages/core/src/core/modalityDefaults.ts b/packages/core/src/core/modalityDefaults.ts
new file mode 100644
index 000000000..f17927325
--- /dev/null
+++ b/packages/core/src/core/modalityDefaults.ts
@@ -0,0 +1,94 @@
+/**
+ * @license
+ * Copyright 2025 Qwen Team
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+import type { InputModalities } from './contentGenerator.js';
+import { normalize } from './tokenLimits.js';
+
+const FULL_MULTIMODAL: InputModalities = {
+  image: true,
+  pdf: true,
+  audio: true,
+  video: true,
+};
+
+/**
+ * Ordered regex patterns: most specific -> most general (first match wins).
+ * Default for unknown models is text-only (empty object = all false).
+ */
+const MODALITY_PATTERNS: Array<[RegExp, InputModalities]> = [
+  // -------------------
+  // Google Gemini — full multimodal
+  // -------------------
+  [/^gemini-3/, FULL_MULTIMODAL],
+  [/^gemini-/, FULL_MULTIMODAL],
+
+  // -------------------
+  // OpenAI — image by default for all gpt/o-series models
+  // -------------------
+  [/^gpt-5/, { image: true }],
+  [/^gpt-/, { image: true }],
+  [/^o\d/, { image: true }],
+
+  // -------------------
+  // Anthropic Claude — image + pdf
+  // -------------------
+  [/^claude-/, { image: true, pdf: true }],
+
+  // -------------------
+  // Alibaba / Qwen
+  // -------------------
+  // Qwen3.5-Plus: image support
+  [/^qwen3\.5-plus/, { image: true, video: true }],
+  [/^coder-model$/, { image: true, video: true }],
+
+  // Qwen VL (vision-language) models: image + video
+  [/^qwen-vl-/, { image: true, video: true }],
+  [/^qwen3-vl-/, { image: true, video: true }],
+
+  // Qwen coder / text models: text-only
+  [/^qwen3-coder-/, {}],
+  [/^qwen/, {}],
+
+  // -------------------
+  // DeepSeek — text-only
+  // -------------------
+  [/^deepseek/, {}],
+
+  // -------------------
+  // Zhipu GLM
+  // -------------------
+  [/^glm-4\.5v/, { image: true }],
+  [/^glm-5(?:-|$)/, {}],
+  [/^glm-/, {}],
+
+  // -------------------
+  // MiniMax — text-only
+  // -------------------
+  [/^minimax-/, {}],
+
+  // -------------------
+  // Moonshot / Kimi
+  // -------------------
+  [/^kimi-k2\.5/, { image: true, video: true }],
+  [/^kimi-/, {}],
+];
+
+/**
+ * Return the default input modalities for a model based on its name.
+ *
+ * Uses the same normalize-then-regex pattern as {@link tokenLimit}.
+ * Unknown models default to text-only (empty object) to avoid sending
+ * unsupported media types that would cause unrecoverable API errors.
+ */
+export function defaultModalities(model: string): InputModalities {
+  const norm = normalize(model);
+  for (const [regex, modalities] of MODALITY_PATTERNS) {
+    if (regex.test(norm)) {
+      return { ...modalities };
+    }
+  }
+  return {};
+}
diff --git a/packages/core/src/core/openaiContentGenerator/converter.test.ts b/packages/core/src/core/openaiContentGenerator/converter.test.ts
index 36bbc812d..115d6dc0d 100644
--- a/packages/core/src/core/openaiContentGenerator/converter.test.ts
+++ b/packages/core/src/core/openaiContentGenerator/converter.test.ts
@@ -9,6 +9,7 @@ import { OpenAIContentConverter } from './converter.js';
 import type { StreamingToolCallParser } from './streamingToolCallParser.js';
 import {
   Type,
+  FinishReason,
   type GenerateContentParameters,
   type Content,
   type Part,
@@ -22,7 +23,12 @@ describe('OpenAIContentConverter', () => {
   let converter: OpenAIContentConverter;
 
   beforeEach(() => {
-    converter = new OpenAIContentConverter('test-model');
+    converter = new OpenAIContentConverter('test-model', 'auto', {
+      image: true,
+      pdf: true,
+      audio: true,
+      video: true,
+    });
   });
 
   describe('resetStreamingToolCalls', () => {
@@ -1684,7 +1690,12 @@ describe('MCP tool result end-to-end through OpenAI converter (issue #1520)', ()
   let converter: OpenAIContentConverter;
 
   beforeEach(() => {
-    converter = new OpenAIContentConverter('test-model');
+    converter = new OpenAIContentConverter('test-model', 'auto', {
+      image: true,
+      pdf: true,
+      audio: true,
+      video: true,
+    });
   });
 
   it('should preserve MCP multi-text content in tool message (not leak to user message)', () => {
@@ -1957,3 +1968,394 @@ describe('MCP tool result end-to-end through OpenAI converter (issue #1520)', ()
     expect(contentArray[1].image_url?.url).toContain('data:image/png');
   });
 });
+
+describe('Truncated tool call detection in streaming', () => {
+  let converter: OpenAIContentConverter;
+
+  beforeEach(() => {
+    converter = new OpenAIContentConverter('test-model');
+  });
+
+  /**
+   * Helper: feed streaming chunks then a final chunk with finish_reason,
+   * and return the Gemini response for the final chunk.
+   */
+  function feedToolCallChunks(
+    conv: OpenAIContentConverter,
+    toolCallChunks: Array<{
+      index: number;
+      id?: string;
+      name?: string;
+      arguments: string;
+    }>,
+    finishReason: string,
+  ) {
+    // Feed argument chunks (no finish_reason yet)
+    for (const tc of toolCallChunks) {
+      conv.convertOpenAIChunkToGemini({
+        object: 'chat.completion.chunk',
+        id: 'chunk-stream',
+        created: 100,
+        model: 'test-model',
+        choices: [
+          {
+            index: 0,
+            delta: {
+              tool_calls: [
+                {
+                  index: tc.index,
+                  id: tc.id,
+                  type: 'function' as const,
+                  function: {
+                    name: tc.name,
+                    arguments: tc.arguments,
+                  },
+                },
+              ],
+            },
+            finish_reason: null,
+            logprobs: null,
+          },
+        ],
+      } as unknown as OpenAI.Chat.ChatCompletionChunk);
+    }
+
+    // Final chunk with finish_reason
+    return conv.convertOpenAIChunkToGemini({
+      object: 'chat.completion.chunk',
+      id: 'chunk-final',
+      created: 101,
+      model: 'test-model',
+      choices: [
+        {
+          index: 0,
+          delta: {},
+          finish_reason: finishReason,
+          logprobs: null,
+        },
+      ],
+    } as unknown as OpenAI.Chat.ChatCompletionChunk);
+  }
+
+  it('should override finishReason to MAX_TOKENS when tool call JSON is truncated and provider reports "stop"', () => {
+    // Simulate: write_file call truncated mid-JSON, provider says "stop"
+    const result = feedToolCallChunks(
+      converter,
+      [
+        {
+          index: 0,
+          id: 'call_1',
+          name: 'write_file',
+          arguments: '{"file_path": "/tmp/test.cpp"',
+          // Missing closing brace and content field — truncated
+        },
+      ],
+      'stop',
+    );
+
+    expect(result.candidates?.[0]?.finishReason).toBe(FinishReason.MAX_TOKENS);
+  });
+
+  it('should override finishReason to MAX_TOKENS when provider reports "tool_calls" but JSON is truncated', () => {
+    const result = feedToolCallChunks(
+      converter,
+      [
+        {
+          index: 0,
+          id: 'call_1',
+          name: 'write_file',
+          arguments:
+            '{"file_path": "/tmp/test.cpp", "content": "partial content',
+          // Truncated mid-string
+        },
+      ],
+      'tool_calls',
+    );
+
+    expect(result.candidates?.[0]?.finishReason).toBe(FinishReason.MAX_TOKENS);
+  });
+
+  it('should preserve finishReason STOP when tool call JSON is complete', () => {
+    const result = feedToolCallChunks(
+      converter,
+      [
+        {
+          index: 0,
+          id: 'call_1',
+          name: 'write_file',
+          arguments: '{"file_path": "/tmp/test.cpp", "content": "hello"}',
+        },
+      ],
+      'stop',
+    );
+
+    expect(result.candidates?.[0]?.finishReason).toBe(FinishReason.STOP);
+  });
+
+  it('should preserve finishReason MAX_TOKENS when provider already reports "length"', () => {
+    const result = feedToolCallChunks(
+      converter,
+      [
+        {
+          index: 0,
+          id: 'call_1',
+          name: 'write_file',
+          arguments: '{"file_path": "/tmp/test.cpp"',
+        },
+      ],
+      'length',
+    );
+
+    expect(result.candidates?.[0]?.finishReason).toBe(FinishReason.MAX_TOKENS);
+  });
+
+  it('should still emit the (repaired) function call even when truncated', () => {
+    const result = feedToolCallChunks(
+      converter,
+      [
+        {
+          index: 0,
+          id: 'call_1',
+          name: 'write_file',
+          arguments: '{"file_path": "/tmp/test.cpp"',
+        },
+      ],
+      'stop',
+    );
+
+    const parts = result.candidates?.[0]?.content?.parts ?? [];
+    const fnCall = parts.find((p: Part) => p.functionCall);
+    expect(fnCall).toBeDefined();
+    expect(fnCall?.functionCall?.name).toBe('write_file');
+    expect(fnCall?.functionCall?.args).toEqual({
+      file_path: '/tmp/test.cpp',
+    });
+  });
+
+  it('should detect truncation with multi-chunk streaming arguments', () => {
+    // Feed arguments in multiple small chunks like real streaming
+    const conv = new OpenAIContentConverter('test-model');
+
+    // Chunk 1: start of JSON with tool metadata
+    conv.convertOpenAIChunkToGemini({
+      object: 'chat.completion.chunk',
+      id: 'c1',
+      created: 100,
+      model: 'test-model',
+      choices: [
+        {
+          index: 0,
+          delta: {
+            tool_calls: [
+              {
+                index: 0,
+                id: 'call_1',
+                type: 'function' as const,
+                function: { name: 'write_file', arguments: '{"file_' },
+              },
+            ],
+          },
+          finish_reason: null,
+          logprobs: null,
+        },
+      ],
+    } as unknown as OpenAI.Chat.ChatCompletionChunk);
+
+    // Chunk 2: more arguments
+    conv.convertOpenAIChunkToGemini({
+      object: 'chat.completion.chunk',
+      id: 'c2',
+      created: 100,
+      model: 'test-model',
+      choices: [
+        {
+          index: 0,
+          delta: {
+            tool_calls: [
+              {
+                index: 0,
+                function: { arguments: 'path": "/tmp/f.txt", "conten' },
+              },
+            ],
+          },
+          finish_reason: null,
+          logprobs: null,
+        },
+      ],
+    } as unknown as OpenAI.Chat.ChatCompletionChunk);
+
+    // Final chunk: finish_reason "stop" but JSON is still incomplete
+    const result = conv.convertOpenAIChunkToGemini({
+      object: 'chat.completion.chunk',
+      id: 'c3',
+      created: 101,
+      model: 'test-model',
+      choices: [
+        {
+          index: 0,
+          delta: {},
+          finish_reason: 'stop',
+          logprobs: null,
+        },
+      ],
+    } as unknown as OpenAI.Chat.ChatCompletionChunk);
+
+    expect(result.candidates?.[0]?.finishReason).toBe(FinishReason.MAX_TOKENS);
+  });
+});
+
+describe('modality filtering', () => {
+  function makeRequest(parts: Part[]): GenerateContentParameters {
+    return {
+      model: 'test-model',
+      contents: [{ role: 'user', parts }],
+    };
+  }
+
+  function getUserContentParts(
+    messages: OpenAI.Chat.ChatCompletionMessageParam[],
+  ): Array<{ type: string; text?: string }> {
+    const userMsg = messages.find((m) => m.role === 'user');
+    if (
+      !userMsg ||
+      !('content' in userMsg) ||
+      !Array.isArray(userMsg.content)
+    ) {
+      return [];
+    }
+    return userMsg.content as Array<{ type: string; text?: string }>;
+  }
+
+  it('replaces image with placeholder when image modality is disabled', () => {
+    const conv = new OpenAIContentConverter('deepseek-chat', 'auto', {});
+    const request = makeRequest([
+      {
+        inlineData: { mimeType: 'image/png', data: 'abc123' },
+        displayName: 'screenshot.png',
+      } as unknown as Part,
+    ]);
+    const messages = conv.convertGeminiRequestToOpenAI(request);
+    const parts = getUserContentParts(messages);
+    expect(parts).toHaveLength(1);
+    expect(parts[0].type).toBe('text');
+    expect(parts[0].text).toContain('image file');
+    expect(parts[0].text).toContain('does not support image input');
+  });
+
+  it('keeps image when image modality is enabled', () => {
+    const conv = new OpenAIContentConverter('gpt-4o', 'auto', { image: true });
+    const request = makeRequest([
+      {
+        inlineData: { mimeType: 'image/png', data: 'abc123' },
+      } as unknown as Part,
+    ]);
+    const messages = conv.convertGeminiRequestToOpenAI(request);
+    const parts = getUserContentParts(messages);
+    expect(parts).toHaveLength(1);
+    expect(parts[0].type).toBe('image_url');
+  });
+
+  it('replaces PDF with placeholder when pdf modality is disabled', () => {
+    const conv = new OpenAIContentConverter('test-model', 'auto', {
+      image: true,
+    });
+    const request = makeRequest([
+      {
+        inlineData: {
+          mimeType: 'application/pdf',
+          data: 'pdf-data',
+          displayName: 'doc.pdf',
+        },
+      } as unknown as Part,
+    ]);
+    const messages = conv.convertGeminiRequestToOpenAI(request);
+    const parts = getUserContentParts(messages);
+    expect(parts).toHaveLength(1);
+    expect(parts[0].type).toBe('text');
+    expect(parts[0].text).toContain('pdf file');
+    expect(parts[0].text).toContain('does not support PDF input');
+  });
+
+  it('keeps PDF when pdf modality is enabled', () => {
+    const conv = new OpenAIContentConverter('claude-sonnet', 'auto', {
+      image: true,
+      pdf: true,
+    });
+    const request = makeRequest([
+      {
+        inlineData: {
+          mimeType: 'application/pdf',
+          data: 'pdf-data',
+          displayName: 'doc.pdf',
+        },
+      } as unknown as Part,
+    ]);
+    const messages = conv.convertGeminiRequestToOpenAI(request);
+    const parts = getUserContentParts(messages);
+    expect(parts).toHaveLength(1);
+    expect(parts[0].type).toBe('file');
+  });
+
+  it('replaces video with placeholder when video modality is disabled', () => {
+    const conv = new OpenAIContentConverter('test-model', 'auto', {});
+    const request = makeRequest([
+      {
+        inlineData: { mimeType: 'video/mp4', data: 'vid-data' },
+      } as unknown as Part,
+    ]);
+    const messages = conv.convertGeminiRequestToOpenAI(request);
+    const parts = getUserContentParts(messages);
+    expect(parts).toHaveLength(1);
+    expect(parts[0].type).toBe('text');
+    expect(parts[0].text).toContain('video file');
+  });
+
+  it('replaces audio with placeholder when audio modality is disabled', () => {
+    const conv = new OpenAIContentConverter('test-model', 'auto', {});
+    const request = makeRequest([
+      {
+        inlineData: { mimeType: 'audio/wav', data: 'audio-data' },
+      } as unknown as Part,
+    ]);
+    const messages = conv.convertGeminiRequestToOpenAI(request);
+    const parts = getUserContentParts(messages);
+    expect(parts).toHaveLength(1);
+    expect(parts[0].type).toBe('text');
+    expect(parts[0].text).toContain('audio file');
+  });
+
+  it('handles mixed content: keeps text + supported media, replaces unsupported', () => {
+    const conv = new OpenAIContentConverter('gpt-4o', 'auto', { image: true });
+    const request = makeRequest([
+      { text: 'Analyze these files' },
+      {
+        inlineData: { mimeType: 'image/png', data: 'img-data' },
+      } as unknown as Part,
+      {
+        inlineData: { mimeType: 'video/mp4', data: 'vid-data' },
+      } as unknown as Part,
+    ]);
+    const messages = conv.convertGeminiRequestToOpenAI(request);
+    const parts = getUserContentParts(messages);
+    expect(parts).toHaveLength(3);
+    expect(parts[0].type).toBe('text');
+    expect(parts[0].text).toBe('Analyze these files');
+    expect(parts[1].type).toBe('image_url');
+    expect(parts[2].type).toBe('text');
+    expect(parts[2].text).toContain('video file');
+  });
+
+  it('defaults to text-only when no modalities are specified', () => {
+    const conv = new OpenAIContentConverter('unknown-model');
+    const request = makeRequest([
+      {
+        inlineData: { mimeType: 'image/png', data: 'img-data' },
+      } as unknown as Part,
+    ]);
+    const messages = conv.convertGeminiRequestToOpenAI(request);
+    const parts = getUserContentParts(messages);
+    expect(parts).toHaveLength(1);
+    expect(parts[0].type).toBe('text');
+    expect(parts[0].text).toContain('image file');
+  });
+});
diff --git a/packages/core/src/core/openaiContentGenerator/converter.ts b/packages/core/src/core/openaiContentGenerator/converter.ts
index 2ca7428bd..d90737d10 100644
--- a/packages/core/src/core/openaiContentGenerator/converter.ts
+++ b/packages/core/src/core/openaiContentGenerator/converter.ts
@@ -20,12 +20,16 @@ import type {
 import { GenerateContentResponse, FinishReason } from '@google/genai';
 import type OpenAI from 'openai';
 import { safeJsonParse } from '../../utils/safeJsonParse.js';
+import { createDebugLogger } from '../../utils/debugLogger.js';
+import type { InputModalities } from '../contentGenerator.js';
 import { StreamingToolCallParser } from './streamingToolCallParser.js';
 import {
   convertSchema,
   type SchemaComplianceMode,
 } from '../../utils/schemaConverter.js';
 
+const debugLogger = createDebugLogger('CONVERTER');
+
 /**
  * Extended usage type that supports both OpenAI standard format and alternative formats
  * Some models return cached_tokens at the top level instead of in prompt_tokens_details
@@ -92,12 +96,18 @@ type OpenAIContentPart =
 export class OpenAIContentConverter {
   private model: string;
   private schemaCompliance: SchemaComplianceMode;
+  private modalities: InputModalities;
   private streamingToolCallParser: StreamingToolCallParser =
     new StreamingToolCallParser();
 
-  constructor(model: string, schemaCompliance: SchemaComplianceMode = 'auto') {
+  constructor(
+    model: string,
+    schemaCompliance: SchemaComplianceMode = 'auto',
+    modalities: InputModalities = {},
+  ) {
     this.model = model;
     this.schemaCompliance = schemaCompliance;
+    this.modalities = modalities;
   }
 
   /**
@@ -108,6 +118,13 @@ export class OpenAIContentConverter {
     this.model = model;
   }
 
+  /**
+   * Update the supported input modalities.
+   */
+  setModalities(modalities: InputModalities): void {
+    this.modalities = modalities;
+  }
+
   /**
    * Reset streaming tool calls parser for new stream processing
    * This should be called at the beginning of each stream to prevent
@@ -585,13 +602,19 @@ export class OpenAIContentConverter {
   }
 
   /**
-   * Create OpenAI media content part from Gemini part
+   * Create OpenAI media content part from Gemini part.
+   * Checks modality support before building each media type.
    */
   private createMediaContentPart(part: Part): OpenAIContentPart | null {
     if (part.inlineData?.mimeType && part.inlineData?.data) {
       const mimeType = part.inlineData.mimeType;
       const mediaType = this.getMediaType(mimeType);
+      const displayName = part.inlineData.displayName || mimeType;
+
       if (mediaType === 'image') {
+        if (!this.modalities.image) {
+          return this.unsupportedModalityPlaceholder('image', displayName);
+        }
         const dataUrl = `data:${mimeType};base64,${part.inlineData.data}`;
         return {
           type: 'image_url' as const,
@@ -600,6 +623,9 @@ export class OpenAIContentConverter {
       }
 
       if (mimeType === 'application/pdf') {
+        if (!this.modalities.pdf) {
+          return this.unsupportedModalityPlaceholder('pdf', displayName);
+        }
         const filename = part.inlineData.displayName || 'document.pdf';
         return {
           type: 'file' as const,
@@ -611,6 +637,9 @@ export class OpenAIContentConverter {
       }
 
       if (mediaType === 'audio') {
+        if (!this.modalities.audio) {
+          return this.unsupportedModalityPlaceholder('audio', displayName);
+        }
         const format = this.getAudioFormat(mimeType);
         if (format) {
           return {
@@ -624,6 +653,9 @@ export class OpenAIContentConverter {
       }
 
       if (mediaType === 'video') {
+        if (!this.modalities.video) {
+          return this.unsupportedModalityPlaceholder('video', displayName);
+        }
         return {
           type: 'video_url' as const,
           video_url: {
@@ -632,12 +664,9 @@ export class OpenAIContentConverter {
         };
       }
 
-      const displayName = part.inlineData.displayName
-        ? ` (${part.inlineData.displayName})`
-        : '';
       return {
         type: 'text' as const,
-        text: `Unsupported inline media type: ${mimeType}${displayName}.`,
+        text: `Unsupported inline media type: ${mimeType} (${displayName}).`,
       };
     }
 
@@ -648,6 +677,9 @@ export class OpenAIContentConverter {
       const mediaType = this.getMediaType(mimeType);
 
       if (mediaType === 'image') {
+        if (!this.modalities.image) {
+          return this.unsupportedModalityPlaceholder('image', filename);
+        }
         return {
           type: 'image_url' as const,
           image_url: { url: fileUri },
@@ -655,6 +687,9 @@ export class OpenAIContentConverter {
       }
 
       if (mimeType === 'application/pdf') {
+        if (!this.modalities.pdf) {
+          return this.unsupportedModalityPlaceholder('pdf', filename);
+        }
         return {
           type: 'file' as const,
           file: {
@@ -665,6 +700,9 @@ export class OpenAIContentConverter {
       }
 
       if (mediaType === 'video') {
+        if (!this.modalities.video) {
+          return this.unsupportedModalityPlaceholder('video', filename);
+        }
         return {
           type: 'video_url' as const,
           video_url: {
@@ -673,18 +711,42 @@ export class OpenAIContentConverter {
         };
       }
 
-      const displayName = part.fileData.displayName
+      const displayNameStr = part.fileData.displayName
         ? ` (${part.fileData.displayName})`
         : '';
       return {
         type: 'text' as const,
-        text: `Unsupported file media type: ${mimeType}${displayName}.`,
+        text: `Unsupported file media type: ${mimeType}${displayNameStr}.`,
       };
     }
 
     return null;
   }
 
+  /**
+   * Create a text placeholder for unsupported modalities.
+   */
+  private unsupportedModalityPlaceholder(
+    modality: string,
+    displayName: string,
+  ): OpenAIContentPart {
+    debugLogger.warn(
+      `Model '${this.model}' does not support ${modality} input. ` +
+        `Replacing with text placeholder: ${displayName}`,
+    );
+    let hint: string;
+    if (modality === 'pdf') {
+      hint =
+        'This model does not support PDF input directly. The read_file tool cannot extract PDF content either. To extract text from the PDF file, try using skills if applicable, or guide user to install pdf skill by running this slash command:\n/extensions install https://github.com/anthropics/skills:document-skills';
+    } else {
+      hint = `This model does not support ${modality} input. The read_file tool cannot process this type of file either. To handle this file, try using skills if applicable, or any tools installed at system wide, or let the user know you cannot process this type of file.`;
+    }
+    return {
+      type: 'text' as const,
+      text: `[Unsupported ${modality} file: "${displayName}". ${hint}]`,
+    };
+  }
+
   /**
    * Determine media type from MIME type
    */
@@ -911,7 +973,14 @@ export class OpenAIContentConverter {
       }
 
       // Only emit function calls when streaming is complete (finish_reason is present)
+      let toolCallsTruncated = false;
       if (choice.finish_reason) {
+        // Detect truncation the provider may not report correctly.
+        // Some providers (e.g. DashScope/Qwen) send "stop" or "tool_calls"
+        // even when output was cut off mid-JSON due to max_tokens.
+        toolCallsTruncated =
+          this.streamingToolCallParser.hasIncompleteToolCalls();
+
         const completedToolCalls =
           this.streamingToolCallParser.getCompletedToolCalls();
 
@@ -933,6 +1002,13 @@ export class OpenAIContentConverter {
         this.streamingToolCallParser.reset();
       }
 
+      // If tool call JSON was truncated, override to "length" so downstream
+      // (turn.ts) correctly sets wasOutputTruncated=true.
+      const effectiveFinishReason =
+        toolCallsTruncated && choice.finish_reason !== 'length'
+          ? 'length'
+          : choice.finish_reason;
+
       // Only include finishReason key if finish_reason is present
       const candidate: Candidate = {
         content: {
@@ -942,9 +1018,9 @@ export class OpenAIContentConverter {
         index: 0,
         safetyRatings: [],
       };
-      if (choice.finish_reason) {
+      if (effectiveFinishReason) {
         candidate.finishReason = this.mapOpenAIFinishReasonToGemini(
-          choice.finish_reason,
+          effectiveFinishReason,
         );
       }
       response.candidates = [candidate];
diff --git a/packages/core/src/core/openaiContentGenerator/pipeline.test.ts b/packages/core/src/core/openaiContentGenerator/pipeline.test.ts
index 964f768a3..d71e23e91 100644
--- a/packages/core/src/core/openaiContentGenerator/pipeline.test.ts
+++ b/packages/core/src/core/openaiContentGenerator/pipeline.test.ts
@@ -47,6 +47,7 @@ describe('ContentGenerationPipeline', () => {
     // Mock converter
     mockConverter = {
       setModel: vi.fn(),
+      setModalities: vi.fn(),
       convertGeminiRequestToOpenAI: vi.fn(),
       convertOpenAIResponseToGemini: vi.fn(),
       convertOpenAIChunkToGemini: vi.fn(),
@@ -104,6 +105,7 @@ describe('ContentGenerationPipeline', () => {
       expect(OpenAIContentConverter).toHaveBeenCalledWith(
         'test-model',
         undefined,
+        {},
       );
     });
   });
diff --git a/packages/core/src/core/openaiContentGenerator/pipeline.ts b/packages/core/src/core/openaiContentGenerator/pipeline.ts
index 1865adb48..8d2cc9fc7 100644
--- a/packages/core/src/core/openaiContentGenerator/pipeline.ts
+++ b/packages/core/src/core/openaiContentGenerator/pipeline.ts
@@ -46,6 +46,7 @@ export class ContentGenerationPipeline {
     this.converter = new OpenAIContentConverter(
       this.contentGeneratorConfig.model,
       this.contentGeneratorConfig.schemaCompliance,
+      this.contentGeneratorConfig.modalities ?? {},
     );
   }
 
@@ -58,6 +59,7 @@ export class ContentGenerationPipeline {
     // that is not valid/available for the OpenAI-compatible backend.
     const effectiveModel = this.contentGeneratorConfig.model;
     this.converter.setModel(effectiveModel);
+    this.converter.setModalities(this.contentGeneratorConfig.modalities ?? {});
     return this.executeWithErrorHandling(
       request,
       userPromptId,
@@ -85,6 +87,7 @@ export class ContentGenerationPipeline {
   ): Promise<AsyncGenerator<GenerateContentResponse>> {
     const effectiveModel = this.contentGeneratorConfig.model;
     this.converter.setModel(effectiveModel);
+    this.converter.setModalities(this.contentGeneratorConfig.modalities ?? {});
     return this.executeWithErrorHandling(
       request,
       userPromptId,
diff --git a/packages/core/src/core/openaiContentGenerator/provider/dashscope.test.ts b/packages/core/src/core/openaiContentGenerator/provider/dashscope.test.ts
index f9d7a0fd6..2e528120a 100644
--- a/packages/core/src/core/openaiContentGenerator/provider/dashscope.test.ts
+++ b/packages/core/src/core/openaiContentGenerator/provider/dashscope.test.ts
@@ -733,7 +733,7 @@ describe('DashScopeOpenAICompatibleProvider', () => {
   describe('output token limits', () => {
     it('should limit max_tokens when it exceeds model limit', () => {
       const request: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'qwen3-coder-plus',
+        model: 'qwen3-max',
         messages: [{ role: 'user', content: 'Hello' }],
         max_tokens: 100000, // Exceeds the model's output limit
       };
@@ -757,7 +757,7 @@ describe('DashScopeOpenAICompatibleProvider', () => {
 
     it('should not modify max_tokens when it is within model limit', () => {
       const request: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'qwen3-coder-plus',
+        model: 'qwen3-max',
         messages: [{ role: 'user', content: 'Hello' }],
         max_tokens: 1000, // Within the model's output limit
       };
@@ -769,7 +769,7 @@ describe('DashScopeOpenAICompatibleProvider', () => {
 
     it('should not add max_tokens when not present in request', () => {
       const request: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'qwen3-coder-plus',
+        model: 'qwen3-max',
         messages: [{ role: 'user', content: 'Hello' }],
         // No max_tokens parameter
       };
@@ -781,7 +781,7 @@ describe('DashScopeOpenAICompatibleProvider', () => {
 
     it('should handle null max_tokens parameter', () => {
       const request: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'qwen3-coder-plus',
+        model: 'qwen3-max',
         messages: [{ role: 'user', content: 'Hello' }],
         max_tokens: null,
       };
@@ -800,12 +800,12 @@ describe('DashScopeOpenAICompatibleProvider', () => {
 
       const result = provider.buildRequest(request, 'test-prompt-id');
 
-      expect(result.max_tokens).toBe(4096); // Should be limited to default output limit (4K)
+      expect(result.max_tokens).toBe(8192); // Should be limited to default output limit (8K)
     });
 
     it('should preserve other request parameters when limiting max_tokens', () => {
       const request: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'qwen3-coder-plus',
+        model: 'qwen3-max',
         messages: [{ role: 'user', content: 'Hello' }],
         max_tokens: 100000, // Will be limited
         temperature: 0.8,
@@ -872,12 +872,10 @@ describe('DashScopeOpenAICompatibleProvider', () => {
             ],
           },
         ],
-        max_tokens: 50000,
       };
 
       const result = provider.buildRequest(request, 'test-prompt-id');
 
-      expect(result.max_tokens).toBe(32768); // Limited to model's output limit (32K)
       expect(
         (result as { vl_high_resolution_images?: boolean })
           .vl_high_resolution_images,
@@ -904,8 +902,7 @@ describe('DashScopeOpenAICompatibleProvider', () => {
 
       const result = provider.buildRequest(request, 'test-prompt-id');
 
-      // coder-model has 64K output limit, so max_tokens should be capped
-      expect(result.max_tokens).toBe(65536);
+      expect(result.max_tokens).toBe(65536); // Limited to model's output limit (64K)
       expect(
         (result as { vl_high_resolution_images?: boolean })
           .vl_high_resolution_images,
@@ -914,7 +911,7 @@ describe('DashScopeOpenAICompatibleProvider', () => {
 
     it('should handle streaming requests with output token limits', () => {
       const request: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'qwen3-coder-plus',
+        model: 'qwen3-max',
         messages: [{ role: 'user', content: 'Hello' }],
         max_tokens: 100000, // Exceeds the model's output limit
         stream: true,
diff --git a/packages/core/src/core/openaiContentGenerator/provider/deepseek.test.ts b/packages/core/src/core/openaiContentGenerator/provider/deepseek.test.ts
index 68693393b..9a69cd326 100644
--- a/packages/core/src/core/openaiContentGenerator/provider/deepseek.test.ts
+++ b/packages/core/src/core/openaiContentGenerator/provider/deepseek.test.ts
@@ -5,7 +5,6 @@
  */
 
 import { describe, it, expect, vi, beforeEach } from 'vitest';
-import type OpenAI from 'openai';
 import { DeepSeekOpenAICompatibleProvider } from './deepseek.js';
 import type { ContentGeneratorConfig } from '../../contentGenerator.js';
 import type { Config } from '../../../config/config.js';
@@ -18,7 +17,6 @@ vi.mock('openai', () => ({
 }));
 
 describe('DeepSeekOpenAICompatibleProvider', () => {
-  let provider: DeepSeekOpenAICompatibleProvider;
   let mockContentGeneratorConfig: ContentGeneratorConfig;
   let mockCliConfig: Config;
 
@@ -34,11 +32,6 @@ describe('DeepSeekOpenAICompatibleProvider', () => {
     mockCliConfig = {
       getCliVersion: vi.fn().mockReturnValue('1.0.0'),
     } as unknown as Config;
-
-    provider = new DeepSeekOpenAICompatibleProvider(
-      mockContentGeneratorConfig,
-      mockCliConfig,
-    );
   });
 
   describe('isDeepSeekProvider', () => {
@@ -61,72 +54,15 @@ describe('DeepSeekOpenAICompatibleProvider', () => {
     });
   });
 
-  describe('buildRequest', () => {
-    const userPromptId = 'prompt-123';
-
-    it('converts array content into a string', () => {
-      const originalRequest: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'deepseek-chat',
-        messages: [
-          {
-            role: 'user',
-            content: [
-              { type: 'text', text: 'Hello' },
-              { type: 'text', text: ' world' },
-            ],
-          },
-        ],
-      };
-
-      const result = provider.buildRequest(originalRequest, userPromptId);
-
-      expect(result.messages).toHaveLength(1);
-      expect(result.messages?.[0]).toEqual({
-        role: 'user',
-        content: 'Hello world',
+  describe('getDefaultGenerationConfig', () => {
+    it('returns temperature 0', () => {
+      const provider = new DeepSeekOpenAICompatibleProvider(
+        mockContentGeneratorConfig,
+        mockCliConfig,
+      );
+      expect(provider.getDefaultGenerationConfig()).toEqual({
+        temperature: 0,
       });
-      expect(originalRequest.messages?.[0].content).toEqual([
-        { type: 'text', text: 'Hello' },
-        { type: 'text', text: ' world' },
-      ]);
-    });
-
-    it('leaves string content unchanged', () => {
-      const originalRequest: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'deepseek-chat',
-        messages: [
-          {
-            role: 'user',
-            content: 'Hello world',
-          },
-        ],
-      };
-
-      const result = provider.buildRequest(originalRequest, userPromptId);
-
-      expect(result.messages?.[0].content).toBe('Hello world');
-    });
-
-    it('throws when encountering non-text multimodal parts', () => {
-      const originalRequest: OpenAI.Chat.ChatCompletionCreateParams = {
-        model: 'deepseek-chat',
-        messages: [
-          {
-            role: 'user',
-            content: [
-              { type: 'text', text: 'Hello' },
-              {
-                type: 'image_url',
-                image_url: { url: 'https://example.com/image.png' },
-              },
-            ],
-          },
-        ],
-      };
-
-      expect(() =>
-        provider.buildRequest(originalRequest, userPromptId),
-      ).toThrow(/only supports text content/i);
     });
   });
 });
diff --git a/packages/core/src/core/openaiContentGenerator/provider/deepseek.ts b/packages/core/src/core/openaiContentGenerator/provider/deepseek.ts
index 9b5fd7479..0e246725f 100644
--- a/packages/core/src/core/openaiContentGenerator/provider/deepseek.ts
+++ b/packages/core/src/core/openaiContentGenerator/provider/deepseek.ts
@@ -4,7 +4,6 @@
  * SPDX-License-Identifier: Apache-2.0
  */
 
-import type OpenAI from 'openai';
 import type { Config } from '../../../config/config.js';
 import type { ContentGeneratorConfig } from '../../contentGenerator.js';
 import { DefaultOpenAICompatibleProvider } from './default.js';
@@ -26,58 +25,6 @@ export class DeepSeekOpenAICompatibleProvider extends DefaultOpenAICompatiblePro
     return baseUrl.toLowerCase().includes('api.deepseek.com');
   }
 
-  override buildRequest(
-    request: OpenAI.Chat.ChatCompletionCreateParams,
-    userPromptId: string,
-  ): OpenAI.Chat.ChatCompletionCreateParams {
-    const baseRequest = super.buildRequest(request, userPromptId);
-    if (!baseRequest.messages?.length) {
-      return baseRequest;
-    }
-
-    const messages = baseRequest.messages.map((message) => {
-      if (!('content' in message)) {
-        return message;
-      }
-
-      const { content } = message;
-
-      if (
-        typeof content === 'string' ||
-        content === null ||
-        content === undefined
-      ) {
-        return message;
-      }
-
-      if (!Array.isArray(content)) {
-        return message;
-      }
-
-      const text = content
-        .map((part) => {
-          if (part.type !== 'text') {
-            throw new Error(
-              `DeepSeek provider only supports text content. Found non-text part of type '${part.type}' in message with role '${message.role}'.`,
-            );
-          }
-
-          return part.text ?? '';
-        })
-        .join('');
-
-      return {
-        ...message,
-        content: text,
-      } as OpenAI.Chat.ChatCompletionMessageParam;
-    });
-
-    return {
-      ...baseRequest,
-      messages,
-    };
-  }
-
   override getDefaultGenerationConfig(): GenerateContentConfig {
     return {
       temperature: 0,
diff --git a/packages/core/src/core/openaiContentGenerator/streamingToolCallParser.test.ts b/packages/core/src/core/openaiContentGenerator/streamingToolCallParser.test.ts
index 14da87d7e..1735097be 100644
--- a/packages/core/src/core/openaiContentGenerator/streamingToolCallParser.test.ts
+++ b/packages/core/src/core/openaiContentGenerator/streamingToolCallParser.test.ts
@@ -790,4 +790,75 @@ describe('StreamingToolCallParser', () => {
       expect(call2?.args).toEqual({ param2: 'value2' });
     });
   });
+
+  describe('hasIncompleteToolCalls', () => {
+    it('should return false when no tool calls exist', () => {
+      expect(parser.hasIncompleteToolCalls()).toBe(false);
+    });
+
+    it('should return false when all tool calls have complete JSON', () => {
+      parser.addChunk(0, '{"key": "value"}', 'call_1', 'write_file');
+      expect(parser.hasIncompleteToolCalls()).toBe(false);
+    });
+
+    it('should return true when a tool call has depth > 0 (unclosed braces)', () => {
+      parser.addChunk(
+        0,
+        '{"file_path": "/tmp/test.txt", "content": "partial',
+        'call_1',
+        'write_file',
+      );
+      expect(parser.hasIncompleteToolCalls()).toBe(true);
+    });
+
+    it('should return true when a tool call is inside a string literal', () => {
+      // Simulate truncation mid-string: {"file_path": "/tmp/test.txt", "content": "some text
+      parser.addChunk(
+        0,
+        '{"file_path": "/tmp/test.txt"',
+        'call_1',
+        'write_file',
+      );
+      parser.addChunk(0, ', "content": "some text');
+      const state = parser.getState(0);
+      expect(state.inString).toBe(true);
+      expect(parser.hasIncompleteToolCalls()).toBe(true);
+    });
+
+    it('should return false for tool calls without name metadata', () => {
+      // Tool calls without a name should be ignored
+      parser.addChunk(0, '{"key": "incomplete', undefined, undefined);
+      expect(parser.hasIncompleteToolCalls()).toBe(false);
+    });
+
+    it('should detect incomplete among multiple tool calls', () => {
+      // First tool call is complete
+      parser.addChunk(0, '{"key": "value"}', 'call_1', 'func_a');
+      // Second tool call is incomplete
+      parser.addChunk(1, '{"key": "val', 'call_2', 'func_b');
+      expect(parser.hasIncompleteToolCalls()).toBe(true);
+    });
+
+    it('should return false after reset', () => {
+      parser.addChunk(0, '{"key": "incomplete', 'call_1', 'write_file');
+      expect(parser.hasIncompleteToolCalls()).toBe(true);
+      parser.reset();
+      expect(parser.hasIncompleteToolCalls()).toBe(false);
+    });
+
+    it('should detect real-world truncation: write_file with only file_path', () => {
+      // Reproduces the actual bug: LLM output truncated mid-JSON,
+      // only file_path key received, content never arrived.
+      // Buffer: {"file_path": "/path/to/file.cpp"
+      // depth=1 because outer brace is unclosed
+      parser.addChunk(
+        0,
+        '{"file_path": "/path/to/file.cpp"',
+        'call_1',
+        'write_file',
+      );
+      expect(parser.hasIncompleteToolCalls()).toBe(true);
+      expect(parser.getState(0).depth).toBe(1);
+    });
+  });
 });
diff --git a/packages/core/src/core/openaiContentGenerator/streamingToolCallParser.ts b/packages/core/src/core/openaiContentGenerator/streamingToolCallParser.ts
index 31fe75283..19a659ab3 100644
--- a/packages/core/src/core/openaiContentGenerator/streamingToolCallParser.ts
+++ b/packages/core/src/core/openaiContentGenerator/streamingToolCallParser.ts
@@ -411,4 +411,32 @@ export class StreamingToolCallParser {
       escape: this.escapes.get(index) || false,
     };
   }
+
+  /**
+   * Checks whether any buffered tool call has incomplete JSON at stream end.
+   *
+   * A tool call is considered incomplete when its JSON parsing state indicates
+   * the buffer was truncated mid-stream:
+   * - depth > 0: unclosed braces/brackets remain
+   * - inString === true: still inside a string literal
+   *
+   * This is critical for detecting output truncation that the LLM provider
+   * may not report correctly via finish_reason (e.g. reporting "stop" or
+   * "tool_calls" instead of "length" when output was actually cut off).
+   *
+   * @returns true if at least one tool call buffer has incomplete JSON
+   */
+  hasIncompleteToolCalls(): boolean {
+    for (const [index] of this.buffers.entries()) {
+      const meta = this.toolCallMeta.get(index);
+      if (!meta?.name) continue;
+
+      const depth = this.depths.get(index) || 0;
+      const inString = this.inStrings.get(index) || false;
+      if (depth > 0 || inString) {
+        return true;
+      }
+    }
+    return false;
+  }
 }
diff --git a/packages/core/src/core/tokenLimits.test.ts b/packages/core/src/core/tokenLimits.test.ts
index ffd71cd4b..edea10a10 100644
--- a/packages/core/src/core/tokenLimits.test.ts
+++ b/packages/core/src/core/tokenLimits.test.ts
@@ -91,183 +91,143 @@ describe('normalize', () => {
 });
 
 describe('tokenLimit', () => {
-  // Test cases for each model family
   describe('Google Gemini', () => {
-    it('should return the correct limit for Gemini 1.5 Pro', () => {
-      expect(tokenLimit('gemini-1.5-pro')).toBe(2097152);
+    it('should return 1M for Gemini 3.x (latest)', () => {
+      expect(tokenLimit('gemini-3-pro-preview')).toBe(1000000);
+      expect(tokenLimit('gemini-3-flash-preview')).toBe(1000000);
+      expect(tokenLimit('gemini-3.1-pro-preview')).toBe(1000000);
     });
-    it('should return the correct limit for Gemini 1.5 Flash', () => {
-      expect(tokenLimit('gemini-1.5-flash')).toBe(1048576);
-    });
-    it('should return the correct limit for Gemini 2.5 Pro', () => {
-      expect(tokenLimit('gemini-2.5-pro')).toBe(1048576);
-    });
-    it('should return the correct limit for Gemini 2.5 Flash', () => {
-      expect(tokenLimit('gemini-2.5-flash')).toBe(1048576);
-    });
-    it('should return the correct limit for Gemini 2.0 Flash with image generation', () => {
-      expect(tokenLimit('gemini-2.0-flash-image-generation')).toBe(32768);
-    });
-    it('should return the correct limit for Gemini 2.0 Flash', () => {
-      expect(tokenLimit('gemini-2.0-flash')).toBe(1048576);
+
+    it('should return 1M for legacy Gemini (fallback)', () => {
+      expect(tokenLimit('gemini-2.5-pro')).toBe(1000000);
+      expect(tokenLimit('gemini-2.5-flash')).toBe(1000000);
+      expect(tokenLimit('gemini-2.0-flash')).toBe(1000000);
+      expect(tokenLimit('gemini-1.5-pro')).toBe(1000000);
+      expect(tokenLimit('gemini-1.5-flash')).toBe(1000000);
     });
   });
 
   describe('OpenAI', () => {
-    it('should return the correct limit for o3-mini', () => {
-      expect(tokenLimit('o3-mini')).toBe(200000);
+    it('should return 400K for GPT-5.x (latest)', () => {
+      expect(tokenLimit('gpt-5')).toBe(400000);
+      expect(tokenLimit('gpt-5-mini')).toBe(400000);
+      expect(tokenLimit('gpt-5.2')).toBe(400000);
+      expect(tokenLimit('gpt-5.2-pro')).toBe(400000);
     });
-    it('should return the correct limit for o3 models', () => {
-      expect(tokenLimit('o3')).toBe(200000);
-    });
-    it('should return the correct limit for o4-mini', () => {
-      expect(tokenLimit('o4-mini')).toBe(200000);
-    });
-    it('should return the correct limit for gpt-4o-mini', () => {
-      expect(tokenLimit('gpt-4o-mini')).toBe(131072);
-    });
-    it('should return the correct limit for gpt-4o', () => {
+
+    it('should return 128K for legacy GPT (fallback)', () => {
       expect(tokenLimit('gpt-4o')).toBe(131072);
-    });
-    it('should return the correct limit for gpt-4.1-mini', () => {
-      expect(tokenLimit('gpt-4.1-mini')).toBe(1048576);
-    });
-    it('should return the correct limit for gpt-4.1 models', () => {
-      expect(tokenLimit('gpt-4.1')).toBe(1048576);
-    });
-    it('should return the correct limit for gpt-4', () => {
+      expect(tokenLimit('gpt-4o-mini')).toBe(131072);
+      expect(tokenLimit('gpt-4.1')).toBe(131072);
       expect(tokenLimit('gpt-4')).toBe(131072);
     });
+
+    it('should return 200K for o-series', () => {
+      expect(tokenLimit('o3')).toBe(200000);
+      expect(tokenLimit('o3-mini')).toBe(200000);
+      expect(tokenLimit('o4-mini')).toBe(200000);
+    });
   });
 
   describe('Anthropic Claude', () => {
-    it('should return the correct limit for Claude 3.5 Sonnet', () => {
+    it('should return 200K for all Claude models', () => {
+      expect(tokenLimit('claude-opus-4-6')).toBe(200000);
+      expect(tokenLimit('claude-sonnet-4-6')).toBe(200000);
+      expect(tokenLimit('claude-sonnet-4')).toBe(200000);
+      expect(tokenLimit('claude-opus-4')).toBe(200000);
       expect(tokenLimit('claude-3.5-sonnet')).toBe(200000);
-    });
-    it('should return the correct limit for Claude 3.7 Sonnet', () => {
-      expect(tokenLimit('claude-3.7-sonnet')).toBe(1048576);
-    });
-    it('should return the correct limit for Claude Sonnet 4', () => {
-      expect(tokenLimit('claude-sonnet-4')).toBe(1048576);
-    });
-    it('should return the correct limit for Claude Opus 4', () => {
-      expect(tokenLimit('claude-opus-4')).toBe(1048576);
+      expect(tokenLimit('claude-3.7-sonnet')).toBe(200000);
     });
   });
 
   describe('Alibaba Qwen', () => {
-    it('should return the correct limit for qwen3-coder commercial models', () => {
-      expect(tokenLimit('qwen3-coder-plus')).toBe(1048576);
-      expect(tokenLimit('qwen3-coder-plus-20250601')).toBe(1048576);
-      expect(tokenLimit('qwen3-coder-flash')).toBe(1048576);
-      expect(tokenLimit('qwen3-coder-flash-20250601')).toBe(1048576);
+    it('should return 1M for commercial Qwen3 models', () => {
+      expect(tokenLimit('qwen3-coder-plus')).toBe(1000000);
+      expect(tokenLimit('qwen3-coder-plus-20250601')).toBe(1000000);
+      expect(tokenLimit('qwen3-coder-flash')).toBe(1000000);
+      expect(tokenLimit('qwen3.5-plus')).toBe(1000000);
+      expect(tokenLimit('coder-model')).toBe(1000000);
     });
 
-    it('should return the correct limit for qwen3-coder open source models', () => {
+    it('should return 256K for Qwen3 non-commercial models', () => {
+      expect(tokenLimit('qwen3-max')).toBe(262144);
+      expect(tokenLimit('qwen3-max-2026-01-23')).toBe(262144);
+      expect(tokenLimit('qwen3-vl-plus')).toBe(262144);
       expect(tokenLimit('qwen3-coder-7b')).toBe(262144);
-      expect(tokenLimit('qwen3-coder-480b-a35b-instruct')).toBe(262144);
-      expect(tokenLimit('qwen3-coder-30b-a3b-instruct')).toBe(262144);
+      expect(tokenLimit('qwen3-coder-next')).toBe(262144);
     });
 
-    it('should return the correct limit for qwen3 2507 variants', () => {
-      expect(tokenLimit('qwen3-some-model-2507-instruct')).toBe(262144);
+    it('should return 1M for studio latest models', () => {
+      expect(tokenLimit('qwen-plus-latest')).toBe(1000000);
+      expect(tokenLimit('qwen-flash-latest')).toBe(1000000);
     });
 
-    it('should return the correct limit for qwen2.5-1m', () => {
-      expect(tokenLimit('qwen2.5-1m')).toBe(1048576);
-      expect(tokenLimit('qwen2.5-1m-instruct')).toBe(1048576);
-    });
-
-    it('should return the correct limit for qwen2.5', () => {
-      expect(tokenLimit('qwen2.5')).toBe(131072);
-      expect(tokenLimit('qwen2.5-instruct')).toBe(131072);
-    });
-
-    it('should return the correct limit for qwen-plus', () => {
-      expect(tokenLimit('qwen-plus-latest')).toBe(1048576);
-      expect(tokenLimit('qwen-plus')).toBe(131072);
-    });
-
-    it('should return the correct limit for qwen-flash', () => {
-      expect(tokenLimit('qwen-flash-latest')).toBe(1048576);
-    });
-
-    it('should return the correct limit for qwen-turbo', () => {
-      expect(tokenLimit('qwen-turbo')).toBe(131072);
-      expect(tokenLimit('qwen-turbo-latest')).toBe(131072);
-    });
-  });
-
-  describe('ByteDance Seed-OSS', () => {
-    it('should return the correct limit for seed-oss', () => {
-      expect(tokenLimit('seed-oss')).toBe(524288);
-    });
-  });
-
-  describe('Zhipu GLM', () => {
-    it('should return the correct limit for glm-4.5v', () => {
-      expect(tokenLimit('glm-4.5v')).toBe(65536);
-    });
-    it('should return the correct limit for glm-4.5-air', () => {
-      expect(tokenLimit('glm-4.5-air')).toBe(131072);
-    });
-    it('should return the correct limit for glm-4.5', () => {
-      expect(tokenLimit('glm-4.5')).toBe(131072);
-    });
-    it('should return the correct limit for glm-4.6', () => {
-      expect(tokenLimit('glm-4.6')).toBe(202752);
+    it('should return 256K for Qwen fallback', () => {
+      expect(tokenLimit('qwen-plus')).toBe(262144);
+      expect(tokenLimit('qwen-turbo')).toBe(262144);
+      expect(tokenLimit('qwen2.5')).toBe(262144);
+      expect(tokenLimit('qwen-vl-max-latest')).toBe(262144);
     });
   });
 
   describe('DeepSeek', () => {
-    it('should return the correct limit for deepseek-r1', () => {
+    it('should return 128K for DeepSeek models', () => {
       expect(tokenLimit('deepseek-r1')).toBe(131072);
-    });
-    it('should return the correct limit for deepseek-v3', () => {
       expect(tokenLimit('deepseek-v3')).toBe(131072);
+      expect(tokenLimit('deepseek-chat')).toBe(131072);
     });
-    it('should return the correct limit for deepseek-v3.1', () => {
-      expect(tokenLimit('deepseek-v3.1')).toBe(131072);
+  });
+
+  describe('Zhipu GLM', () => {
+    it('should return 200K for GLM-5 and GLM-4.7 (latest)', () => {
+      expect(tokenLimit('glm-5')).toBe(202752);
+      expect(tokenLimit('glm-4.7')).toBe(202752);
     });
-    it('should return the correct limit for deepseek-v3.2', () => {
-      expect(tokenLimit('deepseek-v3.2-exp')).toBe(131072);
+
+    it('should return 200K for legacy GLM (fallback)', () => {
+      expect(tokenLimit('glm-4.5')).toBe(202752);
+      expect(tokenLimit('glm-4.5v')).toBe(202752);
+      expect(tokenLimit('glm-4.5-air')).toBe(202752);
+    });
+  });
+
+  describe('MiniMax', () => {
+    it('should return 1M for MiniMax-M2.5 (latest)', () => {
+      expect(tokenLimit('MiniMax-M2.5')).toBe(1000000);
+    });
+
+    it('should return 200K for MiniMax fallback', () => {
+      expect(tokenLimit('MiniMax-M2.1')).toBe(200000);
     });
   });
 
   describe('Moonshot Kimi', () => {
-    it('should return the correct limit for kimi-k2 variants', () => {
-      expect(tokenLimit('kimi-k2-0905-preview')).toBe(262144); // 256K
+    it('should return 256K for Kimi models', () => {
+      expect(tokenLimit('kimi-k2.5')).toBe(262144);
       expect(tokenLimit('kimi-k2-0905')).toBe(262144);
-      expect(tokenLimit('kimi-k2-turbo-preview')).toBe(262144);
       expect(tokenLimit('kimi-k2-turbo')).toBe(262144);
-      expect(tokenLimit('kimi-k2-0711-preview')).toBe(262144);
-      expect(tokenLimit('kimi-k2-instruct')).toBe(262144);
     });
   });
 
   describe('Other models', () => {
-    it('should return the correct limit for gpt-oss', () => {
-      expect(tokenLimit('gpt-oss')).toBe(131072);
+    it('should return correct limits for other known models', () => {
+      expect(tokenLimit('seed-oss')).toBe(524288);
     });
-    it('should return the correct limit for llama-4-scout', () => {
-      expect(tokenLimit('llama-4-scout')).toBe(10485760);
-    });
-    it('should return the correct limit for mistral-large-2', () => {
-      expect(tokenLimit('mistral-large-2')).toBe(131072);
+
+    it('should return the default token limit for unknown models', () => {
+      expect(tokenLimit('llama-4-scout')).toBe(DEFAULT_TOKEN_LIMIT);
     });
   });
 
-  // Test for default limit
   it('should return the default token limit for an unknown model', () => {
     expect(tokenLimit('unknown-model-v1.0')).toBe(DEFAULT_TOKEN_LIMIT);
+    expect(tokenLimit('mistral-large-2')).toBe(DEFAULT_TOKEN_LIMIT);
   });
 
-  // Test with complex model string
   it('should return the correct limit for a complex model string', () => {
     expect(tokenLimit('  a/b/c|GPT-4o:gpt-4o-2024-05-13-q4  ')).toBe(131072);
   });
 
-  // Test case-insensitive matching
   it('should handle case-insensitive model names', () => {
     expect(tokenLimit('GPT-4O')).toBe(131072);
     expect(tokenLimit('CLAUDE-3.5-SONNET')).toBe(200000);
@@ -275,99 +235,97 @@ describe('tokenLimit', () => {
 });
 
 describe('tokenLimit with output type', () => {
-  describe('Qwen models with output limits', () => {
-    it('should return the correct output limit for qwen3-coder-plus', () => {
-      expect(tokenLimit('qwen3-coder-plus', 'output')).toBe(65536);
-      expect(tokenLimit('qwen3-coder-plus-20250601', 'output')).toBe(65536);
+  describe('latest models output limits', () => {
+    it('should return correct output limits for GPT-5.x', () => {
+      expect(tokenLimit('gpt-5.2', 'output')).toBe(131072);
+      expect(tokenLimit('gpt-5-mini', 'output')).toBe(131072);
     });
 
-    it('should return the correct output limit for qwen-vl-max-latest', () => {
+    it('should return correct output limits for Gemini 3.x', () => {
+      expect(tokenLimit('gemini-3-pro-preview', 'output')).toBe(65536);
+      expect(tokenLimit('gemini-3-flash-preview', 'output')).toBe(65536);
+    });
+
+    it('should return correct output limits for Claude 4.6', () => {
+      expect(tokenLimit('claude-opus-4-6', 'output')).toBe(131072);
+      expect(tokenLimit('claude-sonnet-4-6', 'output')).toBe(65536);
+    });
+  });
+
+  describe('legacy model output fallbacks', () => {
+    it('should return fallback output limits for legacy GPT', () => {
+      expect(tokenLimit('gpt-4o', 'output')).toBe(16384);
+    });
+
+    it('should return fallback output limits for legacy Gemini', () => {
+      expect(tokenLimit('gemini-2.5-pro', 'output')).toBe(8192);
+    });
+
+    it('should return fallback output limits for legacy Claude', () => {
+      expect(tokenLimit('claude-sonnet-4', 'output')).toBe(65536);
+      expect(tokenLimit('claude-opus-4', 'output')).toBe(65536);
+    });
+  });
+
+  describe('Qwen output limits', () => {
+    it('should return correct output limits for Qwen models', () => {
+      expect(tokenLimit('qwen3.5-plus', 'output')).toBe(65536);
+      expect(tokenLimit('qwen3-max', 'output')).toBe(65536);
+      expect(tokenLimit('qwen3-max-2026-01-23', 'output')).toBe(65536);
+      expect(tokenLimit('coder-model', 'output')).toBe(65536);
+      // Models without specific output limits fall back to default
+      expect(tokenLimit('qwen3-coder-plus', 'output')).toBe(8192);
+      expect(tokenLimit('qwen3-coder-next', 'output')).toBe(8192);
+      expect(tokenLimit('qwen3-vl-plus', 'output')).toBe(8192);
       expect(tokenLimit('qwen-vl-max-latest', 'output')).toBe(8192);
     });
   });
 
-  describe('Default output limits', () => {
+  describe('other output limits', () => {
+    it('should return correct output limits for DeepSeek', () => {
+      expect(tokenLimit('deepseek-reasoner', 'output')).toBe(65536);
+      expect(tokenLimit('deepseek-chat', 'output')).toBe(8192);
+    });
+
+    it('should return correct output limits for GLM', () => {
+      expect(tokenLimit('glm-5', 'output')).toBe(16384);
+      expect(tokenLimit('glm-4.7', 'output')).toBe(16384);
+    });
+
+    it('should return correct output limits for MiniMax', () => {
+      expect(tokenLimit('MiniMax-M2.5', 'output')).toBe(65536);
+    });
+
+    it('should return correct output limits for Kimi', () => {
+      expect(tokenLimit('kimi-k2.5', 'output')).toBe(32768);
+    });
+  });
+
+  describe('default output limits', () => {
     it('should return the default output limit for unknown models', () => {
       expect(tokenLimit('unknown-model', 'output')).toBe(
         DEFAULT_OUTPUT_TOKEN_LIMIT,
       );
-      expect(tokenLimit('gpt-4', 'output')).toBe(DEFAULT_OUTPUT_TOKEN_LIMIT);
-      expect(tokenLimit('claude-3.5-sonnet', 'output')).toBe(
-        DEFAULT_OUTPUT_TOKEN_LIMIT,
-      );
-    });
-
-    it('should return the default output limit for models without specific output patterns', () => {
-      expect(tokenLimit('qwen3-coder-7b', 'output')).toBe(
-        DEFAULT_OUTPUT_TOKEN_LIMIT,
-      );
-      expect(tokenLimit('qwen-plus', 'output')).toBe(
-        DEFAULT_OUTPUT_TOKEN_LIMIT,
-      );
-      expect(tokenLimit('qwen-vl-max', 'output')).toBe(
-        DEFAULT_OUTPUT_TOKEN_LIMIT,
-      );
     });
   });
 
-  describe('Input vs Output limits comparison', () => {
-    it('should return different limits for input vs output for qwen3-coder-plus', () => {
-      expect(tokenLimit('qwen3-coder-plus', 'input')).toBe(1048576); // 1M input
-      expect(tokenLimit('qwen3-coder-plus', 'output')).toBe(65536); // 64K output
+  describe('input vs output comparison', () => {
+    it('should return different limits for input vs output', () => {
+      expect(tokenLimit('qwen3-max', 'input')).toBe(262144);
+      expect(tokenLimit('qwen3-max', 'output')).toBe(65536);
     });
 
-    it('should return different limits for input vs output for qwen-vl-max-latest', () => {
-      expect(tokenLimit('qwen-vl-max-latest', 'input')).toBe(131072); // 128K input
-      expect(tokenLimit('qwen-vl-max-latest', 'output')).toBe(8192); // 8K output
-    });
-
-    it('should return different limits for input vs output for qwen3-vl-plus', () => {
-      expect(tokenLimit('qwen3-vl-plus', 'input')).toBe(262144); // 256K input
-      expect(tokenLimit('qwen3-vl-plus', 'output')).toBe(32768); // 32K output
-    });
-
-    it('should return same default limits for unknown models', () => {
-      expect(tokenLimit('unknown-model', 'input')).toBe(DEFAULT_TOKEN_LIMIT); // 128K input
-      expect(tokenLimit('unknown-model', 'output')).toBe(
-        DEFAULT_OUTPUT_TOKEN_LIMIT,
-      ); // 4K output
-    });
-  });
-
-  describe('Backward compatibility', () => {
     it('should default to input type when no type is specified', () => {
-      expect(tokenLimit('qwen3-coder-plus')).toBe(1048576); // Should be input limit
-      expect(tokenLimit('qwen-vl-max-latest')).toBe(131072); // Should be input limit
-      expect(tokenLimit('unknown-model')).toBe(DEFAULT_TOKEN_LIMIT); // Should be input default
-    });
-
-    it('should work with explicit input type', () => {
-      expect(tokenLimit('qwen3-coder-plus', 'input')).toBe(1048576);
-      expect(tokenLimit('qwen-vl-max-latest', 'input')).toBe(131072);
-      expect(tokenLimit('unknown-model', 'input')).toBe(DEFAULT_TOKEN_LIMIT);
+      expect(tokenLimit('qwen3-coder-plus')).toBe(1000000);
+      expect(tokenLimit('unknown-model')).toBe(DEFAULT_TOKEN_LIMIT);
     });
   });
 
-  describe('Model normalization with output limits', () => {
+  describe('normalization with output limits', () => {
     it('should handle normalized model names for output limits', () => {
-      expect(tokenLimit('QWEN3-CODER-PLUS', 'output')).toBe(65536);
-      expect(tokenLimit('qwen3-coder-plus-20250601', 'output')).toBe(65536);
+      expect(tokenLimit('QWEN3-MAX', 'output')).toBe(65536);
+      expect(tokenLimit('qwen3-max-20250601', 'output')).toBe(65536);
       expect(tokenLimit('QWEN-VL-MAX-LATEST', 'output')).toBe(8192);
     });
-
-    it('should handle complex model strings for output limits', () => {
-      expect(
-        tokenLimit(
-          '  a/b/c|QWEN3-CODER-PLUS:qwen3-coder-plus-2024-05-13  ',
-          'output',
-        ),
-      ).toBe(65536);
-      expect(
-        tokenLimit(
-          'provider/qwen-vl-max-latest:qwen-vl-max-latest-v1',
-          'output',
-        ),
-      ).toBe(8192);
-    });
   });
 });
diff --git a/packages/core/src/core/tokenLimits.ts b/packages/core/src/core/tokenLimits.ts
index 2419e51a1..d038133cb 100644
--- a/packages/core/src/core/tokenLimits.ts
+++ b/packages/core/src/core/tokenLimits.ts
@@ -9,23 +9,23 @@ type TokenCount = number;
 export type TokenLimitType = 'input' | 'output';
 
 export const DEFAULT_TOKEN_LIMIT: TokenCount = 131_072; // 128K (power-of-two)
-export const DEFAULT_OUTPUT_TOKEN_LIMIT: TokenCount = 4_096; // 4K tokens
+export const DEFAULT_OUTPUT_TOKEN_LIMIT: TokenCount = 8_192; // 8K tokens
 
 /**
  * Accurate numeric limits:
  * - power-of-two approximations (128K -> 131072, 256K -> 262144, etc.)
- * - vendor-declared exact values (e.g., 200k -> 200000) are used as stated in docs.
+ * - vendor-declared exact values (e.g., 200k -> 200000, 1m -> 1000000) are
+ *   used as stated in docs.
  */
 const LIMITS = {
   '32k': 32_768,
   '64k': 65_536,
   '128k': 131_072,
-  '200k': 200_000, // vendor-declared decimal, used by OpenAI, Anthropic, GLM etc.
+  '200k': 200_000, // vendor-declared decimal, used by OpenAI, Anthropic, etc.
   '256k': 262_144,
+  '400k': 400_000, // vendor-declared decimal, used by OpenAI GPT-5.x
   '512k': 524_288,
-  '1m': 1_048_576,
-  '2m': 2_097_152,
-  '10m': 10_485_760, // 10 million tokens
+  '1m': 1_000_000,
   // Output token limits (typically much smaller than input limits)
   '4k': 4_096,
   '8k': 8_192,
@@ -81,110 +81,64 @@ const PATTERNS: Array<[RegExp, TokenCount]> = [
   // -------------------
   // Google Gemini
   // -------------------
-  [/^gemini-1\.5-pro$/, LIMITS['2m']],
-  [/^gemini-1\.5-flash$/, LIMITS['1m']],
-  [/^gemini-2\.5-pro.*$/, LIMITS['1m']],
-  [/^gemini-2\.5-flash.*$/, LIMITS['1m']],
-  [/^gemini-2\.0-flash-image-generation$/, LIMITS['32k']],
-  [/^gemini-2\.0-flash.*$/, LIMITS['1m']],
+  [/^gemini-3/, LIMITS['1m']], // Gemini 3.x (Pro, Flash, 3.1, etc.): 1M
+  [/^gemini-/, LIMITS['1m']], // Gemini fallback (1.5, 2.x): 1M
 
   // -------------------
-  // OpenAI (o3 / o4-mini / gpt-4.1 / gpt-4o family)
-  // o3 and o4-mini document a 200,000-token context window (decimal).
-  // Note: GPT-4.1 models typically report 1_048_576 (1M) context in OpenAI announcements.
-  [/^o3(?:-mini|$).*$/, LIMITS['200k']],
-  [/^o3.*$/, LIMITS['200k']],
-  [/^o4-mini.*$/, LIMITS['200k']],
-  [/^gpt-4\.1-mini.*$/, LIMITS['1m']],
-  [/^gpt-4\.1.*$/, LIMITS['1m']],
-  [/^gpt-4o-mini.*$/, LIMITS['128k']],
-  [/^gpt-4o.*$/, LIMITS['128k']],
-  [/^gpt-4.*$/, LIMITS['128k']],
+  // OpenAI
+  // -------------------
+  [/^gpt-5/, LIMITS['400k']], // GPT-5.x: 400K
+  [/^gpt-/, LIMITS['128k']], // GPT fallback (4o, 4.1, etc.): 128K
+  [/^o\d/, LIMITS['200k']], // o-series (o3, o4-mini, etc.): 200K
 
   // -------------------
   // Anthropic Claude
-  // - Claude Sonnet / Sonnet 3.5 and related Sonnet variants: 200,000 tokens documented.
-  // - Some Sonnet/Opus models offer 1M in beta/enterprise tiers (handled separately if needed).
-  [/^claude-3\.5-sonnet.*$/, LIMITS['200k']],
-  [/^claude-3\.7-sonnet.*$/, LIMITS['1m']], // some Sonnet 3.7/Opus variants advertise 1M beta in docs
-  [/^claude-sonnet-4.*$/, LIMITS['1m']],
-  [/^claude-opus-4.*$/, LIMITS['1m']],
+  // -------------------
+  [/^claude-/, LIMITS['200k']], // All Claude models: 200K
 
   // -------------------
   // Alibaba / Qwen
   // -------------------
-  // Commercial Qwen3-Coder-Plus: 1M token context
-  [/^qwen3-coder-plus(-.*)?$/, LIMITS['1m']], // catches "qwen3-coder-plus" and date variants
-
-  // Commercial Qwen3-Coder-Flash: 1M token context
-  [/^qwen3-coder-flash(-.*)?$/, LIMITS['1m']], // catches "qwen3-coder-flash" and date variants
-
-  // Commercial Qwen3.5-Plus: 1M token context
-  [/^qwen3\.5-plus(-.*)?$/, LIMITS['1m']], // catches "qwen3.5-plus" and date variants
-
-  // Generic coder-model: same as qwen3.5-plus (1M token context)
-  [/^coder-model$/, LIMITS['1m']],
-
-  // Commercial Qwen3-Max-Preview: 256K token context
-  [/^qwen3-max(-preview)?(-.*)?$/, LIMITS['256k']], // catches "qwen3-max" or "qwen3-max-preview" and date variants
-
-  // Open-source Qwen3-Coder variants: 256K native
-  [/^qwen3-coder-.*$/, LIMITS['256k']],
-  // Open-source Qwen3 2507 variants: 256K native
-  [/^qwen3-.*-2507-.*$/, LIMITS['256k']],
-
-  // Open-source long-context Qwen2.5-1M
-  [/^qwen2\.5-1m.*$/, LIMITS['1m']],
-
-  // Standard Qwen2.5: 128K
-  [/^qwen2\.5.*$/, LIMITS['128k']],
-
-  // Studio commercial Qwen-Plus / Qwen-Flash / Qwen-Turbo
-  [/^qwen-plus-latest$/, LIMITS['1m']], // Commercial latest: 1M
-  [/^qwen-plus.*$/, LIMITS['128k']], // Standard: 128K
+  // Commercial API models (1,000,000 context)
+  [/^qwen3-coder-plus/, LIMITS['1m']],
+  [/^qwen3-coder-flash/, LIMITS['1m']],
+  [/^qwen3\.5-plus/, LIMITS['1m']],
+  [/^qwen-plus-latest$/, LIMITS['1m']],
   [/^qwen-flash-latest$/, LIMITS['1m']],
-  [/^qwen-turbo.*$/, LIMITS['128k']],
-
-  // Qwen Vision Models
-  [/^qwen3-vl-plus$/, LIMITS['256k']], // Qwen3-VL-Plus: 256K input
-  [/^qwen-vl-max.*$/, LIMITS['128k']],
-
-  // -------------------
-  // ByteDance Seed-OSS (512K)
-  // -------------------
-  [/^seed-oss.*$/, LIMITS['512k']],
-
-  // -------------------
-  // Zhipu GLM
-  // -------------------
-  [/^glm-4\.5v(?:-.*)?$/, LIMITS['64k']],
-  [/^glm-4\.5-air(?:-.*)?$/, LIMITS['128k']],
-  [/^glm-4\.5(?:-.*)?$/, LIMITS['128k']],
-  [/^glm-4\.6(?:-.*)?$/, 202_752 as unknown as TokenCount], // exact limit from the model config file
-  [/^glm-4\.7(?:-.*)?$/, LIMITS['200k']],
+  [/^coder-model$/, LIMITS['1m']],
+  // Commercial API models (256K context)
+  [/^qwen3-max/, LIMITS['256k']],
+  // Open-source Qwen3 variants: 256K native
+  [/^qwen3-coder-/, LIMITS['256k']],
+  // Qwen fallback (VL, turbo, plus, 2.5, etc.): 128K
+  [/^qwen/, LIMITS['256k']],
 
   // -------------------
   // DeepSeek
   // -------------------
-  [/^deepseek(?:-.*)?$/, LIMITS['128k']],
+  [/^deepseek/, LIMITS['128k']],
 
   // -------------------
-  // Moonshot / Kimi
+  // Zhipu GLM
   // -------------------
-  [/^kimi-2\.5.*$/, LIMITS['256k']], // Kimi-2.5: 256K context
-  [/^kimi-k2.*$/, LIMITS['256k']], // Kimi-k2 variants: 256K context
-
-  // -------------------
-  // GPT-OSS / Llama & Mistral examples
-  // -------------------
-  [/^gpt-oss.*$/, LIMITS['128k']],
-  [/^llama-4-scout.*$/, LIMITS['10m']],
-  [/^mistral-large-2.*$/, LIMITS['128k']],
+  [/^glm-5/, 202_752 as TokenCount], // GLM-5: exact vendor limit
+  [/^glm-/, 202_752 as TokenCount], // GLM fallback: 128K
 
   // -------------------
   // MiniMax
   // -------------------
-  [/^minimax-m2\.1.*$/i, LIMITS['200k']], // MiniMax-M2.1: 200K context
+  [/^minimax-m2\.5/i, LIMITS['1m']], // MiniMax-M2.5: 1,000,000
+  [/^minimax-/i, LIMITS['200k']], // MiniMax fallback: 200K
+
+  // -------------------
+  // Moonshot / Kimi
+  // -------------------
+  [/^kimi-/, LIMITS['256k']], // Kimi fallback: 256K
+
+  // -------------------
+  // ByteDance Seed-OSS (512K)
+  // -------------------
+  [/^seed-oss/, LIMITS['512k']],
 ];
 
 /**
@@ -193,32 +147,38 @@ const PATTERNS: Array<[RegExp, TokenCount]> = [
  * in a single response for specific models.
  */
 const OUTPUT_PATTERNS: Array<[RegExp, TokenCount]> = [
-  // -------------------
-  // Alibaba / Qwen - DashScope Models
-  // -------------------
-  // Qwen3-Coder-Plus: 65,536 max output tokens
-  [/^qwen3-coder-plus(-.*)?$/, LIMITS['64k']],
+  // Google Gemini
+  [/^gemini-3/, LIMITS['64k']], // Gemini 3.x: 64K
+  [/^gemini-/, LIMITS['8k']], // Gemini fallback: 8K
 
-  // Qwen3.5-Plus: 65,536 max output tokens
-  [/^qwen3\.5-plus(-.*)?$/, LIMITS['64k']],
+  // OpenAI
+  [/^gpt-5/, LIMITS['128k']], // GPT-5.x: 128K
+  [/^gpt-/, LIMITS['16k']], // GPT fallback: 16K
+  [/^o\d/, LIMITS['128k']], // o-series: 128K
 
-  // Generic coder-model: same as qwen3.5-plus (64K max output tokens)
+  // Anthropic Claude
+  [/^claude-opus-4-6/, LIMITS['128k']], // Opus 4.6: 128K
+  [/^claude-sonnet-4-6/, LIMITS['64k']], // Sonnet 4.6: 64K
+  [/^claude-/, LIMITS['64k']], // Claude fallback: 64K
+
+  // Alibaba / Qwen
+  [/^qwen3\.5/, LIMITS['64k']],
   [/^coder-model$/, LIMITS['64k']],
+  [/^qwen3-max/, LIMITS['64k']],
 
-  // Qwen3-Max: 65,536 max output tokens
-  [/^qwen3-max(-preview)?(-.*)?$/, LIMITS['64k']],
+  // DeepSeek
+  [/^deepseek-reasoner/, LIMITS['64k']],
+  [/^deepseek-chat/, LIMITS['8k']],
 
-  // Qwen-VL-Max-Latest: 8,192 max output tokens
-  [/^qwen-vl-max-latest$/, LIMITS['8k']],
+  // Zhipu GLM
+  [/^glm-5/, LIMITS['16k']],
+  [/^glm-4\.7/, LIMITS['16k']],
 
-  // Qwen3-VL-Plus: 32K max output tokens
-  [/^qwen3-vl-plus$/, LIMITS['32k']],
+  // MiniMax
+  [/^minimax-m2\.5/i, LIMITS['64k']],
 
-  // Deepseek-chat: 8k max tokens
-  [/^deepseek-chat$/, LIMITS['8k']],
-
-  // Deepseek-reasoner: 64k max tokens
-  [/^deepseek-reasoner$/, LIMITS['64k']],
+  // Kimi
+  [/^kimi-k2\.5/, LIMITS['32k']],
 ];
 
 /**
diff --git a/packages/core/src/core/turn.test.ts b/packages/core/src/core/turn.test.ts
index 7d687a17b..148a19d63 100644
--- a/packages/core/src/core/turn.test.ts
+++ b/packages/core/src/core/turn.test.ts
@@ -873,4 +873,141 @@ describe('Turn', () => {
       expect(turn.getDebugResponses()).toEqual([resp1, resp2]);
     });
   });
+
+  describe('wasOutputTruncated flag', () => {
+    it('should set wasOutputTruncated=true on pending tool calls when finishReason is MAX_TOKENS', async () => {
+      const mockResponseStream = (async function* () {
+        // Yield a tool call request
+        yield {
+          type: StreamEventType.CHUNK,
+          value: {
+            functionCalls: [
+              {
+                name: 'write_file',
+                args: { file_path: '/test.txt', content: 'hello' },
+              },
+            ],
+          } as unknown as GenerateContentResponse,
+        };
+        // Yield finish with MAX_TOKENS
+        yield {
+          type: StreamEventType.CHUNK,
+          value: {
+            candidates: [
+              {
+                finishReason: 'MAX_TOKENS',
+                content: { parts: [] },
+              },
+            ],
+          } as unknown as GenerateContentResponse,
+        };
+      })();
+      mockSendMessageStream.mockResolvedValue(mockResponseStream);
+
+      const reqParts: Part[] = [{ text: 'Test prompt' }];
+      const events = [];
+      for await (const event of turn.run(
+        'test-model',
+        reqParts,
+        new AbortController().signal,
+      )) {
+        events.push(event);
+      }
+
+      // Verify that pending tool calls have wasOutputTruncated flag set
+      expect(turn.pendingToolCalls).toHaveLength(1);
+      expect(turn.pendingToolCalls[0].wasOutputTruncated).toBe(true);
+      expect(turn.pendingToolCalls[0].name).toBe('write_file');
+    });
+
+    it('should NOT set wasOutputTruncated when finishReason is STOP', async () => {
+      const mockResponseStream = (async function* () {
+        yield {
+          type: StreamEventType.CHUNK,
+          value: {
+            functionCalls: [
+              {
+                name: 'read_file',
+                args: { file_path: '/test.txt' },
+              },
+            ],
+          } as unknown as GenerateContentResponse,
+        };
+        // Yield finish with STOP (normal completion)
+        yield {
+          type: StreamEventType.CHUNK,
+          value: {
+            candidates: [
+              {
+                finishReason: 'STOP',
+                content: { parts: [] },
+              },
+            ],
+          } as unknown as GenerateContentResponse,
+        };
+      })();
+      mockSendMessageStream.mockResolvedValue(mockResponseStream);
+
+      const reqParts: Part[] = [{ text: 'Test prompt' }];
+      for await (const _ of turn.run(
+        'test-model',
+        reqParts,
+        new AbortController().signal,
+      )) {
+        // consume stream
+      }
+
+      // Verify that pending tool calls do NOT have wasOutputTruncated flag
+      expect(turn.pendingToolCalls).toHaveLength(1);
+      expect(turn.pendingToolCalls[0].wasOutputTruncated).toBeUndefined();
+    });
+
+    it('should handle multiple pending tool calls with MAX_TOKENS', async () => {
+      const mockResponseStream = (async function* () {
+        // Yield two tool calls
+        yield {
+          type: StreamEventType.CHUNK,
+          value: {
+            functionCalls: [
+              {
+                name: 'write_file',
+                args: { file_path: '/test1.txt', content: 'content1' },
+              },
+              {
+                name: 'edit',
+                args: { file_path: '/test2.txt', original_text: 'old' },
+              },
+            ],
+          } as unknown as GenerateContentResponse,
+        };
+        // Yield finish with MAX_TOKENS
+        yield {
+          type: StreamEventType.CHUNK,
+          value: {
+            candidates: [
+              {
+                finishReason: 'MAX_TOKENS',
+                content: { parts: [] },
+              },
+            ],
+          } as unknown as GenerateContentResponse,
+        };
+      })();
+      mockSendMessageStream.mockResolvedValue(mockResponseStream);
+
+      const reqParts: Part[] = [{ text: 'Test prompt' }];
+      for await (const _ of turn.run(
+        'test-model',
+        reqParts,
+        new AbortController().signal,
+      )) {
+        // consume stream
+      }
+
+      // Verify both tool calls have wasOutputTruncated flag set
+      expect(turn.pendingToolCalls).toHaveLength(2);
+      expect(turn.pendingToolCalls[0].wasOutputTruncated).toBe(true);
+      expect(turn.pendingToolCalls[1].wasOutputTruncated).toBe(true);
+    });
+  });
 });
diff --git a/packages/core/src/core/turn.ts b/packages/core/src/core/turn.ts
index 3115cb425..08f379d68 100644
--- a/packages/core/src/core/turn.ts
+++ b/packages/core/src/core/turn.ts
@@ -4,14 +4,14 @@
  * SPDX-License-Identifier: Apache-2.0
  */
 
-import type {
-  Part,
-  PartListUnion,
-  GenerateContentResponse,
-  FunctionCall,
-  FunctionDeclaration,
+import {
   FinishReason,
-  GenerateContentResponseUsageMetadata,
+  type Part,
+  type PartListUnion,
+  type GenerateContentResponse,
+  type FunctionCall,
+  type FunctionDeclaration,
+  type GenerateContentResponseUsageMetadata,
 } from '@google/genai';
 import type {
   ToolCallConfirmationDetails,
@@ -99,6 +99,8 @@ export interface ToolCallRequestInfo {
   isClientInitiated: boolean;
   prompt_id: string;
   response_id?: string;
+  /** Set to true when the LLM response was truncated due to max_tokens. */
+  wasOutputTruncated?: boolean;
 }
 
 export interface ToolCallResponseInfo {
@@ -313,6 +315,14 @@ export class Turn {
 
         // This is the key change: Only yield 'Finished' if there is a finishReason.
         if (finishReason) {
+          // Mark pending tool calls so downstream can distinguish
+          // truncation from real parameter errors.
+          if (finishReason === FinishReason.MAX_TOKENS) {
+            for (const tc of this.pendingToolCalls) {
+              tc.wasOutputTruncated = true;
+            }
+          }
+
           if (this.pendingCitations.size > 0) {
             yield {
               type: GeminiEventType.Citation,
diff --git a/packages/core/src/models/constants.ts b/packages/core/src/models/constants.ts
index 025e3b9cf..c7f4a148b 100644
--- a/packages/core/src/models/constants.ts
+++ b/packages/core/src/models/constants.ts
@@ -22,12 +22,14 @@ export const MODEL_GENERATION_CONFIG_FIELDS = [
   'samplingParams',
   'timeout',
   'maxRetries',
+  'retryErrorCodes',
   'enableCacheControl',
   'schemaCompliance',
   'reasoning',
   'contextWindowSize',
   'customHeaders',
   'extra_body',
+  'modalities',
 ] as const satisfies ReadonlyArray<keyof ContentGeneratorConfig>;
 
 /**
diff --git a/packages/core/src/models/modelRegistry.ts b/packages/core/src/models/modelRegistry.ts
index 7b9bdad77..c2815fb32 100644
--- a/packages/core/src/models/modelRegistry.ts
+++ b/packages/core/src/models/modelRegistry.ts
@@ -5,6 +5,8 @@
  */
 
 import { AuthType } from '../core/contentGenerator.js';
+import { defaultModalities } from '../core/modalityDefaults.js';
+import { tokenLimit } from '../core/tokenLimits.js';
 import { DEFAULT_OPENAI_BASE_URL } from '../core/openaiContentGenerator/constants.js';
 import {
   type ModelConfig,
@@ -121,7 +123,12 @@ export class ModelRegistry {
       capabilities: model.capabilities,
       authType: model.authType,
       isVision: model.capabilities?.vision ?? false,
-      contextWindowSize: model.generationConfig.contextWindowSize,
+      contextWindowSize:
+        model.generationConfig.contextWindowSize ?? tokenLimit(model.id),
+      modalities:
+        model.generationConfig.modalities ?? defaultModalities(model.id),
+      baseUrl: model.baseUrl,
+      envKey: model.envKey,
     }));
   }
 
diff --git a/packages/core/src/models/modelsConfig.ts b/packages/core/src/models/modelsConfig.ts
index a77d1d06b..d22cc790c 100644
--- a/packages/core/src/models/modelsConfig.ts
+++ b/packages/core/src/models/modelsConfig.ts
@@ -11,6 +11,7 @@ import type { ContentGeneratorConfig } from '../core/contentGenerator.js';
 import type { ContentGeneratorConfigSources } from '../core/contentGenerator.js';
 import { DEFAULT_QWEN_MODEL } from '../config/models.js';
 import { tokenLimit } from '../core/tokenLimits.js';
+import { defaultModalities } from '../core/modalityDefaults.js';
 
 import { ModelRegistry } from './modelRegistry.js';
 import {
@@ -770,6 +771,15 @@ export class ModelsConfig {
         detail: 'auto-detected from model',
       };
     }
+
+    // modalities fallback: auto-detect from model when not set by provider
+    if (gc.modalities === undefined) {
+      this._generationConfig.modalities = defaultModalities(model.id);
+      this.generationConfigSources['modalities'] = {
+        kind: 'computed',
+        detail: 'auto-detected from model',
+      };
+    }
   }
 
   /**
diff --git a/packages/core/src/models/types.ts b/packages/core/src/models/types.ts
index 69c286729..64f5ef43e 100644
--- a/packages/core/src/models/types.ts
+++ b/packages/core/src/models/types.ts
@@ -7,6 +7,7 @@
 import type {
   AuthType,
   ContentGeneratorConfig,
+  InputModalities,
 } from '../core/contentGenerator.js';
 import type { ConfigSources } from '../utils/configResolver.js';
 
@@ -29,12 +30,14 @@ export type ModelGenerationConfig = Pick<
   | 'samplingParams'
   | 'timeout'
   | 'maxRetries'
+  | 'retryErrorCodes'
   | 'enableCacheControl'
   | 'schemaCompliance'
   | 'reasoning'
   | 'customHeaders'
   | 'extra_body'
   | 'contextWindowSize'
+  | 'modalities'
 >;
 
 /**
@@ -93,6 +96,9 @@ export interface AvailableModel {
   authType: AuthType;
   isVision?: boolean;
   contextWindowSize?: number;
+  modalities?: InputModalities;
+  baseUrl?: string;
+  envKey?: string;
 
   /** Whether this is a runtime model (not from modelProviders) */
   isRuntimeModel?: boolean;
diff --git a/packages/core/src/subagents/subagent.test.ts b/packages/core/src/subagents/subagent.test.ts
index ce6e64ae4..0286d11c8 100644
--- a/packages/core/src/subagents/subagent.test.ts
+++ b/packages/core/src/subagents/subagent.test.ts
@@ -458,6 +458,103 @@ describe('subagent.ts', () => {
         ]);
       });
 
+      it('should append userMemory to the system prompt when available', async () => {
+        const { config } = await createMockConfig();
+        const userMemoryContent =
+          '# Output language preference: English\nRespond in English.';
+        vi.spyOn(config, 'getUserMemory').mockReturnValue(userMemoryContent);
+
+        vi.mocked(GeminiChat).mockClear();
+
+        const promptConfig: PromptConfig = {
+          systemPrompt: 'You are a test agent.',
+        };
+        const context = new ContextState();
+
+        mockSendMessageStream.mockImplementation(createMockStream(['stop']));
+
+        const scope = await SubAgentScope.create(
+          'test-agent',
+          config,
+          promptConfig,
+          defaultModelConfig,
+          defaultRunConfig,
+        );
+
+        await scope.runNonInteractive(context);
+
+        const generationConfig = getGenerationConfigFromMock();
+        expect(generationConfig.systemInstruction).toContain(
+          'You are a test agent.',
+        );
+        expect(generationConfig.systemInstruction).toContain(
+          'Important Rules:',
+        );
+        expect(generationConfig.systemInstruction).toContain(
+          '# Output language preference: English',
+        );
+        expect(generationConfig.systemInstruction).toContain(
+          'Respond in English.',
+        );
+      });
+
+      it('should not append userMemory separator when userMemory is empty', async () => {
+        const { config } = await createMockConfig();
+        vi.spyOn(config, 'getUserMemory').mockReturnValue('');
+
+        vi.mocked(GeminiChat).mockClear();
+
+        const promptConfig: PromptConfig = {
+          systemPrompt: 'You are a test agent.',
+        };
+        const context = new ContextState();
+
+        mockSendMessageStream.mockImplementation(createMockStream(['stop']));
+
+        const scope = await SubAgentScope.create(
+          'test-agent',
+          config,
+          promptConfig,
+          defaultModelConfig,
+          defaultRunConfig,
+        );
+
+        await scope.runNonInteractive(context);
+
+        const generationConfig = getGenerationConfigFromMock();
+        const sysPrompt = generationConfig.systemInstruction as string;
+        expect(sysPrompt).toContain('You are a test agent.');
+        expect(sysPrompt).not.toContain('---');
+      });
+
+      it('should not append userMemory separator when userMemory is whitespace-only', async () => {
+        const { config } = await createMockConfig();
+        vi.spyOn(config, 'getUserMemory').mockReturnValue('   \n\n  ');
+
+        vi.mocked(GeminiChat).mockClear();
+
+        const promptConfig: PromptConfig = {
+          systemPrompt: 'You are a test agent.',
+        };
+        const context = new ContextState();
+
+        mockSendMessageStream.mockImplementation(createMockStream(['stop']));
+
+        const scope = await SubAgentScope.create(
+          'test-agent',
+          config,
+          promptConfig,
+          defaultModelConfig,
+          defaultRunConfig,
+        );
+
+        await scope.runNonInteractive(context);
+
+        const generationConfig = getGenerationConfigFromMock();
+        const sysPrompt = generationConfig.systemInstruction as string;
+        expect(sysPrompt).not.toContain('---');
+      });
+
       it('should use initialMessages instead of systemPrompt if provided', async () => {
         const { config } = await createMockConfig();
         vi.mocked(GeminiChat).mockClear();
diff --git a/packages/core/src/subagents/subagent.ts b/packages/core/src/subagents/subagent.ts
index c9328e5ad..613bc8044 100644
--- a/packages/core/src/subagents/subagent.ts
+++ b/packages/core/src/subagents/subagent.ts
@@ -999,6 +999,12 @@ Important Rules:
  - Use tools only when necessary to obtain facts or make changes.
  - When the task is complete, return the final result as a normal model response (not a tool call) and stop.`;
 
+    // Append user memory (QWEN.md + output-language.md) to ensure subagent respects project conventions
+    const userMemory = this.runtimeContext.getUserMemory();
+    if (userMemory && userMemory.trim().length > 0) {
+      finalPrompt += `\n\n---\n\n${userMemory.trim()}`;
+    }
+
     return finalPrompt;
   }
 }
diff --git a/packages/core/src/tools/memoryTool.ts b/packages/core/src/tools/memoryTool.ts
index fff2d2be1..95c89b18b 100644
--- a/packages/core/src/tools/memoryTool.ts
+++ b/packages/core/src/tools/memoryTool.ts
@@ -76,11 +76,16 @@ Do NOT use this tool:
 
 export const QWEN_CONFIG_DIR = '.qwen';
 export const DEFAULT_CONTEXT_FILENAME = 'QWEN.md';
+export const AGENT_CONTEXT_FILENAME = 'AGENTS.md';
 export const MEMORY_SECTION_HEADER = '## Qwen Added Memories';
 
-// This variable will hold the currently configured filename for QWEN.md context files.
-// It defaults to DEFAULT_CONTEXT_FILENAME but can be overridden by setGeminiMdFilename.
-let currentGeminiMdFilename: string | string[] = DEFAULT_CONTEXT_FILENAME;
+// This variable will hold the currently configured filename for context files.
+// It defaults to include both QWEN.md and AGENTS.md but can be overridden by setGeminiMdFilename.
+// QWEN.md is first to maintain backward compatibility (used by /init command and save_memory tool).
+let currentGeminiMdFilename: string | string[] = [
+  DEFAULT_CONTEXT_FILENAME,
+  AGENT_CONTEXT_FILENAME,
+];
 
 export function setGeminiMdFilename(newFilename: string | string[]): void {
   if (Array.isArray(newFilename)) {
diff --git a/packages/core/src/tools/read-file.test.ts b/packages/core/src/tools/read-file.test.ts
index 4972f26e7..ec07a6995 100644
--- a/packages/core/src/tools/read-file.test.ts
+++ b/packages/core/src/tools/read-file.test.ts
@@ -231,8 +231,8 @@ describe('ReadFileTool', () => {
 
     it('should return error for a file that is too large', async () => {
       const filePath = path.join(tempRootDir, 'largefile.txt');
-      // 21MB of content exceeds 20MB limit
-      const largeContent = 'x'.repeat(21 * 1024 * 1024);
+      // 11MB of content exceeds 10MB limit
+      const largeContent = 'x'.repeat(11 * 1024 * 1024);
       await fsp.writeFile(filePath, largeContent, 'utf-8');
       const params: ReadFileToolParams = { absolute_path: filePath };
       const invocation = tool.build(params) as ToolInvocation<
@@ -244,7 +244,7 @@ describe('ReadFileTool', () => {
       expect(result).toHaveProperty('error');
       expect(result.error?.type).toBe(ToolErrorType.FILE_TOO_LARGE);
       expect(result.error?.message).toContain(
-        'File size exceeds the 20MB limit',
+        'File size exceeds the 10MB limit',
       );
     });
 
diff --git a/packages/core/src/tools/tool-error.ts b/packages/core/src/tools/tool-error.ts
index a07de4777..96581602f 100644
--- a/packages/core/src/tools/tool-error.ts
+++ b/packages/core/src/tools/tool-error.ts
@@ -66,4 +66,7 @@ export enum ToolErrorType {
 
   // WebSearch-specific Errors
   WEB_SEARCH_FAILED = 'web_search_failed',
+
+  // Truncation Errors
+  OUTPUT_TRUNCATED = 'output_truncated',
 }
diff --git a/packages/core/src/utils/errorParsing.ts b/packages/core/src/utils/errorParsing.ts
index 21b845955..aa191003b 100644
--- a/packages/core/src/utils/errorParsing.ts
+++ b/packages/core/src/utils/errorParsing.ts
@@ -31,6 +31,11 @@ export function parseAndFormatApiError(
   authType?: AuthType,
 ): string {
   if (isStructuredError(error)) {
+    // Qwen OAuth quota errors have their own user-friendly message; don't wrap them
+    if (error.message.startsWith('Qwen OAuth quota exceeded:')) {
+      return error.message;
+    }
+
     let text = `[API Error: ${error.message}]`;
     if (error.status === 429) {
       text += getRateLimitMessage(authType);
diff --git a/packages/core/src/utils/fileUtils.test.ts b/packages/core/src/utils/fileUtils.test.ts
index da9f257fd..b21ee79e2 100644
--- a/packages/core/src/utils/fileUtils.test.ts
+++ b/packages/core/src/utils/fileUtils.test.ts
@@ -948,13 +948,13 @@ describe('fileUtils', () => {
       );
     });
 
-    it('should return an error if the file size exceeds 20MB', async () => {
+    it('should return an error if the file size exceeds 10MB', async () => {
       // Create a small test file
       actualNodeFs.writeFileSync(testTextFilePath, 'test content');
 
       // Spy on fs.promises.stat to return a large file size
       const statSpy = vi.spyOn(fs.promises, 'stat').mockResolvedValueOnce({
-        size: 21 * 1024 * 1024,
+        size: 11 * 1024 * 1024,
         isDirectory: () => false,
       } as fs.Stats);
 
@@ -964,11 +964,11 @@ describe('fileUtils', () => {
           mockConfig,
         );
 
-        expect(result.error).toContain('File size exceeds the 20MB limit');
+        expect(result.error).toContain('File size exceeds the 10MB limit');
         expect(result.returnDisplay).toContain(
-          'File size exceeds the 20MB limit',
+          'File size exceeds the 10MB limit',
         );
-        expect(result.llmContent).toContain('File size exceeds the 20MB limit');
+        expect(result.llmContent).toContain('File size exceeds the 10MB limit');
       } finally {
         statSpy.mockRestore();
       }
diff --git a/packages/core/src/utils/fileUtils.ts b/packages/core/src/utils/fileUtils.ts
index 3e4124d18..aab6935cb 100644
--- a/packages/core/src/utils/fileUtils.ts
+++ b/packages/core/src/utils/fileUtils.ts
@@ -340,11 +340,12 @@ export async function processSingleFileContent(
     }
 
     const fileSizeInMB = stats.size / (1024 * 1024);
-    if (fileSizeInMB > 20) {
+    // Use 9.9MB instead of 10MB to leave margin for encoding overhead (#1880)
+    if (fileSizeInMB > 9.9) {
       return {
-        llmContent: 'File size exceeds the 20MB limit.',
-        returnDisplay: 'File size exceeds the 20MB limit.',
-        error: `File size exceeds the 20MB limit: ${filePath} (${fileSizeInMB.toFixed(2)}MB)`,
+        llmContent: 'File size exceeds the 10MB limit.',
+        returnDisplay: 'File size exceeds the 10MB limit.',
+        error: `File size exceeds the 10MB limit: ${filePath} (${fileSizeInMB.toFixed(2)}MB)`,
         errorType: ToolErrorType.FILE_TOO_LARGE,
       };
     }
@@ -465,6 +466,16 @@ export async function processSingleFileContent(
       case 'pdf': {
         const contentBuffer = await fs.promises.readFile(filePath);
         const base64Data = contentBuffer.toString('base64');
+        const base64SizeInMB = base64Data.length / (1024 * 1024);
+        // Use 9.9MB instead of 10MB to leave margin for small overhead (#1880)
+        if (base64SizeInMB > 9.9) {
+          return {
+            llmContent: `File exceeds the 10MB data URI limit after base64 encoding (${base64SizeInMB.toFixed(2)}MB encoded).`,
+            returnDisplay: `File exceeds the 10MB data URI limit after base64 encoding.`,
+            error: `File exceeds the 10MB data URI limit after base64 encoding: ${filePath} (${base64SizeInMB.toFixed(2)}MB encoded)`,
+            errorType: ToolErrorType.FILE_TOO_LARGE,
+          };
+        }
         return {
           llmContent: {
             inlineData: {
diff --git a/packages/core/src/utils/pathReader.test.ts b/packages/core/src/utils/pathReader.test.ts
index 5de10765b..282a7d6d1 100644
--- a/packages/core/src/utils/pathReader.test.ts
+++ b/packages/core/src/utils/pathReader.test.ts
@@ -392,8 +392,8 @@ describe('readPathFromWorkspace', () => {
   );
 
   it('should return an error string for files exceeding the size limit', async () => {
-    // Mock a file slightly larger than the 20MB limit defined in fileUtils.ts
-    const largeContent = 'a'.repeat(21 * 1024 * 1024); // 21MB
+    // Mock a file slightly larger than the 10MB limit defined in fileUtils.ts
+    const largeContent = 'a'.repeat(11 * 1024 * 1024); // 11MB
     mock({
       [CWD]: {
         'large.txt': largeContent,
@@ -406,6 +406,6 @@ describe('readPathFromWorkspace', () => {
     const result = await readPathFromWorkspace('large.txt', config);
     const textResult = result[0] as string;
     // The error message comes directly from processSingleFileContent
-    expect(textResult).toBe('File size exceeds the 20MB limit.');
+    expect(textResult).toBe('File size exceeds the 10MB limit.');
   });
 });
diff --git a/packages/core/src/utils/rateLimit.test.ts b/packages/core/src/utils/rateLimit.test.ts
index 48605db20..a342a4a0b 100644
--- a/packages/core/src/utils/rateLimit.test.ts
+++ b/packages/core/src/utils/rateLimit.test.ts
@@ -33,6 +33,13 @@ describe('isRateLimitError — detection paths', () => {
     expect(info).toBe(true);
   });
 
+  it('should detect 1305 code from ApiError (issue #1918)', () => {
+    const info = isRateLimitError({
+      error: { code: 1305, message: 'IdealTalk rate limit' },
+    });
+    expect(info).toBe(true);
+  });
+
   it('should detect rate-limit from StructuredError.status', () => {
     const error: StructuredError = { message: 'Rate limited', status: 429 };
     const info = isRateLimitError(error);
@@ -52,6 +59,21 @@ describe('isRateLimitError — detection paths', () => {
     ).toBe(false);
   });
 
+  it('should detect custom error code passed via extraCodes', () => {
+    expect(
+      isRateLimitError(
+        { error: { code: 9999, message: 'Custom rate limit' } },
+        [9999],
+      ),
+    ).toBe(true);
+  });
+
+  it('should not detect custom code when extraCodes is not provided', () => {
+    expect(
+      isRateLimitError({ error: { code: 9999, message: 'Custom rate limit' } }),
+    ).toBe(false);
+  });
+
   it('should return null for invalid inputs', () => {
     expect(isRateLimitError(null)).toBe(false);
     expect(isRateLimitError(undefined)).toBe(false);
diff --git a/packages/core/src/utils/rateLimit.ts b/packages/core/src/utils/rateLimit.ts
index 559cb26fb..19466e90f 100644
--- a/packages/core/src/utils/rateLimit.ts
+++ b/packages/core/src/utils/rateLimit.ts
@@ -10,7 +10,8 @@ import { isApiError, isStructuredError } from './quotaErrorDetection.js';
 // 429  - Standard HTTP "Too Many Requests" (DashScope TPM, OpenAI, etc.)
 // 503  - Provider throttling/overload (treated as rate-limit for retry UI)
 // 1302 - Z.AI GLM rate limit (https://docs.z.ai/api-reference/api-code)
-const RATE_LIMIT_ERROR_CODES = new Set([429, 503, 1302]);
+// 1305 - DashScope/IdealTalk internal rate limit (issue #1918)
+const RATE_LIMIT_ERROR_CODES = new Set([429, 503, 1302, 1305]);
 
 export interface RetryInfo {
   /** Formatted error message for display, produced by parseAndFormatApiError. */
@@ -25,10 +26,20 @@ export interface RetryInfo {
 
 /**
  * Detects rate-limit / throttling errors and returns retry info.
+ *
+ * @param error - The error to check.
+ * @param extraCodes - Additional error codes to treat as rate-limit errors,
+ *   merged with the built-in set at call time (not mutating the default set).
  */
-export function isRateLimitError(error: unknown): boolean {
+export function isRateLimitError(
+  error: unknown,
+  extraCodes?: readonly number[],
+): boolean {
   const code = getErrorCode(error);
-  return code !== null && RATE_LIMIT_ERROR_CODES.has(code);
+  if (code === null) return false;
+  if (RATE_LIMIT_ERROR_CODES.has(code)) return true;
+  if (extraCodes && extraCodes.includes(code)) return true;
+  return false;
 }
 
 /**
diff --git a/packages/core/src/utils/retry.test.ts b/packages/core/src/utils/retry.test.ts
index 490f24448..a628719a5 100644
--- a/packages/core/src/utils/retry.test.ts
+++ b/packages/core/src/utils/retry.test.ts
@@ -323,7 +323,7 @@ describe('retryWithBackoff', () => {
         authType: AuthType.QWEN_OAUTH,
       });
 
-      await expect(promise).rejects.toThrow(/Qwen API quota exceeded/);
+      await expect(promise).rejects.toThrow(/Qwen OAuth quota exceeded/);
 
       // Should be called only once (no retries)
       expect(fn).toHaveBeenCalledTimes(1);
@@ -343,7 +343,7 @@ describe('retryWithBackoff', () => {
         authType: AuthType.QWEN_OAUTH,
       });
 
-      await expect(promise).rejects.toThrow(/Qwen API quota exceeded/);
+      await expect(promise).rejects.toThrow(/Qwen OAuth quota exceeded/);
 
       // Should be called only once (no retries)
       expect(fn).toHaveBeenCalledTimes(1);
@@ -414,7 +414,7 @@ describe('retryWithBackoff', () => {
         authType: AuthType.QWEN_OAUTH,
       });
 
-      await expect(promise).rejects.toThrow(/Qwen API quota exceeded/);
+      await expect(promise).rejects.toThrow(/Qwen OAuth quota exceeded/);
 
       // Should be called only once (no retries)
       expect(fn).toHaveBeenCalledTimes(1);
diff --git a/packages/core/src/utils/retry.ts b/packages/core/src/utils/retry.ts
index fd9b5c025..5ce79f08f 100644
--- a/packages/core/src/utils/retry.ts
+++ b/packages/core/src/utils/retry.ts
@@ -110,7 +110,11 @@ export async function retryWithBackoff<T>(
       // Check for Qwen OAuth quota exceeded error - throw immediately without retry
       if (authType === AuthType.QWEN_OAUTH && isQwenQuotaExceededError(error)) {
         throw new Error(
-          `Qwen API quota exceeded: Your Qwen API quota has been exhausted. Please wait for your quota to reset.`,
+          `Qwen OAuth quota exceeded: Your free daily quota has been reached.\n\n` +
+            `To continue using Qwen Code without waiting, upgrade to the Alibaba Cloud Coding Plan:\n` +
+            `  China:       https://help.aliyun.com/zh/model-studio/coding-plan\n` +
+            `  Global/Intl: https://www.alibabacloud.com/help/en/model-studio/coding-plan\n\n` +
+            `After subscribing, run /auth to configure your Coding Plan API key.`,
         );
       }
 
diff --git a/packages/test-utils/package.json b/packages/test-utils/package.json
index 37d7223fa..358128630 100644
--- a/packages/test-utils/package.json
+++ b/packages/test-utils/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@qwen-code/qwen-code-test-utils",
-  "version": "0.11.0",
+  "version": "0.11.1",
   "private": true,
   "main": "src/index.ts",
   "license": "Apache-2.0",
diff --git a/packages/vscode-ide-companion/package.json b/packages/vscode-ide-companion/package.json
index bdeb0adab..28da4cf4f 100644
--- a/packages/vscode-ide-companion/package.json
+++ b/packages/vscode-ide-companion/package.json
@@ -2,7 +2,7 @@
   "name": "qwen-code-vscode-ide-companion",
   "displayName": "Qwen Code Companion",
   "description": "Enable Qwen Code with direct access to your VS Code workspace.",
-  "version": "0.11.0",
+  "version": "0.11.1",
   "publisher": "qwenlm",
   "icon": "assets/icon.png",
   "repository": {
diff --git a/packages/web-templates/package.json b/packages/web-templates/package.json
index c7f5b0fa3..a1b11d81c 100644
--- a/packages/web-templates/package.json
+++ b/packages/web-templates/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@qwen-code/web-templates",
-  "version": "0.10.0",
+  "version": "0.11.1",
   "description": "Web templates bundled as embeddable JS/CSS strings",
   "repository": {
     "type": "git",
diff --git a/packages/webui/package.json b/packages/webui/package.json
index 656ef0b66..339c85322 100644
--- a/packages/webui/package.json
+++ b/packages/webui/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@qwen-code/webui",
-  "version": "0.11.0",
+  "version": "0.11.1",
   "description": "Shared UI components for Qwen Code packages",
   "type": "module",
   "main": "./dist/index.cjs",