fix(fileUtils): use config modalities instead of model-based defaults

This fixes session corruption issues where the modality check was based on the model name rather than the actual resolved config, causing inconsistent behavior when the config's modalities differed from the defaults. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-05-01 05:00:46 +00:00 · 2026-03-15 21:22:57 +08:00 · 2026-03-15 21:22:57 +08:00 · 6997636ba4
commit 6997636ba4
parent 0cede7bc5e
3 changed files with 65 additions and 65 deletions
--- a/docs/developers/tools/file-system.md
+++ b/docs/developers/tools/file-system.md
@ -24,7 +24,7 @@ Qwen Code provides a comprehensive suite of tools for interacting with the local

 ## 2. `read_file` (ReadFile)

-`read_file` reads and returns the content of a specified file. This tool handles text and images (PNG, JPG, GIF, WEBP, SVG, BMP). For text files, it can read specific line ranges. PDF files are not supported directly - extract text externally first. Other binary file types are generally skipped.
+`read_file` reads and returns the content of a specified file. This tool handles text files and media files (images, PDFs, audio, video) whose modality is supported by the current model. For text files, it can read specific line ranges. Media files whose modality is not supported by the current model are rejected with a helpful error message. Other binary file types are generally skipped.

 - **Tool name:** `read_file`
 - **Display name:** ReadFile
@ -35,13 +35,12 @@ Qwen Code provides a comprehensive suite of tools for interacting with the local
  - `limit` (number, optional): For text files, the maximum number of lines to read. If omitted, reads a default maximum (e.g., 2000 lines) or the entire file if feasible.
 - **Behavior:**
  - For text files: Returns the content. If `offset` and `limit` are used, returns only that slice of lines. Indicates if content was truncated due to line limits or line length limits.
-  - For image files: Returns the file content as a base64-encoded `inlineData` object suitable for model consumption.
-  - For PDF files: Returns an error message directing users to extract text externally.
+  - For media files (images, PDFs, audio, video): If the current model supports the file's modality, returns the file content as a base64-encoded `inlineData` object. If the model does not support the modality, returns an error message with guidance (e.g., suggesting skills or external tools).
  - For other binary files: Attempts to identify and skip them, returning a message indicating it's a generic binary file.
 - **Output:** (`llmContent`):
  - For text files: The file content, potentially prefixed with a truncation message (e.g., `[File content truncated: showing lines 1-100 of 500 total lines...]\nActual file content...`).
-  - For image files: An object containing `inlineData` with `mimeType` and base64 `data` (e.g., `{ inlineData: { mimeType: 'image/png', data: 'base64encodedstring' } }`).
-  - For PDF files: An error message string explaining that PDFs are not supported.
+  - For supported media files: An object containing `inlineData` with `mimeType` and base64 `data` (e.g., `{ inlineData: { mimeType: 'image/png', data: 'base64encodedstring' } }`).
+  - For unsupported media files: An error message string explaining that the current model does not support this modality, with suggestions for alternatives.
  - For other binary files: A message like `Cannot display content of binary file: /path/to/data.bin`.
 - **Confirmation:** No.