docs(channels): document attachments and block streaming features

- Add Attachments interface docs with handling examples
- Document block streaming configuration and behavior
- Update architecture diagrams to show attachment resolution
- Add Attachment type to exported types reference
- Update plugin-example README

Covers new structured attachment support and block streaming
that delivers responses as multiple progressive messages.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
tanzhenxin 2026-03-27 14:57:17 +00:00
parent 3e0f213ea3
commit 39103eea5f
4 changed files with 161 additions and 41 deletions

View file

@ -68,23 +68,61 @@ export class MyChannel extends ChannelBase {
The normalized message object you build from platform data. The boolean flags drive gate logic, so they must be accurate. The normalized message object you build from platform data. The boolean flags drive gate logic, so they must be accurate.
| Field | Type | Required | Notes | | Field | Type | Required | Notes |
| ---------------- | ------- | -------- | -------------------------------------------------------------------------- | | ---------------- | ------------ | -------- | -------------------------------------------------------------------------- |
| `channelName` | string | Yes | Use `this.name` | | `channelName` | string | Yes | Use `this.name` |
| `senderId` | string | Yes | Must be stable across messages (used for session routing + access control) | | `senderId` | string | Yes | Must be stable across messages (used for session routing + access control) |
| `senderName` | string | Yes | Display name | | `senderName` | string | Yes | Display name |
| `chatId` | string | Yes | Must distinguish DMs from groups | | `chatId` | string | Yes | Must distinguish DMs from groups |
| `text` | string | Yes | Strip bot @mentions | | `text` | string | Yes | Strip bot @mentions |
| `threadId` | string | No | For `sessionScope: "thread"` | | `threadId` | string | No | For `sessionScope: "thread"` |
| `messageId` | string | No | Platform message ID — useful for response correlation | | `messageId` | string | No | Platform message ID — useful for response correlation |
| `isGroup` | boolean | Yes | GroupGate relies on this | | `isGroup` | boolean | Yes | GroupGate relies on this |
| `isMentioned` | boolean | Yes | GroupGate relies on this | | `isMentioned` | boolean | Yes | GroupGate relies on this |
| `isReplyToBot` | boolean | Yes | GroupGate relies on this | | `isReplyToBot` | boolean | Yes | GroupGate relies on this |
| `referencedText` | string | No | Quoted message — prepended as context | | `referencedText` | string | No | Quoted message — prepended as context |
| `imageBase64` | string | No | Base64-encoded image for multimodal models | | `imageBase64` | string | No | Base64-encoded image (legacy — prefer `attachments`) |
| `imageMimeType` | string | No | e.g., `image/jpeg` | | `imageMimeType` | string | No | e.g., `image/jpeg` (legacy — prefer `attachments`) |
| `attachments` | Attachment[] | No | Structured media attachments (see below) |
For **files**: download from your platform, save to a temp directory, include the file path in `text`. ### Attachments
Use the `attachments` array for images, files, audio, and video. `handleInbound()` resolves them automatically: images with base64 `data` are sent to the model as vision input, files with a `filePath` get their path appended to the prompt so the agent can read them.
```typescript
interface Attachment {
type: 'image' | 'file' | 'audio' | 'video';
data?: string; // base64-encoded data (images, small files)
filePath?: string; // absolute path to local file (large files saved to disk)
mimeType: string; // e.g. 'application/pdf', 'image/jpeg'
fileName?: string; // original file name from the platform
}
```
Example — handling a file upload in your adapter:
```typescript
import { writeFileSync, mkdirSync, existsSync } from 'node:fs';
import { join } from 'node:path';
import { tmpdir } from 'node:os';
const buf = await downloadFromPlatform(fileId);
const dir = join(tmpdir(), 'channel-files');
if (!existsSync(dir)) mkdirSync(dir, { recursive: true });
const filePath = join(dir, fileName);
writeFileSync(filePath, buf);
envelope.attachments = [
{
type: 'file',
filePath,
mimeType: 'application/pdf',
fileName,
},
];
```
The legacy `imageBase64`/`imageMimeType` fields still work for backwards compatibility but `attachments` is preferred for new code.
## Extension Manifest ## Extension Manifest
@ -126,7 +164,11 @@ override async handleInbound(envelope: Envelope): Promise<void> {
**Tool call hooks** — override `onToolCall()` to display agent activity (e.g., "Running shell command..."). **Tool call hooks** — override `onToolCall()` to display agent activity (e.g., "Running shell command...").
**Media** — download from your platform, set `imageBase64`/`imageMimeType` on the Envelope before calling `handleInbound()`. **Streaming hooks** — override `onResponseChunk(chatId, chunk, sessionId)` for per-chunk progressive display (e.g., editing a message in-place). Override `onResponseComplete(chatId, fullText, sessionId)` to customize final delivery.
**Block streaming** — set `blockStreaming: "on"` in the channel config. The base class automatically splits responses into multiple messages at paragraph boundaries. No plugin code needed — it works alongside `onResponseChunk`.
**Media** — populate `envelope.attachments` with images/files. See [Attachments](#attachments) above.
## Reference Implementations ## Reference Implementations

View file

@ -47,20 +47,23 @@ Channels are configured under the `channels` key in `settings.json`. Each channe
### Options ### Options
| Option | Required | Description | | Option | Required | Description |
| -------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------- | | ------------------------ | -------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| `type` | Yes | Channel type: `telegram`, `weixin`, `dingtalk`, or a custom type from an extension (see [Plugins](./plugins)) | | `type` | Yes | Channel type: `telegram`, `weixin`, `dingtalk`, or a custom type from an extension (see [Plugins](./plugins)) |
| `token` | Telegram | Bot token. Supports `$ENV_VAR` syntax to read from environment variables. Not needed for WeChat or DingTalk | | `token` | Telegram | Bot token. Supports `$ENV_VAR` syntax to read from environment variables. Not needed for WeChat or DingTalk |
| `clientId` | DingTalk | DingTalk AppKey. Supports `$ENV_VAR` syntax | | `clientId` | DingTalk | DingTalk AppKey. Supports `$ENV_VAR` syntax |
| `clientSecret` | DingTalk | DingTalk AppSecret. Supports `$ENV_VAR` syntax | | `clientSecret` | DingTalk | DingTalk AppSecret. Supports `$ENV_VAR` syntax |
| `model` | No | Model to use for this channel (e.g., `qwen3.5-plus`). Overrides the default model. Useful for multimodal models that support image input | | `model` | No | Model to use for this channel (e.g., `qwen3.5-plus`). Overrides the default model. Useful for multimodal models that support image input |
| `senderPolicy` | No | Who can talk to the bot: `allowlist` (default), `open`, or `pairing` | | `senderPolicy` | No | Who can talk to the bot: `allowlist` (default), `open`, or `pairing` |
| `allowedUsers` | No | List of user IDs allowed to use the bot (used by `allowlist` and `pairing` policies) | | `allowedUsers` | No | List of user IDs allowed to use the bot (used by `allowlist` and `pairing` policies) |
| `sessionScope` | No | How sessions are scoped: `user` (default), `thread`, or `single` | | `sessionScope` | No | How sessions are scoped: `user` (default), `thread`, or `single` |
| `cwd` | No | Working directory for the agent. Defaults to the current directory | | `cwd` | No | Working directory for the agent. Defaults to the current directory |
| `instructions` | No | Custom instructions prepended to the first message of each session | | `instructions` | No | Custom instructions prepended to the first message of each session |
| `groupPolicy` | No | Group chat access: `disabled` (default), `allowlist`, or `open`. See [Group Chats](#group-chats) | | `groupPolicy` | No | Group chat access: `disabled` (default), `allowlist`, or `open`. See [Group Chats](#group-chats) |
| `groups` | No | Per-group settings. Keys are group chat IDs or `"*"` for defaults. See [Group Chats](#group-chats) | | `groups` | No | Per-group settings. Keys are group chat IDs or `"*"` for defaults. See [Group Chats](#group-chats) |
| `blockStreaming` | No | Progressive response delivery: `on` or `off` (default). See [Block Streaming](#block-streaming) |
| `blockStreamingChunk` | No | Chunk size bounds: `{ "minChars": 400, "maxChars": 1000 }`. See [Block Streaming](#block-streaming) |
| `blockStreamingCoalesce` | No | Idle flush: `{ "idleMs": 1500 }`. See [Block Streaming](#block-streaming) |
### Sender Policy ### Sender Policy
@ -219,6 +222,34 @@ Files work with any model — no multimodal support required.
| Files | Direct download via Bot API (20MB limit) | CDN download with AES decryption | downloadCode API (two-step) | | Files | Direct download via Bot API (20MB limit) | CDN download with AES decryption | downloadCode API (two-step) |
| Captions | Photo/file captions included as message text | Not applicable | Rich text: mixed text + images in one message | | Captions | Photo/file captions included as message text | Not applicable | Rich text: mixed text + images in one message |
## Block Streaming
By default, the agent works for a while and then sends one large response. With block streaming enabled, the response arrives as multiple shorter messages while the agent is still working — similar to how ChatGPT or Claude show progressive output.
```json
{
"channels": {
"my-channel": {
"type": "telegram",
"blockStreaming": "on",
"blockStreamingChunk": { "minChars": 400, "maxChars": 1000 },
"blockStreamingCoalesce": { "idleMs": 1500 },
...
}
}
}
```
### How it works
- The agent's response is split into blocks at paragraph boundaries and sent as separate messages
- `minChars` (default 400) — don't send a block until it's at least this long, to avoid spamming tiny messages
- `maxChars` (default 1000) — if a block gets this long without a natural break, send it anyway
- `idleMs` (default 1500) — if the agent pauses (e.g., running a tool), send what's buffered so far
- When the agent finishes, any remaining text is sent immediately
Only `blockStreaming` is required. The chunk and coalesce settings are optional and have sensible defaults.
## Slash Commands ## Slash Commands
Channels support slash commands. These are handled locally (no agent round-trip): Channels support slash commands. These are handled locally (no agent round-trip):

View file

@ -59,15 +59,16 @@ For a complete working example, see [`@qwen-code/channel-plugin-example`](../plu
``` ```
Inbound: Platform message Inbound: Platform message
→ Envelope → Envelope (with attachments)
→ GroupGate (group policy + mention gating) → GroupGate (group policy + mention gating)
→ SenderGate (allowlist / pairing / open) → SenderGate (allowlist / pairing / open)
→ Slash commands (/clear, /help, /status) → Slash commands (/clear, /help, /status)
→ SessionRouter (resolve or create ACP session) → SessionRouter (resolve or create ACP session)
→ Resolve attachments (images → bridge, files → prompt text)
→ AcpBridge.prompt() → agent → AcpBridge.prompt() → agent
Outbound: Agent response Outbound: Agent response
ChannelBase BlockStreamer (if enabled: split into blocks at paragraph boundaries)
→ sendMessage() → platform → sendMessage() → platform
``` ```
@ -81,6 +82,7 @@ Everything between `handleInbound()` and `sendMessage()` is handled by the base
| --------------- | ---------------------------------------------------------------- | | --------------- | ---------------------------------------------------------------- |
| `ChannelBase` | Abstract base class — extend this to build a channel adapter | | `ChannelBase` | Abstract base class — extend this to build a channel adapter |
| `AcpBridge` | Spawns and communicates with the `qwen-code --acp` agent process | | `AcpBridge` | Spawns and communicates with the `qwen-code --acp` agent process |
| `BlockStreamer` | Progressive multi-message delivery for block streaming |
| `SessionRouter` | Maps senders to ACP sessions with configurable scoping | | `SessionRouter` | Maps senders to ACP sessions with configurable scoping |
| `SenderGate` | DM access control (allowlist / pairing / open) | | `SenderGate` | DM access control (allowlist / pairing / open) |
| `GroupGate` | Group chat policy and @mention gating | | `GroupGate` | Group chat policy and @mention gating |
@ -90,6 +92,7 @@ Everything between `handleInbound()` and `sendMessage()` is handled by the base
| Type | Description | | Type | Description |
| --------------- | ---------------------------------------------- | | --------------- | ---------------------------------------------- |
| `Attachment` | Structured file/image/audio/video attachment |
| `ChannelConfig` | Channel configuration from `settings.json` | | `ChannelConfig` | Channel configuration from `settings.json` |
| `ChannelPlugin` | Plugin factory interface (what you export) | | `ChannelPlugin` | Plugin factory interface (what you export) |
| `Envelope` | Normalized inbound message format | | `Envelope` | Normalized inbound message format |
@ -117,12 +120,16 @@ constructor(name: string, config: ChannelConfig, bridge: AcpBridge, options?: Ch
**Provided methods:** **Provided methods:**
| Method | Description | | Method | Description |
| -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- | | ------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
| `handleInbound(envelope)` | Route an inbound message through the full pipeline (gate checks, commands, session, prompt). Call this from your message handler. | | `handleInbound(envelope)` | Route an inbound message through the full pipeline (gate checks, commands, session, prompt). Call this from your message handler. |
| `setBridge(bridge)` | Replace the ACP bridge after crash recovery | | `setBridge(bridge)` | Replace the ACP bridge after crash recovery |
| `registerCommand(name, handler)` | Register a custom slash command (e.g. `/mycommand`) | | `registerCommand(name, handler)` | Register a custom slash command (e.g. `/mycommand`) |
| `onToolCall(chatId, event)` | Hook called on agent tool invocations — override to show indicators | | `onToolCall(chatId, event)` | Hook called on agent tool invocations — override to show indicators |
| `onResponseChunk(chatId, chunk, sessionId)` | Hook called per streaming text chunk — override for progressive display (default: no-op) |
| `onResponseComplete(chatId, fullText, sessionId)` | Hook called when full response is ready — override to customize delivery (default: `sendMessage()`) |
**Block streaming:** When `blockStreaming: "on"` is set in the channel config, the base class automatically splits the agent's streaming response into multiple messages at paragraph boundaries. See [Block Streaming](#block-streaming) below.
**Built-in slash commands:** `/clear` (`/reset`, `/new`), `/help`, `/status` **Built-in slash commands:** `/clear` (`/reset`, `/new`), `/help`, `/status`
@ -244,11 +251,44 @@ interface Envelope {
isMentioned: boolean; // true if bot was @mentioned isMentioned: boolean; // true if bot was @mentioned
isReplyToBot: boolean; // true if replying to bot's message isReplyToBot: boolean; // true if replying to bot's message
referencedText?: string; // quoted message text referencedText?: string; // quoted message text
imageBase64?: string; // base64-encoded image imageBase64?: string; // base64-encoded image (legacy — prefer attachments)
imageMimeType?: string; // e.g. 'image/jpeg' imageMimeType?: string; // e.g. 'image/jpeg' (legacy — prefer attachments)
attachments?: Attachment[]; // structured file/image/audio/video attachments
}
interface Attachment {
type: 'image' | 'file' | 'audio' | 'video';
data?: string; // base64-encoded data (images, small files)
filePath?: string; // absolute path to local file (large files)
mimeType: string; // e.g. 'application/pdf', 'image/jpeg'
fileName?: string; // original file name from the platform
} }
``` ```
`handleInbound()` automatically resolves attachments: images with `data` are sent to the model as vision input, files with `filePath` get their path appended to the prompt text so the agent can read them with its tools.
## Block Streaming
When `blockStreaming: "on"` is set in a channel's config, the agent's response is delivered as multiple separate messages instead of one large wall of text. The `BlockStreamer` accumulates streaming chunks and emits completed blocks based on paragraph boundaries and size heuristics.
**Config fields** (on `ChannelConfig`):
| Field | Type | Default | Description |
| ------------------------ | ------------------------ | --------------- | --------------------------------------------------------------------------- |
| `blockStreaming` | `'on' \| 'off'` | `'off'` | Enable/disable block streaming |
| `blockStreamingChunk` | `{ minChars, maxChars }` | `{ 400, 1000 }` | `minChars`: don't emit until this size. `maxChars`: force-emit at this size |
| `blockStreamingCoalesce` | `{ idleMs }` | `{ 1500 }` | Emit buffered text after this many ms of silence from the agent |
**How it works:**
1. Text accumulates as the agent streams its response
2. When the buffer reaches `minChars` and hits a paragraph break (`\n\n`), that block is sent as a separate message
3. If the buffer reaches `maxChars` without a paragraph break, it force-splits at the best break point (newline > space)
4. If the agent goes quiet for `idleMs`, the buffer is flushed (as long as it's past `minChars`)
5. When the agent finishes, any remaining text is sent immediately regardless of `minChars`
Block streaming and `onResponseChunk` work independently — plugins can override `onResponseChunk` for their own purposes while block streaming handles delivery.
## Further reading ## Further reading
- [Channel Plugin Developer Guide](../../docs/developers/channel-plugins.md) - [Channel Plugin Developer Guide](../../docs/developers/channel-plugins.md)

View file

@ -94,4 +94,11 @@ See `src/MockPluginChannel.ts` for a working example. The key points:
3. Export a `plugin` object conforming to `ChannelPlugin` 3. Export a `plugin` object conforming to `ChannelPlugin`
4. Add a `qwen-extension.json` manifest 4. Add a `qwen-extension.json` manifest
### Features you get for free
- **Block streaming** — enable `blockStreaming: "on"` in config and the agent's response is automatically split into multiple messages at paragraph boundaries
- **Attachments** — populate `envelope.attachments` with images/files and `handleInbound()` routes them to the agent (images as vision input, files as paths in the prompt)
- **Streaming hooks** — override `onResponseChunk()` for progressive display (e.g., editing a message in-place)
- Access control (allowlist, pairing, open), session routing, slash commands, crash recovery
Full guide: [Channel Plugin Developer Guide](../../docs/developers/channel-plugins.md) Full guide: [Channel Plugin Developer Guide](../../docs/developers/channel-plugins.md)