qwen-code/docs/design/markdown-syntax-extension.md
ChiGao 7f0c9791b7
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
SDK Python / SDK Python (3.10) (push) Waiting to run
SDK Python / SDK Python (3.11) (push) Waiting to run
SDK Python / SDK Python (3.12) (push) Waiting to run
feat(cli): expand TUI markdown rendering (#3680)
* feat(cli): expand markdown rendering in tui

* fix(cli): render finalized mermaid images synchronously

* fix(cli): harden markdown rendering paths

* fix(cli): preserve mermaid source fallbacks

* test(cli): make mermaid image renderer mock cross-platform

* fix(cli): run windows mmdc shims through shell

* fix(cli): address markdown rendering review comments

* fix(cli): validate mermaid render timeout

* feat(cli): expose rendered markdown source blocks

* fix(cli): align mermaid source copy controls

* fix(cli): make markdown render toggle visible

* fix(cli): keep markdown render toggle quiet

* fix(cli): generalize markdown render mode

* fix(cli): broaden render mode and copy latex blocks

* fix(cli): align rendered copy source indices

* fix(cli): support copying inline latex expressions

* feat(cli): document markdown render controls

* test(cli): cover markdown render controls

* fix(cli): tighten markdown render fallbacks

* fix(cli): bound mermaid renderer cache

* fix(cli): address markdown render review feedback

* fix(cli): address markdown render review comments

* test(cli): strengthen render mode shortcut coverage

* fix(cli): address mermaid image review comments

* fix(cli): stabilize renderer output truncation

* test(cli): flush fake renderer stderr before exit

* fix(cli): address markdown renderer review feedback

* docs(cli): clarify mermaid image limits

* fix(cli): refresh mermaid images on height resize

* fix(cli): address markdown render review

* chore: revert unrelated review formatting churn

* fix(cli): avoid mermaid regex codeql alert

* fix(cli): silence mermaid operator codeql alert

* fix(cli): render inline math in markdown tables

* fix(cli): harden markdown visual renderers

* fix(cli): strip c1 controls from mermaid previews

---------

Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com>
2026-05-07 16:24:13 +08:00

236 lines
10 KiB
Markdown

# Markdown Syntax Extension Design
## Context
This document keeps the implementation references for the integrated Markdown
syntax extension PR. It is based on the TUI optimization research from
`origin/docs/tui-optimization-design`, especially:
- `docs/design/tui-optimization/00-overview.md`
- `docs/design/tui-optimization/03-rendering-extensibility.md`
- `docs/design/tui-optimization/04-gemini-cli-research.md`
- `docs/design/tui-optimization/05-claude-code-research.md`
- `docs/design/tui-optimization/06-implementation-rollout-checklist.md`
- `docs/design/tui-optimization/08-execution-plan-and-test-matrix.md`
The referenced research recommends a long-term Markdown architecture built
around an AST parser, block/token caching, stable-prefix streaming, bounded
detail panels, and terminal capability detection. This first implementation
keeps the runtime footprint small and makes the new behavior visible
immediately.
## Integrated PR Scope
This PR treats Markdown syntax expansion as one coherent renderer improvement,
not separate feature PRs.
Included in the first implementation:
- Mermaid code blocks render visually in the TUI.
- Mermaid diagrams render through PNG terminal images when image rendering is
explicitly enabled, `mmdc` is available, and the terminal supports an image
path.
- `flowchart` / `graph` Mermaid diagrams fall back to box-and-arrow previews.
- `sequenceDiagram` Mermaid diagrams fall back to participant-arrow previews.
- Basic `classDiagram`, `stateDiagram`, `erDiagram`, `gantt`, `pie`,
`journey`, `mindmap`, `gitGraph`, and `requirementDiagram` blocks fall back
to bounded text previews.
- Mermaid types without a text preview fall back to their original fenced
source so the user can still read and copy the diagram definition.
- Task list items render checked/unchecked markers.
- Blockquotes render with a visible quote bar.
- Inline `$...$` math and block `$$...$$` math render with common Unicode
substitutions.
- Existing Markdown tables continue to use `TableRenderer`.
- Existing non-Mermaid fenced code blocks continue to use `CodeColorizer`.
- Rendered visual blocks keep source reachable through `/copy mermaid N`,
`/copy latex N`, `/copy latex inline N`, and raw mode.
- `ui.renderMode` controls whether sessions start in rendered or raw/source
mode, while `Alt/Option+M` toggles the active session view.
## Mermaid Rendering Strategy
### First version: capability-gated image rendering plus text fallback
The implementation now treats Mermaid's own layout as the preferred path. When
the local environment supports it, the TUI renders Mermaid blocks through this
pipeline:
```text
Mermaid source
-> mmdc / Mermaid CLI
-> PNG
-> Kitty or iTerm2 terminal image protocol
```
If the terminal does not support inline images but `chafa` is installed, the
same PNG is rendered as ANSI block graphics. If neither image protocol nor
`chafa` is available, the renderer falls back to the synchronous terminal text
preview described below.
The image render is not attempted while a response is still streaming. During
streaming, Mermaid blocks show a bounded pending preview. Once the response is
finalized, the image path is attempted only when explicitly enabled. This keeps
slow `mmdc` startup, especially the opt-in `npx` path, out of the default
interactive render path.
PNG generation is cached independently from terminal placement. Repeated
renders of the same Mermaid source, including terminal resize updates, reuse
the generated PNG and only recompute the Kitty/iTerm2 placement dimensions.
The image path is intentionally opt-in and capability-gated instead of always
bundling or invoking Puppeteer/Chromium from the hot CLI path. A user can enable
the image path with `QWEN_CODE_MERMAID_IMAGE_RENDERING=1`, then provide
`@mermaid-js/mermaid-cli` by installing `mmdc` on `PATH` or by setting
`QWEN_CODE_MERMAID_MMD_CLI` to the binary path. For ad-hoc local verification,
`QWEN_CODE_MERMAID_ALLOW_NPX=1` allows the renderer to invoke
`npx -y @mermaid-js/mermaid-cli@11.12.0`; this is intentionally opt-in because
the first run may install Puppeteer/Chromium and block rendering. Repo-local
`node_modules/.bin` renderers are not auto-discovered unless
`QWEN_CODE_MERMAID_ALLOW_LOCAL_RENDERERS=1` is set. Terminal protocol selection
can be forced with `QWEN_CODE_MERMAID_IMAGE_PROTOCOL=kitty|iterm2|off`.
For Kitty-compatible terminals such as Ghostty, the renderer uses Kitty
Unicode placeholders instead of writing the image payload as Ink text. The PNG
is transmitted through raw stdout in quiet mode (`q=2`) with a virtual
placement (`U=1`), and the React tree renders the normal placeholder character
grid (`U+10EEEE`) with explicit row and column diacritics for each cell. This
keeps Ink responsible for layout and resize while preventing APC payload bytes
from being wrapped into visible base64 text.
### Fallback: resizable wireframe preview
The fallback avoids async work because Ink's `<Static>` path is append-only: a
finalized message cannot reliably wait for a background render job and then
update in place without forcing a full static refresh. The fallback must
therefore produce terminal output during the normal React render pass.
For `flowchart` / `graph` diagrams, the fallback builds a lightweight graph
model instead of printing one edge at a time:
- Nodes are normalized by Mermaid id, label, and basic shape.
- Node labels support Mermaid-style `\n` / `<br>` line breaks.
- Top-down diagrams are ranked into horizontal layers.
- Left-to-right diagrams are ranked into vertical columns when they fit.
- Multiple outgoing edges from the same node are drawn as one fork with
bracketed edge labels such as `[Yes]`, `[No]`, `[是]`, and `[否]`.
- Back edges and cycles are summarized in a `Cycles:` section with explicit
`↩ to <node>` markers. This avoids unstable long cross-diagram routes in
terminal fonts while keeping the loop semantics visible.
- The graph is recomputed from `contentWidth`, so resize changes node width,
spacing, and connector paths.
- Large previews are bounded before graph layout so very large Mermaid blocks
do not allocate an unbounded terminal canvas during render.
Example:
```mermaid
flowchart LR
A[Client] --> B[API]
```
renders as a terminal visual preview rather than Mermaid source.
Other common Mermaid diagram families use bounded text summaries rather than a
full layout engine: class relationships/members, state transitions, ER
entities/relationships, Gantt tasks, pie slices, journey steps, mindmap trees,
git graph entries, and requirement trees. If a diagram type is unknown or not
previewable, the renderer shows the original fenced Mermaid source rather than
a placeholder so the content remains readable and selectable/copyable in the
terminal. Rendered Mermaid headings also show the Mermaid-specific copy command,
for example `/copy mermaid 2`, so users can recover the original diagram source
without switching the whole view to raw mode.
The fallback is still not a complete Mermaid engine. It is a fast,
dependency-light preview layer for common LLM-generated diagrams when
high-fidelity rendering is not available.
### Future providers
The provider boundary is intentionally open for additional native image
providers:
- `mmdc` / `@mermaid-js/mermaid-cli` for SVG/PNG output.
- `terminal-image` for Kitty/iTerm2 plus ANSI fallback.
- `chafa` when present for Sixel/Kitty/iTerm2/Unicode mosaics.
This path should remain optional, cached, and capability-gated, with cache keys
based on source hash, terminal width, renderer provider, and terminal protocol.
It should not block startup or add bundled Mermaid/Puppeteer work to the hot TUI
path by default.
## AST Renderer Compatibility
The first version extends the existing parser to minimize blast radius. The
feature boundaries are still compatible with a future `marked` token pipeline:
- `code(lang=mermaid)` -> `MermaidDiagram`
- `code(lang=*)` -> existing `CodeColorizer`
- `table` -> existing `TableRenderer`
- `blockquote` -> quote block renderer
- `list(task=true)` -> task list renderer
- `paragraph/text` -> inline renderer with math/link/style support
The implementation does not cache React nodes. A future AST renderer should
cache tokens/blocks, then render from current width/theme/settings props.
## Safety And Performance
- Mermaid source is treated as untrusted input.
- The first renderer does not execute Mermaid JavaScript.
- Native image rendering must be opt-in or capability-gated.
- Future browser-based rendering must use timeouts and size limits.
- Rendering should degrade to terminal text instead of throwing.
- Large blocks should respect available height and width.
## Validation
Targeted unit verification:
```bash
cd packages/cli
npx vitest run \
src/config/settingsSchema.test.ts \
src/ui/AppContainer.test.tsx \
src/ui/utils/MarkdownDisplay.test.tsx \
src/ui/utils/mermaidImageRenderer.test.ts \
src/ui/commands/copyCommand.test.ts \
src/ui/components/BaseTextInput.test.tsx \
src/ui/keyMatchers.test.ts \
src/ui/contexts/KeypressContext.test.tsx
```
Broader verification before PR submission:
```bash
npm run build --workspace=packages/cli
npm run typecheck --workspace=packages/cli
npm run lint --workspace=packages/cli
git diff --check
```
Terminal-capture integration scenario:
```bash
npm run build && npm run bundle
cd integration-tests/terminal-capture
npm run capture:markdown-rendering
```
This scenario captures a Markdown-heavy model response, toggles raw/source mode
with `Alt/Option+M`, and verifies the visible source copy flows with
`/copy mermaid 1` and `/copy latex 1`.
Manual scenarios:
- Assistant response with a Mermaid `flowchart LR` block.
- Assistant response with a Mermaid `sequenceDiagram` block.
- Markdown table plus Mermaid in the same answer.
- Fenced JavaScript code block still showing code formatting.
- Narrow terminal width.
- Constrained tool/detail surface.
- `ui.renderMode: "raw"` starts a session in source-oriented mode.
- `Alt/Option+M` toggles the same response between rendered and raw/source
mode.
- Mermaid and LaTeX visual blocks expose copy hints that map to the actual
`/copy mermaid N` and `/copy latex N` source order.