qwen-code/integration-tests/terminal-capture/motivation.md
tanzhenxin b8a7ac830d feat(terminal-capture): add streaming capture with GIF generation
Add ability to capture multiple screenshots at intervals during
long-running terminal output (e.g., progress bars). Optionally
generates animated GIFs from captured frames using ffmpeg.

Features:
- Streaming capture at configurable intervals
- Early stop when output stabilizes (3 consecutive unchanged frames)
- Duplicate frame skipping
- Animated GIF generation via ffmpeg concat demuxer
- Auto-cleanup of output directory before each run
- Configurable delay before starting captures

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-03-05 17:46:09 +08:00

6.3 KiB

terminal-capture — Motivation and Positioning

1. Overview of Existing Testing System

Layer Tools Coverage Status
Unit Tests Vitest + ink-testing-library Ink components, Core logic, utilities Mature, extensive .test.ts / .test.tsx
Integration Tests Vitest + TestRig / SDKTestHelper CLI E2E, SDK multi-turn, MCP, auth Mature, supports none/docker/podman sandboxes
Terminal UI Snapshots toMatchSnapshot() + ink-testing-library Ink component render output (ANSI) Exists, covers Footer, InputPrompt, MarkdownDisplay, etc.
Web UI Regression Chromatic + Storybook packages/webui components Exists, but only covers Web UI
Terminal UI Visual terminal-capture CLI terminal real rendering screenshots Implemented

2. Problems Solved by terminal-capture

Limitations of Existing Ink Text Snapshots

The project uses toMatchSnapshot() to compare Ink component ANSI text output, which validates text content, but cannot verify:

  • Whether colors are correct (red separators? green highlights? Logo gradients?)
  • Whether layout is aligned (table borders? multi-column layout?)
  • Overall visual feel (component spacing? blank areas? overflow?)

These can only be seen by actually rendering to a terminal emulator.

Core Architecture

node-pty (pseudo-terminal)
  ↓ raw ANSI byte stream
xterm.js (running inside Playwright headless Chromium)
  ↓ perfect rendering: colors, bold, cursor, scrolling
Playwright element screenshot
  ↓ pixel-perfect screenshots (optional macOS window decorations)

Core Features

Feature Description
WYSIWYG xterm.js fully renders ANSI, no manual output cleaning needed
Theme Support Built-in 5 themes (Dracula, One Dark, GitHub Dark, Monokai, Night Owl)
Full-length captureFull() supports capturing scrollback buffer content
Streaming Capture Capture multiple frames at intervals during execution (e.g., progress bars)
Animated GIF Auto-generate GIF from streaming frames via ffmpeg
Early Stop Streaming stops early if output stabilizes; duplicate frames are skipped
Auto Cleanup Output directory is cleared before each run to prevent stale screenshots
Deterministic Naming Screenshot filenames auto-generated by step sequence for easy regression comparison
Batch Execution run.ts executes all scenarios in one command

3. Usage

TypeScript Configuration-Driven

Scenario config files (scenarios/*.ts) only need to declare type (input) and key (keypress), Runner handles automatically:

  • Wait for CLI readiness
  • Auto-complete interference handling (/ commands auto-send Escape)
  • Auto-screenshot before/after input (01 = input state, 02 = result)
  • Auto-capture full-length image at last step (full-flow.png)
  • Special key interactions (Arrow keys / Tab / Enter, etc.)
// integration-tests/terminal-capture/scenarios/about.ts
import type { ScenarioConfig } from '../scenario-runner.js';

export default {
  name: '/about',
  spawn: ['node', 'dist/cli.js', '--yolo'],
  terminal: { title: 'qwen-code', cwd: '../../..' },
  flow: [
    { type: 'Hi, can you help me understand this codebase?' },
    { type: '/about' },
  ],
} satisfies ScenarioConfig;

Running

# From project root
npx tsx integration-tests/terminal-capture/run.ts integration-tests/terminal-capture/scenarios/

# Or inside terminal-capture directory
npm run capture

Screenshot Output

scenarios/screenshots/
  about/
    01-01.png          # Step 1 input state
    01-02.png          # Step 1 result
    02-01.png          # Step 2 input state
    02-02.png          # Step 2 result
    full-flow.png      # Final state full-length image
  streaming-shell/
    01-01.png          # Input state
    01-streaming-01.png  # Streaming frame 1
    01-streaming-02.png  # Streaming frame 2
    ...
    01-02.png          # Final result
    streaming.gif      # Animated GIF (requires ffmpeg)
    full-flow.png      # Final state full-length image

4. Position in Testing System

┌─────────────────────────────────────┐
│       Existing Testing System        │
├─────────────────────────────────────┤
│  Unit Tests (Vitest)                 │  ← Function/Component level
│  Text Snapshots (ink-testing-lib)    │  ← ANSI string comparison
│  Integration Tests (TestRig/SDK)     │  ← E2E functionality
│  Web UI Regression (Chromatic)       │  ← Only covers webui
├─────────────────────────────────────┤
│  terminal-capture                    │  ← Terminal UI visual layer
│  (xterm.js + Playwright)             │     Fills the gap
└─────────────────────────────────────┘

5. Future Directions

  1. Visual Regression — Integrate Playwright toHaveScreenshot() for pixel-level baseline comparison, CI auto-detects terminal UI changes
  2. PR Workflow Integration — Drive Agent via Cursor Skill to auto-checkout branch → build → screenshot → attach to review comment
  3. Complement to Chromatic — Chromatic covers Web UI, terminal-capture covers CLI terminal UI