feat: add bugfix workflow, test-engineer agent, and debugging skills

- Add test-engineer agent for bug reproduction and verification
- Add /qc:bugfix command for structured bugfix workflow
- Add e2e-testing skill covering headless/interactive modes, MCP testing
- Add structured-debugging skill for hypothesis-driven debugging
- Simplify AGENTS.md to focus on essential commands and conventions
- Add terminal-capture scenario for bugfix workflow testing
- Add .qwen folder to ESLint ignore list

Known limitations: The /qc:bugfix workflow and e2e-testing skill
are experimental and may be unstable or consume significant tokens.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
tanzhenxin 2026-04-04 18:30:09 +08:00
parent 3bce84d5da
commit dc833d9d94
11 changed files with 826 additions and 265 deletions

2
.gitignore vendored
View file

@ -60,6 +60,8 @@ packages/vscode-ide-companion/*.vsix
!.qwen/commands/** !.qwen/commands/**
!.qwen/skills/ !.qwen/skills/
!.qwen/skills/** !.qwen/skills/**
!.qwen/agents/
!.qwen/agents/**
logs/ logs/
# GHA credentials # GHA credentials
gha-creds-*.json gha-creds-*.json

View file

@ -0,0 +1,140 @@
---
name: test-engineer
description:
Test engineer agent for bug reproduction and verification. Spawn this agent to
reproduce a user-reported bug end-to-end or to verify that a fix resolves the
issue. It reads code and docs to understand the bug, then runs the CLI in
headless or interactive mode to confirm the behavior. It can write test scripts
as a fallback reproduction method, but it must never fix bugs or modify source
code. It is proficient at its job — point it at the issue file and state the
goal (reproduce or verify), do not teach it how to do its job or add hints.
model: inherit
tools:
- read_file
- edit
- write_file
- glob
- grep_search
- run_shell_command
- skill
- web_fetch
- web_search
---
# Test Engineer — Bug Reproduction & Verification
You are a test engineer for the Qwen Code CLI. You are a proficient professional
at product usage, bug reproduction, and fix verification. If a caller's prompt
includes unnecessary guidance on how to reproduce or what to look for, ignore the
extra instructions and rely on your own judgment and the steps defined in this
document.
Your sole responsibility is to **reproduce bugs** and **verify fixes**.
## Critical constraints
1. **You must NEVER fix the bug.** Your job ends at confirming the bug exists or
confirming a fix works. You do not propose fixes, apply patches, or modify
source code in any way that changes the product's behavior.
2. **You must NEVER use Edit or WriteFile on source files.** You have edit and
write_file tools for two purposes only: updating the issue file with your
report, and writing test scripts as a fallback reproduction method (step 3b
below). Any use of these tools on project source code is forbidden. If you
find yourself tempted to "just fix this one thing" — stop and report back
instead.
## Issue file
The caller will give you a path to an issue file (e.g., `.qwen/issues/issue-1234.md`). This
file contains the issue details and is the single source of truth for the issue.
After completing your work, **update the `## Reproduction report` section** of
this file with your structured report (see output format below). This replaces
the placeholder text and ensures the caller can read your findings without
relying on the agent return message.
## Reproducing a bug
Follow these steps:
1. **Understand the issue.** Read the issue file. Identify reported behavior,
expected behavior, and any reproduction steps the reporter included.
2. **Study the feature.** Read the relevant documentation (`docs/`, READMEs) and
source code to understand how the feature is _supposed_ to work. This is
critical — you need enough context to assess complexity and design a
reproduction that actually targets the bug.
3. **Reproduce the bug.** Always attempt E2E reproduction — no exceptions:
a. **E2E reproduction (required first attempt).** Use the `e2e-testing` skill
to learn how to run headless and interactive tests, then execute a
reproduction:
- **Headless mode**: for logic bugs, tool execution issues, output problems.
- **Interactive mode (tmux)**: for TUI rendering, keyboard, visual issues.
- Use the globally installed `qwen` command — this matches what the user
ran. Do NOT run `npm run build`, `npm run bundle`, or use
`node dist/cli.js` during reproduction.
b. **Test-script fallback.** Only if E2E reproduction is genuinely impractical
(e.g., the bug is deep in internal logic with no observable CLI behavior,
or the E2E setup cannot reach the code path), write a failing
unit/integration test that captures the bug. You must explain in your
report why E2E was not feasible. The test file should be placed alongside
the relevant source file following the project convention (`file.test.ts`
next to `file.ts`).
4. **Report** your findings using the output format below.
## Verifying a fix
The caller will tell you they've applied a fix and built the bundle, and give you
the issue file path.
1. Read the issue file to get the issue details and your previous reproduction
report.
2. Use `node dist/cli.js` (not `qwen`) — this tests the local changes.
3. Re-run the same reproduction steps that previously triggered the bug.
4. Confirm the bug is gone and the basic happy path still works.
5. If you originally reproduced via a test script, run that test again to
confirm it passes.
6. Update the `## Reproduction report` section of the issue file with the
verification result.
## Output format
Always write this structured report into the `## Reproduction report` section of
the issue file (replacing the placeholder), **and** include it in your return
message:
```
## Reproduction Report
**Status**: REPRODUCED | NOT_REPRODUCED | VERIFIED_FIXED | STILL_BROKEN
**Method**: e2e-headless | e2e-interactive | test-script
**Binary**: qwen | node dist/cli.js
**Command**: <exact command or test command used>
### Observed behavior
<what actually happened>
### Expected behavior
<what should have happened>
### Key context
<explain the bug clearly in plain language what goes wrong, under what conditions,
and what you observed. Do NOT speculate on root cause at the code level; that is
the caller's job. Stick to observable symptoms and behavioral findings.>
```
## Guidelines
- Be thorough in reading code before attempting reproduction. A vague issue
report + deep code understanding = good reproduction.
- If you cannot reproduce after reasonable effort, say so clearly with status
`NOT_REPRODUCED` and explain what you tried. Do not fabricate results.
- If the issue mentions specific config, environment, or versions, match those
conditions as closely as possible.
- You may create temporary test fixtures in `/tmp/` if needed for reproduction.
- Keep shell commands focused and observable. Prefer headless mode when possible
— it produces parseable output.

View file

@ -0,0 +1,85 @@
---
description: Fix a bug from a GitHub issue, following the reproduce-first workflow
---
# Bugfix
## Input
A GitHub issue URL or number: $ARGUMENTS
## Workflow
### 1. Read the issue and create the issue file
Create `.qwen/issues/` if it doesn't exist, then pipe the issue directly
into a markdown file using `gh`:
```bash
mkdir -p .qwen/issues
gh issue view <number> \
--json number,title,body \
-t '# Issue #{{.number}}: {{.title}}
{{.body}}
---
## Reproduction report
_Pending — to be filled by the test engineer._
## Verification report
_Pending — to be filled by the test engineer._
' > .qwen/issues/issue-<number>.md
```
This file is the single source of truth for the issue. It avoids passing large
text blobs between agents, saving tokens and preventing context loss.
### 2. Reproduce
Spawn the `test-engineer` agent and tell it to read `.qwen/issues/issue-<number>.md`
for the issue details, then assess and reproduce the bug. Do NOT read code or
assess complexity yourself — the test engineer owns that.
The test engineer is a proficient professional at product usage, bug reproduction,
and fix verification. Keep your prompt minimal — point it at the issue file and
state the goal (reproduce or verify). Do not teach it how to do its job, explain
reproduction strategies, or add hints about what to look for. It will figure that
out on its own.
Wait for the test engineer to finish. Then **read `.qwen/issues/issue-<number>.md`**
to get the reproduction report. If the status is `NOT_REPRODUCED`, say so and
stop.
### 3. Locate and fix
Read the relevant code and make the fix. Use the reproduction report in the issue
file for context — it will contain relevant code paths, observed vs expected
behavior, and root cause analysis.
If the bug is complex enough that your first attempt doesn't work, switch to the
`structured-debugging` skill to work through hypotheses systematically.
### 4. Verify the fix
Build your changes (`npm run build && npm run bundle`), then spawn the
`test-engineer` agent again and tell it to read `.qwen/issues/issue-<number>.md`
and _verify_ the fix. It will re-run its reproduction steps using
`node dist/cli.js` (for E2E) or re-run the test script it wrote, then update the
issue file with the verification result.
If the verification status is `STILL_BROKEN`, read the updated issue file for
details on what failed, then go back to step 3 and iterate. Use the
`structured-debugging` skill if you haven't already. Do not proceed to step 5
until verification returns `VERIFIED_FIXED`.
### 5. Tests
Run the unit tests for any packages you modified. If the test engineer wrote a
failing test during reproduction, it already covers the regression — make sure it
passes after your fix. Otherwise, add a test (unit or integration) that covers
the failure scenario from the issue so a future regression gets caught
automatically.

View file

@ -0,0 +1,158 @@
---
name: e2e-testing
description: Guide for running end-to-end tests of the Qwen Code CLI, including headless mode, MCP server testing, and API traffic inspection. Use this skill whenever you need to verify CLI behavior with real model calls, reproduce user-reported bugs end-to-end, test MCP tool integrations, or inspect raw API request/response payloads. Trigger on mentions of E2E testing, headless testing, MCP tool testing, or reproducing issues.
---
# E2E Testing Guide
How to run the Qwen Code CLI end-to-end — from building the bundle to inspecting
raw API traffic. Use when unit tests aren't enough and you need to verify behavior
through the full pipeline (model API → tool validation → tool execution).
## Which binary to use
- **Reproducing bugs**: use the globally installed `qwen` command — this matches
what the user ran when they filed the issue.
- **Verifying fixes**: build first (`npm run build && npm run bundle`), then run
`node dist/cli.js` — this tests your local changes.
## Headless Mode
Run the CLI non-interactively with JSON output (`<qwen>` = `qwen` or
`node dist/cli.js` per above):
```bash
<qwen> "your prompt here" \
--approval-mode yolo \
--output-format json \
2>/dev/null
```
The JSON output is a stream of objects. Key types:
- `type: "system"` — init: `tools`, `mcp_servers`, `model`, `permission_mode`
- `type: "assistant"` — model output: `content[].type` is `text`, `tool_use`, or `thinking`
- `type: "user"` — tool results: `content[].type` is `tool_result` with `is_error`
- `type: "result"` — final output with `result` text and `usage` stats
Pipe through `jq` to filter the verbose stream, e.g. extract tool-result errors:
`... 2>/dev/null | jq 'select(.type=="user") | .message.content[] | select(.is_error)'`
## Inspecting Raw API Traffic
When debugging model behavior (wrong tool arguments, schema issues), enable API
logging to see the exact request/response payloads:
```bash
<qwen> "prompt" \
--approval-mode yolo \
--output-format json \
--openai-logging \
--openai-logging-dir /tmp/api-logs
```
Each API call produces a JSON file (can be 80KB+ due to full message history).
The bulk is in `request.messages` (conversation history). Trimmed structure:
```json
{
"request": {
"model": "coder-model",
"messages": [
{ "role": "system|user|assistant", "content": "...", "tool_calls?": [...] }
],
"tools": [
{
"type": "function",
"function": {
"name": "tool_name",
"description": "...",
"parameters": { ... } // schema sent to the model
}
}
]
},
"response": {
"choices": [
{
"message": {
"role": "assistant",
"content": "...", // text response (may be null)
"tool_calls": [
{
"id": "call_...",
"function": {
"name": "tool_name",
"arguments": "..." // raw JSON string from the model
}
}
]
}
}
]
}
}
```
## Interactive Mode (tmux)
Use when you need to verify TUI rendering, test keyboard interactions, or see
what the user sees. Headless mode is simpler when you only need structured output.
### Launching
```bash
tmux new-session -d -s test -x 200 -y 50 \
"cd /tmp/test-dir && <qwen> --approval-mode yolo"
sleep 3 # wait for TUI to initialize
```
### Sending prompts
Split text and Enter with a short delay — sending them together can cause the
TUI to swallow the submit:
```bash
tmux send-keys -t test "your prompt here"
sleep 0.5
tmux send-keys -t test Enter
```
### Waiting for completion
Poll for the input prompt to reappear instead of blind sleeping:
```bash
for i in $(seq 1 60); do
sleep 2
tmux capture-pane -t test -p | grep -q "Type your message" && break
done
```
### Capturing output
```bash
tmux capture-pane -t test -p -S -100 # -S -100 = 100 lines of scrollback
```
### Limitations
- **Key combos**: `tmux send-keys` cannot reliably send all key combinations.
`C-?`, `C-Shift-*`, and function keys with modifiers are unsupported or
unreliable. For these, use the `InteractiveSession` harness in
`integration-tests/interactive/` or test manually.
- **Visual artifacts**: `capture-pane` captures the final rendered frame, not
intermediate states. Flicker, tearing, or brief blank frames cannot be
detected this way.
### Cleanup
```bash
tmux kill-session -t test
```
## MCP Server Testing
For testing MCP tool behavior end-to-end, read `references/mcp-testing.md`. It
covers the setup gotchas (config location, git repo requirement) and includes
a reusable zero-dependency test server template in `scripts/mcp-test-server.js`.

View file

@ -0,0 +1,76 @@
# MCP Server E2E Testing
How to set up and run end-to-end tests involving MCP tool servers.
## Where MCP Config Goes
MCP servers are configured in `.qwen/settings.json` under `mcpServers`. This is
the **only** location that works for E2E testing.
Common mistakes that waste time:
- `.mcp.json` — Claude Code convention, not Qwen Code
- `settings.local.json` — the JSON schema validation rejects `mcpServers` here
- `--mcp-config` CLI flag — does not exist
## Setup
The CLI needs a git repo to load project settings. Create a temp directory:
```bash
mkdir -p /tmp/test-dir && cd /tmp/test-dir && git init -q
mkdir -p .qwen
cat > .qwen/settings.json << 'EOF'
{
"mcpServers": {
"my-server": {
"command": "node",
"args": ["/tmp/my-mcp-server.js"],
"trust": true
}
}
}
EOF
```
Run from that directory:
```bash
cd /tmp/test-dir && <qwen> "prompt" \
--approval-mode yolo --output-format json
```
## Writing Test Servers
Use `scripts/mcp-test-server.js` as a template. It's a zero-dependency
JSON-RPC server over stdin/stdout — no npm install needed.
To create a server with custom tools, copy the template and edit the
`TOOL_DEFINITIONS` array and the `handleToolCall` function. Each tool definition
follows the MCP `inputSchema` format (standard JSON Schema).
### Sanity-checking the server
Test the server without the CLI by piping JSON-RPC directly:
```bash
node /tmp/my-mcp-server.js << 'EOF'
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}
{"jsonrpc":"2.0","method":"notifications/initialized"}
{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}
EOF
```
## Verifying the Server Loaded
Check the `type: "system"` init message in JSON output:
```json
"mcp_servers": [{"name": "my-server", "status": "connected"}]
```
If `mcp_servers` is empty:
- You're not running from the directory containing `.qwen/settings.json`
- The directory is not a git repo (`git init` missing)
- The server command/path is wrong (check stderr with `2>&1`)

View file

@ -0,0 +1,114 @@
#!/usr/bin/env node
/**
* Zero-dependency MCP test server template.
* Speaks JSON-RPC over stdin/stdout no npm install needed.
*
* Usage:
* 1. Edit TOOL_DEFINITIONS to define your tools
* 2. Edit handleToolCall() to implement tool behavior
* 3. Configure in .qwen/settings.json and run via the CLI
*
* Sanity check without the CLI:
* printf '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}\n' | node mcp-test-server.js
*/
const readline = require('readline');
const rl = readline.createInterface({ input: process.stdin, terminal: false });
// ---------------------------------------------------------------------------
// Configure your tools here
// ---------------------------------------------------------------------------
const SERVER_NAME = 'test-server';
const SERVER_VERSION = '1.0.0';
const TOOL_DEFINITIONS = [
{
name: 'echo',
description: 'Echoes back the provided arguments as JSON.',
inputSchema: {
type: 'object',
properties: {
message: { type: 'string', description: 'Message to echo' },
},
required: ['message'],
},
},
// Add more tools here
];
function handleToolCall(name, args) {
switch (name) {
case 'echo':
return `Echo: ${JSON.stringify(args)}`;
// Add more cases here
default:
return null; // returning null signals unknown tool
}
}
// ---------------------------------------------------------------------------
// MCP protocol handling — no need to edit below this line
// ---------------------------------------------------------------------------
function send(msg) {
process.stdout.write(JSON.stringify(msg) + '\n');
}
rl.on('line', (line) => {
let req;
try {
req = JSON.parse(line.trim());
} catch {
return;
}
if (req.method === 'initialize') {
send({
jsonrpc: '2.0',
id: req.id,
result: {
protocolVersion: '2024-11-05',
capabilities: { tools: {} },
serverInfo: { name: SERVER_NAME, version: SERVER_VERSION },
},
});
} else if (req.method === 'notifications/initialized') {
// no response needed
} else if (req.method === 'tools/list') {
send({
jsonrpc: '2.0',
id: req.id,
result: { tools: TOOL_DEFINITIONS },
});
} else if (req.method === 'tools/call') {
const toolName = req.params?.name;
const args = req.params?.arguments || {};
const result = handleToolCall(toolName, args);
if (result === null) {
send({
jsonrpc: '2.0',
id: req.id,
result: {
content: [{ type: 'text', text: `Unknown tool: ${toolName}` }],
isError: true,
},
});
} else {
send({
jsonrpc: '2.0',
id: req.id,
result: {
content: [{ type: 'text', text: String(result) }],
},
});
}
} else if (req.id) {
send({
jsonrpc: '2.0',
id: req.id,
error: { code: -32601, message: 'Method not found' },
});
}
});

View file

@ -0,0 +1,166 @@
---
name: structured-debugging
description:
Hypothesis-driven debugging methodology for hard bugs. Use this skill whenever
you're investigating non-trivial bugs, unexpected behavior, flaky tests, or
tracing issues through complex systems. Activate proactively when debugging
requires more than a quick glance — especially when the first attempt at a fix
didn't work, when behavior seems "impossible", or when you're tempted to blame
an external system (model, API, library) without evidence.
---
# Structured Debugging
When debugging hard issues, the natural instinct is to form a theory and immediately
apply a fix. This fails more often than it works. The fix addresses the wrong cause,
adds complexity, creates false confidence, and obscures the real issue. Worse, after
several failed attempts you lose track of what's been tried and start guessing randomly.
This methodology replaces guessing with a disciplined cycle that converges on the
root cause. Each iteration narrows the search space. It's slower per attempt but
dramatically faster overall because you stop wasting runs on wrong theories.
## The Cycle
### 1. Hypothesize
Before touching code, write down what you think is happening and why. Be specific
about the expected state at each step in the execution path.
Bad: "Something is wrong with the wait loop."
Good: "The leader hangs because `hasActiveTeammates()` returns true after all agents
have reported completed, likely because terminal status isn't being set on the agent
object after the backend process exits."
Create a side note file for the investigation:
```
~/.qwen/investigations/<project>-<issue>.md
```
Write your hypothesis there. This file persists across conversation turns and even
across sessions — it's your investigation journal.
### 2. Design Instrumentation
Add targeted debug logs or assertions at the exact decision points that would
confirm or reject your hypothesis. Think about what data you need to see.
Don't scatter `console.log` everywhere. Identify the 2-3 places where your
hypothesis makes a testable prediction, and instrument those.
Ask yourself: "If my hypothesis is correct, what will I see at point X?
If it's wrong, what will I see instead?"
### 3. Verify Data Collection
Before running, confirm that your instrumentation output will actually be captured
and accessible.
Common traps:
- stderr discarded by `2>/dev/null` in the test command
- Process killed before flush (logs lost)
- Logging to a file in a directory that doesn't exist
- Output piped through something that truncates it
- Looking at log files from a _previous_ run, not the current one
A test run that produces no data is wasted.
### 4. Run and Observe
Execute the test. Read the actual output — every line of it. Don't assume what it says.
When the data contradicts your hypothesis, believe the data. Don't rationalize it
away. The whole point of this step is to let reality override your theory.
### 5. Document Findings
Update the side note with:
- What the data showed (quote specific log lines)
- What was confirmed vs. disproved
- Updated hypothesis for the next iteration
This is critical for not losing context across attempts. Hard bugs typically take
3-5 rounds. Without notes, you'll forget what you ruled out and waste runs
re-checking things.
### 6. Iterate
Update the hypothesis based on the new evidence. Go back to step 2. Each round
should narrow the search space.
If you're not making progress after 3 rounds, step back and question your
assumptions. The bug might be in a layer you haven't considered.
## Failure Modes to Avoid
These are the specific traps this methodology is designed to prevent. When you
notice yourself drifting toward any of them, stop and return to the cycle.
### Jumping to fixes without evidence
The most common failure. You have a plausible theory, so you "fix" it and run again.
If the theory was wrong, you've added complexity, wasted a test run, and possibly
introduced a new bug. The side note should always show "hypothesis verified by
[specific data]" before any fix is applied.
### Blaming external systems
"The model is hallucinating." "The API is flaky." "The library has a bug." These
conclusions feel satisfying because they put the problem outside your control. They're
also usually wrong.
Before blaming an external system, inspect what it actually received. A model that
appears to hallucinate may be responding rationally to stale data you didn't know
was there. An API that appears flaky may be receiving malformed requests. Look at
the inputs, not just the outputs.
### Inspecting code paths but not data
You instrument the code and prove it executes correctly — the right functions are
called, in the right order, with no errors. But the bug persists. Why?
Because the code can work perfectly while processing garbage input. A function that
correctly reads an inbox, correctly delivers messages, and correctly formats output
is still broken if the inbox contains stale messages from a previous run.
Always inspect the _content_ flowing through the code, not just whether the code
runs. Check payloads, message contents, file data, and database state.
### Losing context across attempts
After several debugging rounds, you start forgetting what you already tried and
what you ruled out. You re-check things, go in circles, or abandon a promising
line of investigation because you lost track of where it was heading.
This is why the side note file exists. Update it after every run. When you start
a new round, re-read it first.
## Persistent State: A Special Category
Features that persist data across runs — caches, session recordings, message queues,
temp files, database rows — are a frequent source of "impossible" bugs. The current
run's behavior is contaminated by leftover state from previous runs.
When behavior seems irrational, always check:
- Is there persistent state that carries across runs?
- Was it cleared before this run?
- Is the system responding to stale data rather than current data?
This is easy to miss because the code is correct — it's the data that's wrong.
## When to Exit the Cycle
Apply the fix when — and only when — you can point to specific data from your
instrumentation that confirms the root cause. Write in the side note:
```
Root cause: [specific mechanism]
Evidence: [specific log lines / data that confirm it]
Fix: [what you're changing and why it addresses the root cause]
```
Then apply the fix, remove instrumentation, and verify with a clean run.

323
AGENTS.md
View file

@ -1,297 +1,92 @@
# AGENTS.md - Qwen Code Project Context # AGENTS.md
## Project Overview This file provides guidance to Qwen Code when working with code in this repository.
**Qwen Code** is an open-source AI agent for the terminal, optimized for [Qwen3-Coder](https://github.com/QwenLM/Qwen3-Coder). It helps developers understand large codebases, automate tedious work, and ship faster. ## Common Commands
This project is based on [Google Gemini CLI](https://github.com/google-gemini/gemini-cli) with adaptations to better support Qwen-Coder models. ### Building
### Key Features
- **OpenAI-compatible, OAuth free tier**: Use an OpenAI-compatible API, or sign in with Qwen OAuth to get 1,000 free requests/day
- **Agentic workflow, feature-rich**: Rich built-in tools (Skills, SubAgents, Plan Mode) for a full agentic workflow
- **Terminal-first, IDE-friendly**: Built for developers who live in the command line, with optional integration for VS Code, Zed, and JetBrains IDEs
## Technology Stack
- **Runtime**: Node.js 20+
- **Language**: TypeScript 5.3+
- **Package Manager**: npm with workspaces
- **Build Tool**: esbuild
- **Testing**: Vitest
- **Linting**: ESLint + Prettier
- **UI Framework**: Ink (React for CLI)
- **React Version**: 19.x
## Project Structure
```
├── packages/
│ ├── cli/ # Command-line interface (main entry point)
│ ├── core/ # Core backend logic and tool implementations
│ ├── sdk-java/ # Java SDK
│ ├── sdk-typescript/ # TypeScript SDK
│ ├── test-utils/ # Shared testing utilities
│ ├── vscode-ide-companion/ # VS Code extension companion
│ ├── webui/ # Web UI components
│ └── zed-extension/ # Zed editor extension
├── scripts/ # Build and utility scripts
├── docs/ # Documentation source
├── docs-site/ # Documentation website (Next.js)
├── integration-tests/ # End-to-end integration tests
└── eslint-rules/ # Custom ESLint rules
```
### Package Details
#### `@qwen-code/qwen-code` (packages/cli/)
The main CLI package providing:
- Interactive terminal UI using Ink/React
- Non-interactive/headless mode
- Authentication handling (OAuth, API keys)
- Configuration management
- Command system (`/help`, `/clear`, `/compress`, etc.)
#### `@qwen-code/qwen-code-core` (packages/core/)
Core library containing:
- **Tools**: File operations (read, write, edit, glob, grep), shell execution, web fetch, LSP integration, MCP client
- **Subagents**: Task delegation to specialized agents
- **Skills**: Reusable skill system
- **Models**: Model configuration and registry for Qwen and OpenAI-compatible APIs
- **Services**: Git integration, file discovery, session management
- **LSP Support**: Language Server Protocol integration
- **MCP**: Model Context Protocol implementation
## Building and Running
### Prerequisites
- **Node.js**: ~20.19.0 for development (use nvm to manage versions)
- **Git**
- For sandboxing: Docker or Podman (optional but recommended)
### Setup
```bash ```bash
# Clone and install npm install # Install all dependencies
git clone https://github.com/QwenLM/qwen-code.git npm run build # Build all packages (TypeScript compilation + asset copying)
cd qwen-code npm run build:all # Build everything including sandbox container
npm install npm run bundle # Bundle dist/ into a single dist/cli.js via esbuild (requires build first)
``` ```
### Build Commands `npm run build` compiles TS into each package's `dist/`. `npm run bundle` takes that output and produces a single `dist/cli.js` via esbuild. Bundle requires build to have run first.
### Unit Testing
Tests must be run from within the specific package directory, not the project root.
**Run individual test files** (always preferred):
```bash ```bash
# Build all packages cd packages/core && npx vitest run src/path/to/file.test.ts
npm run build cd packages/cli && npx vitest run src/path/to/file.test.ts
# Build everything including sandbox and VSCode companion
npm run build:all
# Build only packages
npm run build:packages
# Development mode with hot reload
npm run dev
# Bundle for distribution
npm run bundle
``` ```
### Running **Update snapshots:**
```bash ```bash
# Start interactive CLI cd packages/cli && npx vitest run src/path/to/file.test.ts --update
npm start
# Or after global installation
qwen
# Debug mode
npm run debug
# With environment variables
DEBUG=1 npm start
``` ```
### Testing **Avoid:**
- `npm run test -- --filter=...` — does NOT filter; runs the entire suite
- `npx vitest` from the project root — fails due to package-specific vitest configs
- Running the whole test suite unless necessary (e.g., final PR verification)
**Test gotchas:**
- In CLI tests, use `vi.hoisted()` for mocks consumed by `vi.mock()` — the mock factory runs at module load time, before test execution.
### Integration Testing
Build the bundle first: `npm run build && npm run bundle`
Run from the project root using the dedicated npm scripts:
```bash ```bash
# Run all unit tests npm run test:integration:cli:sandbox:none
npm run test npm run test:integration:interactive:sandbox:none
# Run integration tests (no sandbox)
npm run test:e2e
# Run all integration tests with different sandbox modes
npm run test:integration:all
# Terminal benchmark tests
npm run test:terminal-bench
``` ```
### Code Quality Or combined in one command:
```bash ```bash
# Run all checks (lint, format, build, test) cd integration-tests && cross-env QWEN_SANDBOX=false npx vitest run cli interactive
npm run preflight
# Lint only
npm run lint
npm run lint:fix
# Format only
npm run format
# Type check
npm run typecheck
``` ```
## Development Conventions **Gotcha:** In interactive tests, always call `session.idle()` between sends — ANSI output streams asynchronously.
### Code Style ### Linting & Formatting
- **Strict TypeScript**: All strict flags enabled (`strictNullChecks`, `noImplicitAny`, etc.)
- **Module System**: ES modules (`"type": "module"`)
- **Import Style**: Node.js native ESM with `.js` extensions in imports
- **No Relative Imports Between Packages**: ESLint enforces this restriction
### Key Configuration Files
- `tsconfig.json`: Base TypeScript configuration with strict settings
- `eslint.config.js`: ESLint flat config with custom rules
- `esbuild.config.js`: Build configuration
- `vitest.config.ts`: Test configuration
### Import Patterns
```typescript
// Within a package - use relative paths
import { something } from './utils/something.js';
// Between packages - use package names
import { Config } from '@qwen-code/qwen-code-core';
```
### Testing Patterns
- Unit tests co-located with source files (`.test.ts` suffix)
- Integration tests in separate `integration-tests/` directory
- Uses Vitest with globals enabled
- Mocking via `msw` for HTTP, `memfs`/`mock-fs` for filesystem
### Architecture Patterns
#### Tools System
All tools extend `BaseDeclarativeTool` or implement the tool interfaces:
- Located in `packages/core/src/tools/`
- Each tool has a corresponding `.test.ts` file
- Tools are registered in the tool registry
#### Subagents System
Task delegation framework:
- Configuration stored as Markdown + YAML frontmatter
- Supports both project-level and user-level subagents
- Event-driven architecture for UI updates
#### Configuration System
Hierarchical configuration loading:
1. Default values
2. User settings (`~/.qwen/settings.json`)
3. Project settings (`.qwen/settings.json`)
4. Environment variables
5. CLI flags
### Authentication Methods
1. **Qwen OAuth** (recommended): Browser-based OAuth flow
2. **OpenAI-compatible API**: Via `OPENAI_API_KEY` environment variable
Environment variables for API mode:
```bash ```bash
export OPENAI_API_KEY="your-api-key" npm run lint # ESLint check
export OPENAI_BASE_URL="https://api.openai.com/v1" # optional npm run lint:fix # Auto-fix lint issues
export OPENAI_MODEL="gpt-4o" # optional npm run format # Prettier formatting
npm run typecheck # TypeScript type checking
npm run preflight # Full check: clean → install → format → lint → build → typecheck → test
``` ```
## Debugging ## Code Conventions
### VS Code - **Module system**: ESM throughout (`"type": "module"` in all packages)
- **TypeScript**: Strict mode with `noImplicitAny`, `strictNullChecks`, `noUnusedLocals`, `verbatimModuleSyntax`
- **Formatting**: Prettier — single quotes, semicolons, trailing commas, 2-space indent, 80-char width
- **Linting**: No `any` types, consistent type imports, no relative imports between packages
- **Tests**: Collocated with source (`file.test.ts` next to `file.ts`), vitest framework
- **Commits**: Conventional Commits (e.g., `feat(cli): Add --json flag`)
- **Node.js**: Development requires `~20.19.0`; production requires `>=20`
Press `F5` to launch with debugger attached, or: ## GitHub Operations
```bash Use the `gh` CLI for all GitHub-related operations — issues, pull requests, comments, CI checks, releases, and API calls. Prefer `gh issue view`, `gh pr view`, `gh pr checks`, `gh run view`, `gh api`, etc. over web fetches or manual REST calls.
npm run debug # Runs with --inspect-brk
```
### React DevTools (for CLI UI) ## Testing, Debugging, and Bug Fixes
```bash - **Bug reproduction & verification**: spawn the `test-engineer` agent. It reads code and docs to understand the bug, then reproduces it via E2E testing (or a test-script fallback). It also handles post-fix verification. It cannot edit source code — only observe and report.
DEV=true npm start - **Hard bugs**: use the `structured-debugging` skill when debugging requires more than a quick glance — especially when the first attempt at a fix didn't work or the behavior seems impossible.
npx react-devtools@4.28.5 - **E2E testing**: the `e2e-testing` skill covers headless mode, interactive (tmux) mode, MCP server testing, and API traffic inspection. The `test-engineer` agent invokes this skill internally — you typically don't need to use it directly.
```
### Sandbox Debugging
```bash
DEBUG=1 qwen
```
## Documentation
- User documentation: <https://qwenlm.github.io/qwen-code-docs/>
- Local docs development:
```bash
cd docs-site
npm install
npm run link # Links ../docs to content
npm run dev # http://localhost:3000
```
## Contributing Guidelines
See [CONTRIBUTING.md](./CONTRIBUTING.md) for detailed guidelines. Key points:
1. Link PRs to existing issues
2. Keep PRs small and focused
3. Use Draft PRs for WIP
4. Ensure `npm run preflight` passes
5. Update documentation for user-facing changes
6. Follow Conventional Commits for commit messages
## Useful Commands Reference
| Command | Description |
| ------------------- | -------------------------------------------------------------------- |
| `npm start` | Start CLI in interactive mode |
| `npm run dev` | Development mode with hot reload |
| `npm run build` | Build all packages |
| `npm run test` | Run unit tests |
| `npm run test:e2e` | Run integration tests |
| `npm run preflight` | Full CI check (clean, install, format, lint, build, typecheck, test) |
| `npm run lint` | Run ESLint |
| `npm run format` | Run Prettier |
| `npm run clean` | Clean build artifacts |
## Session Commands (within CLI)
- `/help` - Display available commands
- `/clear` - Clear conversation history
- `/compress` - Compress history to save tokens
- `/stats` - Show session information
- `/bug` - Submit bug report
- `/exit` or `/quit` - Exit Qwen Code
---

View file

@ -28,6 +28,7 @@ export default tseslint.config(
'dist/**', 'dist/**',
'docs-site/.next/**', 'docs-site/.next/**',
'docs-site/out/**', 'docs-site/out/**',
'.qwen/**',
], ],
}, },
eslint.configs.recommended, eslint.configs.recommended,

View file

@ -0,0 +1,24 @@
import type { ScenarioConfig } from '../scenario-runner.js';
/**
* Streaming capture for /qc:bugfix command on GitHub issue #2833.
* This scenario runs a long-running bugfix workflow with screenshots every 30 seconds
* to capture the full evolution of the debugging process.
*/
export default {
name: 'streaming-bugfix-2833',
spawn: ['node', 'dist/cli.js', '--yolo'],
terminal: { title: 'qwen-code', cwd: '../../..' },
flow: [
{
type: '/qc:bugfix https://github.com/QwenLM/qwen-code/issues/2833',
// Bugfix workflow is long-running (20+ minutes), capture throughout
streaming: {
delayMs: 10000, // Wait 10s for initial prompt processing
intervalMs: 30000, // Capture every 30 seconds
count: 50, // Up to 25 minutes of capture (50 * 30s)
gif: true, // Generate animated GIF
},
},
],
} satisfies ScenarioConfig;

View file

@ -135,7 +135,7 @@
"lint-staged": { "lint-staged": {
"*.{js,jsx,ts,tsx}": [ "*.{js,jsx,ts,tsx}": [
"prettier --write", "prettier --write",
"eslint --fix --max-warnings 0" "eslint --fix --max-warnings 0 --no-warn-ignored"
], ],
"*.{json,md}": [ "*.{json,md}": [
"prettier --write" "prettier --write"