vrr/qwen-code: mirror of https://github.com/QwenLM/qwen-code?tab=readme-ov-file#-free-options-available

mirror of https://github.com/QwenLM/qwen-code.git synced 2026-04-28 03:30:40 +00:00

mirror of https://github.com/QwenLM/qwen-code?tab=readme-ov-file#-free-options-available

Find a file

Shaojin Wen 519e5aa1de fix(core): recover from truncated tool calls via multi-turn continuation (#3313 ) * fix(core): recover from truncated tool calls via multi-turn continuation (#3049) When large tool calls (e.g., WriteFile with big HTML) exceed the output token limit, the model's response gets truncated and required parameters like file_path are missing. Previously this surfaced as a confusing "params must have required property" error. Three-layer defense: 1. Escalate to model's actual output limit (not fixed 64K). Models with 128K output (Claude Opus, GPT-5) now use their full capacity. 2. Multi-turn recovery: if the escalated response is still truncated, keep the partial response in history and inject a recovery message ("Resume directly — pick up mid-thought") so the model continues from where it left off. Up to 3 recovery attempts before falling back to the tool scheduler's guidance. 3. Stronger truncation guidance as fallback: "you MUST split" instead of "consider splitting". Also fixes: - Clear toolCallRequests on RETRY to prevent duplicate tool execution - Add isContinuation flag to RETRY events so the UI preserves text buffers during recovery (continuation) but resets them during escalation (fresh restart) - Catch errors during recovery to prevent dangling history entries * docs: update adaptive output token escalation design for recovery mechanism Update the design doc to reflect: - Escalation now targets model's actual output limit (64K floor) - Multi-turn recovery loop after escalation (up to 3 attempts) - isContinuation flag on RETRY events - Recovery error handling (pop dangling message, break) - Updated constants table and model-specific escalation limits - New design decision: why multi-turn recovery over progressive escalation * fix: remove competitor reference from code comment * fix: address review feedback on recovery mechanism Three correctness fixes from @tanzhenxin's review: 1. Partial text lost during continuation (useGeminiStream.ts): On continuation RETRY, setPendingHistoryItem(null) cleared the pending gemini item. The next Content event then saw a null pending item, created a fresh one, and reset geminiMessageBuffer = eventValue — discarding the preserved partial text. Now the pending item AND buffers are kept on continuation, so the continuation appends. 2. Recovery on truncated tool-call turns (geminiChat.ts): When the truncated turn already contains a complete functionCall, appending a user recovery message produces model(functionCall) → user(text) with no intervening functionResponse — an invalid API sequence. Now recovery skips turns with functionCall parts and defers to the tool scheduler's layer-3 fallback. 3. Recovery errors swallowed after partial chunks (geminiChat.ts): If a recovery attempt yielded chunks then failed, the catch block broke without emitting any terminal signal, leaving the UI with partial text and no Finished event. Now emits a synthetic finishReason=STOP chunk in the catch so the UI gets a proper terminal signal. * test: add coverage for output token recovery loop Four targeted tests for the recovery mechanism introduced in the truncated-tool-call-recovery PR: 1. Recovery loop fires when escalated response is also truncated: initial MAX_TOKENS → escalation MAX_TOKENS → recovery STOP. Verifies two RETRY events (one escalation, one continuation) and three API calls. 2. Recovery is skipped when truncated turn contains a functionCall: prevents the invalid model(functionCall) → user(text) sequence. Verifies no continuation RETRY and history ends with the functionCall intact. 3. Recovery attempts are capped at MAX_OUTPUT_RECOVERY_ATTEMPTS (3): persistent MAX_TOKENS triggers exactly 5 API calls (1 initial + 1 escalation + 3 recovery). 4. Recovery catch block emits synthetic STOP chunk and pops dangling user message: when a recovery attempt fails (empty stream → InvalidStreamError), the UI gets a terminal signal and history ends on the model turn, not a dangling user recovery message. * test: cover cross-iteration functionCall detection in recovery loop Existing tests cover the functionCall guard when both initial and escalated responses have functionCall. This adds a test for the cross-iteration case: iter 1 returns text (recovery proceeds), iter 2 returns functionCall (recovery must break before iter 3). Verifies: - API called exactly 4 times (1 initial + 1 escalation + 2 recovery) - History ends with the functionCall model turn, not a dangling user recovery message - Iter 3's user recovery message is never pushed (guard fires at top of loop before recoveryCount increment) * fix(core): cast synthetic STOP chunk via unknown for TS2352 The object literal {candidates, content, parts} doesn't structurally overlap enough with GenerateContentResponse for TypeScript's strict narrow cast. Casting through 'unknown' is required per TS2352. Build error from CI: src/core/geminiChat.ts(651,24): error TS2352: Conversion of type '...' to type 'GenerateContentResponse' may be a mistake because neither type sufficiently overlaps with the other. If this was intentional, convert the expression to 'unknown' first. * test(core): tighten recovery history integrity assertions Strengthen the "pop dangling recovery message" test to catch any future regression that leaves consecutive same-role entries or an empty last-model placeholder in history — conditions providers reject on the next turn. * fix(core): coalesce recovery pairs to avoid leaking control prompt Previously every output-token recovery iteration left a (user, model) pair in durable history where the user turn was the internal OUTPUT_RECOVERY_MESSAGE control prompt. That prompt was then visible to every later turn, biasing responses and polluting compression, replay, and export. Track successful recovery iterations and, after the recovery loop, fold each completed pair back into the preceding model turn via a new `coalesceRecoveryPairs` helper. Failed iterations already pop their user turn in the catch block, so they need no coalescing. Adds a targeted test that runs escalation + two successful recovery iterations + a clean STOP, and asserts the merged history has exactly one user turn and one model turn, no trace of the control prompt text, and content ordered as B (escalation) + C + D.		2026-04-21 17:04:24 +08:00
.github	ci(stale): enable 35+35 stale/close policy for pull requests (#3375 )	2026-04-19 09:45:17 +08:00
.husky	Sync upstream Gemini-CLI v0.8.2 (#838 )	2025-10-23 09:27:04 +08:00
.qwen	feat: background subagents with headless and SDK support (#3076 )	2026-04-17 18:23:06 +08:00
.vscode	Merge branch 'main' into feat/sandbox-config-improvements	2026-03-06 14:38:39 +08:00
docs	fix(core): recover from truncated tool calls via multi-turn continuation (#3313 )	2026-04-21 17:04:24 +08:00
docs-site	feat: update docs	2025-12-15 09:47:03 +08:00
eslint-rules	pre-release commit	2025-07-22 23:26:01 +08:00
integration-tests	test(integration): switch settings-migration probe from --help to mcp list (#3486 )	2026-04-21 14:19:44 +08:00
packages	fix(core): recover from truncated tool calls via multi-turn continuation (#3313 )	2026-04-21 17:04:24 +08:00
scripts	feat(core): detect tool validation retry loops and inject stop directive (#3178 )	2026-04-18 10:24:46 +08:00
.dockerignore	fix(cli): skip stdin read for ACP mode	2026-03-27 11:47:01 +00:00
.editorconfig	pre-release commit	2025-07-22 23:26:01 +08:00
.gitattributes	pre-release commit	2025-07-22 23:26:01 +08:00
.gitignore	feat: add bugfix workflow, test-engineer agent, and debugging skills	2026-04-04 18:30:09 +08:00
.npmrc	chore: remove google registry	2025-08-08 20:45:54 +08:00
.nvmrc	chore: Expand node version test matrix (#2700 )	2025-07-21 16:33:54 -07:00
.prettierignore	Merge branch 'main' into feat/add-vscode-settings-json-schema	2026-03-03 11:21:57 +08:00
.prettierrc.json	pre-release commit	2025-07-22 23:26:01 +08:00
.yamllint.yml	Sync upstream Gemini-CLI v0.8.2 (#838 )	2025-10-23 09:27:04 +08:00
AGENTS.md	feat: add bugfix workflow, test-engineer agent, and debugging skills	2026-04-04 18:30:09 +08:00
CONTRIBUTING.md	docs: add Screenshots/Video Demo section to PR template	2026-03-20 16:59:53 +08:00
Dockerfile	refactor: Extract web-templates package and unify build/pack workflow	2026-02-26 21:02:46 +08:00
esbuild.config.js	feat: add wasm build config (#2985 )	2026-04-09 14:21:00 +08:00
eslint.config.js	feat: add bugfix workflow, test-engineer agent, and debugging skills	2026-04-04 18:30:09 +08:00
LICENSE	Sync upstream Gemini-CLI v0.8.2 (#838 )	2025-10-23 09:27:04 +08:00
Makefile	feat: update docs	2025-12-22 21:11:33 +08:00
package-lock.json	fix(tool-registry): add lazy factory registration with inflight concurrency dedup (#3297 )	2026-04-18 10:31:50 +08:00
package.json	fix(build): invoke tsx directly via node --import instead of npx (#3237 )	2026-04-19 03:14:13 +08:00
README.md	docs: update authentication methods to reflect OAuth discontinuation (#3325 )	2026-04-17 15:34:18 +08:00
SECURITY.md	fix: update security vulnerability reporting channel	2026-02-24 14:22:47 +08:00
tsconfig.json	# 🚀 Sync Gemini CLI v0.2.1 - Major Feature Update (#483 )	2025-09-01 14:48:55 +08:00
vitest.config.ts	test(channels): add comprehensive test suites for channel adapters	2026-03-27 15:26:39 +00:00

README.md

An open-source AI agent that lives in your terminal.

🎉 News

2026-04-15: Qwen OAuth free tier has been discontinued. To continue using Qwen Code, switch to Alibaba Cloud Coding Plan, OpenRouter, Fireworks AI, or bring your own API key. Run qwen auth to configure.
2026-04-13: Qwen OAuth free tier policy update: daily quota adjusted to 100 requests/day (from 1,000).
2026-04-02: Qwen3.6-Plus is now live! Get an API key from Alibaba Cloud ModelStudio to access it through the OpenAI-compatible API.
2026-02-16: Qwen3.5-Plus is now live!

Why Qwen Code?

Qwen Code is an open-source AI agent for the terminal, optimized for Qwen series models. It helps you understand large codebases, automate tedious work, and ship faster.

Multi-protocol, flexible providers: use OpenAI / Anthropic / Gemini-compatible APIs, Alibaba Cloud Coding Plan, OpenRouter, Fireworks AI, or bring your own API key.
Open-source, co-evolving: both the framework and the Qwen3-Coder model are open-source—and they ship and evolve together.
Agentic workflow, feature-rich: rich built-in tools (Skills, SubAgents) for a full agentic workflow and a Claude Code-like experience.
Terminal-first, IDE-friendly: built for developers who live in the command line, with optional integration for VS Code, Zed, and JetBrains IDEs.

Installation

Quick Install (Recommended)

Linux / macOS

bash -c "$(curl -fsSL https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.sh)"

Windows (Run as Administrator)

Works in both Command Prompt and PowerShell:

powershell -Command "Invoke-WebRequest 'https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.bat' -OutFile (Join-Path $env:TEMP 'install-qwen.bat'); & (Join-Path $env:TEMP 'install-qwen.bat')"

Note

: It's recommended to restart your terminal after installation to ensure environment variables take effect.

Manual Installation

Prerequisites

Make sure you have Node.js 20 or later installed. Download it from nodejs.org.

NPM

npm install -g @qwen-code/qwen-code@latest

Homebrew (macOS, Linux)

brew install qwen-code

Quick Start

# Start Qwen Code (interactive)
qwen

# Then, in the session:
/help
/auth

On first use, you'll be prompted to sign in. You can run /auth anytime to switch authentication methods.

Example prompts:

What does this project do?
Explain the codebase structure.
Help me refactor this function.
Generate unit tests for this module.

Click to watch a demo video

🦞 Use Qwen Code for Coding Tasks in Claw

Copy the prompt below and paste it into your agent:

Use Qwen Code as a coding agent. First, fetch and read this skill reference:
https://raw.githubusercontent.com/QwenLM/qwen-code/refs/heads/main/.qwen/skills/qwen-code-claw/SKILL.md
Then follow the instructions inside to install, authenticate, and use Qwen Code via acpx for all coding tasks.

Authentication

Qwen Code supports the following authentication methods:

API Key (recommended): use an API key from Alibaba Cloud Model Studio (Beijing / intl) or any supported provider (OpenAI, Anthropic, Google GenAI, and other compatible endpoints).
Coding Plan: subscribe to the Alibaba Cloud Coding Plan (Beijing / intl) for a fixed monthly fee with higher quotas.

⚠️ Qwen OAuth was discontinued on April 15, 2026. If you were previously using Qwen OAuth, please switch to one of the methods above. Run qwen and then /auth to reconfigure.

API Key (recommended)

Use an API key to connect to Alibaba Cloud Model Studio or any supported provider. Supports multiple protocols:

OpenAI-compatible: Alibaba Cloud ModelStudio, ModelScope, OpenAI, OpenRouter, and other OpenAI-compatible providers
Anthropic: Claude models
Google GenAI: Gemini models

The recommended way to configure models and providers is by editing ~/.qwen/settings.json (create it if it doesn't exist). This file lets you define all available models, API keys, and default settings in one place.

Quick Setup in 3 Steps

Step 1: Create or edit ~/.qwen/settings.json

Here is a complete example:

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.6-plus",
        "name": "qwen3.6-plus",
        "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        "description": "Qwen3-Coder via Dashscope",
        "envKey": "DASHSCOPE_API_KEY"
      }
    ]
  },
  "env": {
    "DASHSCOPE_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.6-plus"
  }
}

Step 2: Understand each field

Field	What it does
`modelProviders`	Declares which models are available and how to connect to them. Keys like `openai`, `anthropic`, `gemini` represent the API protocol.
`modelProviders[].id`	The model ID sent to the API (e.g. `qwen3.6-plus`, `gpt-4o`).
`modelProviders[].envKey`	The name of the environment variable that holds your API key.
`modelProviders[].baseUrl`	The API endpoint URL (required for non-default endpoints).
`env`	A fallback place to store API keys (lowest priority; prefer `.env` files or `export` for sensitive keys).
`security.auth.selectedType`	The protocol to use on startup (`openai`, `anthropic`, `gemini`, `vertex-ai`).
`model.name`	The default model to use when Qwen Code starts.

Step 3: Start Qwen Code — your configuration takes effect automatically:

qwen

Use the /model command at any time to switch between all configured models.

More Examples

Coding Plan (Alibaba Cloud ModelStudio) — fixed monthly fee, higher quotas

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.6-plus",
        "name": "qwen3.6-plus (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "qwen3.6-plus from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY"
      },
      {
        "id": "qwen3.5-plus",
        "name": "qwen3.5-plus (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "qwen3.5-plus with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      },
      {
        "id": "glm-4.7",
        "name": "glm-4.7 (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "glm-4.7 with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      },
      {
        "id": "kimi-k2.5",
        "name": "kimi-k2.5 (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "kimi-k2.5 with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      }
    ]
  },
  "env": {
    "BAILIAN_CODING_PLAN_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.6-plus"
  }
}

Subscribe to the Coding Plan and get your API key at Alibaba Cloud ModelStudio(Beijing) or Alibaba Cloud ModelStudio(intl).

Multiple providers (OpenAI + Anthropic + Gemini)

{
  "modelProviders": {
    "openai": [
      {
        "id": "gpt-4o",
        "name": "GPT-4o",
        "envKey": "OPENAI_API_KEY",
        "baseUrl": "https://api.openai.com/v1"
      }
    ],
    "anthropic": [
      {
        "id": "claude-sonnet-4-20250514",
        "name": "Claude Sonnet 4",
        "envKey": "ANTHROPIC_API_KEY"
      }
    ],
    "gemini": [
      {
        "id": "gemini-2.5-pro",
        "name": "Gemini 2.5 Pro",
        "envKey": "GEMINI_API_KEY"
      }
    ]
  },
  "env": {
    "OPENAI_API_KEY": "sk-xxxxxxxxxxxxx",
    "ANTHROPIC_API_KEY": "sk-ant-xxxxxxxxxxxxx",
    "GEMINI_API_KEY": "AIzaxxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "gpt-4o"
  }
}

Enable thinking mode (for supported models like qwen3.5-plus)

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.5-plus",
        "name": "qwen3.5-plus (thinking)",
        "envKey": "DASHSCOPE_API_KEY",
        "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      }
    ]
  },
  "env": {
    "DASHSCOPE_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.5-plus"
  }
}

Tip: You can also set API keys via export in your shell or .env files, which take higher priority than settings.json → env. See the authentication guide for full details.

Security note: Never commit API keys to version control. The ~/.qwen/settings.json file is in your home directory and should stay private.

Local Model Setup (Ollama / vLLM)

You can also run models locally — no API key or cloud account needed. This is not an authentication method; instead, configure your local model endpoint in ~/.qwen/settings.json using the modelProviders field.

Ollama setup

Install Ollama from ollama.com
Pull a model: ollama pull qwen3:32b
Configure ~/.qwen/settings.json:

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3:32b",
        "name": "Qwen3 32B (Ollama)",
        "baseUrl": "http://localhost:11434/v1",
        "description": "Qwen3 32B running locally via Ollama"
      }
    ]
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3:32b"
  }
}

vLLM setup

Install vLLM: pip install vllm
Start the server: vllm serve Qwen/Qwen3-32B
Configure ~/.qwen/settings.json:

{
  "modelProviders": {
    "openai": [
      {
        "id": "Qwen/Qwen3-32B",
        "name": "Qwen3 32B (vLLM)",
        "baseUrl": "http://localhost:8000/v1",
        "description": "Qwen3 32B running locally via vLLM"
      }
    ]
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "Qwen/Qwen3-32B"
  }
}

Usage

As an open-source terminal agent, you can use Qwen Code in four primary ways:

Interactive mode (terminal UI)
Headless mode (scripts, CI)
IDE integration (VS Code, Zed)
TypeScript SDK

Interactive mode

cd your-project/
qwen

Run qwen in your project folder to launch the interactive terminal UI. Use @ to reference local files (for example @src/main.ts).

Headless mode

cd your-project/
qwen -p "your question"

Use -p to run Qwen Code without the interactive UI—ideal for scripts, automation, and CI/CD. Learn more: Headless mode.

IDE integration

Use Qwen Code inside your editor (VS Code, Zed, and JetBrains IDEs):

TypeScript SDK

Build on top of Qwen Code with the TypeScript SDK:

Use the Qwen Code SDK

Commands & Shortcuts

Session Commands

/help - Display available commands
/clear - Clear conversation history
/compress - Compress history to save tokens
/stats - Show current session information
/bug - Submit a bug report
/exit or /quit - Exit Qwen Code

Keyboard Shortcuts

Ctrl+C - Cancel current operation
Ctrl+D - Exit (on empty line)
Up/Down - Navigate command history

Learn more about Commands

Tip: In YOLO mode (--yolo), vision switching happens automatically without prompts when images are detected. Learn more about Approval Mode

Configuration

Qwen Code can be configured via settings.json, environment variables, and CLI flags.

File	Scope	Description
`~/.qwen/settings.json`	User (global)	Applies to all your Qwen Code sessions. Recommended for `modelProviders` and `env`.
`.qwen/settings.json`	Project	Applies only when running Qwen Code in this project. Overrides user settings.

The most commonly used top-level fields in settings.json:

Field	Description
`modelProviders`	Define available models per protocol (`openai`, `anthropic`, `gemini`, `vertex-ai`).
`env`	Fallback environment variables (e.g. API keys). Lower priority than shell `export` and `.env` files.
`security.auth.selectedType`	The protocol to use on startup (e.g. `openai`).
`model.name`	The default model to use when Qwen Code starts.

See the Authentication section above for complete settings.json examples, and the settings reference for all available options.

Benchmark Results

Terminal-Bench Performance

Agent	Model	Accuracy
Qwen Code	Qwen3-Coder-480A35	37.5%
Qwen Code	Qwen3-Coder-30BA3B	31.3%

Ecosystem

Looking for a graphical interface?

AionUi A modern GUI for command-line AI tools including Qwen Code
Gemini CLI Desktop A cross-platform desktop/web/mobile UI for Qwen Code

Troubleshooting

If you encounter issues, check the troubleshooting guide.

Common issues:

Qwen OAuth free tier was discontinued on 2026-04-15: Qwen OAuth is no longer available. Run qwen → /auth and switch to API Key or Coding Plan. See the Authentication section above for setup instructions.

To report a bug from within the CLI, run /bug and include a short title and repro steps.

Connect with Us

Acknowledgments

This project is based on Google Gemini CLI. We acknowledge and appreciate the excellent work of the Gemini CLI team. Our main contribution focuses on parser-level adaptations to better support Qwen-Coder models.