Find a file
Shaojin Wen 7f5b92b076
feat(core): hint to background long-running foreground bash commands (#3809)
* feat(core): hint to background long-running foreground bash commands

Phase D part (a) of Issue #3634. When a foreground `shell` tool call
runs ≥ 60 seconds and completes (succeeds or errors), append an
advisory line to the LLM-facing tool result suggesting re-running with
`is_background: true` next time.

Why: today a foreground bash that takes minutes (build watcher, soak
test, slow npm install, polling loop) blocks the agent indefinitely.
The user is already paying for the wait; the agent's next turn could
have started running in parallel under `is_background: true`. Sleep
interception (#3684) handled the egregious `sleep N` case at validate
time; this handles the legitimate-but-long case at result time.

Trade-offs:
- Threshold = 60s. Half the existing 120s foreground timeout. Long
  enough that normal `npm install` / `pytest` runs don't trigger;
  short enough that the hint surfaces before the timeout hard-kills.
- Advisory only — the command still runs to completion in the
  foreground for THIS invocation. The advice is for the agent's NEXT
  decision, not a corrective action on the current one.
- Fires on success AND error completions. The advice is the same
  ("background it next time") in both cases.
- Suppressed on aborted (timeout / user-cancel) — those paths already
  surface their own messaging and don't benefit from a "should have
  been background" reminder when the user / system already killed it.

Implementation:
- New constant `LONG_RUNNING_FOREGROUND_THRESHOLD_MS = 60000` in
  shell.ts, paired with the existing `DEFAULT_FOREGROUND_TIMEOUT_MS`.
- Helper `buildLongRunningForegroundHint(elapsedMs)` exported so
  future surfaces (UI, telemetry) can render the same text without
  duplicating the threshold logic.
- `Date.now()` bracketing around the spawn → `await resultPromise`
  block — mirrors what the background path already captures via
  `entry.startTime`.
- Append happens inside the existing non-aborted result builder;
  zero changes to the cancel / timeout arms.

Tests: 4 new cases — fires on long success, omits on short success,
fires on long error completion, omits on aborted. Uses vi fake timers
to drive wall-clock past the threshold without actually sleeping.

* fix(core): tighten long-run hint suppression + boundary tests + post-truncation insertion

Addresses 8 review threads on PR #3809 — 6 from /review bots, 2 from
copilot — covering doc accuracy, code quality, behavioural gaps, and
test coverage.

**Behavioural fixes (real bugs)**:

- **Suppress on external signal kills** (`result.signal != null` with
  `aborted: false`). `shellExecutionService` only sets `aborted` when
  the AbortSignal we passed was triggered, so SIGTERM from container
  shutdown / k8s eviction / OOM killer / sibling process-group reap
  falls through to the non-aborted branch. The advisory shouldn't fire
  there — the process didn't run to its conclusion, so "next time,
  background it" doesn't fit. New test pins this with `signal: 15`
  (SIGTERM), `aborted: false`.

- **Append AFTER `truncateToolOutput`**. Previously the hint was
  appended inside the non-aborted result builder, which meant for
  long outputs it got wrapped in the "Truncated part of the output:"
  envelope — the LLM might read the advisory as part of the command's
  own output. New post-truncation insertion + test that pins ordering
  by mocking `truncateToolOutput` directly (real path needs
  `fs.writeFile` to actually succeed for the replacement branch to
  fire).

- **Hint wording mode-aware**. The dialog mention dropped the
  unconditional "(footer pill + Enter)" specifics, which would mislead
  non-TTY users (`-p` headless / ACP / SDK consumers — no dialog or
  pill exists there). Now qualified as "in interactive mode the
  Background tasks dialog also has...". `/tasks` and the on-disk
  output file are mentioned without qualifier (work in any mode).

**Code quality**:

- **Threshold programmatically coupled to timeout**:
  `LONG_RUNNING_FOREGROUND_THRESHOLD_MS = Math.floor(DEFAULT_FOREGROUND_TIMEOUT_MS / 2)`.
  If the timeout is tuned later, the threshold tracks automatically.

- **Docstring corrected**: removed the misleading "before it gets
  killed by the timeout" claim — the hint is on non-aborted path
  only, so timeout-killed commands never see it. The new docstring
  enumerates all suppression paths explicitly.

- **Removed stale line-number reference**: comment said "mirrors the
  background path's `entry.startTime` capture (line ~781)" which goes
  stale on file edits. Now refers conceptually.

**Test coverage gaps closed**:

- **Off-by-one boundary**: 59_999ms → no hint. Pairs with the existing
  60_000ms-exactly test (which fires) to pin the boundary tightly. A
  regression flipping `>=` to `>` would fail loudly.

- **Timeout path explicit**: previous "aborted" test exercised user-
  cancel only. With `vi.useFakeTimers({ toFake: ['Date'] })`,
  `AbortSignal.timeout()` doesn't fake (it depends on the real timer
  subsystem), so `combinedSignal.aborted` stayed false. New test
  follows the pre-existing `should handle timeout vs user cancellation
  correctly` pattern: stubs `AbortSignal.timeout` + `.any` to return
  an already-aborted combined signal, then verifies "Command timed out
  after Nms" appears AND no advisory.

* fix(core): per-invocation long-run threshold + debug-mode + test isolation

Six suggestions from /review's third pass on PR #3809:

**Real semantic fix**:
- Long-run threshold now scales with the EFFECTIVE timeout, not the
  fixed default. A user who sets `timeout: 600_000` (10 min) gets the
  advisory at 5 min, not at 60s — respects the explicit timeout
  intent. Replaced the `LONG_RUNNING_FOREGROUND_THRESHOLD_MS` constant
  with a per-invocation `longRunThresholdFor(effectiveTimeout)` helper.

**Debug-mode visibility**:
- Debug mode previously snapshotted `returnDisplayMessage = llmContent`
  BEFORE the truncation + hint append, so debug-mode users saw the
  pre-hint content while the agent saw the advisory — agent suddenly
  suggesting `is_background: true` had no visible trigger in the TUI.
  Re-sync `returnDisplayMessage` after the hint append (debug-mode
  branch only) so the TUI mirrors what the agent sees.

**Type-safety footgun**:
- `if (typeof llmContent === 'string')` would silently drop the hint
  if `llmContent` ever becomes structured `Part[]`. Added an explicit
  `else` comment documenting the deliberate omission and the conditions
  under which to revisit (no string llmContent path exists today).

**Style**:
- Replaced the JSDoc `/** ... */` block on the (now-defunct) constant
  with a plain `//` comment block on the helper, matching the
  `DEFAULT_FOREGROUND_TIMEOUT_MS` / `OUTPUT_UPDATE_INTERVAL_MS` style.

**Test hygiene**:
- Wrapped both `vi.stubGlobal('AbortSignal', ...)` and
  `vi.spyOn(truncateToolOutput, ...)` in `try/finally` so failures
  during the test body don't leak the stub/spy into subsequent tests
  (would cause confusing cascading failures).
- Dropped the internal-roadmap "Phase D part (a)" reference from the
  test comment — future maintainers don't have the context.

**New test**:
- `threshold scales with the user-supplied timeout (not the default)`:
  sets `timeout: 600_000`, advances 100s, verifies no hint. Pins the
  per-invocation coupling so a regression to a fixed constant would
  fail loudly here.

* fix(core): tighten long-run hint suppression + boundary tests + post-truncation insertion (round 4)

Six suggestions from /review's pai/glm-5-fp8 pass on PR #3809:

**Behavioural / UX**:
- **Hint now visible in non-debug TUI too.** Previously only debug
  mode mirrored the hint into `returnDisplay`; non-debug users saw
  the agent suggest `is_background: true` with no visible trigger.
  Now the hint is appended to `returnDisplayMessage` in both modes
  (full mirror in debug, terse-append in non-debug to preserve the
  output-or-status form).

**Test coverage**:
- **Debug-mode re-sync test added.** All other long-run hint tests
  run with `getDebugMode → false`; this one flips it to true and
  asserts the hint appears in `returnDisplay` too. Pins the re-sync
  so a regression that drops the debug branch would fail loudly.
- **Threshold-scaling positive test added.** The negative case
  (`timeout: 600_000`, advance 100s, no hint) was already pinned;
  paired now with the positive case (advance 305s, hint fires) so a
  regression to a fixed 60s threshold is caught at both ends.

**Style / consistency**:
- **`result.signal === null` (was `== null`).** Strict equality to
  match the rest of the file. The `signal` field is typed
  `number | null` so loose equality has identical semantics, but the
  inconsistency was noise.

**Doc clarity (timing semantics)**:
- **Comment explains why elapsedMs is computed BEFORE truncation.**
  Two reviewers disagreed on the timing — one read it as before
  truncation (correct, slightly under-reports), the other as after
  (incorrect read). The intent is to report the COMMAND's runtime,
  not the tool call's total time. Truncation is post-processing,
  not part of "agent blocking time", so excluding it is the right
  semantic. Inline comment now spells this out so future readers
  don't have to infer.

* fix(core): error-path hint surfacing + clock-resilient elapsed + threshold floor + observability

Round 5 of PR #3809 review — 10 threads, mix of Critical and Suggestion:

**Critical fixes**:

1. **Hint survives the error path** (`#OWbA`). When result.error is
   set, coreToolScheduler builds the model-facing functionResponse
   from `error.message` ONLY (not llmContent — see
   convertToFunctionResponse + the toolResult.error branch in
   scheduler:1648-1724). My hint was being silently dropped on
   long-command-failed cases. Now the hint is appended to
   error.message too so the advisory survives whichever branch the
   scheduler takes.

2. **Hint wording de-ambiguated** (`#OU6o`). "prefer re-running with
   is_background: true" was ambiguous — model could read it as
   "re-run THIS command in the background", which on stateful
   commands (DB migrations, deploys, git push) would cause double
   side effects. Reworded to "Next time you run a SIMILAR
   long-running process..." with an explicit parenthetical that
   warns against re-running the just-completed command.

3. **Debug observability** (`#OU6s`). Added `debugLogger.debug` at
   the hint decision point with elapsedMs / threshold / aborted /
   signal — when a user reports "my 65s command didn't get the
   hint" the suppression branch is now visible in DEBUG output.

**Other behaviour fixes**:

4. **Threshold floor of 1000ms** (`#OU6r`). Pathological
   `timeout: 0` / `timeout: 1` would have given a 0-ms threshold,
   firing the hint on every invocation showing "ran for 0s".
   Floor at 1s makes that branch unreachable.

5. **`performance.now()` instead of `Date.now()`** (`#OU6v`). NTP
   corrections / VM clock drift between capture and read would
   silently make `elapsedMs` negative and skip the hint with no
   observable failure. Monotonic clock prevents that.

6. **Debug mode preserves truncation marker** (`#OU6w` / `#OWCq`).
   Previously `returnDisplayMessage = llmContent` after hint
   clobbered the "Output too long and was saved to: …" line
   appended during truncation. Switched to append-style re-sync in
   BOTH modes so prior content is preserved.

**Test coverage gaps closed**:

7. **Non-debug returnDisplay test** (`#OWCo`). Pinned that the
   user TUI gets the hint in the default (non-debug) mode too.

8. **Test rename** (`#OWCl`). The "debug-mode TUI mirror" test
   passed in non-debug too after the recent refactor; split into
   two tests, one per branch.

9. **Error-path hint test**. Added a test that pins `result.error?.message`
   contains both the original error text AND the hint, covering
   the scheduler-routing-via-error.message path that was silently
   broken before fix #1.

10. **Test: faketimers also fakes `performance`**. Since we
    switched to `performance.now()`, `vi.useFakeTimers({ toFake:
    ['Date'] })` no longer covered the elapsed measurement;
    extended to `['Date', 'performance']` so the threshold tests
    can drive the wall-clock with `advanceTimersByTimeAsync`.

#OU6t (else-comment for the type guard) was already addressed in
the prior round — the explicit else-with-comment is in place;
adding logging there would be noise.

* test(core): cover the MIN_LONG_RUN_THRESHOLD_MS floor branch

PR #3809 review: the new `Math.max(MIN_LONG_RUN_THRESHOLD_MS, ...)`
floor in `longRunThresholdFor` was untested — only default-timeout
and large-custom-timeout cases existed. A regression that strips the
floor would let `timeout: 1` produce a 0ms threshold and fire a
"ran for 0s" advisory on every invocation; the test suite would not
catch it.

New test: build with `timeout: 1`, advance 500ms (below the 1000ms
floor), resolve with `aborted: false` to isolate the threshold logic
from the abort path. Asserts no hint appears. A regression that
removes the floor flips the assertion to fail.

* fix(core): structured delimiter on error.message hint + tighten timeout floor comment

Two of three threads from the latest /review pass on PR #3809 (the
third — PR description / threshold scaling reconciliation — is fixed
in the PR description update, not in code):

- **`\n---\n` divider before hint in `error.message`** (`#Pt7C`).
  Downstream consumers of `error.message` (firePostToolUseFailureHook,
  telemetry grouping, SIEM alerting, hook-side error parsers) were
  receiving ~400 chars of advisory text mixed inline with the
  original error body — pattern-matching on error messages would
  absorb the advisory into the matched body. Added a `---` separator
  line so the boundary is unambiguous and split-able.

- **Threshold-floor comment narrowed to `timeout: 1`** (`#Pu9o`).
  The comment said the floor guards `timeout: 0` / `timeout: 1`, but
  `validateToolParamValues` rejects `timeout <= 0` at validate time,
  so `timeout: 0` can't reach `longRunThresholdFor`. Updated the
  comment to mention only the actually-allowed pathological case
  (`timeout: 1` and any value `< 2` rounds to 0).

Test updated to assert the `---` divider format with `toMatch`.

* fix(core): capture executionStartTime AFTER spawn so PTY import isn't counted

PR #3809 review: copilot caught that `executionStartTime` was
captured BEFORE `await ShellExecutionService.execute(...)`, which
meant the elapsed measurement included `getPty()` dynamic-import
setup (~50-200ms on first call). The hint's "ran for Xs" reading was
slightly inflated, and the comment claiming "spawn → settle" wasn't
strictly accurate.

Moved the capture immediately after the execute() call returns its
{ result, pid } handle. The pid being set by that point confirms the
process has been spawned, so the subtraction is true post-spawn-to-
settle. Comment updated to reflect the actual semantics.

The displayed accuracy gain is small (50-200ms on a 60s+ threshold
is <1%), but the comment claim now matches what the code measures.
Tests unaffected — fakeTimers don't drive real dynamic imports, so
the threshold tests behave identically.

* fix(core): align long-run hint code/tests with ShellExecutionResult.error semantics

Four copilot threads on PR #3809 — all rooted in the same
observation: `ShellExecutionResult.error` is reserved for
spawn/setup failures (per the field's doc comment in
shellExecutionService.ts), NOT for non-zero exit codes. My existing
code/tests conflated the two, making the error-path coverage less
realistic and the inline comments inaccurate.

**Test shape fixes**:

- `appends the hint when a long-running foreground command exits
  with error` → `exits non-zero`. Changed `error: new Error('exit
  1')` to `error: null` (the realistic shape for a non-zero exit
  without spawn failure). Added a comment explaining the field
  contract so future test authors don't repeat the conflation.

- `hint survives the error path (appended to error.message)`:
  reframed the mock from `spawn ENOENT` (which would resolve in
  <1s in practice, making the long-elapsed scenario unrealistic)
  to `PTY initialization failed after 75s` — a slow-spawn-failure
  shape that COULD plausibly take 75s. Test still pins the same
  CODE PATH; comment now acknowledges the edge-case nature
  ("rare but real: PTY init dragging, remote-fs exec syscalls,
  security scanners interposing").

**Comment corrections**:

- `returnDisplayMessage` build-order comment was misleading. It
  said "the hint is appended after both the truncation block and
  the returnDisplayMessage build" — but `returnDisplayMessage` is
  built BEFORE truncation. Replaced with a chronological enumeration
  (1. initial value, 2. truncation marker append, 3. hint append)
  that matches what the code actually does.

- Error-path preservation comment now acknowledges the narrow
  applicability (spawn failures only, exit codes don't reach this
  branch). Code is unchanged — the path is still real, just rare.

* test(core): pin empty-output success + background-no-hint paths

Two defensive tests for the long-running foreground hint:

- empty-output success at >=60s — exercises the
  returnDisplayMessage='' → hint append branch (write-only commands
  like `tar czf` / `cp -r` produce no stdout). Asserts the user-
  facing returnDisplay still surfaces the advisory even when the
  command produced nothing else to show.

- background never includes the hint — the foreground hint logic
  lives in executeForeground only, so today this can't fail; the
  test guards against a future refactor hoisting the advisory into
  a shared post-execute path that would tag every background launch
  with a nonsensical "ran for 0s, consider is_background: true"
  suggestion.
2026-05-04 23:24:14 +08:00
.github feat(sdk-python): add PyPI release workflow (#3685) 2026-05-04 21:07:21 +08:00
.husky Sync upstream Gemini-CLI v0.8.2 (#838) 2025-10-23 09:27:04 +08:00
.qwen feat(skills): add tmux-real-user-testing skill for readable TUI test logs (#3577) 2026-04-29 11:19:00 +08:00
.vscode Merge branch 'main' into feat/sandbox-config-improvements 2026-03-06 14:38:39 +08:00
docs feat(core): support reasoning effort 'max' tier (DeepSeek extension) (#3800) 2026-05-04 22:42:23 +08:00
docs-site feat: update docs 2025-12-15 09:47:03 +08:00
eslint-rules pre-release commit 2025-07-22 23:26:01 +08:00
integration-tests fix(test): restore abort-and-lifecycle stdin-close test to pre-#3723 version (#3777) 2026-05-02 21:39:43 +08:00
packages feat(core): hint to background long-running foreground bash commands (#3809) 2026-05-04 23:24:14 +08:00
scripts feat(sdk-python): add PyPI release workflow (#3685) 2026-05-04 21:07:21 +08:00
.dockerignore fix(cli): skip stdin read for ACP mode 2026-03-27 11:47:01 +00:00
.editorconfig pre-release commit 2025-07-22 23:26:01 +08:00
.gitattributes pre-release commit 2025-07-22 23:26:01 +08:00
.gitignore Feat/openrouter auth (#3576) 2026-04-27 14:47:44 +08:00
.npmrc chore: remove google registry 2025-08-08 20:45:54 +08:00
.nvmrc chore: Expand node version test matrix (#2700) 2025-07-21 16:33:54 -07:00
.prettierignore Merge branch 'main' into feat/add-vscode-settings-json-schema 2026-03-03 11:21:57 +08:00
.prettierrc.json pre-release commit 2025-07-22 23:26:01 +08:00
.yamllint.yml Sync upstream Gemini-CLI v0.8.2 (#838) 2025-10-23 09:27:04 +08:00
AGENTS.md feat(vscode): expose /skills as slash command with secondary picker (#2548) 2026-04-24 23:28:53 +08:00
CONTRIBUTING.md docs: add Screenshots/Video Demo section to PR template 2026-03-20 16:59:53 +08:00
Dockerfile refactor: Extract web-templates package and unify build/pack workflow 2026-02-26 21:02:46 +08:00
esbuild.config.js feat: add wasm build config (#2985) 2026-04-09 14:21:00 +08:00
eslint.config.js feat(cli): add API preconnect to reduce first-call latency (#3318) 2026-04-27 06:54:55 +08:00
LICENSE Sync upstream Gemini-CLI v0.8.2 (#838) 2025-10-23 09:27:04 +08:00
Makefile feat: update docs 2025-12-22 21:11:33 +08:00
package-lock.json feat(sdk-python): add PyPI release workflow (#3685) 2026-05-04 21:07:21 +08:00
package.json feat(sdk-python): add PyPI release workflow (#3685) 2026-05-04 21:07:21 +08:00
README.md feat(SDK) Add Python SDK implementation for #3010 (#3494) 2026-04-25 07:02:58 +08:00
SECURITY.md fix: update security vulnerability reporting channel 2026-02-24 14:22:47 +08:00
tsconfig.json # 🚀 Sync Gemini CLI v0.2.1 - Major Feature Update (#483) 2025-09-01 14:48:55 +08:00
vitest.config.ts test(channels): add comprehensive test suites for channel adapters 2026-03-27 15:26:39 +00:00

npm version License Node.js Version Downloads

QwenLM%2Fqwen-code | Trendshift

An open-source AI agent that lives in your terminal.

中文 | Deutsch | français | 日本語 | Русский | Português (Brasil)

🎉 News

  • 2026-04-15: Qwen OAuth free tier has been discontinued. To continue using Qwen Code, switch to Alibaba Cloud Coding Plan, OpenRouter, Fireworks AI, or bring your own API key. Run qwen auth to configure.

  • 2026-04-13: Qwen OAuth free tier policy update: daily quota adjusted to 100 requests/day (from 1,000).

  • 2026-04-02: Qwen3.6-Plus is now live! Get an API key from Alibaba Cloud ModelStudio to access it through the OpenAI-compatible API.

  • 2026-02-16: Qwen3.5-Plus is now live!

Why Qwen Code?

Qwen Code is an open-source AI agent for the terminal, optimized for Qwen series models. It helps you understand large codebases, automate tedious work, and ship faster.

  • Multi-protocol, flexible providers: use OpenAI / Anthropic / Gemini-compatible APIs, Alibaba Cloud Coding Plan, OpenRouter, Fireworks AI, or bring your own API key.
  • Open-source, co-evolving: both the framework and the Qwen3-Coder model are open-source—and they ship and evolve together.
  • Agentic workflow, feature-rich: rich built-in tools (Skills, SubAgents) for a full agentic workflow and a Claude Code-like experience.
  • Terminal-first, IDE-friendly: built for developers who live in the command line, with optional integration for VS Code, Zed, and JetBrains IDEs.

Installation

Linux / macOS

bash -c "$(curl -fsSL https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.sh)"

Windows (Run as Administrator)

Works in both Command Prompt and PowerShell:

powershell -Command "Invoke-WebRequest 'https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.bat' -OutFile (Join-Path $env:TEMP 'install-qwen.bat'); & (Join-Path $env:TEMP 'install-qwen.bat')"

Note

: It's recommended to restart your terminal after installation to ensure environment variables take effect.

Manual Installation

Prerequisites

Make sure you have Node.js 20 or later installed. Download it from nodejs.org.

NPM

npm install -g @qwen-code/qwen-code@latest

Homebrew (macOS, Linux)

brew install qwen-code

Quick Start

# Start Qwen Code (interactive)
qwen

# Then, in the session:
/help
/auth

On first use, you'll be prompted to sign in. You can run /auth anytime to switch authentication methods.

Example prompts:

What does this project do?
Explain the codebase structure.
Help me refactor this function.
Generate unit tests for this module.
Click to watch a demo video

🦞 Use Qwen Code for Coding Tasks in Claw

Copy the prompt below and paste it into your agent:

Use Qwen Code as a coding agent. First, fetch and read this skill reference:
https://raw.githubusercontent.com/QwenLM/qwen-code/refs/heads/main/.qwen/skills/qwen-code-claw/SKILL.md
Then follow the instructions inside to install, authenticate, and use Qwen Code via acpx for all coding tasks.

Authentication

Qwen Code supports the following authentication methods:

  • API Key (recommended): use an API key from Alibaba Cloud Model Studio (Beijing / intl) or any supported provider (OpenAI, Anthropic, Google GenAI, and other compatible endpoints).
  • Coding Plan: subscribe to the Alibaba Cloud Coding Plan (Beijing / intl) for a fixed monthly fee with higher quotas.

⚠️ Qwen OAuth was discontinued on April 15, 2026. If you were previously using Qwen OAuth, please switch to one of the methods above. Run qwen and then /auth to reconfigure.

Use an API key to connect to Alibaba Cloud Model Studio or any supported provider. Supports multiple protocols:

  • OpenAI-compatible: Alibaba Cloud ModelStudio, ModelScope, OpenAI, OpenRouter, and other OpenAI-compatible providers
  • Anthropic: Claude models
  • Google GenAI: Gemini models

The recommended way to configure models and providers is by editing ~/.qwen/settings.json (create it if it doesn't exist). This file lets you define all available models, API keys, and default settings in one place.

Quick Setup in 3 Steps

Step 1: Create or edit ~/.qwen/settings.json

Here is a complete example:

{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.6-plus",
        "name": "qwen3.6-plus",
        "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        "description": "Qwen3-Coder via Dashscope",
        "envKey": "DASHSCOPE_API_KEY"
      }
    ]
  },
  "env": {
    "DASHSCOPE_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.6-plus"
  }
}

Step 2: Understand each field

Field What it does
modelProviders Declares which models are available and how to connect to them. Keys like openai, anthropic, gemini represent the API protocol.
modelProviders[].id The model ID sent to the API (e.g. qwen3.6-plus, gpt-4o).
modelProviders[].envKey The name of the environment variable that holds your API key.
modelProviders[].baseUrl The API endpoint URL (required for non-default endpoints).
env A fallback place to store API keys (lowest priority; prefer .env files or export for sensitive keys).
security.auth.selectedType The protocol to use on startup (openai, anthropic, gemini, vertex-ai).
model.name The default model to use when Qwen Code starts.

Step 3: Start Qwen Code — your configuration takes effect automatically:

qwen

Use the /model command at any time to switch between all configured models.

More Examples
Coding Plan (Alibaba Cloud ModelStudio) — fixed monthly fee, higher quotas
{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.6-plus",
        "name": "qwen3.6-plus (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "qwen3.6-plus from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY"
      },
      {
        "id": "qwen3.5-plus",
        "name": "qwen3.5-plus (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "qwen3.5-plus with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      },
      {
        "id": "glm-4.7",
        "name": "glm-4.7 (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "glm-4.7 with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      },
      {
        "id": "kimi-k2.5",
        "name": "kimi-k2.5 (Coding Plan)",
        "baseUrl": "https://coding.dashscope.aliyuncs.com/v1",
        "description": "kimi-k2.5 with thinking enabled from ModelStudio Coding Plan",
        "envKey": "BAILIAN_CODING_PLAN_API_KEY",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      }
    ]
  },
  "env": {
    "BAILIAN_CODING_PLAN_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.6-plus"
  }
}

Subscribe to the Coding Plan and get your API key at Alibaba Cloud ModelStudio(Beijing) or Alibaba Cloud ModelStudio(intl).

Multiple providers (OpenAI + Anthropic + Gemini)
{
  "modelProviders": {
    "openai": [
      {
        "id": "gpt-4o",
        "name": "GPT-4o",
        "envKey": "OPENAI_API_KEY",
        "baseUrl": "https://api.openai.com/v1"
      }
    ],
    "anthropic": [
      {
        "id": "claude-sonnet-4-20250514",
        "name": "Claude Sonnet 4",
        "envKey": "ANTHROPIC_API_KEY"
      }
    ],
    "gemini": [
      {
        "id": "gemini-2.5-pro",
        "name": "Gemini 2.5 Pro",
        "envKey": "GEMINI_API_KEY"
      }
    ]
  },
  "env": {
    "OPENAI_API_KEY": "sk-xxxxxxxxxxxxx",
    "ANTHROPIC_API_KEY": "sk-ant-xxxxxxxxxxxxx",
    "GEMINI_API_KEY": "AIzaxxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "gpt-4o"
  }
}
Enable thinking mode (for supported models like qwen3.5-plus)
{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3.5-plus",
        "name": "qwen3.5-plus (thinking)",
        "envKey": "DASHSCOPE_API_KEY",
        "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        "generationConfig": {
          "extra_body": {
            "enable_thinking": true
          }
        }
      }
    ]
  },
  "env": {
    "DASHSCOPE_API_KEY": "sk-xxxxxxxxxxxxx"
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3.5-plus"
  }
}

Tip: You can also set API keys via export in your shell or .env files, which take higher priority than settings.jsonenv. See the authentication guide for full details.

Security note: Never commit API keys to version control. The ~/.qwen/settings.json file is in your home directory and should stay private.

Local Model Setup (Ollama / vLLM)

You can also run models locally — no API key or cloud account needed. This is not an authentication method; instead, configure your local model endpoint in ~/.qwen/settings.json using the modelProviders field.

Ollama setup
  1. Install Ollama from ollama.com
  2. Pull a model: ollama pull qwen3:32b
  3. Configure ~/.qwen/settings.json:
{
  "modelProviders": {
    "openai": [
      {
        "id": "qwen3:32b",
        "name": "Qwen3 32B (Ollama)",
        "baseUrl": "http://localhost:11434/v1",
        "description": "Qwen3 32B running locally via Ollama"
      }
    ]
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "qwen3:32b"
  }
}
vLLM setup
  1. Install vLLM: pip install vllm
  2. Start the server: vllm serve Qwen/Qwen3-32B
  3. Configure ~/.qwen/settings.json:
{
  "modelProviders": {
    "openai": [
      {
        "id": "Qwen/Qwen3-32B",
        "name": "Qwen3 32B (vLLM)",
        "baseUrl": "http://localhost:8000/v1",
        "description": "Qwen3 32B running locally via vLLM"
      }
    ]
  },
  "security": {
    "auth": {
      "selectedType": "openai"
    }
  },
  "model": {
    "name": "Qwen/Qwen3-32B"
  }
}

Usage

As an open-source terminal agent, you can use Qwen Code in four primary ways:

  1. Interactive mode (terminal UI)
  2. Headless mode (scripts, CI)
  3. IDE integration (VS Code, Zed)
  4. SDKs (TypeScript, Python, Java)

Interactive mode

cd your-project/
qwen

Run qwen in your project folder to launch the interactive terminal UI. Use @ to reference local files (for example @src/main.ts).

Headless mode

cd your-project/
qwen -p "your question"

Use -p to run Qwen Code without the interactive UI—ideal for scripts, automation, and CI/CD. Learn more: Headless mode.

IDE integration

Use Qwen Code inside your editor (VS Code, Zed, and JetBrains IDEs):

SDKs

Build on top of Qwen Code with the available SDKs:

Python SDK example:

import asyncio

from qwen_code_sdk import is_sdk_result_message, query


async def main() -> None:
    result = query(
        "Summarize the repository layout.",
        {
            "cwd": "/path/to/project",
            "path_to_qwen_executable": "qwen",
        },
    )

    async for message in result:
        if is_sdk_result_message(message):
            print(message["result"])


asyncio.run(main())

Commands & Shortcuts

Session Commands

  • /help - Display available commands
  • /clear - Clear conversation history
  • /compress - Compress history to save tokens
  • /stats - Show current session information
  • /bug - Submit a bug report
  • /exit or /quit - Exit Qwen Code

Keyboard Shortcuts

  • Ctrl+C - Cancel current operation
  • Ctrl+D - Exit (on empty line)
  • Up/Down - Navigate command history

Learn more about Commands

Tip: In YOLO mode (--yolo), vision switching happens automatically without prompts when images are detected. Learn more about Approval Mode

Configuration

Qwen Code can be configured via settings.json, environment variables, and CLI flags.

File Scope Description
~/.qwen/settings.json User (global) Applies to all your Qwen Code sessions. Recommended for modelProviders and env.
.qwen/settings.json Project Applies only when running Qwen Code in this project. Overrides user settings.

The most commonly used top-level fields in settings.json:

Field Description
modelProviders Define available models per protocol (openai, anthropic, gemini, vertex-ai).
env Fallback environment variables (e.g. API keys). Lower priority than shell export and .env files.
security.auth.selectedType The protocol to use on startup (e.g. openai).
model.name The default model to use when Qwen Code starts.

See the Authentication section above for complete settings.json examples, and the settings reference for all available options.

Benchmark Results

Terminal-Bench Performance

Agent Model Accuracy
Qwen Code Qwen3-Coder-480A35 37.5%
Qwen Code Qwen3-Coder-30BA3B 31.3%

Ecosystem

Looking for a graphical interface?

  • AionUi A modern GUI for command-line AI tools including Qwen Code
  • Gemini CLI Desktop A cross-platform desktop/web/mobile UI for Qwen Code

Troubleshooting

If you encounter issues, check the troubleshooting guide.

Common issues:

  • Qwen OAuth free tier was discontinued on 2026-04-15: Qwen OAuth is no longer available. Run qwen/auth and switch to API Key or Coding Plan. See the Authentication section above for setup instructions.

To report a bug from within the CLI, run /bug and include a short title and repro steps.

Connect with Us

Acknowledgments

This project is based on Google Gemini CLI. We acknowledge and appreciate the excellent work of the Gemini CLI team. Our main contribution focuses on parser-level adaptations to better support Qwen-Coder models.