* refactor: centralize IDE diff interaction in CoreToolScheduler
- Move openDiff/confirmation handling from edit.ts and write-file.ts into
CoreToolScheduler.openIdeDiffIfEnabled(), called after permission hooks
- Use structuredClone in buildInvocation to prevent params mutation leaking
to LLM history (fixes#2709 token waste)
- Use confirmationDetails as single data source for IDE diff content,
only rely on ModifyContext.createUpdatedParams() for parameter transform
- Skip inline modify when IDE content unchanged, preserving original tool
params for multi-edit-on-same-file scenarios (mitigates #2702)
- Remove ideConfirmation field from ToolEditConfirmationDetails
- Remove dead resolveIdeDiffForOutcome from ACP Session.ts
- Fix memory tool scope fallback in createUpdatedParams
Closes#2709Closes#2673
* fix(core): fall back to CLI confirmation when IDE diff open fails
* fix(core): narrow IDE diff error handling scope
---------
Co-authored-by: 胡玮文 <huweiwen.hww@alibaba-inc.com>
Co-authored-by: tanzhenxin <tanzhenxing1987@gmail.com>
When the @ autocomplete triggers RecursiveFileSearch, the crawler
materialises the entire project tree into memory with no upper bound.
For very large workspaces (missing .gitignore, huge node_modules, home
directory as cwd) this pushes Node.js past its heap limit and crashes.
- Add `maxFiles` option to CrawlOptions; use fdir's withMaxFiles() to
stop traversal early instead of post-hoc truncation
- Apply file-level ignore patterns during crawl via fdir filter() so
ignored files don't consume the maxFiles budget
- Include maxFiles in the crawl cache key for correctness
- Set MAX_CRAWL_FILES = 100 000 in RecursiveFileSearch (caps peak
memory at ~50 MB for the file list)
Fixes#3130
* fix: use latest assistant token count on resume instead of stale compression checkpoint
When resuming a session that had /compress followed by more messages,
getResumePromptTokenCount would return the compression checkpoint's
newTokenCount instead of the more recent assistant message's
totalTokenCount. This caused the status line to show a stale context
usage value until the first new API call.
Fixes#3107
* fix: simplify getResumePromptTokenCount with early returns and zero-guard
Restructure to return early for both branches (assistant usage and
compression checkpoint) instead of accumulating a fallback. Skip
zero/placeholder assistant usage so it doesn't override a valid
compression checkpoint. Add tests for the two key scenarios.
* fix: prevent statusline script from corrupting settings.json
Some models generate shell commands with complex quoting (e.g. single-quote
escaping like '\'') that break JSON syntax when written to settings.json,
causing qwen-code to fail to start with a FatalConfigError.
This adds four layers of defense:
1. **Agent prompt** (builtin-agents.ts): Require commands using jq/pipes/quotes
to be saved as script files instead of inline in settings.json. Mark examples
as script-only to prevent models from copying them inline.
2. **Write validation** (commentJson.ts): Validate JSON output before writing
to disk in updateSettingsFilePreservingFormat.
3. **Startup recovery** (settings.ts): When settings.json has invalid JSON,
try .orig backup first, then degrade gracefully to empty settings instead
of crashing. Rename corrupted file to .corrupted for manual recovery.
Show warning to user via migrationWarnings.
4. **Test update** (settings.test.ts): Update test to verify graceful
degradation behavior instead of expecting FatalConfigError.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address review comments on statusline JSON corruption fix
1. Backup recovery now surfaces warning via migrationWarnings (reviewer: P2 correctness)
2. Corrupted file uses timestamped suffix to avoid overwriting (reviewer: P2 robustness)
3. Remove misleading underscore prefix on used catch variable (reviewer: P2 code quality)
4. updateSettingsFilePreservingFormat returns boolean (reviewer: P2 correctness)
5. Add 3 new tests: backup recovery, both-corrupted, rename-failure (reviewer: P2 testing)
6. Consistent shebang lines in agent prompt examples (reviewer: P3 nit)
7. Improve catch block error message for backup recovery (reviewer: P2 correctness)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: warningMsg says "renamed" even when rename fails
Move warningMsg construction after renameSync so the message accurately
reflects the outcome: "renamed to X" on success, "fix manually" on failure.
Add assertion to rename-failure test verifying the fallback message.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): improve markdown table rendering in terminal
* fix(cli): restore theme colors and inline markdown rendering in tables
Improvements over previous commit:
- Restore theme.border.default color for table borders
- Restore theme.text.link color + bold for table headers
- Add renderMarkdownToAnsi() to render **bold**, `code`, *italic*,
~~strikethrough~~, <u>underline</u>, [links](url), and bare URLs
as ANSI-styled text in table cells (mirrors RenderInline behavior)
- Use raw ANSI escape codes instead of chalk (chalk.level=0 in tests)
- Remove dead code: INLINE_MARKDOWN_REGEX, hasInlineMarkdown,
ANSI_BOLD_START/END constants, unused vi/beforeEach in tests
- Update 8 snapshots to reflect themed output
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): address Copilot review comments on table rendering
- renderRowLines: normalize cells to exactly colCount (pad/truncate)
to prevent undefined access when row has fewer cells than headers
- calculateMaxRowLines: iterate colCount instead of row.length to
prevent undefined columnWidths access for extra cells
- tableSeparatorRegex: add (?=.*\|) lookahead to require at least one
pipe character, preventing `---` (horizontal rule) from being
mis-parsed as a table separator
- Add test: horizontal rule after pipe line is not a table separator
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): address Copilot round-2 review on table rendering
- idealWidths: use getRenderedWidth() (markdown→ANSI→stripAnsi→stringWidth)
instead of getPlainTextLength() so link URLs are accounted for in
column width calculation
- calculateMaxRowLines: use getFormattedCellText() (same as renderRowLines)
so vertical fallback decision matches actual rendered row height
- renderVerticalFormat: normalize row to colCount (pad/truncate) for
consistency with horizontal format
- renderVerticalFormat: render markdown in labels via renderMarkdownToAnsi()
instead of showing raw syntax
- Remove unused getCellPlainText helper and getPlainTextLength import
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): address Copilot round-3 review on table rendering
- Early return empty <Box /> when headers is empty (colCount === 0)
to prevent malformed border output
- Always apply theme.text.link color to header cells regardless of
ANSI content, matching original Ink implementation behavior
- Validate separator column count matches header column count before
entering table mode, preventing mismatched separators like
`| A | B |` followed by `|---|` from creating invalid tables
- Add test for column count mismatch detection
- Update 2 snapshots for consistent header link color
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): address Copilot round-4 review on table rendering
- getMinWordWidth: use renderMarkdownToAnsi output so link URLs are
included as unbreakable tokens in minimum column width calculation
- Remove now-unused stripInlineMarkdown function
- Header alignment: respect explicit alignment markers from separator;
only default to center when no alignment is specified for the column
- Header color nesting: re-apply theme.text.link color after inner
foreground resets (from inline code/links) to match Ink's nested
color behavior where parent color is restored after child resets
- Add getColorCode() helper for extracting raw ANSI color escape
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): address Copilot round-5 review on table rendering
- Apply theme.text.primary color to non-header cells and re-apply
after inner foreground resets, matching header recolor behavior
- Use nullish coalescing (??) for vertical format labels so empty
header strings are preserved instead of replaced with Column N
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): re-apply cell color after full ANSI reset (\x1b[0m)
Add recolorAfterResets() helper that handles both \x1b[39m (foreground
reset) and \x1b[0m (full SGR reset). Applies to both header and body
cells so mixed ANSI content keeps consistent theme coloring.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): apply recolorAfterResets to vertical format labels
Vertical fallback labels with inline markdown (code, URLs) now
re-apply link color after SGR resets, consistent with horizontal
header/body cell behavior.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): apply primary color to vertical format values
Vertical fallback values now get theme.text.primary color with
recolorAfterResets, consistent with horizontal body cell styling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): preserve internal blank lines in wrapped cell content
wrapText now only trims trailing empty lines (wrap-ansi artifacts)
instead of filtering all empty lines, preserving intentional blank
lines within multi-paragraph cell content.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): validate hex colors and deduplicate applyColor/getColorCode
- Add HEX_COLOR_RE validation; invalid hex like #ff00 or #gg0000
now returns unchanged text instead of producing NaN in ANSI escapes
- Refactor applyColor to delegate to getColorCode, eliminating
duplicated hex parsing logic
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): precompute cell metrics and fix column width overflow
- Precompute per-cell rendered text, visible width, and min word width
once via computeMetrics(), eliminating repeated renderMarkdownToAnsi
calls across width calculation, max-row-lines check, and rendering
- Add post-pass in totalMin > availableWidth branch: shave wider
columns until sum(columnWidths) <= availableWidth, preventing
MIN_COLUMN_WIDTH floor from causing unnecessary vertical fallback
- Remove now-unused getMinWordWidth standalone function
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(vscode-ide-companion/session): force fresh sessions for new chats
Ensure explicit new-session actions bypass active ACP session reuse so the VS Code sidebar clears context correctly.
Add regression coverage for the agent manager and webview new-session entry points.
* fix(vscode): remove core runtime imports from webview bundle
Replace the runtime import of `isSupportedImageMimeType` from
`@qwen-code/qwen-code-core` with a local `SUPPORTED_PASTED_IMAGE_MIME_TYPES`
set in the vscode-ide-companion package. The webview is bundled for a
browser environment where Node.js-only core modules are unavailable,
so keeping the MIME list local avoids esbuild failures during development.
Added tests to verify the local list stays aligned with core and that
the webview bundle does not contain core runtime imports.
* fix(vscode): reset context usage display on new session (#2847)
The webview context-usage bar did not clear when the user started a new
session because the old code always fell back to DEFAULT_TOKEN_LIMIT,
producing a stale percentage even after usageStats and modelInfo were
both cleared.
Key changes:
- Extract `knownTokenLimit()` in core/tokenLimits.ts that returns
`undefined` for unrecognized models instead of a default, keeping
`tokenLimit()` behavior unchanged.
- In acpModelInfo.ts, derive `_meta.contextLimit` from the known-model
table when the ACP payload omits a numeric limit.
- Extract `computeContextUsage()` into its own module, which returns
`null` when no trusted numeric limit is available — the UI then
correctly hides the context bar.
- Remove the `@qwen-code/qwen-code-core` runtime import from App.tsx
so the webview bundle stays free of Node-only dependencies.
Closes#2847
* fix(vscode-ide-companion/webview): reset state on new session
* test(vscode-ide-companion/webview): cover stale conversation reset
* fix(vscode): remove webview token limit runtime import
* fix(vscode): fully reset state for explicit new session
* fix(vscode-ide-companion/webview): clear residual state on new session
---------
Co-authored-by: tanzhenxin <tanzhenxing1987@gmail.com>
Compact mode confirmation dialog uses ProceedAlways for "Allow always"
option, but persistPermissionOutcome() only handled ProceedAlwaysProject
and ProceedAlwaysUser, causing the permission to never be saved.
Now ProceedAlways is treated as project scope (same as ProceedAlwaysProject).
* feat(plan): add "Yes, restore previous mode" option when exiting plan mode
When exiting plan mode, users previously had no way to restore their
original approval mode (e.g. YOLO). Add a new default option that
restores the pre-plan approval mode, with a dynamic label showing
which mode will be restored.
Closes#3002
* test: add fallback test for RestorePrevious when no prePlanMode recorded
* fix: handle RestorePrevious in telemetry and ACP mode notification
- Add RestorePrevious to telemetry decision mapping as ACCEPT
- Fix sendCurrentModeUpdateNotification to read actual mode for
RestorePrevious instead of defaulting to 'default'
* test: add plan confirmation tests for RestorePrevious in permissionUtils
* fix(permissions): match env-prefixed shell commands
Fixes#2846
* fix(core): improve shell command parsing for env vars and multiline commands
- Add dotAll flag to matchesCommandPattern for matching commands with embedded newlines
- Support newline operators in SHELL_OPERATORS for splitCompoundCommand
- Refactor getCommandRoot to skip leading VAR=value assignments
- Add test coverage for multiline commands and env var prefixed commands
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
* fix(permissions): tighten shell command parsing
Handle env-prefixed commands and quoted Windows paths consistently.
Keep newline splitting heredoc-aware and avoid false heredoc detection in comments or arithmetic expressions.
* refactor(permissions): simplify fix by reverting splitCompoundCommand rewrite
Remove ~350 lines of heredoc/comment/arithmetic parsing from
splitCompoundCommand that were not needed to fix#2846. Revert to
the original main version, keeping only the core env-var stripping
logic in matchesCommandPattern and getCommandRoot.
This addresses both reviewer concerns:
- heredoc breakage: no longer an issue since splitCompoundCommand is unchanged
- Windows quoted paths: handled correctly by shell-quote parse in getCommandRoot
---------
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Move openDiff/confirmation handling from edit.ts and write-file.ts into
CoreToolScheduler.openIdeDiffIfEnabled(), called after permission hooks
- Use structuredClone in buildInvocation to prevent params mutation leaking
to LLM history (fixes#2709 token waste)
- Use confirmationDetails as single data source for IDE diff content,
only rely on ModifyContext.createUpdatedParams() for parameter transform
- Skip inline modify when IDE content unchanged, preserving original tool
params for multi-edit-on-same-file scenarios (mitigates #2702)
- Remove ideConfirmation field from ToolEditConfirmationDetails
- Remove dead resolveIdeDiffForOutcome from ACP Session.ts
- Fix memory tool scope fallback in createUpdatedParams
Closes#2709Closes#2673
* test: add tests for confirmation-bus, prompt-registry, and cli/core modules
Add 42 new tests covering previously untested core modules:
- MessageBus: publish, subscribe/unsubscribe, request-response pattern (13 tests)
- PromptRegistry: register, dedup, query by server, clear, remove (11 tests)
- performInitialAuth: success, failure, no authType cases (3 tests)
- validateTheme: found, not found, no config cases (4 tests)
- initializeApp: i18n, auth, theme, IDE mode, auth dialog logic (11 tests)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: improve test quality - copyright headers, env safety, cleanup
- Fix copyright headers from Google LLC to Qwen Code in all 5 test files
- Use vi.stubEnv() instead of manual process.env mutation in initializer test
- Add removeAllListeners() cleanup in message-bus debug test
- Add void prefix to un-awaited publish() calls in message-bus test
- Verify invoke reference preserved after prompt rename in prompt-registry test
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* test(message-bus): add AbortSignal coverage for request()
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
LLM was putting findings in the comments array WITHOUT line numbers,
creating orphaned PR comments. Clarified: comments array entries MUST
have a valid line. Findings without a mappable diff line go in body.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two rules repeatedly violated by LLMs despite being documented:
1. Language matching (Chinese output on English PRs)
2. Create Review API (falling back to individual gh api comments)
Moved both to a "Critical rules" section at the very top of the
prompt, before the design philosophy. Early placement = higher
attention from the LLM.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The APPROVE path for zero findings was missing the model footer.
Added YOUR_MODEL_ID to the body.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
replaceAll('YOUR_MODEL_ID', modelId) was also replacing the instruction
"The variable YOUR_MODEL_ID is declared..." → nonsensical text.
Removed the literal reference from the instruction line.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Some models (e.g., glm-5.1) ignore the {{model}} template in code
blocks and write their own footer without the model name. Fix:
1. BundledSkillLoader prepends YOUR_MODEL_ID="glm-5.1" as a top-level
declaration at the start of the skill body — impossible to miss
2. SKILL.md references YOUR_MODEL_ID in footer instructions
3. Empty model → empty string (no "unknown" — prefer omission)
4. YOUR_MODEL_ID declaration only prepended when model is available
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(core): prevent followup suggestion input/output from appearing in tool call UI
The follow-up suggestion generation was leaking into the conversation UI
through three channels:
1. The forked query included tools in its generation config, allowing the
model to produce function calls during suggestion generation. Fixed by
setting `tools: []` in runForkedQuery's per-request config (kept in
createForkedChat for speculation which needs tools).
2. logApiResponse and logApiError recorded suggestion API events to the
chatRecordingService, causing them to appear in session JSONL files
and the WebUI. Fixed by adding isInternalPromptId() guard that skips
chatRecordingService for 'prompt_suggestion' and 'forked_query' IDs.
uiTelemetryService.addEvent() is preserved so /stats still tracks
suggestion token usage.
3. LoggingContentGenerator logged suggestion requests/responses to the
OpenAI logger and telemetry pipeline. Fixed by skipping logApiRequest,
buildOpenAIRequestForLogging, and logOpenAIInteraction for internal
prompt IDs. _logApiResponse is preserved (for /stats) but its
chatRecordingService path is filtered by fix#2.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: deduplicate isInternalPromptId into shared export from loggers.ts
Address review feedback: extract isInternalPromptId() to a single
exported function in telemetry/loggers.ts and import it in
LoggingContentGenerator, eliminating the duplicate private method.
Also update loggingContentGenerator.test.ts mock to use importOriginal
so the real isInternalPromptId is available during tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: extract isInternalPromptId to shared utils, add tests
Address maintainer review feedback:
1. Move isInternalPromptId() to packages/core/src/utils/internalPromptIds.ts
using a ReadonlySet for the ID registry. Adding new internal prompt IDs
only requires changing one file. loggers.ts re-exports for compatibility,
loggingContentGenerator.ts imports directly from utils.
2. Extract `tools: []` magic value to a frozen NO_TOOLS constant in
forkedQuery.ts.
3. Add unit tests for isInternalPromptId: prompt_suggestion → true,
forked_query → true, user_query → false, empty string → false.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address Copilot review — docs, stream optimization, tests
1. Update forkedQuery.ts module docs to reflect that runForkedQuery
overrides tools: [] at the per-request level while createForkedChat
retains the full generationConfig for speculation callers.
2. Propagate isInternal into loggingStreamWrapper to skip response
collection and consolidation for internal prompts, avoiding
unnecessary CPU/memory overhead.
3. Add logApiResponse chatRecordingService filter tests: verify
prompt_suggestion/forked_query skip recording while normal IDs
still record.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: deep-freeze NO_TOOLS, add internal prompt guard tests
Address Copilot review round 3:
1. Deep-freeze NO_TOOLS.tools array to prevent shared mutable state
across forked query calls.
2. Add LoggingContentGenerator tests verifying that internal prompt IDs
(prompt_suggestion, forked_query) skip logApiRequest and OpenAI
interaction logging while preserving logApiResponse.
3. Add logApiError chatRecordingService filter tests matching the
existing logApiResponse coverage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: reconcile createForkedChat JSDoc with module header
Clarify that createForkedChat retains the full generationConfig
(including tools) for speculation callers, while runForkedQuery
strips tools at the per-request level via NO_TOOLS.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: build errors and Copilot round 4 feedback
1. Fix NO_TOOLS type: Object.freeze produces readonly array incompatible
with ToolUnion[]. Use Readonly<Pick<>> instead; spread in requestConfig
already creates a fresh mutable copy per call.
2. Fix test missing required 'model' field in ContentGeneratorConfig.
3. Track firstResponseId/firstModelVersion in loggingStreamWrapper so
_logApiResponse/_logApiError have accurate values even when full
response collection is skipped for internal prompts.
4. Strengthen OpenAI logger test assertion: assert OpenAILogger was
constructed (not guarded by if), then assert logInteraction was
not called.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: remove dead Object.keys check, add streaming internal prompt test
1. Simplify runForkedQuery: requestConfig always has tools:[] from
NO_TOOLS spread, so the Object.keys().length > 0 ternary is dead
code. Pass requestConfig directly.
2. Add generateContentStream test for internal prompt IDs to match
the existing generateContent coverage, ensuring the streaming
wrapper also skips logApiRequest and OpenAI interaction logging.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: prevent Enter accept from re-inserting suggestion into buffer
When accepting a followup suggestion via Enter, accept() queued
buffer.insert(suggestion) in a microtask that executed after
handleSubmitAndClear had already cleared the buffer, leaving the
suggestion text stuck in the input.
Add skipOnAccept option to accept() so the Enter path bypasses the
onAccept callback. Also add runForkedQuery unit tests verifying
tools: [] is passed in per-request config.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(core): add speculation to internal IDs, fix logToolCall filtering, improve suggestion prompt
- Add 'speculation' to INTERNAL_PROMPT_IDS so speculation API traffic
and tool calls are hidden from chat recordings and tool call UI
- Add isInternalPromptId check to logToolCall() for consistency with
logApiError/logApiResponse
- Improve SUGGESTION_PROMPT: prioritize assistant's last few lines and
extract actionable text from explicit tips (e.g. "Tip: type X")
- Fix garbled unicode in prompt text
- Update design docs and user docs to reflect changes
- Add test coverage for all new behavior
* fix(core): deep-freeze NO_TOOLS, add speculation to loggingContentGenerator tests
- Object.freeze NO_TOOLS and its tools array to prevent runtime mutation
- Add 'speculation' to loggingContentGenerator internal prompt ID tests
for consistency with loggers.test.ts and internalPromptIds.ts
* fix(core): fix NO_TOOLS Object.freeze type error
Use `as const` with type assertion to satisfy TypeScript while keeping
runtime immutability via Object.freeze.
* refactor(core): remove unused isInternalPromptId re-export from loggers.ts
All consumers import directly from utils/internalPromptIds.js.
The re-export was dead code with no importers.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous version bump commit (bb4376c) only updated the root
package.json but did not run `npm run release:version` to propagate
the version and sandboxImageUri to all workspace packages.
This caused Docker sandbox integration tests to fail in CI with
"manifest unknown" because build_sandbox.js built image 0.14.1
(from packages/cli/package.json) while sandboxConfig.ts expected
image 0.14.2 (from root package.json).
Fixes: https://github.com/QwenLM/qwen-code/actions/runs/24135197272/job/70424966323
- Instruct agent to use "bash script.sh" pattern instead of direct
execution (agent cannot chmod +x without SHELL tool)
- Replace vague "skip optional locks" with concrete GIT_OPTIONAL_LOCKS=0
- Simplify "parent agent" framing to direct user-facing message
- Add qwen3.6-plus to both China and Global/Intl regions as the first
model in the Coding Plan template (1M context, enable_thinking)
- Set qwen3.6-plus as the new default MAINLINE_CODER_MODEL
- Add image+video input modality support for qwen3.6-plus
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Restructure the status line stdin JSON for clarity and accuracy:
- Rename model.id → model.display_name, cwd → workspace.current_dir
- Replace raw context_window size/count with used_percentage,
remaining_percentage, current_usage, context_window_size, and
total_input_tokens/total_output_tokens
- Add version field from cfg.getCliVersion()
- Add git.branch, metrics.models, metrics.files
- Remove upstream-only fields: tokens.tool (never populated),
session (start_time/elapsed_time not live-updating),
streaming_state, approval_mode, terminal, metrics.tools
- Rename tokens.candidates → tokens.completion (Qwen API convention)
- Fix template string escaping in builtin-agents to avoid
templateString() placeholder collision
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(core): adaptive output token escalation (8K default + 64K retry)
99% of model responses are under 5K tokens, but we previously reserved
32K for every request. This wastes GPU slot capacity by ~4x.
Now the default output limit is 8K. When a response hits this cap
(stop_reason=max_tokens), it automatically retries once at 64K — only
the ~1% of requests that actually need more tokens pay the cost.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add design doc and user doc for adaptive output token escalation
- Add design doc covering problem, architecture, token limit
determination, escalation mechanism, and design decisions
- Document QWEN_CODE_MAX_OUTPUT_TOKENS env var in settings.md
- Add max_tokens adaptive behavior explanation in model config section
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
"Yes, and manually approve edits" was restoring getPrePlanMode() which
could be YOLO, contradicting the label. Now hardcodes DEFAULT to match
the "manually approve" semantics.
Align with observed provider prompt-cache TTL (~5 min). Add
`context.gapThresholdMinutes` setting so users can tune the threshold
for providers with different cache TTLs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLM was putting all findings in the review body (creating a summary
comment) instead of the comments array (inline comments). Added
prominent warning: "Findings go in comments array, NOT in body."
Also: "Do NOT use COMMENT when there are Critical findings."
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the two-phase posting (individual gh api comments + separate
gh pr review verdict) with a single Create Review API call that bundles
inline comments + verdict together — same approach as Copilot Code Review.
Benefits:
- No summary comment needed (inline comments ARE the review)
- No "two-phase posting" complexity
- No "STOP for Comment verdict" rules
- No duplicate/orphaned reviews
- One API call instead of N+1
- Verdict (approve/request_changes/comment) correctly attached
Eliminates ~40 lines of complex posting rules replaced by ~30 lines
of straightforward JSON construction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three issues found from real /review output on PR #2921:
1. Critical findings but verdict submitted as --comment instead of
--request-changes. Added explicit: "Do NOT use --comment when
verdict is Request changes — this loses the blocking status."
2. Nice to have findings appeared in PR summary. Added: "Do NOT
include Nice to have findings" to all summary rules.
3. Clarified that failed-inline summary should only contain
Critical/Suggestion, never Nice to have.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues found from real review (PR #2826):
1. Multiple /review runs on same PR create duplicate comments. Now
Step 9 checks for existing "via Qwen Code /review" comments
before posting and warns the user about potential duplicates.
2. Comments posted without line numbers appear as orphaned PR
comments. Now enforced: every inline comment MUST reference a
specific line in the diff. Findings that can't be mapped to
diff lines go in the summary instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaced 5 numbered rules + example with example-first format.
LLMs pattern-match from examples better than parsing rules.
Rules condensed to 2 sentences after the example.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Inline comments now use ```suggestion blocks when the fix is a direct
line replacement. PR authors can accept fixes with one click instead
of manually copying code. Falls back to regular code blocks when the
fix spans multiple locations.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Logic errors causing incorrect behavior (wrong return values, skipped
code paths) were being classified as Suggestion instead of Critical.
Added explicit examples: "if code does something wrong, it's Critical
— not Suggestion."
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three issues found in real review output:
1. Summary repeated findings already posted as inline comments
2. "Review Stats" (agent count, raw/confirmed) is internal noise
3. Summary was too verbose
Fix: partial-failure summary must contain ONLY the failed findings.
Distinguish terminal output (stats OK) from PR comments (no stats).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GitHub renders #1, #2 as links to issues/PRs with those numbers.
Review summaries using "#1 (logic error)" link to the wrong target.
Added guideline: use (1), [1], or descriptive references instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Review comments, findings, and summaries must use the same language
as the PR (title/description/code comments). English PR → English
review. Chinese PR → Chinese review. No language switching.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>