Commit graph

2117 commits

Author SHA1 Message Date
tanzhenxin
3c23952ef7
Merge pull request #2897 from QwenLM/feat/thinking-cross-turn-retention-idle-cleanup
feat(core): thinking block cross-turn retention with idle cleanup
2026-04-08 15:26:53 +08:00
zhangxy-zju
db7488f3a2
Merge pull request #2921 from QwenLM/feat/plan-mode
feat(cli): implement /plan command for plan mode
2026-04-08 15:23:27 +08:00
wenshao
121af70cc0 fix: ProceedOnce should set DEFAULT mode, not restore pre-plan mode
"Yes, and manually approve edits" was restoring getPrePlanMode() which
could be YOLO, contradicting the label. Now hardcodes DEFAULT to match
the "manually approve" semantics.
2026-04-08 14:51:26 +08:00
wenshao
6a55a9aeea feat(config): make thinking idle threshold configurable and lower default to 5min
Align with observed provider prompt-cache TTL (~5 min). Add
`context.gapThresholdMinutes` setting so users can tune the threshold
for providers with different cache TTLs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 14:21:06 +08:00
wenshao
a83877e3de fix(review): findings MUST go in comments array, NOT in body
LLM was putting all findings in the review body (creating a summary
comment) instead of the comments array (inline comments). Added
prominent warning: "Findings go in comments array, NOT in body."
Also: "Do NOT use COMMENT when there are Critical findings."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 09:22:57 +08:00
wenshao
d4b96c8527 feat(review): use Create Review API for single-call review submission
Replace the two-phase posting (individual gh api comments + separate
gh pr review verdict) with a single Create Review API call that bundles
inline comments + verdict together — same approach as Copilot Code Review.

Benefits:
- No summary comment needed (inline comments ARE the review)
- No "two-phase posting" complexity
- No "STOP for Comment verdict" rules
- No duplicate/orphaned reviews
- One API call instead of N+1
- Verdict (approve/request_changes/comment) correctly attached

Eliminates ~40 lines of complex posting rules replaced by ~30 lines
of straightforward JSON construction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 07:33:15 +08:00
wenshao
97aacfac4e fix(review): enforce correct verdict flag + exclude Nice to have from PR
Three issues found from real /review output on PR #2921:

1. Critical findings but verdict submitted as --comment instead of
   --request-changes. Added explicit: "Do NOT use --comment when
   verdict is Request changes — this loses the blocking status."
2. Nice to have findings appeared in PR summary. Added: "Do NOT
   include Nice to have findings" to all summary rules.
3. Clarified that failed-inline summary should only contain
   Critical/Suggestion, never Nice to have.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 07:26:59 +08:00
chinesepowered
6763035e1e fix(hooks): distinguish signal-killed hook from successful exit 2026-04-07 14:24:41 -07:00
wenshao
bb85768adf fix(review): detect old review comments + require line numbers
Two issues found from real review (PR #2826):

1. Multiple /review runs on same PR create duplicate comments. Now
   Step 9 checks for existing "via Qwen Code /review" comments
   before posting and warns the user about potential duplicates.

2. Comments posted without line numbers appear as orphaned PR
   comments. Now enforced: every inline comment MUST reference a
   specific line in the diff. Findings that can't be mapped to
   diff lines go in the summary instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 05:20:12 +08:00
wenshao
51964fa4b9 Merge remote-tracking branch 'origin/main' into feature/status-line-customization
# Conflicts:
#	packages/cli/src/ui/components/Footer.tsx
2026-04-08 05:05:04 +08:00
wenshao
ab72b5e6bb fix(review): simplify comment template — example-first, rules after
Replaced 5 numbered rules + example with example-first format.
LLMs pattern-match from examples better than parsing rules.
Rules condensed to 2 sentences after the example.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 04:28:15 +08:00
wenshao
664bc34d84 fix(review): fix broken nested code fence in comment template
Prettier mangled the nested code fences (4-backtick outer + 3-backtick
suggestion inner). Replaced with plain-text numbered structure +
indented example. Also fixed orphaned 4-backtick fence closing Step B.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 04:27:05 +08:00
wenshao
1a93078aee fix(review): suggestion blocks supported by GitLab/Gitea too, not just GitHub
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 04:19:49 +08:00
wenshao
3f91c1d1b7 feat(review): use GitHub suggestion blocks for one-click fixes
Inline comments now use ```suggestion blocks when the fix is a direct
line replacement. PR authors can accept fixes with one click instead
of manually copying code. Falls back to regular code blocks when the
fix spans multiple locations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 04:17:54 +08:00
wenshao
5a41e17b2b fix(review): clarify Critical vs Suggestion severity boundary
Logic errors causing incorrect behavior (wrong return values, skipped
code paths) were being classified as Suggestion instead of Critical.
Added explicit examples: "if code does something wrong, it's Critical
— not Suggestion."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 04:16:33 +08:00
wenshao
3982df56bc fix(review): PR summary should not repeat inline comments or show stats
Three issues found in real review output:
1. Summary repeated findings already posted as inline comments
2. "Review Stats" (agent count, raw/confirmed) is internal noise
3. Summary was too verbose

Fix: partial-failure summary must contain ONLY the failed findings.
Distinguish terminal output (stats OK) from PR comments (no stats).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 04:12:54 +08:00
wenshao
2824bf6327 fix(review): avoid #N notation in PR comments (GitHub auto-links)
GitHub renders #1, #2 as links to issues/PRs with those numbers.
Review summaries using "#1 (logic error)" link to the wrong target.
Added guideline: use (1), [1], or descriptive references instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 04:11:19 +08:00
wenshao
ea2f8e1b78 feat(review): match review language to PR language
Review comments, findings, and summaries must use the same language
as the PR (title/description/code comments). English PR → English
review. Chinese PR → Chinese review. No language switching.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 04:08:21 +08:00
wenshao
16640e92b9 Merge remote-tracking branch 'origin/main' into feat/review-skill-improvements 2026-04-07 22:15:13 +08:00
wenshao
6bb5e0a276 Merge remote-tracking branch 'origin/main' into feat/plan-mode
# Conflicts:
#	packages/cli/src/i18n/locales/de.js
#	packages/cli/src/i18n/locales/en.js
#	packages/cli/src/i18n/locales/ja.js
#	packages/cli/src/i18n/locales/pt.js
#	packages/cli/src/i18n/locales/ru.js
#	packages/cli/src/i18n/locales/zh.js
2026-04-07 21:04:25 +08:00
wenshao
297d81ee5f fix(review): enforce one-line summary body for Approve/Request changes
LLM was writing detailed analysis in the review summary body despite
"minimal body" instruction. Strengthened to "one-line body only, do
NOT include analysis/findings/explanations" with concrete examples.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:27:52 +08:00
LaZzyMan
80b0c6baec fix(core): add getDefaultPermission and allowExternalPaths to ripGrep tool
Add getDefaultPermission() override to GrepToolInvocation in ripGrep.ts
to match the behavior of grep.ts, returning "ask" for paths outside
the workspace and "allow" for workspace-internal paths.

Also pass allowExternalPaths: true to resolveAndValidatePath in both
the execute() and validateToolParamValues() methods, so external paths
are not rejected at the validation layer (permission is deferred to
getDefaultPermission as designed).

Fixes issue where grep searches in arbitrary workspace paths would
fail with "Path is not within workspace" even when the user intended
to search external directories.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-07 17:18:24 +08:00
tanzhenxin
5b550ae7cd
Merge pull request #2858 from QwenLM/fix/anyof-schema-validation-coercion
fix(core): coerce stringified JSON values for anyOf/oneOf MCP tool schemas
2026-04-07 15:52:01 +08:00
wenshao
2717aa1269 feat(review): add "post comments" tip for zero-findings PR review
When PR review finds no issues and --comment was not specified, suggest
"post comments" so the user can formally approve the PR on GitHub.
Without this, the LGTM only appears in terminal — no approval status
on the PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:27:23 +08:00
wenshao
3e5582aa45 fix(review): simplify --comment on unchanged PR — just re-run review
Loading cached findings from a Markdown report is fragile (unstructured
prose, LLM might misparse). Instead, when --comment is specified on an
unchanged PR, simply run the full review. The user explicitly wants
comments posted — spending 7 LLM calls is acceptable.

Removed reportPath from cache schema (no longer needed).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:57:53 +08:00
wenshao
6b54aff0e3 fix(review): add reportPath to cache for --comment on unchanged PR
When --comment is used on an unchanged PR, Step 9 needs prior findings
to post. Cache now stores reportPath pointing to the saved report from
Step 10, allowing findings to be loaded without re-running the review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:52:09 +08:00
wenshao
009ac9a6d9 fix(review): enforce exact footer template to prevent LLM rewriting
LLM was ignoring the {{model}} template and writing its own footer
("— Qwen Code /review" instead of "— glm-5.1 via Qwen Code /review").
Added explicit warning: footer must appear EXACTLY as shown, do NOT
shorten or rephrase.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:50:01 +08:00
wenshao
a9038a1769 fix(review): 5 issues — CI security, incremental+comment, doc accuracy
1. CI config auto-discovery: read from base branch for PR reviews
   (PR branch is untrusted, malicious PR could inject commands)
2. Incremental early-exit: don't block --comment on unchanged PR —
   allow posting comments from previous review findings
3. Doc: review summary not always posted (Comment verdict skips it)
4. Doc: cross-repo reviews skip report persistence
5. Doc: clarify "Agents 1-4 findings verified" (not all — reverse
   audit findings skip verification)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:40:16 +08:00
tanzhenxin
8df861ac21 fix(core): handle array-form type in anyOf/oneOf variants in getAcceptedTypes
The function only checked for string-form type inside anyOf/oneOf variants,
missing the array form (e.g., type: ["array", "null"]). This mirrors the
handling already done at the top-level property schema.
2026-04-07 13:58:00 +08:00
wenshao
4091210b01 fix(review): move incremental check before worktree creation
Previously: fetch → create worktree → incremental check → "no changes"
→ delete worktree (wasted time creating it).

Now: fetch → incremental check (via git rev-parse on fetched ref) →
if no changes, delete ref and stop (no worktree ever created).
Worktree only created when review will actually proceed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:09:15 +08:00
wenshao
60eba658f7 fix(review): 3 logic holes in Step 9 comment posting flow
1. Step 9 --comment guard blocked "post comments" follow-up path.
   Fixed: allow entry via either --comment flag OR follow-up request.
2. Cross-repo gh pr review needs -R {owner}/{repo}. Added note.
3. "post comments" tip shown even when --comment already set (double
   posting risk). Fixed: tip only shown when --comment NOT specified.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:23:23 +08:00
wenshao
13bac8bdcb fix(review): make skip-summary rule more prominent with warning emoji
LLM ignored the "do not submit summary for Comment verdict" rule
despite it being written. Restructured: warning emoji + STOP instruction
first, then the exception cases. The default is "don't post", not
"decide whether to post."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:14:43 +08:00
wenshao
96fa7ac478 fix(review): enforce two-phase posting to prevent duplicate reviews
Root cause: LLM was calling gh pr review for each inline comment
instead of using gh api for comments and gh pr review once for the
verdict. Added explicit "Two-phase posting" instruction: complete ALL
inline comments (gh api) first, THEN submit verdict (gh pr review)
ONCE. Do NOT call gh pr review per comment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:13:44 +08:00
wenshao
62ca195adb fix(review): inline comment footer should include full attribution
Was: "— glm-5.1" (model name only)
Now: "— glm-5.1 via Qwen Code /review" (full attribution matching
the review summary footer format)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:10:54 +08:00
wenshao
e671cd350b fix(review): make Maven/Gradle wrapper selection explicit
Commands were hardcoded as ./mvnw and ./gradlew despite text saying
"prefer wrapper if exists, else mvn/gradle". Changed to explicit
conditional: "use ./mvnw if it exists, otherwise mvn" with {mvn}
placeholder in examples. Applied to Step 3 and Agent 5.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:15:41 +08:00
wenshao
0c5aeb9545 docs: record parallel execution lessons in DESIGN.md
Two rejected alternatives added from real debugging:
1. Verbose agent prompts → exceed output token budget → serial fallback
2. Relaxed "try 3+2" instruction → model always takes the fallback

Key constraint: each agent prompt ≤200 words for all 5 to fit in
one response and launch in parallel.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:58:38 +08:00
wenshao
776c9faefc fix(review): enforce short agent prompts to enable parallel launch
Each agent prompt must be under 200 words. With 5 verbose prompts,
the total output exceeds the model's output token limit, forcing
serial execution. Shorter prompts = all 5 fit in one response =
parallel execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:48:34 +08:00
wenshao
c826c24f6e fix(review): skip Agent 5 in cross-repo mode, update token counts
Cross-repo lightweight mode has no local codebase — Agent 5 (build/test)
is pointless. Now launches 4 agents instead of 5 in cross-repo mode.

Updated token count tables in SKILL.md, user doc, and DESIGN.md:
same-repo = 7 LLM calls, cross-repo = 6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:24:59 +08:00
wenshao
608550f2fc fix(review): restore strict parallel agent launch instruction
The relaxed "if you cannot fit 5, try 3+2" gave the model a fallback
that it always took (serial execution). Restored strict requirement:
MUST include all 5 tool calls in one response. The runtime confirms
concurrent agent execution is supported.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:19:03 +08:00
wenshao
205eefe6f9 Merge remote-tracking branch 'origin/main' into feat/review-skill-improvements 2026-04-07 09:15:18 +08:00
wenshao
446d309d53 fix(review): pragmatic parallel agent launch instruction
The strict "all 5 in one response" requirement may exceed the model's
output token limit. Changed to: prefer all 5 in one response, but
allow splitting (e.g., 3+2) as long as agents launch without waiting
for previous ones to finish. Runtime confirms parallel execution is
supported (coreToolScheduler tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:14:12 +08:00
wenshao
e96de40754 fix(review): handle partial inline comment failures + accept *italic*
1. When verdict is Comment and SOME inline comments fail, submit
   a summary with the failed findings (not lost). Only skip summary
   when ALL inline comments succeed.
2. Accept *italic* for model attribution in prose — prettier
   normalizes _italic_ to *italic* in markdown, both render the same.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:06:30 +08:00
wenshao
88ff2ac034 fix(review): prefer Maven/Gradle wrappers (./mvnw, ./gradlew)
Many repos rely on wrappers and don't require global Maven/Gradle
installed. Now prefers ./mvnw over mvn and ./gradlew over gradle
in both Step 3 (linter) and Agent 5 (build/test).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:00:07 +08:00
wenshao
b9d3602309 fix(review): 3 Copilot comments — conditional cleanup, italic format, cache SHA
1. Step 11: conditional worktree removal — skip if Step 8 flagged
   preservation (autofix commit/push failure)
2. Standardize model attribution to _italic_ (was mixed *italic*)
3. Cache stores pre-autofix headRefOid (not worktree HEAD which may
   include the autofix commit)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:58:13 +08:00
wenshao
226c9e2c19 fix(review): CI config discovery runs for ALL projects, not just unmatched
Previously, rule 7 (CI config auto-discovery) only ran for "unrecognized
projects." Java/Makefile (e.g., OpenJDK) and C/C++/Makefile matched
earlier rules that said "skip," so CI config was never read.

Now: language-specific rules 1-6 run first for known tools, then rule 7
runs for ALL projects to discover additional CI-defined checks. OpenJDK
will now discover jcheck and custom targets from .github/workflows/*.yml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 08:53:46 +08:00
wenshao
44241bc8c1 fix(review): 3 issues from self-audit
1. Step 9 code block: annotate Comment template applies only when
   no inline comments were posted (avoids LLM confusion)
2. Step 11: fix stale rationale — Step 9 uses API calls, not worktree
3. Verdict: explicitly based on high-confidence findings only —
   low-confidence findings don't influence PR approval status

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:56:52 +08:00
wenshao
57b7353946 fix(review): skip redundant review summary when inline comments suffice
When verdict is "Comment" and inline comments were posted successfully,
do NOT submit an additional gh pr review — the inline comments are
sufficient. Only submit summary for Approve/Request changes (which
carry approval status) or when no inline comments were posted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:53:38 +08:00
wenshao
dcb58545d2 feat(review): model attribution on inline comments + minimal summary
1. Each inline PR comment now includes model attribution footer
   (e.g., "— qwen3-coder") so reviewers know which model produced
   each comment.
2. When inline comments are posted successfully, the review summary
   is minimal (verdict + model only, no repeated findings). Full
   summary is only used when no inline comments were posted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:52:40 +08:00
wenshao
8f723575bd docs: add CI config auto-discovery to user doc and design doc
- User doc: added "Other" row to language table + explanation that
  CI config is read for unrecognized projects
- DESIGN.md: added "Why auto-discover from CI config" decision
  section + added .qwen/review-tools.md to rejected alternatives

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:47:07 +08:00
wenshao
2595567585 feat(review): auto-discover lint/build/test from CI config
For projects that don't match standard tool patterns (e.g., OpenJDK),
Step 3 and Agent 5 now read CI configuration files (.github/workflows,
.gitlab-ci.yml, Jenkinsfile, Makefile) to discover what lint/build/test
commands the project uses. No user configuration needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 07:44:48 +08:00