mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-05-21 18:46:47 +00:00
10 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ed14a33064
|
feat(core): add NotebookEdit tool for Jupyter notebooks
Some checks are pending
Qwen Code CI / Classify PR (push) Waiting to run
Qwen Code CI / CodeQL (push) Blocked by required conditions
Qwen Code CI / Lint (push) Blocked by required conditions
Qwen Code CI / Test (macos-latest, Node 22.x) (push) Blocked by required conditions
Qwen Code CI / Test (ubuntu-latest, Node 22.x) (push) Blocked by required conditions
Qwen Code CI / Test (windows-latest, Node 22.x) (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run
Adds NotebookEdit as the structured write counterpart to existing notebook read support. Summary: - Add `notebook_edit` for safe cell-level `.ipynb` replace/insert/delete operations. - Integrate notebook editing with tool registration, permissions, Claude conversion, prior-read enforcement, IDE/inline modify flow, commit attribution, docs, and SDK permission docs. - Harden notebook read/edit behavior for truncated notebook renders, ambiguous fallback cell IDs, internal modify metadata, compact JSON, UTF-8 BOM notebooks, and cache behavior after structural edits. - Add unit and integration coverage for notebook read/edit behavior. Follow-up work remains for tab-indented notebook formatting preservation, a few low-risk unit-test additions, and non-blocking hardening suggestions from review. |
||
|
|
1b66f79555
|
feat(cli,core): add Auto approval mode with LLM classifier (#4151)
* feat(cli,core): add Auto approval mode with LLM classifier (#auto-mode)
Add a fifth approval mode positioned between Auto-Edit and YOLO that uses
an LLM classifier to evaluate each tool call and auto-approve safe ones
while blocking risky ones — letting agents work autonomously on long
sessions without forcing users to confirm every shell/network call.
Three-layer filter when L4 returns 'ask'/'default':
L5.1 acceptEdits fast-path: Edit/Write inside workspace -> allow
L5.2 safe-tool allowlist: Read/Grep/LS/TodoWrite/... -> allow
L5.3 LLM classifier: two-stage (fast/thinking) via sideQuery
Anti-injection: assistant text and tool results are stripped from the
classifier transcript; each tool projects its args through a new
`toAutoClassifierInput` method to redact sensitive/voluminous fields.
Pending action is rendered as a user-role text turn so it survives the
OpenAI Chat Completions converter (which drops orphan tool_calls).
Safety: fail-closed on classifier failure; denial-tracking caps
3 consecutive blocks / 2 consecutive unavailable before falling back
to manual confirmation; dangerous allow rules (Bash interpreter
wildcards, any Agent/Skill allow) are temporarily stripped while in
AUTO and restored on exit — settings.json is never modified.
Config:
--approval-mode auto # CLI flag
tools.approvalMode: "auto" # settings.json
permissions.autoMode.hints.{allow,deny}: string[] # natural-lang
permissions.autoMode.environment: string[]
* chore(schema): regenerate settings.schema.json after adding tools.approvalMode 'auto'
The autogenerated VS Code settings schema was out of sync with the
runtime SETTINGS_SCHEMA after the AUTO mode addition; CI's Lint job
caught the drift. No behavior change — this is purely the regenerated
output of `npm run generate:settings-schema`.
* test(cli): update expected error message after adding 'auto' to approval-mode choices
Two tests in `loadCliConfig`'s error-path coverage hard-coded the list of
valid approval modes in the expected error string. Add `auto` to match
the runtime message produced by the new five-mode enum.
* test(core): fix autoMode test fixture on Windows
The fixture's mock isPathWithinWorkspace used path.sep to join the root
prefix, but the hard-coded test paths use forward slashes regardless of
OS. On Windows path.sep is '\\', so prefix matching failed and L5.1
fast-path tests returned false (and the L5.1-gating test then fell into
the classifier branch, hitting an undefined getToolRegistry mock).
Hard-code '/' in the fixture — it controls only intra-file consistency
between mock roots and mock paths, not real workspace behavior.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cli,core): three asymmetries surfaced by self-review of PR #4151
ACP path (Session.ts) had two asymmetries with the CLI scheduler that
silently degraded AUTO behavior, and the classifier transcript builder
left historical tool_use calls vulnerable to the OpenAI converter's
orphan-tool_call filter on the default Qwen / DashScope backend.
1) ACP runs the classifier even when finalPermission === 'allow'
The CLI scheduler short-circuits when L4 returned 'allow' (user-
explicit rule matched) so the classifier never sees the call. The
ACP duplicate only short-circuits on 'deny'. Mirror the scheduler:
set autoModeAllowed = (finalPermission === 'allow') before the AUTO
L5 block. Without this, a user-written `Bash(git push *)` allow rule
in an ACP session could reach the classifier and be blocked by a
conservative Stage-1 verdict.
2) ACP never records a successful fallback approval
When the denialTracking streak forced fallback, ACP correctly dropped
into requestPermission — but after the user approved, the streak was
never reset. consecutiveBlock stayed at 3, so every subsequent call
re-fell into fallback. The session was permanently downgraded to
manual approval until the mode toggled. Add the post-outcome
recordFallbackApprove call paralleling coreToolScheduler.ts:1705-
1717 (approve outcomes only; cancel/abort preserve the streak).
3) Classifier transcript: historical functionCalls become orphans on
OpenAI-compatible backends
buildClassifierContents kept model.functionCall parts but stripped
tool results entirely (anti-injection). On Anthropic-native APIs
that's fine, but the OpenAI Chat Completions converter
(converter.ts:1422-1455) filters out tool_calls without a matching
tool response, and since the assistant message has no text content
either, the entire turn gets dropped. The classifier on Qwen /
DashScope ended up seeing only user prompts plus the pending action —
zero record of prior tool actions in the chain.
Match ClaudeCode's `buildTranscriptEntries` (yoloClassifier.ts):
render every historical model.functionCall as a user-role text turn
("Prior action: tool(args)") projected through toAutoClassifierInput.
The result contains only user-role text — no functionCall parts,
no assistant tool_calls — so it is converter-agnostic by
construction. Tests updated to assert the new shape and added a
regression guard verifying no functionCall part survives anywhere
in the output.
ACP fixes have no new unit tests: their logic is mechanically symmetric
with the CLI scheduler branch, the underlying recordFallbackApprove
state machine is covered by denialTracking.test.ts, and adding ACP
integration tests for these two-to-four-line branches would dwarf the
fix itself. The fix correctness is verifiable from the diff against
the existing scheduler comparison.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(core): recordFallbackApprove resets BOTH consecutive counters
Asymmetry caught by copilot[bot] on PR #4151: the original
implementation only cleared consecutiveBlock when the user approved
a fallback prompt, leaving consecutiveUnavailable at its threshold.
A transient classifier API blip (2 consecutive unavailable verdicts)
therefore permanently downgraded the rest of the session to manual
approval — even after the user explicitly approved the prompt —
because every subsequent shouldFallback() call kept seeing the
{reason: 'consecutive_unavailable'} branch.
The fix mirrors recordAllow: a manual approval signals the user
accepted the action and the next call should re-engage the
classifier. If the API is still degraded, the next call simply re-
arms the counter (one unavailable / one block), same recovery curve
as initial onset. No permanent lock-out, and the documented "Counter
resets on user approve or mode switch" behavior from the PR body
now actually holds for both reasons.
Existing test 'does not reset consecutiveUnavailable' was codifying
the bug — replaced with three positive cases (unavailable recovery,
total-counter preservation as telemetry, and the no-op guard).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cli,core): address PR #4151 review findings (defense-in-depth + sibling-drift)
20 findings from reviewers wenshao (gpt-5.5 / deepseek-v4-pro / mimo-v2.5-pro)
on PR #4151. Triaged through the five-filter framework, accepted findings
clustered into four root-cause groups + a misc group.
A) Sibling drift: AUTO mode missing in entry-point allowlists
- packages/core/src/agents/background-agent-resume.ts —
`normalizeApprovalMode` now accepts `'auto'`; `reconcileResumedApprovalMode`
now treats `'auto'` as privileged (downgrade in untrusted folder).
- packages/cli/src/nonInteractive/control/controllers/permissionController.ts —
`validModes` for `set_permission_mode` includes `'auto'`; the
non-interactive tool-permission switch handles AUTO (delegates to the
scheduler's classifier).
- packages/cli/src/config/config.ts — non-interactive deny-list switch
adds an AUTO arm that mirrors PLAN/DEFAULT (no fallback UI available).
- packages/sdk-typescript/{types/protocol,types/queryOptionsSchema}.ts —
`PermissionMode` and the SDK `permissionMode` zod enum accept `'auto'`.
- packages/vscode-ide-companion/* — `ApprovalModeValue`, `ApprovalMode`
enum, `APPROVAL_MODE_MAP`, `APPROVAL_MODE_INFO`, `APPROVAL_MODE_VALUES`,
and all ACP-session mode unions now include AUTO.
B) Sub-agent AUTO path (architectural)
- agent.ts: untrusted-folder guard in `resolveSubagentApprovalMode` now
blocks the `AUTO` privileged mode the same way it blocks YOLO / AUTO_EDIT.
- agent.ts: `createApprovalModeOverride(_, AUTO)` now triggers
`PermissionManager.stripDangerousRulesForAutoMode()` on the shared
manager, so the override path matches the top-level entry path.
- agent.ts: `AgentTool.toAutoClassifierInput` forwards the full prompt
(was truncated to 200 chars, which hid attack payloads past character
200 from the classifier while the sub-agent received the full text).
C) Sibling drift: dangerous-rule surface
- dangerousRules.ts: interpreter list expanded with php / lua / julia /
R / rscript / groovy / awk / pwsh / cargo / npm / pnpm / yarn / make /
gradle / mvn / rake / just / eval / exec / source. Token-based
detection now catches multi-word interpreter subcommands
(`bun run *`, `npm run *`), absolute-path forms (`/usr/bin/python3 *`),
and Monitor-tool allow rules with the same logic. Literal concrete
commands (`Bash(npm test)`, `Bash(python script.py)`) are NOT flagged.
- permission-manager.ts: `addSessionAllowRule` / `addPersistentRule`
now stash newly added dangerous allow rules into `strippedAllowRules`
while in AUTO mode, instead of letting an "Always allow" choice on
a fallback prompt persist a broad rule that bypasses the classifier.
- tools/tools.ts: default `toAutoClassifierInput` returns `''` (the
no-security-relevance sentinel) instead of `undefined` (which fell
through to raw args). Third-party MCP tools no longer leak raw
parameters — potentially API keys, tokens, file contents — into the
classifier LLM prompt by default. Internal tools that need their
args inspected for safety override the method explicitly.
D) Classifier defense-in-depth (architectural)
- autoMode.ts: `send_message` removed from SAFE_TOOL_ALLOWLIST so the
classifier sees destination + body and can judge inter-agent steering.
- autoMode.ts: when `pmForcedAsk=true` (user wrote an explicit ask
rule), the function now returns `{ via: 'fallback' }` instead of
falling through to the classifier — honoring the documented "ask
rules force manual confirmation" guarantee.
- classifier.ts: new `sanitizeClassifierReason` strips angle-bracket
pseudo-tags, collapses whitespace, and clamps length to 200 chars;
applied at the stage-2 boundary so `decision.reason` cannot smuggle
a `<system>...` payload into the main model's tool-error message.
- classifier.ts: `buildClassifierContents` /
`buildClassifierSystemPrompt` are now wrapped in a try/catch that
funnels to the existing `failClosed` handler, so any pathological
input (circular projected args, registry lookup error, …) becomes
an `unavailable=true` block result instead of crashing the
tool-execution loop.
- classifier-transcript.ts: transcript now truncates to the most
recent 40 messages so long autonomous sessions don't overflow the
fast classifier's context window — which would otherwise tip the
session into the `consecutive_unavailable` fallback after two
overflow-induced failures.
E) Misc
- coreToolScheduler.ts + Session.ts: `finalPermission === 'allow'`
path now calls `recordAllow` in AUTO mode so an explicit allow-rule
match resets the denialTracking streak (otherwise a 3-block streak
would silently force the next classifier-eligible call into manual
approval right after an allow-ruled call just worked).
- useAutoAcceptIndicator.ts: mount-time effect emits the first-time
AUTO information notice + stripped-rules notice when the session
starts already in AUTO (`--approval-mode auto` flag or
`tools.approvalMode: "auto"` in settings). Previously the notices
only fired on Shift+Tab / `/approval-mode` switches.
Test updates:
- permissions/autoMode.test.ts: SAFE_TOOL_ALLOWLIST snapshot updated
(no longer contains send_message). pmForcedAsk regression test now
asserts the new `via: 'fallback'` semantics.
- permissions/dangerousRules.test.ts: 25 new cases covering extended
interpreter list, multi-word subcommands, absolute paths, and
Monitor tool.
- tools/toAutoClassifierInput.test.ts: AgentTool now asserts full-
prompt passthrough rather than 200-char truncation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(vscode-ide-companion): include 'auto' in NEXT_APPROVAL_MODE cycle
The cycle map in `acpTypes.ts` is typed as
`{ [k in ApprovalModeValue]: ApprovalModeValue }`. After adding `'auto'`
to `ApprovalModeValue` in the previous commit, this map became missing
the `auto` arm — caught by CI's tsc check (`error TS2741: Property 'auto'
is missing`). Add it between `auto-edit` and `yolo` so the cycle order
remains plan → default → auto-edit → auto → yolo → plan, matching the
core APPROVAL_MODES ordering.
Local lint/typecheck only — not introduced or surfaced by review.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(core): silence two CodeQL findings on PR #4151
CodeQL 223 — Incomplete multi-character sanitization
(packages/core/src/permissions/classifier.ts:258)
A single `/<[^>]*>/g` pass can leave residual angle-brackets when the
input is crafted to overlap (e.g. `<scr<script>ipt>`). In our actual
use case the sanitized string is a prompt fragment, not HTML output,
so a "reconstituted script tag" doesn't matter — but iterating the
strip until the string stabilises is cheap defense-in-depth and
removes the warning. Bounded by 8 iterations so the loop is always
O(n) regardless of how the attacker structures the input.
CodeQL 222 — Polynomial regex on uncontrolled data
(packages/core/src/permissions/dangerousRules.ts:93)
The regex `/[*]+$/` is actually linear (single-character class + `$`
anchor, no backtracking), but CodeQL flags any `replace(<regex>, ...)`
applied to user-controlled input. Replace the regex with a manual
trailing-`*` strip via `slice` + a counted loop — same semantics,
no regex engine involved, warning cleared.
Existing tests cover both branches (classifier transcript sanitizer
test suite, dangerousRules interpreter coverage). No regressions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cli,core,docs): address 4 non-blocker findings from PR #4151 review
Top-level review on
|
||
|
|
1d4c17b1ce |
fix: rename /plan execute to /plan exit
Rename the subcommand to accurately reflect its behavior (exits plan mode and restores previous approval mode, does not trigger execution). Update source, tests, i18n keys (6 locales), and docs. |
||
|
|
2e089cce71 |
fix(docs): fix mode count and update plan example in approval-mode docs
- Fix "three distinct permission modes" → "four" (Plan was always listed) - Update refactor example to use /plan command instead of /approval-mode - Fix grammar in example description Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
ee8fcc9230 |
docs: add /plan command usage to approval mode documentation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
||
|
|
3296785b23 | feat use tab on windows instead of shift+tab | ||
|
|
a4e3d764d3 | docs: updated all links, click and open in vscode, new showcase video in overview | ||
|
|
6e9d4f1e3e | docs: delete docs folder, updated links & content | ||
|
|
9fd4f58c16 |
docs: Add detailed documentation for Qwen Code's approval modes and usage
- Introduced a comprehensive guide on the four permission modes: Plan, Default, Auto-Edit, and YOLO, including their use cases and risk levels. - Updated the overview and quickstart documentation for clarity and consistency. - Removed the outdated CLI reference document and integrated relevant content into the updated documentation. - Improved command creation examples and best practices for custom commands. |
||
|
|
bfe8133ea3 | feat: refactor docs |