qwen-code/docs/e2e-tests/worktree.md
顾盼 609e05baee
feat(tools): add generic worktree support — EnterWorktree/ExitWorktree + Agent isolation (#4073)
* feat(tools): add generic worktree support (Phase A + B of #4056)

Adds first-class git worktree as a general-purpose capability:

Phase A — User-facing tools
- enter_worktree: creates `<projectRoot>/.qwen/worktrees/<slug>` on a
  `worktree-<slug>` branch and returns the absolute path. Slug auto-generated
  when omitted; validated against path traversal and disallowed characters.
- exit_worktree: keeps or removes the worktree (and its branch). Refuses to
  remove a worktree with uncommitted tracked changes or untracked files
  unless `discard_changes: true` is set.

Phase B — Agent isolation
- Agent tool gains an `isolation: 'worktree'` parameter that provisions a
  temporary `agent-<7hex>` worktree, prepends a worktree notice to the task
  prompt, and on completion either removes the worktree (no changes) or
  preserves it and reports its path/branch in the result. Background and
  foreground execution paths both wired up; rejected for fork agents.
- worktreeCleanup.cleanupStaleAgentWorktrees: fail-closed sweep for
  ephemeral `agent-<7hex>` worktrees older than 30 days with no tracked
  changes and no unpushed commits. User-named worktrees are never swept.
- buildWorktreeNotice helper for fork subagents (parity with claude-code).

Arena compatibility
- The existing Arena worktree implementation (GitWorktreeService.setupWorktrees,
  ArenaManager, agents.arena.worktreeBaseDir) is untouched. Arena uses its
  own batch APIs and `~/.qwen/arena` base dir; the new general-purpose APIs
  live alongside under `<projectRoot>/.qwen/worktrees/`.

Subagent safety
- enter_worktree / exit_worktree are added to EXCLUDED_TOOLS_FOR_SUBAGENTS
  so a subagent cannot mutate the parent session's worktree state.

Refs #4056

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(worktree): use path.join in expected paths so the test passes on Windows

The Windows CI run reported `enter-worktree.test.ts` failing because the
expected string was hardcoded with `/` while `getUserWorktreesDir()` uses
`path.join`, which returns `\\` on Windows. Build the expected path via
`path.join` so the platform-correct separator is compared.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(enter-worktree): treat empty name as auto-generate

Some models pass `{ "name": "" }` when calling EnterWorktree, because the
schema marks `name` as optional and they emit an empty placeholder. The
previous validation rejected the empty string with "Worktree name must be
a non-empty string", which surprised users running the auto-slug path.

Now both `validateToolParams` and `execute` treat `name: ""` as equivalent
to `name: undefined` and fall back to the auto-generated `{adj}-{noun}-{4hex}`
slug. Explicit invalid slugs (`'../etc'`, `'a/b'`, etc.) are still rejected
as before.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(worktree): address review findings 1-6 from PR #4073

Six issues raised on the initial review; each addressed with a verifiable
guarantee.

1. Real isolation for `agent isolation: 'worktree'`
   Before: subagent's Config still resolved `getTargetDir()` to the parent
   project root, so Edit/Write/Read workspace checks and Shell's default cwd
   silently operated on the parent tree. The cleanup helper then saw a
   "clean" worktree and removed it — destroying the evidence.
   After: the worktree is provisioned BEFORE `createApprovalModeOverride`,
   and the resulting agent Config has `getTargetDir`/`getCwd`/`getWorkingDir`
   rebound to the worktree path. Relative paths, unqualified shell
   commands, and glob/grep roots all confine to the worktree.

2. `exit_worktree action='remove'` now prompts in default/auto-edit modes
   Added `getDefaultPermission()` on the invocation: `'ask'` when action is
   `remove`, `'allow'` when `keep`. Brings it in line with edit, write_file,
   and run_shell_command.

3. Force-delete no longer silently destroys unpushed commits
   `removeUserWorktree` now uses `git branch -d` (refuses unmerged) by
   default and surfaces `branchPreserved: true` when git refuses. Added
   `hasUnmergedWorktreeCommits` (checks if branch tip is reachable from any
   other local branch or remote ref). Both the agent isolation cleanup and
   `exit_worktree action='remove'` use this check: if the branch has work
   not covered elsewhere, the worktree+branch are preserved even when
   `discard_changes: true` is set (there is no `discard_commits` flag —
   committed work is rarely what `remove` means to discard).

4. Both new tools are now deferred behind ToolSearch
   `shouldDefer: true` + `searchHint` on both. Verified via openai-logging:
   `enter_worktree` and `exit_worktree` no longer appear in the function-
   declaration list sent on every API request.

5. Stale-worktree cleanup is wired in
   `Config.initialize()` fires `cleanupStaleAgentWorktrees(targetDir)` as a
   non-awaited startup sweep (skipped in bare mode). Picks up orphaned
   `agent-<7hex>` worktrees left by crashed runs.

6. Foreground isolation no longer leaks on uncaught throw
   The foreground try block tracks whether the cleanup helper ran on the
   success path; the finally block invokes it as a fallback when the try
   bailed early. Mirrors the background path's pattern.

Verification:
- Unit tests: 83 passed (16 worktree + 64 existing agent + 3 cleanup) — no
  regressions.
- E2E #1: agent told to write `hello.txt` via RELATIVE path — file landed
  at `.qwen/worktrees/agent-XXXXXXX/hello.txt`, NOT at the parent root.
- E2E #3: created worktree, committed work inside it, called exit_worktree
  with `discard_changes=true` — refused with clear message; worktree and
  branch both preserved.
- E2E #4: openai-logging confirms worktree tools absent from API tool list
  (7 tools sent instead of 9).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(worktree): address review round 2 findings (1 from tanzhenxin, 7+8 from wenshao)

The first round closed the data-loss-class issues. This round addresses
follow-ups from a deeper audit:

1. Stale-worktree sweep was inert on common-case repos
   `cleanupStaleAgentWorktrees` previously ran `git log --branches --not
   --remotes --oneline` from each worktree's directory — that lists
   unpushed commits across EVERY local branch, not just the worktree's
   own branch. On any repo with no remote configured (or with stray
   unpushed branches), the sweep refused to remove every candidate.
   Replaced with `service.hasUnmergedWorktreeCommits(slug)` which scopes
   the check to the worktree branch via `for-each-ref --contains <tip>`.
   Also added the `branchPreserved` warn log requested in M7 and an
   `fs.access` shortcut for the empty-worktrees-dir case (M8).

2. `cleanupWorktreeIsolation` and `worktreeIsolation` were inside the
   inner try (~660 lines from the outer catch). Hoisted both to the top
   of `execute()` so the outer catch can reap or preserve the worktree
   when anything between provisioning and the inner try throws (e.g.
   `createApprovalModeOverride`, agent creation). Closure carries the
   resolved `repoRoot` so cleanup never has to re-resolve.

3. Background error path discarded the cleanup result. Now captures
   `formatWorktreeSuffix(...)` and appends it to the registry's failure
   /cancel message, so users see the preserved path/branch even when
   the agent crashed before reporting.

4. `cleanupWorktreeIsolation` now treats `result.success === false` as
   "worktree still on disk" and surfaces it as preserved instead of
   silently dropping it from the result.

5. Override was incomplete. Several Config methods read `this.targetDir`
   directly (`getProjectRoot`, `getFileService`, etc.) — own-property
   getter overrides did not redirect them. Now also shadows `targetDir`
   and `cwd` as own properties on the agent's Config override, swaps in
   a `FileDiscoveryService` rooted at the worktree, and rebuilds
   `WorkspaceContext` to point at the worktree only. Verified
   end-to-end: shell `pwd > pwd-record.txt` (no directory arg) lands at
   `.qwen/worktrees/agent-<7hex>/pwd-record.txt`, not the parent root.

6. monorepo subdir issue. Both `enter_worktree` and the agent isolation
   path now resolve `git rev-parse --show-toplevel` first and anchor
   `.qwen/worktrees/<slug>` at the repo root. Worktrees created from
   any subdirectory now end up where the startup sweep can find them.

7. Replaced `git worktree add -B` (silent force-reset of pre-existing
   branches) with `git worktree add -b` plus an explicit existence
   check via `git for-each-ref` (NOT `show-ref --quiet`, which
   simple-git swallows). Pre-existing `worktree-<slug>` branches now
   trigger a clear error instead of clobbering committed work.

8. First worktree creation in a repo writes `<projectRoot>/.qwen/.gitignore`
   with `worktrees/` so worktree contents stay out of the parent's
   `git status`, glob/grep results, and bundle tools. Idempotent: never
   overwrites an existing file.

9. Logging across the failure paths (`enter_worktree` errors,
   `agent.ts:failWorktreeProvisioning`, `cleanupWorktreeIsolation`,
   `hasUnmergedWorktreeCommits` swallowed errors,
   `cleanupStaleAgentWorktrees`'s `branchPreserved` race).

10. `exit_worktree` no longer suggests `discard_changes: true` when the
    git status check itself fails — that would be advising the user to
    bypass a safety check whose precondition is unknown. Now points at
    the underlying repo problem.

11. `generateAutoSlug` switched from `Math.random()` (4 hex, weak RNG,
    one-in-65k collision) to `randomBytes` (6 hex, ~16M combinations).
    Two RNG sources in this file collapsed to one.

Pushed back: the TOCTOU swap in `removeUserWorktree` (S6 round 1) is
left as-is — `git branch -d` is the real safety, and reordering does
not eliminate the window. Windows reserved-name validation (M5 round 2)
deferred to a follow-up; the current allowlist already rejects path
separators, `..`, leading dot/dash, and the >64-char case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(worktree): use randomInt to silence CodeQL biased-modulo finding

CodeQL's `js/biased-cryptographic-random` flagged
`randomBytes(4)[i] % ARRAY.length` in `generateAutoSlug`. The math is
actually exact for the current word-list lengths (256 % 8 == 0), but
the lint rule does not know that — and a future contributor changing
the list to a non-power-of-two length would silently introduce bias.

Switched the index lookups to `crypto.randomInt(0, length)`, which uses
rejection sampling and is uniform by construction. Suffix still uses
`randomBytes(3).toString('hex')` since hex encoding is unbiased.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(worktree): address review round 3 findings 1-6 from PR #4073

The previous round added `getRepoTopLevel` for `enter_worktree`'s
provisioning, but missed three sibling call sites that still used the
raw cwd. The double-cleanup race in the foreground path also leaked
stale `[worktree preserved]` suffixes on rejected promises. All six
findings from the deeper audit are addressed:

1. exit_worktree now resolves through `getRepoTopLevel()` before
   building its `GitWorktreeService`, mirroring `enter_worktree`. Without
   this, launching `qwen` from a monorepo subdirectory created the
   worktree under the repo root but exit_worktree looked under the
   subdir's `.qwen/worktrees/` and always returned "Worktree not found".
   Verified end-to-end: enter + exit from `packages/core/` works.

2. agent.ts cleanup helper now nulls `worktreeIsolation` immediately
   after capturing the closure value. The previous structure could
   reach the helper twice — once in the foreground try's success path
   and once in the foreground finally fallback (or once in the inner
   try and once in the outer catch on a thrown rejection). The second
   call would `hasWorktreeChanges()` against a directory the first
   call already removed, fail-closed, and emit a bogus
   `[worktree preserved: <missing path>]` suffix.

3. Config.initialize's startup sweep now resolves `getRepoTopLevel()`
   before invoking `cleanupStaleAgentWorktrees`. Without this, every
   subdir launch scanned a non-existent `<subdir>/.qwen/worktrees/`
   and the 30-day expiry sweep was permanently a no-op.

4. agent.ts's `buildWorktreeNotice` now passes
   `worktreeIsolation.repoRoot` as `parentCwd` instead of
   `this.config.getTargetDir()`. The notice's path-translation
   guidance (≈ "translate paths from <parent> to <worktree>") would
   otherwise misdirect the subagent in a monorepo subdir launch.

5. Removed dead method `GitWorktreeService.listUserWorktrees`. It had
   no callers anywhere in the codebase and used `execSync` in a loop
   (would have blocked the event loop if anyone wired it up).

6. `localBranchExists` no longer swallows git failures silently. The
   defensive `false` default is preserved (so `git worktree add -b`
   itself surfaces the conflict if the check missed an existing
   branch), but the catch now logs via `debugLogger.warn` so disk-full
   / permission / ref-store-corruption cases are visible in debug
   output instead of being invisible.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(worktree): address review round 4 findings (data-loss + visibility)

Seven actionable findings from a deeper audit, all closed:

1. User worktree slugs could collide with ephemeral-agent shape
   `validateUserWorktreeSlug` did not reject names starting with
   `agent-`, so a user-named `agent-1234567` matched the cleanup regex
   `/^agent-[0-9a-f]{7}$/` and would be silently swept after 30 days
   along with whatever work was in it. Now reserved — clear error
   message points users at the cause.

2. Slug producer and consumer were string-coupled across files
   `agent.ts` hardcoded `agent-${hex(7)}` and `worktreeCleanup.ts`
   independently hardcoded `/^agent-[0-9a-f]{7}$/`. Future change to
   hex length on one side would silently break the other. Lifted
   `AGENT_WORKTREE_PREFIX`, `AGENT_WORKTREE_HEX_LENGTH`,
   `AGENT_WORKTREE_SLUG_PATTERN`, and `generateAgentWorktreeSlug()` to
   `gitWorktreeService.ts`; both call sites import them.

3. Startup sweep was invisible at default log level
   Fire-and-forget sweep used `debug` for errors and discarded the
   success count. A leak-chasing operator had no log breadcrumb.
   Errors promoted to `warn`; successful removals (count > 0) logged
   at `info`.

4. `getRepoTopLevel()` silent catch
   Returned `null` on any git failure with no log. Combined with
   `?? cwd` fallback in callers, a flaky git would have made worktree
   creators and the startup sweep disagree silently about which dir to
   use. Now logs the underlying error.

5. `hasTrackedChanges()` silent catch
   Cleanup's fail-closed `return true` had no log. Couldn't tell
   "has real changes — leave alone" from "git index unreadable — repo
   may be corrupt". Now logs.

6. `cleanupWorktreeIsolation` claimed `preservedPath` for a removed dir
   When `removeUserWorktree` returns `{ success: true, branchPreserved:
   true }` it has already deleted the directory and failed only on
   `git branch -d`. The helper still reported the (now non-existent)
   path as preserved. Now returns only `preservedBranch` for that
   case; `formatWorktreeSuffix` emits a distinct message instructing
   recovery via `git worktree add <new-path> <branch>`.

7. `removeUserWorktree` swallowed branch-delete failures
   Both `-d` and `-D` catch blocks were empty. Locked refs, perms,
   disk full all looked identical to "unmerged commits". Both now
   `debugLogger.warn` with the underlying error.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(worktree): self-review pass — reuse, parallelism, dead code

Self-review caught a handful of issues across three categories:

Reuse:
- `pathExists` in the new code now uses the existing `fileExists` from
  `utils/fileUtils.ts` instead of duplicating an `fs.access` wrapper.
- `worktree-` branch prefix was string-literalled in five places. Added
  `WORKTREE_BRANCH_PREFIX` and `worktreeBranchForSlug(slug)` exports in
  `gitWorktreeService.ts`; updated `gitWorktreeService.ts`,
  `worktreeCleanup.ts`, and `exit-worktree.ts` to use them. Future
  prefix changes are a single edit.

Efficiency:
- `Config.initialize` used two `await import(...)` calls inside the
  startup-sweep IIFE, paying that cost on every CLI start. Switched to
  static imports at the top of `config.ts` — the modules are tiny and
  the dynamic indirection bought nothing.
- `cleanupWorktreeIsolation` in `agent.ts` ran `hasWorktreeChanges` and
  `hasUnmergedWorktreeCommits` sequentially. They have no data
  dependency on each other and each spawns its own `git` invocation;
  `Promise.all` halves the cleanup wall-clock on the common path.
  Same fix in `worktreeCleanup.ts`'s per-entry loop.
- `ensureWorktreesGitignored` used `fs.access` then `fs.writeFile`, a
  TOCTOU race when two agent invocations created worktrees concurrently
  (both could pass the `access` check and the second would clobber the
  first's `.gitignore`). Now writes with `flag: 'wx'` and treats
  `EEXIST` as the no-op case — atomic in one syscall.

Quality:
- Dropped the `worktreeCleanupRan` boolean in the foreground execution
  path. `cleanupWorktreeIsolation` already nulls its closure variable
  at the top of every call (see the comment at its definition), so
  re-entries are no-ops. The boolean and its tracking were dead weight
  that obscured the real guard.
- Trimmed the Phase-2 override comment block to drop the WHAT-stating
  enumerations (items 3 and 4 just narrated the lines below) and
  removed a navigation comment about hoisted helpers — the helpers are
  visible at the top of the same method.

84 unit tests pass; typecheck clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(worktree): address review round 5 — design-doc commitments + correctness

Five critical findings + four suggestions, all closed.

Critical:
1. Wrong base branch for agent isolation. `createUserWorktree(slug)` with
   no `baseBranch` arg fell back to `getCurrentBranch()` on the **main**
   working tree, returning `main` regardless of which branch the user
   was actually on. A subagent invoked from `feature-x` would silently
   start from `main` and produce diffs against the wrong baseline.
   `enter_worktree` had the same bug. Both now resolve the parent's
   current branch first and pass it explicitly. Verified end-to-end:
   `git checkout feature-x` → `enter_worktree` → worktree HEAD includes
   the feature-x commit.

2. `countWorktreeChanges` (used by `exit_worktree`'s dirty-state guard)
   missed `status.conflicted[]`. In simple-git that array is mutually
   exclusive with the staged/modified/etc. arrays, so a worktree
   mid-merge with only conflicts looked `{tracked: 0, untracked: 0}`
   to the guard and `action='remove'` would proceed without
   `discard_changes: true`. Added `+ status.conflicted.length`.

3. `exit_worktree` had no session-ownership check, contradicting the
   design doc's "only operates on worktrees created by THIS session".
   In yolo mode a prompt injection could enumerate `.qwen/worktrees/`
   and pass any name to drop another session's work. Now:
   `enter_worktree` and agent isolation write a `.qwen-session`
   marker into the worktree at provisioning time; `exit_worktree
   action='remove'` reads it and refuses if it does not match the
   current `Config.getSessionId()`. Worktrees from before this guard
   (no marker file) are treated as "owner unknown" — allowed with a
   warn log so the change is observable.

4. `enter_worktree` did not refuse nested invocations from inside an
   existing worktree, contradicting the design doc. Now rejects any
   cwd containing `.qwen/worktrees/` as a path component, with a
   clear "Already inside a git worktree…" message. Verified: enter
   from inside a worktree returns is_error with that text.

6. `hasTrackedChanges` (cleanup sweep) had the same `conflicted[]`
   gap. Rewrote to use raw `git status --porcelain --untracked-files=no`
   which lists every tracked change including `UU` conflict markers
   in a single git call and explicitly skips the untracked walk
   (the prior comment claimed to skip it, but `status()` always
   does the scan).

Suggestion:
7. `buildWorktreeNotice` now receives the parent agent's actual
   `getTargetDir()` again (was switched to `repoRoot` in round 3 on
   a different reviewer's suggestion; round-5 caught that the model's
   inherited paths reference the parent's cwd, not necessarily the
   repo root, so the prior behaviour was correct).

8. Startup sweep now does `fs.access(<targetDir>/.qwen/worktrees)`
   *before* importing GitWorktreeService and spawning `git
   rev-parse --show-toplevel`. The git probe is reserved for users
   who actually have a worktrees directory locally — 99% of users
   pay only one syscall on startup.

9. Tests:
   - New `exit-worktree.test.ts` covers metadata, validation,
     `getDefaultPermission` (ask vs allow), and getDescription.
   - `agent.test.ts` adds three `validateToolParams` cases for the
     `isolation` parameter (accepted with subagent_type, rejected
     without, rejected for non-"worktree" values).
   - `enter-worktree.test.ts` adds round-trip tests for
     `writeWorktreeSessionMarker` / `readWorktreeSessionMarker` plus
     a `worktreeBranchForSlug` sanity check.
   - Total: 101 tests pass (was 86 → +15).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): drop unused @ts-expect-error in exit-worktree.test.ts

Empty string `''` is a valid `string` type, so the @ts-expect-error
directive on `validateToolParams({ name: '', action: 'keep' })` did
nothing — TypeScript correctly accepted the line, and `tsc --build`
in CI reported TS2578 ("Unused '@ts-expect-error' directive"). The
runtime assertion already covers the case; the directive was leftover
from an earlier draft.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(test): use importActual in ArenaManager mock to preserve new exports

The Arena test mocks `gitWorktreeService.js` with a factory that
returns only `{ GitWorktreeService }`. PR #4073 added several other
exports to that module (`AGENT_WORKTREE_SLUG_PATTERN`,
`WORKTREE_BRANCH_PREFIX`, `worktreeBranchForSlug`,
`generateAgentWorktreeSlug`, `writeWorktreeSessionMarker`,
`readWorktreeSessionMarker`, `WORKTREE_SESSION_FILE`).

Other modules in the dep graph reach the mocked surface — most
notably `worktreeCleanup.ts` imports `AGENT_WORKTREE_SLUG_PATTERN`
and `worktreeBranchForSlug`, and now reaches the mock via the static
`config.ts` → `worktreeCleanup.ts` import chain added in the
self-review pass. The Arena test failed at module-load with:

  Caused by: Error: [vitest] No "AGENT_WORKTREE_SLUG_PATTERN" export
  is defined on the "../../services/gitWorktreeService.js" mock. Did
  you forget to return it from "vi.mock"?

Use `importOriginal` to capture every real export, spread it into
the return object, and only replace `GitWorktreeService` (the class
the test actually needs to mock). The class-level mock keeps its
existing static-method shims.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(worktree): address review round 6 (5 critical + 6 suggestions)

The biggest item — #1 — is a self-inflicted regression from round 5:
the new agent- prefix reservation in `validateUserWorktreeSlug`
rejected EVERY slug that `generateAgentWorktreeSlug` produces, since
that helper emits exactly `agent-<7hex>`. Net effect: every
`AgentTool isolation: 'worktree'` invocation failed at validation.
The reservation now allows the canonical pattern through (everything
the helper can produce) and only rejects user-chosen `agent-*` names
that don't match it. Added a round-trip regression guard: 50
`generateAgentWorktreeSlug()` outputs are fed back through
`validateUserWorktreeSlug` and must all pass.

Other critical fixes:

2. `hasWorktreeChanges` (used by agent isolation cleanup) was the
   one remaining caller relying solely on `status.isClean()`.
   Defensive `|| status.conflicted.length > 0` so a future simple-git
   bookkeeping change can't let a mid-merge worktree appear clean and
   get auto-deleted.

3. `readWorktreeSessionMarker` swallowed every I/O error as "marker
   missing", which let a disk error / EACCES silently bypass the
   session-ownership guard. ENOENT is still treated as missing
   (legitimate); every other code now logs.

4. `exit_worktree` `fs.stat` catch was the same shape — every error
   collapsed to "Worktree not found". ENOENT → not found; everything
   else logs and returns a distinct "cannot access" error.

5. `cleanupStaleAgentWorktrees` `fs.stat` catch was again the same.
   ENOENT → silently skip (entry vanished between readdir and stat);
   everything else logs.

Suggestions:

6. Startup sweep fast-bail was running BEFORE resolving the repo
   top-level. For monorepo subdir launches, `targetDir/.qwen/worktrees`
   never exists and the sweep early-returned — permanently a no-op.
   Now resolves the root first, then fast-bails against the resolved
   `<root>/.qwen/worktrees`. Also logs the skip case so operators can
   tell "skipped" from "ran, found nothing".

7. `.qwen-session` marker was visible to `git add -A` inside the
   worktree. Now writes a `.git/info/exclude` rule (resolved via
   `git rev-parse --git-dir`, since worktree `.git` is a file
   pointing at the parent repo's `.git/worktrees/<name>/`).
   Best-effort: failure to write the rule does not abort
   provisioning.

8. Agent isolation now refuses to provision when the parent's cwd is
   already inside a worktree — same regex guard as `enter_worktree`.

9. `exit_worktree`'s wrapper around `hasUnmergedWorktreeCommits` now
   logs at the call site so the chain (caller → reason it asked →
   underlying git error) is complete in operator logs.

10. Sweep now logs unconditionally at `info`. Three distinct messages:
    "skipped (no worktrees dir)", "ran, nothing to remove", "removed N".

Tests:

11. New `execute()` coverage:
    • exit-worktree: session-ownership refusal, keep happy path,
      legacy/no-marker fallthrough with warn log, missing-worktree
      error, unmerged-commits guard with `discard_changes: true`,
      `writeWorktreeSessionMarker` round-trip.
    • enter-worktree: nested-guard rejection, non-git-repo error.
    These spin up real temp git repos (no filesystem mocking) and
    drive the actual tool invocation pipeline.

   Total: 135 tests pass (was 101 → +34).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(worktree): demote noise startup-sweep logs to debug

Self-review pass applying the round-6 review-triage framework
(filter #5: "If a log only fires on the happy path, it's noise.")
to my own round-6 changes:

- "Stale worktree sweep skipped: <dir> does not exist" — fires on
  every CLI start for ~99% of users who never use worktrees.
- "Stale worktree sweep ran under <root>: nothing to remove" —
  fires on every CLI start for users who have any worktrees but
  no stale ones at the moment.

Both are happy-path noise at `info`. Demoted to `debug` so an
operator can opt in via `--debug` when they want to confirm the
sweep is wired up, but normal output stays clean.

Only the actually-actionable case ("removed N worktrees") stays at
`info` — that's the signal someone chasing a worktree leak would
grep for.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(worktree): close AUTO_EDIT bypass + parent-dirty stale-code hazard

Round-7 review caught two correctness gaps:

1. exit_worktree action='remove' was still auto-approved in AUTO_EDIT
   `getDefaultPermission` returning 'ask' is necessary but not
   sufficient. `permissionFlow.isAutoEditApproved` auto-approves any
   tool whose `confirmationDetails.type` is 'edit' OR 'info', and
   `BaseToolInvocation` returns 'info' by default. So a session in
   AUTO_EDIT could silently destroy a worktree (with branch deletion)
   without a confirmation prompt — the data-loss path the round-1
   `'ask'` switch was meant to close. Now overrides
   `getConfirmationDetails` to return `type: 'exec'` for action=remove,
   which keeps the prompt in AUTO_EDIT. The `keep` action still falls
   through to the base info-type since it is non-destructive.

   Regression-guard test asserts the type is 'exec' (not 'info') for
   remove and that the command field describes both the worktree-remove
   and branch-delete operations.

2. Agent isolation worktrees ran against parent's HEAD, not its
   working tree
   `git worktree add -b <branch> <path> <base>` only checks out the
   base ref's tip — uncommitted edits in the parent's working tree do
   NOT propagate. The "edit code → ask review/test agent before
   committing" workflow silently ran the subagent against the
   pre-edit HEAD and returned results that looked authoritative but
   reflected stale code.

   Reviewer offered two options: overlay parent's dirty state à la
   Arena (~50 LOC, edge cases), or refuse isolation when parent is
   dirty (~10 LOC, clear UX). Chose the latter for Phase B scope —
   simpler, decisive, and matches the design-doc's explicit
   commitment that dirty-state overlay is Arena-specific. Users can
   commit/stash before re-invoking agent isolation; overlay can be a
   follow-up if users complain about the friction.

   Fail-closed on the dirty-check itself (assume dirty rather than
   silently launch on a possibly-stale tree).

   Test exercises both "dirty parent → guard fires" and
   "clean parent → guard passes" against real temp git repos.

139 unit tests pass (was 135, +4 regression guards).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 18:00:30 +08:00

15 KiB

Worktree Feature E2E Test Plan (Phase A + B)

Scope

End-to-end tests for the generic worktree capability:

  • Phase A: EnterWorktree / ExitWorktree tools + SessionService state
  • Phase B: Agent tool isolation: 'worktree' parameter + auto-cleanup + worktree notice

Test environment

Each test group runs in its own temp git repo and tmux session to avoid collisions. Template setup:

TEST_DIR=$(mktemp -d -t worktree-test-XXXXXX)
cd "$TEST_DIR"
git init -q
git config user.email "test@example.com"
git config user.name "Test"
echo "hello" > README.md
git add README.md
git commit -q -m "initial"

Each group uses a unique tmux session name (e.g. wt-test-a, wt-test-b) and a unique temp dir.

Baseline binary: globally installed qwen (0.15.10). Local build binary: node /Users/mochi/code/qwen-code/.claude/worktrees/trusting-euclid-6fdfb9/bundle/qwen.js.

Test Group A: EnterWorktree tool registration and basic creation

Mode: Headless, --approval-mode yolo, --output-format json

A1: Tool registered in system init

Steps:

<qwen> "say hello" --approval-mode yolo --output-format json 2>/dev/null \
  | jq -r 'select(.type=="system") | .tools[]' \
  | grep -E "^(enter_worktree|exit_worktree)$"

Pre-implementation: empty (tools not registered). Post-implementation: outputs enter_worktree and exit_worktree.

A2: Create worktree with auto-generated name

Steps:

<qwen> "create a new git worktree using the enter_worktree tool" \
  --approval-mode yolo --output-format json 2>/dev/null > /tmp/a2.json
# Check worktree dir created
ls -la .qwen/worktrees/ | grep -v "^\." | wc -l
# Should have a directory matching the auto-generated slug pattern

Pre-implementation: model says it can't find the tool; no .qwen/worktrees/ directory. Post-implementation: .qwen/worktrees/<slug> exists with auto-generated slug (format: {adj}-{noun}-{4hex}).

A3: Create worktree with custom name

Steps:

<qwen> "use the enter_worktree tool with name='my-feature' to create a worktree" \
  --approval-mode yolo --output-format json 2>/dev/null
ls .qwen/worktrees/my-feature/
git branch | grep worktree-my-feature

Pre-implementation: tool unknown. Post-implementation: .qwen/worktrees/my-feature/ directory exists; branch worktree-my-feature exists.

A4: Invalid slug rejected

Steps:

<qwen> "use enter_worktree with name='../../../etc' to create a worktree" \
  --approval-mode yolo --output-format json 2>/dev/null \
  | jq 'select(.type=="user") | .message.content[] | select(.is_error) | .content'

Pre-implementation: tool unknown. Post-implementation: tool result is_error=true with a validation error message.

Test Group B: ExitWorktree

Mode: Headless, two-step interaction within one prompt.

B1: Enter then exit with action=keep

Steps:

<qwen> "create a worktree named 'temp-keep' using enter_worktree, then immediately exit it with action='keep' using exit_worktree" \
  --approval-mode yolo --output-format json 2>/dev/null > /tmp/b1.json
# Directory should still exist (keep preserves it)
ls -d .qwen/worktrees/temp-keep
# Branch should still exist
git branch | grep worktree-temp-keep
# CWD should be original

Pre-implementation: tools unknown. Post-implementation: worktree dir and branch both still exist after exit.

B2: Enter then exit with action=remove (no changes)

Steps:

<qwen> "create a worktree named 'temp-remove' using enter_worktree, then immediately exit it with action='remove' using exit_worktree" \
  --approval-mode yolo --output-format json 2>/dev/null
ls -d .qwen/worktrees/temp-remove 2>&1
git branch | grep worktree-temp-remove

Pre-implementation: tools unknown. Post-implementation: worktree dir is removed; branch is deleted.

B3: Exit with action=remove refuses when uncommitted changes exist

Steps: Spawn an interactive tmux session, manually create files in worktree, then attempt exit.

tmux new-session -d -s wt-test-b3 -x 200 -y 50 "cd $TEST_DIR && <qwen> --approval-mode yolo"
sleep 3
tmux send-keys -t wt-test-b3 "create a worktree named 'dirty-test' using enter_worktree"
sleep 0.5
tmux send-keys -t wt-test-b3 Enter
# Wait for completion
for i in $(seq 1 30); do
  sleep 2
  tmux capture-pane -t wt-test-b3 -p | grep -q "Type your message" && break
done
# Create dirty file in worktree
echo "dirty" > "$TEST_DIR/.qwen/worktrees/dirty-test/dirty.txt"
# Try to remove without discard_changes
tmux send-keys -t wt-test-b3 "use exit_worktree with action='remove' to exit the worktree"
sleep 0.5
tmux send-keys -t wt-test-b3 Enter
for i in $(seq 1 30); do sleep 2; tmux capture-pane -t wt-test-b3 -p | grep -q "Type your message" && break; done
tmux capture-pane -t wt-test-b3 -p -S -100 > /tmp/b3.out
# Should mention "uncommitted changes" or "discard_changes" in output
grep -E "uncommitted|discard_changes" /tmp/b3.out
tmux kill-session -t wt-test-b3

Pre-implementation: tools unknown. Post-implementation: exit fails with a message about uncommitted changes and the discard_changes flag.

Test Group C: SessionService persistence

C1: Worktree state in session metadata

Steps:

SESSION_ID=$(<qwen> "create a worktree named 'persist-test' using enter_worktree" \
  --approval-mode yolo --output-format json 2>/dev/null \
  | jq -r 'select(.type=="system") | .session_id' | head -1)
# Check session storage for worktree state
find ~/.qwen -name "*${SESSION_ID}*" 2>/dev/null | head
grep -l "persist-test" ~/.qwen/projects/*/sessions/*.json 2>/dev/null || \
  grep -rl "worktreeSession\|persist-test" ~/.qwen/projects/ 2>/dev/null | head -5

Pre-implementation: no worktree session state stored anywhere. Post-implementation: session JSON contains a worktreeSession field with slug='persist-test', worktreePath, originalCwd, etc.

Test Group D: AgentTool isolation

D1: Agent isolation parameter accepted

Steps:

<qwen> "spawn an agent using the agent tool with isolation='worktree' to run 'echo hello'" \
  --approval-mode yolo --output-format json 2>/dev/null \
  | jq 'select(.type=="assistant") | .message.content[] | select(.type=="tool_use" and .name=="agent") | .input'
# Check that .qwen/worktrees/ contains an agent-* slug during execution

Pre-implementation: agent tool schema has no isolation parameter; model either omits it or the schema rejects it. Post-implementation: agent runs successfully with isolation='worktree'; an agent-<7hex> worktree is created.

D2: Agent auto-cleans worktree (no changes)

Steps:

ls .qwen/worktrees/ > /tmp/d2-before.txt 2>/dev/null
<qwen> "spawn an agent with isolation='worktree' to list files in the current directory using ls" \
  --approval-mode yolo --output-format json 2>/dev/null
ls .qwen/worktrees/ > /tmp/d2-after.txt 2>/dev/null
# After should equal before (no leftover agent-* dirs)
diff /tmp/d2-before.txt /tmp/d2-after.txt

Pre-implementation: N/A (no isolation parameter). Post-implementation: worktrees dir is unchanged after agent completes with no changes.

D3: Agent worktree preserved when changes made

Steps:

<qwen> "spawn an agent with isolation='worktree' to write 'test content' to a new file called test.txt" \
  --approval-mode yolo --output-format json 2>/dev/null > /tmp/d3.json
# Worktree should be preserved with the change
ls .qwen/worktrees/agent-* 2>/dev/null
ls .qwen/worktrees/agent-*/test.txt 2>/dev/null
# Agent result should include worktreePath/worktreeBranch
jq 'select(.type=="user") | .message.content[] | select(.tool_use_id) | .content' /tmp/d3.json | head

Pre-implementation: N/A. Post-implementation: .qwen/worktrees/agent-<7hex>/test.txt exists; agent result mentions worktree path and branch.

Test Group E: Stale cleanup

E1: Cleanup function removes old agent worktrees

This is harder to test e2e because it requires aging. Cover via unit tests in worktreeCleanup.test.ts:

  • Worktree with mtime > 30 days ago and matching agent-<7hex> pattern → removed
  • Worktree with mtime > 30 days ago but user-named (e.g., my-feature) → preserved
  • Worktree with mtime < 30 days → preserved
  • Worktree with uncommitted changes → preserved (fail-closed)
  • Worktree with unpushed commits → preserved (fail-closed)

E2E spot check (optional): manually touch -t 200001010000 .qwen/worktrees/agent-aabcdef0 and invoke cleanup; verify removal.

Test Group F: Arena compatibility (no regression)

F1: Arena worktree path unchanged

Steps: Run an Arena session (separate from EnterWorktree); verify it still creates worktrees under ~/.qwen/arena/<sessionId>/worktrees/ and not under .qwen/worktrees/.

# Setup: requires Arena-enabled config. Detailed steps depend on Arena CLI invocation.
# Pre-implementation: arena worktrees are under ~/.qwen/arena/.
# Post-implementation: SAME — arena path is independent.

(If Arena is not easily reachable from headless mode, this group is verified by unit test that ArenaManager.ts:125 (this.arenaBaseDir = arenaSettings?.worktreeBaseDir ?? path.join(Storage.getGlobalQwenDir(), 'arena')) is unchanged.)

Unit test coverage (collocated with implementation)

Outside of the E2E plan, these unit tests must accompany the implementation:

  • EnterWorktreeTool.test.ts: schema validation, slug rejection, nested-worktree rejection, cwd change, SessionService write
  • ExitWorktreeTool.test.ts: keep vs remove paths, dirty-state guard, discard_changes bypass, cwd restoration
  • gitWorktreeService.test.ts extensions: createUserWorktree, removeUserWorktree, createAgentWorktree, removeAgentWorktree
  • sessionService.test.ts extensions: WorktreeSession field read/write, resume restoration
  • worktreeCleanup.test.ts: cleanup pattern matching, age filter, fail-closed conditions
  • agent.test.ts extensions: isolation parameter accepted, worktree created and (in some cases) cleaned

Pass criteria

Group Pre-build expected Post-build expected
A1 tools not listed both tools listed
A2 error/no-op .qwen/worktrees/<auto-slug> created
A3 error/no-op .qwen/worktrees/my-feature created, branch present
A4 error/no-op tool result is_error with validation message
B1 error/no-op worktree dir + branch preserved
B2 error/no-op worktree dir + branch removed
B3 error/no-op exit refuses with uncommitted-changes message
C1 no worktree state session has worktreeSession field
D1 no isolation param agent runs in agent-<7hex> worktree
D2 N/A worktrees dir unchanged after agent with no changes
D3 N/A agent-<7hex> preserved with changes

Reproduction report (post-implementation)

Local build at dist/cli.js (commit at the tip of claude/trusting-euclid-6fdfb9).

Group Result Notes
A1 enter_worktree and exit_worktree listed in system.tools
A3 .qwen/worktrees/my-feature created, branch worktree-my-feature present
A4 covered by unit test validateUserWorktreeSlug rejects path-traversal etc. (enter-worktree.test.ts)
B1 keep action preserved both directory and branch
B2 remove action deleted directory and branch
B3 remove refused with Refusing to remove worktree "dirty-test" — it has 0 tracked change(s) and 1 untracked file(s).
C1 scope-out SessionService persistence deferred from Phase A (see scope notes in docs/design/worktree.md)
D1 Agent invocation accepted isolation: 'worktree', created agent-2c4e759
D2 After agent finished with no changes, worktrees dir was empty
D3 After agent wrote test.txt, worktree agent-bad55bd and branch worktree-agent-bad55bd preserved; result included [worktree preserved: ... (branch ...)] suffix
E1 covered by unit test worktreeCleanup.test.ts verifies isEphemeralSlug matches only agent-<7hex>
F1 scope-out (no Arena E2E in this run) Arena code paths untouched: ArenaManager.ts:125 and setupWorktrees() unchanged

Scope deviations from the test plan

  • C1 (SessionService persistence) was deferred from Phase A. The minimum-viable Phase A returns the absolute worktree path so the model uses it directly via absolute paths, instead of mechanically switching Config.targetDir. Resume support requires SessionService extension and is documented for a future phase.
  • A2 (auto-generated name) was indirectly verified via D1/D3, which exercise the same auto-slug path through the agent isolation flow.