fix(serve/fs): address PR review feedback (#4250)

Codex + Copilot review found 8 substantive issues against
7b0db4c3a; this commit fixes all of them. The first two are
P1 build-breakers introduced by the pre-commit `eslint --fix`
auto-promoting value imports to `import type` — 31 fs tests
were failing post-commit until this fix.

Issue list (links to PR comments at #4250):

1. Import type erased runtime values —
   workspaceFileSystem.ts:10-15. eslint's
   consistent-type-imports rule rewrote
   `import { Ignore, StandardFileSystemService, loadIgnoreRules,
   type WriteTextFileOptions }` -> `import type { ... }` because
   it saw type-only usage of `Ignore`. That erased
   `loadIgnoreRules` (called at runtime) and
   `StandardFileSystemService` (constructed at runtime), causing
   TS1361/TS2206 and runtime ReferenceErrors. Restored as a
   value import with inline `type` modifiers per-symbol and an
   eslint-disable line + comment so future autofixes don't
   repeat the regression.

2. Same import-type erasure in contract.test.ts:14 — `isFsError`
   was lumped under `import type` even though it's called at
   runtime. Same fix shape.

3. edit() OOM hole — workspaceFileSystem.ts. The earlier review
   pass added a pre-stat MAX_READ_BYTES gate to readText but
   missed edit, which fsp.readFile's the whole file before any
   size check. Multi-GB targets inside the workspace could OOM
   the daemon. Now stat-first; refuses above the cap with
   file_too_large; also rejects binary files (string indexOf
   over arbitrary bytes is meaningless).

4. glob accepted absolute / device patterns —
   workspaceFileSystem.ts. The ..-segment check stopped lexical
   traversal but /etc/** / C:\\Users\\foo\\** /
   \\\\server\\share\\** / //server/share/** still reached
   globAsync, walking outside the workspace before per-hit
   filtering dropped the results. Now rejects these patterns
   up-front with parse_error so no I/O happens outside.

5. glob ignore filter probed every hit as `file` —
   workspaceFileSystem.ts. The underlying `ignore` library
   needs a trailing-slash probe for dist/ / .git/-style
   directory patterns; probing as `file` silently leaked
   directory matches. Now lstats each hit and routes 'directory'
   vs 'file' to shouldIgnore so dir-only ignore rules actually
   match.

6. ReadTextOptions.line off-by-one — workspaceFileSystem.ts. The
   public option was documented as 1-based but forwarded as-is
   to readFileWithLineAndLimit, which is 0-based. A request with
   line: 1 returned content starting at the second line. Now
   converts 1-based -> 0-based at the boundary; doc clarified;
   truncation check uses the converted index.

7. ServeAppDeps.fsFactory JSDoc said trusted=true — server.ts:96.
   Stale from before the strict-default refactor in the same
   review pass. Rewrote to match the actual trusted: false +
   warn-once emit behavior.

8. MAX_READ_BYTES JSDoc said reads above cap return truncated —
   policy.ts:18. Stale from before the hard-cap refactor; now
   correctly states the cap throws file_too_large and that soft
   truncation only applies under the cap via enforceReadSize.

7 new tests cover the new behaviors:
- POSIX-absolute pattern rejection
- Win32 / UNC pattern rejection (4 variants)
- directory-pattern ignore (dist/)
- edit file-too-large
- edit binary refusal
- readText line: 1 returns from first line
- readText line: 2 starts from second line

418/418 serve tests pass; typecheck + eslint clean.

Deferred follow-ups (per PR review reply):
- glob maxResults is applied after globAsync materializes every
  match. A streaming iterator (glob.iterate) would bound the
  walk too. Non-trivial; tracked as a separate hardening
  follow-up since current behavior is correctness-safe (just
  not optimal under huge trees).
- Per-path glob escape hash in audit hint (currently aggregated
  count) — can revisit once PR 19 wires the routes and we see
  real audit volume.
- EVENT_SCHEMA_VERSION migration mechanism — orthogonal; the
  whole BridgeEvent schema lacks one and that's a Wave 5+
  concern.

🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
This commit is contained in:
doudouOUC 2026-05-17 23:51:39 +08:00
parent 7b0db4c3ae
commit a81ada43f5
5 changed files with 198 additions and 23 deletions

View file

@ -11,7 +11,14 @@ import * as path from 'node:path';
import { randomBytes } from 'node:crypto';
import { WorkspaceContext } from '@qwen-code/qwen-code-core';
import { canonicalizeWorkspace, resolveWithinWorkspace } from './paths.js';
import type { FsError, isFsError, type FsErrorKind } from './errors.js';
// `isFsError` is a runtime guard called below — must stay a value
// import. `FsError` is type-only here (typed `catch` variable); same
// for `FsErrorKind`. The eslint-disable mirrors the workspaceFileSystem.ts
// fix and exists because the auto-fix at commit 7b0db4c3a promoted the
// whole line to `import type`, erasing `isFsError` at runtime and
// failing 11 tests in this file alone.
import { isFsError, type FsError, type FsErrorKind } from './errors.js';
/**
* Kinds the boundary throws to indicate the path is *outside the

View file

@ -5,17 +5,30 @@
*/
import * as path from 'node:path';
import type { Ignore} from '@qwen-code/qwen-code-core';
import type { Ignore } from '@qwen-code/qwen-code-core';
import { isBinaryFile } from '@qwen-code/qwen-code-core';
import { FsError } from './errors.js';
import type { Intent, ResolvedPath } from './paths.js';
/**
* Maximum bytes returned by `readText`. Mirrors claude-code's
* Maximum bytes the `readText` boundary is willing to slurp before
* delegating to the core service. Mirrors claude-code's
* `MAX_OUTPUT_SIZE` (`src/utils/file.ts`) large enough for
* typical source files, small enough that an SSE replay buffer
* doesn't fill on a single read. Reads above this size return
* truncated content with `meta.truncated = true`.
* doesn't fill on a single read.
*
* Files **above** this cap are refused with `file_too_large` rather
* than truncated the underlying `readFileWithLineAndLimit`
* reads the whole file into memory before slicing lines, so soft
* truncation past the cap would still OOM the daemon. Files
* **at or below** the cap honor a tighter `opts.maxBytes` via
* post-decode truncation (`enforceReadSize`); that's where the
* `meta.truncated = true` flag fires.
*
* `enforceReadBytesSize` (the `readBytes` gate) and `edit()` use the
* same constant as a hard upper bound multi-GB files in the
* workspace can no longer reach `fsp.readFile` through any
* boundary path.
*/
export const MAX_READ_BYTES = 256 * 1024;

View file

@ -240,6 +240,39 @@ describe('WorkspaceFileSystem - glob', () => {
expect((err as { kind: string }).kind).toBe('parse_error');
});
it('rejects POSIX-absolute patterns up-front (no I/O outside workspace)', async () => {
const err = await h.fs.glob('/etc/**/*').catch((e) => e);
expect(isFsError(err)).toBe(true);
expect((err as { kind: string }).kind).toBe('parse_error');
});
it('rejects Win32 drive-letter and UNC patterns up-front', async () => {
for (const pattern of [
'C:\\Users\\foo\\**\\*.ts',
'C:/Users/foo/**/*.ts',
'\\\\server\\share\\**',
'//server/share/**',
]) {
const err = await h.fs.glob(pattern).catch((e: unknown) => e);
expect(isFsError(err)).toBe(true);
expect((err as { kind: string }).kind).toBe('parse_error');
}
});
it('respects directory-only ignore rules (e.g. dist/) on glob hits', async () => {
const ignore = new Ignore().add(['dist/']);
h = await makeHarness({ ignore });
await fsp.mkdir(path.join(h.workspace, 'dist'));
await fsp.writeFile(path.join(h.workspace, 'dist', 'bundle.js'), '');
await fsp.writeFile(path.join(h.workspace, 'src.ts'), '');
const hits = await h.fs.glob('*');
const names = hits.map((p) => path.basename(p)).sort();
// `dist` directory is filtered because the trailing-slash dir
// pattern now probes `<rel>/` against the directory ignorer.
expect(names).not.toContain('dist');
expect(names).toContain('src.ts');
});
it('respects maxResults', async () => {
const hits = await h.fs.glob('**/*', { maxResults: 1 });
expect(hits).toHaveLength(1);
@ -303,6 +336,44 @@ describe('WorkspaceFileSystem - write/edit', () => {
expect(isFsError(err)).toBe(true);
expect((err as { kind: string }).kind).toBe('parse_error');
});
it('rejects edit on file > MAX_READ_BYTES with file_too_large (no slurp)', async () => {
const policy = await import('./policy.js');
const big = path.join(h.workspace, 'huge.txt');
await fsp.writeFile(big, 'a'.repeat(policy.MAX_READ_BYTES + 1));
const r = await h.fs.resolve('huge.txt', 'write');
const err = await h.fs.edit(r, 'a', 'b').catch((e: unknown) => e);
expect(isFsError(err)).toBe(true);
expect((err as { kind: string }).kind).toBe('file_too_large');
});
it('rejects edit on binary file', async () => {
const bin = path.join(h.workspace, 'bin.dat');
const buf = Buffer.alloc(64);
buf[5] = 0;
await fsp.writeFile(bin, buf);
const r = await h.fs.resolve('bin.dat', 'write');
const err = await h.fs.edit(r, '\x00', 'x').catch((e: unknown) => e);
expect(isFsError(err)).toBe(true);
expect((err as { kind: string }).kind).toBe('binary_file');
});
it('readText converts 1-based line to 0-based slice (line: 1 returns from first line)', async () => {
const target = path.join(h.workspace, 'lines.txt');
await fsp.writeFile(target, 'one\ntwo\nthree\n');
const r = await h.fs.resolve('lines.txt', 'read');
const out = await h.fs.readText(r, { line: 1, limit: 1 });
// 1-based line 1 → 0-based slice index 0 → first line "one"
expect(out.content.split('\n')[0]).toBe('one');
});
it('readText with line: 2 starts from the second line', async () => {
const target = path.join(h.workspace, 'lines2.txt');
await fsp.writeFile(target, 'one\ntwo\nthree\n');
const r = await h.fs.resolve('lines2.txt', 'read');
const out = await h.fs.readText(r, { line: 2, limit: 1 });
expect(out.content.split('\n')[0]).toBe('two');
});
});
describe('WorkspaceFileSystem - trust gate', () => {

View file

@ -7,10 +7,19 @@
import { promises as fsp } from 'node:fs';
import * as path from 'node:path';
import { glob as globAsync } from 'glob';
import type {
Ignore,
// `StandardFileSystemService` is constructed and `loadIgnoreRules` is
// invoked at runtime — they MUST stay as value imports. The eslint
// auto-fix in commit 7b0db4c3a hoisted the whole block to `import type`
// (because the same line referenced the `Ignore` and `WriteTextFileOptions`
// types), which silently erased the value bindings and broke the runtime
// + 31 tests post-commit. The inline `type` modifiers below tell the
// `consistent-type-imports` rule per-symbol intent so future autofixes
// don't repeat the regression.
import {
StandardFileSystemService,
loadIgnoreRules,
type Ignore,
type WriteTextFileOptions,
} from '@qwen-code/qwen-code-core';
import type { BridgeEvent } from '../eventBus.js';
@ -69,7 +78,13 @@ export interface ReadMeta {
export interface ReadTextOptions {
/** Cap returned bytes; defaults to MAX_READ_BYTES. */
maxBytes?: number;
/** 1-based starting line for partial reads. */
/**
* 1-based starting line for partial reads. `1` returns the file
* from its first line. The boundary converts to the 0-based slice
* index `readFileWithLineAndLimit` expects internally; SDK
* consumers don't need to adjust. Values < 1 (or undefined) are
* treated as "from the beginning".
*/
line?: number;
/** Maximum number of lines to return. */
limit?: number;
@ -269,11 +284,17 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
const sizeOutcome = enforceReadSize(st.size, opts.maxBytes);
// Delegate encoding-aware read to the existing core service so
// BOM, CRLF, and iconv-supported codepages remain consistent
// with what the tools layer already does.
// with what the tools layer already does. The core service's
// `line` parameter is a 0-based slice index whereas the
// boundary's public `ReadTextOptions.line` is 1-based (the
// convention SDK consumers expect from line-numbered errors,
// editor jump-to-line, etc.). Convert here so the public
// contract isn't tied to the internal helper's indexing.
const startLineIndex = opts.line && opts.line > 1 ? opts.line - 1 : 0;
const result = await this.deps.lowFs.readTextFile({
path: p as string,
limit: opts.limit ?? Number.POSITIVE_INFINITY,
line: opts.line ?? 0,
line: startLineIndex,
});
const ignoreVerdict = shouldIgnore(
p,
@ -305,7 +326,7 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
opts.limit !== undefined &&
Number.isFinite(opts.limit) &&
result._meta?.originalLineCount !== undefined &&
result._meta.originalLineCount > opts.limit + (opts.line ?? 0)
result._meta.originalLineCount > opts.limit + startLineIndex
) {
meta.truncated = true;
}
@ -386,17 +407,36 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
const start = performance.now();
try {
assertTrustedForIntent(this.deps.trusted, 'glob');
// Reject `..` segments in the pattern outright. A pattern like
// `../../**/*.js` would otherwise let a caller search outside
// `cwd`. Boundary-checking each individual hit catches escapes
// too, but rejecting at the pattern level is cheaper and gives
// a clearer error.
// Reject patterns up-front before delegating to `glob` — the
// per-hit filter below catches escapes after the walk, but
// letting a clearly out-of-workspace pattern reach `globAsync`
// burns I/O *outside* the workspace before we drop the
// results. Three rejection classes:
// 1. `..` segments — would let `cwd` be escaped lexically.
// 2. POSIX absolute (`/etc/**`) — `glob` rooted outside cwd.
// 3. Windows-style absolute / device prefixes (`C:\…`,
// `\\?\…`, `\\server\share`) — same hazard on the other
// platform. `path.isAbsolute` covers POSIX `/`; the
// drive-letter / UNC checks cover Win32 even when the
// daemon runs on POSIX (clients may send Win32 paths).
if (pattern.split(/[\\/]/).some((seg) => seg === '..')) {
throw new FsError(
'parse_error',
`glob pattern may not contain '..' segments: ${pattern}`,
);
}
if (
path.isAbsolute(pattern) ||
/^[A-Za-z]:[\\/]/.test(pattern) ||
pattern.startsWith('\\\\') ||
pattern.startsWith('//')
) {
throw new FsError(
'parse_error',
`glob pattern must be workspace-relative: ${pattern}`,
{ hint: 'pass a relative pattern such as "src/**/*.ts"' },
);
}
const cwd = (opts.cwd as string | undefined) ?? this.deps.boundWorkspace;
const matches = await globAsync(pattern, {
cwd,
@ -434,11 +474,29 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
escapedCount += 1;
continue;
}
// Check the dirent kind so directory ignore rules (`dist/`,
// `.git/`, `node_modules/`) actually match — `shouldIgnore`
// probes `<rel>/` for the directory filter, which the
// underlying `ignore` library requires for trailing-slash
// patterns. Probing every hit as a `file` (the prior
// behavior) silently leaks ignored directories from
// `glob('**/*')` even when `includeIgnored` is false. We
// already realpath'd the hit, so an extra `lstat` here is
// cheap; on `lstat` failure (raced unlink) we conservatively
// treat the hit as a file so the file-pattern check still
// runs.
let dirent: { isDirectory(): boolean } | null = null;
try {
dirent = await fsp.lstat(canonical);
} catch {
dirent = null;
}
const kind = dirent?.isDirectory() ? 'directory' : 'file';
const verdict = shouldIgnore(
canonical as ResolvedPath,
this.deps.boundWorkspace,
this.deps.ignore,
'file',
kind,
);
if (verdict.ignored && !opts.includeIgnored) continue;
out.push(canonical as ResolvedPath);
@ -504,6 +562,29 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
const start = performance.now();
try {
assertTrustedForIntent(this.deps.trusted, 'edit');
// Mirror `readText`'s pre-stat OOM gate: `fsp.readFile` would
// otherwise slurp the whole target into memory before
// `enforceWriteSize` got a chance to refuse. A multi-GB file
// already inside the workspace can OOM the daemon even though
// the *edited output* would later fail the size check.
// Reject above `MAX_READ_BYTES` outright with a typed
// `file_too_large`; binary content is also refused since
// `current.indexOf(oldText)` over arbitrary bytes is meaningless.
const st = await fsp.stat(p as string);
if (st.size > MAX_READ_BYTES) {
throw new FsError(
'file_too_large',
`file of ${st.size} bytes exceeds edit cap of ${MAX_READ_BYTES} bytes`,
{
hint: 'split large edits into bounded readBytes/writeText sequences',
},
);
}
if (await detectBinary(p)) {
throw new FsError('binary_file', `cannot edit binary file: ${p}`, {
hint: 'edit() works on text files only',
});
}
const current = await fsp.readFile(p as string, 'utf-8');
// Single replacement to preserve atomic write-once semantics.
// Multi-occurrence handling lives in PR 20's edit endpoint

View file

@ -86,12 +86,15 @@ export interface ServeAppDeps {
* Workspace filesystem boundary factory (#4175 PR 18). When
* supplied, PR 19/20 routes will pull a per-request
* `WorkspaceFileSystem` off it; when omitted, `createServeApp`
* builds a permissive default (trusted=true, no-op audit emit).
* No PR 18 routes consume the factory yet the slot is wired so
* PR 19 read-only file routes can drop in without re-shaping
* `ServeAppDeps`. Once PR 19 lands, `runQwenServe` will inject a
* factory whose `trusted` flag mirrors `Config.isTrustedFolder()`
* and whose `emit` plumbs into the per-session EventBus.
* builds a strict default (`trusted: false`, warn-once no-op
* `emit`) so an upstream refactor that forgets to inject
* `fsFactory` never silently allows writes against an untrusted
* workspace. No PR 18 routes consume the factory yet the slot
* is wired so PR 19 read-only file routes can drop in without
* re-shaping `ServeAppDeps`. Once PR 19 lands, `runQwenServe`
* will inject a factory whose `trusted` flag mirrors
* `Config.isTrustedFolder()` and whose `emit` plumbs into the
* per-session EventBus.
*/
fsFactory?: WorkspaceFileSystemFactory;
}