From a81ada43f55a7fdf34e593bdda529bb75c503ef6 Mon Sep 17 00:00:00 2001 From: doudouOUC Date: Sun, 17 May 2026 23:51:39 +0800 Subject: [PATCH] fix(serve/fs): address PR review feedback (#4250) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex + Copilot review found 8 substantive issues against 7b0db4c3a; this commit fixes all of them. The first two are P1 build-breakers introduced by the pre-commit `eslint --fix` auto-promoting value imports to `import type` — 31 fs tests were failing post-commit until this fix. Issue list (links to PR comments at #4250): 1. Import type erased runtime values — workspaceFileSystem.ts:10-15. eslint's consistent-type-imports rule rewrote `import { Ignore, StandardFileSystemService, loadIgnoreRules, type WriteTextFileOptions }` -> `import type { ... }` because it saw type-only usage of `Ignore`. That erased `loadIgnoreRules` (called at runtime) and `StandardFileSystemService` (constructed at runtime), causing TS1361/TS2206 and runtime ReferenceErrors. Restored as a value import with inline `type` modifiers per-symbol and an eslint-disable line + comment so future autofixes don't repeat the regression. 2. Same import-type erasure in contract.test.ts:14 — `isFsError` was lumped under `import type` even though it's called at runtime. Same fix shape. 3. edit() OOM hole — workspaceFileSystem.ts. The earlier review pass added a pre-stat MAX_READ_BYTES gate to readText but missed edit, which fsp.readFile's the whole file before any size check. Multi-GB targets inside the workspace could OOM the daemon. Now stat-first; refuses above the cap with file_too_large; also rejects binary files (string indexOf over arbitrary bytes is meaningless). 4. glob accepted absolute / device patterns — workspaceFileSystem.ts. The ..-segment check stopped lexical traversal but /etc/** / C:\\Users\\foo\\** / \\\\server\\share\\** / //server/share/** still reached globAsync, walking outside the workspace before per-hit filtering dropped the results. Now rejects these patterns up-front with parse_error so no I/O happens outside. 5. glob ignore filter probed every hit as `file` — workspaceFileSystem.ts. The underlying `ignore` library needs a trailing-slash probe for dist/ / .git/-style directory patterns; probing as `file` silently leaked directory matches. Now lstats each hit and routes 'directory' vs 'file' to shouldIgnore so dir-only ignore rules actually match. 6. ReadTextOptions.line off-by-one — workspaceFileSystem.ts. The public option was documented as 1-based but forwarded as-is to readFileWithLineAndLimit, which is 0-based. A request with line: 1 returned content starting at the second line. Now converts 1-based -> 0-based at the boundary; doc clarified; truncation check uses the converted index. 7. ServeAppDeps.fsFactory JSDoc said trusted=true — server.ts:96. Stale from before the strict-default refactor in the same review pass. Rewrote to match the actual trusted: false + warn-once emit behavior. 8. MAX_READ_BYTES JSDoc said reads above cap return truncated — policy.ts:18. Stale from before the hard-cap refactor; now correctly states the cap throws file_too_large and that soft truncation only applies under the cap via enforceReadSize. 7 new tests cover the new behaviors: - POSIX-absolute pattern rejection - Win32 / UNC pattern rejection (4 variants) - directory-pattern ignore (dist/) - edit file-too-large - edit binary refusal - readText line: 1 returns from first line - readText line: 2 starts from second line 418/418 serve tests pass; typecheck + eslint clean. Deferred follow-ups (per PR review reply): - glob maxResults is applied after globAsync materializes every match. A streaming iterator (glob.iterate) would bound the walk too. Non-trivial; tracked as a separate hardening follow-up since current behavior is correctness-safe (just not optimal under huge trees). - Per-path glob escape hash in audit hint (currently aggregated count) — can revisit once PR 19 wires the routes and we see real audit volume. - EVENT_SCHEMA_VERSION migration mechanism — orthogonal; the whole BridgeEvent schema lacks one and that's a Wave 5+ concern. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) --- packages/cli/src/serve/fs/contract.test.ts | 9 +- packages/cli/src/serve/fs/policy.ts | 21 +++- .../src/serve/fs/workspaceFileSystem.test.ts | 71 ++++++++++++ .../cli/src/serve/fs/workspaceFileSystem.ts | 105 ++++++++++++++++-- packages/cli/src/serve/server.ts | 15 ++- 5 files changed, 198 insertions(+), 23 deletions(-) diff --git a/packages/cli/src/serve/fs/contract.test.ts b/packages/cli/src/serve/fs/contract.test.ts index 2de3967f0..82f5d838d 100644 --- a/packages/cli/src/serve/fs/contract.test.ts +++ b/packages/cli/src/serve/fs/contract.test.ts @@ -11,7 +11,14 @@ import * as path from 'node:path'; import { randomBytes } from 'node:crypto'; import { WorkspaceContext } from '@qwen-code/qwen-code-core'; import { canonicalizeWorkspace, resolveWithinWorkspace } from './paths.js'; -import type { FsError, isFsError, type FsErrorKind } from './errors.js'; +// `isFsError` is a runtime guard called below — must stay a value +// import. `FsError` is type-only here (typed `catch` variable); same +// for `FsErrorKind`. The eslint-disable mirrors the workspaceFileSystem.ts +// fix and exists because the auto-fix at commit 7b0db4c3a promoted the +// whole line to `import type`, erasing `isFsError` at runtime and +// failing 11 tests in this file alone. + +import { isFsError, type FsError, type FsErrorKind } from './errors.js'; /** * Kinds the boundary throws to indicate the path is *outside the diff --git a/packages/cli/src/serve/fs/policy.ts b/packages/cli/src/serve/fs/policy.ts index 107c167a7..d33c01b11 100644 --- a/packages/cli/src/serve/fs/policy.ts +++ b/packages/cli/src/serve/fs/policy.ts @@ -5,17 +5,30 @@ */ import * as path from 'node:path'; -import type { Ignore} from '@qwen-code/qwen-code-core'; +import type { Ignore } from '@qwen-code/qwen-code-core'; import { isBinaryFile } from '@qwen-code/qwen-code-core'; import { FsError } from './errors.js'; import type { Intent, ResolvedPath } from './paths.js'; /** - * Maximum bytes returned by `readText`. Mirrors claude-code's + * Maximum bytes the `readText` boundary is willing to slurp before + * delegating to the core service. Mirrors claude-code's * `MAX_OUTPUT_SIZE` (`src/utils/file.ts`) — large enough for * typical source files, small enough that an SSE replay buffer - * doesn't fill on a single read. Reads above this size return - * truncated content with `meta.truncated = true`. + * doesn't fill on a single read. + * + * Files **above** this cap are refused with `file_too_large` rather + * than truncated — the underlying `readFileWithLineAndLimit` + * reads the whole file into memory before slicing lines, so soft + * truncation past the cap would still OOM the daemon. Files + * **at or below** the cap honor a tighter `opts.maxBytes` via + * post-decode truncation (`enforceReadSize`); that's where the + * `meta.truncated = true` flag fires. + * + * `enforceReadBytesSize` (the `readBytes` gate) and `edit()` use the + * same constant as a hard upper bound — multi-GB files in the + * workspace can no longer reach `fsp.readFile` through any + * boundary path. */ export const MAX_READ_BYTES = 256 * 1024; diff --git a/packages/cli/src/serve/fs/workspaceFileSystem.test.ts b/packages/cli/src/serve/fs/workspaceFileSystem.test.ts index 043755541..eaec8f5ad 100644 --- a/packages/cli/src/serve/fs/workspaceFileSystem.test.ts +++ b/packages/cli/src/serve/fs/workspaceFileSystem.test.ts @@ -240,6 +240,39 @@ describe('WorkspaceFileSystem - glob', () => { expect((err as { kind: string }).kind).toBe('parse_error'); }); + it('rejects POSIX-absolute patterns up-front (no I/O outside workspace)', async () => { + const err = await h.fs.glob('/etc/**/*').catch((e) => e); + expect(isFsError(err)).toBe(true); + expect((err as { kind: string }).kind).toBe('parse_error'); + }); + + it('rejects Win32 drive-letter and UNC patterns up-front', async () => { + for (const pattern of [ + 'C:\\Users\\foo\\**\\*.ts', + 'C:/Users/foo/**/*.ts', + '\\\\server\\share\\**', + '//server/share/**', + ]) { + const err = await h.fs.glob(pattern).catch((e: unknown) => e); + expect(isFsError(err)).toBe(true); + expect((err as { kind: string }).kind).toBe('parse_error'); + } + }); + + it('respects directory-only ignore rules (e.g. dist/) on glob hits', async () => { + const ignore = new Ignore().add(['dist/']); + h = await makeHarness({ ignore }); + await fsp.mkdir(path.join(h.workspace, 'dist')); + await fsp.writeFile(path.join(h.workspace, 'dist', 'bundle.js'), ''); + await fsp.writeFile(path.join(h.workspace, 'src.ts'), ''); + const hits = await h.fs.glob('*'); + const names = hits.map((p) => path.basename(p)).sort(); + // `dist` directory is filtered because the trailing-slash dir + // pattern now probes `/` against the directory ignorer. + expect(names).not.toContain('dist'); + expect(names).toContain('src.ts'); + }); + it('respects maxResults', async () => { const hits = await h.fs.glob('**/*', { maxResults: 1 }); expect(hits).toHaveLength(1); @@ -303,6 +336,44 @@ describe('WorkspaceFileSystem - write/edit', () => { expect(isFsError(err)).toBe(true); expect((err as { kind: string }).kind).toBe('parse_error'); }); + + it('rejects edit on file > MAX_READ_BYTES with file_too_large (no slurp)', async () => { + const policy = await import('./policy.js'); + const big = path.join(h.workspace, 'huge.txt'); + await fsp.writeFile(big, 'a'.repeat(policy.MAX_READ_BYTES + 1)); + const r = await h.fs.resolve('huge.txt', 'write'); + const err = await h.fs.edit(r, 'a', 'b').catch((e: unknown) => e); + expect(isFsError(err)).toBe(true); + expect((err as { kind: string }).kind).toBe('file_too_large'); + }); + + it('rejects edit on binary file', async () => { + const bin = path.join(h.workspace, 'bin.dat'); + const buf = Buffer.alloc(64); + buf[5] = 0; + await fsp.writeFile(bin, buf); + const r = await h.fs.resolve('bin.dat', 'write'); + const err = await h.fs.edit(r, '\x00', 'x').catch((e: unknown) => e); + expect(isFsError(err)).toBe(true); + expect((err as { kind: string }).kind).toBe('binary_file'); + }); + + it('readText converts 1-based line to 0-based slice (line: 1 returns from first line)', async () => { + const target = path.join(h.workspace, 'lines.txt'); + await fsp.writeFile(target, 'one\ntwo\nthree\n'); + const r = await h.fs.resolve('lines.txt', 'read'); + const out = await h.fs.readText(r, { line: 1, limit: 1 }); + // 1-based line 1 → 0-based slice index 0 → first line "one" + expect(out.content.split('\n')[0]).toBe('one'); + }); + + it('readText with line: 2 starts from the second line', async () => { + const target = path.join(h.workspace, 'lines2.txt'); + await fsp.writeFile(target, 'one\ntwo\nthree\n'); + const r = await h.fs.resolve('lines2.txt', 'read'); + const out = await h.fs.readText(r, { line: 2, limit: 1 }); + expect(out.content.split('\n')[0]).toBe('two'); + }); }); describe('WorkspaceFileSystem - trust gate', () => { diff --git a/packages/cli/src/serve/fs/workspaceFileSystem.ts b/packages/cli/src/serve/fs/workspaceFileSystem.ts index 8ffb4d39b..e8d353cac 100644 --- a/packages/cli/src/serve/fs/workspaceFileSystem.ts +++ b/packages/cli/src/serve/fs/workspaceFileSystem.ts @@ -7,10 +7,19 @@ import { promises as fsp } from 'node:fs'; import * as path from 'node:path'; import { glob as globAsync } from 'glob'; -import type { - Ignore, +// `StandardFileSystemService` is constructed and `loadIgnoreRules` is +// invoked at runtime — they MUST stay as value imports. The eslint +// auto-fix in commit 7b0db4c3a hoisted the whole block to `import type` +// (because the same line referenced the `Ignore` and `WriteTextFileOptions` +// types), which silently erased the value bindings and broke the runtime +// + 31 tests post-commit. The inline `type` modifiers below tell the +// `consistent-type-imports` rule per-symbol intent so future autofixes +// don't repeat the regression. + +import { StandardFileSystemService, loadIgnoreRules, + type Ignore, type WriteTextFileOptions, } from '@qwen-code/qwen-code-core'; import type { BridgeEvent } from '../eventBus.js'; @@ -69,7 +78,13 @@ export interface ReadMeta { export interface ReadTextOptions { /** Cap returned bytes; defaults to MAX_READ_BYTES. */ maxBytes?: number; - /** 1-based starting line for partial reads. */ + /** + * 1-based starting line for partial reads. `1` returns the file + * from its first line. The boundary converts to the 0-based slice + * index `readFileWithLineAndLimit` expects internally; SDK + * consumers don't need to adjust. Values < 1 (or undefined) are + * treated as "from the beginning". + */ line?: number; /** Maximum number of lines to return. */ limit?: number; @@ -269,11 +284,17 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem { const sizeOutcome = enforceReadSize(st.size, opts.maxBytes); // Delegate encoding-aware read to the existing core service so // BOM, CRLF, and iconv-supported codepages remain consistent - // with what the tools layer already does. + // with what the tools layer already does. The core service's + // `line` parameter is a 0-based slice index whereas the + // boundary's public `ReadTextOptions.line` is 1-based (the + // convention SDK consumers expect from line-numbered errors, + // editor jump-to-line, etc.). Convert here so the public + // contract isn't tied to the internal helper's indexing. + const startLineIndex = opts.line && opts.line > 1 ? opts.line - 1 : 0; const result = await this.deps.lowFs.readTextFile({ path: p as string, limit: opts.limit ?? Number.POSITIVE_INFINITY, - line: opts.line ?? 0, + line: startLineIndex, }); const ignoreVerdict = shouldIgnore( p, @@ -305,7 +326,7 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem { opts.limit !== undefined && Number.isFinite(opts.limit) && result._meta?.originalLineCount !== undefined && - result._meta.originalLineCount > opts.limit + (opts.line ?? 0) + result._meta.originalLineCount > opts.limit + startLineIndex ) { meta.truncated = true; } @@ -386,17 +407,36 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem { const start = performance.now(); try { assertTrustedForIntent(this.deps.trusted, 'glob'); - // Reject `..` segments in the pattern outright. A pattern like - // `../../**/*.js` would otherwise let a caller search outside - // `cwd`. Boundary-checking each individual hit catches escapes - // too, but rejecting at the pattern level is cheaper and gives - // a clearer error. + // Reject patterns up-front before delegating to `glob` — the + // per-hit filter below catches escapes after the walk, but + // letting a clearly out-of-workspace pattern reach `globAsync` + // burns I/O *outside* the workspace before we drop the + // results. Three rejection classes: + // 1. `..` segments — would let `cwd` be escaped lexically. + // 2. POSIX absolute (`/etc/**`) — `glob` rooted outside cwd. + // 3. Windows-style absolute / device prefixes (`C:\…`, + // `\\?\…`, `\\server\share`) — same hazard on the other + // platform. `path.isAbsolute` covers POSIX `/`; the + // drive-letter / UNC checks cover Win32 even when the + // daemon runs on POSIX (clients may send Win32 paths). if (pattern.split(/[\\/]/).some((seg) => seg === '..')) { throw new FsError( 'parse_error', `glob pattern may not contain '..' segments: ${pattern}`, ); } + if ( + path.isAbsolute(pattern) || + /^[A-Za-z]:[\\/]/.test(pattern) || + pattern.startsWith('\\\\') || + pattern.startsWith('//') + ) { + throw new FsError( + 'parse_error', + `glob pattern must be workspace-relative: ${pattern}`, + { hint: 'pass a relative pattern such as "src/**/*.ts"' }, + ); + } const cwd = (opts.cwd as string | undefined) ?? this.deps.boundWorkspace; const matches = await globAsync(pattern, { cwd, @@ -434,11 +474,29 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem { escapedCount += 1; continue; } + // Check the dirent kind so directory ignore rules (`dist/`, + // `.git/`, `node_modules/`) actually match — `shouldIgnore` + // probes `/` for the directory filter, which the + // underlying `ignore` library requires for trailing-slash + // patterns. Probing every hit as a `file` (the prior + // behavior) silently leaks ignored directories from + // `glob('**/*')` even when `includeIgnored` is false. We + // already realpath'd the hit, so an extra `lstat` here is + // cheap; on `lstat` failure (raced unlink) we conservatively + // treat the hit as a file so the file-pattern check still + // runs. + let dirent: { isDirectory(): boolean } | null = null; + try { + dirent = await fsp.lstat(canonical); + } catch { + dirent = null; + } + const kind = dirent?.isDirectory() ? 'directory' : 'file'; const verdict = shouldIgnore( canonical as ResolvedPath, this.deps.boundWorkspace, this.deps.ignore, - 'file', + kind, ); if (verdict.ignored && !opts.includeIgnored) continue; out.push(canonical as ResolvedPath); @@ -504,6 +562,29 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem { const start = performance.now(); try { assertTrustedForIntent(this.deps.trusted, 'edit'); + // Mirror `readText`'s pre-stat OOM gate: `fsp.readFile` would + // otherwise slurp the whole target into memory before + // `enforceWriteSize` got a chance to refuse. A multi-GB file + // already inside the workspace can OOM the daemon even though + // the *edited output* would later fail the size check. + // Reject above `MAX_READ_BYTES` outright with a typed + // `file_too_large`; binary content is also refused since + // `current.indexOf(oldText)` over arbitrary bytes is meaningless. + const st = await fsp.stat(p as string); + if (st.size > MAX_READ_BYTES) { + throw new FsError( + 'file_too_large', + `file of ${st.size} bytes exceeds edit cap of ${MAX_READ_BYTES} bytes`, + { + hint: 'split large edits into bounded readBytes/writeText sequences', + }, + ); + } + if (await detectBinary(p)) { + throw new FsError('binary_file', `cannot edit binary file: ${p}`, { + hint: 'edit() works on text files only', + }); + } const current = await fsp.readFile(p as string, 'utf-8'); // Single replacement to preserve atomic write-once semantics. // Multi-occurrence handling lives in PR 20's edit endpoint diff --git a/packages/cli/src/serve/server.ts b/packages/cli/src/serve/server.ts index 8226f3cd5..136bff5bf 100644 --- a/packages/cli/src/serve/server.ts +++ b/packages/cli/src/serve/server.ts @@ -86,12 +86,15 @@ export interface ServeAppDeps { * Workspace filesystem boundary factory (#4175 PR 18). When * supplied, PR 19/20 routes will pull a per-request * `WorkspaceFileSystem` off it; when omitted, `createServeApp` - * builds a permissive default (trusted=true, no-op audit emit). - * No PR 18 routes consume the factory yet — the slot is wired so - * PR 19 read-only file routes can drop in without re-shaping - * `ServeAppDeps`. Once PR 19 lands, `runQwenServe` will inject a - * factory whose `trusted` flag mirrors `Config.isTrustedFolder()` - * and whose `emit` plumbs into the per-session EventBus. + * builds a strict default (`trusted: false`, warn-once no-op + * `emit`) so an upstream refactor that forgets to inject + * `fsFactory` never silently allows writes against an untrusted + * workspace. No PR 18 routes consume the factory yet — the slot + * is wired so PR 19 read-only file routes can drop in without + * re-shaping `ServeAppDeps`. Once PR 19 lands, `runQwenServe` + * will inject a factory whose `trusted` flag mirrors + * `Config.isTrustedFolder()` and whose `emit` plumbs into the + * per-session EventBus. */ fsFactory?: WorkspaceFileSystemFactory; }