mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-05-24 22:03:46 +00:00
fix(serve/fs): address PR review feedback (#4250)
Codex + Copilot review found 8 substantive issues against
7b0db4c3a; this commit fixes all of them. The first two are
P1 build-breakers introduced by the pre-commit `eslint --fix`
auto-promoting value imports to `import type` — 31 fs tests
were failing post-commit until this fix.
Issue list (links to PR comments at #4250):
1. Import type erased runtime values —
workspaceFileSystem.ts:10-15. eslint's
consistent-type-imports rule rewrote
`import { Ignore, StandardFileSystemService, loadIgnoreRules,
type WriteTextFileOptions }` -> `import type { ... }` because
it saw type-only usage of `Ignore`. That erased
`loadIgnoreRules` (called at runtime) and
`StandardFileSystemService` (constructed at runtime), causing
TS1361/TS2206 and runtime ReferenceErrors. Restored as a
value import with inline `type` modifiers per-symbol and an
eslint-disable line + comment so future autofixes don't
repeat the regression.
2. Same import-type erasure in contract.test.ts:14 — `isFsError`
was lumped under `import type` even though it's called at
runtime. Same fix shape.
3. edit() OOM hole — workspaceFileSystem.ts. The earlier review
pass added a pre-stat MAX_READ_BYTES gate to readText but
missed edit, which fsp.readFile's the whole file before any
size check. Multi-GB targets inside the workspace could OOM
the daemon. Now stat-first; refuses above the cap with
file_too_large; also rejects binary files (string indexOf
over arbitrary bytes is meaningless).
4. glob accepted absolute / device patterns —
workspaceFileSystem.ts. The ..-segment check stopped lexical
traversal but /etc/** / C:\\Users\\foo\\** /
\\\\server\\share\\** / //server/share/** still reached
globAsync, walking outside the workspace before per-hit
filtering dropped the results. Now rejects these patterns
up-front with parse_error so no I/O happens outside.
5. glob ignore filter probed every hit as `file` —
workspaceFileSystem.ts. The underlying `ignore` library
needs a trailing-slash probe for dist/ / .git/-style
directory patterns; probing as `file` silently leaked
directory matches. Now lstats each hit and routes 'directory'
vs 'file' to shouldIgnore so dir-only ignore rules actually
match.
6. ReadTextOptions.line off-by-one — workspaceFileSystem.ts. The
public option was documented as 1-based but forwarded as-is
to readFileWithLineAndLimit, which is 0-based. A request with
line: 1 returned content starting at the second line. Now
converts 1-based -> 0-based at the boundary; doc clarified;
truncation check uses the converted index.
7. ServeAppDeps.fsFactory JSDoc said trusted=true — server.ts:96.
Stale from before the strict-default refactor in the same
review pass. Rewrote to match the actual trusted: false +
warn-once emit behavior.
8. MAX_READ_BYTES JSDoc said reads above cap return truncated —
policy.ts:18. Stale from before the hard-cap refactor; now
correctly states the cap throws file_too_large and that soft
truncation only applies under the cap via enforceReadSize.
7 new tests cover the new behaviors:
- POSIX-absolute pattern rejection
- Win32 / UNC pattern rejection (4 variants)
- directory-pattern ignore (dist/)
- edit file-too-large
- edit binary refusal
- readText line: 1 returns from first line
- readText line: 2 starts from second line
418/418 serve tests pass; typecheck + eslint clean.
Deferred follow-ups (per PR review reply):
- glob maxResults is applied after globAsync materializes every
match. A streaming iterator (glob.iterate) would bound the
walk too. Non-trivial; tracked as a separate hardening
follow-up since current behavior is correctness-safe (just
not optimal under huge trees).
- Per-path glob escape hash in audit hint (currently aggregated
count) — can revisit once PR 19 wires the routes and we see
real audit volume.
- EVENT_SCHEMA_VERSION migration mechanism — orthogonal; the
whole BridgeEvent schema lacks one and that's a Wave 5+
concern.
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
This commit is contained in:
parent
7b0db4c3ae
commit
a81ada43f5
5 changed files with 198 additions and 23 deletions
|
|
@ -11,7 +11,14 @@ import * as path from 'node:path';
|
|||
import { randomBytes } from 'node:crypto';
|
||||
import { WorkspaceContext } from '@qwen-code/qwen-code-core';
|
||||
import { canonicalizeWorkspace, resolveWithinWorkspace } from './paths.js';
|
||||
import type { FsError, isFsError, type FsErrorKind } from './errors.js';
|
||||
// `isFsError` is a runtime guard called below — must stay a value
|
||||
// import. `FsError` is type-only here (typed `catch` variable); same
|
||||
// for `FsErrorKind`. The eslint-disable mirrors the workspaceFileSystem.ts
|
||||
// fix and exists because the auto-fix at commit 7b0db4c3a promoted the
|
||||
// whole line to `import type`, erasing `isFsError` at runtime and
|
||||
// failing 11 tests in this file alone.
|
||||
|
||||
import { isFsError, type FsError, type FsErrorKind } from './errors.js';
|
||||
|
||||
/**
|
||||
* Kinds the boundary throws to indicate the path is *outside the
|
||||
|
|
|
|||
|
|
@ -5,17 +5,30 @@
|
|||
*/
|
||||
|
||||
import * as path from 'node:path';
|
||||
import type { Ignore} from '@qwen-code/qwen-code-core';
|
||||
import type { Ignore } from '@qwen-code/qwen-code-core';
|
||||
import { isBinaryFile } from '@qwen-code/qwen-code-core';
|
||||
import { FsError } from './errors.js';
|
||||
import type { Intent, ResolvedPath } from './paths.js';
|
||||
|
||||
/**
|
||||
* Maximum bytes returned by `readText`. Mirrors claude-code's
|
||||
* Maximum bytes the `readText` boundary is willing to slurp before
|
||||
* delegating to the core service. Mirrors claude-code's
|
||||
* `MAX_OUTPUT_SIZE` (`src/utils/file.ts`) — large enough for
|
||||
* typical source files, small enough that an SSE replay buffer
|
||||
* doesn't fill on a single read. Reads above this size return
|
||||
* truncated content with `meta.truncated = true`.
|
||||
* doesn't fill on a single read.
|
||||
*
|
||||
* Files **above** this cap are refused with `file_too_large` rather
|
||||
* than truncated — the underlying `readFileWithLineAndLimit`
|
||||
* reads the whole file into memory before slicing lines, so soft
|
||||
* truncation past the cap would still OOM the daemon. Files
|
||||
* **at or below** the cap honor a tighter `opts.maxBytes` via
|
||||
* post-decode truncation (`enforceReadSize`); that's where the
|
||||
* `meta.truncated = true` flag fires.
|
||||
*
|
||||
* `enforceReadBytesSize` (the `readBytes` gate) and `edit()` use the
|
||||
* same constant as a hard upper bound — multi-GB files in the
|
||||
* workspace can no longer reach `fsp.readFile` through any
|
||||
* boundary path.
|
||||
*/
|
||||
export const MAX_READ_BYTES = 256 * 1024;
|
||||
|
||||
|
|
|
|||
|
|
@ -240,6 +240,39 @@ describe('WorkspaceFileSystem - glob', () => {
|
|||
expect((err as { kind: string }).kind).toBe('parse_error');
|
||||
});
|
||||
|
||||
it('rejects POSIX-absolute patterns up-front (no I/O outside workspace)', async () => {
|
||||
const err = await h.fs.glob('/etc/**/*').catch((e) => e);
|
||||
expect(isFsError(err)).toBe(true);
|
||||
expect((err as { kind: string }).kind).toBe('parse_error');
|
||||
});
|
||||
|
||||
it('rejects Win32 drive-letter and UNC patterns up-front', async () => {
|
||||
for (const pattern of [
|
||||
'C:\\Users\\foo\\**\\*.ts',
|
||||
'C:/Users/foo/**/*.ts',
|
||||
'\\\\server\\share\\**',
|
||||
'//server/share/**',
|
||||
]) {
|
||||
const err = await h.fs.glob(pattern).catch((e: unknown) => e);
|
||||
expect(isFsError(err)).toBe(true);
|
||||
expect((err as { kind: string }).kind).toBe('parse_error');
|
||||
}
|
||||
});
|
||||
|
||||
it('respects directory-only ignore rules (e.g. dist/) on glob hits', async () => {
|
||||
const ignore = new Ignore().add(['dist/']);
|
||||
h = await makeHarness({ ignore });
|
||||
await fsp.mkdir(path.join(h.workspace, 'dist'));
|
||||
await fsp.writeFile(path.join(h.workspace, 'dist', 'bundle.js'), '');
|
||||
await fsp.writeFile(path.join(h.workspace, 'src.ts'), '');
|
||||
const hits = await h.fs.glob('*');
|
||||
const names = hits.map((p) => path.basename(p)).sort();
|
||||
// `dist` directory is filtered because the trailing-slash dir
|
||||
// pattern now probes `<rel>/` against the directory ignorer.
|
||||
expect(names).not.toContain('dist');
|
||||
expect(names).toContain('src.ts');
|
||||
});
|
||||
|
||||
it('respects maxResults', async () => {
|
||||
const hits = await h.fs.glob('**/*', { maxResults: 1 });
|
||||
expect(hits).toHaveLength(1);
|
||||
|
|
@ -303,6 +336,44 @@ describe('WorkspaceFileSystem - write/edit', () => {
|
|||
expect(isFsError(err)).toBe(true);
|
||||
expect((err as { kind: string }).kind).toBe('parse_error');
|
||||
});
|
||||
|
||||
it('rejects edit on file > MAX_READ_BYTES with file_too_large (no slurp)', async () => {
|
||||
const policy = await import('./policy.js');
|
||||
const big = path.join(h.workspace, 'huge.txt');
|
||||
await fsp.writeFile(big, 'a'.repeat(policy.MAX_READ_BYTES + 1));
|
||||
const r = await h.fs.resolve('huge.txt', 'write');
|
||||
const err = await h.fs.edit(r, 'a', 'b').catch((e: unknown) => e);
|
||||
expect(isFsError(err)).toBe(true);
|
||||
expect((err as { kind: string }).kind).toBe('file_too_large');
|
||||
});
|
||||
|
||||
it('rejects edit on binary file', async () => {
|
||||
const bin = path.join(h.workspace, 'bin.dat');
|
||||
const buf = Buffer.alloc(64);
|
||||
buf[5] = 0;
|
||||
await fsp.writeFile(bin, buf);
|
||||
const r = await h.fs.resolve('bin.dat', 'write');
|
||||
const err = await h.fs.edit(r, '\x00', 'x').catch((e: unknown) => e);
|
||||
expect(isFsError(err)).toBe(true);
|
||||
expect((err as { kind: string }).kind).toBe('binary_file');
|
||||
});
|
||||
|
||||
it('readText converts 1-based line to 0-based slice (line: 1 returns from first line)', async () => {
|
||||
const target = path.join(h.workspace, 'lines.txt');
|
||||
await fsp.writeFile(target, 'one\ntwo\nthree\n');
|
||||
const r = await h.fs.resolve('lines.txt', 'read');
|
||||
const out = await h.fs.readText(r, { line: 1, limit: 1 });
|
||||
// 1-based line 1 → 0-based slice index 0 → first line "one"
|
||||
expect(out.content.split('\n')[0]).toBe('one');
|
||||
});
|
||||
|
||||
it('readText with line: 2 starts from the second line', async () => {
|
||||
const target = path.join(h.workspace, 'lines2.txt');
|
||||
await fsp.writeFile(target, 'one\ntwo\nthree\n');
|
||||
const r = await h.fs.resolve('lines2.txt', 'read');
|
||||
const out = await h.fs.readText(r, { line: 2, limit: 1 });
|
||||
expect(out.content.split('\n')[0]).toBe('two');
|
||||
});
|
||||
});
|
||||
|
||||
describe('WorkspaceFileSystem - trust gate', () => {
|
||||
|
|
|
|||
|
|
@ -7,10 +7,19 @@
|
|||
import { promises as fsp } from 'node:fs';
|
||||
import * as path from 'node:path';
|
||||
import { glob as globAsync } from 'glob';
|
||||
import type {
|
||||
Ignore,
|
||||
// `StandardFileSystemService` is constructed and `loadIgnoreRules` is
|
||||
// invoked at runtime — they MUST stay as value imports. The eslint
|
||||
// auto-fix in commit 7b0db4c3a hoisted the whole block to `import type`
|
||||
// (because the same line referenced the `Ignore` and `WriteTextFileOptions`
|
||||
// types), which silently erased the value bindings and broke the runtime
|
||||
// + 31 tests post-commit. The inline `type` modifiers below tell the
|
||||
// `consistent-type-imports` rule per-symbol intent so future autofixes
|
||||
// don't repeat the regression.
|
||||
|
||||
import {
|
||||
StandardFileSystemService,
|
||||
loadIgnoreRules,
|
||||
type Ignore,
|
||||
type WriteTextFileOptions,
|
||||
} from '@qwen-code/qwen-code-core';
|
||||
import type { BridgeEvent } from '../eventBus.js';
|
||||
|
|
@ -69,7 +78,13 @@ export interface ReadMeta {
|
|||
export interface ReadTextOptions {
|
||||
/** Cap returned bytes; defaults to MAX_READ_BYTES. */
|
||||
maxBytes?: number;
|
||||
/** 1-based starting line for partial reads. */
|
||||
/**
|
||||
* 1-based starting line for partial reads. `1` returns the file
|
||||
* from its first line. The boundary converts to the 0-based slice
|
||||
* index `readFileWithLineAndLimit` expects internally; SDK
|
||||
* consumers don't need to adjust. Values < 1 (or undefined) are
|
||||
* treated as "from the beginning".
|
||||
*/
|
||||
line?: number;
|
||||
/** Maximum number of lines to return. */
|
||||
limit?: number;
|
||||
|
|
@ -269,11 +284,17 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
|
|||
const sizeOutcome = enforceReadSize(st.size, opts.maxBytes);
|
||||
// Delegate encoding-aware read to the existing core service so
|
||||
// BOM, CRLF, and iconv-supported codepages remain consistent
|
||||
// with what the tools layer already does.
|
||||
// with what the tools layer already does. The core service's
|
||||
// `line` parameter is a 0-based slice index whereas the
|
||||
// boundary's public `ReadTextOptions.line` is 1-based (the
|
||||
// convention SDK consumers expect from line-numbered errors,
|
||||
// editor jump-to-line, etc.). Convert here so the public
|
||||
// contract isn't tied to the internal helper's indexing.
|
||||
const startLineIndex = opts.line && opts.line > 1 ? opts.line - 1 : 0;
|
||||
const result = await this.deps.lowFs.readTextFile({
|
||||
path: p as string,
|
||||
limit: opts.limit ?? Number.POSITIVE_INFINITY,
|
||||
line: opts.line ?? 0,
|
||||
line: startLineIndex,
|
||||
});
|
||||
const ignoreVerdict = shouldIgnore(
|
||||
p,
|
||||
|
|
@ -305,7 +326,7 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
|
|||
opts.limit !== undefined &&
|
||||
Number.isFinite(opts.limit) &&
|
||||
result._meta?.originalLineCount !== undefined &&
|
||||
result._meta.originalLineCount > opts.limit + (opts.line ?? 0)
|
||||
result._meta.originalLineCount > opts.limit + startLineIndex
|
||||
) {
|
||||
meta.truncated = true;
|
||||
}
|
||||
|
|
@ -386,17 +407,36 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
|
|||
const start = performance.now();
|
||||
try {
|
||||
assertTrustedForIntent(this.deps.trusted, 'glob');
|
||||
// Reject `..` segments in the pattern outright. A pattern like
|
||||
// `../../**/*.js` would otherwise let a caller search outside
|
||||
// `cwd`. Boundary-checking each individual hit catches escapes
|
||||
// too, but rejecting at the pattern level is cheaper and gives
|
||||
// a clearer error.
|
||||
// Reject patterns up-front before delegating to `glob` — the
|
||||
// per-hit filter below catches escapes after the walk, but
|
||||
// letting a clearly out-of-workspace pattern reach `globAsync`
|
||||
// burns I/O *outside* the workspace before we drop the
|
||||
// results. Three rejection classes:
|
||||
// 1. `..` segments — would let `cwd` be escaped lexically.
|
||||
// 2. POSIX absolute (`/etc/**`) — `glob` rooted outside cwd.
|
||||
// 3. Windows-style absolute / device prefixes (`C:\…`,
|
||||
// `\\?\…`, `\\server\share`) — same hazard on the other
|
||||
// platform. `path.isAbsolute` covers POSIX `/`; the
|
||||
// drive-letter / UNC checks cover Win32 even when the
|
||||
// daemon runs on POSIX (clients may send Win32 paths).
|
||||
if (pattern.split(/[\\/]/).some((seg) => seg === '..')) {
|
||||
throw new FsError(
|
||||
'parse_error',
|
||||
`glob pattern may not contain '..' segments: ${pattern}`,
|
||||
);
|
||||
}
|
||||
if (
|
||||
path.isAbsolute(pattern) ||
|
||||
/^[A-Za-z]:[\\/]/.test(pattern) ||
|
||||
pattern.startsWith('\\\\') ||
|
||||
pattern.startsWith('//')
|
||||
) {
|
||||
throw new FsError(
|
||||
'parse_error',
|
||||
`glob pattern must be workspace-relative: ${pattern}`,
|
||||
{ hint: 'pass a relative pattern such as "src/**/*.ts"' },
|
||||
);
|
||||
}
|
||||
const cwd = (opts.cwd as string | undefined) ?? this.deps.boundWorkspace;
|
||||
const matches = await globAsync(pattern, {
|
||||
cwd,
|
||||
|
|
@ -434,11 +474,29 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
|
|||
escapedCount += 1;
|
||||
continue;
|
||||
}
|
||||
// Check the dirent kind so directory ignore rules (`dist/`,
|
||||
// `.git/`, `node_modules/`) actually match — `shouldIgnore`
|
||||
// probes `<rel>/` for the directory filter, which the
|
||||
// underlying `ignore` library requires for trailing-slash
|
||||
// patterns. Probing every hit as a `file` (the prior
|
||||
// behavior) silently leaks ignored directories from
|
||||
// `glob('**/*')` even when `includeIgnored` is false. We
|
||||
// already realpath'd the hit, so an extra `lstat` here is
|
||||
// cheap; on `lstat` failure (raced unlink) we conservatively
|
||||
// treat the hit as a file so the file-pattern check still
|
||||
// runs.
|
||||
let dirent: { isDirectory(): boolean } | null = null;
|
||||
try {
|
||||
dirent = await fsp.lstat(canonical);
|
||||
} catch {
|
||||
dirent = null;
|
||||
}
|
||||
const kind = dirent?.isDirectory() ? 'directory' : 'file';
|
||||
const verdict = shouldIgnore(
|
||||
canonical as ResolvedPath,
|
||||
this.deps.boundWorkspace,
|
||||
this.deps.ignore,
|
||||
'file',
|
||||
kind,
|
||||
);
|
||||
if (verdict.ignored && !opts.includeIgnored) continue;
|
||||
out.push(canonical as ResolvedPath);
|
||||
|
|
@ -504,6 +562,29 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
|
|||
const start = performance.now();
|
||||
try {
|
||||
assertTrustedForIntent(this.deps.trusted, 'edit');
|
||||
// Mirror `readText`'s pre-stat OOM gate: `fsp.readFile` would
|
||||
// otherwise slurp the whole target into memory before
|
||||
// `enforceWriteSize` got a chance to refuse. A multi-GB file
|
||||
// already inside the workspace can OOM the daemon even though
|
||||
// the *edited output* would later fail the size check.
|
||||
// Reject above `MAX_READ_BYTES` outright with a typed
|
||||
// `file_too_large`; binary content is also refused since
|
||||
// `current.indexOf(oldText)` over arbitrary bytes is meaningless.
|
||||
const st = await fsp.stat(p as string);
|
||||
if (st.size > MAX_READ_BYTES) {
|
||||
throw new FsError(
|
||||
'file_too_large',
|
||||
`file of ${st.size} bytes exceeds edit cap of ${MAX_READ_BYTES} bytes`,
|
||||
{
|
||||
hint: 'split large edits into bounded readBytes/writeText sequences',
|
||||
},
|
||||
);
|
||||
}
|
||||
if (await detectBinary(p)) {
|
||||
throw new FsError('binary_file', `cannot edit binary file: ${p}`, {
|
||||
hint: 'edit() works on text files only',
|
||||
});
|
||||
}
|
||||
const current = await fsp.readFile(p as string, 'utf-8');
|
||||
// Single replacement to preserve atomic write-once semantics.
|
||||
// Multi-occurrence handling lives in PR 20's edit endpoint
|
||||
|
|
|
|||
|
|
@ -86,12 +86,15 @@ export interface ServeAppDeps {
|
|||
* Workspace filesystem boundary factory (#4175 PR 18). When
|
||||
* supplied, PR 19/20 routes will pull a per-request
|
||||
* `WorkspaceFileSystem` off it; when omitted, `createServeApp`
|
||||
* builds a permissive default (trusted=true, no-op audit emit).
|
||||
* No PR 18 routes consume the factory yet — the slot is wired so
|
||||
* PR 19 read-only file routes can drop in without re-shaping
|
||||
* `ServeAppDeps`. Once PR 19 lands, `runQwenServe` will inject a
|
||||
* factory whose `trusted` flag mirrors `Config.isTrustedFolder()`
|
||||
* and whose `emit` plumbs into the per-session EventBus.
|
||||
* builds a strict default (`trusted: false`, warn-once no-op
|
||||
* `emit`) so an upstream refactor that forgets to inject
|
||||
* `fsFactory` never silently allows writes against an untrusted
|
||||
* workspace. No PR 18 routes consume the factory yet — the slot
|
||||
* is wired so PR 19 read-only file routes can drop in without
|
||||
* re-shaping `ServeAppDeps`. Once PR 19 lands, `runQwenServe`
|
||||
* will inject a factory whose `trusted` flag mirrors
|
||||
* `Config.isTrustedFolder()` and whose `emit` plumbs into the
|
||||
* per-session EventBus.
|
||||
*/
|
||||
fsFactory?: WorkspaceFileSystemFactory;
|
||||
}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue