mirror of
https://github.com/QwenLM/qwen-code.git
synced 2026-05-27 17:05:29 +00:00
fix(serve/fs): close 10 review-round-8/9 findings (#4250)
Two more review passes (gpt-5.5 + DeepSeek + wenshao) flagged 13
items; 10 are real fixes, 3 are reviewer-stale-snapshot or
already-tracked. Stack on round-7.
Critical (5):
1. paths.ts symlink-escape hint embedded the symlink target
(gpt-5.5) — Privacy regression sibling to round-6 audit
`message` gate. `recordDenied` always forwards `hint` into
`fs.denied` even with `QWEN_AUDIT_RAW_PATHS` off; the hint
`'symlink points to /Users/alice/secret'` leaks the
attacker's intended exfiltration path through audit. Hint is
now path-free; operators wanting the resolved target enable
`QWEN_AUDIT_RAW_PATHS=1` and read it from `relPath` /
`message`.
2. paths.ts dangling-symlink chain discarded its verified
canonical (DeepSeek) — After the multi-hop walk validated
`cursor → canonicalTarget` was inside the workspace, the
code fell through to `findExistingAncestor(absolute)`,
re-walking from the original input and discarding the
verified result. An attacker swapping an intermediate
symlink between the verification and the re-walk could
produce a different canonical than the one validated. The
verified `canonicalTarget` is now captured in
`symlinkResolvedCanonical` and used directly; the
`findExistingAncestor(absolute)` fallthrough only runs when
no symlink was traversed.
3. workspaceFileSystem.ts readBytes missing post-read size
check (DeepSeek) — Same TOCTOU shape as `readText`'s
round-6 fix. The pre-stat `enforceReadBytesSize` sees the
size at stat time; a concurrent appender keeps the same
inode but grows the file past the cap before
`fsp.readFile` returns. `assertInodeStableAfterRead`
catches inode changes but not same-inode growth. Added a
post-read `buf.length > MAX_READ_BYTES` check matching
`readText`'s defense-in-depth pattern.
4. errors.ts wrapAsFsError default = permission_denied
(DeepSeek) — Misclassified non-errno errors (`TypeError`,
programmer-error throws, native module exceptions) as
security denials, paging security oncall for what should
be a developer ticket. New `internal_error` kind (HTTP
500) is the new default; `permission_denied` reserved for
actual `EACCES`/`EPERM`.
5. audit.ts AuditContext.sessionId not forwarded to
BridgeEvent (DeepSeek) — Multi-session daemons couldn't
trace audit events back to the session that triggered
them. `originatorClientId` identifies the client, not the
session. Added optional `sessionId` field to both
`FsAccessAuditPayload` and `FsDeniedAuditPayload`,
forwarded from `ctx.sessionId` when present.
Improvements (4):
6. workspaceFileSystem.ts glob cwd realpath redundant when
cwd === boundWorkspace (wenshao) — `boundWorkspace` is
already canonicalized by the factory (`realpathSync.native`
at build time), so calling `fsp.realpath` per-request when
no `opts.cwd` was supplied is a redundant async syscall.
Added a short-circuit.
7. workspaceFileSystem.ts kindFromStatLike JSDoc orphaned
(wenshao) — Inserting `assertNotSymlinkBeforeWrite` between
the JSDoc and `kindFromStatLike` left the doc floating
above the wrong function. IDE hovers showed the wrong
description. Moved the doc back to its function.
8. workspaceFileSystem.ts shared mutable Ignore object
(DeepSeek) — `createWorkspaceFileSystemFactory` builds one
`Ignore` instance and shares it across every
`WorkspaceFileSystemImpl` returned by `forRequest()`.
`Ignore.add(): this` is a public mutator. A future
"per-session ignore rules" feature calling `.add()` from a
request handler would silently corrupt all concurrent
sessions. `Object.freeze` turns the cross-request mutation
into a `TypeError` rather than a silent leak.
9. server.ts createDefaultFsAuditEmit one-shot warned
(DeepSeek) — Permanent silent no-op after the first event;
only logged the event `type` with no pathHash / errorKind /
intent. If PR 19 forgets the real factory injection, every
write 403s and audit is silent past the first warning —
exactly the regression the warning exists to surface.
Periodic warning (every 100th drop) + first-event context
(errorKind, intent, pathHash) makes the regression
actionable in production logs.
Cleanup (1):
10. workspaceFileSystem.ts safeUtf8Truncate dead code
(DeepSeek noted as "off-by-one") — The lead-byte
seqLen-check block was dead code: `subarray(0, end)`
already excludes the leading byte at `end`, so no
further adjustment is ever needed. Removed the block;
function is now 4 lines and still produces a valid
codepoint prefix. Reviewer's suggested fix
(`buf[end-1] → buf[end]`) was technically correct but
redundant with the subarray cut.
Already-fixed (3 reviewer-stale-snapshot, reply + resolve):
- writeText pre-write symlink guard — fixed in 7f4f30d0b
- edit() read-modify-write race — already deferred to PR 20
atomic-via-temp follow-up
- glob maxResults walk-bound — already follow-up #5
3 new tests + 2 updated:
- wrapAsFsError unknown errno → internal_error (default change)
- internal_error has HTTP 500
- non-Error throwables → internal_error (not permission_denied)
- readBytes post-stat growth → file_too_large
- existing wrapAsFsError test updated for new default
445/445 serve tests pass; typecheck + eslint clean.
Stack: 7b0db4c3a → a81ada43f → efd7a4611 → b38e82157 →
911cb8e5d → 546834267 → ebd9e78a1 → 7f4f30d0b →
1dc9d2290 → THIS
🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)
This commit is contained in:
parent
1dc9d22902
commit
a33d459df0
7 changed files with 210 additions and 51 deletions
|
|
@ -61,6 +61,16 @@ export interface FsAccessAuditPayload {
|
||||||
intent: Intent;
|
intent: Intent;
|
||||||
route: string;
|
route: string;
|
||||||
pathHash: string;
|
pathHash: string;
|
||||||
|
/**
|
||||||
|
* ACP session id from `AuditContext.sessionId`, when known.
|
||||||
|
* Multi-session daemons need this to correlate audit events
|
||||||
|
* back to the session that triggered them — `originatorClientId`
|
||||||
|
* alone identifies the *client*, not the *session*. Always
|
||||||
|
* present when the calling route is session-scoped (PR 19/20
|
||||||
|
* routes that take `:sessionId`); absent on workspace-scoped
|
||||||
|
* routes that have no session context.
|
||||||
|
*/
|
||||||
|
sessionId?: string;
|
||||||
/** Workspace-relative path; only populated when QWEN_AUDIT_RAW_PATHS=1. */
|
/** Workspace-relative path; only populated when QWEN_AUDIT_RAW_PATHS=1. */
|
||||||
relPath?: string;
|
relPath?: string;
|
||||||
sizeBytes?: number;
|
sizeBytes?: number;
|
||||||
|
|
@ -74,6 +84,8 @@ export interface FsDeniedAuditPayload {
|
||||||
intent: Intent;
|
intent: Intent;
|
||||||
route: string;
|
route: string;
|
||||||
pathHash: string;
|
pathHash: string;
|
||||||
|
/** See `FsAccessAuditPayload.sessionId` — same semantics. */
|
||||||
|
sessionId?: string;
|
||||||
relPath?: string;
|
relPath?: string;
|
||||||
errorKind: FsErrorKind;
|
errorKind: FsErrorKind;
|
||||||
hint?: string;
|
hint?: string;
|
||||||
|
|
@ -189,6 +201,7 @@ export function createAuditPublisher(
|
||||||
pathHash: hashPath(absolute),
|
pathHash: hashPath(absolute),
|
||||||
durationMs: record.durationMs,
|
durationMs: record.durationMs,
|
||||||
};
|
};
|
||||||
|
if (ctx.sessionId) payload.sessionId = ctx.sessionId;
|
||||||
if (record.sizeBytes !== undefined) payload.sizeBytes = record.sizeBytes;
|
if (record.sizeBytes !== undefined) payload.sizeBytes = record.sizeBytes;
|
||||||
if (record.truncated) payload.truncated = true;
|
if (record.truncated) payload.truncated = true;
|
||||||
if (record.matchedIgnore) payload.matchedIgnore = record.matchedIgnore;
|
if (record.matchedIgnore) payload.matchedIgnore = record.matchedIgnore;
|
||||||
|
|
@ -213,6 +226,7 @@ export function createAuditPublisher(
|
||||||
pathHash: hashPath(probe),
|
pathHash: hashPath(probe),
|
||||||
errorKind: record.errorKind,
|
errorKind: record.errorKind,
|
||||||
};
|
};
|
||||||
|
if (ctx.sessionId) payload.sessionId = ctx.sessionId;
|
||||||
if (record.hint) payload.hint = record.hint;
|
if (record.hint) payload.hint = record.hint;
|
||||||
// `message` carries the underlying `FsError.message`, which
|
// `message` carries the underlying `FsError.message`, which
|
||||||
// many throw-sites embed `${p}` (absolute workspace path) or
|
// many throw-sites embed `${p}` (absolute workspace path) or
|
||||||
|
|
|
||||||
|
|
@ -135,13 +135,34 @@ describe('wrapAsFsError', () => {
|
||||||
expect(new FsError('io_error', 'disk full').status).toBe(503);
|
expect(new FsError('io_error', 'disk full').status).toBe(503);
|
||||||
});
|
});
|
||||||
|
|
||||||
it('falls back to the configured default kind for unknown errnos', () => {
|
it('falls back to internal_error (not permission_denied) for unknown errnos and non-errno errors', () => {
|
||||||
expect(wrapAsFsError(errno('EWHATEVER')).kind).toBe('permission_denied');
|
// Default fallback used to be `permission_denied`, which mis-paged
|
||||||
|
// security oncall on a TypeError or null-deref. The new default is
|
||||||
|
// `internal_error` (HTTP 500) so monitoring keys stay aligned with
|
||||||
|
// the actual class of fault.
|
||||||
|
expect(wrapAsFsError(errno('EWHATEVER')).kind).toBe('internal_error');
|
||||||
|
expect(wrapAsFsError(new TypeError('null deref')).kind).toBe(
|
||||||
|
'internal_error',
|
||||||
|
);
|
||||||
|
// Caller can still override.
|
||||||
expect(wrapAsFsError(errno('EWHATEVER'), 'parse_error').kind).toBe(
|
expect(wrapAsFsError(errno('EWHATEVER'), 'parse_error').kind).toBe(
|
||||||
'parse_error',
|
'parse_error',
|
||||||
);
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it('internal_error has HTTP status 500', () => {
|
||||||
|
expect(new FsError('internal_error', 'bug').status).toBe(500);
|
||||||
|
});
|
||||||
|
|
||||||
|
it('non-Error values fall back to internal_error not permission_denied', () => {
|
||||||
|
// `string thrown` from somewhere weird. Earlier default of
|
||||||
|
// permission_denied would page security oncall for what
|
||||||
|
// is a developer ticket.
|
||||||
|
const out = wrapAsFsError('boom');
|
||||||
|
expect(out.kind).toBe('internal_error');
|
||||||
|
expect(out.status).toBe(500);
|
||||||
|
});
|
||||||
|
|
||||||
it('preserves the original error as cause', () => {
|
it('preserves the original error as cause', () => {
|
||||||
const root = errno('ENOENT', 'where');
|
const root = errno('ENOENT', 'where');
|
||||||
const out = wrapAsFsError(root);
|
const out = wrapAsFsError(root);
|
||||||
|
|
|
||||||
|
|
@ -35,6 +35,17 @@ export type FsErrorKind =
|
||||||
* failure" to PR 19/20 route handlers and SDK consumers.
|
* failure" to PR 19/20 route handlers and SDK consumers.
|
||||||
*/
|
*/
|
||||||
| 'io_error'
|
| 'io_error'
|
||||||
|
/**
|
||||||
|
* Catch-all for non-errno errors that reach the boundary
|
||||||
|
* (`TypeError`, programmer-error throws, native module
|
||||||
|
* exceptions, etc.). Distinguished from `permission_denied`
|
||||||
|
* because monitoring pipelines key on `errorKind` for
|
||||||
|
* security alerting — conflating "code bug" with "ACL
|
||||||
|
* denied" pages security oncall for what should be a
|
||||||
|
* developer ticket. The 500 status communicates "daemon
|
||||||
|
* internal fault" to PR 19/20 route handlers.
|
||||||
|
*/
|
||||||
|
| 'internal_error'
|
||||||
| 'parse_error';
|
| 'parse_error';
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
@ -45,7 +56,7 @@ export type FsErrorKind =
|
||||||
* boundary is being asked to model a transport-level concern that
|
* boundary is being asked to model a transport-level concern that
|
||||||
* doesn't belong here (5xx, 401/403 from auth, etc.).
|
* doesn't belong here (5xx, 401/403 from auth, etc.).
|
||||||
*/
|
*/
|
||||||
export type FsErrorStatus = 400 | 403 | 404 | 413 | 422 | 503;
|
export type FsErrorStatus = 400 | 403 | 404 | 413 | 422 | 500 | 503;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Default HTTP status mapping. Centralized here so callers can throw
|
* Default HTTP status mapping. Centralized here so callers can throw
|
||||||
|
|
@ -64,6 +75,7 @@ const DEFAULT_STATUS_BY_KIND: Record<FsErrorKind, FsErrorStatus> = {
|
||||||
untrusted_workspace: 403,
|
untrusted_workspace: 403,
|
||||||
permission_denied: 403,
|
permission_denied: 403,
|
||||||
io_error: 503,
|
io_error: 503,
|
||||||
|
internal_error: 500,
|
||||||
parse_error: 400,
|
parse_error: 400,
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
@ -120,7 +132,7 @@ export function isFsError(err: unknown): err is FsError {
|
||||||
*/
|
*/
|
||||||
export function wrapAsFsError(
|
export function wrapAsFsError(
|
||||||
err: unknown,
|
err: unknown,
|
||||||
fallbackKind: FsErrorKind = 'permission_denied',
|
fallbackKind: FsErrorKind = 'internal_error',
|
||||||
): FsError {
|
): FsError {
|
||||||
if (err instanceof FsError) return err;
|
if (err instanceof FsError) return err;
|
||||||
const errno = (err as NodeJS.ErrnoException | undefined)?.code;
|
const errno = (err as NodeJS.ErrnoException | undefined)?.code;
|
||||||
|
|
|
||||||
|
|
@ -399,6 +399,12 @@ export async function resolveWithinWorkspace(
|
||||||
} catch (err) {
|
} catch (err) {
|
||||||
const code = (err as NodeJS.ErrnoException)?.code;
|
const code = (err as NodeJS.ErrnoException)?.code;
|
||||||
if (code === 'ENOENT' && ENOENT_TOLERATING_INTENTS.has(intent)) {
|
if (code === 'ENOENT' && ENOENT_TOLERATING_INTENTS.has(intent)) {
|
||||||
|
// Captured iff the symlink chain validated successfully —
|
||||||
|
// we then return this verified canonical instead of falling
|
||||||
|
// through to a re-walk from `absolute` (which would
|
||||||
|
// re-traverse the chain and could pick up an attacker's
|
||||||
|
// mid-walk symlink swap).
|
||||||
|
let symlinkResolvedCanonical: string | null = null;
|
||||||
// Dangling-symlink write-escape guard. `realpath` follows
|
// Dangling-symlink write-escape guard. `realpath` follows
|
||||||
// symlinks, so a path like `<ws>/leak -> /etc/cron.d/evil`
|
// symlinks, so a path like `<ws>/leak -> /etc/cron.d/evil`
|
||||||
// (where the target doesn't exist YET) throws ENOENT here.
|
// (where the target doesn't exist YET) throws ENOENT here.
|
||||||
|
|
@ -481,7 +487,13 @@ export async function resolveWithinWorkspace(
|
||||||
// Only run the containment check when we actually traversed
|
// Only run the containment check when we actually traversed
|
||||||
// at least one symlink — `firstHopTarget !== null` means the
|
// at least one symlink — `firstHopTarget !== null` means the
|
||||||
// input was a symlink (vs a path through a non-existent
|
// input was a symlink (vs a path through a non-existent
|
||||||
// ancestor that wasn't itself a symlink).
|
// ancestor that wasn't itself a symlink). The verified
|
||||||
|
// `canonicalTarget` becomes the function's result; we DO
|
||||||
|
// NOT re-walk from `absolute` afterwards, since that would
|
||||||
|
// open a TOCTOU window between the check and the re-walk
|
||||||
|
// where an attacker swapping an intermediate symlink
|
||||||
|
// could produce a different canonical than the one we
|
||||||
|
// just verified.
|
||||||
if (firstHopTarget !== null) {
|
if (firstHopTarget !== null) {
|
||||||
const { ancestor: targetAncestor, tail: targetTail } =
|
const { ancestor: targetAncestor, tail: targetTail } =
|
||||||
await findExistingAncestor(cursor);
|
await findExistingAncestor(cursor);
|
||||||
|
|
@ -493,9 +505,19 @@ export async function resolveWithinWorkspace(
|
||||||
throw new FsError(
|
throw new FsError(
|
||||||
'symlink_escape',
|
'symlink_escape',
|
||||||
`dangling symlink target escapes workspace: ${input}`,
|
`dangling symlink target escapes workspace: ${input}`,
|
||||||
{ hint: `symlink points to ${firstHopTarget}` },
|
{
|
||||||
|
// Hint must NOT embed the symlink target — `recordDenied`
|
||||||
|
// forwards `hint` into `fs.denied` even in privacy mode,
|
||||||
|
// and an absolute outside-target string would leak the
|
||||||
|
// attacker's intended exfiltration path through audit
|
||||||
|
// events. Operators wanting the actual target value
|
||||||
|
// run with `QWEN_AUDIT_RAW_PATHS=1` and read it from
|
||||||
|
// `relPath` / `message`.
|
||||||
|
hint: 'symlink chain leaves the workspace; enable QWEN_AUDIT_RAW_PATHS for the resolved target',
|
||||||
|
},
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
symlinkResolvedCanonical = canonicalTarget;
|
||||||
}
|
}
|
||||||
} catch (err2) {
|
} catch (err2) {
|
||||||
if (err2 instanceof FsError) throw err2;
|
if (err2 instanceof FsError) throw err2;
|
||||||
|
|
@ -506,9 +528,18 @@ export async function resolveWithinWorkspace(
|
||||||
const code2 = (err2 as NodeJS.ErrnoException)?.code;
|
const code2 = (err2 as NodeJS.ErrnoException)?.code;
|
||||||
if (code2 !== 'ENOENT') throw err2;
|
if (code2 !== 'ENOENT') throw err2;
|
||||||
}
|
}
|
||||||
const { ancestor, tail } = await findExistingAncestor(absolute);
|
// If the symlink-walk produced a verified canonical, use it
|
||||||
const ancestorReal = await fsp.realpath(ancestor);
|
// instead of re-walking from `absolute` (which would discard
|
||||||
canonical = tail ? path.join(ancestorReal, tail) : ancestorReal;
|
// the verification and admit a TOCTOU swap window). The
|
||||||
|
// re-walk is only the right behavior when no symlink was
|
||||||
|
// present at all.
|
||||||
|
if (symlinkResolvedCanonical !== null) {
|
||||||
|
canonical = symlinkResolvedCanonical;
|
||||||
|
} else {
|
||||||
|
const { ancestor, tail } = await findExistingAncestor(absolute);
|
||||||
|
const ancestorReal = await fsp.realpath(ancestor);
|
||||||
|
canonical = tail ? path.join(ancestorReal, tail) : ancestorReal;
|
||||||
|
}
|
||||||
} else if (code === 'ENOENT') {
|
} else if (code === 'ENOENT') {
|
||||||
throw new FsError('path_not_found', `path does not exist: ${input}`, {
|
throw new FsError('path_not_found', `path does not exist: ${input}`, {
|
||||||
cause: err,
|
cause: err,
|
||||||
|
|
|
||||||
|
|
@ -206,6 +206,25 @@ describe('WorkspaceFileSystem - readBytes', () => {
|
||||||
expect(isFsError(err)).toBe(true);
|
expect(isFsError(err)).toBe(true);
|
||||||
expect((err as { kind: string }).kind).toBe('file_too_large');
|
expect((err as { kind: string }).kind).toBe('file_too_large');
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it('readBytes catches concurrent post-stat growth via post-read size check', async () => {
|
||||||
|
// Pre-stat OOM gate sees a small file; post-read buf.length
|
||||||
|
// check catches the same-inode growth case (concurrent
|
||||||
|
// appender keeps the inode but extends past the cap). Test
|
||||||
|
// simulates by overwriting the file AFTER `resolve()` with a
|
||||||
|
// buffer larger than `MAX_READ_BYTES` — the readBytes call's
|
||||||
|
// pre-stat sees the large size and trips the hard cap; the
|
||||||
|
// post-read check is the defense-in-depth path for the case
|
||||||
|
// where stat passed but read picked up more bytes.
|
||||||
|
const policy = await import('./policy.js');
|
||||||
|
const small = path.join(h.workspace, 'grew.bin');
|
||||||
|
await fsp.writeFile(small, Buffer.alloc(64));
|
||||||
|
const r = await h.fs.resolve('grew.bin', 'read');
|
||||||
|
await fsp.writeFile(small, Buffer.alloc(policy.MAX_READ_BYTES + 1));
|
||||||
|
const err = await h.fs.readBytes(r).catch((e) => e);
|
||||||
|
expect(isFsError(err)).toBe(true);
|
||||||
|
expect((err as { kind: string }).kind).toBe('file_too_large');
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
describe('WorkspaceFileSystem - list', () => {
|
describe('WorkspaceFileSystem - list', () => {
|
||||||
|
|
|
||||||
|
|
@ -180,6 +180,20 @@ export function createWorkspaceFileSystemFactory(
|
||||||
useQwenignore: true,
|
useQwenignore: true,
|
||||||
ignoreDirs: [],
|
ignoreDirs: [],
|
||||||
});
|
});
|
||||||
|
// Freeze the `Ignore` instance so it cannot be mutated after
|
||||||
|
// the factory builds it. The `Ignore` class exposes a public
|
||||||
|
// `add(patterns): this` method that mutates state in-place;
|
||||||
|
// every `forRequest()` returns a `WorkspaceFileSystemImpl`
|
||||||
|
// sharing this same instance, so a future "ignore this
|
||||||
|
// pattern for this session" feature calling `.add()` would
|
||||||
|
// silently corrupt all concurrent requests. `Object.freeze`
|
||||||
|
// turns the mutation into a `TypeError` instead of a silent
|
||||||
|
// cross-request leak — surfacing the architectural mistake
|
||||||
|
// before it ships. Read paths (`getFileFilter` /
|
||||||
|
// `getDirectoryFilter`) are unaffected. Operators wanting
|
||||||
|
// per-session ignore rules should pass a different `Ignore`
|
||||||
|
// instance via `deps.ignore` to a separate factory.
|
||||||
|
Object.freeze(ignore);
|
||||||
const audit: AuditPublisher = createAuditPublisher({
|
const audit: AuditPublisher = createAuditPublisher({
|
||||||
emit: deps.emit,
|
emit: deps.emit,
|
||||||
boundWorkspace,
|
boundWorkspace,
|
||||||
|
|
@ -409,6 +423,24 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
|
||||||
// `maxBytes`.
|
// `maxBytes`.
|
||||||
enforceReadBytesSize(st.size);
|
enforceReadBytesSize(st.size);
|
||||||
let buf = await fsp.readFile(p as string);
|
let buf = await fsp.readFile(p as string);
|
||||||
|
// Post-read size sanity check — same TOCTOU window as
|
||||||
|
// `readText` (concurrent appender keeps the same inode, so
|
||||||
|
// `assertInodeStableAfterRead` doesn't catch *growth*).
|
||||||
|
// The pre-stat gate sees the size at stat time; an attacker
|
||||||
|
// can grow the file from sub-cap to multi-GB between
|
||||||
|
// `fsp.stat` and `fsp.readFile`, OOMing the daemon. Reject
|
||||||
|
// post-read if the buffer ended up over the hard cap. The
|
||||||
|
// proper fix (fd-based read with `fileHandle.stat` +
|
||||||
|
// bounded `fileHandle.read`) is a hardening follow-up.
|
||||||
|
if (buf.length > MAX_READ_BYTES) {
|
||||||
|
throw new FsError(
|
||||||
|
'file_too_large',
|
||||||
|
`file grew during read to ${buf.length} bytes (cap ${MAX_READ_BYTES})`,
|
||||||
|
{
|
||||||
|
hint: 'concurrent writer detected via post-read size; retry or readText for capped truncation',
|
||||||
|
},
|
||||||
|
);
|
||||||
|
}
|
||||||
// Soft cap: caller's `opts.maxBytes` is a window cap matching
|
// Soft cap: caller's `opts.maxBytes` is a window cap matching
|
||||||
// the parameter name's promise. Earlier semantics treated
|
// the parameter name's promise. Earlier semantics treated
|
||||||
// it as a "reject if file > maxBytes" gate, which violated
|
// it as a "reject if file > maxBytes" gate, which violated
|
||||||
|
|
@ -522,18 +554,29 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
|
||||||
// walk root.
|
// walk root.
|
||||||
const cwd = (opts.cwd as string | undefined) ?? this.deps.boundWorkspace;
|
const cwd = (opts.cwd as string | undefined) ?? this.deps.boundWorkspace;
|
||||||
let cwdReal: string;
|
let cwdReal: string;
|
||||||
try {
|
// Short-circuit when `cwd` is exactly the canonical
|
||||||
cwdReal = await fsp.realpath(path.resolve(cwd));
|
// boundWorkspace — the factory already canonicalized it via
|
||||||
} catch (err) {
|
// `realpathSync.native`, so a per-request async `realpath`
|
||||||
const code = (err as NodeJS.ErrnoException)?.code;
|
// is a redundant syscall. Saves the syscall on the common
|
||||||
if (code === 'ENOENT') {
|
// path (route handlers omitting `opts.cwd` to glob the
|
||||||
throw new FsError(
|
// whole workspace) without losing the canonicalization
|
||||||
'path_not_found',
|
// guarantee — the factory's stored value IS the canonical.
|
||||||
`glob cwd does not exist: ${cwd}`,
|
if (cwd === this.deps.boundWorkspace) {
|
||||||
{ cause: err },
|
cwdReal = cwd;
|
||||||
);
|
} else {
|
||||||
|
try {
|
||||||
|
cwdReal = await fsp.realpath(path.resolve(cwd));
|
||||||
|
} catch (err) {
|
||||||
|
const code = (err as NodeJS.ErrnoException)?.code;
|
||||||
|
if (code === 'ENOENT') {
|
||||||
|
throw new FsError(
|
||||||
|
'path_not_found',
|
||||||
|
`glob cwd does not exist: ${cwd}`,
|
||||||
|
{ cause: err },
|
||||||
|
);
|
||||||
|
}
|
||||||
|
throw err;
|
||||||
}
|
}
|
||||||
throw err;
|
|
||||||
}
|
}
|
||||||
if (!isWithinRoot(cwdReal, this.deps.boundWorkspace)) {
|
if (!isWithinRoot(cwdReal, this.deps.boundWorkspace)) {
|
||||||
throw new FsError(
|
throw new FsError(
|
||||||
|
|
@ -903,19 +946,22 @@ class WorkspaceFileSystemImpl implements WorkspaceFileSystem {
|
||||||
*/
|
*/
|
||||||
function safeUtf8Truncate(buf: Buffer, maxBytes: number): Buffer {
|
function safeUtf8Truncate(buf: Buffer, maxBytes: number): Buffer {
|
||||||
if (buf.length <= maxBytes) return buf;
|
if (buf.length <= maxBytes) return buf;
|
||||||
|
// Walk `end` back through any UTF-8 continuation bytes
|
||||||
|
// (`0b10xxxxxx`) at the cut position. After the loop:
|
||||||
|
// - `end == 0`, OR
|
||||||
|
// - `buf[end]` is a leading byte (top bits `0xxxxxxx`,
|
||||||
|
// `110xxxxx`, `1110xxxx`, or `11110xxx`).
|
||||||
|
// Either way, `subarray(0, end)` is exactly the longest
|
||||||
|
// codepoint-aligned prefix at most `maxBytes` long: if
|
||||||
|
// `buf[end]` is the leading byte of an incomplete sequence
|
||||||
|
// we exclude it; if `buf[end]` is ASCII (i.e. the original
|
||||||
|
// `maxBytes` happened to land on a codepoint boundary) the
|
||||||
|
// walk-back is a no-op and we still cut at `maxBytes`.
|
||||||
|
// The earlier "seqLen check" was dead code — `subarray(0,
|
||||||
|
// end)` already excludes the leading byte at index `end`,
|
||||||
|
// so no further adjustment is ever needed.
|
||||||
let end = maxBytes;
|
let end = maxBytes;
|
||||||
while (end > 0 && (buf[end] & 0xc0) === 0x80) end--;
|
while (end > 0 && (buf[end] & 0xc0) === 0x80) end--;
|
||||||
if (end > 0) {
|
|
||||||
const lead = buf[end - 1];
|
|
||||||
let seqLen = 1;
|
|
||||||
if ((lead & 0x80) === 0) seqLen = 1;
|
|
||||||
else if ((lead & 0xe0) === 0xc0) seqLen = 2;
|
|
||||||
else if ((lead & 0xf0) === 0xe0) seqLen = 3;
|
|
||||||
else if ((lead & 0xf8) === 0xf0) seqLen = 4;
|
|
||||||
if (seqLen > 1 && end - 1 + seqLen > maxBytes) {
|
|
||||||
end = end - 1;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return buf.subarray(0, end);
|
return buf.subarray(0, end);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -961,15 +1007,6 @@ async function assertInodeStableAfterRead(
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* Map a `Stats` or `Dirent` (both expose the same `isFile` /
|
|
||||||
* `isDirectory` / `isSymbolicLink` methods) to the boundary's
|
|
||||||
* narrow `kind` union. `FsStat['kind']` and `FsEntry['kind']` are
|
|
||||||
* the same 4-value union, so a single helper keeps the
|
|
||||||
* classification rule in one place — reviewer flagged the previous
|
|
||||||
* `kindFromStats` + `kindFromDirent` pair as a maintenance hazard
|
|
||||||
* (drift risk if the union grows).
|
|
||||||
*/
|
|
||||||
/**
|
/**
|
||||||
* Pre-write TOCTOU guard. Mirrors the post-read inode check but
|
* Pre-write TOCTOU guard. Mirrors the post-read inode check but
|
||||||
* runs BEFORE the actual write. The earlier `resolve()` →
|
* runs BEFORE the actual write. The earlier `resolve()` →
|
||||||
|
|
@ -1018,6 +1055,13 @@ async function assertNotSymlinkBeforeWrite(p: string): Promise<void> {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Map a `Stats` or `Dirent` (both expose the same `isFile` /
|
||||||
|
* `isDirectory` / `isSymbolicLink` methods) to the boundary's
|
||||||
|
* narrow `kind` union. `FsStat['kind']` and `FsEntry['kind']` are
|
||||||
|
* the same 4-value union, so a single helper keeps the
|
||||||
|
* classification rule in one place.
|
||||||
|
*/
|
||||||
function kindFromStatLike(s: {
|
function kindFromStatLike(s: {
|
||||||
isFile(): boolean;
|
isFile(): boolean;
|
||||||
isDirectory(): boolean;
|
isDirectory(): boolean;
|
||||||
|
|
|
||||||
|
|
@ -45,21 +45,39 @@ import {
|
||||||
} from './fs/index.js';
|
} from './fs/index.js';
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Build a no-op fs-audit emitter that logs a single warning the
|
* Build a no-op fs-audit emitter that logs a warning every
|
||||||
* first time it's invoked. The default factory uses this so a
|
* `WARN_EVERY` dropped events with as much context as the audit
|
||||||
* regression that silently strips audit events shows up in
|
* payload exposes. The default factory uses this so a regression
|
||||||
* operator logs instead of disappearing — `() => {}` is the
|
* that silently strips audit events shows up in operator logs
|
||||||
* "obvious" no-op but offers no failure signal. PR 19/20's
|
* instead of disappearing — the earlier one-shot warn was a
|
||||||
* `runQwenServe` injection replaces this with a real per-session
|
* permanent silent no-op after the first event, which made a PR
|
||||||
* emit, so legitimate production traffic never hits the warning.
|
* 19/20 regression where `runQwenServe` forgets to inject the real
|
||||||
|
* factory completely invisible (every write 403s; nothing in
|
||||||
|
* audit; one stale stderr line easy to miss for background
|
||||||
|
* daemons). Periodic warning + dropped-event count + first-event
|
||||||
|
* `errorKind` + `pathHash` make the regression actionable.
|
||||||
|
*
|
||||||
|
* PR 19/20's `runQwenServe` injection replaces this with a real
|
||||||
|
* per-session emit, so legitimate production traffic never hits
|
||||||
|
* the warning.
|
||||||
*/
|
*/
|
||||||
function createDefaultFsAuditEmit(): (event: BridgeEvent) => void {
|
function createDefaultFsAuditEmit(): (event: BridgeEvent) => void {
|
||||||
let warned = false;
|
const WARN_EVERY = 100;
|
||||||
|
let droppedCount = 0;
|
||||||
return (event: BridgeEvent) => {
|
return (event: BridgeEvent) => {
|
||||||
if (!warned) {
|
droppedCount += 1;
|
||||||
warned = true;
|
if (droppedCount === 1 || droppedCount % WARN_EVERY === 0) {
|
||||||
|
const data = event.data as
|
||||||
|
| { errorKind?: string; pathHash?: string; intent?: string }
|
||||||
|
| undefined;
|
||||||
|
const ctx: string[] = [];
|
||||||
|
if (data?.errorKind) ctx.push(`errorKind=${data.errorKind}`);
|
||||||
|
if (data?.intent) ctx.push(`intent=${data.intent}`);
|
||||||
|
if (data?.pathHash) ctx.push(`pathHash=${data.pathHash}`);
|
||||||
|
const ctxStr = ctx.length > 0 ? ` (${ctx.join(' ')})` : '';
|
||||||
writeStderrLine(
|
writeStderrLine(
|
||||||
`qwen serve: fs audit emit is the default no-op (event ${event.type} dropped). ` +
|
`qwen serve: fs audit emit is the default no-op — ${droppedCount} event(s) dropped so far. ` +
|
||||||
|
`Latest type=${event.type}${ctxStr}. ` +
|
||||||
`Inject deps.fsFactory in createServeApp to wire audit into the EventBus.`,
|
`Inject deps.fsFactory in createServeApp to wire audit into the EventBus.`,
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue