fix(cli): address banner re-review (FIFO, mutex schema, display width, regex dedupe)

Addresses the five findings on PR #3710 from the latest re-review:

1. **[Critical] FIFO/pipe at `customAsciiArt.path` no longer hangs
   startup.** The resolver was calling `openSync(path, O_NOFOLLOW)`
   *before* the `fstatSync(...).isFile()` check; on POSIX, opening a
   FIFO read-only blocks until a writer connects, and `O_NOFOLLOW`
   doesn't help — it only refuses symlinks at the final path
   component. `readArtFile` now `lstatSync()`s first and refuses
   non-regular files (FIFO / socket / device / symlink) before the
   open, while keeping the post-open `fstatSync` check for TOCTOU
   safety against a swap between the lstat and the open. New
   POSIX-only regression test `mkfifo`s a named pipe and asserts the
   resolver soft-fails inside 1 s; if the open ever regresses to
   blocking, the test will hang past the timeout and the assertion
   will catch it.

2. **[Suggestion] `{path}` and `{small,large}` are now mutually
   exclusive in both schema and runtime.** The `jsonSchemaOverride`
   on `ui.customAsciiArt` is split into three branches (string,
   `{path}`, `{small?, large?}`); none of them allow `path` and tier
   keys to co-exist. `normalizeTiers()` mirrors that — an object
   carrying both kinds of keys is now soft-rejected with a `[BANNER]`
   warn rather than letting `path` silently win and dropping the
   tier values. New regression test pins the runtime side.

3. **[Suggestion] Column cap and tier-fit selection now measure in
   terminal cells.** `getAsciiArtWidth` (in `textUtils.ts`) and the
   `MAX_ART_COLS` cap in `customBanner.ts` were both using UTF-16
   `.length`, so 200 CJK fullwidth characters would slip the cap and
   render at ~400 cells, and `pickAsciiArtTier`'s width-fit check
   was wrong for any non-ASCII art. Switched both to
   `getCachedStringWidth` (string-width semantics, already in the
   repo); art truncation walks code points until adding another
   would push the cell width past the cap, so we never split a
   fullwidth code point or surrogate pair down the middle. New
   regression test exercises the CJK fullwidth case.

4. **[Suggestion] `collectScopedTiers()` no longer drops a whole
   scope just because it has no `file.path`.** Inline-string tiers
   don't need an owning settings directory; only `{path}` tiers do.
   The path-presence check was moved into the `{path}` branch, so a
   path-less scope (e.g. `systemDefaults`, future SDK-injected
   scopes) can still contribute inline art. `{path}` entries in such
   a scope soft-fail with a tier-specific `[BANNER]` warn rather
   than killing the whole scope. Two regression tests cover both
   sides.

5. **[Suggestion] OSC / CSI / SS2-3 regex are now authored once.**
   Extracted `TERMINAL_OSC_REGEX`, `TERMINAL_CSI_REGEX`,
   `TERMINAL_SHIFT_DCS_REGEX` from `stripTerminalControlSequences`
   in `@qwen-code/qwen-code-core` and re-export them from the
   package index. `customBanner.ts` reuses the constants for
   `sanitizeArt` (which still has to preserve `\n` / `\t`) and
   delegates the title/subtitle pipeline directly to
   `stripTerminalControlSequences`. Also backported the C1 control
   strip (0x80-0x9F) into the core helper so all callers
   (session-title, etc.) benefit from the same coverage; banner
   sanitizer was the only place catching single-byte CSI / DCS / ST.

Banner suite is now 40 + 17 + 6 + 16 = 79 tests, all green. Schema
regen is still byte-for-byte idempotent. `npm run typecheck` and
prettier clean on touched files.
This commit is contained in:
秦奇 2026-05-06 13:05:24 +08:00
parent d4d315d3fe
commit 7ccbfaeb1a
7 changed files with 274 additions and 66 deletions

View file

@ -799,13 +799,25 @@ const SETTINGS_SCHEMA = {
// the way (we don't want a multi-line ASCII editor in the TUI), but
// the JSON Schema needs a real union so VS Code stops flagging the
// documented bare-string form.
// The `oneOf` here uses three *mutually exclusive* branches rather
// than one permissive object branch, so VS Code rejects nonsense
// like `{ path, small, large }` (which the runtime would also
// reject — see `normalizeTiers` in `customBanner.ts`).
jsonSchemaOverride: {
oneOf: [
{ type: 'string' },
// Bare `{path}` — no tier keys allowed.
{
type: 'object',
properties: { path: { type: 'string' } },
required: ['path'],
additionalProperties: false,
},
// Width-aware `{small?, large?}` — `path` not allowed at this
// level; each tier is itself string-or-`{path}`.
{
type: 'object',
properties: {
path: { type: 'string' },
small: {
oneOf: [
{ type: 'string' },

View file

@ -227,6 +227,37 @@ describe('resolveCustomBanner', () => {
expect(out.asciiArt.small).toBe('ABS\nART');
});
it('refuses to open a FIFO at the configured path (POSIX) — must not hang startup', () => {
if (process.platform === 'win32') return;
const fifoPath = path.join(tmpDir, 'pipe.fifo');
// mkfifo via child_process keeps this test self-contained without
// pulling in a native dep. If `mkfifo` isn't available (very rare on
// POSIX dev boxes) we skip the assertion rather than fail the suite.
try {
// eslint-disable-next-line @typescript-eslint/no-require-imports
require('node:child_process').execFileSync('mkfifo', [fifoPath]);
} catch {
return;
}
const start = Date.now();
const out = resolveCustomBanner(
makeSettings({
userUi: {
customAsciiArt: { path: 'pipe.fifo' } as CustomAsciiArtSetting,
},
userPath: path.join(tmpDir, 'settings.json'),
}),
);
const elapsedMs = Date.now() - start;
// The pre-open lstat() check should reject the FIFO instantly. If the
// resolver ever regresses to opening a FIFO read-only, this assertion
// will catch it because the open would block until a writer connects
// (we never start one) and the test would hang well past 1s.
expect(elapsedMs).toBeLessThan(1000);
expect(out.asciiArt.small).toBeUndefined();
expect(out.asciiArt.large).toBeUndefined();
});
it('refuses to follow a symlink at the configured path on POSIX', () => {
if (process.platform === 'win32') return;
const real = path.join(tmpDir, 'real.txt');
@ -274,6 +305,73 @@ describe('resolveCustomBanner', () => {
expect(out.asciiArt.small?.length).toBe(200);
});
it('rejects {path} mixed with tier keys (mutually exclusive object branches)', () => {
const out = resolveCustomBanner(
makeSettings({
userUi: {
customAsciiArt: {
path: 'p.txt',
small: 'should-be-rejected',
} as unknown as CustomAsciiArtSetting,
},
userPath: '/home/u/.qwen/settings.json',
}),
);
// The resolver must NOT silently let `path` win — both forms are
// dropped and we fall through to the default art.
expect(out.asciiArt.small).toBeUndefined();
expect(out.asciiArt.large).toBeUndefined();
});
it('caps art width by terminal cells, not UTF-16 length (CJK fullwidth)', () => {
// 200 fullwidth CJK characters render at ~400 cells; before the fix
// the .length cap let them through. The cap is now visual width, so
// ~100 fullwidth chars max per line.
const fullwidth = '一'.repeat(150); // 150 chars × 2 cells = 300 cells
const out = resolveCustomBanner(
makeSettings({ userUi: { customAsciiArt: fullwidth } }),
);
const small = out.asciiArt.small ?? '';
// Truncated by visual width: each fullwidth char is 2 cells, so
// ≤ 100 chars fit under MAX_ART_COLS=200.
expect(small.length).toBeLessThanOrEqual(100);
// And it's a valid string of fullwidth chars (no surrogate / split
// mid-codepoint — every char is a full BMP code point here).
expect(/^一+$/.test(small)).toBe(true);
});
it('falls back when a {path} entry lives in a scope with no associated file path', () => {
// `userPath: ''` simulates a path-less scope (e.g. systemDefaults).
// A {path} tier in such a scope can't resolve relative paths, so we
// soft-fail it specifically instead of silently dropping the whole
// scope (which would block inline tiers from path-less scopes too).
const out = resolveCustomBanner(
makeSettings({
userUi: {
customAsciiArt: { path: 'art.txt' } as CustomAsciiArtSetting,
},
userPath: '',
}),
);
expect(out.asciiArt.small).toBeUndefined();
expect(out.asciiArt.large).toBeUndefined();
});
it('still accepts inline string art from a scope with no file path (decoupled from {path} resolution)', () => {
// The flip side of the previous test: inline strings don't need an
// owning settings directory, so a path-less scope must still
// contribute. Before the decoupling, the whole scope was dropped on
// `!file.path`, so even inline art would silently disappear.
const out = resolveCustomBanner(
makeSettings({
userUi: { customAsciiArt: 'INLINE-FROM-PATHLESS-SCOPE' },
userPath: '',
}),
);
expect(out.asciiArt.small).toBe('INLINE-FROM-PATHLESS-SCOPE');
expect(out.asciiArt.large).toBe('INLINE-FROM-PATHLESS-SCOPE');
});
it('rejects a malformed customAsciiArt and falls back to default', () => {
const out = resolveCustomBanner(
makeSettings({

View file

@ -6,12 +6,19 @@
import * as fs from 'node:fs';
import * as path from 'node:path';
import { createDebugLogger } from '@qwen-code/qwen-code-core';
import {
createDebugLogger,
stripTerminalControlSequences,
TERMINAL_OSC_REGEX,
TERMINAL_CSI_REGEX,
TERMINAL_SHIFT_DCS_REGEX,
} from '@qwen-code/qwen-code-core';
import type { LoadedSettings, SettingsFile } from '../../config/settings.js';
import type {
AsciiArtSource,
CustomAsciiArtSetting,
} from '../../config/settingsSchema.js';
import { getCachedStringWidth, toCodePoints } from './textUtils.js';
const debugLogger = createDebugLogger('BANNER');
@ -118,15 +125,37 @@ function collectScopedTiers(settings: LoadedSettings): {
if (small && large) break;
const raw = file.settings.ui?.customAsciiArt;
if (raw === undefined || raw === null) continue;
if (!file.path) continue;
const tiers = normalizeTiers(raw);
if (!tiers) continue;
const dir = path.dirname(file.path);
if (!small && tiers.small !== undefined) {
small = { source: tiers.small, dir };
// `dir` is only meaningful for `{path}` entries (relative paths
// resolve against the file that declared them). Inline-string tiers
// don't need it, so a scope with no associated file path (e.g.
// `systemDefaults`, future SDK-injected scopes) can still contribute
// string art. When a `{path}` lands in a path-less scope we soft-fail
// that tier specifically and log a `[BANNER]` warn — dropping the
// entire scope was unnecessary coupling.
const dir = file.path ? path.dirname(file.path) : '';
const considerTier = (
tier: AsciiArtSource | undefined,
label: 'small' | 'large',
): ScopedSource | undefined => {
if (tier === undefined) return undefined;
const isPathSource = typeof tier === 'object';
if (isPathSource && !dir) {
debugLogger.warn(
`Ignoring ui.customAsciiArt.${label}: {path} entry has no owning settings file directory to resolve against.`,
);
return undefined;
}
return { source: tier, dir };
};
if (!small) {
const next = considerTier(tiers.small, 'small');
if (next) small = next;
}
if (!large && tiers.large !== undefined) {
large = { source: tiers.large, dir };
if (!large) {
const next = considerTier(tiers.large, 'large');
if (next) large = next;
}
}
return { small, large };
@ -150,11 +179,25 @@ function normalizeTiers(
return undefined;
}
if ('path' in value && typeof value.path === 'string') {
// Mirror the JSON schema's mutually-exclusive object branches: an object
// with `path` cannot also carry `small` / `large`, and vice versa. The
// schema rejects this shape in VS Code; without the same check at
// runtime, JSON parsed at startup would silently let `path` win and
// drop the tier keys (or vice versa).
const hasPath = 'path' in value && typeof value.path === 'string';
const hasTierKeys = 'small' in value || 'large' in value;
if (hasPath && hasTierKeys) {
debugLogger.warn(
'Ignoring ui.customAsciiArt: object combines `path` with `small` / `large`. Use one shape or the other.',
);
return undefined;
}
if (hasPath) {
return { small: value, large: value };
}
if ('small' in value || 'large' in value) {
if (hasTierKeys) {
const tiered = value as {
small?: unknown;
large?: unknown;
@ -227,14 +270,40 @@ function memo(
function readArtFile(absolutePath: string): string | undefined {
let fd: number | undefined;
try {
// O_NOFOLLOW prevents a symlink at the configured path from redirecting
// the read to an attacker-controlled file. Not portable to Windows,
// where `O_NOFOLLOW` is not defined; fall back to a plain read there.
// Step 1: refuse non-regular files BEFORE opening. On POSIX, opening a
// FIFO / named pipe read-only blocks until a writer connects — which
// means a misconfigured `customAsciiArt: { "path": "/tmp/some-fifo" }`
// would hang CLI startup forever. `O_NOFOLLOW` does not help here; it
// refuses symlinks at the final path component, not FIFOs / sockets /
// devices. `lstatSync` (rather than `statSync`) also covers the
// "configured path is itself a symlink" case so we soft-fail before
// opening.
let preOpenStat: fs.Stats;
try {
preOpenStat = fs.lstatSync(absolutePath);
} catch (err) {
debugLogger.warn(
`Failed to stat ui.customAsciiArt at ${absolutePath}: ${(err as Error).message}`,
);
return undefined;
}
if (!preOpenStat.isFile()) {
debugLogger.warn(
`Ignoring ui.customAsciiArt: ${absolutePath} is not a regular file.`,
);
return undefined;
}
// Step 2: open with O_NOFOLLOW (POSIX only) so a TOCTOU symlink swap
// between the lstat above and this open also soft-fails. Windows has
// no equivalent constant, so it falls back to a plain read.
const flags =
typeof fs.constants.O_NOFOLLOW === 'number'
? fs.constants.O_RDONLY | fs.constants.O_NOFOLLOW
: fs.constants.O_RDONLY;
fd = fs.openSync(absolutePath, flags);
// Re-check via fstat on the FD: if anything changed between lstat and
// open, refuse rather than reading whatever the FD now points at.
const stat = fs.fstatSync(fd);
if (!stat.isFile()) {
debugLogger.warn(
@ -268,28 +337,27 @@ function readArtFile(absolutePath: string): string | undefined {
}
/**
* Banner-specific sanitizer. Like `stripTerminalControlSequences` but
* preserves `\n` so multi-line ASCII art survives. Strips OSC/CSI/SS2/SS3
* sequences and replaces every other C0/C1 control byte (and DEL) with a
* single space a hostile or accidental escape can't paint, redirect, or
* hyperlink in the user's terminal.
* Banner-specific sanitizer. Re-uses the OSC / CSI / SS2 / SS3 patterns
* exported from `stripTerminalControlSequences` (in
* `@qwen-code/qwen-code-core`) so the regexes are authored once, but
* preserves `\n` and `\t` multi-line / tab-aligned ASCII art needs
* those, while the shared core helper strips them. The fallback range
* here matches the core helper's C0/C1/DEL strip but carves out
* `\t` (0x09) and `\n` (0x0a) so they survive into the rendered art.
*/
function sanitizeArt(input: string): string {
// Normalize CRLF / CR to LF so the column cap is computed against the
// same line boundaries the renderer will see.
let s = input.replace(/\r\n?/g, '\n');
/* eslint-disable no-control-regex */
// OSC: ESC ] ... (BEL | ESC \)
s = s.replace(/\x1b\][^\x07\x1b]*(?:\x07|\x1b\\)/g, ' ');
// CSI: ESC [ params final-byte
s = s.replace(/\x1b\[[\d;?]*[a-zA-Z]/g, ' ');
// SS2/SS3/DCS leaders
s = s.replace(/\x1b[NOP]/g, ' ');
s = s
.replace(TERMINAL_OSC_REGEX, ' ')
.replace(TERMINAL_CSI_REGEX, ' ')
.replace(TERMINAL_SHIFT_DCS_REGEX, ' ');
// Remaining C0 controls + DEL + C1 controls (0x80-0x9f, e.g. single-byte
// CSI 0x9b) → space. Keep \n (0x0a) and \t (0x09) so multi-line ASCII art
// and tab-aligned art survive.
// eslint-disable-next-line no-control-regex
s = s.replace(/[\x00-\x08\x0b-\x1f\x7f-\x9f]/g, ' ');
/* eslint-enable no-control-regex */
const rawLines = s.split('\n');
const truncatedRows = rawLines.length > MAX_ART_LINES;
@ -303,11 +371,22 @@ function sanitizeArt(input: string): string {
// doesn't expand differently per terminal.
const detabbed = line.replace(/\t/g, ' ');
const trimmed = detabbed.replace(/\s+$/u, '');
if (trimmed.length > MAX_ART_COLS) {
truncatedCols = true;
return trimmed.slice(0, MAX_ART_COLS);
// Cap by *visual* width (terminal cells), not UTF-16 length: 200 CJK
// fullwidth characters render as ~400 cells, and a `.length` slice
// could split a fullwidth code point or surrogate pair down the
// middle. We walk code points until adding the next one would push
// the cell width past the cap.
if (getCachedStringWidth(trimmed) <= MAX_ART_COLS) {
return trimmed;
}
return trimmed;
truncatedCols = true;
const codePoints = toCodePoints(trimmed);
let kept = '';
for (const cp of codePoints) {
if (getCachedStringWidth(kept + cp) > MAX_ART_COLS) break;
kept += cp;
}
return kept;
});
// Drop trailing empty lines so width measurement isn't skewed by a
@ -344,11 +423,12 @@ function sanitizeSubtitle(raw: unknown): string | undefined {
/**
* Shared cleaner for any single-line info-panel string (title, subtitle).
* Strips OSC / CSI / SS2 / SS3 leaders, replaces every other C0 / C1
* control byte (and DEL) with a space, then folds any internal whitespace
* (including the newlines we just dropped from controls) into a single
* space and trims the ends. Returns `undefined` for empty input so
* `<Header />` knows to fall back to its default rendering.
* Delegates the escape-sequence + C0/C1 stripping to the core
* `stripTerminalControlSequences` helper (which already handles `\n` /
* `\t` because single-line fields don't need them), then folds any
* remaining whitespace into a single space and trims the ends. Returns
* `undefined` for empty input so `<Header />` knows to fall back to its
* default rendering.
*/
function sanitizeSingleLine(
raw: unknown,
@ -356,17 +436,7 @@ function sanitizeSingleLine(
fieldLabel: string,
): string | undefined {
if (typeof raw !== 'string') return undefined;
/* eslint-disable no-control-regex */
let t = raw
.replace(/\x1b\][^\x07\x1b]*(?:\x07|\x1b\\)/g, ' ')
.replace(/\x1b\[[\d;?]*[a-zA-Z]/g, ' ')
.replace(/\x1b[NOP]/g, ' ')
// C0 + DEL + C1 controls. Single-line fields never need newlines or
// tabs (they live on a single line of the info panel) so the range is
// denser than the art sanitizer's.
.replace(/[\x00-\x1f\x7f-\x9f]/g, ' ');
/* eslint-enable no-control-regex */
t = t.replace(/\s+/g, ' ').trim();
let t = stripTerminalControlSequences(raw).replace(/\s+/g, ' ').trim();
if (!t) return undefined;
if (t.length > maxLength) {
debugLogger.warn(`Truncated ${fieldLabel} to ${maxLength} characters.`);

View file

@ -10,16 +10,20 @@ import { stripVTControlCharacters } from 'node:util';
import stringWidth from 'string-width';
/**
* Calculates the maximum width of a multi-line ASCII art string.
* Calculates the maximum *visual* width (terminal cells) of a multi-line
* ASCII art string. Uses `string-width` semantics via `getCachedStringWidth`
* so CJK fullwidth characters count as 2 cells and emoji are sized
* correctly `.length` would undercount these and let oversized art slip
* past the width budget that `pickAsciiArtTier` applies.
* @param asciiArt The ASCII art string.
* @returns The length of the longest line in the ASCII art.
* @returns The widest line's terminal-cell width.
*/
export const getAsciiArtWidth = (asciiArt: string): number => {
if (!asciiArt) {
return 0;
}
const lines = asciiArt.split('\n');
return Math.max(...lines.map((line) => line.length));
return Math.max(...lines.map((line) => getCachedStringWidth(line)));
};
/*

View file

@ -147,7 +147,12 @@ export * from './services/gitWorktreeService.js';
export * from './services/sessionRecap.js';
export * from './services/sessionService.js';
export * from './services/sessionTitle.js';
export { stripTerminalControlSequences } from './utils/terminalSafe.js';
export {
stripTerminalControlSequences,
TERMINAL_OSC_REGEX,
TERMINAL_CSI_REGEX,
TERMINAL_SHIFT_DCS_REGEX,
} from './utils/terminalSafe.js';
export * from './services/shellExecutionService.js';
export * from './services/monitorRegistry.js';
export * from './services/backgroundShellRegistry.js';

View file

@ -4,6 +4,22 @@
* SPDX-License-Identifier: Apache-2.0
*/
/**
* Regex constants shared with banner customization (`packages/cli/src/ui/
* utils/customBanner.ts`) so the OSC / CSI / SS2 / SS3 patterns are
* authored once and stay aligned across call sites. Exported via
* `@qwen-code/qwen-code-core` so the CLI sanitizer can re-use them when
* it has to preserve `\n` (which `stripTerminalControlSequences` strips).
*/
/* eslint-disable no-control-regex */
/** OSC: `ESC ]` followed by any non-BEL/non-ESC bytes terminated by BEL or `ESC \`. */
export const TERMINAL_OSC_REGEX = /\x1b\][^\x07\x1b]*(?:\x07|\x1b\\)/g;
/** CSI: `ESC [` parameters then a final letter (cursor / color / erase family). */
export const TERMINAL_CSI_REGEX = /\x1b\[[\d;?]*[a-zA-Z]/g;
/** SS2 / SS3 / DCS leader bytes after ESC. */
export const TERMINAL_SHIFT_DCS_REGEX = /\x1b[NOP]/g;
/* eslint-enable no-control-regex */
/**
* Strip the terminal control sequences from arbitrary text so the result can
* safely render in a TTY without painting cursor moves, clearing the screen,
@ -15,29 +31,23 @@
* - CSI sequences (`\x1b[...<letter>`) the common "cursor/color/erase"
* family.
* - SS2/SS3 / DCS leaders (`\x1b[NOP]`).
* - Any remaining C0/C1 control bytes plus DEL, flattened to a space. This
* backstop means a bare `\x1b` that wasn't part of a recognized sequence
* still can't execute the terminal only interprets ESC followed by
* specific bytes.
* - Any remaining C0 controls + DEL + C1 controls (`0x80-0x9F`, e.g.
* single-byte CSI `0x9B`, DCS `0x90`, ST `0x9C`), flattened to a space.
* This backstop means a bare `\x1b` that wasn't part of a recognized
* sequence still can't execute — and 8-bit terminals can't interpret
* the C1 codes that some legacy shells still honor.
*
* Used for LLM-returned text that ends up in the session picker (titles);
* without this, a compromised or prompt-injected fast model could paint on
* the user's terminal on every render.
*/
export function stripTerminalControlSequences(s: string): string {
// These regexes deliberately match control characters; the whole point of
// this module is to neutralize them. The no-control-regex rule is
// suppressed per-line rather than file-wide so any future additions still
// opt in explicitly.
return (
s
.replace(TERMINAL_OSC_REGEX, ' ')
.replace(TERMINAL_CSI_REGEX, ' ')
.replace(TERMINAL_SHIFT_DCS_REGEX, ' ')
// eslint-disable-next-line no-control-regex
.replace(/\x1b\][^\x07\x1b]*(?:\x07|\x1b\\)/g, ' ')
// eslint-disable-next-line no-control-regex
.replace(/\x1b\[[\d;?]*[a-zA-Z]/g, ' ')
// eslint-disable-next-line no-control-regex
.replace(/\x1b[NOP]/g, ' ')
// eslint-disable-next-line no-control-regex
.replace(/[\x00-\x1f\x7f]/g, ' ')
.replace(/[\x00-\x1f\x7f-\x9f]/g, ' ')
);
}

View file

@ -286,7 +286,16 @@
"properties": {
"path": {
"type": "string"
},
}
},
"required": [
"path"
],
"additionalProperties": false
},
{
"type": "object",
"properties": {
"small": {
"oneOf": [
{