From 9d205363433183e774174c4691129682bb7c919b Mon Sep 17 00:00:00 2001 From: ChiGao Date: Fri, 15 May 2026 17:26:18 +0800 Subject: [PATCH] perf(cli): code-split lowlight to cut startup V8 parse cost (#4070) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * perf(cli): code-split lowlight to cut startup V8 parse cost Move the syntax-highlight engine out of the synchronously-parsed cli.js entry into a separately-emitted chunk and load it via dynamic import on the first code-block render. Until the chunk arrives, code blocks render as plain text; the next React commit of the surrounding subtree picks up the highlighted version, so users never see incorrect highlighting – just an imperceptibly later transition for the very first code block. Mechanics: - esbuild config: switch entry to outdir + splitting:true so that `await import('lowlight')` produces an actual on-disk chunk that's only parsed by V8 when first needed. - esbuild-shims: rename injected __dirname/__filename to qwen-prefixed symbols + use `define` to redirect free references. Previous inject collided with vendored libraries (yargs) that ship their own `var __dirname` ESM-compat polyfill once splitting flattens chunks. - prepare-package: include the new chunks/ directory in the published package's files list. - CodeColorizer: keep the public colorize{Code,Line} signatures and HAST rendering identical; on first call when the chunk hasn't loaded it returns the plain line and fires the dynamic import via a tiny standalone loader module. - lowlightLoader (new): isolates the lazy-load surface to a module with zero transitive imports (no themeManager, settings, or core). This lets test-setup prime the cache without dragging the whole UI module graph into every test file, which was observed to perturb theme and settings test outcomes when CodeColorizer was imported directly. - test-setup: await loadLowlight() once via the standalone loader so synchronous snapshot tests see the highlighted output deterministically. Measurements (real $HOME, n=15 interleaved A/B vs main HEAD, macOS): | Metric | Before (mean±sd ms) | After (mean±sd ms) | Δ | t | p | | ------------------ | ------------------- | ------------------ | -------- | ------ | -------- | | firstByte (wall) | 1633.5 ± 88.7 | 1475.8 ± 73.3 | -157.7 | 5.31 | 1.33e-5 | | idle (wall) | 2048.7 ± 93.6 | 1902.3 ± 80.2 | -146.3 | 4.60 | 8.71e-5 | | cli.js size | 25 MB | 6.9 MB | -18.1 MB | — | — | Both metrics clear the +50ms-or-10% Welch's t-test bar by an order of magnitude. cli.js drops 72%; total payload (cli.js + chunks/) is similar but only cli.js is parsed at module-eval time, which is the phase that dominates the user-visible startup gap. How to validate: npm run bundle ls dist/ # cli.js + chunks/lowlight-*.js node dist/cli.js -y # interactive UI still renders Generated with AI Co-authored-by: Qwen-Coder * fix(cli): resolve chunk-relative sibling paths under esbuild splitting With `splitting: true`, esbuild hoists modules with shared dependencies into `dist/chunks/`. Three modules derived runtime paths from `import.meta.url` assuming they were co-located with `cli.js`; once hoisted, `path.dirname(fileURLToPath(import.meta.url))` resolved to `dist/chunks/` and sibling-asset lookups silently missed: - `skill-manager.ts`: bundledSkillsDir → `dist/chunks/bundled` (actual `dist/bundled/`). The `existsSync` guard swallowed the miss, dropping all four bundled skills (`/review`, `/qc-helper`, `/batch`, `/loop`) with no user-visible signal. - `ripgrepUtils.ts`: `getBuiltinRipgrep()` → `dist/chunks/vendor/...`. Falls back to system rg if installed, otherwise null on minimal hosts — degrading grep to the slow internal scanner. - `i18n/index.ts`: `getBuiltinLocalesDir()` → `dist/chunks/locales`. User-visible behavior survives via the static glob import in `tryImportBundledTranslations`, but the loose-on-disk override path is dead. Each module now strips a trailing `chunks` segment when present, so the lookup resolves under `dist/`. In source / transpiled modes the basename is never `chunks`, so the fallback is a no-op. Also: - Add `chunks` to `DIST_REQUIRED_PATHS` in `create-standalone-package.js` so a regressed bundle that produces only `cli.js` fails the pre-packaging check instead of shipping a broken archive. - Expand `esbuild-shims.js` header so future contributors understand that `__qwen_filename` / `__qwen_dirname` always resolve to the shim's chunk file (dist/chunks/) and that sibling-asset lookups must strip the `chunks` segment. Reported by claude-opus-4-7 via Qwen Code /qreview on #4070. Generated with AI Co-authored-by: Qwen-Coder * perf(cli): prefetch lowlight from AppContainer + harden loader Three follow-ups to the lowlight code-split: - AppContainer fires `loadLowlight()` from a mount effect so the dynamic import is already in flight before any code block needs colorizing. Without this, code blocks committed to ink's append-only `` region before the import resolves stay plain text for the rest of the session — Static can only be re-rendered via `refreshStatic`, which is not wired to lowlight load completion. Common reachable paths: short `--prompt -p` runs that finalize quickly, Ctrl+C- cancelled first turns, and the first-paint history replay on `--resume`. The startup parse-cost win is preserved (V8 still parses off the critical path). - `lowlightLoader.ts` latches the first import failure so subsequent calls short-circuit to a rejected promise instead of re-attempting `import('lowlight')` on every keystroke. The colorizer already falls back to plain text on miss; recovery requires a fresh process anyway. - `test-setup.ts` wraps the top-level `await loadLowlight()` in try/catch. A transient import failure no longer crashes the entire vitest run — tests that hit a code block render the plain-text fallback and surface a warning. - `CodeColorizer.tsx` header comment updated to point at the AppContainer prefetch instead of claiming first-paint always sees a loaded instance. Reported by DeepSeek/deepseek-v4-pro and claude-opus-4-7 via Qwen Code /review and /qreview on #4070. Generated with AI Co-authored-by: Qwen-Coder * refactor(bundle): extract resolveBundleDir helper, apply to extensions/new Centralises the `chunks/` strip pattern that three sites (`i18n/index.ts`, `skills/skill-manager.ts`, `utils/ripgrepUtils.ts`) each duplicated after the round-3 fix in d581da04d. The implicit coupling to `esbuild.config.js`'s `chunkNames: 'chunks/[name]-[hash]'` now lives in a single helper (`packages/core/src/utils/bundlePaths.ts`), so a future rename only needs updating in one place. Also applies the same anchor to `commands/extensions/new.ts:EXAMPLES_PATH`. That module is currently bundled into `cli.js` (so the strip is a no-op today), but `qwen extensions new --help` always reads the examples directory in its yargs `builder` — confirmed against the built bundle that the lookup hits `dist/examples/` (sibling of `cli.js`). Using the helper future-proofs against esbuild later hoisting the module into a shared chunk, where the bare `__dirname`/`import.meta.url` lookup would silently break the command for every end user. While here, surface lowlight-load failures from `AppContainer`'s prefetch effect to the debug channel (`debugLogger.warn`) instead of swallowing them silently. The loader already latches failures permanently, so this fires at most once per session; `CodeColorizer` continues to fall back to plain text on miss, so user-visible behaviour is unchanged. Generated with AI Co-authored-by: Qwen-Coder * fix(bundle): restore __filename shadow in ripgrepUtils; harden lowlight loader Round-4 review (wenshao 2026-05-13 13:12) flagged five issues in the recent code-split work. This commit addresses all of them. CRITICAL — `packages/core/src/utils/ripgrepUtils.ts`: the round-3 `resolveBundleDir` refactor removed the local `__filename` declaration but `getBuiltinRipgrep` still references bare `__filename` to decide how many `..` segments to walk. In `npm run dev` (tsx, ESM) `__filename` is undefined so the function throws `ReferenceError`. In the bundle esbuild's `define` rewrites it to `__qwen_filename` (the shim chunk path), which is the wrong string but happens to short-circuit to `levelsUp = 0` — accidentally correct only because the chunk-path string never contains `path.join('src', 'utils')`. Reproduced via tsx: `__filename is not defined`; fixed by re-introducing the explicit local shadow plus a comment explaining why centralising both helpers into `resolveBundleDir` cannot replace the per-file shadow. `packages/cli/src/ui/utils/lowlightLoader.ts`: the previous permanent `lowlightFailed` latch left syntax highlighting dead for the entire process lifetime on transient errors (EMFILE, antivirus locks, slow-disk-after-wake). Replaced with a 30-second cooldown — within the window subsequent calls return the cached rejection synchronously (keeps the per-render short-circuit that protects against permanently-broken installs); after the cooldown the next call retries the dynamic import. Exposes `isLowlightCoolingDown()` so render-hot callers can also skip duplicate failure logging. `packages/cli/src/ui/utils/CodeColorizer.tsx`: hoisted `loadLowlight()` + log out of the per-line render loop into a single `ensureLowlightLoading()` call at the top of `colorizeCode`. In the failure case this collapses hundreds of duplicate debug entries (one per line) to one per block. The instance is now passed down to `highlightAndRenderLine` as a parameter. `packages/core/src/utils/bundlePaths.ts` + `esbuild.config.js`: exposed `BUNDLE_CHUNK_DIR = 'chunks'` as a named constant and updated `esbuild.config.js` to interpolate the same name into `chunkNames` (plus an explicit "MUST stay in sync" comment). Renaming on one side without the other now stands out at review time. Also expanded the `define` comment with a contributor-facing warning describing exactly why bare `__dirname` / `__filename` in source files becomes the shim chunk path, and pointing future contributors at the `fileURLToPath(import.meta.url)` shadow pattern (and `resolveBundleDir` for sibling-asset lookups). Verified: - typecheck (all 4 workspaces): clean - packages/core tests: 7747 passing (no regressions) - packages/cli tests: only the pre-existing `useAtCompletion.test.ts` filesystem-order failures remain (confirmed against `git stash`) - `npm run bundle` succeeds; `node dist/cli.js --version` returns `0.15.10`; `node dist/cli.js --help` renders normally - `npx tsx ` now returns the vendored path instead of throwing `ReferenceError` Generated with AI Co-authored-by: Qwen-Coder * fix(bundle): validate lowlight API shape; sync doc-comment drift; add tests - lowlightLoader: validate runtime shape of createLowlight() before the `as Lowlight` cast so an upstream API rename routes through the cooldown latch instead of silently degrading every code block to plain text. - bundlePaths: correct doc comment — esbuild.config.js maintains its own `BUNDLE_CHUNK_DIR` constant rather than importing this one (it runs before any TS compile step). - AppContainer: update prefetch-failure comment to reference the cooldown symbols (`LOWLIGHT_RETRY_COOLDOWN_MS` / `lowlightLastFailureAt`) that replaced the removed `lowlightFailed` latch. - New unit tests covering the lowlightLoader state machine (success, in-flight dedup, shape mismatch, cooldown skip, post-cooldown retry) and `resolveBundleDir`'s strip-only-on-exact-match contract. * test(bundlePaths): use path.resolve for Windows-compatible absolute paths CI failure on Windows: the new `resolveBundleDir` tests built expected values with `path.join(path.sep, ...)` (e.g. `\tmp\dist`), but `pathToFileURL` resolves drive-less paths against the current drive on Windows. The URL -> `fileURLToPath` round-trip returned `D:\tmp\dist`, while the expectation stayed `\tmp\dist`, tripping all three new assertions. Switched both the URL source and the expected value to a single `path.resolve(path.sep, ...)` anchor per test so both sides absorb whatever the platform considers absolute. POSIX behaviour is unchanged (`/tmp/dist` -> `/tmp/dist`). --------- Co-authored-by: 秦奇 Co-authored-by: Qwen-Coder --- esbuild.config.js | 38 ++++- packages/cli/src/commands/extensions/new.ts | 15 +- packages/cli/src/i18n/index.ts | 16 +- packages/cli/src/ui/AppContainer.tsx | 28 ++++ packages/cli/src/ui/utils/CodeColorizer.tsx | 60 +++++++- .../cli/src/ui/utils/lowlightLoader.test.ts | 141 ++++++++++++++++++ packages/cli/src/ui/utils/lowlightLoader.ts | 115 ++++++++++++++ packages/cli/test-setup.ts | 20 +++ packages/core/src/index.ts | 1 + packages/core/src/skills/skill-manager.ts | 9 +- packages/core/src/utils/bundlePaths.test.ts | 63 ++++++++ packages/core/src/utils/bundlePaths.ts | 62 ++++++++ packages/core/src/utils/ripgrepUtils.ts | 17 ++- scripts/create-standalone-package.js | 8 +- scripts/esbuild-shims.js | 31 ++-- scripts/prepare-package.js | 1 + 16 files changed, 593 insertions(+), 32 deletions(-) create mode 100644 packages/cli/src/ui/utils/lowlightLoader.test.ts create mode 100644 packages/cli/src/ui/utils/lowlightLoader.ts create mode 100644 packages/core/src/utils/bundlePaths.test.ts create mode 100644 packages/core/src/utils/bundlePaths.ts diff --git a/esbuild.config.js b/esbuild.config.js index c7aafb463..a842a2c6b 100644 --- a/esbuild.config.js +++ b/esbuild.config.js @@ -72,11 +72,22 @@ const external = [ '@teddyzhu/clipboard-win32-arm64-msvc', ]; +// Name of the directory under `dist/` that esbuild emits shared chunks into. +// MUST stay in sync with `BUNDLE_CHUNK_DIR` in +// `packages/core/src/utils/bundlePaths.ts`, whose `resolveBundleDir` helper +// strips this exact segment when modules look up sibling assets at runtime. +// Renaming here without renaming there silently breaks bundled-binary lookup +// in skill-manager / ripgrepUtils / i18n / extensions/new. +const BUNDLE_CHUNK_DIR = 'chunks'; + esbuild .build({ - entryPoints: ['packages/cli/index.ts'], + entryPoints: { cli: 'packages/cli/index.ts' }, bundle: true, - outfile: 'dist/cli.js', + outdir: 'dist', + entryNames: '[name]', + chunkNames: `${BUNDLE_CHUNK_DIR}/[name]-[hash]`, + splitting: true, platform: 'node', format: 'esm', target: 'node22', @@ -103,6 +114,29 @@ esbuild 'process.env.CLI_VERSION': JSON.stringify(pkg.version), // Make global available for compatibility global: 'globalThis', + // Redirect free __dirname/__filename references to the shim so that + // vendored libraries that emit their own `var __dirname` locals don't + // collide with our injected bindings when code-splitting is enabled. + // + // CONTRIBUTOR WARNING: this rewrite applies to *all* source files, so + // any bare `__dirname` / `__filename` in our own code resolves to the + // shim chunk's on-disk location (i.e. `dist/chunks/`), NOT the source + // file's own directory. To get a per-file path, declare a local shadow + // at the top of the module: + // + // import { fileURLToPath } from 'node:url'; + // const __filename = fileURLToPath(import.meta.url); + // const __dirname = path.dirname(__filename); + // + // esbuild leaves the local binding alone (it's a declared identifier, + // not a free reference). For sibling-asset lookups in modules that may + // be hoisted into a shared chunk, prefer + // `resolveBundleDir(import.meta.url)` from + // `packages/core/src/utils/bundlePaths.ts` — it both produces a + // per-file path and strips the chunk segment when the module ends up + // under `dist/chunks/`. + __dirname: '__qwen_dirname', + __filename: '__qwen_filename', }, loader: { '.node': 'file' }, plugins: [wasmBinaryPlugin, wasmLoader({ mode: 'embedded' })], diff --git a/packages/cli/src/commands/extensions/new.ts b/packages/cli/src/commands/extensions/new.ts index f47648ab9..dbadb9ec0 100644 --- a/packages/cli/src/commands/extensions/new.ts +++ b/packages/cli/src/commands/extensions/new.ts @@ -5,9 +5,9 @@ */ import { access, cp, mkdir, readdir, writeFile } from 'node:fs/promises'; -import { join, dirname, basename } from 'node:path'; +import { join, basename } from 'node:path'; import type { CommandModule } from 'yargs'; -import { fileURLToPath } from 'node:url'; +import { resolveBundleDir } from '@qwen-code/qwen-code-core'; import { getErrorMessage } from '../../utils/errors.js'; import { writeStdoutLine, writeStderrLine } from '../../utils/stdioHelpers.js'; @@ -16,10 +16,13 @@ interface NewArgs { template?: string; } -const __filename = fileURLToPath(import.meta.url); -const __dirname = dirname(__filename); - -const EXAMPLES_PATH = join(__dirname, 'examples'); +// Anchor the bundled extension-examples directory at the on-disk sibling of +// `cli.js` (i.e. `dist/examples/`, populated by `prepare-package.js`). Today +// this module is bundled into `cli.js` itself, so the `chunks/` strip in +// `resolveBundleDir` is a no-op — but using the same helper as the other +// asset-anchor sites means this code stays correct if esbuild later hoists +// this module into a shared chunk. +const EXAMPLES_PATH = join(resolveBundleDir(import.meta.url), 'examples'); async function pathExists(path: string) { try { diff --git a/packages/cli/src/i18n/index.ts b/packages/cli/src/i18n/index.ts index e4536cf53..86f91cb01 100644 --- a/packages/cli/src/i18n/index.ts +++ b/packages/cli/src/i18n/index.ts @@ -6,9 +6,9 @@ import * as fs from 'node:fs'; import * as path from 'node:path'; -import { fileURLToPath, pathToFileURL } from 'node:url'; +import { pathToFileURL } from 'node:url'; import { writeStderrLine } from '../utils/stdioHelpers.js'; -import { Storage } from '@qwen-code/qwen-code-core'; +import { Storage, resolveBundleDir } from '@qwen-code/qwen-code-core'; import { type SupportedLanguage, SUPPORTED_LANGUAGES, @@ -37,10 +37,14 @@ type TranslationLoadResult = | { translations?: undefined; error: Error }; // Path helpers -const getBuiltinLocalesDir = (): string => { - const __filename = fileURLToPath(import.meta.url); - return path.join(path.dirname(__filename), 'locales'); -}; +// +// Anchor the bundled locales directory at the on-disk sibling of `cli.js` +// (i.e. `dist/locales/`, populated by `prepare-package.js`). See +// `resolveBundleDir` for the rationale behind stripping a trailing +// `chunks/` segment when this module is hoisted into a shared esbuild +// chunk. +const getBuiltinLocalesDir = (): string => + path.join(resolveBundleDir(import.meta.url), 'locales'); const getUserLocalesDir = (): string => path.join(Storage.getGlobalQwenDir(), 'locales'); diff --git a/packages/cli/src/ui/AppContainer.tsx b/packages/cli/src/ui/AppContainer.tsx index 3f81cac35..4e8a39120 100644 --- a/packages/cli/src/ui/AppContainer.tsx +++ b/packages/cli/src/ui/AppContainer.tsx @@ -60,6 +60,7 @@ import { ToolNames, } from '@qwen-code/qwen-code-core'; import { buildResumedHistoryItems } from './utils/resumeHistoryUtils.js'; +import { loadLowlight } from './utils/lowlightLoader.js'; import { getStickyTodos, getStickyTodoMaxVisibleItems, @@ -435,6 +436,33 @@ export const AppContainer = (props: AppContainerProps) => { const lastTitleRef = useRef(null); const staticExtraHeight = 3; + // Prefetch the lowlight chunk on mount so the dynamic import is already + // in flight before the first code block needs colorizing. Without this + // kick-off, code blocks committed to ink's append-only region + // before the import resolves stay plain text for the rest of the session + // — Static can only be re-rendered via `refreshStatic`, which is not + // wired to lowlight load completion. Common reachable paths: short + // `--prompt -p` runs that finalize quickly, Ctrl+C-cancelled first turns, + // and the first-paint history replay on `--resume`. Firing the load + // from mount keeps the startup parse-cost win (V8 still parses off the + // critical path) while restoring the "first paint sees a loaded + // instance" guarantee. Errors are silently swallowed; CodeColorizer + // already falls back to plain text on miss. + useEffect(() => { + void loadLowlight().catch((err) => { + // The loader caches rejection with a cooldown (see + // `LOWLIGHT_RETRY_COOLDOWN_MS` / `lowlightLastFailureAt` in + // `lowlightLoader.ts`). This useEffect runs once on mount, so this + // catch fires at most once per session regardless. Log to the debug + // channel so a degraded syntax-highlight state (corrupted install, + // missing chunk) leaves a breadcrumb without spamming the user's + // TTY — `CodeColorizer` already falls back to plain text. + debugLogger.warn( + `Failed to load lowlight chunk; code blocks will render as plain text: ${err instanceof Error ? err.message : String(err)}`, + ); + }); + }, []); + // Initialize config (runs once on mount) useEffect(() => { (async () => { diff --git a/packages/cli/src/ui/utils/CodeColorizer.tsx b/packages/cli/src/ui/utils/CodeColorizer.tsx index da0d99132..9c731bac7 100644 --- a/packages/cli/src/ui/utils/CodeColorizer.tsx +++ b/packages/cli/src/ui/utils/CodeColorizer.tsx @@ -6,7 +6,6 @@ import React from 'react'; import { Text, Box } from 'ink'; -import { common, createLowlight } from 'lowlight'; import type { Root, Element, @@ -22,11 +21,25 @@ import { } from '../components/shared/MaxSizedBox.js'; import type { LoadedSettings } from '../../config/settings.js'; import { createDebugLogger } from '@qwen-code/qwen-code-core'; +import { + getLowlightInstance, + isLowlightCoolingDown, + loadLowlight, + type Lowlight, +} from './lowlightLoader.js'; -// Configure theming and parsing utilities. -const lowlight = createLowlight(common); const debugLogger = createDebugLogger('CODE_COLORIZER'); +// Lowlight is heavy (~1.5 MB bundled, ~36–60 ms V8 parse). It's loaded lazily +// from `./lowlightLoader.js` via dynamic import so it lives in a separate +// esbuild chunk that's only parsed once a code block actually needs +// highlighting. To avoid leaving code blocks committed to ink's append-only +// region as plain text for the rest of the session, AppContainer +// fires `loadLowlight()` from a mount effect — in steady state the import +// is already resolved by the time any colorize call lands. The fallback +// below still handles the brief window before resolution and any +// permanent-failure path (latched inside lowlightLoader). + function renderHastNode( node: Root | Element | HastText | RootContent, theme: Theme, @@ -92,11 +105,39 @@ function renderHastNode( return null; } +/** + * Fires the lazy `loadLowlight()` once if the instance isn't ready yet and + * we aren't inside the loader's failure cooldown. Returns the current + * instance (which may still be `null` if the load is in flight or cooling + * down). Centralising this here lets callers kick the load off-the-hot-path + * — `colorizeCode` fires once per block, not once per rendered line, which + * matters in the failure case: when the load is permanently broken, the + * loader rejects synchronously and a per-line trigger would emit hundreds + * of duplicate debug-log entries per code block. + */ +function ensureLowlightLoading(): Lowlight | null { + const ll = getLowlightInstance(); + if (ll) return ll; + if (!isLowlightCoolingDown()) { + void loadLowlight().catch((err) => { + debugLogger.error('[CodeColorizer] failed to load lowlight:', err); + }); + } + return null; +} + function highlightAndRenderLine( line: string, language: string | null, theme: Theme, + lowlight: Lowlight | null, ): React.ReactNode { + // Until lowlight resolves (or after a permanent failure), fall back to a + // plain-text rendering of the line. The next React render of the + // surrounding subtree will pick up the highlighted version on success. + if (!lowlight) { + return line; + } try { const getHighlightedLine = () => !language || !lowlight.registered(language) @@ -117,7 +158,12 @@ export function colorizeLine( theme?: Theme, ): React.ReactNode { const activeTheme = theme || themeManager.getActiveTheme(); - return highlightAndRenderLine(line, language, activeTheme); + return highlightAndRenderLine( + line, + language, + activeTheme, + ensureLowlightLoading(), + ); } /** @@ -142,6 +188,11 @@ export function colorizeCode( .replace(/\t/g, ' '.repeat(tabWidth)); const activeTheme = theme || themeManager.getActiveTheme(); const showLineNumbers = settings?.merged.ui?.showLineNumbers ?? true; + // Resolve the loader state once per block, not once per line. Triggers the + // lazy import on first use; subsequent renders pick up the highlighted + // output once the chunk lands. Hoisting this out of the per-line render + // loop also collapses duplicate failure logs to one per block. + const lowlight = ensureLowlightLoading(); try { // Render the HAST tree using the adapted theme @@ -173,6 +224,7 @@ export function colorizeCode( line, language, activeTheme, + lowlight, ); return ( diff --git a/packages/cli/src/ui/utils/lowlightLoader.test.ts b/packages/cli/src/ui/utils/lowlightLoader.test.ts new file mode 100644 index 000000000..3d739ab01 --- /dev/null +++ b/packages/cli/src/ui/utils/lowlightLoader.test.ts @@ -0,0 +1,141 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; + +/** + * The loader keeps top-level module state (lowlightInstance, in-flight + * promise, last-failure timestamp). `test-setup.ts` primes that state once + * per test worker, so to exercise the load / failure / cooldown / shape-check + * branches we need a fresh module copy per test plus a mock of `lowlight`. + * + * `vi.resetModules()` clears the module cache so the next dynamic + * `await import('./lowlightLoader.js')` re-runs the file with all state + * reset to zero. `vi.doMock('lowlight', ...)` injects the desired upstream + * shape for the dynamic `import('lowlight')` inside `loadLowlight`. + */ + +describe('lowlightLoader', () => { + beforeEach(() => { + vi.resetModules(); + vi.useFakeTimers(); + }); + + afterEach(() => { + vi.useRealTimers(); + vi.doUnmock('lowlight'); + vi.resetModules(); + }); + + function makeLowlightInstance() { + return { + registered: vi.fn(() => true), + highlight: vi.fn(() => ({ type: 'root', children: [] })), + highlightAuto: vi.fn(() => ({ type: 'root', children: [] })), + }; + } + + it('resolves with a Lowlight instance on first successful load', async () => { + const instance = makeLowlightInstance(); + vi.doMock('lowlight', () => ({ + createLowlight: vi.fn(() => instance), + common: {}, + })); + + const mod = await import('./lowlightLoader.js'); + expect(mod.getLowlightInstance()).toBeNull(); + + const loaded = await mod.loadLowlight(); + expect(loaded).toBe(instance); + expect(mod.getLowlightInstance()).toBe(instance); + expect(mod.isLowlightCoolingDown()).toBe(false); + }); + + it('dedupes concurrent in-flight loads to a single dynamic import', async () => { + const instance = makeLowlightInstance(); + const createLowlight = vi.fn(() => instance); + vi.doMock('lowlight', () => ({ createLowlight, common: {} })); + + const mod = await import('./lowlightLoader.js'); + const [a, b, c] = await Promise.all([ + mod.loadLowlight(), + mod.loadLowlight(), + mod.loadLowlight(), + ]); + + expect(a).toBe(instance); + expect(b).toBe(instance); + expect(c).toBe(instance); + // The factory is only called once across the three concurrent callers — + // proves the in-flight `lowlightLoad` promise is reused. + expect(createLowlight).toHaveBeenCalledTimes(1); + }); + + it('rejects and latches on upstream API-shape mismatch', async () => { + // Simulate a future lowlight release that renames `highlightAuto`. + const brokenInstance = { + registered: vi.fn(() => true), + highlight: vi.fn(() => ({ type: 'root', children: [] })), + // highlightAuto missing — shape check must fail + }; + vi.doMock('lowlight', () => ({ + createLowlight: vi.fn(() => brokenInstance), + common: {}, + })); + + const mod = await import('./lowlightLoader.js'); + await expect(mod.loadLowlight()).rejects.toThrow( + /lowlight instance does not match expected API/, + ); + expect(mod.getLowlightInstance()).toBeNull(); + // After the failure callers should see the cooldown latch on so they + // short-circuit the next render without retrying the broken import. + expect(mod.isLowlightCoolingDown()).toBe(true); + }); + + it('caches rejection within the cooldown window and skips re-import', async () => { + const importErr = new Error('chunk not found'); + const createLowlight = vi.fn(() => { + throw importErr; + }); + vi.doMock('lowlight', () => ({ createLowlight, common: {} })); + + const mod = await import('./lowlightLoader.js'); + await expect(mod.loadLowlight()).rejects.toThrow('chunk not found'); + expect(mod.isLowlightCoolingDown()).toBe(true); + + // A subsequent call inside the cooldown window must return the cached + // rejection without re-invoking the dynamic import. + await expect(mod.loadLowlight()).rejects.toThrow('chunk not found'); + expect(createLowlight).toHaveBeenCalledTimes(1); + }); + + it('retries after the cooldown elapses and recovers on transient failure', async () => { + const instance = makeLowlightInstance(); + let attempt = 0; + const createLowlight = vi.fn(() => { + attempt += 1; + if (attempt === 1) { + throw new Error('EMFILE: too many open files'); + } + return instance; + }); + vi.doMock('lowlight', () => ({ createLowlight, common: {} })); + + const mod = await import('./lowlightLoader.js'); + await expect(mod.loadLowlight()).rejects.toThrow(/EMFILE/); + expect(mod.isLowlightCoolingDown()).toBe(true); + + // Advance past the 30s cooldown window. + await vi.advanceTimersByTimeAsync(30_001); + expect(mod.isLowlightCoolingDown()).toBe(false); + + // Next call retries and now succeeds. + const loaded = await mod.loadLowlight(); + expect(loaded).toBe(instance); + expect(createLowlight).toHaveBeenCalledTimes(2); + }); +}); diff --git a/packages/cli/src/ui/utils/lowlightLoader.ts b/packages/cli/src/ui/utils/lowlightLoader.ts new file mode 100644 index 000000000..a566bb7c7 --- /dev/null +++ b/packages/cli/src/ui/utils/lowlightLoader.ts @@ -0,0 +1,115 @@ +/** + * @license + * Copyright 2025 Google LLC + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * Standalone loader for the lowlight syntax-highlight engine. + * + * Kept in its own module — with zero imports beyond `lowlight` itself — so + * that priming the cache from `test-setup.ts` does not transitively pull + * `themeManager`, settings, or `@qwen-code/qwen-code-core` into every test + * file's module graph. That cascade was observed to alter theme/config test + * outcomes (e.g. theme-manager auto-detection and QWEN_HOME env tests). + */ + +import type { Root } from 'hast'; + +export type Lowlight = { + registered(language: string): boolean; + highlight(language: string, value: string): Root; + highlightAuto(value: string): Root; +}; + +let lowlightInstance: Lowlight | null = null; +let lowlightLoad: Promise | null = null; +// Tracks recent failures so callers can short-circuit without re-attempting +// `import('lowlight')` on every render. Without this, every React render of +// a code block would re-call `loadLowlight()` — wasting CPU and spamming +// debug logs on every keystroke if the chunk file is permanently missing +// (corrupted install). +// +// We don't latch permanently though: transient errors (EMFILE, antivirus +// file lock, slow disk after wake-from-sleep) would otherwise leave syntax +// highlighting off for the entire session. Instead we use a short cooldown +// — subsequent calls within `LOWLIGHT_RETRY_COOLDOWN_MS` of the last failure +// return the cached rejection immediately; the next call after the cooldown +// retries the dynamic import. For a permanently broken install the chunk +// import will keep failing every `LOWLIGHT_RETRY_COOLDOWN_MS`, which is +// already orders of magnitude less work than the per-render hot loop the +// cooldown is designed to prevent. +const LOWLIGHT_RETRY_COOLDOWN_MS = 30_000; +let lowlightLastFailureAt = 0; +let lowlightError: Error | null = null; + +export function getLowlightInstance(): Lowlight | null { + return lowlightInstance; +} + +/** + * Returns true if a recent load attempt failed and we're still inside the + * cooldown window. Callers in render-hot paths can use this to skip both the + * `loadLowlight()` call and any duplicate failure-log it would emit. + */ +export function isLowlightCoolingDown(): boolean { + return ( + lowlightLastFailureAt > 0 && + Date.now() - lowlightLastFailureAt < LOWLIGHT_RETRY_COOLDOWN_MS + ); +} + +/** + * Kicks off (or returns the in-flight) load of the lowlight chunk. Exported + * for two callers: + * 1. `CodeColorizer.tsx` — fires the load on first colorize call so the + * next React commit picks up the highlighted output. + * 2. `test-setup.ts` — awaits this once to keep snapshot tests + * deterministic without dragging more modules into the test graph. + * + * On import failure the rejection is cached for `LOWLIGHT_RETRY_COOLDOWN_MS` + * (see `isLowlightCoolingDown`); subsequent calls inside the cooldown return + * the cached rejection without retrying. After the cooldown, the next call + * will retry the dynamic import — this recovers from transient errors + * (EMFILE, antivirus locks) without losing the per-render short-circuit that + * protects against permanently-broken installs. + */ +export function loadLowlight(): Promise { + if (lowlightInstance) return Promise.resolve(lowlightInstance); + if (isLowlightCoolingDown()) { + return Promise.reject( + lowlightError ?? new Error('lowlight import previously failed'), + ); + } + if (lowlightLoad) return lowlightLoad; + lowlightLoad = import('lowlight') + .then((mod) => { + const instance = mod.createLowlight(mod.common) as Partial; + // Validate the runtime shape before casting. Without this, an upstream + // API rename would silently coerce the mismatched object: the resulting + // TypeError in `highlightAndRenderLine` is swallowed by its catch and + // every code block falls back to plain text with no log breadcrumb. A + // throw here routes the failure through the cooldown latch above, so + // the degraded state surfaces in the debug channel exactly once. + if ( + typeof instance?.registered !== 'function' || + typeof instance?.highlight !== 'function' || + typeof instance?.highlightAuto !== 'function' + ) { + throw new Error( + 'lowlight instance does not match expected API (registered/highlight/highlightAuto)', + ); + } + lowlightInstance = instance as Lowlight; + lowlightLastFailureAt = 0; + lowlightError = null; + return lowlightInstance; + }) + .catch((err) => { + lowlightLastFailureAt = Date.now(); + lowlightError = err instanceof Error ? err : new Error(String(err)); + lowlightLoad = null; + throw err; + }); + return lowlightLoad; +} diff --git a/packages/cli/test-setup.ts b/packages/cli/test-setup.ts index c26e57fa5..e84e7809c 100644 --- a/packages/cli/test-setup.ts +++ b/packages/cli/test-setup.ts @@ -16,3 +16,23 @@ if (process.env['QWEN_DEBUG_LOG_FILE'] === undefined) { } import './src/test-utils/customMatchers.js'; + +// Lowlight is loaded asynchronously in production to keep it out of the +// startup-critical bundle chunk. Snapshot tests render synchronously via +// `lastFrame()` and would otherwise capture the plain-text fallback before +// the dynamic import resolves. Prime the cache once here so every test sees +// the fully-highlighted output. The loader is intentionally a tiny standalone +// module (no transitive imports of themeManager / settings / core) so this +// prime does not perturb any other test's module graph. +import { loadLowlight } from './src/ui/utils/lowlightLoader.js'; +try { + await loadLowlight(); +} catch (err) { + // Don't crash the entire test run if lowlight fails to import; snapshot + // tests that hit a code block will then render the plain-text fallback. + console.warn( + '[test-setup] Failed to prime lowlight cache, snapshot tests may ' + + 'show plain-text fallback:', + String(err), + ); +} diff --git a/packages/core/src/index.ts b/packages/core/src/index.ts index 06566c33a..4f703e9cd 100644 --- a/packages/core/src/index.ts +++ b/packages/core/src/index.ts @@ -269,6 +269,7 @@ export * from './followup/index.js'; // ============================================================================ export * from './utils/browser.js'; +export * from './utils/bundlePaths.js'; export * from './utils/configResolver.js'; export * from './utils/debugLogger.js'; export * from './utils/editor.js'; diff --git a/packages/core/src/skills/skill-manager.ts b/packages/core/src/skills/skill-manager.ts index 0c7c62dbf..ddebb7103 100644 --- a/packages/core/src/skills/skill-manager.ts +++ b/packages/core/src/skills/skill-manager.ts @@ -8,8 +8,8 @@ import * as fs from 'fs/promises'; import * as fsSync from 'fs'; import * as path from 'path'; import * as os from 'os'; -import { fileURLToPath } from 'url'; import { watch as watchFs, type FSWatcher } from 'chokidar'; +import { resolveBundleDir } from '../utils/bundlePaths.js'; import { parse as parseYaml } from '../utils/yaml-parser.js'; import * as yaml from 'yaml'; import type { @@ -86,8 +86,13 @@ export class SkillManager { private activationRegistry: SkillActivationRegistry | null = null; constructor(private readonly config: Config) { + // Anchor the bundled skills directory at the on-disk sibling of + // `cli.js` (i.e. `dist/bundled/`, populated by `copy_bundle_assets.js`). + // See `resolveBundleDir` for the rationale behind stripping a trailing + // `chunks/` segment when this module is hoisted into a shared esbuild + // chunk. this.bundledSkillsDir = path.join( - path.dirname(fileURLToPath(import.meta.url)), + resolveBundleDir(import.meta.url), 'bundled', ); } diff --git a/packages/core/src/utils/bundlePaths.test.ts b/packages/core/src/utils/bundlePaths.test.ts new file mode 100644 index 000000000..a2ef33365 --- /dev/null +++ b/packages/core/src/utils/bundlePaths.test.ts @@ -0,0 +1,63 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +import * as path from 'node:path'; +import { pathToFileURL } from 'node:url'; +import { describe, expect, it } from 'vitest'; +import { BUNDLE_CHUNK_DIR, resolveBundleDir } from './bundlePaths.js'; + +/** + * `resolveBundleDir` is the single chokepoint that hides whether a module + * was hoisted into `dist/chunks/` by esbuild's `splitting: true`. The check + * is intentionally narrow (only strips a trailing segment whose basename + * equals `BUNDLE_CHUNK_DIR`) — these tests pin that behaviour so a future + * tweak to the splitter or `chunkNames` doesn't silently break the four + * downstream callers (skill-manager, ripgrepUtils, i18n, extensions/new). + */ +describe('resolveBundleDir', () => { + it('keeps the constant in sync with esbuild.config.js', () => { + // Cross-checked by hand against `esbuild.config.js`'s + // `BUNDLE_CHUNK_DIR = 'chunks'`. If you change one, change both. + expect(BUNDLE_CHUNK_DIR).toBe('chunks'); + }); + + it('strips the trailing chunks segment when the module lives under dist/chunks/', () => { + // Use `path.resolve` so the expected value matches whatever the current + // platform considers absolute. On Windows, `pathToFileURL` of a + // drive-less path resolves against the current drive (e.g. `\tmp\dist` + // becomes `D:\tmp\dist`), so both the URL we synthesise and the + // expected return value must be built via `path.resolve` to stay aligned. + const distDir = path.resolve(path.sep, 'tmp', 'dist'); + const fakeChunk = pathToFileURL( + path.join(distDir, BUNDLE_CHUNK_DIR, 'chunk-AAAA.js'), + ).toString(); + expect(resolveBundleDir(fakeChunk)).toBe(distDir); + }); + + it('returns the module directory unchanged when not under chunks/', () => { + // Source / transpiled / non-split builds: the trailing segment is the + // module's own directory name, never the chunk constant. + const i18nDir = path.resolve( + path.sep, + 'repo', + 'packages', + 'cli', + 'src', + 'i18n', + ); + const sourceFile = pathToFileURL(path.join(i18nDir, 'index.ts')).toString(); + expect(resolveBundleDir(sourceFile)).toBe(i18nDir); + }); + + it('only strips when the basename matches exactly', () => { + // A directory whose name merely contains "chunks" must not be stripped. + const myChunksDir = path.resolve(path.sep, 'tmp', 'dist', 'my-chunks'); + const looksLikeChunks = pathToFileURL( + path.join(myChunksDir, 'mod.js'), + ).toString(); + expect(resolveBundleDir(looksLikeChunks)).toBe(myChunksDir); + }); +}); diff --git a/packages/core/src/utils/bundlePaths.ts b/packages/core/src/utils/bundlePaths.ts new file mode 100644 index 000000000..cda2ea29f --- /dev/null +++ b/packages/core/src/utils/bundlePaths.ts @@ -0,0 +1,62 @@ +/** + * @license + * Copyright 2025 Qwen team + * SPDX-License-Identifier: Apache-2.0 + */ + +import * as path from 'node:path'; +import { fileURLToPath } from 'node:url'; + +/** + * Name of the directory under `dist/` that esbuild emits shared chunks into. + * + * Hardcoded here to match `esbuild.config.js`'s + * `chunkNames: 'chunks/[name]-[hash]'` setting. The two files each define + * their own copy (esbuild.config.js runs before any TS compile step and so + * cannot import this module), so renaming here requires renaming the + * `BUNDLE_CHUNK_DIR` constant in `esbuild.config.js` in the same commit. + * The comment block in `esbuild.config.js` cross-references this file as + * the authoritative definition for runtime callers. + * + * If you change this value, also re-check anything that filters or lists + * `dist/` entries (e.g. `scripts/prepare-package.js`, + * `scripts/create-standalone-package.js`, + * `vscode-ide-companion/scripts/copy-bundled-cli.js`). + */ +export const BUNDLE_CHUNK_DIR = 'chunks'; + +/** + * Resolves the on-disk directory a module should treat as a sibling of the + * bundled `cli.js` entry, given the caller's `import.meta.url`. + * + * Why this exists: `esbuild.config.js` ships with `splitting: true` and + * `chunkNames: '/[name]-[hash]'`, so modules that are + * hoisted into a shared chunk live at `dist//.js`. + * Any code that derives a path from `import.meta.url` and joins a sibling + * asset (e.g. `bundled/`, `vendor/`, `locales/`, `examples/`) would + * otherwise land in `dist//` and miss the actual + * `dist/` location. + * + * The fix is intentionally narrow: only strip the trailing path segment + * when its basename matches `BUNDLE_CHUNK_DIR`. In source / transpiled / + * non-split builds the trailing segment is the source directory's own + * name, never that constant, so this is a no-op there. + * + * Centralising the check keeps the coupling to esbuild's `chunkNames` + * setting in one place — if that ever changes, only `BUNDLE_CHUNK_DIR` + * needs updating (and `esbuild.config.js` picks up the new value via the + * imported constant). + * + * @param importMetaUrl Pass `import.meta.url` from the caller. It must be + * evaluated at the caller's chunk so the resolution matches that chunk's + * on-disk location; centralising the `fileURLToPath`/`dirname` work here + * does not change that. + * @returns The directory that should be used as the anchor for sibling + * asset lookups (`path.join(result, 'bundled')`, etc.). + */ +export function resolveBundleDir(importMetaUrl: string): string { + const moduleDir = path.dirname(fileURLToPath(importMetaUrl)); + return path.basename(moduleDir) === BUNDLE_CHUNK_DIR + ? path.dirname(moduleDir) + : moduleDir; +} diff --git a/packages/core/src/utils/ripgrepUtils.ts b/packages/core/src/utils/ripgrepUtils.ts index 7f6f3d9c8..a6f21f4bc 100644 --- a/packages/core/src/utils/ripgrepUtils.ts +++ b/packages/core/src/utils/ripgrepUtils.ts @@ -7,6 +7,7 @@ import path from 'node:path'; import { fileURLToPath } from 'node:url'; import { execFile } from 'node:child_process'; +import { resolveBundleDir } from './bundlePaths.js'; import { fileExists } from './fileUtils.js'; import { execCommand, isCommandAvailable } from './shell-utils.js'; import { createDebugLogger } from './debugLogger.js'; @@ -57,9 +58,21 @@ function wslTimeout(): number { : RIPGREP_RUN_TIMEOUT_MS; } -// Get the directory of the current module +// Resolved at module load to the directory that should anchor sibling-asset +// lookups (here: the vendored ripgrep binary copied to `dist/vendor/`). See +// `resolveBundleDir` for the rationale behind stripping a trailing `chunks/` +// segment when this module is hoisted into a shared esbuild chunk. +// +// `__filename` is needed separately by `getBuiltinRipgrep` to decide whether +// it's running from source / transpiled / bundled output (each requires a +// different `..`-traversal count). It is NOT just `path.join(__dirname, +// basename)` because in bundled mode esbuild rewrites every bare `__filename` +// reference to `__qwen_filename` (the shim chunk's path), which would make +// the heuristic always pick `levelsUp = 0` by accident; the explicit local +// shadow keeps the lookup correct in source/transpiled/dev modes too, where +// node ESM leaves `__filename` undefined. const __filename = fileURLToPath(import.meta.url); -const __dirname = path.dirname(__filename); +const __dirname = resolveBundleDir(import.meta.url); type Platform = 'darwin' | 'linux' | 'win32'; type Architecture = 'x64' | 'arm64'; diff --git a/scripts/create-standalone-package.js b/scripts/create-standalone-package.js index 968d6da45..4f6e52f33 100644 --- a/scripts/create-standalone-package.js +++ b/scripts/create-standalone-package.js @@ -36,9 +36,15 @@ const TARGETS = new Map([ ['win-x64', { outputExtension: 'zip', nodeExecutable: ['node.exe'] }], ]); -const DIST_REQUIRED_PATHS = ['cli.js', 'vendor', 'bundled/qc-helper/docs']; +const DIST_REQUIRED_PATHS = [ + 'cli.js', + 'chunks', + 'vendor', + 'bundled/qc-helper/docs', +]; const DIST_ALLOWED_ENTRIES = new Set([ 'cli.js', + 'chunks', 'vendor', 'bundled', 'package.json', diff --git a/scripts/esbuild-shims.js b/scripts/esbuild-shims.js index 9b71a3d4b..2a3427641 100644 --- a/scripts/esbuild-shims.js +++ b/scripts/esbuild-shims.js @@ -5,25 +5,38 @@ */ /** - * Shims for esbuild ESM bundles to support require() calls - * This file is injected into the bundle via esbuild's inject option + * Shims for esbuild ESM bundles. + * + * With code-splitting enabled, the inject is applied per-chunk and the + * exported bindings cannot collide with `var __dirname` polyfills that + * vendored libraries (e.g. yargs) emit in their own ESM compat layers. + * To stay collision-free, this file exposes prefixed names; the build + * config uses esbuild `define` to rewrite free `__dirname` / `__filename` + * references in source to these prefixed identifiers, while leaving + * vendor-declared locals untouched. */ import { createRequire } from 'node:module'; import { fileURLToPath } from 'node:url'; import { dirname } from 'node:path'; -// Create require function for the current module and make it global const _require = createRequire(import.meta.url); -// Make require available globally for dynamic requires if (typeof globalThis.require === 'undefined') { globalThis.require = _require; } -// Export for esbuild injection export const require = _require; - -// Setup __filename and __dirname for compatibility -export const __filename = fileURLToPath(import.meta.url); -export const __dirname = dirname(__filename); +// IMPORTANT: __qwen_filename / __qwen_dirname always resolve to this shim's +// chunk file — i.e. dist/chunks/ in a built bundle, NOT the directory of any +// source file that uses bare __dirname / __filename. esbuild's `define` +// rewrites all free references in source code to these symbols, so to get a +// per-file path you MUST declare a local shadow at the top of your module: +// const __filename = fileURLToPath(import.meta.url); +// const __dirname = path.dirname(__filename); +// Even with a local shadow, under code-splitting the path can still point to +// dist/chunks/ rather than the source dir — sibling-asset lookups (vendor/, +// bundled/, locales/) must strip a trailing `chunks` segment. See +// skill-manager.ts / ripgrepUtils.ts / i18n/index.ts for the pattern. +export const __qwen_filename = fileURLToPath(import.meta.url); +export const __qwen_dirname = dirname(__qwen_filename); diff --git a/scripts/prepare-package.js b/scripts/prepare-package.js index 28811c0fb..b9235a489 100644 --- a/scripts/prepare-package.js +++ b/scripts/prepare-package.js @@ -159,6 +159,7 @@ const distPackageJson = { }, files: [ 'cli.js', + 'chunks', 'vendor', '*.sb', 'README.md',