feat(core): add path-based context rule injection from .qwen/rules/ (#3339)
Some checks are pending
Qwen Code CI / Lint (push) Waiting to run
Qwen Code CI / Test (push) Blocked by required conditions
Qwen Code CI / Test-1 (push) Blocked by required conditions
Qwen Code CI / Test-2 (push) Blocked by required conditions
Qwen Code CI / Test-3 (push) Blocked by required conditions
Qwen Code CI / Test-4 (push) Blocked by required conditions
Qwen Code CI / Test-5 (push) Blocked by required conditions
Qwen Code CI / Test-6 (push) Blocked by required conditions
Qwen Code CI / Test-7 (push) Blocked by required conditions
Qwen Code CI / Test-8 (push) Blocked by required conditions
Qwen Code CI / Post Coverage Comment (push) Blocked by required conditions
Qwen Code CI / CodeQL (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:docker (push) Waiting to run
E2E Tests / E2E Test (Linux) - sandbox:none (push) Waiting to run
E2E Tests / E2E Test - macOS (push) Waiting to run

* feat(core): add path-based context rule injection from .qwen/rules/

Support multiple rule files in `.qwen/rules/` directories with optional
YAML frontmatter for conditional loading based on glob patterns.

Rules with a `paths:` field only load when matching files exist in the
project. Rules without `paths:` always load as baseline rules.

Key behaviors:
- Global rules from ~/.qwen/rules/ always load
- Project rules from <root>/.qwen/rules/ require folder trust
- HTML comments stripped to save tokens
- Files sorted alphabetically for deterministic ordering
- Deduplication when project root equals home directory
- Uses globIterate for early termination on first match

* feat(core): align rules loading with Claude Code reference implementation

Closes three gaps with Claude Code's .claude/rules/ feature:

1. Recursive directory scanning — .qwen/rules/ now supports subdirectories
   like frontend/, backend/ for organized rule hierarchies.

2. Exclusion patterns — new `contextRuleExcludes` config parameter accepts
   glob patterns to skip specific rule files (useful in monorepos with
   other teams' rules).

3. Turn-level lazy loading — conditional rules (with `paths:` frontmatter)
   are no longer injected eagerly at session start. Instead, they are
   stored in a per-session ConditionalRulesRegistry and injected on-demand
   via <system-reminder> when the model reads/edits a matching file
   (read_file, edit, write_file). Each rule is injected at most once per
   session.

Internals:
- loadRules() now returns { content, ruleCount, conditionalRules } — only
  baseline rules flow into the system prompt; conditional rules are
  deferred.
- ConditionalRulesRegistry pre-compiles picomatch matchers for efficiency
  and tracks injected rules to avoid duplicate injection.
- coreToolScheduler.ts injects matched rules after PostToolUse hooks but
  before the tool response is sent to the model.
- Path matching defensively rejects files outside the project root.
- /memory refresh and /directory add keep the registry in sync via
  setConditionalRulesRegistry().

* fix(core): correct field placement in config.test.ts mocks after merge

Earlier replace_all inserted ruleCount/conditionalRules/projectRoot
into the wrong mock call (readAutoMemoryIndex instead of
loadServerHierarchicalMemory), breaking the build with syntax errors.
Move the fields back to the correct mocked return value.

* fix(core): normalize rule display paths to forward slashes for Windows

On Windows, path.relative() returns backslash-separated paths, causing
the "Rule from:" marker to differ from Linux/macOS and breaking the
formats-rules-with-source-markers test on Windows CI.

Normalize to forward slashes for cross-platform consistency, matching
the convention used in glob patterns (paths: field) so that the model
sees the same format regardless of the host OS.

* fix(core): harden rulesDiscovery path checks and sort determinism

Two small defensive improvements surfaced by the audit:

1. matchAndConsume now rejects the exact '..' relative path in addition
   to '../'-prefixed paths. path.relative returns '..' (no trailing
   slash) when the target equals the parent of projectRoot — rare in
   practice but worth guarding against.

2. loadRulesFromDir now uses Array.sort() default (UTF-16 code point
   comparison) instead of localeCompare. The previous sort was
   locale-dependent and could produce different rule loading order on
   machines with non-English locales (e.g. zh-CN). Rule filenames are
   typically ASCII so behaviour is unchanged in common cases, but
   deterministic ordering is preferable across environments.

Adds one test case for the '..' rejection path.

* fix(core): address CodeQL incomplete HTML comment sanitization

stripHtmlComments only matched complete <!-- ... --> pairs in a single
pass, so input like 'A<!-- one --><!-- two -->B<!--unclosed' would
leave a residual '<!--' marker — flagged by CodeQL as
incomplete-multi-character-sanitization.

Not a security issue in our context (the output goes to an LLM system
prompt, not an HTML renderer), but worth fixing to:
 - clear the CodeQL alert in CI
 - avoid token waste from dangling markers
 - produce deterministic output

Strategy: iteratively strip <!-- ... --> pairs until stable, then
remove any residual <!-- markers (leaving the following content
visible since the author probably intended it to appear in the rule).
This commit is contained in:
Shaojin Wen 2026-04-17 22:05:50 +08:00 committed by GitHub
parent 7e83c08062
commit 355ac5d54a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 984 additions and 44 deletions

View file

@ -155,6 +155,9 @@ vi.mock('@qwen-code/qwen-code-core', async (importOriginal) => {
Promise.resolve({
memoryContent: extensionPaths?.join(',') || '',
fileCount: extensionPaths?.length || 0,
ruleCount: 0,
conditionalRules: [],
projectRoot: cwd || '/tmp',
}),
),
DEFAULT_MEMORY_FILE_FILTERING_OPTIONS: {

View file

@ -12,6 +12,7 @@ import {
FileDiscoveryService,
getAllGeminiMdFilenames,
loadServerHierarchicalMemory,
type LoadServerHierarchicalMemoryResponse,
setGeminiMdFilename as setServerGeminiMdFilename,
resolveTelemetrySettings,
FatalConfigError,
@ -666,7 +667,8 @@ export async function loadHierarchicalGeminiMemory(
extensionContextFilePaths: string[] = [],
folderTrust: boolean,
memoryImportFormat: 'flat' | 'tree' = 'tree',
): Promise<{ memoryContent: string; fileCount: number }> {
contextRuleExcludes: string[] = [],
): Promise<LoadServerHierarchicalMemoryResponse> {
// FIX: Use real, canonical paths for a reliable comparison to handle symlinks.
const realCwd = fs.realpathSync(path.resolve(currentWorkingDirectory));
const realHome = fs.realpathSync(path.resolve(homedir()));
@ -684,6 +686,7 @@ export async function loadHierarchicalGeminiMemory(
extensionContextFilePaths,
folderTrust,
memoryImportFormat,
contextRuleExcludes,
);
}

View file

@ -52,6 +52,7 @@ import {
type SpeculationState,
IDLE_SPECULATION,
ApprovalMode,
ConditionalRulesRegistry,
type PermissionMode,
} from '@qwen-code/qwen-code-core';
import { buildResumedHistoryItems } from './utils/resumeHistoryUtils.js';
@ -688,19 +689,24 @@ export const AppContainer = (props: AppContainerProps) => {
Date.now(),
);
try {
const { memoryContent, fileCount } = await loadHierarchicalGeminiMemory(
process.cwd(),
settings.merged.context?.loadFromIncludeDirectories
? config.getWorkspaceContext().getDirectories()
: [],
config.getFileService(),
config.getExtensionContextFilePaths(),
config.isTrustedFolder(),
settings.merged.context?.importFormat || 'tree', // Use setting or default to 'tree'
);
const { memoryContent, fileCount, conditionalRules, projectRoot } =
await loadHierarchicalGeminiMemory(
process.cwd(),
settings.merged.context?.loadFromIncludeDirectories
? config.getWorkspaceContext().getDirectories()
: [],
config.getFileService(),
config.getExtensionContextFilePaths(),
config.isTrustedFolder(),
settings.merged.context?.importFormat || 'tree', // Use setting or default to 'tree'
config.getContextRuleExcludes(),
);
config.setUserMemory(memoryContent);
config.setGeminiMdFileCount(fileCount);
config.setConditionalRulesRegistry(
new ConditionalRulesRegistry(conditionalRules, projectRoot),
);
setGeminiMdFileCount(fileCount);
historyManager.addItem(

View file

@ -10,7 +10,10 @@ import { MessageType } from '../types.js';
import * as fs from 'node:fs';
import * as os from 'node:os';
import * as path from 'node:path';
import { loadServerHierarchicalMemory } from '@qwen-code/qwen-code-core';
import {
loadServerHierarchicalMemory,
ConditionalRulesRegistry,
} from '@qwen-code/qwen-code-core';
import { t } from '../../i18n/index.js';
export function expandHomeDir(p: string): string {
@ -147,7 +150,7 @@ export const directoryCommand: SlashCommand = {
try {
if (config.shouldLoadMemoryFromIncludeDirectories()) {
const { memoryContent, fileCount } =
const { memoryContent, fileCount, conditionalRules, projectRoot } =
await loadServerHierarchicalMemory(
config.getWorkingDir(),
[
@ -159,9 +162,13 @@ export const directoryCommand: SlashCommand = {
config.getFolderTrust(),
context.services.settings.merged.context?.importFormat ||
'tree', // Use setting or default to 'tree'
config.getContextRuleExcludes(),
);
config.setUserMemory(memoryContent);
config.setGeminiMdFileCount(fileCount);
config.setConditionalRulesRegistry(
new ConditionalRulesRegistry(conditionalRules, projectRoot),
);
context.ui.setGeminiMdFileCount(fileCount);
}
addItem(

View file

@ -83,9 +83,13 @@ vi.mock('../tools/tool-registry', () => {
});
vi.mock('../utils/memoryDiscovery.js', () => ({
loadServerHierarchicalMemory: vi
.fn()
.mockResolvedValue({ memoryContent: '', fileCount: 0 }),
loadServerHierarchicalMemory: vi.fn().mockResolvedValue({
memoryContent: '',
fileCount: 0,
ruleCount: 0,
conditionalRules: [],
projectRoot: '/tmp',
}),
}));
vi.mock('../memory/store.js', () => ({
@ -612,6 +616,9 @@ describe('Server Config (config.ts)', () => {
vi.mocked(loadServerHierarchicalMemory).mockResolvedValue({
memoryContent: '--- Context from: QWEN.md ---\nProject rules',
fileCount: 1,
ruleCount: 0,
conditionalRules: [],
projectRoot: '/tmp',
});
vi.mocked(readAutoMemoryIndex).mockResolvedValue(
'# Managed Auto-Memory Index\n\n- [Project Memory](project.md)',
@ -630,6 +637,9 @@ describe('Server Config (config.ts)', () => {
vi.mocked(loadServerHierarchicalMemory).mockResolvedValue({
memoryContent: '--- Context from: QWEN.md ---\nProject rules',
fileCount: 1,
ruleCount: 0,
conditionalRules: [],
projectRoot: '/tmp',
});
vi.mocked(readAutoMemoryIndex).mockResolvedValue(null);

View file

@ -133,6 +133,7 @@ import {
} from '../services/sessionService.js';
import { randomUUID } from 'node:crypto';
import { loadServerHierarchicalMemory } from '../utils/memoryDiscovery.js';
import { ConditionalRulesRegistry } from '../utils/rulesDiscovery.js';
import {
createDebugLogger,
setDebugLogSession,
@ -477,6 +478,8 @@ export interface ConfigParameters {
projectHooks?: Record<string, unknown>;
hooks?: Record<string, unknown>;
/** Glob patterns to exclude from .qwen/rules/ loading. */
contextRuleExcludes?: string[];
/** Warnings generated during configuration resolution */
warnings?: string[];
/** Allowed HTTP hook URLs whitelist (from security.allowedHttpHookUrls) */
@ -574,6 +577,8 @@ export class Config {
private userMemory: string;
private sdkMode: boolean;
private geminiMdFileCount: number;
private conditionalRulesRegistry: ConditionalRulesRegistry | undefined;
private readonly contextRuleExcludes: string[];
private approvalMode: ApprovalMode;
private prePlanMode?: ApprovalMode;
private readonly accessibility: AccessibilitySettings;
@ -709,6 +714,7 @@ export class Config {
this.sdkMode = params.sdkMode ?? false;
this.userMemory = params.userMemory ?? '';
this.geminiMdFileCount = params.geminiMdFileCount ?? 0;
this.contextRuleExcludes = params.contextRuleExcludes ?? [];
this.approvalMode = params.approvalMode ?? ApprovalMode.DEFAULT;
this.accessibility = params.accessibility ?? {};
this.telemetrySettings = {
@ -1076,16 +1082,18 @@ export class Config {
}
async refreshHierarchicalMemory(): Promise<void> {
const { memoryContent, fileCount } = await loadServerHierarchicalMemory(
this.getWorkingDir(),
this.shouldLoadMemoryFromIncludeDirectories()
? this.getWorkspaceContext().getDirectories()
: [],
this.getFileService(),
this.getExtensionContextFilePaths(),
this.isTrustedFolder(),
this.getImportFormat(),
);
const { memoryContent, fileCount, conditionalRules, projectRoot } =
await loadServerHierarchicalMemory(
this.getWorkingDir(),
this.shouldLoadMemoryFromIncludeDirectories()
? this.getWorkspaceContext().getDirectories()
: [],
this.getFileService(),
this.getExtensionContextFilePaths(),
this.isTrustedFolder(),
this.getImportFormat(),
this.contextRuleExcludes,
);
if (this.getManagedAutoMemoryEnabled()) {
const managedAutoMemoryIndex = await readAutoMemoryIndex(
this.getProjectRoot(),
@ -1101,6 +1109,29 @@ export class Config {
this.setUserMemory(memoryContent);
}
this.setGeminiMdFileCount(fileCount);
this.conditionalRulesRegistry = new ConditionalRulesRegistry(
conditionalRules,
projectRoot,
);
}
getConditionalRulesRegistry(): ConditionalRulesRegistry | undefined {
return this.conditionalRulesRegistry;
}
/**
* Update the conditional rules registry. Called after external refresh
* paths (e.g. /memory refresh or /directory add) that bypass
* refreshHierarchicalMemory().
*/
setConditionalRulesRegistry(
registry: ConditionalRulesRegistry | undefined,
): void {
this.conditionalRulesRegistry = registry;
}
getContextRuleExcludes(): string[] {
return this.contextRuleExcludes;
}
getContentGenerator(): ContentGenerator {

View file

@ -1622,6 +1622,21 @@ export class CoreToolScheduler {
}
}
// Inject conditional rules when the model accesses a matching file.
// Rules are injected at most once per session per rule file.
const filePath = toolInput?.['file_path'];
if (typeof filePath === 'string') {
const rulesCtx = this.config
.getConditionalRulesRegistry()
?.matchAndConsume(filePath);
if (rulesCtx) {
content = appendAdditionalContext(
content,
`<system-reminder>\n${rulesCtx}\n</system-reminder>`,
);
}
}
const response = convertToFunctionResponse(toolName, callId, content);
const successResponse: ToolCallResponseInfo = {
callId,

View file

@ -241,6 +241,8 @@ export * from './utils/gitUtils.js';
export * from './utils/ignorePatterns.js';
export * from './utils/jsonl-utils.js';
export * from './utils/memoryDiscovery.js';
export { ConditionalRulesRegistry } from './utils/rulesDiscovery.js';
export type { RuleFile } from './utils/rulesDiscovery.js';
export { OpenAILogger, openaiLogger } from './utils/openaiLogger.js';
export * from './utils/partUtils.js';
export * from './utils/pathReader.js';

View file

@ -130,6 +130,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: '',
fileCount: 0,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -150,6 +153,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${path.relative(cwd, defaultContextFile)} ---\ndefault context content\n--- End of Context from: ${path.relative(cwd, defaultContextFile)} ---`,
fileCount: 1,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -173,6 +179,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${path.relative(cwd, customContextFile)} ---\ncustom context content\n--- End of Context from: ${path.relative(cwd, customContextFile)} ---`,
fileCount: 1,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -200,6 +209,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${path.relative(cwd, projectContextFile)} ---\nproject context content\n--- End of Context from: ${path.relative(cwd, projectContextFile)} ---\n\n--- Context from: ${path.relative(cwd, cwdContextFile)} ---\ncwd context content\n--- End of Context from: ${path.relative(cwd, cwdContextFile)} ---`,
fileCount: 2,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -225,6 +237,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${customFilename} ---\nCWD custom memory\n--- End of Context from: ${customFilename} ---`,
fileCount: 1,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -249,6 +264,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${path.relative(cwd, projectRootGeminiFile)} ---\nProject root memory\n--- End of Context from: ${path.relative(cwd, projectRootGeminiFile)} ---\n\n--- Context from: ${path.relative(cwd, srcGeminiFile)} ---\nSrc directory memory\n--- End of Context from: ${path.relative(cwd, srcGeminiFile)} ---`,
fileCount: 2,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -274,6 +292,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${DEFAULT_CONTEXT_FILENAME} ---\nCWD memory\n--- End of Context from: ${DEFAULT_CONTEXT_FILENAME} ---`,
fileCount: 1,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -311,6 +332,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${path.relative(cwd, defaultContextFile)} ---\ndefault context content\n--- End of Context from: ${path.relative(cwd, defaultContextFile)} ---\n\n--- Context from: ${path.relative(cwd, rootGeminiFile)} ---\nProject parent memory\n--- End of Context from: ${path.relative(cwd, rootGeminiFile)} ---\n\n--- Context from: ${path.relative(cwd, projectRootGeminiFile)} ---\nProject root memory\n--- End of Context from: ${path.relative(cwd, projectRootGeminiFile)} ---\n\n--- Context from: ${path.relative(cwd, cwdGeminiFile)} ---\nCWD memory\n--- End of Context from: ${path.relative(cwd, cwdGeminiFile)} ---`,
fileCount: 4,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -331,6 +355,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${path.relative(cwd, extensionFilePath)} ---\nExtension memory content\n--- End of Context from: ${path.relative(cwd, extensionFilePath)} ---`,
fileCount: 1,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});
@ -354,6 +381,9 @@ describe('loadServerHierarchicalMemory', () => {
expect(result).toEqual({
memoryContent: `--- Context from: ${path.relative(cwd, includedFile)} ---\nincluded directory memory\n--- End of Context from: ${path.relative(cwd, includedFile)} ---`,
fileCount: 1,
ruleCount: 0,
conditionalRules: [],
projectRoot: expect.any(String),
});
});

View file

@ -13,6 +13,7 @@ import type { FileDiscoveryService } from '../services/fileDiscoveryService.js';
import { processImports } from './memoryImportProcessor.js';
import { QWEN_DIR } from './paths.js';
import { createDebugLogger } from './debugLogger.js';
import { loadRules, type RuleFile } from './rulesDiscovery.js';
const logger = createDebugLogger('MEMORY_DISCOVERY');
@ -303,11 +304,20 @@ function concatenateInstructions(
export interface LoadServerHierarchicalMemoryResponse {
memoryContent: string;
fileCount: number;
/** Number of baseline rules injected at session start. */
ruleCount: number;
/** Conditional rules (with `paths:`) for turn-level lazy injection. */
conditionalRules: RuleFile[];
/** Effective project root used for glob matching. */
projectRoot: string;
}
/**
* Loads hierarchical QWEN.md files and concatenates their content.
* Also loads path-based context rules from `.qwen/rules/` directories.
* This function is intended for use by the server.
*
* @param contextRuleExcludes - Glob patterns to skip when loading rules.
*/
export async function loadServerHierarchicalMemory(
currentWorkingDirectory: string,
@ -316,6 +326,7 @@ export async function loadServerHierarchicalMemory(
extensionContextFilePaths: string[] = [],
folderTrust: boolean,
importFormat: 'flat' | 'tree' = 'tree',
contextRuleExcludes: string[] = [],
): Promise<LoadServerHierarchicalMemoryResponse> {
logger.debug(
`Loading server hierarchical memory for CWD: ${currentWorkingDirectory} (importFormat: ${importFormat})`,
@ -332,26 +343,54 @@ export async function loadServerHierarchicalMemory(
extensionContextFilePaths,
folderTrust,
);
if (filePaths.length === 0) {
logger.debug('No QWEN.md files found in hierarchy.');
return { memoryContent: '', fileCount: 0 };
}
const contentsWithPaths = await readGeminiMdFiles(filePaths, importFormat);
// Pass CWD for relative path display in concatenated content
const combinedInstructions = concatenateInstructions(
contentsWithPaths,
currentWorkingDirectory,
);
// Only count files that match configured memory filenames (e.g., QWEN.md),
// excluding system context files like output-language.md
const memoryFilenames = new Set(getAllGeminiMdFilenames());
const fileCount = contentsWithPaths.filter((item) =>
memoryFilenames.has(path.basename(item.filePath)),
).length;
let combinedInstructions = '';
let fileCount = 0;
if (filePaths.length > 0) {
const contentsWithPaths = await readGeminiMdFiles(filePaths, importFormat);
// Pass CWD for relative path display in concatenated content
combinedInstructions = concatenateInstructions(
contentsWithPaths,
currentWorkingDirectory,
);
// Only count files that match configured memory filenames (e.g., QWEN.md),
// excluding system context files like output-language.md
const memoryFilenames = new Set(getAllGeminiMdFilenames());
fileCount = contentsWithPaths.filter((item) =>
memoryFilenames.has(path.basename(item.filePath)),
).length;
}
// Load path-based context rules from .qwen/rules/ directories
const resolvedCwd = path.resolve(currentWorkingDirectory);
const foundRoot = await findProjectRoot(resolvedCwd);
const effectiveRoot = foundRoot ?? resolvedCwd;
const {
content: rulesContent,
ruleCount,
conditionalRules,
} = await loadRules(effectiveRoot, folderTrust, contextRuleExcludes);
// Baseline rules go into the system prompt
let memoryContent = combinedInstructions;
if (rulesContent) {
memoryContent = memoryContent
? `${memoryContent}\n\n${rulesContent}`
: rulesContent;
}
if (!memoryContent && filePaths.length === 0 && ruleCount === 0) {
logger.debug('No QWEN.md files or rules found.');
}
return {
memoryContent: combinedInstructions,
fileCount, // Only count the context files
memoryContent,
fileCount,
ruleCount,
conditionalRules,
projectRoot: effectiveRoot,
};
}

View file

@ -0,0 +1,427 @@
/**
* @license
* Copyright 2025 Qwen
* SPDX-License-Identifier: Apache-2.0
*/
import { vi, describe, it, expect, beforeEach, afterEach } from 'vitest';
import * as fsPromises from 'node:fs/promises';
import * as os from 'node:os';
import * as path from 'node:path';
import {
parseRuleFile,
loadRules,
ConditionalRulesRegistry,
} from './rulesDiscovery.js';
import { QWEN_DIR } from './paths.js';
vi.mock('os', async (importOriginal) => {
const actualOs = await importOriginal<typeof os>();
return {
...actualOs,
homedir: vi.fn(),
};
});
describe('rulesDiscovery', () => {
let testRootDir: string;
let projectRoot: string;
let homedir: string;
async function createTestFile(fullPath: string, content: string) {
await fsPromises.mkdir(path.dirname(fullPath), { recursive: true });
await fsPromises.writeFile(fullPath, content);
return fullPath;
}
beforeEach(async () => {
testRootDir = await fsPromises.mkdtemp(
path.join(os.tmpdir(), 'rules-discovery-test-'),
);
vi.resetAllMocks();
vi.stubEnv('NODE_ENV', 'test');
vi.stubEnv('VITEST', 'true');
projectRoot = path.join(testRootDir, 'project');
await fsPromises.mkdir(projectRoot, { recursive: true });
homedir = path.join(testRootDir, 'userhome');
await fsPromises.mkdir(homedir, { recursive: true });
vi.mocked(os.homedir).mockReturnValue(homedir);
});
afterEach(async () => {
vi.unstubAllEnvs();
await fsPromises.rm(testRootDir, {
recursive: true,
force: true,
maxRetries: 3,
retryDelay: 10,
});
});
// ─────────────────────────────────────────────────────────────────────────
// parseRuleFile
// ─────────────────────────────────────────────────────────────────────────
describe('parseRuleFile', () => {
it('parses a rule with paths frontmatter', () => {
const content = `---
description: Frontend rules
paths:
- "src/**/*.tsx"
- "src/**/*.ts"
---
Use React functional components.
`;
const rule = parseRuleFile(content, '/test/rule.md');
expect(rule).not.toBeNull();
expect(rule!.description).toBe('Frontend rules');
expect(rule!.paths).toEqual(['src/**/*.tsx', 'src/**/*.ts']);
expect(rule!.content).toBe('Use React functional components.');
});
it('parses a baseline rule without paths', () => {
const content = `---
description: General coding standards
---
Always write tests.
`;
const rule = parseRuleFile(content, '/test/rule.md');
expect(rule!.paths).toBeUndefined();
expect(rule!.content).toBe('Always write tests.');
});
it('parses a rule without any frontmatter as baseline', () => {
const rule = parseRuleFile('Plain rules.\n\nParagraph.', '/test/r.md');
expect(rule!.paths).toBeUndefined();
expect(rule!.content).toBe('Plain rules.\n\nParagraph.');
});
it('strips HTML comments', () => {
const content = `---
description: Test
---
Visible.
<!-- stripped -->
Also visible.
`;
const rule = parseRuleFile(content, '/test/rule.md');
expect(rule!.content).not.toContain('stripped');
expect(rule!.content).toContain('Visible.');
expect(rule!.content).toContain('Also visible.');
});
it('strips adjacent and residual HTML comment markers', () => {
// Defensive cases that previously left residual <!-- in the output,
// flagged by CodeQL as incomplete multi-character sanitization.
const content = `---
description: Test
---
A<!-- one --><!-- two -->B<!--unclosed
`;
const rule = parseRuleFile(content, '/test/rule.md');
expect(rule!.content).not.toContain('<!--');
expect(rule!.content).toContain('A');
expect(rule!.content).toContain('B');
});
it('returns null for empty body after stripping', () => {
const content = `---
paths:
- "*.ts"
---
<!-- Only a comment -->
`;
expect(parseRuleFile(content, '/test/rule.md')).toBeNull();
});
it('handles empty paths array as baseline', () => {
const content = `---
paths:
---
Some content.
`;
expect(parseRuleFile(content, '/t.md')!.paths).toBeUndefined();
});
it('handles paths as a single string', () => {
const content = `---
paths: "src/**/*.ts"
---
Rule.
`;
expect(parseRuleFile(content, '/t.md')!.paths).toEqual(['src/**/*.ts']);
});
it('handles BOM and CRLF', () => {
const content = '\uFEFF---\r\ndescription: BOM\r\n---\r\nContent.\r\n';
const rule = parseRuleFile(content, '/t.md');
expect(rule!.description).toBe('BOM');
expect(rule!.content).toBe('Content.');
});
it('treats non-array/non-string paths as baseline', () => {
const content = `---
paths: 42
---
Body.
`;
expect(parseRuleFile(content, '/t.md')!.paths).toBeUndefined();
});
});
// ─────────────────────────────────────────────────────────────────────────
// loadRules — baseline vs conditional split
// ─────────────────────────────────────────────────────────────────────────
describe('loadRules', () => {
it('returns empty when no rules directory exists', async () => {
const result = await loadRules(projectRoot, true);
expect(result).toEqual({
content: '',
ruleCount: 0,
conditionalRules: [],
});
});
it('loads baseline rules into content', async () => {
const rulesDir = path.join(projectRoot, QWEN_DIR, 'rules');
await createTestFile(
path.join(rulesDir, 'general.md'),
`---
description: General
---
Always write tests.`,
);
const result = await loadRules(projectRoot, true);
expect(result.ruleCount).toBe(1);
expect(result.content).toContain('Always write tests.');
expect(result.conditionalRules).toEqual([]);
});
it('puts conditional rules in conditionalRules, not in content', async () => {
const rulesDir = path.join(projectRoot, QWEN_DIR, 'rules');
await createTestFile(
path.join(rulesDir, 'fe.md'),
`---
paths:
- "src/**/*.tsx"
---
Use hooks.`,
);
const result = await loadRules(projectRoot, true);
expect(result.ruleCount).toBe(0);
expect(result.content).toBe('');
expect(result.conditionalRules).toHaveLength(1);
expect(result.conditionalRules[0].content).toBe('Use hooks.');
});
it('splits baseline and conditional correctly', async () => {
const rulesDir = path.join(projectRoot, QWEN_DIR, 'rules');
await createTestFile(
path.join(rulesDir, '01-general.md'),
'Write clean code.',
);
await createTestFile(
path.join(rulesDir, '02-py.md'),
`---\npaths:\n - "**/*.py"\n---\nUse type hints.`,
);
await createTestFile(
path.join(rulesDir, '03-ts.md'),
`---\npaths:\n - "**/*.ts"\n---\nUse strict.`,
);
const result = await loadRules(projectRoot, true);
expect(result.ruleCount).toBe(1);
expect(result.content).toContain('Write clean code.');
expect(result.conditionalRules).toHaveLength(2);
});
it('recursively scans subdirectories', async () => {
const rulesDir = path.join(projectRoot, QWEN_DIR, 'rules');
await createTestFile(
path.join(rulesDir, 'frontend', 'react.md'),
'Use hooks.',
);
await createTestFile(
path.join(rulesDir, 'backend', 'api.md'),
'Validate inputs.',
);
await createTestFile(path.join(rulesDir, 'general.md'), 'Write tests.');
const result = await loadRules(projectRoot, true);
expect(result.ruleCount).toBe(3);
expect(result.content).toContain('Use hooks.');
expect(result.content).toContain('Validate inputs.');
expect(result.content).toContain('Write tests.');
});
it('skips project rules when folder is untrusted', async () => {
await createTestFile(
path.join(projectRoot, QWEN_DIR, 'rules', 'r.md'),
'Untrusted.',
);
const result = await loadRules(projectRoot, false);
expect(result.ruleCount).toBe(0);
});
it('loads global rules even when folder is untrusted', async () => {
await createTestFile(
path.join(homedir, QWEN_DIR, 'rules', 'g.md'),
'Global.',
);
const result = await loadRules(projectRoot, false);
expect(result.ruleCount).toBe(1);
expect(result.content).toContain('Global.');
});
it('does not duplicate rules when projectRoot equals homedir', async () => {
await createTestFile(
path.join(homedir, QWEN_DIR, 'rules', 's.md'),
'Shared.',
);
const result = await loadRules(homedir, true);
expect(result.ruleCount).toBe(1);
expect((result.content.match(/Shared\./g) || []).length).toBe(1);
});
it('excludes rules matching exclude patterns', async () => {
const rulesDir = path.join(projectRoot, QWEN_DIR, 'rules');
await createTestFile(path.join(rulesDir, 'keep.md'), 'Keep.');
const skipped = await createTestFile(
path.join(rulesDir, 'skip.md'),
'Skip.',
);
const result = await loadRules(projectRoot, true, [skipped]);
expect(result.ruleCount).toBe(1);
expect(result.content).toContain('Keep.');
expect(result.content).not.toContain('Skip.');
});
it('excludes rules in subdirectories by glob', async () => {
const rulesDir = path.join(projectRoot, QWEN_DIR, 'rules');
await createTestFile(
path.join(rulesDir, 'other-team', 'r.md'),
'Their rule.',
);
await createTestFile(path.join(rulesDir, 'mine.md'), 'My rule.');
const result = await loadRules(projectRoot, true, ['**/other-team/**']);
expect(result.ruleCount).toBe(1);
expect(result.content).not.toContain('Their rule.');
});
it('formats rules with source markers', async () => {
await createTestFile(
path.join(projectRoot, QWEN_DIR, 'rules', 'test.md'),
'Content.',
);
const result = await loadRules(projectRoot, true);
expect(result.content).toContain(
`--- Rule from: ${QWEN_DIR}/rules/test.md ---`,
);
});
});
// ─────────────────────────────────────────────────────────────────────────
// ConditionalRulesRegistry
// ─────────────────────────────────────────────────────────────────────────
describe('ConditionalRulesRegistry', () => {
const rule = (fp: string, pats: string[], body: string) => ({
filePath: fp,
paths: pats,
content: body,
});
it('matches a file and returns formatted content', () => {
const reg = new ConditionalRulesRegistry(
[rule('/r/fe.md', ['src/**/*.tsx'], 'Use hooks.')],
'/project',
);
const result = reg.matchAndConsume('/project/src/App.tsx');
expect(result).toContain('Use hooks.');
});
it('returns undefined when no patterns match', () => {
const reg = new ConditionalRulesRegistry(
[rule('/r/fe.md', ['src/**/*.tsx'], 'Use hooks.')],
'/project',
);
expect(reg.matchAndConsume('/project/lib/utils.py')).toBeUndefined();
});
it('injects each rule at most once', () => {
const reg = new ConditionalRulesRegistry(
[rule('/r/fe.md', ['src/**/*.tsx'], 'Use hooks.')],
'/project',
);
expect(reg.matchAndConsume('/project/src/A.tsx')).toBeDefined();
expect(reg.matchAndConsume('/project/src/B.tsx')).toBeUndefined();
});
it('matches multiple rules for one file', () => {
const reg = new ConditionalRulesRegistry(
[
rule('/r/ts.md', ['**/*.tsx'], 'Strict.'),
rule('/r/react.md', ['src/**/*.tsx'], 'Hooks.'),
],
'/project',
);
const result = reg.matchAndConsume('/project/src/App.tsx');
expect(result).toContain('Strict.');
expect(result).toContain('Hooks.');
expect(reg.injectedCount).toBe(2);
});
it('tracks totalCount and injectedCount', () => {
const reg = new ConditionalRulesRegistry(
[rule('/r/a.md', ['**/*.ts'], 'A'), rule('/r/b.md', ['**/*.py'], 'B')],
'/project',
);
expect(reg.totalCount).toBe(2);
expect(reg.injectedCount).toBe(0);
reg.matchAndConsume('/project/foo.ts');
expect(reg.injectedCount).toBe(1);
});
it('returns undefined when registry is empty', () => {
const reg = new ConditionalRulesRegistry([], '/project');
expect(reg.matchAndConsume('/project/foo.ts')).toBeUndefined();
});
it('does not match files outside the project root', () => {
const reg = new ConditionalRulesRegistry(
[rule('/r/ts.md', ['**/*.ts'], 'Strict.')],
'/project',
);
expect(reg.matchAndConsume('/etc/passwd')).toBeUndefined();
expect(reg.matchAndConsume('/other/foo.ts')).toBeUndefined();
});
it('rejects the exact `..` relative path (parent of projectRoot)', () => {
// Pattern matches literal '..' — pathological but defensive
const reg = new ConditionalRulesRegistry(
[rule('/r/dot.md', ['..'], 'Parent rule.')],
'/project',
);
// Exact parent directory (unlikely but possible input)
expect(reg.matchAndConsume('/')).toBeUndefined();
});
it('resolves relative paths against projectRoot', () => {
const reg = new ConditionalRulesRegistry(
[rule('/r/ts.md', ['src/**/*.ts'], 'Strict.')],
'/project',
);
// A relative file_path should be resolved against the project root
// so "src/foo.ts" matches "src/**/*.ts".
const result = reg.matchAndConsume('src/foo.ts');
expect(result).toContain('Strict.');
});
});
});

View file

@ -0,0 +1,367 @@
/**
* @license
* Copyright 2025 Qwen
* SPDX-License-Identifier: Apache-2.0
*/
// Path-based context rule injection.
//
// Discovers .qwen/rules/ files (recursively) with optional YAML frontmatter.
// Rules declare applicable file paths via glob patterns in `paths:`.
//
// - Rules WITHOUT `paths:` always load at session start (baseline rules).
// - Rules WITH `paths:` are deferred and injected on-demand when the model
// reads or edits a matching file (turn-level lazy loading).
// - HTML comments are stripped to save tokens.
import * as fs from 'node:fs/promises';
import * as path from 'node:path';
import { homedir } from 'node:os';
import picomatch from 'picomatch';
import { parse as parseYaml } from './yaml-parser.js';
import { normalizeContent } from './textUtils.js';
import { QWEN_DIR } from './paths.js';
import { createDebugLogger } from './debugLogger.js';
const logger = createDebugLogger('RULES_DISCOVERY');
// ─────────────────────────────────────────────────────────────────────────────
// Types
// ─────────────────────────────────────────────────────────────────────────────
export interface RuleFile {
filePath: string;
description?: string;
paths?: string[];
content: string;
}
export interface LoadRulesResponse {
/** Formatted baseline rules (no `paths:`) for the system prompt. */
content: string;
/** Number of baseline rules injected at session start. */
ruleCount: number;
/** Conditional rules (with `paths:`) for turn-level lazy injection. */
conditionalRules: RuleFile[];
}
// ─────────────────────────────────────────────────────────────────────────────
// Parsing
// ─────────────────────────────────────────────────────────────────────────────
const FRONTMATTER_REGEX = /^---\n([\s\S]*?)\n---(?:\n|$)([\s\S]*)$/;
function stripHtmlComments(content: string): string {
// Iteratively strip complete <!-- ... --> pairs so adjacent or
// malformed-looking sequences (e.g. <!-- A --><!-- B -->) fully clear.
let result = content;
let prev: string;
do {
prev = result;
result = prev.replace(/<!--[\s\S]*?-->/g, '');
} while (result !== prev);
// Strip any residual unclosed <!-- markers. Not a security issue in
// system-prompt context (output isn't rendered as HTML), but leaving
// them would waste tokens and trip static analyzers (CodeQL flags
// "incomplete multi-character sanitization" without this step).
return result.replace(/<!--/g, '');
}
/**
* Parse a rule file's YAML frontmatter and body content.
* Returns null if the file has no usable content after processing.
*/
export function parseRuleFile(
rawContent: string,
filePath: string,
): RuleFile | null {
const normalized = normalizeContent(rawContent);
const match = normalized.match(FRONTMATTER_REGEX);
let body: string;
let paths: string[] | undefined;
let description: string | undefined;
if (match) {
const [, frontmatterYaml, rawBody] = match;
try {
const frontmatter = parseYaml(frontmatterYaml);
const pathsRaw = frontmatter['paths'];
if (Array.isArray(pathsRaw)) {
paths = pathsRaw.map(String).filter(Boolean);
if (paths.length === 0) paths = undefined;
} else if (typeof pathsRaw === 'string' && pathsRaw) {
paths = [pathsRaw];
}
if (frontmatter['description'] != null) {
description = String(frontmatter['description']);
}
} catch (error) {
logger.warn(`Failed to parse frontmatter in ${filePath}: ${error}`);
}
body = rawBody;
} else {
body = normalized;
}
const content = stripHtmlComments(body).trim();
if (!content) return null;
return { filePath, description, paths, content };
}
// ─────────────────────────────────────────────────────────────────────────────
// Directory scanning (recursive)
// ─────────────────────────────────────────────────────────────────────────────
/**
* Recursively collect all .md file paths under a directory.
* Returns sorted absolute paths for deterministic ordering.
*/
async function collectMdFiles(dir: string): Promise<string[]> {
let entries;
try {
entries = await fs.readdir(dir, { withFileTypes: true });
} catch {
return [];
}
const files: string[] = [];
for (const entry of entries) {
const fullPath = path.join(dir, entry.name);
if (entry.isDirectory()) {
files.push(...(await collectMdFiles(fullPath)));
} else if (entry.isFile() && entry.name.endsWith('.md')) {
files.push(fullPath);
}
}
return files;
}
/**
* Discover and load rule files from a single `.qwen/rules/` directory.
* Scans recursively; files are sorted alphabetically for deterministic ordering.
*
* @param excludes - Glob patterns to skip (matched against absolute paths).
*/
async function loadRulesFromDir(
rulesDir: string,
excludes: string[],
): Promise<RuleFile[]> {
const allPaths = await collectMdFiles(rulesDir);
if (allPaths.length === 0) return [];
// Sort for deterministic ordering. Use Array.sort() default (UTF-16 code
// point comparison) rather than localeCompare — locale-dependent sorting
// can produce different orders on machines with different locales.
allPaths.sort();
// Compile exclude matchers once
const excludeMatchers =
excludes.length > 0 ? excludes.map((p) => picomatch(p, { dot: true })) : [];
const ruleFiles: RuleFile[] = [];
for (const filePath of allPaths) {
// Gap 2: check excludes
if (excludeMatchers.some((m) => m(filePath))) {
logger.debug(`Excluding rule by setting: ${filePath}`);
continue;
}
try {
const rawContent = await fs.readFile(filePath, 'utf-8');
const rule = parseRuleFile(rawContent, filePath);
if (rule) {
ruleFiles.push(rule);
}
} catch (error) {
logger.warn(`Failed to load rule file ${filePath}: ${error}`);
}
}
return ruleFiles;
}
// ─────────────────────────────────────────────────────────────────────────────
// Formatting
// ─────────────────────────────────────────────────────────────────────────────
/**
* Format loaded rules into a single string with source markers,
* consistent with the `--- Context from: ... ---` format used for QWEN.md.
*/
export function formatRules(rules: RuleFile[], projectRoot: string): string {
return rules
.map((rule) => {
const rawDisplayPath = path.isAbsolute(rule.filePath)
? path.relative(projectRoot, rule.filePath)
: rule.filePath;
// Normalize to forward slashes for cross-platform consistency in the
// system prompt. Glob patterns in `paths:` use forward slashes, so
// display paths should match — otherwise Windows shows `.qwen\rules\foo.md`
// and Linux shows `.qwen/rules/foo.md`, which is confusing in diffs/tests.
const displayPath = rawDisplayPath.replace(/\\/g, '/');
return (
`--- Rule from: ${displayPath} ---\n` +
`${rule.content}\n` +
`--- End of Rule from: ${displayPath} ---`
);
})
.join('\n\n');
}
// ─────────────────────────────────────────────────────────────────────────────
// ConditionalRulesRegistry (Gap 3: turn-level lazy loading)
// ─────────────────────────────────────────────────────────────────────────────
interface CompiledRule {
rule: RuleFile;
matchers: picomatch.Matcher[];
}
/**
* Registry that holds conditional rules and injects them on-demand when
* the model accesses a file matching a rule's `paths:` patterns.
*
* Each rule is injected at most once per session. Patterns are pre-compiled
* with picomatch for efficient repeated matching.
*/
export class ConditionalRulesRegistry {
private readonly compiledRules: CompiledRule[];
private readonly injected = new Set<string>();
private readonly projectRoot: string;
constructor(rules: RuleFile[], projectRoot: string) {
this.projectRoot = projectRoot;
this.compiledRules = rules.map((rule) => ({
rule,
matchers: (rule.paths ?? []).map((p) => picomatch(p, { dot: false })),
}));
logger.debug(
`ConditionalRulesRegistry created with ${rules.length} rule(s)`,
);
}
/**
* Check if a file path matches any conditional rules that haven't been
* injected yet. Matched rules are marked as consumed and their formatted
* content is returned for injection into the conversation context.
*
* @param filePath - Absolute path of the file being accessed.
* @returns Formatted rule content, or undefined if no new rules match.
*/
matchAndConsume(filePath: string): string | undefined {
if (this.compiledRules.length === 0) return undefined;
// Resolve first to handle both absolute and relative input paths,
// then compute the path relative to projectRoot for pattern matching.
const absolutePath = path.isAbsolute(filePath)
? filePath
: path.resolve(this.projectRoot, filePath);
const relativePath = path
.relative(this.projectRoot, absolutePath)
.replace(/\\/g, '/');
// Paths outside the project root produce `../` prefixes (or exact `..`
// when the target equals the parent of projectRoot) — don't inject rules
// for files outside the project boundary.
if (relativePath === '..' || relativePath.startsWith('../')) {
return undefined;
}
const newMatches = this.compiledRules.filter(({ rule, matchers }) => {
if (this.injected.has(rule.filePath)) return false;
return matchers.some((m) => m(relativePath));
});
if (newMatches.length === 0) return undefined;
for (const { rule } of newMatches) {
this.injected.add(rule.filePath);
logger.debug(`Injecting conditional rule: ${rule.filePath}`);
}
return formatRules(
newMatches.map((m) => m.rule),
this.projectRoot,
);
}
get totalCount(): number {
return this.compiledRules.length;
}
get injectedCount(): number {
return this.injected.size;
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Main entry point
// ─────────────────────────────────────────────────────────────────────────────
/**
* Load rules from both global (`~/.qwen/rules/`) and project-level
* (`.qwen/rules/`) directories.
*
* Baseline rules (no `paths:`) are returned in `content` for immediate
* injection into the system prompt. Conditional rules (with `paths:`) are
* returned separately in `conditionalRules` for turn-level lazy loading.
*
* @param projectRoot - Absolute path to the project root (git root or CWD).
* @param folderTrust - Whether the project folder is trusted.
* @param excludes - Glob patterns to skip (matched against absolute paths).
*/
export async function loadRules(
projectRoot: string,
folderTrust: boolean,
excludes: string[] = [],
): Promise<LoadRulesResponse> {
logger.debug(`Loading rules for project: ${projectRoot}`);
const allRules: RuleFile[] = [];
// 1. Global rules: ~/.qwen/rules/
const globalRulesDir = path.join(homedir(), QWEN_DIR, 'rules');
const globalRules = await loadRulesFromDir(globalRulesDir, excludes);
allRules.push(...globalRules);
logger.debug(`Loaded ${globalRules.length} global rule(s)`);
// 2. Project-level rules: <projectRoot>/.qwen/rules/ (trusted only)
// Skip if it resolves to the same directory as global rules.
if (folderTrust) {
const projectRulesDir = path.join(projectRoot, QWEN_DIR, 'rules');
if (path.resolve(projectRulesDir) !== path.resolve(globalRulesDir)) {
const projectRules = await loadRulesFromDir(projectRulesDir, excludes);
allRules.push(...projectRules);
logger.debug(`Loaded ${projectRules.length} project rule(s)`);
} else {
logger.debug(
'Project rules dir same as global — skipping to avoid duplicates',
);
}
}
// Split into baseline (no paths) and conditional (has paths)
const baselineRules: RuleFile[] = [];
const conditionalRules: RuleFile[] = [];
for (const rule of allRules) {
if (rule.paths) {
conditionalRules.push(rule);
} else {
baselineRules.push(rule);
}
}
logger.debug(
`Split: ${baselineRules.length} baseline, ${conditionalRules.length} conditional`,
);
const content =
baselineRules.length > 0 ? formatRules(baselineRules, projectRoot) : '';
return { content, ruleCount: baselineRules.length, conditionalRules };
}