qwen-code/packages/core/src/services/fileReadCache.integration.test.ts
Shaojin Wen fcefab6df5
fix(core): clear FileReadCache on every history rewrite path (#3810)
* fix(core): clear FileReadCache after microcompaction

Microcompaction (the idle-cleanup pass that runs at the start of every
new user/cron message) replaces old read_file / shell / glob / grep /
edit / write_file tool outputs with a `[Old tool result content cleared]`
placeholder. The FileReadCache, however, still records the prior full
Reads as "seen in this conversation" — so the next ReadFile of an
unchanged file returns the file_unchanged placeholder pointing at bytes
the model can no longer retrieve from history. The result is a Read
that succeeds at the tool layer but delivers no usable content to the
model, which is the failure mode reported in #3805 ("read tool returns
no content in long-running sessions").

This mirrors the existing post-compaction clear in tryCompressChat —
microcompaction has the same "history rewrite invalidates the cache's
'model has seen this' assumption" property, it was just missed when the
cache was wired in.

* fix(core): clear FileReadCache on every history rewrite path

PR1 only patched microcompaction, but a multi-round audit found four
more entry points that rewrite history without clearing the cache,
producing the same `file_unchanged` placeholder vs. missing-content
mismatch. Each is fixed in the same minimal way (clear() at the call
site) and covered by a regression test:

- GeminiClient.setHistory     — /restore checkpoint, /load_history
- GeminiClient.truncateHistory — rewind in AppContainer
- GeminiClient.resetChat       — public API; clearCommand happens to
  clear the cache via startNewSession beforehand, but other callers
  have no such guarantee
- stripOrphanedUserEntriesFromHistory — Retry path drops trailing user
  entries that may include read_file functionResponses

Also tightened the microcompaction comment ("compactable tool outputs"
instead of an enumerated list, since the source of truth is
microcompact.COMPACTABLE_TOOLS) and removed caller references per the
codebase comment style.

Reverse-tested every new clear() by commenting it out and confirming
the matching regression test fails.

* test(core): integration test for FileReadCache + history rewrite

End-to-end tests using the real ReadFileTool, real FileReadCache,
real microcompactHistory, and a real on-disk file. Three cases:

1. Without a cache clear after microcompact, the second Read of an
   unchanged file returns the file_unchanged placeholder while the
   prior content has already been wiped from history. Demonstrates
   the failure mode this PR fixes.
2. After an explicit cache.clear(), the second Read re-emits the
   real bytes. Demonstrates that the fix works.
3. When microcompact removes every prior read of a file, the
   placeholder leaves zero recoverable bytes — the model literally
   cannot find the content anywhere it can reach.

These complement the existing unit tests in client.test.ts (which
verify the call-site wiring) by proving the end-to-end behaviour
through the real code paths, without mocks.

* chore(core): add traceable debug log for every FileReadCache clear

Per review feedback: the new clear() call sites were silent, leaving
no breadcrumb in production debug streams when the cache is dropped.
Adds a `[FILE_READ_CACHE] clear after <reason>` log at every clear
site (5 new + 1 pre-existing in tryCompressChat) so operators can
grep one prefix and see why the cache was invalidated.

* chore(core): refine truncateHistory cache clear + extract test helper

Per review feedback (deepseek-v4-pro):

1. truncateHistory now skips the cache clear when keepCount >=
   prevLen, since a no-op truncate leaves the cache valid against the
   unchanged history. Adds a regression test covering both
   keepCount==prevLen and keepCount>prevLen.

2. The 6 cache-spy test cases each repeated the same 4-line mock
   setup. Extract a `mockFileReadCacheClear()` helper so future
   changes to the FileReadCache mock surface only need one edit.

Both are quality-of-implementation tweaks; the underlying fix is
unchanged.

* perf(core): use O(1) getHistoryLength in truncateHistory

Per Copilot review feedback: the previous commit's no-op detection in
truncateHistory called this.getChat().getHistory().length, but
GeminiChat.getHistory() does a structuredClone of the entire history
on every call (line 770 of geminiChat.ts) — paying an O(history)
clone purely to read .length. In long-running sessions with hundreds
of entries this is a meaningful regression.

Adds GeminiChat.getHistoryLength(): O(1), no clone. truncateHistory
switches to it. The behaviour (skip clear when keepCount >= prevLen)
is unchanged.

Also adds:
- Unit tests for GeminiChat.getHistoryLength (empty, after addHistory,
  parity with getHistory().length).
- A regression test asserting truncateHistory calls getHistoryLength
  and NOT getHistory, locking in the perf fix against future drift.

* fix(core): close NaN hole + use public ReadFileTool API in tests

Two issues from copilot review:

1. NaN edge case in truncateHistory cache invalidation. The
   "did anything actually change?" check was `keepCount < prevLen`,
   but `Array.slice(0, NaN)` returns [] (history wiped) while
   `NaN < prevLen` is false. That sequence would wipe the chat but
   leave the FileReadCache claiming the model has seen the prior
   reads — exactly the file_unchanged placeholder bug this PR is
   closing. Switched the check to compare actual post-truncate length
   (`newLen < prevLen`), which correctly invalidates whenever entries
   were removed regardless of how `keepCount` was malformed. Added
   a NaN regression test.

2. The integration test cast `tool` to `unknown` to reach the
   protected `createInvocation()` method. Switched to the public
   `tool.buildAndExecute(params, signal)` API so the test exercises
   the same surface real callers use, including build-time schema
   validation.
2026-05-04 22:42:06 +08:00

278 lines
9.1 KiB
TypeScript

/**
* Integration tests for the FileReadCache short-circuit. Real
* filesystem, real ReadFileTool, real microcompactHistory — verify
* that the placeholder fast-path stays correct under history rewrites.
*/
import * as fs from 'node:fs';
import * as os from 'node:os';
import * as path from 'node:path';
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import type { Content } from '@google/genai';
import { FileReadCache } from './fileReadCache.js';
import { ReadFileTool } from '../tools/read-file.js';
import { microcompactHistory } from './microcompaction/microcompact.js';
import { StandardFileSystemService } from './fileSystemService.js';
function makeConfig(targetDir: string, cache: FileReadCache, disabled = false) {
const explicit: Record<string, unknown> = {
getTargetDir: () => targetDir,
getProjectRoot: () => targetDir,
getWorkspaceContext: () => ({
isPathWithinWorkspace: () => true,
}),
storage: {
getProjectTempDir: () => path.join(targetDir, '.tmp'),
getProjectDir: () => path.join(targetDir, '.proj'),
getUserSkillsDirs: () => [],
},
getFileReadCache: () => cache,
getFileReadCacheDisabled: () => disabled,
getFileService: () => ({ shouldQwenIgnoreFile: () => false }),
getFileFilteringOptions: () => ({}),
getDebugMode: () => false,
getFileSystemService: () => new StandardFileSystemService(),
getContentGeneratorConfig: () => ({ modalities: {} }),
getModel: () => 'test-model',
getTruncateToolOutputLines: () => 2000,
getTruncateToolOutputThreshold: () => 4_000_000,
getUsageStatisticsEnabled: () => false,
};
return new Proxy(explicit, {
get(target, prop) {
if (prop in target)
return (target as Record<string | symbol, unknown>)[prop];
// Default: any unknown getter returns undefined-yielding fn.
return () => undefined;
},
}) as never;
}
describe('FileReadCache integration: read after history rewrite', () => {
let tmpDir: string;
let filePath: string;
beforeEach(() => {
tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'repro-3805-'));
filePath = path.join(tmpDir, 'foo.ts');
fs.writeFileSync(
filePath,
'export function hello() {\n return "world";\n}\n'.repeat(10),
);
});
afterEach(() => {
fs.rmSync(tmpDir, { recursive: true, force: true });
});
it('returns the file_unchanged placeholder on a follow-up Read after microcompact, exposing why the cache must be cleared on history rewrite', async () => {
const cache = new FileReadCache();
const config = makeConfig(tmpDir, cache);
const tool = new ReadFileTool(config);
// STEP 1 — first real Read populates the cache.
const r1 = await tool.buildAndExecute(
{ file_path: filePath },
new AbortController().signal,
);
expect(typeof r1.llmContent).toBe('string');
expect(r1.llmContent as string).toContain('export function hello');
expect(cache.size()).toBe(1);
// STEP 2 — build a conversation history mirroring real flow:
// 6 prior read_file functionResponses with the foo.ts content.
// microcompact's keepRecent=1 will clear the oldest 5.
const history: Content[] = [];
for (let i = 0; i < 6; i++) {
history.push({
role: 'model',
parts: [
{
functionCall: {
name: 'read_file',
args: { file_path: filePath },
},
},
],
});
history.push({
role: 'user',
parts: [
{
functionResponse: {
name: 'read_file',
response: { output: r1.llmContent as string },
},
},
],
});
}
// STEP 3 — microcompact fires (>60min idle).
const mcResult = microcompactHistory(history, Date.now() - 90 * 60_000, {
toolResultsThresholdMinutes: 60,
toolResultsNumToKeep: 1,
});
expect(mcResult.meta).toBeDefined();
expect(mcResult.meta!.toolsCleared).toBe(5);
// Confirm: most foo.ts content has been wiped from history.
const fooContentEntries = mcResult.history.filter((c) =>
c.parts?.some((p) => {
const out = p.functionResponse?.response?.['output'];
return typeof out === 'string' && out.includes('export function hello');
}),
);
// Only 1 fresh entry remains; the other 5 are placeholders.
expect(fooContentEntries).toHaveLength(1);
// STEP 4 — pre-fix code path: cache is NOT cleared after microcompact.
// User reads foo.ts again. File on disk is unchanged.
const r2 = await tool.buildAndExecute(
{ file_path: filePath },
new AbortController().signal,
);
// THE BUG: returned content is the placeholder, NOT the real file.
expect(r2.llmContent as string).toContain(
'unchanged since last read in this session',
);
expect(r2.llmContent as string).not.toContain('export function hello');
// The model now has:
// - history: 5 entries are [Old tool result content cleared],
// 1 entry has real content (the most-recent kept one)
// - fresh tool response: a placeholder pointing at "earlier in
// this conversation" — which is partly true (1 entry remains)
// but if the LLM trusted the placeholder and discarded the
// last surviving entry, the bytes are unrecoverable.
//
// In a longer chain (e.g. 20 reads, keep 1, microcompact clears
// 19), the surviving entry might not even be foo.ts — it would be
// whatever was read most recently. Then the placeholder points at
// ZERO bytes the model can find.
});
it('after cache.clear(), a follow-up Read of the same unchanged file re-emits the real bytes', async () => {
const cache = new FileReadCache();
const config = makeConfig(tmpDir, cache);
const tool = new ReadFileTool(config);
const r1 = await tool.buildAndExecute(
{ file_path: filePath },
new AbortController().signal,
);
expect(r1.llmContent as string).toContain('export function hello');
// The fix.
cache.clear();
const r2 = await tool.buildAndExecute(
{ file_path: filePath },
new AbortController().signal,
);
expect(r2.llmContent as string).toContain('export function hello');
expect(r2.llmContent as string).not.toContain(
'unchanged since last read in this session',
);
});
it('worst case: when microcompact removes every prior read of a file, the placeholder leaves zero recoverable bytes for the model', async () => {
// This is the worst-case version: many reads, microcompact clears
// everything, the surviving entry is a different file. The placeholder
// then points the model at content that no longer exists anywhere
// in its reachable context.
const cache = new FileReadCache();
const config = makeConfig(tmpDir, cache);
const tool = new ReadFileTool(config);
const otherPath = path.join(tmpDir, 'other.ts');
fs.writeFileSync(otherPath, 'unrelated\n');
// Read foo.ts (target file).
await tool.buildAndExecute(
{ file_path: filePath },
new AbortController().signal,
);
// Build history: 1 foo.ts read, then 1 other.ts read (kept).
const fooContent = fs.readFileSync(filePath, 'utf-8');
const history: Content[] = [
{
role: 'model',
parts: [
{
functionCall: {
name: 'read_file',
args: { file_path: filePath },
},
},
],
},
{
role: 'user',
parts: [
{
functionResponse: {
name: 'read_file',
response: { output: fooContent },
},
},
],
},
{
role: 'model',
parts: [
{
functionCall: {
name: 'read_file',
args: { file_path: otherPath },
},
},
],
},
{
role: 'user',
parts: [
{
functionResponse: {
name: 'read_file',
response: { output: 'unrelated\n' },
},
},
],
},
];
const mc = microcompactHistory(history, Date.now() - 90 * 60_000, {
toolResultsThresholdMinutes: 60,
toolResultsNumToKeep: 1,
});
expect(mc.meta!.toolsCleared).toBe(1);
// foo.ts content is gone from history; only other.ts remains.
const surviving = mc.history
.flatMap((c) => c.parts ?? [])
.map((p) => p.functionResponse?.response?.['output'])
.filter((o): o is string => typeof o === 'string');
expect(surviving.some((o) => o.includes('export function hello'))).toBe(
false,
);
// Now Read foo.ts again — pre-fix, cache returns placeholder.
const r = await tool.buildAndExecute(
{ file_path: filePath },
new AbortController().signal,
);
expect(r.llmContent as string).toContain(
'unchanged since last read in this session',
);
// Total foo.ts content reachable to the model:
// history → 0 bytes
// fresh tool result → placeholder, 0 bytes
// The model literally cannot recover the file contents.
});
});