qwen-code/docs/plans/memory-diagnostics-reference-design.md
易良 eef06ce376
feat(cli): add structured memory diagnostics JSON (#3785)
* feat(cli): add memory diagnostics doctor command

* fix(core): platform-aware maxRSS conversion and accurate risk message

- Extract platform detection before building diagnostics so the correct
  unit conversion can be applied: multiply by 1024 on Linux (where
  process.resourceUsage().maxRSS is in KB) but leave the value unchanged
  on macOS/Windows (where it is already in bytes).
- Correct the native-memory-pressure risk message to accurately state
  that the threshold is 2× heap used, not just "larger than heapUsed".
- Add a dedicated test to assert that maxRSS is not multiplied on a
  non-Linux platform (darwin).

All 3 core and 9 CLI tests pass; typecheck clean.

Agent-Logs-Url: https://github.com/QwenLM/qwen-code/sessions/9b413337-68ed-4d5c-af99-0d42378900c3

* test(core): cover active request memory risk

* fix(cli): address memory diagnostics review feedback

* fix(cli): harden memory diagnostics review fixes

* fix(memory-diagnostics): tighten risk thresholds and expand readable output

- Add 64MB absolute floor on native-memory-pressure so cold processes don't trip
  the 2x ratio check; raise active-handles threshold from 100 to 256
- Show detachedContexts, nativeContexts, maxRSS, CPU times, smapsRollup
  availability, and v8HeapSpaces summary in the readable /doctor memory output
- Validate unknown memory subcommand args with a usage hint instead of silently
  dropping them
- Wrap human-readable strings in t(...) for i18n parity with the rest of doctor
- Advertise the memory subcommand via /doctor argumentHint while keeping
  acceptsInput false so the parent still auto-submits
- Document _getActiveHandles/_getActiveRequests as undocumented Node internals
- Update tests for new thresholds, expanded output, unknown-arg path, and
  abort-during-json

* fix(cli): harden memory doctor diagnostics

* fix(core): correct maxRSS byte handling and heapRatio consistency

- Remove incorrect * 1024 multiplier for maxRSS on Linux (Node.js >=14.10 returns bytes on all platforms)
- Use v8HeapStats.usedHeapSize for heapRatio to avoid cross-API inconsistency
- Update test expectations and rename "does not multiply" test

* fix(cli): resolve rebase conflicts in memory diagnostics

- Rename local formatMemoryDiagnostics to formatCoreDiagnostics to avoid
  naming conflict with the imported utility from memoryDiagnostics.js
- Update Session.test.ts to use objectContaining for _meta field added
  in recent main commits
- Align doctorCommand.test.ts assertions with current parent command
  state (argumentHint includes --sample/--snapshot from main)

* fix(core): use null instead of undefined for optional probes, deduplicate active count helpers

- optionalProbe/optionalSyncProbe now return null on failure so
  JSON.stringify preserves the keys instead of silently omitting them.
- Merge getActiveHandlesCount/getActiveRequestsCount into a single
  parameterized getProcessInternalCount helper.
- Update MemoryDiagnostics interface: v8HeapSpaces, openFileDescriptors,
  smapsRollup are now T | null instead of T | undefined.

* fix(cli): finish memory diagnostics review fixes

* fix(cli): address memory diagnostics review feedback

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-05-17 19:52:46 +08:00

2 KiB

Memory Diagnostics Reference Design

Context

Issue #3000 tracks memory and performance diagnostics for long-running Qwen Code sessions. The first PR should establish a small, low-risk diagnostic surface before adding heavier profiling or retention changes.

The design is reference-first:

  • Claude Code keeps memory diagnostics separate from heap snapshot generation. Its diagnostics include process memory, V8 heap statistics, heap spaces, resource usage, active handles/requests, file descriptors, Linux smaps_rollup, and leak hints.
  • Codex focuses heavily on bounded retention and lazy loading for long-lived process state. Those ideas should guide later PRs that address conversation, command output, and history retention.

First PR Scope

Add a /doctor memory diagnostic path that captures a single point-in-time snapshot:

  • process.memoryUsage()
  • V8 heap statistics and heap spaces
  • process.resourceUsage()
  • active handle/request counts
  • open file descriptor count when /proc/self/fd is available
  • Linux smaps_rollup when available
  • basic risk hints for heap pressure, detached contexts, excessive handles, excessive requests, high file descriptor count, and native memory pressure

This command should be cheap enough to run in normal sessions and safe on platforms where Linux-only probes are unavailable.

Non-Goals

This PR intentionally does not:

  • write heap snapshots
  • run continuous polling
  • change prompt/history retention
  • change tool output retention
  • alter module loading behavior

Those are follow-up PRs after the diagnostic baseline exists.

Follow-Up PRs

  1. Add explicit snapshot/export support for deeper local investigation.
  2. Add bounded retention for large command/tool outputs, using Codex's capped output retention as the main reference.
  3. Audit lazy loading and module startup paths after measurements identify hot spots.
  4. Add repeatable memory/performance benchmark scenarios for long-running sessions.