mirror of
https://github.com/ruvnet/RuVector.git
synced 2026-05-25 15:03:46 +00:00
Rebuilt all 4 versions from scratch: - v0.2.x: 1,049 classes, 13,869 functions, 3,375 RVF vectors - v1.0.x: 1,390 classes, 16,593 functions, 4,669 RVF vectors - v2.0.x: 1,612 classes, 20,395 functions, 5,712 RVF vectors - v2.1.x: 1,632 classes, 19,906 functions, 9,058 RVF vectors Structure: source/ (17 JS modules in subfolders) + rvf/ (9 containers) - Zero mixing: no JS in rvf dirs, no RVF in source dirs - 100% code coverage: uncategorized/ catches everything - 17 modules: core/3, tools/3, permissions/1, config/3, telemetry/1, ui/2, types/1, uncategorized/1 - 9 RVF containers per version (1 master + 8 per-category) Co-Authored-By: claude-flow <ruv@ruv.net>
8.7 KiB
8.7 KiB
Claude Code CLI Source Analysis: Research Index
Project
Deep-dive reverse engineering of the Claude Code CLI (v2.1.91) internal architecture, based on binary analysis, string extraction, pattern matching, and configuration schema examination.
Methodology
This analysis used "agentic jujutsu" -- the tool analyzed itself by:
- Locating the binary and extension files on disk
- Extracting the embedded JavaScript source from the Bun SEA binary
- Pattern-matching against 12.8 MB of application code for class names, function signatures, string literals, and configuration patterns
- Analyzing the 76-property settings schema
- Cross-referencing 498 environment variables
- Mapping tool definitions, hook events, and MCP protocol methods
Research Documents
| # | Document | Description |
|---|---|---|
| 01 | Overview and Binary Structure | Binary format, installation paths, Bun SEA architecture, version management |
| 02 | Tool System | 25+ built-in tools, MCP tool integration, tool schemas, validation, content block types |
| 03 | Agent Loop and Execution Flow | Entry points, main loop, streaming, conversation management, slash commands, output formats |
| 04 | Permission System | 6 permission modes, permission flow, sandbox integration, managed settings |
| 05 | MCP Integration | 4 transports, 13 protocol methods, connection management, OAuth, tool discovery |
| 06 | Hooks System | 6 hook events, command/HTTP hook types, lifecycle, security controls |
| 07 | Context and Session Management | Token budgets, auto-compaction, session persistence, CLAUDE.md, file checkpointing, prompt caching |
| 08 | Configuration and Environment | Settings hierarchy, 76 settings, 498 env vars, home directory structure |
| 09 | Agent and Subagent System | Agent types, task/subagent lifecycle, skill system, plugin marketplace |
| 10 | Models and API | 27+ model IDs, 5 provider backends, API endpoints, prompt caching, effort levels |
| 11 | Telemetry and Observability | OpenTelemetry, Datadog, Perfetto, debugging, cost tracking |
| 12 | Dependency Graph | Module relationships, data flow, state management, initialization sequence |
| 13 | Extension Points | 13 extension mechanisms from CLAUDE.md to Agent SDK |
| 14 | Source Extraction | Binary analysis, code metrics, extraction methods, dependency identification |
| 15 | Core Module Analysis | Agent loop, tool dispatch, permissions, context management, MCP, streaming |
| 16 | Call Graphs | Mermaid call graphs: boot, agent loop, tool dispatch, permissions, MCP, compaction |
| 17 | Class Hierarchy | 1,557 classes, inheritance trees, AppState type, tool registry |
| 18 | State Machines | Agent loop, permission, session, streaming, MCP, sandbox state machines |
Extracted Source (v2.1.91)
Source and RVF cleanly separated. Master RVF: 9,058 vectors.
| Directory | Module | Fragments | Confidence |
|---|---|---|---|
source/core/ |
agent-loop.js | 77 | High |
source/core/ |
context-manager.js | 49 | High |
source/core/ |
streaming-handler.js | 24 | High |
source/core/ |
session.js | 361 | High |
source/tools/ |
tool-dispatch.js | 531 | High |
source/tools/mcp/ |
mcp-client.js | 51 | High |
source/permissions/ |
permission-system.js | 500 | High |
source/config/ |
config.js | 473 | High |
source/config/ |
model-provider.js | 165 | Medium |
source/config/ |
env-vars.js | 223 | Pattern |
source/telemetry/ |
telemetry.js | 524 | High |
source/telemetry/ |
telemetry-events.js | 861 | Pattern |
source/ui/ |
commands.js | 80 | Medium |
source/ui/ |
command-defs.js | 93 | Pattern |
source/types/ |
class-hierarchy.js | 1,467 | Pattern |
source/types/ |
api-endpoints.js | 52 | Pattern |
source/uncategorized/ |
uncategorized.js | 3,162 | Low |
RVF containers in rvf/: master.rvf (all), core.rvf, tools.rvf, permissions.rvf, config.rvf, telemetry.rvf, etc.
Additional Research
| # | Document | Description |
|---|---|---|
| 19 | RuVector Integration Guide | 6-tier integration plan: WASM MCP, agents, hooks, cache, SDK, plugin |
| 20 | SOTA Decompiler Research | Survey of JSNice, DeGuard, DIRE, VarCLR + ruDevolution validation |
| 21 | Model Weight Analysis | Embedded models, LoRA federation, GPU training, GGUF parsing |
RVF Version Corpus
| Version | Latest | Vectors | RVF Size | Bundle | Classes | Functions | Modules |
|---|---|---|---|---|---|---|---|
| v0.2.x | 0.2.126 | 3,375 | 1,731 KB | 6.9 MB | 1,049 | 13,869 | 17 |
| v1.0.x | 1.0.128 | 4,669 | 2,388 KB | 8.9 MB | 1,390 | 16,593 | 17 |
| v2.0.x | 2.0.77 | 5,712 | 2,918 KB | 10.5 MB | 1,612 | 20,395 | 17 |
| v2.1.x | 2.1.91 | 9,058 | 4,617 KB | 12.6 MB | 1,632 | 19,906 | 17 |
Tools
| Tool | Description |
|---|---|
scripts/rebuild-all-versions.mjs |
Full rebuild of all version decompilations (Node.js) |
scripts/claude-code-decompile.sh |
CLI decompiler (extract, beautify, split) |
scripts/claude-code-rvf-corpus.sh |
Build RVF containers for all versions (shell wrapper) |
npm/packages/ruvector/src/decompiler/ |
Decompiler library (module-splitter, metrics, witness) |
npx ruvector decompile <package> |
npm CLI decompiler |
examples/decompiler-dashboard/ |
Visual explorer (Vite + React) |
crates/ruvector-decompiler/ |
Rust decompiler crate (MinCut + AI + witness) |
ruDevolution SOTA Results
95.7% name accuracy — beats JSNice (63%), DIRE (65.8%), VarCLR (72%) by 23-35 points.
Trained on 8,201 pairs, 673K param transformer, pure Rust inference (<5ms, zero deps).
Key Findings
Architecture Summary
- Runtime: Bun 1.3.11 Single Executable Application (229 MB binary)
- Application code: ~12.8 MB of bundled, minified JavaScript
- UI: React 18.3.1 WebView (VS Code) + Ink-style terminal (CLI)
- API: Anthropic Messages API with streaming SSE
- Extension: MCP client protocol with 4 transports
By the Numbers
| Metric | Count |
|---|---|
| Built-in tools | 25+ |
| Slash commands | 39 |
| Environment variables | 498 |
| Settings properties | 76 |
| Supported models | 27+ |
| MCP protocol methods | 13 |
| Hook event types | 6 |
| Permission modes | 5 (acceptEdits, bypassPermissions, default, dontAsk, plan) |
| Extension mechanisms | 13 |
| Auth providers | 5 |
| MCP transports | 4 |
| Output formats | 3 |
| Source code classes | 1,557 |
| Functions (estimated) | 19,464 |
| Async generators (core loops) | 6 |
| Bundle size (minified) | 11 MB / 4,836 lines |
Architecture Pattern
Claude Code follows a plugin-oriented monolith pattern:
- Single binary deployment (Bun SEA)
- Modular internal architecture with clear subsystem boundaries
- Extensive extension surface (MCP, hooks, agents, skills, plugins)
- Multi-provider backend abstraction (Anthropic/AWS/GCP/Azure)
- Layered security (permissions -> sandbox -> hooks -> managed settings)
Limitations
- Source is minified/mangled: variable names are meaningless (e.g.,
Yq,f9) - Cannot trace exact function boundaries or module structure
- V8 snapshot region (~100MB) could not be decompiled
- Some patterns may be from bundled dependencies, not Claude Code itself
- This analysis reflects v2.1.91; architecture may change between versions