mirror of https://github.com/ruvnet/RuVector.git synced 2026-05-22 19:56:25 +00:00

rUv 930fca916f feat(sse): decouple SSE to mcp.pi.ruv.io proxy + Claude Code source research

SSE Proxy Decoupling (ADR-130):
- Fix ruvbrain-sse proxy: proper MCP handshake, session creation, drain polling
- Fix internal queue endpoints: session_create keeps receiver, drain returns buffered messages
- Add response_queues to AppState for SSE proxy communication
- Skip sparsifier for >5M edge graphs (was crashing on 16M edges)
- Add SSE_DISABLED/MAX_SSE env vars for configurable connection limits
- Route SSE to dedicated mcp.pi.ruv.io subdomain (Cloudflare CNAME)
- Serve SSE at root / path on proxy (no /sse needed)
- Update all references from pi.ruv.io/sse to mcp.pi.ruv.io
- Fix Dockerfile consciousness crate build (feature/version mismatches)

Claude Code CLI Source Research (ADR-133):
- 19 research documents analyzing Claude Code internals (3000+ lines)
- Decompiler script + RVF corpus builder for all major versions
- Binary RVF containers for v0.2, v1.0, v2.0, v2.1 (300-2068 vectors each)
- Call graphs, class hierarchies, state machines from minified source

Integration Strategy (ADR-134):
- 6-tier integration plan: WASM MCP, agents, hooks, cache, SDK, plugin
- Integration guide with architecture diagrams and performance targets

Co-Authored-By: claude-flow <ruv@ruv.net>

2026-04-02 23:39:56 +00:00

4.5 KiB

Raw Permalink Blame History

ADR-133: Claude Code CLI Source Code Analysis

Status

Deployed (2026-04-02)

Date

2026-04-02

Context

Understanding Claude Code's internal architecture is critical for building effective extensions (agents, hooks, skills, MCP servers) and for debugging integration issues. The CLI ships as a minified Bun 1.3.11 Single Executable Application (229 MB) with no public source code, making direct inspection necessary.

Decision

Perform systematic reverse-engineering analysis of the Claude Code CLI binary using string extraction, pattern matching, and structural analysis. Store findings as structured research documents in /docs/research/claude-code-rvsource/.

Approach: Agentic Jujutsu

Use Claude Code's own agent capabilities (background research agents, parallel tool calls) to analyze itself — the tool examines its own binary.

Architecture Findings

Binary Structure

Format: Bun 1.3.11 SEA (Single Executable Application)
Size: 229 MB binary, ~12.8 MB bundled/minified JavaScript
Entry: cli.js — single bundle with all application code
Runtime: Bun (not Node.js) with native module support

Code Metrics (from decompilation)

Metric	Count
Classes	1,557
Functions	19,464
Async functions	884
Arrow functions	23,537
Environment variables	498
Built-in tools	25+
Slash commands	39
Permission modes	6
Auth providers	5
Extension points	13

Core Modules Identified

Module	Minified Symbol	Role
Agent Loop	`s$`	Async generator — recursive `yield*` after tool execution
Tool Registry	`XF0`	`validateInput`/`call` interface, 25+ built-in tools
Permission Checker	—	5 modes, sandbox (bubblewrap/seatbelt), managed settings
Context Manager	—	Auto-compaction via `clear_tool_uses_20250919` API feature
MCP Client	—	stdio/SSE/WebSocket transports, OAuth/OIDC, MCPB bundles
Streaming Handler	—	SSE event processing, 3 stream modes

Key Architectural Patterns

Recursive generator loop: The agent loop is an async generator that yield*s itself after each tool execution, producing 13 distinct event types
Loopback proxy for MCP: Tool calls from MCP are proxied to REST endpoints via HTTP loopback on localhost
Deferred tool loading: MCP tool schemas loaded lazily via ToolSearch to keep initial context small
Context compaction: Automatic context window management with micro-compaction for stale tool results

Research Output

Documents (18 files, ~3,000 lines)

File	Content
`00-index.md`	Master index
`01` - `13`	Architecture analysis (overview, tools, agent loop, permissions, MCP, hooks, context, config, subagents, models, telemetry, dependencies, extensions)
`14` - `18`	Source code analysis (extraction, core modules, call graphs, class hierarchy, state machines)

Extracted Source (6 RVF-header files)

Module extractions in extracted/ with metadata headers:

agent-loop.rvf, tool-dispatch.rvf, permission-system.rvf
mcp-client.rvf, context-manager.rvf, streaming-handler.rvf

Note: These use text format with RVF-style metadata headers, not binary RVF containers. Future work: convert to proper RVF cognitive containers with vector embeddings using rvf-cli.

Decompiler Script

/scripts/claude-code-decompile.sh — automated extraction tool:

Finds Claude Code source (npm package or Bun SEA binary)
Extracts JavaScript via strings + pattern matching
Basic beautification and module splitting
Generates files with metadata headers

Consequences

Positive

Comprehensive internal reference for building Claude Code extensions
Call graphs and state machines clarify execution flow
Decompiler script enables analysis of future versions
Understanding of tool dispatch enables better MCP server design

Negative

Analysis based on minified code — some mappings are speculative
Findings may become stale as Claude Code updates frequently
Binary extraction captures string literals well but misses control flow nuances

Risks

Decompiled source is for internal research only — respect Anthropic's terms of service
Minified symbol names (s$, XF0, ye) may change between versions

References

Claude Code CLI documentation
MCP specification
Research output: /docs/research/claude-code-rvsource/

4.5 KiB Raw Permalink Blame History