ruvector/docs/adr/ADR-059-shared-brain-google-cloud.md

1406 lines
61 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# ADR-059: Shared Brain — Google Cloud Deployment
**Status**: Accepted
**Date**: 2026-02-27
**Authors**: RuVector Team
**Deciders**: ruv
**Supersedes**: N/A
**Related**: ADR-057 (Federated RVF Transfer Learning)
## 1. Overview
Public shared superintelligence for the RuVector/swarm/hive-mind ecosystem. Multiple Claude Code sessions share learning — patterns, solutions, debugging insights, transfer priors, policy kernels — through an RVF-native knowledge substrate. Knowledge enters as embeddings, gets verified by witness chains, partitioned by mincut, ranked by attention, drift-monitored by deltas, protected from poisoning by Byzantine-tolerant aggregation, gated by multi-factor reputation, and exchanged as RVF cognitive containers.
Hosted at `brain.ruv.io` (Δ.ruv.io).
**This is a PUBLIC system for UNTRUSTED users.** Every input is adversarial until proven otherwise.
The Shared Brain bridges the gap between isolated Claude Code sessions and a collective intelligence. Each session can contribute distilled insights — not raw code or conversation logs, but structured learning artifacts: SONA embeddings, transfer priors, policy kernels, cost curves, and debugging heuristics. These artifacts flow through a seven-layer security pipeline before entering the shared knowledge graph. Other sessions query this graph to bootstrap cold-start problems, avoid known pitfalls, and accelerate convergence on novel tasks.
The system is designed for zero-trust operation. Contributors are pseudonymous. All data is PII-stripped and differentially private before leaving the client. Server-side verification re-checks every guarantee. Byzantine-tolerant aggregation prevents poisoning even if a minority of contributors are adversarial. Multi-factor reputation ensures that high-quality contributors have more influence over time, while low-quality or malicious contributors are progressively marginalized.
## 2. Threat Model
The Shared Brain operates as a public service accepting input from untrusted, pseudonymous users. The following threat categories are addressed:
### Untrusted Users
All contributors are pseudonymous and potentially adversarial. No contributor is trusted by default. The system assumes any input may be crafted to degrade collective knowledge quality or extract private information.
### Adversarial Inputs
Malformed RVF containers, oversized payloads, invalid segment layouts, and schema-violating metadata are all expected. Input validation rejects anything that does not conform to the RVF specification before further processing.
### Embedding Poisoning Attacks
An adversary may submit carefully crafted embeddings designed to shift the collective knowledge centroid toward incorrect or misleading regions. Defenses include Byzantine-tolerant aggregation (2-sigma outlier exclusion), minimum observation thresholds, and reputation-weighted averaging that limits the influence of new or low-quality contributors.
### Credential Theft
API keys and Ed25519 signing keys are stored in Google Cloud Secret Manager with strict IAM policies. Keys are never embedded in code, configuration files, or container images. Rotation policies enforce periodic key changes.
### Replay Attacks
The server issues a challenge nonce via `GET /v1/challenge` (short-lived, single-use). The client includes this nonce in the `FederatedManifest (0x33)` segment and signs it with Ed25519. The server accepts exactly once per nonce and rejects replays. This avoids clock-drift issues inherent in timestamp windows. Nonces expire after 5 minutes if unused.
### DoS and DDoS
Defense is layered to reject cheap attacks before reaching expensive verification:
1. **Edge gate**: External HTTPS Load Balancer with Cloud Armor rate-based rules (IP-level, rejects botnets before they reach Cloud Run)
2. **First-packet gate**: Require a short-lived challenge token (server-issued nonce bound to contributor key). Reject requests without a valid token before any crypto verification.
3. **Application gate**: `BudgetTokenBucket` enforces per-contributor quotas (100 writes/hr, 1000 reads/hr). Payload size limits (1MB) prevent resource exhaustion. Optional proof-of-work challenges activate under sustained load.
This sequence ensures botnet traffic burns load balancer budget (cheap) rather than Cloud Run compute budget (expensive).
### PII Leakage
Client-side PII stripping (rvf-pii-strip) removes file paths, IP addresses, email addresses, API keys, usernames, and environment variable references before any data leaves the client. Server-side PII re-checking provides defense-in-depth. Differential privacy noise injection (rvf-diff-privacy) provides mathematical guarantees against reconstruction.
### Model Inversion via Embeddings
SONA embeddings could theoretically be inverted to reconstruct source content. Differential privacy noise injection (calibrated Gaussian noise with configurable epsilon/delta) ensures that individual contributions cannot be extracted from aggregated embeddings. The `DiffPrivacyProof (0x34)` segment attests to the noise parameters applied.
### Sybil Attacks
A single adversary may create multiple pseudonyms to amplify their influence. Defenses include: cold-start reputation (0.1) that limits initial influence, reputation decay (5%/month) that prevents inactive Sybils from accumulating score, and voting quorum requirements (minimum 3 distinct voters) that limit the impact of a small number of colluding pseudonyms.
### Byzantine Behavior
Contributors may behave correctly most of the time but inject subtle poisoning at strategic moments. The `FederatedAggregator` applies 2-sigma outlier exclusion on every aggregation round. `CognitiveMinCutEngine` applies SNN attractor dynamics to detect clusters of anomalous behavior. `VectorDelta` centroid tracking flags knowledge drift exceeding 30%.
## 3. RVF-Native Security
Every shared knowledge item is an RVF cognitive container with a full security envelope:
- **64-byte headers** with SHAKE-256 content hashes (quantum-robust — 128-bit security against Grover's algorithm)
- **Constant-time hash verification** via `subtle::ConstantTimeEq` to prevent timing side-channels
- **Ed25519 signatures** on every segment, verifiable without accessing payload content
- **WITNESS_SEG chains** linking all operations in a tamper-evident audit trail: PII stripping → embedding generation → DP noise injection → sharing → voting → transfer
- **CutCertificate** proving partition integrity via spectral graph analysis
- **AuditLogger** recording all graph mutations to `brain_audit_log` Firestore collection
### Seven Security Layers
**Layer 1: Input Sanitization (Client-Side)**
- PII strip via `rvf-pii-strip`: file paths, IPs, emails, API keys, usernames, environment variables
- Size limits: 1MB maximum payload, 11 segments maximum per container
- Schema validation: all segments must conform to RVF type specifications
- Differential privacy: calibrated Gaussian noise on all numerical parameters (default ε=1.0, δ=1e-5, sensitivity=1.0, clipping_norm=1.0). The `DiffPrivacyProof (0x34)` segment records the exact ε/δ/sensitivity/clipping parameters used. Server rejects containers whose DP proof does not match the enforced policy (making the privacy claim falsifiable, not just aspirational).
- RVF integrity: SHAKE-256 content hash computed and embedded in 64-byte header
**Layer 2: Server-Side Verification**
- PII re-check: server runs the same PII detection pipeline; rejects containers with detected PII
- Witness chain verification: every WITNESS_SEG must form a valid chain from the first operation to the final signature
- Signature verification: Ed25519 signature on every segment must verify against the contributor's registered public key
- Content hash verification: SHAKE-256 hash in header must match recomputed hash of payload (constant-time comparison)
- Embedding bounds check: all embedding dimensions must be within [-10.0, 10.0]; reject containers with out-of-bounds values
**Layer 3: DoS Protection (three sub-gates)**
- **Edge gate**: Cloud Armor rate-based rules on the external HTTPS LB reject botnets at L7 before reaching Cloud Run
- **First-packet gate**: Write endpoints require a server-issued challenge nonce (`GET /v1/challenge`). Requests without a valid nonce are rejected before expensive crypto verification. Nonces are single-use and expire in 5 minutes.
- **Application gate**: `BudgetTokenBucket` — 100 writes/hour, 1000 reads/hour per contributor pseudonym (10 writes/hour for cold-start). Per-endpoint budgets: search=500/hr, get=1000/hr, vote=200/hr, transfer=20/hr
- Optional proof-of-work: SHA-256 challenge with configurable difficulty (activated under sustained load for anonymous/low-reputation users)
- Negative cache: recently rejected contributor/hash pairs are cached for 5 minutes to prevent repeated submission
- Payload size limit: 1MB per request; 10MB per hour per contributor
- **Negative cost fuse**: When Firestore error rate >5% or GCS p99 latency >2s, the server sheds write load and forces read-only mode until health recovers
**Layer 4: Byzantine-Tolerant Aggregation**
- `FederatedAggregator`: weighted FedAvg with reputation-based weights
- 2-sigma outlier exclusion: contributions more than 2 standard deviations from the running mean are excluded
- `CognitiveMinCutEngine` isolation: SNN attractor dynamics partition the knowledge graph; anomalous clusters are quarantined
- Voting quorum: minimum 3 distinct voters required before a memory's quality score influences aggregation
**Layer 5: Multi-Factor Reputation**
- Composite score: `accuracy² × uptime × stake_weight` where `stake_weight = min(1.0, log10(stake + 1) / 6)`
- Cold start: new contributors begin at reputation 0.1
- Decay: 5% per month for inactive contributors
- Poisoning penalty: contributors with >5 downvotes and average quality <0.2 have their reputation halved
- Stored in `brain_contributors` Firestore collection with full history
**Layer 6: Anti-Drift Monitoring**
- `VectorDelta` centroid tracking: monitors the running centroid of all embeddings per knowledge domain
- Anomaly detection: flags shifts exceeding 30% of the domain centroid magnitude
- `CognitiveMinCutEngine` SNN: spectral analysis identifies emerging knowledge clusters and detects fragmentation
- Automated alerts when drift exceeds thresholds (Cloud Monitoring integration)
**Layer 7: Audit and Revocation**
- `WITNESS_SEG` chain: every operation from PII stripping to final aggregation is recorded in a tamper-evident chain
- Pseudonym revocation: revoking a contributor pseudonym cascades to remove all their contributions from the active graph
- `CutCertificate` audit: spectral graph certificates prove that partitioning preserved structural integrity
- Full audit log in `brain_audit_log` Firestore collection with 365-day retention
## 4. Google Cloud Architecture
### Cloud Run: ruvbrain
The core service runs as an axum-based HTTP server on Cloud Run:
- **Service name**: `ruvbrain`
- **Project**: `ruv-dev`
- **Region**: `us-central1`
- **Image**: `gcr.io/ruv-dev/ruvbrain:latest`
- **Auto-scaling**: 0 to 10 instances (scale-to-zero for cost efficiency)
- **Resources**: 2 vCPU, 2Gi RAM per instance
- **Concurrency**: 80 requests per instance
- **Startup probe**: `GET /v1/health` with 10-second timeout
- **Environment variables**:
- `GOOGLE_CLOUD_PROJECT=ruv-dev`
- `GCS_BUCKET=ruvector-brain-us-central1`
- `FIRESTORE_URL=https://firestore.googleapis.com/v1/projects/ruv-dev/databases/(default)/documents`
- `RUST_LOG=info`
- **Secrets** (via Secret Manager):
- `BRAIN_API_KEY=brain-api-key:latest`
- `BRAIN_SIGNING_KEY=brain-signing-key:latest`
### Firestore (Native Mode)
Collections in the `(default)` Firestore database (reusing the existing free-tier database on `ruv-dev`):
| Collection | Purpose | Key Fields |
|------------|---------|------------|
| `brain_memories` | Shared knowledge items | `id`, `contributor`, `embedding`, `quality_score`, `observations`, `created_at`, `updated_at` |
| `brain_contributors` | Contributor profiles and reputation | `pseudonym`, `public_key`, `reputation`, `stake`, `contributions_count`, `last_active` |
| `brain_graph_edges` | Knowledge graph relationships | `source_id`, `target_id`, `weight`, `edge_type`, `created_at` |
| `brain_audit_log` | Immutable operation log | `operation`, `contributor`, `memory_id`, `witness_hash`, `timestamp` |
### Google Cloud Storage (GCS)
- **Bucket**: `ruvector-brain-us-central1` (single-region for low latency, lifecycle-managed)
- **Object naming**: `{domain}/{contributor_pseudonym}/{timestamp}.rvf`
- **Storage class**: Standard for active data, Nearline for archive
- **Lifecycle**: auto-archive after 90 days, delete after 365 days
- **Encryption**: Google-managed keys (default) with option for CMEK
- **Access**: Private; accessible only via Cloud Run service account
### Secret Manager
Key separation principle: contributor keys authorize content operations only; service identity keys authorize infrastructure operations only. No contributor key can trigger Firestore/GCS admin operations.
- `brain-api-key`: API key for contributor authentication (authorizes read/write of content only)
- `brain-signing-key`: Ed25519 private key for server-side attestation signatures (never exposed to contributors)
- Service account authentication: Workload Identity Federation (preferred) no stored credentials. Falls back to `brain-firestore-credentials` only if WIF is unavailable.
- **Rotation**: 90-day rotation policy with automatic version management
- **Scoping**: Contributor Ed25519 keys (submitted in requests) authorize authorship and deletion. Service Ed25519 key (in Secret Manager) signs server attestations. These are distinct key hierarchies.
### Cloud IAM
- Service account: `mcp-brain-server@ruv-dev.iam.gserviceaccount.com`
- Minimal permissions:
- `roles/datastore.user` on Firestore
- `roles/storage.objectAdmin` on the brain GCS bucket
- `roles/secretmanager.secretAccessor` on brain secrets
- `roles/monitoring.metricWriter` for metrics export
- Workload Identity Federation for keyless authentication from Cloud Run
### External HTTPS Load Balancer + Serverless NEG
Cloud Armor and Cloud CDN require an external HTTPS Load Balancer they do not attach directly to Cloud Run domain mappings. The production architecture uses:
1. **Serverless Network Endpoint Group (NEG)**: Points to the Cloud Run `mcp-brain-server` service
2. **External HTTPS Load Balancer**: Routes traffic through the serverless NEG to Cloud Run
3. **Cloud Armor WAF policy** (`brain-waf-policy`): Attached to the load balancer backend service
- Rate limiting: 1000 requests/minute per IP
- Geo-blocking: configurable (default: allow all)
- OWASP CRS rules: SQL injection, XSS, protocol attack protection
- Custom rules: block requests with known malicious patterns
4. **Cloud CDN**: Enabled on the load balancer for read-heavy endpoints (`/v1/memories/search`, `/v1/status`)
> **Note**: Cloud Run direct domain mapping (Path A) is used for initial development. Production public deployment (Path B) requires the external HTTPS LB + serverless NEG path for full Cloud Armor and CDN integration. See the deployment runbook for both paths.
### VPC Service Controls
- Service perimeter: `brain-perimeter`
- Restricted services: Firestore, Cloud Storage
- Access levels: Cloud Run service account only
- Prevents data exfiltration via unauthorized API calls
### Custom Domain
- Domain: `brain.ruv.io` (also accessible as `Δ.ruv.io`)
- SSL: Google-managed certificate (auto-renewal via load balancer)
- DNS: A record pointing to the load balancer's external IP (production) or CNAME to `ghs.googlehosted.com` (development)
- Both paths serve the same Cloud Run backend
## 5. Data Flow (RVF-Native)
### Client-Side (Export Path)
```
Local Learning Artifacts
PII Strip (rvf-pii-strip)
│ → Remove file paths, IPs, emails, API keys, usernames
│ → Generate RedactionLog (0x35) segment
SONA Embed
│ → Convert stripped content to SONA embedding (f32 × dim)
│ → Generate Vec (0x01) segment
DP Noise Injection (rvf-diff-privacy)
│ → Add calibrated Gaussian noise (ε=1.0, δ=1e-5)
│ → Generate DiffPrivacyProof (0x34) segment
RVF Package Assembly
│ → Assemble 11 segments (see Section 6)
│ → Compute SHAKE-256 content hashes for all segments
│ → Generate Manifest (0x05) segment directory
Ed25519 Sign
│ → Sign full container with contributor's private key
│ → Generate Crypto (0x0C) segment
HTTPS POST to brain.ruv.io/v1/memories
```
### Server-Side (Ingest Path)
```
HTTPS POST received
Signature Verify
│ → Ed25519 signature check against registered public key
│ → Reject if signature invalid
Witness Chain Verify
│ → Walk WITNESS_SEG (0x0A) chain from first to last operation
│ → Reject if chain is broken or tampered
Content Hash Verify
│ → Recompute SHAKE-256 for each segment payload
│ → Constant-time comparison with header hash
│ → Reject if any hash mismatch
PII Re-Check
│ → Run server-side PII detection on all string fields
│ → Reject if any PII detected (defense-in-depth)
Embedding Bounds Check
│ → Verify all embedding dimensions in [-10.0, 10.0]
│ → Reject if out-of-bounds (potential poisoning)
Reputation Gate
│ → Check contributor reputation score
│ → Apply rate limits based on reputation tier
│ → Cold-start contributors limited to 10 writes/hr
Store
│ → Write RVF container to GCS bucket
│ → Write metadata to brain_memories Firestore collection
│ → Write audit entry to brain_audit_log
│ → Update contributor stats in brain_contributors
```
## 6. RVF Segment Layout per Memory
Each shared memory is a complete RVF cognitive container containing 11 segments:
| Index | Segment Code | Segment Name | Purpose |
|-------|-------------|--------------|---------|
| 0 | `0x05` | Manifest | Segment directory lists all segments in this container, their offsets, and sizes. Serves as the table of contents. |
| 1 | `0x01` | Vec | SONA embedding vector (`f32 × dim`). The semantic representation of the shared knowledge, used for similarity search. |
| 2 | `0x07` | Meta | Metadata: title (human-readable summary), content (the actual knowledge), tags (categorization), category (domain). |
| 3 | `0x30` | TransferPrior | Domain transfer priors: compact Beta posteriors from `MetaThompsonEngine`, capturing cross-domain learning rates and prior beliefs. |
| 4 | `0x31` | PolicyKernel | Best configuration snapshot: population-based policy search results, including hyperparameter settings and fitness scores. |
| 5 | `0x32` | CostCurve | Acceleration data: convergence curves, learning rate schedules, and cost-to-accuracy trade-offs from domain expansion. |
| 6 | `0x33` | FederatedManifest | Export metadata: contributor pseudonym, export timestamp, included segment IDs, privacy budget spent (ε consumed), format version, nonce. |
| 7 | `0x34` | DiffPrivacyProof | Differential privacy attestation: ε/δ values, noise mechanism (Gaussian), sensitivity bounds, clipping parameters, RDP accountant state. |
| 8 | `0x35` | RedactionLog | PII stripping attestation: count of redactions by type, SHAKE-256 hash of pre-redaction content (proves scanning without revealing content), rules that fired. |
| 9 | `0x0A` | Witness | WITNESS_SEG operation chain: SHAKE-256 hashes linking every operation (PII strip embed DP noise package sign). Tamper-evident audit trail. |
| 10 | `0x0C` | Crypto | Ed25519 signature footer: covers the entire container (all preceding segments). Verifiable without accessing payload content. |
Total container size is typically 2-50 KB depending on embedding dimension and metadata richness. The 1MB payload limit provides ample headroom for future segment additions.
## 7. Multi-Factor Reputation System
The reputation system is derived from `ReputationScore` in `ruvector-economy-wasm` and adapted for the public Shared Brain context.
### Composite Score Formula
```
composite = accuracy² × uptime × stake_weight
```
Where:
- **accuracy**: Fraction of a contributor's memories that have positive quality scores (upvotes > downvotes). Squared to heavily penalize inaccurate contributors.
- **uptime**: Fraction of the last 30 days that the contributor has been active (at least one read or write per day). Rewards consistent participation.
- **stake_weight**: `min(1.0, log10(stake + 1) / 6)` — logarithmic scaling of contributor's stake (number of high-quality contributions). Caps at 1.0 to prevent plutocratic dominance.
### Accuracy Computation
Accuracy uses a Bayesian prior Beta(1,1) rather than raw fraction to prevent early luck from distorting scores. The expected accuracy is `(upvotes + 1) / (upvotes + downvotes + 2)`, which smoothly converges to the true ratio as observations increase. A minimum of 5 observations is required before accuracy influences the composite score; below that, accuracy defaults to the prior mean (0.5).
### Cold Start
New contributors start with a reputation of 0.1. This allows them to participate immediately but with limited influence:
- Cold-start contributors are rate-limited to 10 writes/hour (vs. 100 for established contributors)
- Their contributions receive reduced weight in federated aggregation
- They cannot vote on other contributors' memories until they have at least 5 accepted contributions
### Decay
Reputation decays at 5% per month for inactive contributors. This prevents:
- Sybil accounts from accumulating reputation over time without contributing
- Abandoned pseudonyms from retaining disproportionate influence
- Stale reputation scores from misrepresenting current contributor quality
### Poisoning Penalty
Contributors who receive more than 5 downvotes with an average quality score below 0.2 have their reputation halved. This penalty:
- Is applied immediately upon crossing the threshold
- Stacks multiplicatively (repeated violations compound)
- Can be recovered from through sustained high-quality contributions
- Triggers a review entry in `brain_audit_log`
### Storage
Reputation data is stored in the `brain_contributors` Firestore collection:
```json
{
"pseudonym": "contributor_abc123",
"public_key": "ed25519_base64_...",
"reputation": 0.72,
"accuracy": 0.85,
"uptime": 0.93,
"stake": 47,
"contributions_count": 52,
"downvote_count": 1,
"last_active": "2026-02-27T14:30:00Z",
"created_at": "2026-01-15T09:00:00Z",
"penalties": []
}
```
## 8. Hive-Mind Emergence
The Shared Brain does not impose a fixed taxonomy on knowledge. Instead, structure emerges organically from the collective contributions through several mechanisms:
### Federated Aggregation with Byzantine Tolerance
The `FederatedAggregator` computes weighted FedAvg across all contributions in a given domain:
- Weights are derived from contributor reputation and contribution quality
- 2-sigma outlier filter excludes contributions that deviate significantly from the consensus
- This provides Byzantine fault tolerance: up to 1/3 of contributors can be adversarial without corrupting the aggregate
- Aggregation rounds run periodically (configurable, default: every 100 new contributions or 24 hours)
### MinCut Partitioning
`CognitiveMinCutEngine` applies spectral graph analysis to the knowledge graph:
- Memories are nodes; similarity and citation relationships are edges
- MinCut partitioning identifies natural knowledge domains — clusters that are internally cohesive but loosely coupled to other clusters
- These emergent domains are not predefined categories but arise from the actual structure of shared knowledge
- `CutCertificate` proves that each partition preserves structural integrity (no important edges were severed)
### SNN Attractor Dynamics
The `CognitiveMinCutEngine` uses Spiking Neural Network (SNN) attractor dynamics to assess cluster quality:
- Each knowledge cluster is modeled as an attractor basin
- Stable attractors represent well-established knowledge domains
- Unstable or rapidly shifting attractors indicate emerging or contested knowledge areas
- This provides an early warning system for knowledge drift or poisoning attempts
### Population-Based Policy Search
Best practices and configuration knowledge evolve through population-based search:
- `PolicyKernel` segments from multiple contributors form a population
- Fitness is determined by downstream quality scores (did this configuration help?)
- Tournament selection and mutation produce improved policy kernels over time
- The best kernels are promoted to the Pareto front
### Pareto Front Maintenance
The system maintains a Pareto front balancing two objectives:
- **Quality**: How accurate and useful is the knowledge? (measured by votes and downstream outcomes)
- **Novelty**: How different is this knowledge from existing entries? (measured by embedding distance from centroid)
- This prevents the knowledge base from collapsing into a single consensus viewpoint
- Novel but unverified knowledge is retained at lower confidence until validated
### Curiosity Bonus
To incentivize exploration of new knowledge domains, the system applies a curiosity bonus:
- Contributions to under-represented domains receive a quality multiplier
- This encourages coverage across the full problem space rather than concentration in popular areas
- The bonus decays as a domain reaches sufficient coverage (measured by embedding space density)
## 9. MCP Tool Interface
The `mcp-brain` crate provides 10 MCP tools for Claude Code sessions to interact with the Shared Brain. Tools are accessed via JSON-RPC 2.0 over stdio.
Registration: `claude mcp add mcp-brain -- cargo run -p mcp-brain`
### Tool Reference
#### 1. brain_share
Share learning with the collective brain.
**Parameters**:
- `title` (string, required): Human-readable summary of the knowledge
- `content` (string, required): The actual knowledge content
- `tags` (string[], optional): Categorization tags
- `category` (string, optional): Knowledge domain
- `embedding` (float[], optional): Pre-computed SONA embedding (auto-computed if omitted)
**Returns**: `{ id, witness_hash, quality_score, segments_stored }`
**Security**: Client-side PII strip, DP noise injection, Ed25519 signing applied automatically before upload. Server re-validates all guarantees.
#### 2. brain_search
Semantic search across all shared knowledge using SONA embeddings.
**Parameters**:
- `query` (string, required): Natural language search query
- `limit` (integer, optional, default: 10): Maximum results
- `category` (string, optional): Filter by knowledge domain
- `min_quality` (float, optional, default: 0.0): Minimum quality score threshold
**Returns**: `[{ id, title, content, quality_score, similarity, contributor, tags }]`
#### 3. brain_get
Retrieve a specific memory with full provenance information.
**Parameters**:
- `id` (string, required): Memory identifier
**Returns**: `{ id, title, content, quality_score, observations, contributor, witness_chain, segments, created_at, updated_at }`
#### 4. brain_vote
Quality-gate a memory via Bayesian score update.
**Parameters**:
- `id` (string, required): Memory identifier
- `vote` (string, required): `"up"` or `"down"`
- `reason` (string, optional): Explanation for the vote
**Returns**: `{ new_quality_score, total_observations, voter_reputation_delta }`
**Mechanism**: Votes update the memory's quality score using Bayesian updating. The voter's reputation influences vote weight. Voting also updates the voter's own reputation (accurate votes on eventually-consensus memories increase voter reputation).
#### 5. brain_transfer
Apply learned priors from the shared brain to a local task domain.
**Parameters**:
- `source_domain` (string, required): Domain to transfer from
- `target_domain` (string, required): Local domain to transfer to
- `min_quality` (float, optional, default: 0.5): Minimum quality threshold for priors
- `min_observations` (integer, optional, default: 10): Minimum evidence threshold
**Returns**: `{ transferred_priors, transferred_policies, estimated_acceleration, confidence }`
**Mechanism**: Extracts high-quality `TransferPrior (0x30)` and `PolicyKernel (0x31)` segments from the specified source domain, applies sqrt-scaling dampening (same as `MetaThompsonEngine::init_domain_with_transfer`), and returns them for local integration.
#### 6. brain_drift
Check knowledge drift in a specific domain.
**Parameters**:
- `domain` (string, optional): Specific domain to check (all domains if omitted)
- `window` (string, optional, default: "7d"): Time window for drift analysis
**Returns**: `{ domains: [{ domain, centroid_shift, anomaly_detected, new_clusters, fragmentation_score }] }`
#### 7. brain_partition
Get knowledge partitioned by MinCut spectral analysis.
**Parameters**:
- `domain` (string, optional): Specific domain (all if omitted)
- `min_cluster_size` (integer, optional, default: 3): Minimum memories per cluster
**Returns**: `{ partitions: [{ cluster_id, memories, centroid, cohesion_score, cut_certificate }] }`
#### 8. brain_list
List recent memories with optional filtering.
**Parameters**:
- `limit` (integer, optional, default: 20): Maximum results
- `category` (string, optional): Filter by category
- `contributor` (string, optional): Filter by contributor pseudonym
- `sort` (string, optional, default: "recent"): Sort order (`"recent"`, `"quality"`, `"trending"`)
**Returns**: `[{ id, title, quality_score, observations, contributor, category, created_at }]`
#### 9. brain_delete
Delete own contribution from the shared brain.
**Parameters**:
- `id` (string, required): Memory identifier (must be owned by the caller)
**Returns**: `{ deleted: true, audit_entry_id }`
**Security**: Only the original contributor (verified by Ed25519 signature) can delete their own memories. Deletion is recorded in `brain_audit_log`.
#### 10. brain_status
Get system health and statistics.
**Parameters**: None
**Returns**: `{ total_memories, total_contributors, active_contributors_30d, avg_quality_score, domains, uptime, storage_used, rate_limits_remaining }`
## 10. REST API Interface
The `mcp-brain-server` crate exposes a REST API (axum-based) deployed on Cloud Run at `brain.ruv.io`.
All endpoints require authentication via Bearer token (`Authorization: Bearer <api-key>`) or Ed25519 signed requests.
### Endpoint Reference
#### GET /v1/health
Health check endpoint. Unauthenticated.
**Response** (200):
```json
{
"status": "healthy",
"version": "0.1.0",
"uptime_seconds": 86400,
"firestore": "connected",
"gcs": "connected"
}
```
#### POST /v1/memories
Submit a new shared memory. Accepts an RVF cognitive container.
**Request**: Binary RVF container or JSON envelope:
```json
{
"title": "Optimal LoRA rank for code review",
"content": "rank=16 with alpha=32 converges 2x faster on code review tasks...",
"tags": ["lora", "code-review", "optimization"],
"category": "ml-tuning",
"embedding": [0.12, -0.34, ...],
"rvf_container_b64": "<base64-encoded RVF container>"
}
```
**Response** (201):
```json
{
"id": "mem_abc123",
"witness_hash": "shake256_...",
"quality_score": 0.5,
"segments_stored": 10
}
```
**Rate limit**: 100 writes/hour per contributor (10 for cold-start).
#### GET /v1/memories/search
Semantic search across shared knowledge.
**Query parameters**:
- `q` (string, required): Search query
- `limit` (integer, optional, default: 10)
- `category` (string, optional)
- `min_quality` (float, optional, default: 0.0)
**Response** (200):
```json
{
"results": [
{
"id": "mem_abc123",
"title": "Optimal LoRA rank for code review",
"content": "rank=16 with alpha=32...",
"quality_score": 0.87,
"similarity": 0.94,
"contributor": "contributor_xyz",
"tags": ["lora", "code-review"]
}
],
"total": 42
}
```
#### GET /v1/memories/{id}
Retrieve a specific memory with full provenance.
**Response** (200):
```json
{
"id": "mem_abc123",
"title": "Optimal LoRA rank for code review",
"content": "rank=16 with alpha=32...",
"quality_score": 0.87,
"observations": 15,
"contributor": "contributor_xyz",
"witness_chain": ["shake256_aaa...", "shake256_bbb..."],
"segments": ["Manifest", "Vec", "Meta", "TransferPrior", "PolicyKernel", "CostCurve", "FederatedManifest", "DiffPrivacyProof", "RedactionLog", "Witness", "Crypto"],
"created_at": "2026-02-27T10:00:00Z",
"updated_at": "2026-02-27T14:30:00Z"
}
```
#### POST /v1/memories/{id}/vote
Vote on a memory's quality.
**Request**:
```json
{
"vote": "up",
"reason": "Verified this LoRA configuration independently"
}
```
**Response** (200):
```json
{
"new_quality_score": 0.89,
"total_observations": 16,
"voter_reputation_delta": 0.01
}
```
#### DELETE /v1/memories/{id}
Delete own contribution. Requires Ed25519 signature matching the original contributor.
**Response** (200):
```json
{
"deleted": true,
"audit_entry_id": "audit_del_789"
}
```
**Response** (403): If the caller is not the original contributor.
#### POST /v1/transfer
Apply transfer learning from shared knowledge to a target domain.
**Request**:
```json
{
"source_domain": "code-review",
"target_domain": "security-audit",
"min_quality": 0.5,
"min_observations": 10
}
```
**Response** (200):
```json
{
"transferred_priors": 8,
"transferred_policies": 3,
"estimated_acceleration": 1.7,
"confidence": 0.82,
"rvf_container_b64": "<base64-encoded RVF container with transfer segments>"
}
```
#### GET /v1/drift
Check knowledge drift across domains.
**Query parameters**:
- `domain` (string, optional): Specific domain (all if omitted)
- `window` (string, optional, default: "7d")
**Response** (200):
```json
{
"domains": [
{
"domain": "ml-tuning",
"centroid_shift": 0.12,
"anomaly_detected": false,
"new_clusters": 1,
"fragmentation_score": 0.08
}
]
}
```
#### GET /v1/partition
Get knowledge partitioned by MinCut spectral analysis.
**Query parameters**:
- `domain` (string, optional)
- `min_cluster_size` (integer, optional, default: 3)
**Response** (200):
```json
{
"partitions": [
{
"cluster_id": "cluster_001",
"memory_count": 12,
"centroid": [0.15, -0.22, ...],
"cohesion_score": 0.91,
"cut_certificate": "cert_sha256_..."
}
]
}
```
#### GET /v1/status
System health and statistics. Includes real computed avg_quality (from all memories' BetaParams) and drift_status (from DriftMonitor).
**Response** (200):
```json
{
"total_memories": 1247,
"total_contributors": 89,
"graph_nodes": 1247,
"graph_edges": 3842,
"cluster_count": 7,
"avg_quality": 0.71,
"drift_status": "healthy",
"lora_epoch": 42,
"lora_pending_submissions": 1,
"total_pages": 23,
"total_nodes": 5,
"total_votes": 892
}
```
### Brainpedia Endpoints (ADR-062)
| Method | Path | Purpose |
|--------|------|---------|
| POST | `/v1/pages` | Create a Brainpedia page (reputation-gated) |
| GET | `/v1/pages/{id}` | Get page with delta log and evidence |
| POST | `/v1/pages/{id}/deltas` | Submit a delta (evidence-gated) |
| GET | `/v1/pages/{id}/deltas` | List deltas for a page |
| POST | `/v1/pages/{id}/evidence` | Add evidence to a page |
| POST | `/v1/pages/{id}/promote` | Promote Draft to Canonical (consensus-gated) |
### WASM Executable Nodes (ADR-063)
| Method | Path | Purpose |
|--------|------|---------|
| GET | `/v1/nodes` | List published (non-revoked) nodes |
| POST | `/v1/nodes` | Publish a WASM node (reputation-gated, V1 ABI validated) |
| GET | `/v1/nodes/{id}` | Get node metadata + conformance vectors |
| GET | `/v1/nodes/{id}.wasm` | Download WASM binary (immutable cache headers) |
| POST | `/v1/nodes/{id}/revoke` | Revoke a node (original publisher only) |
### Federated MicroLoRA Endpoints (ADR-060/061)
| Method | Path | Purpose |
|--------|------|---------|
| GET | `/v1/lora/latest` | Serve current consensus MicroLoRA weights |
| POST | `/v1/lora/submit` | Submit session LoRA weights for federation |
| GET | `/v1/training/preferences` | Export preference pairs for DPO/reward model training |
**Total: 25 endpoints** across core (12), Brainpedia (6), WASM nodes (5), and LoRA/training (3).
### Persistence Architecture
All state uses a write-through cache pattern:
- **DashMap**: In-memory hot cache for all read operations
- **Firestore REST**: Durable backend, written on every mutation when `FIRESTORE_URL` is set
- **Startup hydration**: `load_from_firestore()` populates cache from Firestore on boot
- **Local-only mode**: When `FIRESTORE_URL` is absent, operates as in-memory only (dev/test)
Health endpoint reports `persistence_mode: "firestore"` or `"local-only"`.
## 11. Deployment Runbook
### Prerequisites
- Google Cloud SDK (`gcloud`) installed and authenticated as `ruv@ruv.net`
- Project: `ruv-dev` (project number: redacted)
- Region: `us-central1`
- Service account: `mcp-brain-server@ruv-dev.iam.gserviceaccount.com`
- GCS bucket: `ruvector-brain-us-central1`
- Firestore: `(default)` database
- Secrets: `brain-api-key`, `brain-signing-key`, `cloudflare-api-token` in Secret Manager
- Domain: `brain.ruv.io` (Cloudflare DNS — configured separately)
### deploy.sh
```bash
#!/usr/bin/env bash
set -euo pipefail
PROJECT="ruv-dev"
REGION="us-central1"
SERVICE="ruvbrain"
BUCKET="ruvector-brain-${REGION}"
DB="(default)"
echo "=== Step 1: GCS Bucket Creation ==="
gcloud storage buckets create "gs://${BUCKET}" \
--project="${PROJECT}" \
--location="${REGION}" \
--uniform-bucket-level-access \
--public-access-prevention
# Lifecycle: archive after 90 days, delete after 365 days
cat > /tmp/lifecycle.json <<'LIFECYCLE'
{
"rule": [
{
"action": {"type": "SetStorageClass", "storageClass": "NEARLINE"},
"condition": {"age": 90}
},
{
"action": {"type": "Delete"},
"condition": {"age": 365}
}
]
}
LIFECYCLE
gcloud storage buckets update "gs://${BUCKET}" \
--lifecycle-file=/tmp/lifecycle.json
echo "=== Step 2: Firestore Setup ==="
gcloud firestore databases create \
--project="${PROJECT}" \
--database="${DB}" \
--location="${REGION}" \
--type=firestore-native
# Create composite indexes for common queries
gcloud firestore indexes composite create \
--project="${PROJECT}" \
--database="${DB}" \
--collection-group="brain_memories" \
--field-config="field-path=category,order=ASCENDING" \
--field-config="field-path=quality_score,order=DESCENDING"
gcloud firestore indexes composite create \
--project="${PROJECT}" \
--database="${DB}" \
--collection-group="brain_memories" \
--field-config="field-path=contributor,order=ASCENDING" \
--field-config="field-path=created_at,order=DESCENDING"
echo "=== Step 3: Secret Manager Setup ==="
# Create secrets (values should be provided securely, not in script)
echo -n "${BRAIN_API_KEY:-$(openssl rand -hex 32)}" | \
gcloud secrets create brain-api-key \
--project="${PROJECT}" \
--replication-policy="automatic" \
--data-file=-
echo -n "${BRAIN_SIGNING_KEY:-placeholder}" | \
gcloud secrets create brain-signing-key \
--project="${PROJECT}" \
--replication-policy="automatic" \
--data-file=-
# Grant Cloud Run service account access
SA="mcp-brain-server@${PROJECT}.iam.gserviceaccount.com"
gcloud secrets add-iam-policy-binding brain-api-key \
--project="${PROJECT}" \
--member="serviceAccount:${SA}" \
--role="roles/secretmanager.secretAccessor"
gcloud secrets add-iam-policy-binding brain-signing-key \
--project="${PROJECT}" \
--member="serviceAccount:${SA}" \
--role="roles/secretmanager.secretAccessor"
echo "=== Step 4: Service Account and IAM ==="
gcloud iam service-accounts create "mcp-brain-server" \
--project="${PROJECT}" \
--display-name="MCP Brain Server"
# Firestore access
gcloud projects add-iam-policy-binding "${PROJECT}" \
--member="serviceAccount:${SA}" \
--role="roles/datastore.user"
# GCS access
gcloud storage buckets add-iam-policy-binding "gs://${BUCKET}" \
--member="serviceAccount:${SA}" \
--role="roles/storage.objectAdmin"
# Monitoring
gcloud projects add-iam-policy-binding "${PROJECT}" \
--member="serviceAccount:${SA}" \
--role="roles/monitoring.metricWriter"
echo "=== Step 5: Cloud Run Deployment ==="
# Note: --no-allow-unauthenticated requires IAM auth at the Cloud Run level.
# Public access is handled via the external HTTPS LB (Step 6b).
# The /v1/health endpoint is exempt via IAM invoker on the LB.
gcloud run deploy "${SERVICE}" \
--project="${PROJECT}" \
--region="${REGION}" \
--image="gcr.io/${PROJECT}/${SERVICE}:latest" \
--service-account="${SA}" \
--cpu=2 \
--memory=2Gi \
--min-instances=0 \
--max-instances=10 \
--concurrency=80 \
--port=8080 \
--set-env-vars="GOOGLE_CLOUD_PROJECT=${PROJECT},GCS_BUCKET=${BUCKET},FIRESTORE_URL=https://firestore.googleapis.com/v1/projects/${PROJECT}/databases/${DB}/documents,RUST_LOG=info" \
--set-secrets="BRAIN_API_KEY=brain-api-key:latest,BRAIN_SIGNING_KEY=brain-signing-key:latest" \
--allow-unauthenticated
echo "=== Step 6a: Custom Domain Mapping (Dev Path A — no Cloud Armor/CDN) ==="
echo "For development only. Production uses Step 6b."
# gcloud run domain-mappings create \
# --project="${PROJECT}" \
# --region="${REGION}" \
# --service="${SERVICE}" \
# --domain="brain.ruv.io"
echo "=== Step 6b: External HTTPS LB + Serverless NEG (Production Path B) ==="
# This path provides Cloud Armor WAF + Cloud CDN integration.
# Create serverless NEG pointing to Cloud Run
gcloud compute network-endpoint-groups create brain-neg \
--project="${PROJECT}" \
--region="${REGION}" \
--network-endpoint-type=serverless \
--cloud-run-service="${SERVICE}"
# Create backend service
gcloud compute backend-services create brain-backend \
--project="${PROJECT}" \
--global \
--protocol=HTTPS \
--port-name=http
# Add NEG to backend
gcloud compute backend-services add-backend brain-backend \
--project="${PROJECT}" \
--global \
--network-endpoint-group=brain-neg \
--network-endpoint-group-region="${REGION}"
# Create URL map
gcloud compute url-maps create brain-lb \
--project="${PROJECT}" \
--default-service=brain-backend
# Reserve static IP
gcloud compute addresses create brain-ip \
--project="${PROJECT}" \
--global
# Create managed SSL certificate for brain.ruv.io
gcloud compute ssl-certificates create brain-cert \
--project="${PROJECT}" \
--domains="brain.ruv.io" \
--global
# Create HTTPS proxy
gcloud compute target-https-proxies create brain-https-proxy \
--project="${PROJECT}" \
--url-map=brain-lb \
--ssl-certificates=brain-cert
# Create forwarding rule
BRAIN_IP=$(gcloud compute addresses describe brain-ip --project="${PROJECT}" --global --format='value(address)')
gcloud compute forwarding-rules create brain-https-rule \
--project="${PROJECT}" \
--global \
--target-https-proxy=brain-https-proxy \
--ports=443 \
--address="${BRAIN_IP}"
echo ""
echo "DNS Configuration Required:"
echo " Add an A record for brain.ruv.io pointing to ${BRAIN_IP}"
echo " SSL certificate will auto-provision once DNS resolves."
echo ""
echo "=== Step 7: Cloud Armor WAF (attached to LB backend) ==="
gcloud compute security-policies create brain-waf-policy \
--project="${PROJECT}" \
--description="WAF policy for brain.ruv.io"
# Rate limiting rule (edge gate — rejects botnets before reaching Cloud Run)
gcloud compute security-policies rules create 1000 \
--project="${PROJECT}" \
--security-policy="brain-waf-policy" \
--expression="true" \
--action="rate-based-ban" \
--rate-limit-threshold-count=1000 \
--rate-limit-threshold-interval-sec=60 \
--ban-duration-sec=300
# OWASP CRS rules
gcloud compute security-policies rules create 2000 \
--project="${PROJECT}" \
--security-policy="brain-waf-policy" \
--expression="evaluatePreconfiguredExpr('sqli-v33-stable')" \
--action="deny-403"
gcloud compute security-policies rules create 2001 \
--project="${PROJECT}" \
--security-policy="brain-waf-policy" \
--expression="evaluatePreconfiguredExpr('xss-v33-stable')" \
--action="deny-403"
# Attach Cloud Armor to the backend service
gcloud compute backend-services update brain-backend \
--project="${PROJECT}" \
--global \
--security-policy=brain-waf-policy
# Enable Cloud CDN on the backend (read-heavy caching)
gcloud compute backend-services update brain-backend \
--project="${PROJECT}" \
--global \
--enable-cdn
echo "=== Step 8: Monitoring and Alerting ==="
# Uptime check
gcloud monitoring uptime create "brain-health-check" \
--project="${PROJECT}" \
--display-name="Brain Health Check" \
--resource-type="uptime-url" \
--hostname="brain.ruv.io" \
--path="/v1/health" \
--check-interval=60s
# Alert policy for error rate
cat > /tmp/alert-policy.json <<'ALERT'
{
"displayName": "Brain Server Error Rate > 5%",
"conditions": [
{
"displayName": "Error Rate",
"conditionThreshold": {
"filter": "resource.type=\"cloud_run_revision\" AND resource.labels.service_name=\"mcp-brain-server\" AND metric.type=\"run.googleapis.com/request_count\" AND metric.labels.response_code_class=\"5xx\"",
"aggregations": [
{
"alignmentPeriod": "300s",
"perSeriesAligner": "ALIGN_RATE"
}
],
"comparison": "COMPARISON_GT",
"thresholdValue": 0.05,
"duration": "300s"
}
}
],
"combiner": "OR"
}
ALERT
gcloud alpha monitoring policies create \
--project="${PROJECT}" \
--policy-from-file=/tmp/alert-policy.json
echo "=== Deployment Complete ==="
echo "Service URL: https://brain.ruv.io"
echo "Health check: https://brain.ruv.io/v1/health"
echo "Status: https://brain.ruv.io/v1/status"
```
### Post-Deployment Verification
```bash
# Verify health endpoint
curl -s https://brain.ruv.io/v1/health | jq .
# Verify status endpoint
curl -s -H "Authorization: Bearer ${BRAIN_API_KEY}" \
https://brain.ruv.io/v1/status | jq .
# Test memory submission (with test data)
curl -s -X POST https://brain.ruv.io/v1/memories \
-H "Authorization: Bearer ${BRAIN_API_KEY}" \
-H "Content-Type: application/json" \
-d '{"title":"test","content":"deployment verification","tags":["test"]}' | jq .
# Verify Firestore connectivity
gcloud firestore documents list \
--project="${PROJECT}" \
--database="brain" \
--collection="brain_memories" \
--limit=1
# Verify GCS bucket
gcloud storage ls "gs://${BUCKET}/" --limit=5
```
## 12. Compliance
### GDPR (General Data Protection Regulation)
**Article 25 — Privacy by Design**: PII stripping is mandatory and automatic. No personal data leaves the client without passing through the three-stage PII pipeline (detection, redaction, attestation). Differential privacy noise injection provides mathematical guarantees against re-identification. Server-side PII re-checking provides defense-in-depth.
**Article 17 — Right to Erasure**: Contributors can delete their own memories via `brain_delete` / `DELETE /v1/memories/{id}`. Pseudonym revocation cascades to remove all associated contributions from `brain_memories`, `brain_graph_edges`, and GCS objects. The `brain_audit_log` retains a record of the deletion (without the deleted content) for compliance audit purposes.
**Article 15 — Right of Access**: Contributors can retrieve all their contributions via `brain_list` / `GET /v1/memories/search?contributor={pseudonym}`.
**Article 20 — Data Portability**: Contributors can export their contributions as standard RVF containers via `brain_get` / `GET /v1/memories/{id}`.
### CCPA (California Consumer Privacy Act)
**Section 1798.105 — Right to Delete**: Deletion requests are honored via the same `brain_delete` mechanism as GDPR Article 17. Pseudonym revocation removes all associated data within 30 days.
**Section 1798.100 — Right to Know**: Contributors can access all data associated with their pseudonym via the API.
**Section 1798.110 — Right to Know Categories**: The system stores only: SONA embeddings, metadata (title, content, tags, category), reputation scores, and audit logs. No raw personal data is stored.
### Pseudonym Revocation
Revoking a contributor pseudonym triggers:
1. Immediate exclusion from future aggregation rounds
2. Removal of all memories from `brain_memories` Firestore collection
3. Removal of all graph edges from `brain_graph_edges`
4. Deletion of all RVF containers from GCS bucket
5. Contributor record in `brain_contributors` is marked as revoked (not deleted, for audit)
6. Audit entry in `brain_audit_log` recording the revocation
### Data Retention
| Data Type | Active Retention | Archive | Deletion |
|-----------|-----------------|---------|----------|
| Shared memories | Indefinite (while active) | 90 days after last access | 365 days after archival |
| RVF containers (GCS) | Standard storage | Nearline after 90 days | Deleted after 365 days |
| Contributor profiles | While active | N/A | On pseudonym revocation |
| Audit logs | 365 days | N/A | Deleted after 365 days |
| Graph edges | While connected memories exist | N/A | Cascaded on memory deletion |
### PII Guarantees
1. **No PII in storage**: All data in Firestore and GCS has been PII-stripped (client-side) and PII-re-checked (server-side)
2. **No PII in embeddings**: SONA embeddings are generated from PII-stripped content with DP noise injection
3. **No PII in logs**: Audit logs record operation hashes and pseudonyms, never raw content
4. **No PII in transit**: All communication is HTTPS with TLS 1.3; request bodies contain only processed data
5. **Attestation**: Every memory includes a `RedactionLog (0x35)` segment proving PII stripping was performed, and a `DiffPrivacyProof (0x34)` segment attesting to the privacy parameters applied
## 13. Cognitive Integration Roadmap
This section documents the phased replacement of homebrew algorithms with production crates from the RuVector cognitive stack.
### Phase 1 — Semantic Foundation
- `sona::SonaEngine` replaces SHAKE-256 hash embeddings with real semantic vectors
- `ruvector-gnn::RuvectorLayer` + `RuvectorQuery` for GNN message-passing and HNSW-backed search
- `ruvector-solver::ForwardPushSolver` for O(1/e) personalized PageRank relevance propagation
### Phase 2 — Graph Intelligence
- `ruvector-mincut::MinCutBuilder` + `DynamicMinCut` replaces Union-Find with real subpolynomial min-cut partitioning
- `CutCertificate` provides verifiable proof of partition integrity
- `ruvector-delta-core::VectorDelta` for precise embedding drift detection with sparse/dense delta types
### Phase 3 — Neuromorphic Substrate
- `ruvector-nervous-system::ModernHopfield` for content-addressable memory recall (exponential capacity)
- `DentateGyrus` for hippocampal pattern separation (128D to 10000D, <1% collision rate)
- `Hypervector` + `HdcMemory` for binary hyperdimensional first-pass filtering (100x speedup)
### Phase 4 — Adaptive Ranking
- `ruvector-attention::TopologyGatedAttention` with coherence-gated mode selection (Stable/Cautious/Freeze)
- 46 attention mechanisms available via `AttentionBuilder` pipeline
- `ruvector-domain-expansion::DomainExpansionEngine` for real Meta Thompson Sampling transfer with dampened priors
### Phase 5 — Emergent Intelligence
- `cognitum-gate-kernel::TileState` + `EvidenceAccumulator` for 256-tile distributed quality consensus
- SONA continuous learning loops (InstantLoop <100us, BackgroundLoop with EWC++ forgetting prevention)
- `ruvector-solver::SolverRouter` auto-selects algorithm based on graph sparsity profile
### Module Migration Map
| Brain Module | Previous (Homebrew) | Now Uses (Real Crate) |
|---|---|---|
| embed.rs | SHAKE-256 hash -> random 128D | sona::SonaEngine semantic embeddings |
| graph.rs | Union-Find + brute cosine scan | ruvector-mincut::DynamicMinCut + ruvector-solver::ForwardPush |
| cognitive.rs | Euclidean distance threshold | ruvector-nervous-system (Hopfield + DentateGyrus + HDC) |
| ranking.rs | Static 0.5/0.3/0.2 weights | ruvector-attention::TopologyGatedAttention |
| drift.rs | CV on consecutive distances | ruvector-delta-core::VectorDelta |
| aggregate.rs | 2-sigma filter only | reputation-weighted + byzantine tolerance |
| transfer route | Mock response | ruvector-domain-expansion::DomainExpansionEngine |
| pipeline.rs | 3 regex PII rules | 12-rule PII + linked witness chain |
| verify.rs | inline sha3/ed25519-dalek | rvf-crypto (shake256_256, witness chains) — ADR-075 |
| verify.rs (PII) | 8 string patterns | rvf-federation::PiiStripper (12 regex rules) — ADR-075 |
| routes.rs (DP) | Not implemented | rvf-federation::DiffPrivacyEngine — ADR-075 |
| pipeline.rs (RVF) | Client-provided only | rvf-wire::write_segment (server-side build) — ADR-075 |
| routes.rs (cache) | Not implemented | rvf-runtime::NegativeCache — ADR-075 |
### Implementation Status (Completed)
All 8 modules have been migrated to production crate integrations:
- **embed.rs**: `SonaEngineBuilder` with MicroLoRA, `find_patterns()` for learned centroid reuse, SHAKE-256 fallback. 8 tests.
- **graph.rs**: `ForwardPushSolver` (alpha=0.85, epsilon=1e-4) for PPR-boosted search, `CsrMatrix<f64>` built from adjacency. Cosine + PageRank hybrid scoring. Union-Find partition retained for efficiency.
- **cognitive.rs**: `ModernHopfield::new(dim, 1.0)` for associative recall, `DentateGyrus::new(dim, dim*10, k, 42)` for pattern separation with `catch_unwind` safety, `HdcMemory` for binary similarity. 6 tests.
- **ranking.rs**: `TopologyGatedAttention::new(config)` with coherence-gated mode selection, `compute_gated()` for attention-weighted scoring, fallback to static weights on error.
- **drift.rs**: `VectorDelta::compute(old, new)` via `Delta` trait for precise sparse/dense delta, `l2_norm()` as distance metric, `is_identity()` for sparsity computation.
- **aggregate.rs**: `aggregate_weighted()` method for reputation-weighted Byzantine-tolerant FedAvg. 3 tests.
- **transfer route**: `DomainExpansionEngine::initiate_transfer()` + `verify_transfer()` with `TransferVerification` results. `scoreboard_summary()` for acceleration metrics.
- **pipeline.rs**: 12 PII regex patterns (paths, keys, tokens, emails, SSH, JWT, PEM, secrets, credentials, hosts). Linked `WitnessChain` with SHAKE-256. 14 tests.
Initial: 57 tests across both crates (28 server + 28 client + 1 doc-test).
### Optimization Pass (Completed)
Performance optimizations applied to the hot paths:
- **graph.rs**: Added `HashMap<Uuid, usize>` reverse index (`node_index`) for O(1) node position lookups, replacing O(n) linear scans on `node_ids` Vec. All edge-to-index conversions in `rebuild_csr`, `rebuild_mincut`, `partition_via_mincut`, and `pagerank_scores` now use the index. Deferred CSR cache rebuild via `csr_dirty` flag — CSR is only rebuilt on the next query, not after every `add_memory` call.
- **drift.rs**: Removed wasted `VectorDelta::compute` call in `record()` that computed and immediately discarded the delta on every embedding ingestion.
- **store.rs**: Deduplicated `cosine_similarity` — now imports from `graph.rs` instead of maintaining a local copy.
- **Compiler warnings**: Eliminated all 8 warnings across both brain crates (unused imports, unused fields, unused variables). Production fields renamed with `_` prefix to indicate deferred-use status.
### Federated MicroLoRA Communication Bridge (Completed)
The core challenge: how do independent Claude Code sessions effectively communicate through the brain? Raw hash embeddings are deterministic but not semantic — SHAKE-256 treats "rust programming" and "rust language" as completely unrelated. The solution is a two-stage embedding pipeline with federated weight exchange.
#### Stage 1: Structured Hash Features (Frozen Base)
Replaced the monolithic SHAKE-256 hash with multi-granularity signed hashing into disjoint subspaces:
```
EMBEDDING_DIM = 128
Subspace Allocation:
[0..42) Unigram features (33%) — individual word hashes
[42..84) Bigram features (33%) — consecutive word pair hashes
[84..128) Trigram features (34%) — consecutive word triple hashes
Signed Hashing:
Each n-gram → SHAKE-256(salt:n-gram) → (bucket_index, sign ∈ {+1, -1})
Sign reduces collision bias vs. unsigned counting
Short texts (<= 2 words) also add character trigrams at 0.5x weight
```
This is deterministic and identical across all sessions. Texts sharing n-grams now have measurably higher cosine similarity (verified in tests: "rust programming language features" vs "rust programming language syntax" > "cooking recipes for dinner tonight").
#### Stage 2: MicroLoRA Transform (Learned, Federated)
A rank-2 LoRA adapter transforms the frozen hash features before L2 normalization:
```
output = L2_normalize(features + scale × (features @ down_proj) @ up_proj)
down_proj: 128 × 2 = 256 f32s
up_proj: 2 × 128 = 256 f32s
Total: 512 f32s = 2048 bytes per session export
```
Weights are learned locally through SONA's trajectory-based learning, then periodically federated to/from the server via the `brain_sync` MCP tool.
#### Federation Protocol
**Pull** (download consensus): `GET /v1/lora/latest``ConsensusLoraWeights`
- Returns current epoch, contributor count, total evidence count
- Returns null weights if no consensus has been computed yet
**Push** (submit local weights): `POST /v1/lora/submit``LoraSubmitResponse`
- Rate limited as a write operation
- Passes through two-gate validation before acceptance
#### Two-Gate Aggregation Pipeline
**Gate A: Policy Validity** (per-submission, immediate reject):
1. Shape check: `down_proj.len() == hidden_dim × rank`, `up_proj.len() == rank × hidden_dim`
2. NaN/Inf scan: reject any non-finite values
3. Bound check: reject any weight outside [-2.0, 2.0]
4. Norm check: reject if L2 norm of either projection > 100
5. Evidence threshold: reject if `evidence_count < 5` (minimum training steps)
**Gate B: Robust Aggregation** (batch, on reaching min_submissions threshold):
1. Per-parameter median computation across all submissions
2. MAD (Median Absolute Deviation) for robust spread estimation
3. Trimmed mean: exclude parameters > 3×MAD from median
4. Reputation-weighted averaging: `weight = reputation × evidence_count`
5. Result stored as new consensus; previous consensus saved for rollback
#### Weight Drift Monitoring
After each aggregation round, L2 distance between new and previous consensus is computed. If drift exceeds 5.0 (empirical threshold for rank-2 on 128-dim), the consensus is automatically rolled back to the previous epoch.
#### MCP Tool: brain_sync
The 11th tool enables bidirectional weight exchange:
```json
{
"name": "brain_sync",
"description": "Sync local MicroLoRA weights with the shared brain",
"inputSchema": {
"properties": {
"direction": { "enum": ["pull", "push", "both"], "default": "both" }
}
}
}
```
- **pull**: Downloads consensus weights, applies to local BrainEmbedder
- **push**: Exports local SONA MicroLoRA weights, clips to [-2, 2], validates, submits
- **both**: Pull then push (default — bidirectional sync)
Returns: `{ pulled: bool, pushed: bool, direction: string, local_embed_count: u64 }`
#### Design Decisions
1. **Global MicroLoRA** (not per-domain): Single set of consensus weights for all knowledge. Per-domain heads can be added later when mincut partitions stabilize and we have enough per-domain evidence. Starting global avoids cold-start fragmentation.
2. **Weights-only submission** (not weights + trajectories): Sessions submit only LoRA weight deltas. EWC (Elastic Weight Consolidation) is done locally within SONA. Server relies on robust aggregation (median + trimmed mean) rather than trajectory replay. This keeps submissions small (2KB) and avoids transmitting potentially sensitive trajectory data.
#### Updated Test Counts
Total: **62 tests** across both crates (28 server + 33 client + 1 doc-test), all passing.
- New embed tests: `test_similar_texts_closer` (semantic), `test_disjoint_subspaces`, `test_signed_hash_distribution`, `test_lora_weights_validate`, `test_lora_forward_pass`, `test_consensus_import`
## 14. Deployment Status (2026-03-03)
### Live Metrics
| Metric | Value |
|--------|-------|
| Memories | 237 |
| Contributors | 17 |
| Graph nodes | 237 |
| Graph edges | 827 (threshold 0.55) |
| Clusters (MinCut) | 20 |
| Avg quality (Beta mean) | 0.73 |
| Total votes | 608 |
| LoRA epoch | 2 |
| Brainpedia pages | 8 |
| WASM nodes | 0 |
| Embedding engine | `ruvllm::RlmEmbedder` |
| Embedding dim | 128 |
| Search P@1 | **100%** (30/30) |
| Search P@3 | **100%** (30/30) |
| Persistence | Firestore (memories, contributors, votes, LoRA, pages) |
| Cloud Run revision | 00059 |
### Search Intelligence Stack
Six-layer hybrid search scoring pipeline:
1. **Keyword matching** (primary) — word-boundary matching with field weights (title 6×, tags 4×, category 3×, content 1×), phrase bonus up to 2.0, all-in-title bonus 0.6, keyword floor +1.0
2. **RLM neural embeddings** — RlmEmbedder context-aware embeddings (QueryConditioned for search, CorpusConditioned for storage), activated at 50+ corpus docs, re-embedded on startup for space consistency
3. **Graph PPR** — ForwardPushSolver personalized PageRank over 827-edge knowledge graph, 0.6×cosine + 0.4×PPR blend
4. **Query expansion** — 32 synonym rules for abbreviation expansion (ml→machine learning, gnn→graph neural network, etc.)
5. **Vote quality re-ranking** — Bayesian Beta mean from 608 votes as learning-to-rank signal
6. **Attention ranking** — TopologyGatedAttention post-processing with coherence-gated mode selection
### Persistence Across Restarts
All state survives Cloud Run cold starts:
- **Memories**: Firestore `brain_memories` → DashMap hydration
- **Contributors**: Firestore `brain_contributors` → DashMap hydration
- **Votes**: Firestore `brain_votes` → vote tracker + counter rebuild
- **LoRA**: Firestore `brain_lora` → consensus weights + epoch restore
- **Pages**: Firestore `brain_pages` + `brain_deltas` → Brainpedia hydration
- **Graph**: Rebuilt from hydrated memories on startup
- **Embeddings**: Re-embedded with RLM on startup for consistency