mirror of https://github.com/ruvnet/RuVector.git synced 2026-05-25 23:24:03 +00:00

Reuven a929fde654 feat(rvm): RVM — Coherence-Native Microhypervisor for the Agentic Age

Complete implementation of the RVM microhypervisor:

13 Rust crates (all #![no_std], #![forbid(unsafe_code)]):
- rvm-types: Foundation types (64-byte WitnessRecord, ~40 ActionKind variants)
- rvm-hal: AArch64 EL2 HAL (stage-2 page tables, PL011 UART, GICv2, timer)
- rvm-cap: Capability system (P1/P2 proof verification, derivation trees)
- rvm-witness: Witness logging (FNV-1a hash chain, ring buffer, replay)
- rvm-proof: Proof engine (3-tier, constant-time P2 evaluation)
- rvm-partition: Partition model (lifecycle, split/merge, IPC, device leases)
- rvm-sched: Scheduler (2-signal priority, SMP coordinator, switch hot path)
- rvm-memory: Memory tiers (buddy allocator, 4-tier, RLE compression)
- rvm-coherence: Coherence engine (Stoer-Wagner mincut, adaptive frequency)
- rvm-boot: Bare-metal boot (7-phase measured, EL2 entry, linker script)
- rvm-wasm: Agent runtime (7-state lifecycle, migration, quotas)
- rvm-security: Security gate (validation, attestation, DMA budget)
- rvm-kernel: Integration kernel (boot/tick/create/destroy)

602 tests, 0 failures, 0 clippy warnings.
21 criterion benchmarks (all ADR targets exceeded).
9 ADRs (132-140), 15 design constraints (DC-1 through DC-15).
11 security findings addressed.

Co-Authored-By: claude-flow <ruv@ruv.net>

2026-04-04 12:10:19 -04:00

49 KiB

Raw Permalink Blame History

RVM Hypervisor Core -- GOAP Plan

Goal-Oriented Action Planning for the RVM Coherence-Native Microhypervisor

Field	Value
Status	Draft
Date	2026-04-04
Scope	Research + Architecture + Implementation roadmap
Relates to	ADR-087 (RVM Cognition Kernel), ADR-106 (Kernel/RVF Integration), ADR-117 (Canonical MinCut)

0. Executive Summary

This document defines the GOAP (Goal-Oriented Action Planning) strategy for RVM Hypervisor Core -- a Rust-first, coherence-native microhypervisor that replaces the VM abstraction with coherence domains: graph-partitioned isolation units managed by dynamic min-cut, governed by proof-gated capabilities, and optimized for multi-agent edge computing.

RVM Hypervisor Core is NOT a KVM VMM. It is NOT a Linux module. It is a standalone hypervisor that boots bare metal, manages hardware directly, and uses coherence domains as its primary scheduling and isolation primitive. Traditional VMs are subsumed as a degenerate case (a coherence domain with a single opaque partition and no graph structure).

Current State Assessment

The RuVector project already has significant infrastructure in place:

ruvix kernel workspace -- 22 sub-crates, ~101K lines of Rust, 760 tests passing (Phase A complete)
ruvix-cap -- seL4-inspired capability system with derivation trees
ruvix-proof -- 3-tier proof engine (Reflex <100ns, Standard <100us, Deep <10ms)
ruvix-sched -- Coherence-aware scheduler with novelty boosting
ruvix-hal -- HAL traits for AArch64, RISC-V, x86 (trait definitions)
ruvix-aarch64 -- AArch64 boot, MMU stubs
ruvix-physmem -- Physical memory allocator
ruvix-boot -- 5-stage RVF boot with ML-DSA-65 signatures
ruvix-nucleus -- 12 syscalls, checkpoint/replay
ruvector-mincut -- Subpolynomial dynamic min-cut (the crown jewel)
ruvector-sparsifier -- Dynamic spectral graph sparsification
ruvector-solver -- Sublinear-time sparse solvers
ruvector-coherence -- Coherence scoring with spectral support
ruvector-raft -- Distributed consensus
ruvector-verified -- Formal verification with lean-agentic dependent types
cognitum-gate-kernel -- 256-tile coherence gate fabric (WASM)
qemu-swarm -- QEMU cluster simulation

Goal State

A running microhypervisor that:

Boots bare metal on QEMU virt (AArch64) to first witness in <250ms
Manages coherence domains (not VMs) as the primary isolation unit
Uses dynamic min-cut to partition, migrate, and reclaim partitions
Gates every privileged mutation with capability proofs
Emits a complete witness trail for every state change
Achieves hot partition switch in <10us
Reduces remote memory traffic by 20% via coherence-aware placement
Runs WASM agent partitions with zero-copy IPC
Recovers from faults without global reboot

1. Research Phase Goals

1.1 Bare-Metal Rust Hypervisors and Microkernels

Goal state: Deep understanding of existing Rust bare-metal systems to inform what to adopt, adapt, and discard.

Project	Why Study It	Key Takeaway for RVM
RustyHermit / Hermit	Unikernel in Rust; uhyve hypervisor boots directly. No Linux dependency.	Boot sequence patterns, QEMU integration, minimal device model
Theseus OS	Rust OS with live code swapping, cell-based memory. No address spaces.	Intralingual design, state spill prevention, namespace-based isolation
RedLeaf	Rust OS with language-level isolation, no hardware protection needed for safety.	Ownership-based isolation as alternative to page tables for trusted partitions
Tock OS	Embedded Rust OS with grant regions, capsule isolation.	Grant-based memory model maps to RVM region policies
Hubris	Oxide Computer's embedded Rust RTOS with static task allocation.	Panic isolation, supervisor/task fault model
Firecracker	KVM VMM in Rust. Anti-pattern for architecture (depends on KVM), but study for device model.	Minimal device model pattern, virtio implementation

Actions:

A1.1.1: Read Theseus OS OSDI 2020 paper -- extract cell/namespace isolation patterns
A1.1.2: Read RedLeaf OSDI 2020 paper -- extract ownership-based isolation model
A1.1.3: Study Hubris source for panic-isolated task model
A1.1.4: Study RustyHermit uhyve for bare-metal boot without KVM

Preconditions: None Effects: Design vocabulary for coherence domains established; isolation model decision data gathered

1.2 Capability-Based OS Designs

Goal state: Identify which capability model to adopt for proof-gated coherence domains.

System	Why Study It	Key Takeaway
seL4	Formally verified microkernel with capability-based access control. Gold standard.	Derivation trees, CNode design, retype mechanism. Already partially adopted in ruvix-cap.
CHERI	Hardware capability extensions (Arm Morello). Capabilities in registers.	Hardware-enforced bounds on pointers; could replace page table boundaries for fine-grained isolation
Barrelfish	Multikernel OS with capability system and message-passing IPC.	Per-core kernel model; capability transfer across cores maps to coherence domain migration
Fuchsia/Zircon	Capability-based microkernel from Google. Handles as capabilities.	Practical capability system at scale; job/process/thread hierarchy

Actions:

A1.2.1: Map seL4 CNode/Untyped/Retype to ruvix-cap derivation trees -- identify gaps
A1.2.2: Study CHERI Morello ISA for hardware-backed capability enforcement
A1.2.3: Study Barrelfish multikernel design for cross-core capability transfer
A1.2.4: Evaluate Fuchsia handle table design for runtime performance characteristics

Preconditions: A1.1 partially complete (understand isolation context) Effects: Capability model finalized; hardware vs. software enforcement decision made

1.3 Graph-Partitioned Scheduling

Goal state: Algorithm for coherence-pressure-driven scheduling that leverages ruvector-mincut.

Topic	Reference	Relevance
Graph-based task scheduling	Kwok & Ahmad survey (1999); modern DAG schedulers	Task dependency graphs as scheduling input
Spectral graph partitioning	Fiedler vectors, Normalized Cuts (Shi & Malik 2000)	Coherence domain boundary identification
Dynamic min-cut for placement	ruvector-mincut (our own subpolynomial algorithm)	Core algorithm for partition placement
Network flow scheduling	Coffman-Graham, Hu's algorithm	Precedence-constrained scheduling with graph structure
NUMA-aware scheduling	Linux NUMA balancing, AutoNUMA	Coherence-aware memory placement (lower-bound benchmark)

Actions:

A1.3.1: Formalize the coherence-pressure scheduling problem as a graph optimization
A1.3.2: Prove that dynamic min-cut provides bounded-approximation partition quality
A1.3.3: Design the scheduler tick: {observe graph state} -> {compute pressure} -> {select task}
A1.3.4: Benchmark ruvector-mincut update latency for partition-switch-time budget (<10us)

Preconditions: ruvector-mincut crate functional (it is) Effects: Scheduling algorithm specified; latency bounds proven

1.4 Memory Coherence Protocols

Goal state: Design a memory coherence model for coherence domains that eliminates false sharing and minimizes remote traffic.

Protocol	Why Study	Relevance
MOESI	Snooping protocol; AMD uses for multi-socket	Baseline understanding of cache coherence overhead
Directory-based (DASH, SGI Origin)	Scalable coherence for >4 sockets	Coherence domain as a directory entry abstraction
ARM AMBA CHI	Modern ARM coherence protocol for SoC	Target hardware protocol for Seed/Appliance chips
CXL (Compute Express Link)	Memory-semantic interconnect; CXL.mem for shared memory pools	Future interconnect for cross-chip coherence domains
Barrelfish message-passing	Eliminates shared memory entirely; replicate instead	Alternative: no coherence protocol, only message passing

Actions:

A1.4.1: Quantify coherence overhead: how much of current "VM exit" cost is actually coherence traffic
A1.4.2: Design coherence domain memory model: regions are either local (no coherence) or shared (proof-gated coherence)
A1.4.3: Prototype directory-based coherence for shared regions using ruvix-region policies
A1.4.4: Define the "coherence score" metric that drives scheduling and migration decisions

Preconditions: A1.3.1 (graph model defined) Effects: Memory model specified; coherence score formula defined

1.5 Formal Verification Approaches

Goal state: Identify which properties of the hypervisor can be formally verified, and with what tools.

Approach	Tool	What It Proves
seL4 style	Isabelle/HOL	Full functional correctness (gold standard, very expensive)
Kani/Verus	Rust-native model checkers	Bounded verification of Rust code; panic-freedom, overflow
Prusti	Viper + Rust	Pre/post condition verification with ownership
lean-agentic (ruvector-verified)	Lean 4 + Rust FFI	Dependent types for proof-carrying operations (we have this)
TLA+/P	Model checking	Protocol-level correctness (capability transfer, migration)

Actions:

A1.5.1: Use Kani to verify panic-freedom in ruvix-cap and ruvix-proof (no_std paths)
A1.5.2: Use ruvector-verified to generate proof obligations for capability derivation correctness
A1.5.3: Write TLA+ spec for coherence domain migration protocol
A1.5.4: Define the verification budget: which crates get full verification vs. testing-only

Preconditions: ruvector-verified crate functional Effects: Verification strategy decided; initial proofs for cap and proof crates

1.6 Agent Runtime Designs

Goal state: Design the WASM/agent partition interface that runs inside coherence domains.

Runtime	Why Study	Relevance
wasmtime	Production WASM runtime with Cranelift JIT	Reference for WASM execution in restricted environments
wasmer	WASM with multiple backends (Cranelift, LLVM, Singlepass)	Singlepass backend for predictable compilation time
lunatic	Erlang-like actor system built on WASM	Actor model inside WASM for agent isolation
wasm-micro-runtime (WAMR)	Lightweight WASM interpreter for embedded	Minimal footprint for edge/embedded coherence domains
Component Model (WASI P2)	Typed interface composition for WASM	Interface types for agent-to-agent IPC

Actions:

A1.6.1: Benchmark WAMR vs wasmtime-minimal for partition boot time
A1.6.2: Design the coherence domain WASM interface: capabilities exposed as WASM imports
A1.6.3: Define agent IPC: WASM component model typed channels backed by ruvix-queue
A1.6.4: Prototype: WASM partition that can query ruvix-vecgraph through capability-gated imports

Preconditions: A1.2 (capability model), ruvix-queue functional Effects: Agent runtime selection made; IPC interface defined

2. Architecture Goals

2.1 Coherence Domains Without KVM

Current world state: ruvix has 6 primitives (Task, Capability, Region, Queue, Timer, Proof) but no concept of a "coherence domain" as a first-class hypervisor object.

Goal state: Coherence domains are the primary isolation and scheduling unit, replacing the VM abstraction.

Definition

A coherence domain is a graph-structured isolation unit consisting of:

CoherenceDomain {
    id: DomainId,
    graph: VecGraph,           // from ruvix-vecgraph, nodes=tasks, edges=data dependencies
    regions: Vec<RegionHandle>,  // memory owned by this domain
    capabilities: CapTree,       // capability subtree rooted at domain cap
    coherence_score: f32,        // spectral coherence metric
    mincut_partition: Partition,  // current min-cut boundary from ruvector-mincut
    witness_log: WitnessLog,     // domain-local witness chain
    tier: MemoryTier,            // Hot | Warm | Dormant | Cold
}

How It Replaces VMs

VM Concept	Coherence Domain Equivalent
vCPU	Tasks within the domain's graph
Guest physical memory	Regions with domain-scoped capabilities
VM exit/enter	Partition switch (rescheduling at min-cut boundary)
Device passthrough	Lease-based device capability (time-bounded, revocable)
VM migration	Graph repartitioning via dynamic min-cut
Snapshot/restore	Witness log replay from checkpoint

Key Architectural Decisions

AD-1: No hardware virtualization extensions required. RVM uses capability-based isolation (software) + MMU page table partitioning (hardware) instead of VT-x/AMD-V/EL2 trap-and-emulate. This means:

No VM exits. No VMCS/VMCB. No nested page tables.
Isolation comes from: (a) capability enforcement in ruvix-cap, (b) MMU page table boundaries per domain, (c) proof-gated mutation.
A traditional VM is a degenerate coherence domain: single partition, opaque graph, no coherence scoring.

AD-2: EL2 is used for page table management only. On AArch64, the hypervisor runs at EL2. But EL2 is used purely to manage stage-2 page tables that enforce region boundaries -- not for trap-and-emulate virtualization.

AD-3: Coherence score drives everything. The coherence score (computed from the domain's graph structure via ruvector-coherence spectral methods) determines:

Scheduling priority (high coherence = more CPU time)
Memory tier (high coherence = hot tier; low coherence = demote to warm/dormant)
Migration eligibility (domains with suboptimal min-cut partition are candidates)
Reclamation order (lowest coherence reclaimed first under memory pressure)

Actions:

A2.1.1: Add CoherenceDomain struct to ruvix-types
A2.1.2: Add DomainCreate, DomainDestroy, DomainMigrate syscalls to ruvix-nucleus
A2.1.3: Implement domain-scoped capability trees in ruvix-cap
A2.1.4: Wire ruvector-coherence spectral scoring into ruvix-sched

2.2 Hardware Abstraction Layer

Current state: ruvix-hal defines traits for Console, Timer, InterruptController, Mmu, Power. ruvix-aarch64 has stubs.

Goal state: HAL supports three architectures with hypervisor-level primitives.

Architecture	HAL Implementation	Priority	Hypervisor Feature
AArch64	ruvix-aarch64	P0 (primary)	EL2 page table management, GIC-400/GIC-600
RISC-V	ruvix-riscv (new)	P1	H-extension for VS/HS mode, PLIC/APLIC
x86_64	ruvix-x86 (new)	P2	EPT page tables, APIC (lowest priority)

New HAL traits for hypervisor:

pub trait HypervisorMmu {
    fn create_stage2_tables(&mut self, domain: DomainId) -> Result<PageTableRoot, HalError>;
    fn map_region_to_domain(&mut self, region: RegionHandle, domain: DomainId, perms: RegionPolicy) -> Result<(), HalError>;
    fn unmap_region(&mut self, region: RegionHandle, domain: DomainId) -> Result<(), HalError>;
    fn switch_domain(&mut self, from: DomainId, to: DomainId) -> Result<(), HalError>;
}

pub trait CoherenceHardware {
    fn read_cache_miss_counter(&self) -> u64;
    fn read_remote_memory_counter(&self) -> u64;
    fn flush_domain_tlb(&mut self, domain: DomainId) -> Result<(), HalError>;
}

Actions:

A2.2.1: Extend ruvix-hal with HypervisorMmu and CoherenceHardware traits
A2.2.2: Implement AArch64 EL2 page table management in ruvix-aarch64
A2.2.3: Implement GIC-600 interrupt routing per coherence domain
A2.2.4: Define RISC-V H-extension HAL (trait impl stubs)

2.3 Memory Model

Current state: ruvix-region provides Immutable/AppendOnly/Slab policies with mmap-backed storage. ruvix-physmem has a buddy allocator.

Goal state: Hybrid memory model with capability-gated regions and tiered coherence.

Design: Four-Tier Memory Hierarchy

Tier     | Backing         | Access Latency | Coherence State | Eviction Policy
---------|-----------------|----------------|-----------------|------------------
Hot      | L1/L2 resident  | <10ns          | Exclusive/Modified | Never (pinned)
Warm     | DRAM            | ~100ns         | Shared/Clean    | LRU with coherence weight
Dormant  | Compressed DRAM | ~1us           | Invalid (reconstructable) | Coherence score threshold
Cold     | NVMe/Flash      | ~10us          | Tombstone       | Witness log pointer only

Key Innovation: Reconstructable Memory. Dormant regions are not stored as raw bytes. They are stored as:

A witness log checkpoint hash
A delta-compressed representation (using ruvector-temporal-tensor compression)
Reconstruction instructions that can rebuild the region from the witness log

This means memory reclamation does not destroy state -- it compresses it into the witness chain.

Actions:

A2.3.1: Extend ruvix-region with MemoryTier enum and tier-transition methods
A2.3.2: Implement dormant-region compression using witness log + delta encoding
A2.3.3: Implement cold-tier eviction to NVMe with tombstone references
A2.3.4: Wire physical memory allocator to tier-aware allocation (hot from buddy, warm from slab pool)
A2.3.5: Define the page table structure for stage-2 domain isolation

2.4 Scheduler: Graph-Pressure-Driven

Current state: ruvix-sched has a coherence-aware scheduler with deadline pressure, novelty signal, and structural risk. Fixed partition model.

Goal state: Scheduler uses live graph state from ruvector-mincut to make scheduling decisions.

Scheduling Algorithm: CoherencePressure

EVERY scheduler_tick:
  1. For each active coherence domain D:
     a. Read D.graph edge weights (data flow rates between tasks)
     b. Compute min-cut value via ruvector-mincut (amortized O(n^{o(1)}))
     c. Compute coherence_score = spectral_gap(D.graph) / min_cut_value
     d. Compute pressure = deadline_urgency * coherence_score * novelty_boost
  2. Sort domains by pressure (descending)
  3. Assign CPU time proportional to pressure
  4. If any domain's coherence_score < threshold:
     - Trigger repartition: invoke ruvector-mincut to compute new boundary
     - If repartition improves score by >10%: execute migration

Partition Switch Protocol (target: <10us):

switch_partition(from: DomainId, to: DomainId):
  1. Save from.task_state to from.region (register dump, ~500ns)
  2. Switch stage-2 page table root (TTBR write, ~100ns)
  3. TLB invalidate for from domain (TLBI, ~2us on ARM)
  4. Load to.task_state from to.region (~500ns)
  5. Emit witness record for switch (~200ns with reflex proof)
  6. Resume execution in to domain
  Total budget: ~3.3us (well within 10us target)

Actions:

A2.4.1: Refactor ruvix-sched to accept graph state from ruvix-vecgraph
A2.4.2: Integrate ruvector-mincut as a scheduling oracle (no_std subset)
A2.4.3: Implement partition switch protocol in ruvix-aarch64
A2.4.4: Benchmark partition switch time on QEMU virt

2.5 IPC: Zero-Copy Message Passing

Current state: ruvix-queue provides io_uring-style ring buffers with zero-copy semantics. 47 tests passing.

Goal state: Cross-domain IPC through shared regions with capability-gated access.

Design

Inter-domain IPC:
  1. Sender domain S holds Capability(Queue Q, WRITE)
  2. Receiver domain R holds Capability(Queue Q, READ)
  3. Queue Q is backed by a shared Region visible in both S and R stage-2 page tables
  4. Messages are written as typed records with coherence metadata
  5. Every send/recv emits a witness record linking the two domains

Intra-domain IPC:
  Same as current ruvix-queue, but within a single stage-2 address space.
  No page table switch required. Pure ring buffer.

Message Format:

struct DomainMessage {
    header: MsgHeader,        // 16 bytes: sender, receiver, type, len
    coherence: CoherenceMeta, // 8 bytes: coherence score at send time
    witness: WitnessHash,     // 32 bytes: hash linking to witness chain
    payload: [u8],            // variable: zero-copy reference into shared region
}

Actions:

A2.5.1: Extend ruvix-queue with cross-domain shared region support
A2.5.2: Implement capability-gated queue access for inter-domain messages
A2.5.3: Add CoherenceMeta and WitnessHash to message headers
A2.5.4: Benchmark zero-copy IPC latency (target: <100ns intra-domain, <1us inter-domain)

2.6 Device Model: Lease-Based Access

Goal state: Devices are not "assigned" to domains. They are leased with capability-bounded time windows.

struct DeviceLease {
    device_id: DeviceId,
    domain: DomainId,
    capability: CapHandle,     // Revocable capability for device access
    lease_start: Timestamp,
    lease_duration: Duration,
    max_dma_budget: usize,     // Maximum DMA bytes allowed during lease
    witness: WitnessHash,      // Proof of lease grant
}

Key properties:

Lease expiry automatically revokes capability (no explicit release needed)
DMA budget prevents device from exhausting memory during lease
Multiple domains can hold read-only leases to the same device simultaneously
Exclusive write lease requires proof of non-interference (via min-cut: device node has no shared edges)

Actions:

A2.6.1: Design DeviceLease struct and lease lifecycle
A2.6.2: Implement lease-based MMIO region mapping in ruvix-drivers
A2.6.3: Implement DMA budget enforcement in ruvix-dma
A2.6.4: Wire lease expiry to capability revocation in ruvix-cap

2.7 Witness Subsystem: Compact Append-Only Log

Current state: ruvix-boot has WitnessLog with SHA-256 chaining. ruvix-proof has 3-tier proof engine.

Goal state: Hypervisor-wide witness log that enables deterministic replay, audit, and fault recovery.

Design

WitnessLog (per coherence domain):
  - Append-only ring buffer in a dedicated Region(AppendOnly)
  - Each entry: [timestamp: u64, action_type: u8, proof_hash: [u8; 32], prev_hash: [u8; 32], payload: [u8; N]]
  - Fixed 82-byte entries (ATTESTATION_SIZE from ruvix-types)
  - Hash chain: entry[i].prev_hash = SHA256(entry[i-1])
  - Compaction: when ring buffer wraps, emit a Merkle root of the evicted segment to cold storage

GlobalWitness (hypervisor-level):
  - Merges per-domain witness chains at partition switch boundaries
  - Enables cross-domain causality reconstruction
  - Uses ruvector-dag for causal ordering

Actions:

A2.7.1: Implement per-domain witness log in ruvix-proof
A2.7.2: Implement global witness merge at partition switch
A2.7.3: Implement Merkle compaction for ring buffer overflow
A2.7.4: Implement deterministic replay from witness log + checkpoint

3. Implementation Milestones

M0: Bare-Metal Rust Boot on QEMU (No KVM, Direct Machine Code)

Goal: Boot RVM at EL2 on QEMU aarch64 virt, print to UART, emit first witness record.

Preconditions:

ruvix-hal traits defined (done)
ruvix-aarch64 boot stubs (partially done)
aarch64-boot directory with linker script and build system (exists)

Actions:

M0.1: Complete _start assembly: disable MMU, set up stack, branch to Rust
M0.2: Initialize PL011 UART via ruvix-drivers
M0.3: Initialize GIC-400 minimal (mask all interrupts except timer)
M0.4: Set up EL2 translation tables (identity mapping for kernel, device MMIO)
M0.5: Initialize witness log in a fixed RAM region
M0.6: Emit first witness record (boot attestation)
M0.7: Measure cold boot to first witness time (target: <250ms)

Acceptance criteria:

qemu-system-aarch64 -machine virt -cpu cortex-a72 -kernel ruvix.bin boots
UART prints "RVM Hypervisor Core v0.1.0"
First witness hash printed within 250ms of power-on

Estimated effort: 2-3 weeks Dependencies: None (all prerequisites exist)

M1: Partition Object Model + Capability System

Goal: Coherence domains as first-class kernel objects with capability-gated access.

Preconditions: M0 complete

Actions:

M1.1: Add CoherenceDomain to ruvix-types
M1.2: Add DomainCreate/DomainDestroy/DomainQuery syscalls to ruvix-nucleus
M1.3: Implement domain-scoped capability trees in ruvix-cap
M1.4: Implement stage-2 page table creation per domain in ruvix-aarch64
M1.5: Implement domain switch (save/restore + TTBR switch + TLB invalidate)
M1.6: Test: create two domains, switch between them, verify isolation

Acceptance criteria:

Two domains running concurrently with isolated memory
Capability violation (cross-domain access without cap) triggers fault
Domain switch measured at <10us

Estimated effort: 3-4 weeks Dependencies: M0

M2: Witness Logging + Proof Verifier

Goal: Every privileged action emits a witness record; proofs are verified before mutation.

Preconditions: M1 complete

Actions:

M2.1: Implement per-domain witness log (AppendOnly region)
M2.2: Wire all syscalls through proof verifier (ruvix-proof integration)
M2.3: Implement 3-tier proof routing: Reflex for hot path, Standard for normal, Deep for privileged
M2.4: Implement global witness merge at domain switch
M2.5: Test: replay witness log from checkpoint, verify state reconstruction

Acceptance criteria:

No syscall succeeds without valid proof
Witness log captures all state changes
Replay from checkpoint + witness log produces identical state

Estimated effort: 2-3 weeks Dependencies: M1

M3: Basic Scheduler with Coherence Scoring

Goal: Scheduler uses coherence scores from graph structure to drive scheduling decisions.

Preconditions: M1, M2 complete; ruvector-coherence available

Actions:

M3.1: Integrate ruvector-coherence spectral scoring into ruvix-sched
M3.2: Implement per-domain graph state tracking in ruvix-vecgraph
M3.3: Implement coherence-pressure scheduling algorithm
M3.4: Implement partition priority based on coherence score + deadline pressure
M3.5: Benchmark: measure scheduling overhead per tick

Acceptance criteria:

Domains with higher coherence scores get proportionally more CPU time
Scheduling tick overhead < 1us
Coherence-driven scheduling demonstrably reduces tail latency

Estimated effort: 2-3 weeks Dependencies: M1, M2

M4: Dynamic MinCut Integration from RuVector Crates

Goal: The hypervisor uses ruvector-mincut for live partition placement, migration, and reclamation.

Preconditions: M3 complete; ruvector-mincut and ruvector-sparsifier crates available

Actions:

M4.1: Create no_std-compatible subset of ruvector-mincut for kernel use
M4.2: Integrate min-cut computation into scheduler tick (amortized)
M4.3: Implement partition migration protocol: compute new cut -> transfer regions -> switch
M4.4: Implement memory reclamation: lowest-coherence partitions reclaimed first
M4.5: Integrate ruvector-sparsifier for efficient graph state maintenance in kernel
M4.6: Benchmark: min-cut update latency in kernel context

Acceptance criteria:

Dynamic repartitioning of domains based on workload changes
Min-cut computation completes within scheduler tick budget
Memory reclamation recovers regions without data loss (witness-backed)
20% reduction in remote memory traffic vs. static partitioning

Estimated effort: 4-5 weeks Dependencies: M3, ruvector-mincut, ruvector-sparsifier

M5: Memory Tier Management (Hot/Warm/Dormant/Cold)

Goal: Four-tier memory hierarchy with coherence-driven promotion/demotion.

Preconditions: M4 complete

Actions:

M5.1: Implement MemoryTier enum and tier metadata in ruvix-region
M5.2: Implement hot -> warm demotion (unpin, allow eviction)
M5.3: Implement warm -> dormant compression (delta encoding via witness log)
M5.4: Implement dormant -> cold eviction (to NVMe/flash with tombstone)
M5.5: Implement cold -> hot reconstruction (replay witness log from checkpoint)
M5.6: Wire tier transitions to coherence score thresholds

Acceptance criteria:

Memory tiers transition automatically based on coherence scoring
Dormant regions reconstruct correctly from witness log
Cold eviction and hot reconstruction maintain data integrity
Memory footprint reduced by 50%+ for dormant workloads

Estimated effort: 3-4 weeks Dependencies: M4

M6: Agent Runtime Adapter (WASM Partitions)

Goal: WASM-based agent partitions run inside coherence domains with capability-gated access.

Preconditions: M5 complete; WASM runtime selection made (from A1.6)

Actions:

M6.1: Integrate minimal WASM runtime (WAMR or wasmtime-minimal) into kernel
M6.2: Implement WASM import interface: capabilities as host functions
M6.3: Implement WASM partition boot: load .wasm from RVF package, instantiate in domain
M6.4: Implement agent IPC: WASM component model typed channels -> ruvix-queue
M6.5: Implement agent lifecycle: spawn, pause, resume, terminate (all proof-gated)
M6.6: Test: multi-agent scenario with 10+ WASM partitions in different domains

Acceptance criteria:

WASM partitions boot and execute within coherence domains
Agent-to-agent IPC through typed channels with <1us latency
Capability violations in WASM trapped and logged to witness
10 concurrent WASM agents run without interference

Estimated effort: 4-5 weeks Dependencies: M5, A1.6

M7: Seed/Appliance Hardware Bring-Up

Goal: Boot RVM on Cognitum Seed and Appliance hardware.

Preconditions: M6 complete; hardware available

Actions:

M7.1: Implement device tree parsing for Seed/Appliance SoC in ruvix-dtb
M7.2: Implement BCM2711 (or target SoC) interrupt controller driver
M7.3: Implement board-specific boot sequence in ruvix-rpi-boot or equivalent
M7.4: Implement NVMe driver for cold-tier storage
M7.5: Implement network driver for cross-node coherence domain migration
M7.6: Full integration test: boot, create domains, run WASM agents, migrate, recover

Acceptance criteria:

RVM boots on physical hardware
All M0-M6 acceptance criteria met on real hardware
Fault recovery without global reboot demonstrated
Cross-node migration demonstrated (if multi-node hardware available)

Estimated effort: 6-8 weeks Dependencies: M6, hardware availability

4. RuVector Integration Plan

4.1 MinCut Drives Partition Placement

Crate: ruvector-mincut (subpolynomial dynamic min-cut)

Integration points:

Hypervisor Function	MinCut Operation	Data Flow
Domain creation	Initial partition computation	Domain graph -> MinCutBuilder -> Partition
Scheduler tick	Amortized cut value query	MinCut.min_cut_value() -> coherence score input
Migration decision	Repartition computation	Updated graph -> MinCut.insert/delete_edge -> New partition
Memory reclamation	Cut-based ordering	MinCut values across domains -> reclamation priority
Fault isolation	Cut identifies blast radius	MinCut.min_cut_set() -> affected regions

no_std adaptation required:

ruvector-mincut currently depends on petgraph, rayon, dashmap, parking_lot (all require std/alloc)
Create ruvector-mincut-kernel feature flag that uses:
- Fixed-size graph representation (no heap allocation)
- Single-threaded computation (no rayon)
- Spin locks instead of parking_lot
- Inline graph storage instead of petgraph

Actions:

I4.1.1: Add no_std feature to ruvector-mincut with kernel-compatible subset
I4.1.2: Implement fixed-size graph backend (max 256 nodes, 4096 edges)
I4.1.3: Benchmark kernel-mode min-cut: target <5us for 64-node graphs
I4.1.4: Wire into ruvix-sched as scheduling oracle

4.2 Sparsifier Enables Efficient Graph State

Crate: ruvector-sparsifier (dynamic spectral graph sparsification)

Integration points:

Hypervisor Function	Sparsifier Operation	Purpose
Graph maintenance	Sparsify domain graph	Keep O(n log n) edges instead of O(n^2) for coherence queries
Coherence scoring	Spectral gap from sparsified Laplacian	Fast coherence score without full eigendecomposition
Migration planning	Sparsified graph for min-cut	Approximate min-cut on sparsified graph (faster)
Memory accounting	Sparse representation of access patterns	Track which regions are accessed by which tasks

Key insight: The sparsifier maintains a spectrally-equivalent graph with O(n log n / epsilon^2) edges. This means coherence scoring and min-cut computation can run on the sparse representation instead of the full graph, reducing kernel-mode computation time.

Actions:

I4.2.1: Add no_std feature to ruvector-sparsifier
I4.2.2: Implement incremental sparsification (update sparse graph on edge insert/delete)
I4.2.3: Wire sparsified graph into scheduler for fast coherence queries
I4.2.4: Benchmark: sparsified vs. full graph coherence scoring latency

4.3 Solver Handles Coherence Scoring

Crate: ruvector-solver (sublinear-time sparse solvers)

Integration points:

Hypervisor Function	Solver Operation	Purpose
Coherence score	Approximate Fiedler vector	Spectral gap computation
PageRank-style scoring	Forward push on domain graph	Task importance ranking
Migration cost estimation	Sparse linear system solve	Estimate data transfer cost

Key insight: The solver's Neumann series and conjugate gradient methods can compute approximate spectral properties of the domain graph in O(sqrt(n)) time. This is fast enough for per-tick coherence scoring.

Actions:

I4.3.1: Add no_std subset of ruvector-solver (Neumann series only, no nalgebra)
I4.3.2: Implement approximate Fiedler vector computation for coherence scoring
I4.3.3: Implement forward-push task importance ranking
I4.3.4: Benchmark: solver latency for 64-node domain graphs

4.4 Embeddings Enable Semantic State Reconstruction

Crate: ruvector-core (HNSW indexing), ruvix-vecgraph (kernel vector/graph stores)

Integration points:

Hypervisor Function	Embedding Operation	Purpose
Dormant state reconstruction	Semantic similarity search	Find related state fragments for reconstruction
Novelty detection	Vector distance from recent inputs	Scheduler novelty signal
Fault diagnosis	Embedding-based anomaly detection	Detect divergent domain states
Cold tier indexing	HNSW index over tombstone references	Fast lookup of cold-tier state

Key insight: When a dormant region needs reconstruction, the witness log provides the exact mutation sequence. But semantic embeddings can identify which other regions contain related state, enabling speculative prefetch during reconstruction.

Actions:

I4.4.1: Implement kernel-resident micro-HNSW in ruvix-vecgraph (fixed-size, no_std)
I4.4.2: Wire novelty detection into scheduler (vector distance from recent inputs)
I4.4.3: Implement embedding-based prefetch for dormant region reconstruction
I4.4.4: Implement anomaly detection for cross-domain state divergence

4.5 Additional RuVector Crate Integration

Crate	Integration Point	Priority
`ruvector-raft`	Cross-node consensus for multi-node coherence domains	P2 (M7)
`ruvector-verified`	Formal proofs for capability derivation correctness	P1 (M2)
`ruvector-dag`	Causal ordering in global witness log	P1 (M2)
`ruvector-temporal-tensor`	Delta compression for dormant regions	P1 (M5)
`ruvector-coherence`	Spectral coherence scoring	P0 (M3)
`cognitum-gate-kernel`	256-tile fabric as coherence domain topology	P2 (M7)
`ruvector-snapshot`	Checkpoint/restore for domain state	P1 (M5)

5. Success Metrics

5.1 Performance Targets

Metric	Target	Measurement Method	Milestone
Cold boot to first witness	<250ms	QEMU timer from power-on to first witness UART print	M0
Hot partition switch	<10us	ARM cycle counter around switch_partition()	M1
Remote memory traffic reduction	20% vs. static	Hardware perf counters (cache miss/remote access)	M4
Tail latency reduction	20% vs. round-robin	P99 latency of agent request/response	M4
Full witness trail	100% coverage	Audit: every syscall has witness record	M2
Fault recovery without global reboot	Domain-local recovery	Kill one domain, verify others unaffected	M5
WASM agent boot time	<5ms per agent	Timer around WASM instantiation	M6
Zero-copy IPC latency	<100ns intra, <1us inter	Benchmark ring buffer round-trip	M1
Coherence scoring overhead	<1us per domain per tick	Cycle counter around scoring function	M3
Min-cut update amortized	<5us for 64-node graph	Benchmark in kernel context	M4

5.2 Correctness Targets

Property	Verification Method	Milestone
Capability safety (no unauthorized access)	ruvector-verified + Kani	M1
Witness chain integrity (no gaps, no forgery)	SHA-256 chain verification	M2
Deterministic replay (same inputs -> same state)	Replay 10K syscall traces	M2
Proof soundness (invalid proofs always rejected)	Fuzzing + proptest	M2
Isolation (domain fault does not affect others)	Inject faults, verify containment	M5
Memory safety (no UB in kernel code)	Miri + Kani + `#![forbid(unsafe_code)]` where possible	All

5.3 Scale Targets

Dimension	Target	Milestone
Concurrent coherence domains	256	M4
Tasks per domain	64	M3
Regions per domain	1024	M1
Graph nodes per domain	256	M4
Graph edges per domain	4096	M4
WASM agents total	1024	M6
Witness log entries before compaction	1M	M2
Cross-node domains (federated)	16 nodes	M7

6. Dependency Graph (GOAP Action Ordering)

Research Phase (parallel):
  A1.1 (bare-metal Rust) ---|
  A1.2 (capability OS)    --+--> Architecture Phase
  A1.3 (graph scheduling) --|
  A1.4 (memory coherence) --|
  A1.5 (formal verification)|
  A1.6 (agent runtimes)   --|

Architecture Phase (partially parallel):
  A2.1 (coherence domains) --> A2.4 (scheduler)
  A2.2 (HAL)               --> M0 (bare-metal boot)
  A2.3 (memory model)      --> A2.4 (scheduler)
  A2.4 (scheduler)         --> M3
  A2.5 (IPC)               --> M1
  A2.6 (device model)      --> M7
  A2.7 (witness)           --> M2

Implementation (sequential with overlap):
  M0 (boot)
   |
  M1 (partitions + caps)
   |
  M2 (witness + proofs) -- can overlap with M3
   |
  M3 (coherence scheduler)
   |
  M4 (mincut integration) -- critical path
   |
  M5 (memory tiers)
   |
  M6 (WASM agents)
   |
  M7 (hardware bring-up)

RuVector Integration (parallel with milestones):
  I4.1 (mincut no_std)      --> M4
  I4.2 (sparsifier no_std)  --> M4
  I4.3 (solver no_std)      --> M3
  I4.4 (embeddings kernel)  --> M5

Critical Path

A2.2 (HAL) -> M0 -> M1 -> M3 -> M4 -> M5 -> M6 -> M7
                           ^
                           |
                     I4.1 (mincut no_std) -- this is the highest-risk integration

Highest risk item: Creating a no_std subset of ruvector-mincut that runs in kernel context within the scheduler tick budget. If the amortized min-cut update exceeds 5us for 64-node graphs, the scheduler design must fall back to periodic (not per-tick) repartitioning.

7. GOAP State Transitions

World State Variables

struct WorldState {
    // Research
    bare_metal_research_complete: bool,
    capability_model_decided: bool,
    scheduling_algorithm_specified: bool,
    memory_model_designed: bool,
    verification_strategy_decided: bool,
    agent_runtime_selected: bool,

    // Infrastructure
    boots_on_qemu: bool,
    uart_works: bool,
    mmu_configured: bool,
    interrupts_working: bool,

    // Core features
    coherence_domains_exist: bool,
    capabilities_enforce_isolation: bool,
    witness_log_records_all: bool,
    proofs_gate_all_mutations: bool,
    scheduler_uses_coherence: bool,
    mincut_drives_partitioning: bool,
    memory_tiers_work: bool,
    wasm_agents_run: bool,

    // Performance
    boot_under_250ms: bool,
    switch_under_10us: bool,
    traffic_reduced_20pct: bool,
    tail_latency_reduced_20pct: bool,

    // Hardware
    runs_on_seed_hardware: bool,
    runs_on_appliance_hardware: bool,
}

Initial State

WorldState {
    bare_metal_research_complete: false,
    capability_model_decided: true,  // seL4-inspired, already in ruvix-cap
    scheduling_algorithm_specified: false,
    memory_model_designed: false,
    verification_strategy_decided: false,
    agent_runtime_selected: false,

    boots_on_qemu: false,
    uart_works: false,
    mmu_configured: false,
    interrupts_working: false,

    coherence_domains_exist: false,
    capabilities_enforce_isolation: true,  // ruvix-cap works in hosted mode
    witness_log_records_all: false,
    proofs_gate_all_mutations: true,  // ruvix-proof works in hosted mode
    scheduler_uses_coherence: false,
    mincut_drives_partitioning: false,
    memory_tiers_work: false,
    wasm_agents_run: false,

    boot_under_250ms: false,
    switch_under_10us: false,
    traffic_reduced_20pct: false,
    tail_latency_reduced_20pct: false,

    runs_on_seed_hardware: false,
    runs_on_appliance_hardware: false,
}

Goal State

All fields set to true.

A* Search Heuristic

The heuristic for GOAP planning uses the number of false fields as the distance estimate. Each action sets one or more fields to true. The planner finds the minimum-cost path from initial to goal state.

Cost model:

Research action: 1 week (cost = 1)
Architecture action: 1-2 weeks (cost = 1.5)
Implementation milestone: 2-5 weeks (cost = 3)
Integration action: 1-3 weeks (cost = 2)
Hardware bring-up: 6-8 weeks (cost = 7)

Optimal plan total estimated duration: 28-36 weeks (with parallelism in research and integration phases, critical path through M0->M1->M3->M4->M5->M6->M7).

8. Risk Register

Risk	Likelihood	Impact	Mitigation
MinCut no_std too slow for per-tick scheduling	Medium	High	Fall back to periodic repartitioning (every 100 ticks); use sparsified graph
EL2 page table management bugs	High	Medium	Extensive QEMU testing; Miri for unsafe blocks; compare with known-good implementations
WASM runtime too large for kernel integration	Medium	Medium	Use WAMR interpreter (smallest footprint); or run WASM in EL1 with EL2 capability enforcement
Witness log overhead degrades hot path	Low	High	Reflex proof tier (<100ns) is already within budget; batch witness records if needed
Hardware coherence counters unavailable	Medium	Low	Fall back to software instrumentation (memory access tracking via page faults)
Formal verification scope creep	High	Low	Strict verification budget: only ruvix-cap and ruvix-proof get full verification
Cross-node migration protocol correctness	High	High	TLA+ model before implementation; extensive simulation in qemu-swarm

9. Existing Codebase Inventory

What We Have (and reuse directly)

Crate	LoC (est.)	Reuse Level	Notes
ruvix-types	~2,000	Direct	Add CoherenceDomain, MemoryTier types
ruvix-cap	~1,500	Direct	Add domain-scoped trees
ruvix-proof	~1,800	Direct	Add per-domain witness log
ruvix-sched	~1,200	Refactor	Wire to coherence scoring
ruvix-region	~1,500	Extend	Add tier management
ruvix-queue	~1,000	Extend	Add cross-domain shared regions
ruvix-boot	~2,000	Refactor	EL2 boot sequence
ruvix-vecgraph	~1,200	Extend	Add kernel HNSW
ruvix-nucleus	~3,000	Refactor	Add domain syscalls
ruvix-hal	~800	Extend	Add HypervisorMmu traits
ruvix-aarch64	~800	Major work	EL2 implementation
ruvix-drivers	~500	Extend	Lease-based device model
ruvix-physmem	~800	Direct	Tier-aware allocation
ruvix-smp	~500	Direct	Multi-core domain placement
ruvix-dma	~400	Extend	Budget enforcement
ruvix-dtb	~400	Direct	Device tree parsing
ruvix-shell	~600	Direct	Debug interface
qemu-swarm	~3,000	Direct	Testing infrastructure

What We Have (reuse via no_std adaptation)

Crate	Adaptation Needed
ruvector-mincut	no_std feature, fixed-size graph backend
ruvector-sparsifier	no_std feature, remove rayon
ruvector-solver	no_std Neumann series only
ruvector-coherence	Already minimal, add spectral feature
ruvector-verified	Lean-agentic proofs for cap verification
ruvector-dag	no_std causal ordering

What We Need to Build

Component	Estimated LoC	Milestone
CoherenceDomain lifecycle	~2,000	M1
EL2 page table management	~3,000	M0/M1
Partition switch protocol	~500	M1
Per-domain witness log	~1,000	M2
Global witness merge	~800	M2
Graph-pressure scheduler	~1,500	M3
MinCut kernel integration	~2,000	M4
Memory tier manager	~2,000	M5
WASM runtime adapter	~3,000	M6
Device lease manager	~1,000	M6
Hardware drivers (Seed/Appliance)	~5,000	M7
Total new code	~21,800

Combined with ~20K lines of existing ruvix code being reused/extended, the total codebase at M7 completion is estimated at ~42K lines of Rust.

10. Next Steps (Immediate Actions)

Week 1-2: Research Sprint

Read Theseus OS and RedLeaf papers (A1.1.1, A1.1.2)
Audit ruvix-cap against seL4 CNode spec (A1.2.1)
Formalize coherence-pressure scheduling problem (A1.3.1)
Benchmark ruvector-mincut update latency for kernel budget (A1.3.4)
Select WASM runtime (WAMR vs wasmtime-minimal) (A1.6.1)

Week 3-4: M0 Sprint

Complete _start assembly for AArch64 EL2 boot (M0.1)
Initialize PL011 UART (M0.2)
Configure EL2 translation tables (M0.4)
Emit first witness record (M0.6)
Measure boot time (M0.7)

Week 5-8: M1 + M2 Sprint

Implement CoherenceDomain in ruvix-types (M1.1)
Add domain syscalls (M1.2)
Implement stage-2 page tables (M1.4)
Wire witness logging to all syscalls (M2.2)

Week 9-12: M3 + M4 Sprint (Critical Path)

Integrate ruvector-coherence into scheduler (M3.1)
Create ruvector-mincut no_std kernel subset (I4.1.1)
Wire min-cut into scheduler (I4.1.4 / M4.2)
Implement migration protocol (M4.3)

Week 13-20: M5 + M6

Implement memory tier management (M5)
Integrate WASM runtime (M6)

Week 21-28: M7

Hardware bring-up on target platform (M7)

Appendix A: Glossary

Term	Definition
Coherence domain	The primary isolation unit in RVM; a graph-structured partition of tasks, regions, and capabilities managed by dynamic min-cut
Coherence score	A scalar metric derived from the spectral gap of a domain's task-data dependency graph; higher = more internally coherent, less external dependency
Partition switch	The act of saving one domain's state and loading another's; analogous to VM exit/enter but without hardware virtualization extensions
Proof-gated mutation	The invariant that no kernel state change occurs without a valid cryptographic proof token
Witness log	An append-only, hash-chained log recording every privileged action; enables deterministic replay and audit
Reconstructable memory	Dormant/cold memory that is not stored as raw bytes but as witness log references + delta compression, enabling reconstruction on demand
Device lease	A time-bounded, capability-gated grant of device access to a coherence domain; auto-revokes on expiry
Min-cut boundary	The set of edges in a domain's graph that, when removed, partitions the graph into the minimum-cost cut; used for migration and isolation decisions

Appendix B: Reference Papers

Bhandari et al. "Theseus: an Experiment in Operating System Structure and State Management." OSDI 2020.
Narayanan et al. "RedLeaf: Isolation and Communication in a Safe Operating System." OSDI 2020.
Klein et al. "seL4: Formal Verification of an OS Kernel." SOSP 2009.
Watson et al. "CHERI: A Hybrid Capability-System Architecture for Scalable Software Compartmentalization." IEEE S&P 2015.
Baumann et al. "The Multikernel: A new OS architecture for scalable multicore systems." SOSP 2009.
Karger et al. "Minimum Cuts in Near-Linear Time." JACM 2000.
Shi & Malik. "Normalized Cuts and Image Segmentation." IEEE PAMI 2000.
Spielman & Teng. "Spectral Sparsification of Graphs." SIAM J. Computing 2011.
Levy et al. "Ownership is Theft: Experiences Building an Embedded OS in Rust." PLOS 2015 (Tock).
Klimovic et al. "Pocket: Elastic Ephemeral Storage for Serverless Analytics." OSDI 2018.

49 KiB Raw Permalink Blame History

RVM Hypervisor Core -- GOAP Plan

0. Executive Summary

Current State Assessment

Goal State

1. Research Phase Goals

1.1 Bare-Metal Rust Hypervisors and Microkernels

1.2 Capability-Based OS Designs

1.3 Graph-Partitioned Scheduling

1.4 Memory Coherence Protocols

1.5 Formal Verification Approaches

1.6 Agent Runtime Designs

2. Architecture Goals

2.1 Coherence Domains Without KVM

Definition

How It Replaces VMs

Key Architectural Decisions

2.2 Hardware Abstraction Layer

2.3 Memory Model

Design: Four-Tier Memory Hierarchy

2.4 Scheduler: Graph-Pressure-Driven

Scheduling Algorithm: CoherencePressure

2.5 IPC: Zero-Copy Message Passing

Design

2.6 Device Model: Lease-Based Access

2.7 Witness Subsystem: Compact Append-Only Log

Design

3. Implementation Milestones

M0: Bare-Metal Rust Boot on QEMU (No KVM, Direct Machine Code)

M1: Partition Object Model + Capability System

M2: Witness Logging + Proof Verifier

M3: Basic Scheduler with Coherence Scoring

M4: Dynamic MinCut Integration from RuVector Crates

M5: Memory Tier Management (Hot/Warm/Dormant/Cold)

M6: Agent Runtime Adapter (WASM Partitions)

M7: Seed/Appliance Hardware Bring-Up

4. RuVector Integration Plan

4.1 MinCut Drives Partition Placement

4.2 Sparsifier Enables Efficient Graph State

4.3 Solver Handles Coherence Scoring

4.4 Embeddings Enable Semantic State Reconstruction

4.5 Additional RuVector Crate Integration

5. Success Metrics

5.1 Performance Targets

5.2 Correctness Targets

5.3 Scale Targets

6. Dependency Graph (GOAP Action Ordering)

Critical Path

7. GOAP State Transitions

World State Variables

Initial State

Goal State

A* Search Heuristic

8. Risk Register

9. Existing Codebase Inventory

What We Have (and reuse directly)

What We Have (reuse via no_std adaptation)

What We Need to Build

10. Next Steps (Immediate Actions)

Week 1-2: Research Sprint

Week 3-4: M0 Sprint

Week 5-8: M1 + M2 Sprint

Week 9-12: M3 + M4 Sprint (Critical Path)

Week 13-20: M5 + M6

Week 21-28: M7

Appendix A: Glossary

Appendix B: Reference Papers

49 KiB

Raw Permalink Blame History