Plans the integration path for .rvf acceptance test verification into
the npm ecosystem:
- npx ruvector rvf verify-witness <file.rvf> (N-API + WASM fallback)
- npx rvlite verify-witness <file.rvf> (WASM via cli-rvf.ts)
- rvlite SDK verifyWitnessChain() for browser-side verification
- MCP tool rvf_verify_witness for Claude Code agents
- 5-phase implementation plan, each independently shippable
Bridges the rvf_witness_verify WASM export (ADR-037) to end users
without requiring the Rust toolchain.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Add self-contained acceptance test artifact that external developers can
run offline and reproduce identical graded outcomes:
- SHA-256-linked witness chain: every puzzle decision (skip_mode,
context_bucket, steps, correct) hashed into a tamper-evident chain.
Changing any single bit invalidates everything downstream.
- Deterministic replay: frozen seeds → identical puzzles → identical
solve paths → identical chain_root_hash. Two runs with the same
config produce the same hash, proven by test.
- JSON manifest: config, per-mode scorecards (A/B/C), all six ablation
assertions with measured values, full witness chain, chain root hash.
- Verifier: re-runs with same config, recomputes chain, compares root
hash. Mismatch means non-identical outcomes.
- CLI binary: `acceptance-rvf generate -o manifest.json` to produce,
`acceptance-rvf verify -i manifest.json` to verify.
66 lib tests + 20 integration tests pass.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Fixed policy sign flip (Mode A):
risk_score = R - 30*D (was R + 30*D)
Distractors now reduce effective range, making Mode A conservative
under distractors. This is the defensible control arm: a rational
fixed agent should be more cautious when distractors are present.
Mode C must learn to outperform this baseline.
EarlyCommitPenalty wired into bandit reward:
SkipModeStats now tracks early_commit_penalty_sum per arm.
reward() includes robustness_penalty = 0.2 * avg_penalty.
This means Mode C can actually learn to avoid early wrong commits
in distractor-heavy contexts. Previously the penalty was only
printed, not optimized.
Context buckets expanded to 18:
3 range (small/medium/large) × 3 distractor (clean/some/heavy)
× 2 noise (clean/noisy) = 18 buckets.
Previous: 4 range × 2 distractor = 8 (too coarse for bandit).
Noise flag now flows through AdaptiveSolver.noisy_hint.
New ablation assertion:
c_penalty_better_than_b: Mode C EarlyCommitPenalty must be ≤90%
of Mode B penalty. Proves robustness improvement is explicit,
not just noise_accuracy-based.
Acceptance test noise plumbing:
solver.noisy_hint set to true for noisy puzzles in both training
and holdout evaluation. Context buckets now correctly distinguish
clean vs noisy conditions.
81 tests passing (61 lib + 20 integration).
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
PolicyKernel refinements:
- Fixed policy (Mode A): risk_score = R + k*D, k=30, T=140
Fixed constants (not learned) — Mode A is the control arm.
One distractor raises perceived risk by ~30 range-days.
Weekday only when range is large AND distractor-free.
- Normalized EarlyCommitPenalty: (remaining/initial) * scale
Committing at 5% scan = cheap (0.05), at 90% = expensive (0.90).
Only charged on wrong commits.
- Hybrid minimum evidence: stop_after_first disabled in Hybrid mode
so solver checks all matching weekdays before committing.
Witness log:
- SolutionAttempt now carries skip_mode and context_bucket strings
- record_attempt_witnessed() for full policy audit trail
- Every trajectory records which skip mode was chosen and why
Observability:
- Puzzle tags now include distractor_count and has_dow (deterministic)
- count_distractors() made public for generator to tag puzzles
Ablation assertions (two new):
- a_skip_nonzero: Mode A uses skip at least sometimes (proves not hobbled)
- c_multi_mode: Mode C uses different skip modes across contexts (proves learning)
- Skip-mode distribution table printed per context bucket for Mode C
posterior_target monotonicity verified: 2→4→8→12→18→25→35→50→70→100
(never shrinks with difficulty)
81 tests passing (61 lib + 20 integration).
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Three-fix iteration based on ablation diagnostics:
1. Bounded trial: Strategy Zero now caps trial budget at min(avg_steps*2,
external_limit/4) with floor of 10 steps. Makes false hits cheap
(max 100 steps overhead instead of full compiled budget).
2. Confidence gating: Strategy Zero only attempts when config confidence
>= 0.7 (Laplace-smoothed success rate). Compiled observations from
training seed initial confidence so configs start trusted.
3. 2-failure quarantine: any compiled signature with 2+ false hits is
disabled (expected_correct=false). Prevents persistent bad patterns.
Additional changes:
- Versioned signature prefix (v1:difficulty:constraints) for cache
safety across refactors
- CompiledSolveConfig gains avg_steps, observations, confidence(),
trial_budget() methods
- KnowledgeCompiler gains steps_saved tracking, confidence_threshold,
print_diagnostics() for per-signature analysis
- record_success now tracks actual steps for delta-cost calculation
- Verbose mode prints full compiler diagnostics after each ablation
Results: false hit rate dropped from 8.2% to 4.4% (PASS). Cost still
net-positive because constraint-determined search ranges are 1-10 dates
— structurally no room for compiler optimization. Next: PolicyKernel
constraint ordering for real cost surface.
81 tests passing.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Wire the KnowledgeCompiler as Strategy Zero in AdaptiveSolver solve
path — compiled constraint-signature configs are consulted before any
strategy. Add StrategyRouter with epsilon-greedy contextual bandit for
adaptive strategy selection per difficulty/constraint family.
Implement three-mode ablation protocol (A/B/C):
- Mode A: baseline (no compiler, fixed router)
- Mode B: compiler only (Strategy Zero with early termination)
- Mode C: full (compiler + adaptive router)
Adds run_ablation_comparison() and AblationComparison::print() with
quantitative assertions (B beats A on cost >=15%, C beats B on
robustness >=10%, compiler false-hit rate <5%).
Other changes:
- Early termination (stop_after_first) in TemporalSolver for compiled
single-solution puzzles
- Step accumulation across Strategy Zero failures + fallback
- Promotion gating: patterns only promoted when holdout accuracy
doesn't regress
- Compiler false_hits tracking
- --ablation flag on agi-proof-harness binary
- 81 tests passing (61 unit + 20 integration)
Ablation result (100-task holdout, 5 cycles): compiler active at 59%
hit rate with 8.2% false hit rate. Cost and robustness targets not yet
met — solver needs more policy surface (step 5: PolicyKernel learning).
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Implements a recursive intelligence amplification pipeline where each
level feeds the next, measuring IQ at every stage:
L1 Foundation (IQ ~79) Adaptive solver + ReasoningBank + retry
L2 Meta-Learning (IQ ~82) Learns optimal hyperparams per problem class
L3 Ensemble Arbiter (IQ ~83) Multi-strategy voting with learned selection
L4 Recursive Improve(IQ ~85) Bootstraps from own outputs + knowledge compiler
L5 Adversarial Grow (IQ ~89) Self-generated hard tasks + cascade reasoning
Key mechanisms:
- MetaParams: EMA-learned step budgets + retry benefit estimation
- StrategyEnsemble: N-solver majority vote, confidence-weighted
- KnowledgeCompiler: compiles patterns to direct lookup (54% hit rate)
- AdversarialGenerator: weakness-targeted difficulty escalation
- CascadeReasoner: multi-pass solve-verify-resolve
Results: +7.5 to +10.1 IQ gain across 5 levels, reaching IQ 86-89
depending on noise conditions. 100% accuracy at max difficulty in L4/L5.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Add builder methods with_authority_config() and with_domain_profile()
for the two new TLV tags (0x0110, 0x0111). Update ParsedAgiManifest
parser to extract these sections with round-trip test coverage.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
- Resolve open questions: repo automation as first domain, four-level
AuthorityLevel enum, per-task ResourceBudget with hard caps,
CoherenceThresholds with validation
- Add AGI_MAX_CONTAINER_SIZE (16 GiB) with enforcement in validation
- Tighten ContainerSegments::validate: Verify/Live modes now require
world model data (VEC or INDEX segments), not just kernel/WASM
- Add ContainerError variants: InsufficientAuthority, BudgetExhausted
- Add to_flags support for orchestrator_present and world_model_present
- Add wire format section and cross-references to ADRs 029-033 in doc
- Add 2 new TLV tags: AUTHORITY_CONFIG (0x0110), DOMAIN_PROFILE (0x0111)
- Re-export new types from lib.rs
- Update rvf-runtime tests for tightened validation
- All 222 rvf-types + all rvf-runtime tests pass
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Defines the full system boundary for portable intelligence:
- RuVector as existential substrate (world model, coherence signals)
- RVF as cognitive container format (packaging, witness chains, replay)
- Claude Code as control plane orchestrator (planning, tool use)
- Claude Flow as swarm coordinator (routing, shared memory, learning)
Key mechanisms:
- Structural health gates (min-cut coherence, contradiction pressure)
- Skill promotion with counterexample requirements
- Two execution modes: Replay (bit-identical) and Verify (same grades)
- 10 node types, 9 edge types, 4 invariants for the world model schema
- MCP tools: ruvector_query, ruvector_cypher, rvf_snapshot, eval_run
Acceptance test: same RVF artifact, two machines, 100 tasks,
95+ passing in verify mode, zero policy violations.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Implement a zero-dependency QR code encoder that generates QR code
images from RVQS seed payloads, feature-gated behind `qr`:
QR encoder (qr_encode.rs):
- QrEncoder with encode(), to_svg(), to_ascii() methods
- QrCode struct with modules matrix, version, and size
- EcLevel enum (L/M/Q/H) and QrError enum
- Versions 1-5 (21x21 to 37x37), byte-mode encoding
- Reed-Solomon EC over GF(2^8) with polynomial 0x11D
- All 8 mask patterns with automatic best-mask selection
- Finder patterns, timing patterns, alignment patterns (v2+)
- Format information with BCH(15,5) encoding
- Data interleaving across EC blocks
- 11 unit tests covering encoding, rendering, GF arithmetic, errors
Integration:
- Module declared in lib.rs behind cfg(feature = "qr")
- Re-exports QrEncoder, QrCode, QrError, EcLevel
- `qr` feature in Cargo.toml (zero external deps)
- Example: qr_seed_encode.rs builds seed and renders SVG/ASCII
Fix doc example to use associated function syntax and suppress
dead_code warnings on internal helper fields.
All 236 tests pass: cargo test -p rvf-runtime --features qr
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Build a minimal zero-dependency PWA under examples/pwa-loader/ that
decodes RVQS cognitive seeds and .rvf files in the browser:
- index.html: single-page app with file input, QR scanner button,
decoded seed info display, evidence viewer, and dark/light theme
- app.js: WASM module loading with JS fallback, RVQS 64-byte header
parsing (matching rvf-types binary layout), TLV manifest decoder,
RVF segment parser using WASM exports, QR camera scanner via
getUserMedia + BarcodeDetector API, file drag-and-drop handler
- style.css: CSS variables for dark/light themes, mobile-first
responsive layout, monospace hex display
- manifest.json: PWA manifest for standalone install
- sw.js: cache-first service worker for offline support
The WASM path is configurable via window.RVF_WASM_PATH (default
./rvf_wasm_bg.wasm). Gracefully falls back to pure JS parsing when
WASM is unavailable. No external CDN dependencies.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
QR encoder (feature-gated behind `qr`):
- Pure-Rust QR code encoder with GF(2^8) Reed-Solomon
- SVG and ASCII renderers
- Version 1-5 support, byte mode, EC level M
- Example: qr_seed_encode
PWA loader:
- Browser-based RVF seed decoder (HTML/JS/CSS)
- Service worker for offline support
- Camera QR scanner via getUserMedia
no_std fixes:
- quality.rs test alloc import cleanup
- Cargo.toml feature gate for qr encoder
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Minimal iOS App Clip template that bridges into the RVF C FFI to
decode QR cognitive seeds. Includes SPM package config linking
librvf_runtime.a, a C header mirroring ffi.rs exports (rvqs_parse_header,
rvqs_verify_signature, rvqs_verify_content_hash, rvqs_decompress_microkernel,
rvqs_get_primary_host_url), a Swift SeedDecoder wrapper with proper memory
management, and a SwiftUI view with QR scanner placeholder and decoded
seed info display.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Complete the QR Cognitive Seed pipeline with zero external dependencies:
- Pure SHA-256 (FIPS 180-4) verified against NIST test vectors
- HMAC-SHA256 (RFC 2104) verified against RFC 4231 test cases
- LZ77 compression (SCF-1 format) with 4KB sliding window
- Seed crypto: content hashing, signing, layer verification
- C FFI (5 extern "C" functions) for App Clip / mobile integration
- SeedBuilder.build_and_sign() with automatic hashing and signing
- ParsedSeed.verify_all() with full integrity and signature checks
- ParsedSeed.decompress_microkernel() using built-in LZ
- 11 end-to-end integration tests with real cryptography
- Updated ADR-034 with App Clip, PWA, Android delivery paths
- Example updated with full real-crypto round-trip demo
Total: 381 tests passing (183 types + 154 runtime + 11 e2e + 33 manifest)
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Three fixes to ADR-033:
1. ResultQuality split into RetrievalQuality (per-candidate) and
ResponseQuality (per-response at API boundary). ResponseQuality
survives serialization across JSON/gRPC/MCP. DegradationReason
provides structured, inspectable evidence for why quality dropped.
2. Brute-force safety net dual-budgeted: max 5ms wall-clock AND max
50K candidates, whichever hits first. Both configurable via
QueryOptions. Budget=0 disables fallback entirely. Prevents O(N)
DoS from adversarial queries on large hot caches.
3. Mandatory acceptance test: malicious tail manifest with valid CRC
but redirected hotset pointers must fail deterministically under
Strict policy with a logged, stable error code. Separate test for
re-signed forgery (wrong signer vs no signature distinction).
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Addresses four structural weaknesses in the progressive indexing system:
1. Content-addressed centroid stability — hotset pointers verified by
SHAKE-256 content hashes, not just byte offsets. Compaction becomes
physically destructive but logically stable.
2. Adversarial distribution resilience — distance entropy detection
with adaptive n_probe widening. Silent recall collapse replaced by
detected degradation with ResultQuality signaling.
3. Honest recall framing — empirical targets scoped to distribution
classes (natural/synthetic/adversarial). Monotonic recall improvement
property proven from append-only invariant. Brute-force safety net
when candidate count is insufficient.
4. Mandatory manifest signatures — SecurityPolicy defaults to Strict.
No signature = no mount in production. Prevents segment-swap attacks
on hotset pointers. CRC32C catches corruption; ML-DSA-65 catches
adversaries.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
Add the WASM_SEG segment type and complete self-bootstrapping
architecture that allows RVF files to carry their own execution
runtime. When an RVF file embeds a WASM interpreter alongside the
microkernel, the host only needs raw execution capability — making
RVF "run anywhere compute exists."
Changes:
- rvf-types: Add SegmentType::Wasm (0x10), WasmHeader (64-byte),
WasmRole, WasmTarget enums, and feature flag constants
- rvf-runtime: Add embed_wasm(), extract_wasm(), extract_wasm_all(),
is_self_bootstrapping() methods on RvfStore, plus write_wasm_seg()
in the write path
- rvf-wasm: Add bootstrap module with resolve_bootstrap_chain() that
discovers WASM_SEGs, parses headers, and resolves the optimal
bootstrap strategy (None/HostRequired/SelfContained/TwoStage/Full)
- docs: Add spec/11-wasm-bootstrap.md with complete wire format,
bootstrap protocol, size budget analysis, and security model
The three-layer bootstrap stack:
Layer 0: Raw bytes (.rvf file)
Layer 1: Embedded WASM interpreter (~50 KB)
Layer 2: WASM microkernel (~5.5 KB)
Layer 3: RVF data segments
All 131 rvf-types tests and 72 rvf-runtime tests pass.
https://claude.ai/code/session_01RnwD4x5cbpB7FPvoyYQz8G
- Fix BackendSpec.as_ref() error: backend is a struct, not Option; access options.early_exit directly
- Fix ii_IndexAttrNumbers array indexing: use [0] instead of .offset(0) for fixed-size [i16; 32]
- Bump rvf-cli deps to match rvf-launch 0.2.0 and rvf-server 0.2.0
- Update Docker image version label to 2.0.2
Co-Authored-By: claude-flow <ruv@ruv.net>
HNSW fixes:
- Extract vector dimensions from column atttypmod instead of hardcoding 128,
which caused corrupted indexes for non-128-dim embeddings (#171, #164)
- Add page boundary checks in read_vector/read_neighbors to prevent
segfaults on large tables with >100K rows (#164)
- Use BinaryHeap::into_sorted_vec() for deterministic result ordering
instead of into_iter() which yields arbitrary order (#171)
- Handle non-kNN scans (COUNT, WHERE IS NOT NULL) gracefully by returning
false from hnsw_gettuple when no ORDER BY operator is present (#152)
Agent/SPARQL fixes:
- Fix SQL type mismatch: ruvector_list_agents() and
ruvector_find_agents_by_capability() now use RETURNS TABLE(...)
matching the Rust TableIterator signatures instead of RETURNS SETOF jsonb (#167)
- Add empty query validation to ruvector_sparql() and
ruvector_sparql_json() to prevent panics on invalid input (#167)
- Change workspace panic profile from "abort" to "unwind" so pgrx can
convert Rust panics to PostgreSQL errors instead of killing the backend (#167)
Security:
- Bump lru dependency from 0.12 to 0.16 in ruvector-graph, ruvector-cli,
and ruvLLM to resolve GHSA-xpfx-fvgv-hgqp Stacked Borrows violation (#148)
Version bumps: workspace 2.0.3, ruvector-postgres 2.0.2
Co-Authored-By: claude-flow <ruv@ruv.net>