ruvector

mirror of https://github.com/ruvnet/RuVector.git synced 2026-06-01 23:00:37 +00:00

History

Reuven a0a8065a17 docs(adr): add P0 SOTA feature ADRs - Structured Output, Function Calling, Prefix Caching Add architecture decision records for the 3 critical P0 features needed for production LLM inference parity with vLLM/SGLang: ADR-009: Structured Output (JSON Mode) - Constrained decoding with state machine token filtering - GBNF grammar support for complex schemas - Incremental JSON validation during generation - Performance: <2ms overhead per token ADR-010: Function Calling (Tool Use) - OpenAI-compatible tool definition format - Stop-sequence based argument extraction - Parallel and sequential function execution - Automatic retry with error context ADR-011: Prefix Caching (Radix Tree) - SGLang-style radix tree for prefix matching - Copy-on-write KV cache page sharing - LRU eviction with configurable cache size - 10x speedup target for chat/RAG workloads Also includes: - GitHub issue markdown for tracking implementation - Comprehensive SOTA analysis comparing RuvLLM vs competitors - Detailed roadmap (Q1-Q4 2026) for feature parity Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>		2026-01-20 15:02:07 -05:00
..
ADR-001-ruvector-core-architecture.md	fix(security): Apply 8 critical security fixes and update ADRs	2026-01-19 11:21:31 -05:00
ADR-002-ruvllm-integration.md	docs(adr): Update ADRs with v2.1.1 performance optimizations	2026-01-19 12:03:43 -05:00
ADR-003-simd-optimization-strategy.md	docs(adr): Update ADRs with v2.1.1 performance optimizations	2026-01-19 12:03:43 -05:00
ADR-004-kv-cache-management.md	fix(security): Apply 8 critical security fixes and update ADRs	2026-01-19 11:21:31 -05:00
ADR-005-wasm-runtime-integration.md	fix(security): Apply 8 critical security fixes and update ADRs	2026-01-19 11:21:31 -05:00
ADR-006-memory-management.md	fix(security): Apply 8 critical security fixes and update ADRs	2026-01-19 11:21:31 -05:00
ADR-007-security-review-technical-debt.md	fix(security): Apply 8 critical security fixes and update ADRs	2026-01-19 11:21:31 -05:00
ADR-008-mistral-rs-integration.md	feat(ruvllm): mistral-rs backend integration for production-scale serving	2026-01-20 14:03:48 -05:00
ADR-009-structured-output.md	docs(adr): add P0 SOTA feature ADRs - Structured Output, Function Calling, Prefix Caching	2026-01-20 15:02:07 -05:00
ADR-010-function-calling.md	docs(adr): add P0 SOTA feature ADRs - Structured Output, Function Calling, Prefix Caching	2026-01-20 15:02:07 -05:00
ADR-011-prefix-caching.md	docs(adr): add P0 SOTA feature ADRs - Structured Output, Function Calling, Prefix Caching	2026-01-20 15:02:07 -05:00