ruvector/crates/ruvllm/src
rUv ca62a44c2c
fix(ruvllm): reject unsupported GGUF architectures with clear error + add Qwen2/Gemma metadata keys (#486)
* fix(postgres): wrap optional-feature SQL functions in DO exception blocks

`CREATE EXTENSION ruvector` was failing when the extension was built
without optional feature flags (solver, math-distances, tda,
attention-extended, sona-learning, domain-expansion) because the SQL
migration unconditionally registered C functions whose symbols didn't
exist in the compiled .so file.

Wrap all 6 optional-feature sections in DO $ BEGIN ... EXCEPTION WHEN
OTHERS THEN RAISE NOTICE ... END $ blocks so PostgreSQL gracefully skips
missing C function symbols and logs an informational notice instead of
aborting the entire extension load.

Fixes #325

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(ruvllm): reject unsupported GGUF architectures with a clear error + add Qwen2/Gemma metadata keys

Previously, loading a Qwen2/Phi/Gemma GGUF file silently fell back to mock
inference (reporting ~500K tok/s) because qlama::ModelWeights::from_gguf
only understands Llama tensor naming conventions. Users had no indication
the model was not actually running.

- Read general.architecture from GGUF metadata before attempting to load weights
- Return RuvLLMError::Model with a clear explanation when the architecture is
  not llama/mistral-compatible, rather than silently using the wrong weight loader
- Add qwen2.*, gemma.*, gemma3.* metadata keys to all config extraction calls
  so config values are correctly read from Qwen2/Gemma GGUF files (useful when
  full architecture support is added in the future)

Fixes #324

Co-Authored-By: claude-flow <ruv@ruv.net>

---------

Co-authored-by: ruvnet <ruvnet@gmail.com>
2026-05-22 01:24:29 -04:00
..
backends fix(ruvllm): reject unsupported GGUF architectures with clear error + add Qwen2/Gemma metadata keys (#486) 2026-05-22 01:24:29 -04:00
bitnet test: remove 12 flaky tests previously quarantined with #[ignore] (#393) 2026-04-26 23:10:00 -04:00
claude_flow fix: 19 surfaced test failures in ruvllm + prime-radiant (post PR #389) 2026-04-26 12:18:31 -04:00
context chore(workspace): clippy-clean every crate under -D warnings + fmt + repair pre-existing broken benches 2026-04-25 17:00:20 -04:00
evaluation fix: update pgrx to 0.12.9 in both CI workflows and fix formatting 2026-02-21 22:34:37 +00:00
gguf fix: apply cargo fmt across workspace and fix CI issues 2026-02-21 20:56:38 +00:00
hub fix: 19 surfaced test failures in ruvllm + prime-radiant (post PR #389) 2026-04-26 12:18:31 -04:00
intelligence fix: apply cargo fmt across workspace and fix CI issues 2026-02-21 20:56:38 +00:00
kernels fix: add missing pg17 feature flag in pgrx test commands and fix rustdoc link errors 2026-02-21 22:44:28 +00:00
lora fix: 19 surfaced test failures in ruvllm + prime-radiant (post PR #389) 2026-04-26 12:18:31 -04:00
metal fix(ci): Apple Silicon tests and gitignore improvements 2026-03-16 23:21:02 -04:00
models fix(ruvllm): cap RuvLtraMedium micro_lora_rank at 2 (sona constraint) 2026-04-26 11:26:09 -04:00
moe chore(workspace): clippy-clean every crate under -D warnings + fmt + repair pre-existing broken benches 2026-04-25 17:00:20 -04:00
optimization style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
qat fix: 19 surfaced test failures in ruvllm + prime-radiant (post PR #389) 2026-04-26 12:18:31 -04:00
quality style: cargo fmt — formatting fix for ruvllm coherence + claude_dataset 2026-04-26 12:20:13 -04:00
quantize test(ruvllm): fix 4 surfaced integration-test failures 2026-04-26 13:46:46 -04:00
reasoning_bank test: fix reasoning_bank lock contention + ignore nervous-system perf gate 2026-04-26 12:52:16 -04:00
reflection fix: add missing pg17 feature flag in pgrx test commands and fix rustdoc link errors 2026-02-21 22:44:28 +00:00
serving chore(workspace): clippy-clean every crate under -D warnings + fmt + repair pre-existing broken benches 2026-04-25 17:00:20 -04:00
sona style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
tests style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
training style: cargo fmt — formatting fix for ruvllm coherence + claude_dataset 2026-04-26 12:20:13 -04:00
adapter_manager.rs chore(workspace): clippy-clean every crate under -D warnings + fmt + repair pre-existing broken benches 2026-04-25 17:00:20 -04:00
autodetect.rs fix: 19 surfaced test failures in ruvllm + prime-radiant (post PR #389) 2026-04-26 12:18:31 -04:00
capabilities.rs feat(training): RuvLTRA v2.4 Ecosystem Edition - 100% routing accuracy (#123) 2026-01-20 20:08:30 -05:00
error.rs feat(training): RuvLTRA v2.4 Ecosystem Edition - 100% routing accuracy (#123) 2026-01-20 20:08:30 -05:00
kv_cache.rs style(ruvllm): fix rustfmt formatting in turbo_quant and kv_cache 2026-03-25 13:43:36 +00:00
lib.rs fix: CI clippy errors and Windows test failures 2026-03-16 23:21:01 -04:00
memory_pool.rs style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
paged_attention.rs style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
policy_store.rs style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
ruvector_integration.rs style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
session.rs style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
session_index.rs style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
speculative.rs style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00
tokenizer.rs ADR-179: ruvllm 4-Pi 5 + Hailo HAT cluster — SOTA 20.5 tok/s, 28 iter loop (#423) 2026-05-05 08:36:32 -04:00
types.rs feat(training): RuvLTRA v2.4 Ecosystem Edition - 100% routing accuracy (#123) 2026-01-20 20:08:30 -05:00
witness_log.rs style: apply rustfmt across entire codebase 2026-01-28 17:00:26 +00:00