ruvector/crates/ruvector-cli/tests/cli_tests.rs
rUv d93101b203 Test and validate core functionality (#54)
* chore: Add proptest regression data from test run

Records edge cases found during property testing that cause
integer overflow failures. These will help reproduce and fix
the boundary condition bugs in distance calculations.

* fix: Resolve property test failures with overflow handling

- Fix ScalarQuantized::distance() i16 overflow: use i32 for diff*diff
  (255*255=65025 overflows i16 max of 32767)
- Fix ScalarQuantized::quantize() division by zero when all values equal
  (handle scale=0 case by defaulting to 1.0)
- Bound vector_strategy() to -1000..1000 range to prevent overflow in
  distance calculations with extreme float values

All 177 tests now pass in ruvector-core.

* fix(cli): Resolve short option conflicts in clap argument definitions

- Change --dimensions from -d to -D to avoid conflict with global --debug
- Change --db from -d to -b across all subcommands (Insert, Search, Info,
  Benchmark, Export, Import) to avoid conflict with global --debug

Fixes clap panic in debug builds: "Short option names must be unique"

Note: 4 CLI integration tests still fail due to pre-existing issue where
VectorDB doesn't persist its configuration to disk. When reopening a
database, dimensions are read from config defaults (384) instead of
from the stored database metadata. This is an architectural issue
requiring VectorDB changes to implement proper metadata persistence.

* feat(core): Add database configuration persistence and fix CLI test

- Add CONFIG_TABLE to storage.rs for persisting DbOptions
- Implement save_config() and load_config() methods in VectorStorage
- Modify VectorDB::new() to load stored config for existing databases
- Fix dimension mismatch by recreating storage with correct dimensions
- Fix test_error_handling CLI test to use /dev/null/db.db path

This ensures database settings (dimensions, distance metric, HNSW config,
quantization) are preserved across restarts. Previously opening an existing
database would use default settings instead of stored configuration.

* fix(ruvLLM): Guard against edge cases in HNSW and softmax

- memory.rs: Fix random_level() to handle r=0 (ln(0) = -inf)
- memory.rs: Fix ml calculation when hnsw_m=1 (ln(1) = 0 → div by zero)
- router.rs: Add division-by-zero guard in softmax for larger arrays

These edge cases could cause undefined behavior or NaN propagation.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-06 09:36:47 -05:00

204 lines
5.3 KiB
Rust

//! Integration tests for Ruvector CLI
use assert_cmd::Command;
use predicates::prelude::*;
use std::fs;
use tempfile::tempdir;
#[test]
fn test_cli_version() {
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("--version");
cmd.assert()
.success()
.stdout(predicate::str::contains("ruvector"));
}
#[test]
fn test_cli_help() {
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("--help");
cmd.assert().success().stdout(predicate::str::contains(
"High-performance Rust vector database",
));
}
#[test]
fn test_create_database() {
let dir = tempdir().unwrap();
let db_path = dir.path().join("test.db");
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("create")
.arg("--path")
.arg(db_path.to_str().unwrap())
.arg("--dimensions")
.arg("128");
cmd.assert()
.success()
.stdout(predicate::str::contains("Database created successfully"));
// Verify database file exists
assert!(db_path.exists());
}
#[test]
fn test_info_command() {
let dir = tempdir().unwrap();
let db_path = dir.path().join("test.db");
// Create database first
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("create")
.arg("--path")
.arg(db_path.to_str().unwrap())
.arg("--dimensions")
.arg("64");
cmd.assert().success();
// Check info
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("info").arg("--db").arg(db_path.to_str().unwrap());
cmd.assert()
.success()
.stdout(predicate::str::contains("Database Statistics"))
.stdout(predicate::str::contains("Dimensions: 64"));
}
#[test]
fn test_insert_from_json() {
let dir = tempdir().unwrap();
let db_path = dir.path().join("test.db");
let json_path = dir.path().join("vectors.json");
// Create test JSON file
let test_data = r#"[
{
"id": "v1",
"vector": [1.0, 2.0, 3.0],
"metadata": {"label": "test1"}
},
{
"id": "v2",
"vector": [4.0, 5.0, 6.0],
"metadata": {"label": "test2"}
}
]"#;
fs::write(&json_path, test_data).unwrap();
// Create database
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("create")
.arg("--path")
.arg(db_path.to_str().unwrap())
.arg("--dimensions")
.arg("3");
cmd.assert().success();
// Insert vectors
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("insert")
.arg("--db")
.arg(db_path.to_str().unwrap())
.arg("--input")
.arg(json_path.to_str().unwrap())
.arg("--format")
.arg("json")
.arg("--no-progress");
cmd.assert()
.success()
.stdout(predicate::str::contains("Inserted 2 vectors"));
}
#[test]
fn test_search_command() {
let dir = tempdir().unwrap();
let db_path = dir.path().join("test.db");
let json_path = dir.path().join("vectors.json");
// Create test data
let test_data = r#"[
{"id": "v1", "vector": [1.0, 0.0, 0.0]},
{"id": "v2", "vector": [0.0, 1.0, 0.0]},
{"id": "v3", "vector": [0.0, 0.0, 1.0]}
]"#;
fs::write(&json_path, test_data).unwrap();
// Create and populate database
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("create")
.arg("--path")
.arg(db_path.to_str().unwrap())
.arg("--dimensions")
.arg("3");
cmd.assert().success();
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("insert")
.arg("--db")
.arg(db_path.to_str().unwrap())
.arg("--input")
.arg(json_path.to_str().unwrap())
.arg("--format")
.arg("json")
.arg("--no-progress");
cmd.assert().success();
// Search
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("search")
.arg("--db")
.arg(db_path.to_str().unwrap())
.arg("--query")
.arg("[1.0, 0.0, 0.0]")
.arg("--top-k")
.arg("2");
cmd.assert()
.success()
.stdout(predicate::str::contains("v1"));
}
#[test]
fn test_benchmark_command() {
let dir = tempdir().unwrap();
let db_path = dir.path().join("test.db");
// Create database
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("create")
.arg("--path")
.arg(db_path.to_str().unwrap())
.arg("--dimensions")
.arg("128");
cmd.assert().success();
// Run benchmark
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("benchmark")
.arg("--db")
.arg(db_path.to_str().unwrap())
.arg("--queries")
.arg("100");
cmd.assert()
.success()
.stdout(predicate::str::contains("Benchmark Results"))
.stdout(predicate::str::contains("Queries per second"));
}
#[test]
fn test_error_handling() {
// Test with invalid database path - /dev/null is a device file, not a directory,
// so we cannot create a database file inside it. This guarantees failure
// regardless of user permissions.
let mut cmd = Command::cargo_bin("ruvector").unwrap();
cmd.arg("info").arg("--db").arg("/dev/null/db.db");
cmd.assert()
.failure()
.stderr(predicate::str::contains("Error"));
}