ruvector/examples/train-discoveries/Cargo.toml
Claude 402d5dccd8 feat: ETL pipeline with sublinear ForwardPush PPR for cross-domain discovery
Three-stage pipeline (Extract → Transform → Load) using ruvector-solver:
- Extract: loads 460+ discoveries from 48 JSON data sources
- Transform: embeds into 64-dim vectors, builds 8-NN sparse graph,
  runs ForwardPush PPR (sublinear O(1/ε), Andersen-Chung-Lang 2006)
- Load: outputs ranked cross-domain correlations + 12×12 domain matrix

New data sources from parallel explorer swarms:
- Humanities: Harvard Art, Library of Congress, Open Library, Nobel, Smithsonian
- Genetics/Env: ClinVar variants, GBIF endangered, EPA air, marine, satellite fires
- Tech/Infra: GitHub trending, Hacker News, SpaceX, ISS, crypto/forex markets

Novel discoveries found by PPR:
- Technology→Earth climate correlation (equatorial weather patterns)
- Technology→Space-science link (ultra-short period brown dwarf)
- Life-science→Academic (agentic AI + GPCR drug discovery bridge)

https://claude.ai/code/session_01UWE22wnsZRSHKhT4h4Axby
2026-03-16 23:17:00 -04:00

15 lines
562 B
TOML

[package]
name = "train-discoveries"
version = "0.1.0"
edition = "2021"
publish = false
description = "Cross-domain discovery ETL pipeline using RuVector sublinear solver"
[dependencies]
ruvector-core = { path = "../../crates/ruvector-core", default-features = false, features = ["parallel"] }
ruvector-solver = { path = "../../crates/ruvector-solver", features = ["forward-push", "neumann"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
rand = "0.8"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }