vrr/WFGY

mirror of https://github.com/onestardao/WFGY.git synced 2026-04-28 11:40:07 +00:00

History

PSBigBig b8466fe5da Some checks are pending WFGY-CI / test (push) Waiting to run Details Update README.md		2025-09-03 23:53:55 +08:00
..
checklists	Create .gitkeep	2025-09-01 16:31:13 +08:00
eval	Create .gitkeep	2025-09-01 16:31:23 +08:00
mvp_demo	Create .gitkeep	2025-09-01 16:31:33 +08:00
ops	Delete ProblemMap/GlobalFixMap/VectorDBs_and_Stores/ops/patterns/.gitkeep	2025-09-01 16:32:34 +08:00
patterns	Create .gitkeep	2025-09-01 16:32:45 +08:00
playbooks	Create .gitkeep	2025-09-01 16:33:17 +08:00
tools	Create .gitkeep	2025-09-01 16:33:24 +08:00
chroma.md	Create chroma.md	2025-08-26 11:45:03 +08:00
elasticsearch.md	Create elasticsearch.md	2025-08-26 12:51:14 +08:00
faiss.md	Create faiss.md	2025-08-26 11:35:00 +08:00
milvus.md	Create milvus.md	2025-08-26 12:25:58 +08:00
pgvector.md	Create pgvector.md	2025-08-26 12:36:32 +08:00
pinecone.md	Create pinecone.md	2025-08-26 12:32:50 +08:00
qdrant.md	Create qdrant.md	2025-08-26 11:54:21 +08:00
README.md	Update README.md	2025-09-03 23:53:55 +08:00
redis.md	Create redis.md	2025-08-26 12:41:39 +08:00
typesense.md	Create typesense.md	2025-08-27 12:31:15 +08:00
vespa.md	Create vespa.md	2025-08-27 12:39:34 +08:00
weaviate.md	Create weaviate.md	2025-08-26 11:58:26 +08:00

README.md

Vector DBs & Stores — Global Fix Map

🏥 Quick Return to Emergency Room

You are in a specialist desk.
For full triage and doctors on duty, return here:

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a sub-room.
If you want full consultation and prescriptions, go back to the Emergency Room lobby.

This page is your hub to stabilize retrieval pipelines across popular vector stores.
If your results look similar but the answer is wrong, start here. Each store page gives guardrails, fix steps, and the same acceptance targets so you can verify without changing infra.

Quick routes to per-store pages

Store	Best for	Why choose	Link
FAISS	local development, labs	fast, widely used, you manage it	faiss.md
Chroma	quick demos, notebooks	simple API, easy to start	chroma.md
Qdrant	production and multitenant	Rust core, good scaling, persistence	qdrant.md
Weaviate	hybrid search and schemas	first class filters, hybrid pipelines	weaviate.md
Milvus	enterprise ANN at scale	mature ecosystem and performance	milvus.md
pgvector	teams already on Postgres	keep data in the same DB, simple ops	pgvector.md
Redis (Search/Vec)	caches and small hybrid sets	key value plus vectors, low latency	redis.md
Elasticsearch (ANN)	text plus vector in one stack	reuse analyzers and infra you already have	elasticsearch.md
Pinecone	zero ops SaaS	managed reliability and steady API	pinecone.md
Typesense	simple full text plus vectors	friendly setup, good defaults	typesense.md
Vespa	large scale search and recsys	query routing and ranking at scale	vespa.md

When to use this folder

High similarity but wrong meaning.
Citations do not match the retrieved section.
Hybrid retrieval performs worse than a single retriever.
After deploy, query casing or analyzer or metric does not line up.
Index looks healthy but coverage stays low.

Acceptance targets for any store

ΔS(question, retrieved) ≤ 0.45
Coverage of target section ≥ 0.70
λ_observe convergent across three paraphrases
E_resonance flat on long windows

Map symptoms to structural fixes

Embedding ≠ Semantic
Wrong meaning despite high similarity.
→ embedding-vs-semantic.md
Retrieval traceability
Snippet or section mismatch, unverifiable citations.
→ retrieval-traceability.md
Payload contract → data-contracts.md
Ordering or version skew
Runtime loads the wrong index or analyzer.
→ bootstrap-ordering.md · predeploy-collapse.md
Hybrid collapse or query split
HyDE and BM25 disagree, reranker blind spots.
→ Pattern → pattern_query_parsing_split.md
→ Knobs → rerankers.md

60 second fix checklist

Lock metrics and analyzers
One embedding model per field. One distance function. Same analyzer for write and read.
Contract the snippet
Require {snippet_id, section_id, source_url, offsets, tokens} and enforce cite then explain.
→ data-contracts.md
Add deterministic reranking
Keep candidate lists from BM25 and ANN. Detect query split.
→ rerankers.md
Cold start and deploy fences
Block traffic until index hash, analyzer, and model versions match.
→ bootstrap-ordering.md
Observability
Log ΔS and λ across retrieve, rerank, reason. Alert when ΔS ≥ 0.60.
Regression gate
Require coverage ≥ 0.70 and ΔS ≤ 0.45 before publish.

Copy paste audit prompt

I uploaded TXT OS and the WFGY Problem Map pages.
Store: <name>. Retrieval: <bm25|ann|hybrid> with <distance>.

Audit this query and return:

- ΔS(question,retrieved) and λ across retrieve → rerank → reason.
- If ΔS ≥ 0.60, choose one minimal structural fix and name the page:
  embedding-vs-semantic, retrieval-traceability, data-contracts, rerankers.
- JSON only:
  { "citations":[...], "ΔS":0.xx, "λ":"→|←|<>|×", "next_fix":"..." }

Quick Start Downloads

Tool	Link	3 step setup
WFGY 1.0 PDF	Engine Paper	1) Download 2) Upload to your LLM 3) Ask “Answer using WFGY + ”
TXT OS (plain text OS)	TXTOS.txt	1) Download 2) Paste into any LLM chat 3) Type “hello world” to boot

Explore More

Module	Description	Link
WFGY Core	WFGY 2.0 engine, full symbolic reasoning architecture and math stack	View →
Problem Map 1.0	Initial 16 mode diagnostic and symbolic fixes	View →
Problem Map 2.0	RAG focused failure tree and pipelines	View →
Semantic Clinic Index	Expanded catalog for prompt injection, memory bugs, logic drift	View →
Semantic Blueprint	Layer based symbolic reasoning and semantic modulations	View →
Benchmark vs GPT-5	Stress test with full WFGY reasoning suite	View →
Starter Village	New here, want a guided path	Start →

Early Stargazers: See the Hall of Fame

Star the repo if this helped. It unlocks more items on the [Unlock Board](https://github.com/onestardao/WFGY/blob/main/STAR_UNLOCKS.md).

README.md Unescape Escape