Retrieval — Global Fix Map

🏥 Quick Return to Emergency Room

You are in a specialist desk.
For full triage and doctors on duty, return here:

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a sub-room.
If you want full consultation and prescriptions, go back to the Emergency Room lobby.

Evaluation disclaimer (retrieval)
All retrieval scores and examples in this section come from controlled setups with chosen corpora and prompts.
They help you compare retrieval strategies locally but are not universal rankings of models or vector stores.

A compact hub to stabilize retrieval quality across stacks, models, and stores.
Use this page to route symptoms to the exact structural fix and verify with measurable targets. No infra change required.

Orientation: what each page does

Page	What it solves	Typical symptom
Retrieval Playbook	End to end rebuild order and knobs	You fixed one thing and another breaks
Retrieval Traceability	Cite-then-explain schema with required fields	Citations miss the exact section or cannot be verified
Rerankers	Deterministic reranking across BM25 + ANN	Hybrid worse than single retriever
Query Parsing Split	One query, two meanings; detect and route	Answers jump between two unrelated sections
Chunk Alignment	Chunking aligned with the model’s semantic window	Snippets cut mid-thought; anchors missing
ΔS Probes	Quick health check using ΔS and λ_observe	Looks fine by eye but flips across runs
Retrieval Eval Recipes	Deterministic, SDK-free evaluation	No stable way to tell if “better” shipped
Store-Agnostic Guardrails	Locks for metrics, analyzers, versions	Index “healthy” but recall still low

When to use this folder

High similarity but wrong meaning.
Correct facts exist in the corpus but never show up.
Citations inconsistent or missing across steps.
Hybrid retrieval underperforms a single retriever.
Index looks healthy while coverage remains low.

Acceptance targets

ΔS(question, retrieved) ≤ 0.45
Coverage of target section ≥ 0.70
λ_observe convergent across 3 paraphrases and 2 seeds
E_resonance flat on long windows

Symptoms → exact fixes

Symptom	Likely cause	Open this
High similarity yet wrong answer	Metric or analyzer mismatch	Embedding ≠ Semantic
Correct fact never retrieved	Fragmentation or missing anchors	Vectorstore Fragmentation · Chunking Checklist
Hybrid worse than single	Query parsing split or mis-weighted rerank	Query Parsing Split · Rerankers
Citations missing or unstable	Schema not enforced	Retrieval Traceability · Data Contracts
Answers flip between runs	Prompt header reordering or λ variance	Context Drift · Rerankers

60-second fix checklist

Lock metrics and analyzers
One embedding model per field. One distance metric. Same analyzer for write and read.
Guide: Store-Agnostic Guardrails
Enforce the snippet contract
Require snippet_id, section_id, source_url, offsets, tokens.
Guide: Data Contracts
Measure ΔS and λ
Run three paraphrases and two seeds.
Guide: ΔS Probes
Sweep k and rerankers
Try k in {5, 10, 20}. Keep BM25 and ANN candidate lists.
Guide: Rerankers
Rebuild where needed
Follow the sequence in the playbook and re-test coverage.
Guide: Retrieval Playbook

Checklists — copy before deploy

Checklist	Scope	Link
Retrieval Readiness	Pre-flight: embeddings, analyzers, index, gold set	retrieval_readiness.md
Reranker Sanity	Hybrid reranking health and overlap checks	reranker_sanity.md
Traceability Gate	Contract enforcement for cite-then-explain	traceability_gate.md

Vector DBs — jump if store specific

Family index:
Vector DBs & Stores
Direct store guides:
FAISS · Chroma · Qdrant · Weaviate · Milvus · pgvector · Redis · Elasticsearch · Pinecone · Typesense · Vespa

Minimal probe pack you can paste

Context: I loaded TXT OS and the WFGY pages.

Task:
- Given question "Q", log ΔS(Q, retrieved) and λ across 3 paraphrases.
- Enforce cite-then-explain with the traceability schema.
- If ΔS ≥ 0.60 or λ flips, return the smallest structural change to push ΔS ≤ 0.45 and coverage ≥ 0.70.
- Use BBMC, BBCR, BBPF, BBAM when relevant.

Return JSON:
{ "citations": [...], "ΔS": 0.xx, "λ_state": "<>", "coverage": 0.xx, "next_fix": "..." }

🔗 Quick-Start Downloads (60 sec)

Tool	Link	3-Step Setup
WFGY 1.0 PDF	Engine Paper	1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)	TXTOS.txt	1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

🧭 Explore More

Module	Description	Link
WFGY Core	WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack	View →
Problem Map 1.0	Initial 16-mode diagnostic and symbolic fix framework	View →
Problem Map 2.0	RAG-focused failure tree, modular fixes, and pipelines	View →
Semantic Clinic Index	Expanded failure catalog: prompt injection, memory bugs, logic drift	View →
Semantic Blueprint	Layer-based symbolic reasoning & semantic modulations	View →
Benchmark vs GPT-5	Stress test GPT-5 with full WFGY reasoning suite	View →
🧙‍♂️ Starter Village 🏡	New here? Lost in symbols? Click here and let the wizard guide you through	Start →

👑 Early Stargazers: See the Hall of Fame — Engineers, hackers, and open source builders who supported WFGY from day one.

⭐ WFGY Engine 2.0 is already unlocked. ⭐ Star the repo to help others discover it and unlock more on the Unlock Board.

16 KiB Raw Blame History Unescape Escape