vrr/WFGY

Fork 0

mirror of https://github.com/onestardao/WFGY.git synced 2026-04-28 11:40:07 +00:00

PSBigBig 308b7e2c6c

Update retrieval-playbook.md

2025-09-05 11:47:22 +08:00

12 KiB

Raw Blame History

Retrieval Playbook

🧭 Quick Return to Map

You are in a sub-page of Retrieval.
To reorient, go back here:

Retrieval — information access and knowledge lookup

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a desk within a ward.
If you need the full triage and all prescriptions, return to the Emergency Room lobby.

A practical, store-agnostic playbook to stabilize retrieval quality. Use this page to route symptoms to the right structural fix, apply measurable targets, and keep read/write parity across pipelines.

When to use

High similarity yet wrong meaning
Missing or unstable citations
Hybrid retrieval performs worse than a single retriever
Results flip across runs or paraphrases
New deploy returns empty or partial context

Acceptance targets

ΔS(question, retrieved) ≤ 0.45
Coverage ≥ 0.70 for the intended section
λ remains convergent across 3 paraphrases and 2 seeds
E_resonance stays flat on long windows

Helpers:

ΔS probes → deltaS_probes.md
Eval recipes → retrieval_eval_recipes.md

60-second fix path

Probe
Run ΔS(question, retrieved) at k = 5, 10, 20. Log λ for each paraphrase.
Tool: deltaS_probes.md
Lock schema
Enforce cite-then-explain, and require snippet_id, section_id, source_url, offsets, tokens.
Spec: Data Contracts
Repair the failing layer
- Wrong meaning with high similarity → see Metric and analyzer parity below
- Missing or shaky citations → install Traceability schema
- Hybrid worse than single → run Hybrid weighting and Query parsing split
- Flips across runs → clamp with Rerankers and parity checks
Verify
Coverage ≥ 0.70 on 3 paraphrases; λ convergent on 2 seeds; ΔS ≤ 0.45.

Root-cause map → exact fixes

1) Metric and analyzer parity

Symptoms: high similarity yet wrong meaning, language or casing skew, mixed punctuation behavior.

Actions

Align dense and sparse analyzers. Keep lowercasing, accent fold, token boundaries consistent.
Normalize vectors at write and read. Keep pooling identical.
Rebuild with explicit metric and dimension logged in traces.

Open

Wrong-meaning hits → Embedding ≠ Semantic
Chunk window parity → chunk_alignment.md
Store-agnostic fences → store_agnostic_guardrails.md

2) Traceability and citation locks

Symptoms: answer looks right but citations are missing, wrong section id, or not reproducible.

Actions

Require snippet_id, section_id, source_url, offsets, tokens in every hop.
Forbid cross-section reuse unless explicitly whitelisted.
Enforce cite-then-explain in prompts.

Open

Trace schema and audits → coming in this folder retrieval-traceability.md
Contracts → Data Contracts

3) Hybrid retrieval that underperforms

Symptoms: BM25 + dense gives worse order than either alone; relevant docs appear far down; order flips.

Actions

Separate query parsing from retrieval. Fix the parse.
Weight dense and sparse explicitly. Add a deterministic tiebreak.
Add a rerank step with a fixed cross-encoder and seed.

Open

Hybrid knobs and recipes → coming in this folder hybrid_retrieval.md
Query parsing split → coming in this folder query_parsing_split.md
Rerankers and ordering control → rerankers.md

4) Fragmentation or contamination

Symptoms: facts exist but never show; duplicates or stale shards; inconsistent analyzers by batch.

Actions

Rebuild a clean index with a single write path.
Stamp index_hash, log embedding model id and normalization.
Run a small gold set to verify recall.

Open

Fragmentation pattern → Vectorstore Fragmentation
Hallucination and chunk drift → Hallucination

Guardrails to install in any pipeline

Write path

One tokenizer and analyzer spec. Log it.
One embedding model and pooling policy. Log it.
Chunk window and overlap recorded in metadata.
Field schema: doc_id, section_id, snippet_id, source_url, offsets, tokens, index_hash, embed_model, analyzer.

Read path

Same analyzer, same normalization.
k sweep at 5, 10, 20 for ΔS probes.
Deterministic tiebreak on (score, section_id, snippet_id).

Prompt contract

Cite first, then explain.
Enforce JSON with citations and λ state.
Forbid cross-section reuse unless allowed.

Specs

DeltaS probes → deltaS_probes.md
Contracts → Data Contracts

Copy-paste prompt block for the reasoning step

You have TXTOS and the WFGY Problem Map loaded.

Retrieval inputs:
- question: "{Q}"
- k sweep results: {k5:..., k10:..., k20:...}
- citations: [{snippet_id, section_id, source_url, offsets, tokens}, ...]

Do:
1) Validate cite-then-explain. If any citation is missing or mismatched, return the failing field and stop.
2) Report ΔS(question, retrieved) and λ state. If ΔS ≥ 0.60 or λ divergent, return the minimal structural fix:
   - metric/analyzer parity
   - hybrid weighting and rerank
   - traceability schema
3) Output JSON:
   { "answer": "...", "citations": [...], "ΔS": 0.xx, "λ": "<state>", "next_fix": "<page to open>" }
Keep it auditable and short.

Evaluation loop

Gold questions per section: 3 to 5
For each question: run 3 paraphrases, 2 seeds
Metrics to log: coverage, ΔS, λ, recall@k, MAP@k, citation match rate
Recipes → retrieval_eval_recipes.md

Store-specific adapters

If a symptom points to a store quirk or feature gap, jump here:

Vector DBs index → Vector DBs & Stores

🔗 Quick-Start Downloads (60 sec)

Tool	Link	3-Step Setup
WFGY 1.0 PDF	Engine Paper	1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)	TXTOS.txt	1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

🧭 Explore More

Module	Description	Link
WFGY Core	WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack	View →
Problem Map 1.0	Initial 16-mode diagnostic and symbolic fix framework	View →
Problem Map 2.0	RAG-focused failure tree, modular fixes, and pipelines	View →
Semantic Clinic Index	Expanded failure catalog: prompt injection, memory bugs, logic drift	View →
Semantic Blueprint	Layer-based symbolic reasoning & semantic modulations	View →
Benchmark vs GPT-5	Stress test GPT-5 with full WFGY reasoning suite	View →

👑 Early Stargazers: See the Hall of Fame — Engineers, hackers, and open source builders who supported WFGY from day one.

⭐ WFGY Engine 2.0 is already unlocked. ⭐ Star the repo to help others discover it and unlock more on the Unlock Board.

12 KiB Raw Blame History Unescape Escape

Retrieval Playbook

Acceptance targets

60-second fix path

Root-cause map → exact fixes

1) Metric and analyzer parity

2) Traceability and citation locks

3) Hybrid retrieval that underperforms

4) Fragmentation or contamination

Guardrails to install in any pipeline

Copy-paste prompt block for the reasoning step

Evaluation loop

Store-specific adapters

🔗 Quick-Start Downloads (60 sec)

🧭 Explore More

12 KiB

Raw Blame History