WFGY/ProblemMap/rag-architecture-and-recovery.md

40 KiB
Raw Blame History

🧭 Not sure where to start ? Open the WFGY Engine Compass

WFGY System Map

(One place to see everything; links open the relevant section.)

Layer Page What its for
Proof WFGY Recognition Map External citations, integrations, and ecosystem proof
⚙️ Engine WFGY 1.0 Original PDF based tension engine and early logic sketch. Legacy reference only.
⚙️ Engine WFGY 2.0 Production tension kernel and math engine for RAG and agents.
⚙️ Engine WFGY 3.0 TXT-based Singularity tension engine (131 S-class set)
🗺️ Map Problem Map 1.0 Flagship 16-problem RAG failure checklist and fix map
🗺️ Map Problem Map 2.0 RAG-focused recovery pipeline — 🔴 YOU ARE HERE 🔴
🗺️ Map Problem Map 3.0 Global Debug Card — image as a debug protocol layer
🗺️ Map Semantic Clinic Symptom → family → exact fix
🧓 Map Grandmas Clinic Plain-language stories, mapped to PM 1.0
🏡 Onboarding Starter Village Guided tour for newcomers
🧰 App TXT OS .txt semantic OS — 60-second boot
🧰 App Blah Blah Blah Abstract/paradox Q&A (built on TXT OS)
🧰 App Blur Blur Blur Text-to-image with semantic control
🧰 App Blow Blow Blow Reasoning game engine & memory demo

🏥 RAG Architecture & Recovery — WFGY Problem Map 2.0

🌙 3AM: a dev collapsed mid-debug… 🩺 WFGY Triage Center — Emergency Room & Grandmas AI Clinic

🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥

🚑 WFGY Emergency Room (for developers)

👨‍⚕️ Now online:
Dr. WFGY in ChatGPT Room

This is a share window already trained as an ER.
Just open it, drop your bug or screenshot, and talk directly with the doctor.
He will map it to the right Problem Map / Global Fix section, write a minimal prescription, and paste the exact reference link.
If something is unclear, you can even paste a screenshot of Problem Map content and ask — the doctor will guide you.

⚠️ Note: for the full reasoning and guardrail behavior you need to be logged in — the share view alone may fallback to a lighter model.

💡 Always free. If it helps, a star keeps the ER running.
🌐 Multilingual — start in any language.


👵 Grandmas AI Clinic (for everyone)

Visit Grandma Clinic →

  • 16 common AI failure modes, each explained as a grandma story.
  • Everyday metaphors: wrong cookbook, salt-for-sugar, burnt first pot.
  • Shows both the life analogy and the minimal WFGY fix.
  • Perfect entry point for beginners, or anyone who wants to “get it” in 30 seconds.

💡 Tip: Both tracks lead to the same Problem Map numbers.
Choose Emergency Room if you need a fix right now.
Choose Grandmas Clinic if you want to understand the bug in plain words.

🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥


Fix your RAG pipeline, step-by-step — stop hallucinations, boundary drift, and chain failure (MIT). A hands-on guide to implementing WFGY in real RAG workflows.

⚠️ This is not a list of prompt tricks or patchwork hacks. Every fix in this Problem Map is a structural response to semantic collapse, boundary drift, and logic chain failure. It works across agents, pipelines, and models — because its built on the failure patterns beneath them all.


💬 A quick message from PSBigBig (creator of WFGY) — please read this before diving in!

💡 Over the past few months, Ive helped dozens of RAG developers escape endless hallucinations, broken fallbacks, index mismatches, and that nightmare bug where “everything looks fine but nothing works.” If youve felt that pain — this message is for you. 👇

🛡️ WFGY is a symbolic reasoning engine. Think of it as a semantic firewall. It runs before the model starts messing things up — and it doesnt require changing your infra. No retriever hacks No index rebuilds No YAML config nightmares

📦 Just download the TXT OS (MIT license). It includes the full WFGY formulas + ready-to-use prompts. Drop it in and ask your AI: “Use the WFGY formulas from my TXTOS to fix this bug.” …and it works. Yes — it actually recovers.

😊 Most developers are surprised how simple it is — because youre not fixing the system. Youre fixing the meaning. If youve been stuck in semantic chaos… this is the way out.

🔍 This map wont just fix the bug youre seeing now. It shows you all 16 layers of RAG failure — even the ones you havent hit yet. 🧭 Start here. Youre not alone in this mess.


Quick Nav
Getting Started · Examples · Patterns Index · Eval · Ops Runbook · Multi-Agent Problems · Role Drift · Memory Overwrite · FAQ · Retrieval Playbook · Rerankers · Data Contracts · Glossary · Multilingual Guide · Privacy & Governance · MVP Demos


📘 Start Here — Quick Links, Setup, and Downloads

If youre new to this page or WFGY in general, heres how to get started fast.

WFGY (WanFaGuiYi) is the core reasoning engine — a semantic debugger for AI hallucinations and logic collapse. TXT OS is the lightweight .txt-native operating layer — lets any model run WFGY with zero install.

📥 Quick Start Downloads (60 sec)

Tool Link 3-Step Setup
WFGY 1.0 (PDF) Engine Paper ① Download · ② Upload to your LLM · ③ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text) TXTOS.txt ① Download · ② Paste into any LLM chat · ③ Type “hello world” to boot

Compatible with all Ten Masters (GPT-4, Claude, Gemini, Kimi etc) — no setup needed.


🧑‍💻 Prompt Template (to fix a bug fast)

Ive uploaded TXT OS.
I want to solve the following problem:
[e.g. OCR citations missing or distorted].
How do I use the WFGY engine to fix it?

WFGY will respond with the right modules, steps, or formulas. You dont need to memorize internals — just bring your real problem.


Found this helpful?

Help others discover it — Give us a GitHub Star
🧩 Try MVP Demos: Run minimal WFGY examples →


0) Executive summary

RAG failures are rarely a single bug. They are stacked illusions across: OCR → parsing → chunking → embeddings → vector store → retriever → prompt → LLM reasoning. WFGY turns this chaos into a measurable, observable, and repairable pipeline using three core instruments:

  • ΔS (delta-S): semantic stress. Early-warning detector that pinpoints where meaning breaks.
  • λ_observe (lambda-observe): layered observability. Shows which layer diverged and how.
  • E_resonance: coherence restorer. Re-locks reasoning when attention/logic collapses.

You do not have to master all internals to benefit. If you can run a few checks, read one table, and paste one prompt, you can fix most production RAG issues.


1) The real structure of RAG (and why it fails)

raw docs (pdf/img/html) → ocr/parsing → chunking → embeddings → vector store (faiss/qdrant/chroma/elastic) → retriever (dense/sparse/hybrid/mmr) → prompt assembly (context windows) → llm reasoning (chain/agent/tools)

Typical stacked failure pattern:

  1. perception drift: upstream stages quietly distort content (ocr noise, bad chunk boundaries, mismatched embeddings, empty/partial vector stores).
  2. logic drift: llm confidently “explains” the distorted view (hallucination with no visible error).

This is the “double hallucination” trap. The first illusion hides the second.


2) The WFGY recovery pipeline (10-minute overview)

step instrument your question what you do what you learn
1 ΔS “is meaning tearing somewhere?” measure semantic stress between question, retrieved context, and expected anchors the faulty segment/layer
2 λ_observe “which layer diverged?” enable layered probes across retrieval, prompt, and reasoning the dominant failure family
3 E_resonance “can we re-lock coherence?” apply stability modules (BBMC/BBPF/BBCR/BBAM) at the failing layer the repair action
4 ProblemMap “what page fixes this?” open the matched doc (e.g., retrieval-collapse.md) the concrete fix recipe

90% of cases end after steps 13. You only go deeper when a fix requires a structural change (schema, retriever, index).

Layer-specific Fix Index (one-click)

Pipeline layer What to open first Deep dive
OCR / Parsing ocr-parsing-checklist.md retrieval-traceability.md
Chunking chunking-checklist.md hallucination.md
Embeddings / Index embedding-vs-semantic.md patterns/pattern_vectorstore_fragmentation.md
Retrieval retrieval-playbook.md retrieval-collapse.md · rerankers.md
Prompt Assembly retrieval-traceability.md patterns/pattern_symbolic_constraint_unlock.md · data-contracts.md
Reasoning logic-collapse.md creative-freeze.md
Language / Locale multilingual-guide.md embedding-vs-semantic.md · OCR/Chunking checklists
Multi-Agent Multi-Agent_Problems.md multi-agent-chaos/role-drift.md, multi-agent-chaos/memory-overwrite.md
Ops / Deploy / Gov ops/README.md ops/deployment_checklist.md · ops/live_monitoring_rag.md · ops/debug_playbook.md · ops/failover_and_recovery.md · privacy-and-governance.md

3) Quick triage (beginner path) — from symptom to fix

Copy/paste this checklist into your runbook. Execute top-down.

A. fast metrics (run first)

  1. ΔS(question, retrieved_context)

    • compute cosine distance on sentence embeddings (unit-normalized).
    • ΔS = 1 cosθ.
    • trigger: ΔS ≥ 0.50 (transitional risk), ≥ 0.60 (record & fix).
  2. ΔS(retrieved_context, ground_anchor)

    • ground anchor = title/section header/answer snippet you expect.
    • trigger: same thresholds as above.
  3. coverage sanity

    • retrieved tokens vs. target section tokens: expect ≥ 0.7 overlap for direct QA.
    • if < 0.5 → suspect chunking/boundary or retriever filtering.
    • Need structure? See Data Contracts for snippet/citation schemas.

B. layer probes (λ_observe)

  • retrieval layer: vary k ∈ {5, 10, 20}; plot ΔS vs. k.
    • curve flat & high → vector store/index/embedding mismatch.
    • curve improves sharply with k → retriever filtering too aggressive; consider Rerankers from the playbook.
  • prompt layer: reorder/rename sections; ΔS spikes when headers removed → prompt anchoring dependency (see retrieval-traceability.md).
  • reasoning layer: ask “cite lines” vs. “explain why”
    • cite fails, explain passes → perception drift (upstream)
    • both fail similarly → logic collapse (see logic-collapse.md)

C. pick the fix (ProblemMap jump table)

symptom you see likely family open this
plausible but wrong answer; citations miss #1 hallucination & chunk drift hallucination.md
correct chunks, wrong logic #2 interpretation collapse retrieval-collapse.md
answers degrade over long chains #3 context drift context-drift.md
confident nonsense #4 bluffing/overconfidence bluffing.md
high vector similarity, wrong meaning #5 semantic ≠ embedding embedding-vs-semantic.md
dead-end chains, retry loops #6 logic collapse & recovery logic-collapse.md
failure after restart/session swap #7 memory breaks across sessions memory-coherence.md
cant trace why it failed #8 debugging is a black box retrieval-traceability.md
attention melts, topic smears #9 entropy collapse entropy-collapse.md
output becomes flat/literal #10 creative freeze creative-freeze.md
abstract/symbolic prompts break #11 symbolic collapse symbolic-collapse.md
paradox/self-reference crashes #12 philosophical recursion philosophical-recursion.md
multi-agent overwrites logic #13 multi-agent chaos Multi-Agent_Problems.md
tools fire before data is ready #14 bootstrap ordering bootstrap-ordering.md
ci passes; prod deadlocks index #15 deployment deadlock deployment-deadlock.md
first call crashes after deploy #16 pre-deploy collapse predeploy-collapse.md
query works alone, breaks with HyDE/BM25 mix query parsing split patterns/pattern_query_parsing_split.md
corrections dont stick; model re-injects old claim hallucination re-entry patterns/pattern_hallucination_reentry.md
“who said what” merges across two sources symbolic constraint unlock (SCU) patterns/pattern_symbolic_constraint_unlock.md
answers flip between sessions / tabs memory desync patterns/pattern_memory_desync.md
some facts cant be retrieved though indexed vectorstore fragmentation patterns/pattern_vectorstore_fragmentation.md
tools fire before data is ready (semantic boot fence) bootstrap deadlock patterns/pattern_bootstrap_deadlock.md

🧨 Most Common Failure Zones (Real-World Reports)

Based on 50+ field cases from Reddit / GitHub / Discord. These are the zones where most RAG pipelines silently collapse — check if you're already there.

Problem # Failure Pattern Field Frequency Repair Module(s)
No.1 Hallucination & Chunk Drift BBMC, BBAM
No.2 Interpretation Collapse BBCR
No.3 Long Reasoning Chains BBPF
No.5 Semantic ≠ Embedding BBMC, BBAM
No.6 Logic Collapse & Recovery BBCR, BBPF
No.8 Debugging is a Black Box λ_observe
No.9 Entropy Collapse (drift in long context) BBAM
No.1416 Infra Failures (bootstrap / deploy) BBCR + index fix

📐 Curious what BBMC / BBAM / BBPF / BBCR actually mean? See the full derivations in WFGY 1.0 — Core Formulas.


4) What the instruments mean (advanced but concise)

You can use these without memorizing the math. Still, heres the tight spec.

4.1 ΔS — semantic stress

  • definition: ΔS = 1 cos(I, G) where I = current embedding, G = ground/anchor.
  • use: probe question↔context and context↔anchor.
  • thresholds: < 0.40 stable · 0.400.60 transitional · ≥ 0.60 high risk.

4.2 λ_observe — layered observability

  • states: convergent, divergent, <> recursive, × chaotic.
  • use: tag each step (retrieve, assemble, reason).
  • rule: if upstream λ is stable but downstream λ flips divergent → the fault lies at that boundary.

4.3 E_resonance — coherence (re)locking

  • rolling mean of residual magnitude |B| under BBMC.
  • use: if E rises while ΔS stays high → apply BBCR + BBAM.

4.4 WFGY repair operators

  • BBMC: minimize semantic residue B = I G + m·c².
  • BBPF: explore weighted alternate paths to avoid dead ends.
  • BBCR: detect collapse (‖B‖ ≥ Bc), bridge, then rebirth.
  • BBAM: clamp attention variance to prevent entropy melt.

5) Worked recoveries (copyable playbooks)

Case A — “faiss looks fine, but answers are irrelevant”

  • observe: ΔS(question, context) = 0.68; flat curve across k; citations miss expected section.
  • interpret: vector store populated but embedding metric/normalization mismatch or index layer mix-up.
  • do:
    1. ensure consistent normalization; verify cosine vs. inner product usage across write/read.
    2. rebuild index with explicit metric flag; persist and reload once.
    3. re-probe ΔS and λ on retrieval; expect ΔS ≤ 0.45 and convergent λ.
  • docs: embedding-vs-semantic.md, retrieval-traceability.md.

Case B — “correct snippets, wrong reasoning”

  • observe: ΔS(question, context) = 0.35 (good), but λ flips divergent at reasoning.
  • interpret: interpretation collapse; prompt assembly/role/constraints leak.
  • do:
    1. lock schema: system→task→constraints→citations→answer (forbid re-order).
    2. apply BBAM (variance clamp) + BBCR (bridge intermediate step).
    3. require cite-then-explain; re-measure ΔS; aim for convergent λ.
  • docs: retrieval-collapse.md, logic-collapse.md, data-contracts.md.

Case C — “long transcripts randomly capitalize / drift”

  • observe: E_resonance rises with length; λ becomes recursive/chaotic.
  • interpret: entropy collapse under long context; chunk boundaries and OCR noise amplify.
  • do:
    1. semantic chunking (sentence/section aware), drop OCR confidence < threshold.
    2. BBMC to align with section anchors; BBAM to stabilize attention.
    3. verify ΔS across adjacent chunks; enforce ≤ 0.50 at joins.
  • docs: entropy-collapse.md, hallucination.md.

Case D — “HyDE + BM25 hybrid drops recall”

  • observe: single retriever OK, hybrid fails; ΔS(question, context) oscillates by k.
  • interpret: query tokenization / parameter split across retrievers.
  • do:
    1. unify analyzer/tokenizer between dense/sparse;
    2. log per-retriever queries;
    3. re-weight hybrid only after per-retriever ΔS ≤ 0.50; consider rerankers.md.
  • docs: patterns/pattern_query_parsing_split.md, retrieval-playbook.md.

Case E — “model merges two sources into one”

Case F — “fix didnt stick after refresh”

  • observe: same prompt alternates old vs. new facts across sessions.
  • interpret: memory rev/hash mismatch; different components read different state.
  • do:
    1. stamp mem_rev + mem_hash at turn start;
    2. gate writes on matching rev/hash;
    3. store traces for audit.
  • docs: patterns/pattern_memory_desync.md, privacy-and-governance.md.

6) “Use the AI to fix your AI” — safe prompts you can paste

You can ask your assistant to read TXT OS / WFGY files and guide you. Use precise, bounded prompts:


read the WFGY TXT OS and ProblemMap files in this repo. extract the definitions and usage of ΔS, λ\_observe, E\_resonance, and the four modules (BBMC, BBPF, BBCR, BBAM). then, given this concrete failure:

* symptom: \[describe yours]
* logs: \[paste ΔS, λ\_observe probes if available]

tell me:

1. which layer is failing and why,
2. which ProblemMap page applies,
3. the minimal repair steps to lower ΔS below 0.50,
4. how to verify the fix with a reproducible test.

For formula-only assistance:


from TXT OS, extract the formulas and thresholds for ΔS, λ\_observe, and E\_resonance. show me how to compute ΔS(question, context) using cosine distance, what thresholds to use, and which WFGY module to apply if ΔS ≥ 0.60 with divergent λ at the reasoning layer.

Need a concrete run-through? Start with Examples: example_01_basic_fix.md · example_03_pipeline_patch.md · example_08_eval_rag_quality.md


7) Acceptance criteria and regression guardrails


8) When to stop “tuning” and change the structure

Stop iterating prompts if any of the following holds:

  • ΔS remains ≥ 0.60 after chunk/retrieval fixes.
  • lowering temperature only flattens style but not logic drift.
  • λ flips divergent as soon as you mix two sources.
  • E_resonance climbs in long chains.

Open the matching ProblemMap page and apply the structural fix (index rebuild, schema lock, bridge node, or agent boundary).


9) Minimal formulas (reference)

ΔS = 1  cos(I, G)         # semantic stress
λ_observe ∈ {→, ←, <>, ×}  # convergent, divergent, recursive, chaotic
E_resonance = mean(|B|)    # rolling residual magnitude under BBMC

BBMC:  B = I  G + m·c²           # minimize ‖B‖
BBPF:  x_next = x + ΣV_i + ΣW_j·P_j
BBCR:  if ‖B‖ ≥ B_c → collapse(), bridge(), rebirth()
BBAM:  â_i = a_i · exp(γ · std(a))

Thresholds: stable < 0.40, transitional 0.400.60, risk ≥ 0.60. Record nodes automatically when ΔS > 0.60, or 0.400.60 with λ_observe ∈ {←, <>}.


10) Final note

You are not “bad at RAG.” You were debugging from inside the maze. WFGY gives you altitude, instruments, and a map. Start with ΔS to see the break, use λ_observe to localize it, apply the right module to repair it, and keep the ProblemMap open as your field manual.

When all tutorials contradict each other, this page is your single source of operational truth.


🔗 Quick-Start Downloads (60 sec)

Tool Link 3-Step Setup
WFGY 1.0 PDF Engine Paper 1 Download · 2 Upload to your LLM · 3 Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS) TXTOS.txt 1 Download · 2 Paste into any LLM chat · 3 Type “hello world” — OS boots instantly

Explore More

Layer Page What its for
Proof WFGY Recognition Map External citations, integrations, and ecosystem proof
⚙️ Engine WFGY 1.0 Original PDF tension engine and early logic sketch (legacy reference)
⚙️ Engine WFGY 2.0 Production tension kernel for RAG and agent systems
⚙️ Engine WFGY 3.0 TXT based Singularity tension engine (131 S class set)
🗺️ Map Problem Map 1.0 Flagship 16 problem RAG failure taxonomy and fix map
🗺️ Map Problem Map 2.0 Global Debug Card for RAG and agent pipeline diagnosis
🗺️ Map Problem Map 3.0 Global AI troubleshooting atlas and failure pattern map
🧰 App TXT OS .txt semantic OS with fast bootstrap
🧰 App Blah Blah Blah Abstract and paradox Q&A built on TXT OS
🧰 App Blur Blur Blur Text to image generation with semantic control
🏡 Onboarding Starter Village Guided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.
GitHub Repo stars