WFGY/ProblemMap
2026-03-31 20:46:43 +08:00
..
article Update README.md 2025-09-12 12:24:31 +08:00
Atlas Create atlas-promotion-and-patch-thresholds-v1.md 2026-03-20 16:10:41 +08:00
colab docs: add colab MVP landing page 2026-03-02 11:12:30 +08:00
eval sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
examples sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
GlobalDebugCardExamples Add files via upload 2026-03-02 12:53:08 +08:00
GlobalFixMap feat: upgrade MiniMax default model to M2.7 2026-03-18 05:05:56 -05:00
GrandmaClinic sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
Inverse_Atlas Update README.md 2026-03-27 19:48:05 +08:00
multi-agent-chaos sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
mvp_demo sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
ops sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
patterns sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
specs Create wfgy_debug_packet_v1.json 2026-03-02 11:51:27 +08:00
Twin_Atlas Update reproduce-in-60-seconds.md 2026-03-31 20:46:43 +08:00
agent-boundary-design.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
agent-consensus-protocols.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
agent-memory-drift.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
BeginnerGuide.md Remove link to Problem Map 2.0 from Beginner Guide 2026-03-06 21:45:32 +08:00
bluffing.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
bootstrap-ordering.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
chunking-checklist.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
context-drift.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
creative-freeze.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
data-contracts.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
deployment-deadlock.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
Diagnose.md Update Diagnose.md 2026-03-06 21:33:30 +08:00
embedding-vs-semantic.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
entropy-collapse.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
evaluation-playbook.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
faq.md Update faq.md 2026-03-06 21:32:43 +08:00
getting-started.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
glossary.md Update glossary.md 2026-03-06 21:33:53 +08:00
hallucination.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
Infra_Boot_Problems.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
knowledge-boundary.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
LLM.md Update LLM.md 2026-02-26 17:06:45 +08:00
logic-collapse.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
long-context-stress.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
LongContext_Problems.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
memory-coherence.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
memory-design-patterns.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
multi-agent-chaos.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
Multi-Agent_Problems.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
multilingual-guide.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
Multimodal_Problems.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
observability-runbook.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
ocr-parsing-checklist.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
philosophical-recursion.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
predeploy-collapse.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
privacy-and-governance.md Update privacy-and-governance.md 2026-03-06 21:32:15 +08:00
prompt-injection.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
rag-architecture-and-recovery.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
RAG_Problems.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
README.md Update README.md 2026-03-15 22:46:38 +08:00
reasoning-schemas.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
rerankers.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
retrieval-collapse.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
retrieval-playbook.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
retrieval-traceability.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
Safety_Boundary_Problems.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
SemanticClinicIndex.md Update SemanticClinicIndex.md 2026-03-06 21:33:11 +08:00
symbolic-collapse.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
Symbolic_Logic_Problems.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
system-prompt-drift.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
tool-router-debug.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
vectorstore-fragmentation.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
vectorstore-metrics-and-faiss-pitfalls.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
wfgy-ai-problem-map-troubleshooting-atlas.md Update wfgy-ai-problem-map-troubleshooting-atlas.md 2026-03-27 19:48:08 +08:00
wfgy-ai-problem-map-troubleshooting-atlas.zh-tw.md Update wfgy-ai-problem-map-troubleshooting-atlas.zh-tw.md 2026-03-15 12:12:16 +08:00
wfgy-metrics.md sync footer navigation (remove clinics, align PM versions) 2026-03-06 12:46:37 +00:00
wfgy-rag-16-problem-map-global-debug-card.md Update wfgy-rag-16-problem-map-global-debug-card.md 2026-03-15 22:46:16 +08:00
WFGY_RAG_16_Problem_Map_Global_Debug_Card.jpeg Add files via upload 2026-02-27 23:26:02 +08:00

🧭 Not sure where to start ? Open the WFGY Engine Compass

WFGY System Map · Quick navigation

Problem Maps: PM1 taxonomy → PM2 debug protocol → PM3 troubleshooting atlas · built on the WFGY engine series

Layer Page What its for
Proof WFGY Recognition Map External citations, integrations, and ecosystem proof
⚙️ Engine WFGY 1.0 Original PDF tension engine and early logic sketch
⚙️ Engine WFGY 2.0 Production tension kernel for RAG and agent systems
⚙️ Engine WFGY 3.0 TXT-based Singularity tension engine (131 S-class set)
🗺️ Map Problem Map 1.0 Flagship 16-problem RAG failure taxonomy and fix map — 🔴 YOU ARE HERE 🔴
🗺️ Map Problem Map 2.0 Global Debug Card for RAG and agent pipeline diagnosis
🗺️ Map Problem Map 3.0 Global AI troubleshooting atlas and failure pattern map
🧰 App TXT OS .txt semantic OS with 60-second bootstrap
🧰 App Blah Blah Blah Abstract and paradox Q&A built on TXT OS
🧰 App Blur Blur Blur Text-to-image generation with semantic control
🏡 Onboarding Starter Village Guided entry point for new users

🏥 WFGY Problem Map 1.0 · bookmark it. youll need it

🛡️ reproducible AI bugs, structurally reduced at the reasoning layer

🌐 Recognition & ecosystem integration

As of 2026-03, the WFGY RAG 16 Problem Map line has been adopted or referenced by
20+ frameworks, academic labs, and curated lists in the RAG and agent ecosystem.
Most external references use the WFGY ProblemMap as a diagnostic layer for RAG / agent pipelines,
not the full WFGY product stack.
A smaller but growing set also uses WFGY 3.0 · Singularity Demo as a long-horizon TXT stress test.

Some representative integrations:

Project Stars Segment How it uses WFGY ProblemMap Proof (PR / doc)
LlamaIndex GitHub Repo stars Mainstream RAG infra Integrates the WFGY 16-problem RAG failure checklist into its official RAG troubleshooting docs as a structured failure mode reference. PR #20760
RAGFlow GitHub Repo stars Mainstream RAG engine Introduced a RAG failure modes checklist guide to the RAGFlow documentation via PR, adapted from the WFGY 16-problem failure map for step-by-step RAG pipeline diagnostics. PR #13204
FlashRAG (RUC NLPIR Lab) GitHub Repo stars Academic lab / RAG research toolkit Adapts the WFGY ProblemMap as a structured RAG failure checklist in its documentation. The 16-mode taxonomy is cited to support reproducible debugging and systematic failure-mode reasoning for RAG experiments. PR #224
DeepAgent (RUC NLPIR Lab) GitHub Repo stars Academic lab / agent research Adds a multi-tool agent failure modes troubleshooting note inspired by WFGY-style debugging concepts for diagnosing tool selection loops, tool misuse, and multi-tool workflow failures in agent pipelines. PR #15
ToolUniverse (Harvard MIMS Lab) GitHub Repo stars Academic lab / tools Provides a WFGY_triage_llm_rag_failure tool that wraps the 16 mode map for incident triage. PR #75
Rankify (University of Innsbruck) GitHub Repo stars Academic lab / system Uses the 16 failure patterns in RAG and re-ranking troubleshooting docs. PR #76
Multimodal RAG Survey (QCRI LLM Lab) GitHub Repo stars Academic lab / survey Cites WFGY as a practical diagnostic resource for multimodal RAG. PR #4
LightAgent GitHub Repo stars Agent framework Incorporates WFGY ProblemMap concepts into its documentation via a Multi-agent troubleshooting (failure map) section, providing a structured symptom → failure-mode → debugging checklist for diagnosing role drift, cross-agent memory issues, and coordination failures in multi-agent systems. PR #24
OmniRoute GitHub Repo stars Gateway / routing infra Adds an optional WFGY 16-problem RAG / LLM failure taxonomy to its official troubleshooting documentation, allowing teams to classify downstream RAG and agent failures with No.1 to No.16 alongside OmniRoute logs when the gateway itself appears healthy. PR #164

For the complete 20+ project list (frameworks, benchmarks, curated lists), see the 👉 WFGY Recognition Map

If your project uses the WFGY ProblemMap and you would like to be listed,
feel free to open an issue or pull request in this repository.


🌙 3AM: a dev collapsed mid-debug… 🩺 WFGY Triage Center — Emergency Room & Grandmas AI Clinic

🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥

🚑 WFGY Emergency Room (for developers)

👨‍⚕️ Now online: Dr. WFGY in ChatGPT Room

This is a share window already trained as an ER. Just open it, drop your bug or screenshot, and talk directly with the doctor. He will map it to the right Problem Map / Global Fix section, write a minimal prescription, and paste the exact reference link. If something is unclear, you can even paste a screenshot of Problem Map content and ask — the doctor will guide you.

⚠️ Note: for the full reasoning and guardrail behavior you need to be logged in — the share view alone may fallback to a lighter model.

💡 Always free. If it helps, a star keeps the ER running. 🌐 Multilingual — start in any language.


👵 Grandmas AI Clinic (for everyone)

Visit Grandma Clinic →

  • 16 common AI failure modes, each explained as a grandma story.
  • Everyday metaphors: wrong cookbook, salt-for-sugar, burnt first pot.
  • Shows both the life analogy and the minimal WFGY fix.
  • Perfect entry point for beginners, or anyone who wants to “get it” in 30 seconds.

💡 Tip: Both tracks lead to the same Problem Map numbers. Choose Emergency Room if you need a fix right now. Choose Grandmas Clinic if you want to understand the bug in plain words.

🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥


⏱️ 60 seconds: WFGY as a semantic firewall. before vs after

most fixes today happen AFTER generation:

  • the model outputs something wrong, then we patch it with retrieval, chains, or tools.
  • the same failures reappear again and again.

WFGY inverts the sequence. BEFORE generation:

  • it inspects the semantic field (tension, residue, drift signals).
  • if the state is unstable, it loops, resets, or redirects the path.
  • only a stable semantic state is allowed to generate output.

this is why, once a failure mode is clearly mapped and monitored under the same conditions, it tends to stay fixed for that configuration.
youre not only firefighting after the fact — youre installing a reasoning firewall at the entry point of that stack.


📊 Before vs After

Traditional Fix (After Generation) WFGY Semantic Firewall (Before Generation) 🏆
Flow Output → detect bug → patch manually Inspect semantic field → only a stable state is allowed to generate
Method Add rerankers, regex, JSON repair, tool patches ΔS, λ, coverage checked upfront; loop/reset if unstable
Cost High — every bug = new patch, risk of conflicts Lower — once mapped, the bug usually stops recurring under the same assumptions
Ceiling Often plateaus around 7085% stability in practice In internal tests, 9095%+ stability observed on selected stacks; not a universal guarantee
Experience Firefighting, “whack-a-mole” debugging Structural firewall, “fix once, tends to stay fixed for that setup”
Complexity Growing patch jungle, fragile pipelines Unified acceptance targets, one-page repair guide

Performance impact

  • Traditional patching: in our internal experience, stability often plateaus around 7085%. Each new patch adds complexity and potential regressions.
  • WFGY firewall: in internal experiments on a small number of RAG/agent pipelines, we have seen 9095%+ stability and roughly 6080% reductions in repeat-debug time once failure families are properly mapped. These numbers are setup-dependent and should be treated as indicative, not as hard promises.
  • Unified metrics: in our own recipes, every fix is measured (for example ΔS ≤ 0.45, coverage ≥ 0.70, λ convergent) so that acceptance is explicit rather than based on gut feeling.

🛑 Key notes

  • This is not a plugin or SDK — it runs as plain text, zero infra changes.
  • You should apply acceptance targets: dont just eyeball; log ΔS and λ (or equivalent) to confirm for your own stack.
  • Once acceptance holds, we treat that path as sealed for that configuration. If drift recurs after model, data, or prompt changes, we treat it as a new failure mode that needs mapping, not a simple re-fix of the old one. Ongoing monitoring is still required.

Summary: Others patch symptoms AFTER output. WFGY blocks unstable states BEFORE output.
That is why it often feels less like debugging, more like installing structural guardrails — risk-reducing heuristics, not a mathematical guarantee.


WFGY Problem Map = a reasoning layer for your AI. load TXT OS or WFGY Core, then ask: “which problem map number am i hitting?” youll get a diagnosis and exact fix steps — no infra changes required.

(tip: you can even paste the Problem Map page or a screenshot into the AI, and it will point you to the right number automatically.)

16 reproducible failure modes, each with a clear fix (MIT). (e.g. rag drift, broken indexes) A semantic firewall you install once, and the same failure pattern tends to stay fixed under the same setup.

most readers found this map useful and left a — if it helps you too, please star it so others can discover.

quick-start downloads (60 sec)

new here? skip the map. grab TXT OS or the WFGY PDF, boot, then ask your model: “answer using WFGY: ” or “which Problem Map number am i hitting?”

tool link 3-step setup
WFGY 1.0 PDF engine paper 1) download 2) upload to your LLM 3) ask: “answer using WFGY + ”
TXT OS TXTOS.txt 1) download 2) paste into any LLM chat 3) type “hello world” to boot

start here

  • RAG broke → open Retrieval Playbook and RAG Architecture & Recovery
  • Agents drift or loop → open Agents & Orchestration or Safety_PromptIntegrity
  • Local model feels unstable → open LocalDeploy_Inference and Embeddings: Metric Mismatch
💥 WFGY Global Fix Map — full index (click to open)

🗺️ This is the panoramic index: all common AI infra / RAG / reasoning errors are organized here by category. Prefer Quick Access — it is the fastest way to self-orient, understand how this system works, and jump to the right fix: Quick Access. If you want the full folder view, open the Global Fix Map home: Global Fix Map README.


🧭 Providers & Agents

Family Coverage (all links) Notes
LLM Providers OpenAI · Azure OpenAI · Anthropic · Claude (Anthropic) · Google Gemini Google Vertex AI · Mistral · Meta LLaMA · Cohere · DeepSeek Kimi (Moonshot) · Groq · xAI Grok · AWS Bedrock · OpenRouter Together AI vendor-specific quirks, schema drift, API limits
Agents & Orchestration Autogen · CrewAI · Haystack Agents · LangChain LangGraph · LlamaIndex · OpenAI Assistants v2 · Rewind Agents Semantic Kernel · Smolagents orchestration bugs, cold boot order, role mixing
Chatbots & CX Amazon Lex · Azure Bot Service · Dialogflow CX · Freshchat Freshdesk · Intercom · Microsoft Copilot Studio · Rasa Salesforce Einstein Bots · Twilio Studio · Watson Assistant · Zendesk bot frameworks, CX stack, handoff gaps
Cloud Serverless Cold Start Concurrency · Timeouts & Streaming Limits · Stateless KV/Queue Patterns · Edge Cache Invalidation Network Egress & VPC · Deploy Traffic Shaping · Secrets Rotation · Serverless Limits Matrix Multi-Region Routing · Failover Drills · Observability & SLO · Canary Release (Serverless) Blue-Green Switchovers · Disaster Recovery · Data Retention & Backups · Privacy & PII Edges infra stability, migration, compliance

🧭 Data & Retrieval

Family Coverage (all links) Notes
Vector DBs & Stores FAISS · Chroma · Qdrant · Weaviate · Milvus pgvector · Redis · Elasticsearch · Pinecone · Typesense Vespa metric, analyzer, index hygiene
RAG + VectorDB Metric Mismatch · Normalization & Scaling · Tokenization & Casing · Chunking → Embedding Contract Vectorstore Fragmentation · Dimension Mismatch & Projection · Update & Index Skew Hybrid Retriever Weights · Duplication & Collapse · Poisoning & Contamination store-agnostic knobs
Retrieval Retrieval Playbook · Traceability · Rerankers · Query Parsing Split Chunk Alignment · ΔS Probes · Eval Recipes · Store-Agnostic Guardrails end-to-end routing & contracts
Embeddings Metric Mismatch · Normalization & Scaling · Tokenization & Casing · Chunking → Embedding Contract Vectorstore Fragmentation · Dimension Mismatch & Projection · Update & Index Skew Hybrid Retriever Weights · Duplication & Collapse · Poisoning & Contamination embedding≠semantic checks
Chunking Chunk ID Schema · Checklist · Code / Tables / Blocks · Section Detection Title Hierarchy · PDF Layouts & OCR · Reindex & Migration Eval Precision & Recall · Live Monitoring chunk/section discipline
RAG Retrieval Drift · Hallucination RAG · Citation Break · Hybrid Failure Index Skew · Context Drift · Entropy Collapse · Eval Drift visual routes, acceptance targets

🧭 Input & Parsing

Family (link) Coverage (all links) Notes
DocumentAI_OCR Tesseract · Google Document AI · AWS Textract · Azure OCR · ABBYY · PaddleOCR pre-embedding text integrity
OCR_Parsing Layout, Headers, Footers · Tokenization & Casing · Tables & Columns · Images & Figures · Scanned PDFs & Quality · Multi-language & Fonts parser rails & checks
Language Tokenizer Mismatch · Script Mixing · Locale Drift · Multilingual Guide · Proper Noun Aliases Romanization & Transliteration · Query Language Detection · Query Routing & Analyzers · Hybrid Ranking (Multilingual) · Stopword & Morphology Controls Fallback Translation & Glossary Bridge · Code-Switching Eval cross-script retrieval stability
LanguageLocale Tokenizer Mismatch (cross-lang) · Script Mixing (single query) · Locale Drift & Analyzer Skew · Unicode Normalization · CJK Segmentation / Word-break Fullwidth vs Halfwidth, Punctuation · Diacritics & Folding · RTL / BiDi Control · Transliteration & Romanization · Locale Collation & Sort Keys Numbering & Sort Orders · Date/Time Format Variants · Timezones & DST · Keyboard Input Methods · Input Language Switching Emoji, ZWJ, Grapheme Clusters · Mixed-Locale Metadata analyzer / normalization profiles

🧭 Reasoning & Memory

Family Coverage (all links) Notes
Reasoning Entropy Overload · Recursive Loop · Hallucination Re-entry · Logic Collapse Symbolic Collapse · Proof Dead Ends · Anchoring & Bridge Proofs Context Stitching & Window Joins · Chain-of-Thought Variance Clamp · Redundant Evidence Collapse BBMC / BBPF / BBCR / BBAM rails
MemoryLongContext Memory Coherence · Entropy Collapse · Context Drift · Ghost Context State Fork · Pattern Memory Desync · OCR Jitter · OCR Parsing Checklist Data Contracts · Retrieval Traceability · Chunking Checklist Long-window guardrails
Multimodal_LongContext Alignment Drift · Anchor Misalignment · Boundary Fade · Caption Collapse Cross-Modal Bootstrap · Cross-Modal Trace · Desync Amplification · Desync Anchor Echo Loop · Fusion Blindspot · Fusion Latency · Modal Bridge Failure Modality Dropout · Modality Swap · Multi-Hop Collapse · Multi-Seed Consistency Multimodal Fusion Break · Phantom Visuals · Reference Bleed · Semantic Anchor Shift Signal Drop · Spatial Fusion Error · Sync Loop · Time Sync Failure · Visual Anchor Shift Multimodal joins & anchors

🧭 Automation & Ops

Family (link) Coverage examples Notes
Automation Zapier · n8n · Make · Retool · IFTTT Pipedream · Power Automate · GitHub Actions · Airflow · Airtable Asana · GoHighLevel · Parabola · LangChain (automation) · LlamaIndex (automation) idempotency, warmups, fences
OpsDeploy Blue-Green Switchovers · Cache Warmup · DB Migration Guardrails · Feature Flags · Idempotency Dedup Incident Comms · Postmortem & Regression · Rate Limit Backpressure · Read-Only Mode · Release Calendar Retry & Backoff · Rollback & Recovery · Rollout Gate · Shadow Traffic · Staged Canary Vector Index Swap · Version Pinning prod safety rails
Safety_PromptIntegrity Prompt Injection · Jailbreaks & Overrides · Role Confusion · Memory Fences · JSON & Tools Citation First · Tool Selection & Timeouts · System/User/Role Order · Template Library · Eval Prompts schema locks
PromptAssembly Anti-Injection Recipes · Citation First · Eval Prompts · JSON Mode & Tools · Memory Fences System/User/Role Order · Template Library · Tool Selection & Timeouts contract & eval kits
LocalDeploy_Inference AutoGPTQ · AWQ · BitsAndBytes · CTransformers · ExLlama ExLlamaV2 · GPT4All · Jan · KoboldCpp · llama.cpp LMStudio · Ollama · Textgen-WebUI · TGI · vLLM local stack guardrails
DevTools_CodeAI GitHub Copilot · Cursor · Sourcegraph Cody · VSCode Copilot Chat · Codeium Tabnine · AWS CodeWhisperer · JetBrains AI Assistant IDE/assist rails

🧭 Eval & Governance

Family (link) Coverage examples Notes
Eval Eval_Benchmarking · Eval_Cost_Reporting · Eval_Cross_Agent_Consistency · Eval_Harness · Eval_Latency_vs_Accuracy Eval_Operator_Guidelines · Eval_RAG_Precision_Recall · Eval_Semantic_Stability · Goldset_Curation SDK-free evals
Eval_Observability Alerting_and_Probes · Coverage_Tracking · DeltaS_Thresholds · Eval_Playbook · Lambda_Observe Metrics_and_Logging · Regression_Gate · Variance_and_Drift drift alarms
Governance Audit_and_Logging · Audit_Logs_and_Traceability · Data_Lineage_and_Provenance · Escalation_and_Governance · Ethics_and_Bias_Mitigation Eval_Governance_Gates_and_Signoff · Incident_Response_and_Postmortems · License_and_Dataset_Rights · Model_Governance_Model_Cards_and_Releases PII_Handling_and_Minimization · Policy_Baseline · Prompt_Policy_and_Change_Control · Regulatory_Alignment · Risk_Register_and_Waivers · Roles_and_Access_RBAC_ABAC · Transparency_and_Explainability program-level rails
Enterprise_Knowledge_Gov Access_Control · Audit_and_Traceability · Compliance · Compliance_Audit · Data_Residency Data_Sensitivity · Knowledge_Expiry · Retention_Policy knowledge governance

semantic memory & reasoning fix in action

BigBig Question — If AI bugs are not random but mathematically inevitable, can we finally define and prevent them? (this repo is one experiment toward that direction)


Quick Access

dont worry if this looks long. with TXT OS loaded, simply ask your LLM: “which Problem Map number fits my issue?” it will point you to the right page.

tip: if youre new, skip scrolling — use the minimal quick-start below.

failure catalog (with fixes)

if you are unsure which one applies, ask your LLM with TXT OS loaded: “which Problem Map number matches my trace?” it will route you.

legend

[IN] Input & Retrieval [RE] Reasoning & Planning [ST] State & Context [OP] Infra & Deployment {OBS} Observability/Eval {SEC} Security {LOC} Language/OCR

# problem domain (with layer/tags) what breaks doc
1 [IN] hallucination & chunk drift {OBS} retrieval returns wrong/irrelevant content hallucination.md
2 [RE] interpretation collapse chunk is right, logic is wrong retrieval-collapse.md
3 [RE] long reasoning chains {OBS} drifts across multi-step tasks context-drift.md
4 [RE] bluffing / overconfidence confident but unfounded answers bluffing.md
5 [IN] semantic ≠ embedding {OBS} cosine match ≠ true meaning embedding-vs-semantic.md
6 [RE] logic collapse & recovery {OBS} dead-ends, needs controlled reset logic-collapse.md
7 [ST] memory breaks across sessions lost threads, no continuity memory-coherence.md
8 [IN] debugging is a black box {OBS} no visibility into failure path retrieval-traceability.md
9 [ST] entropy collapse attention melts, incoherent output entropy-collapse.md
10 [RE] creative freeze flat, literal outputs creative-freeze.md
11 [RE] symbolic collapse abstract/logical prompts break symbolic-collapse.md
12 [RE] philosophical recursion self-reference loops, paradox traps philosophical-recursion.md
13 [ST] multi-agent chaos {OBS} agents overwrite or misalign logic Multi-Agent_Problems.md
14 [OP] bootstrap ordering services fire before deps ready bootstrap-ordering.md
15 [OP] deployment deadlock circular waits in infra deployment-deadlock.md
16 [OP] pre-deploy collapse {OBS} version skew / missing secret on first call predeploy-collapse.md

for No.13 deep dives: • role drift → multi-agent-chaos/role-drift.md • cross-agent memory overwrite → multi-agent-chaos/memory-overwrite.md

🧪 one-click sandboxes — run WFGY instantly

run lightweight diagnostics with zero install and zero api key. powered by colab.

these tools map directly to the problem classes. others are handled inside WFGY and will surface in later CLIs.

ΔS diagnostic (mvp) — measure semantic drift

open in colab

detects: No.2 — Interpretation Collapse steps: run all, paste prompt+answer, read ΔS and fix tip

λ_observe checkpoint — mid-step re-grounding

open in colab

fixes: No.6 — Logic Collapse & Recovery steps: run all, compare ΔS before/after, fallback to BBCR if needed

ε_resonance — domain-level harmony

open in colab

explains: No.12 — Philosophical Recursion steps: run, tune anchors, read ε

λ_diverse — answer-set diversity

open in colab

detects: No.3 — Long Reasoning Chains steps: run, supply ≥3 answers, read score

why this matters long-term

these 16 errors are not random. they are structural weak points every ai pipeline hits eventually. with WFGY as a semantic firewall you dont just fix todays issue — you shield tomorrows.

this isnt just a bug list. its an x-ray for your pipeline, so you stop guessing and start repairing.

see the end-to-end view: RAG Architecture & Recovery

minimal quick-start

  1. open Beginner Guide and follow the symptom checklist.
  2. use the Visual RAG Guide to locate the failing stage.
  3. open the matching page and apply the patch.

ask any LLM to apply WFGY (TXT OS makes it smoother):


ive uploaded TXT OS / WFGY notes.
my issue: \[e.g., OCR tables look fine but answers point to wrong sections]
which WFGY modules should i apply and in what order?

status & difficulty
# problem (with layer/tags) difficulty* implementation
1 [IN] hallucination & chunk drift {OBS} medium stable
2 [RE] interpretation collapse high stable
3 [RE] long reasoning chains {OBS} high stable
4 [RE] bluffing / overconfidence high stable
5 [IN] semantic ≠ embedding {OBS} medium stable
6 [RE] logic collapse & recovery {OBS} very high stable
7 [ST] memory breaks across sessions high stable
8 [IN] debugging black box {OBS} medium stable
9 [ST] entropy collapse high stable
10 [RE] creative freeze medium stable
11 [RE] symbolic collapse very high stable
12 [RE] philosophical recursion very high stable
13 [ST] multi-agent chaos {OBS} very high stable
14 [OP] bootstrap ordering medium stable
15 [OP] deployment deadlock high ⚠️ beta
16 [OP] pre-deploy collapse {OBS} medium-high stable

*distance from default LLM behavior to a production-ready fix.

🔬 Behind the Map

The Problem Map is practical and ready to use. But if you wonder why these fixes work, and how were defining physics inside embedding space: → The Hidden Value Engine (WFGY Physics)

🔮 coming soon: global fix map

a universal layer above providers, agents, and infra. Problem Map is step one. Global Fix Map expands the same reasoning-first firewall to RAG, infra boot, agents, evals, and more. same zero-install experience. launching around Sep.

contributing / support

  • open an issue with a minimal repro (inputs → calls → wrong output).
  • PRs for clearer docs, repros, or patches are welcome.
  • project home: github.com/onestardao/WFGY
  • TXT OS: browse the OS
  • if this map helped you, a helps more devs find it.

Explore More

Layer Page What its for
Proof WFGY Recognition Map External citations, integrations, and ecosystem proof
⚙️ Engine WFGY 1.0 Original PDF tension engine and early logic sketch (legacy reference)
⚙️ Engine WFGY 2.0 Production tension kernel for RAG and agent systems
⚙️ Engine WFGY 3.0 TXT based Singularity tension engine (131 S class set)
🗺️ Map Problem Map 1.0 Flagship 16 problem RAG failure taxonomy and fix map
🗺️ Map Problem Map 2.0 Global Debug Card for RAG and agent pipeline diagnosis
🗺️ Map Problem Map 3.0 Global AI troubleshooting atlas and failure pattern map
🧰 App TXT OS .txt semantic OS with fast bootstrap
🧰 App Blah Blah Blah Abstract and paradox Q&A built on TXT OS
🧰 App Blur Blur Blur Text to image generation with semantic control
🏡 Onboarding Starter Village Guided entry point for new users

If this repository helped, starring it improves discovery so more builders can find the docs and tools.
GitHub Repo stars