30 KiB
🧭 Not sure where to start ? Open the WFGY Engine Compass
WFGY System Map · Quick navigation
Problem Maps: PM1 taxonomy → PM2 debug protocol → PM3 troubleshooting atlas · built on the WFGY engine series
| Layer | Page | What it’s for |
|---|---|---|
| ⭐ Proof | WFGY Recognition Map | External citations, integrations, and ecosystem proof |
| ⚙️ Engine | WFGY 1.0 | Original PDF tension engine and early logic sketch |
| ⚙️ Engine | WFGY 2.0 | Production tension kernel for RAG and agent systems |
| ⚙️ Engine | WFGY 3.0 | TXT-based Singularity tension engine (131 S-class set) |
| 🗺️ Map | Problem Map 1.0 | Flagship 16-problem RAG failure taxonomy and fix map |
| 🗺️ Map | Problem Map 2.0 | Global Debug Card for RAG and agent pipeline diagnosis |
| 🗺️ Map | Problem Map 3.0 | Global AI troubleshooting atlas and failure pattern map — 🔴 YOU ARE HERE 🔴 |
| 🧰 App | TXT OS | .txt semantic OS with 60-second bootstrap |
| 🧰 App | Blah Blah Blah | Abstract and paradox Q&A built on TXT OS |
| 🧰 App | Blur Blur Blur | Text-to-image generation with semantic control |
| 🏡 Onboarding | Starter Village | Guided entry point for new users |
Problem Map 3.0 Troubleshooting Atlas 🧭
The first failure grammar for complex AI systems that changes the first repair move.
Stop debugging from symptoms. Route the failure, find the broken invariant, and repair the right layer first.
🌐 Recognition & ecosystem integration
As of 2026-03, the WFGY RAG 16 Problem Map line has been adopted or referenced by 20+ frameworks, academic labs, and curated lists in the RAG and agent ecosystem. Most external references use the WFGY ProblemMap as a diagnostic layer for RAG / agent pipelines, not the full WFGY product stack. A smaller but growing set also uses WFGY 3.0 · Singularity Demo as a long-horizon TXT stress test.
Some representative integrations:
| Project | Stars | Segment | How it uses WFGY ProblemMap | Proof (PR / doc) |
|---|---|---|---|---|
| LlamaIndex | Mainstream RAG infra | Integrates the WFGY 16-problem RAG failure checklist into its official RAG troubleshooting docs as a structured failure mode reference. | PR #20760 | |
| RAGFlow | Mainstream RAG engine | Introduced a RAG failure modes checklist guide to the RAGFlow documentation via PR, adapted from the WFGY 16-problem failure map for step-by-step RAG pipeline diagnostics. | PR #13204 | |
| FlashRAG (RUC NLPIR Lab) | Academic lab / RAG research toolkit | Adapts the WFGY ProblemMap as a structured RAG failure checklist in its documentation. The 16-mode taxonomy is cited to support reproducible debugging and systematic failure-mode reasoning for RAG experiments. | PR #224 | |
| DeepAgent (RUC NLPIR Lab) | Academic lab / agent research | Adds a multi-tool agent failure modes troubleshooting note inspired by WFGY-style debugging concepts for diagnosing tool selection loops, tool misuse, and multi-tool workflow failures in agent pipelines. | PR #15 | |
| ToolUniverse (Harvard MIMS Lab) | Academic lab / tools | Provides a WFGY_triage_llm_rag_failure tool that wraps the 16 mode map for incident triage. |
PR #75 | |
| Rankify (University of Innsbruck) | Academic lab / system | Uses the 16 failure patterns in RAG and re-ranking troubleshooting docs. | PR #76 | |
| Multimodal RAG Survey (QCRI LLM Lab) | Academic lab / survey | Cites WFGY as a practical diagnostic resource for multimodal RAG. | PR #4 | |
| LightAgent | Agent framework | Incorporates WFGY ProblemMap concepts into its documentation via a Multi-agent troubleshooting (failure map) section, providing a structured symptom → failure-mode → debugging checklist for diagnosing role drift, cross-agent memory issues, and coordination failures in multi-agent systems. | PR #24 |
For the complete 20+ project list (frameworks, benchmarks, curated lists), see the 👉 WFGY Recognition Map
If your project uses the WFGY ProblemMap and you would like to be listed, feel free to open an issue or pull request in this repository.
Modern AI systems rarely fail in one clean way.
A case that looks like hallucination may actually begin as grounding drift.
A case that looks like reasoning collapse may actually begin as a broken formal container.
A case that looks like safety trouble may actually begin as missing observability.
A case that looks like memory trouble may actually begin as execution closure failure.
That is why ordinary debugging advice collapses too early.
Problem Map 3.0 was built for a more precise job:
- identify the failure family
- locate the best-fit node
- inspect the broken invariant
- choose the right first repair surface
In short:
Problem Map 3.0 helps humans and AI systems avoid starting with the wrong fix.
What this system actually does
Problem Map 3.0 does not stop at naming the failure.
It helps humans and AI systems do five things more reliably:
- classify a failure
- identify which invariant is broken
- separate neighboring failure regions that are easy to confuse
- choose the right first repair direction
- prevent future debugging from collapsing into ad hoc guesswork
This is why the project should be understood as a debugging decision system, not just a checklist.
The biggest cost in complex AI debugging is often not the final answer itself.
It is the first wrong repair move.
Why this exists
Modern AI systems are increasingly:
- retrieval-heavy
- multi-step
- tool-using
- stateful
- agentic
- operational
As systems grow like this, symptom words become too coarse:
- hallucination
- prompting issue
- bad retrieval
- bad reasoning
- memory problem
- alignment problem
Those labels can be useful, but they are often too shallow to decide what should be repaired first.
Problem Map 3.0 Troubleshooting Atlas was built to cut these regions apart more cleanly, so diagnosis becomes more stable and first repair moves become more precise.
The core promise
You can think of this project in one sentence:
a system that helps humans and AI avoid walking into the wrong repair path at the start of complex debugging
That is the practical threshold.
Not just:
- what went wrong
But also:
- where the failure lives
- what neighboring region is tempting but wrong
- what should be repaired first
- what should not be repaired first
A simple view of the system
flowchart LR
A[Input case] --> B[Failure family]
B --> C[Best-fit node]
C --> D[Broken invariant]
D --> E[First repair surface]
Route first. Repair second. Stop guessing from symptoms alone.
Why “3.0” matters
The name is intentional.
Problem Map stays because this system grows out of the earlier Problem Map line and keeps its original debugging spirit.
3.0 matters because this is not a small update. It is a structural jump:
- from checklist logic to atlas logic
- from flat failure naming to routing grammar
- from isolated debugging tips to reusable failure mapping
- from local AI debugging toward a broader complex-system bridge
Troubleshooting Atlas matters because this project is meant to feel like a map, not a loose article, and like an operating surface, not a decorative theory page.
What makes this different
Most debugging material does one of three things:
- it names symptoms
- it lists best practices
- it suggests local fixes
Problem Map 3.0 does something more structural.
It organizes failure space into a stable mother table, then teaches how to move through that space using:
- family routing
- boundary rules
- canonical cases
- relation lines
- first repair directions
- patch discipline
That is why this project is better understood as a routing grammar for failures than as a checklist.
The seven-family mother table 🧩
The current atlas organizes failure space through seven top-level families.
F1 · Grounding & Evidence Integrity
The system fails to remain correctly aligned with external evidence anchors, truth-like anchors, world anchors, or semantic targets.
Short intuition the output is no longer properly tied to reality, evidence, or the intended target
F2 · Reasoning & Progression Integrity
The reasoning chain, decomposition chain, recursive chain, or recovery path loses continuity, controllability, or recoverability.
Short intuition the system is no longer moving through reasoning space in a stable way
F3 · State & Continuity Integrity
Memory, role, ownership, session thread, or continuity thread can no longer remain stable across steps, sessions, or interacting entities.
Short intuition the system no longer preserves what should persist
F4 · Execution & Contract Integrity
Readiness, ordering, bridge integrity, liveness, closure, protocol, or enforcement skeletons fail to close.
Short intuition the workflow or operational skeleton breaks before the task can complete safely
F5 · Observability & Diagnosability Integrity
The system cannot stably expose, trace, audit, interpret, or anticipate the structures required to understand the failure.
Short intuition the problem may already be there, but you still cannot see it clearly enough
F6 · Boundary & Safety Integrity
Goal, control, incentive, collective, or regime boundaries drift, erode, fragment, or become captured.
Short intuition the system no longer stays inside a safe or viable boundary
F7 · Representation & Localization Integrity
Symbolic shells, formal containers, layouts, local anchors, explanations, or synthetic structures fail to preserve structure faithfully.
Short intuition the container that carries meaning is distorted before the task can remain stable
Why these seven families exist
These seven families were not chosen by aesthetics, convenience, or rhetorical style.
They were carved through a longer WFGY line:
- WFGY 1.0 contributed the original self-healing logic and correction framework
- WFGY 2.0 pushed the system toward explicit routing, text-native control, and guardrail logic
- WFGY 3.0 expanded the pressure field through a much larger stress-tested problem set
The result is that these seven families are not topic buckets.
They are better understood as seven recurring modes of instability in complex systems.
That is why the atlas can begin with AI failures while still pointing beyond AI.
What already exists ✅
Problem Map 3.0 already includes a stable first body of work.
Core atlas
A frozen first atlas structure with:
- seven-family mother table
- major routing rules
- canonical node layer
- high-value subtree layer
- relation matrix
- patch discipline
Casebook layer
A first canonical casebook that teaches:
- what each family looks like
- how important boundaries should be cut
- how diagnosis changes the first repair move
AI adapter layer
A first atlas-to-AI adapter layer that compresses atlas logic into reusable routing modes for model-facing use.
Fix layer
A first repair-facing layer that connects correct routing to first repair surfaces and misrepair discipline.
Demo layer
A first official demo pack showing that different routes lead to different first repair moves.
Patch layer
A first completed patch wave that thickens selected subtrees, strengthens relations, improves case teaching, and improves adapter usability.
Cross-domain bridge layer
A first formal bridge pack showing that the current atlas can already extend beyond narrow AI-only framing without requiring a redraw of the mother table.
Use the atlas directly with AI ⚡
Problem Map 3.0 is not only a document system.
It now also includes a compact product-facing routing pack:
Troubleshooting Atlas Router v1
This is the first compact TXT routing pack built from the atlas.
Its purpose is simple:
- route the case first
- identify the broken invariant
- separate the strongest neighboring pressure
- suggest the first repair direction
- warn about likely misrepair
- stay honest when evidence is weak
Short version:
The Atlas is the map.
The Router is the first compact executable surface of that map.
If you want the practical entry points:
What the Router is not:
- not the full Atlas
- not the full Casebook
- not a full auto-repair engine
- not a claim of full diagnosis closure
What it does give you is something more immediate:
drop the TXT into an AI system, feed it a failure case, and the model becomes much more likely to classify the failure family correctly before jumping into the wrong fix
From routing to repair
Problem Map 3.0 does not stop at diagnosis.
It opens a controlled path from routing to first repair.
Atlas layer
The atlas routes the failure.
Casebook layer
The casebook teaches how major cuts should be made and how neighboring regions should be separated.
Fix layer
The fix surface turns correct routing into a disciplined first repair move.
Deeper bridge layer
WFGY remains the deeper exploration engine when the case needs stronger structural intervention.
This means the system is not just:
- classify and stop
It is:
- route
- cut correctly
- repair the right layer first
- only then escalate deeper if needed
Use it now
If you want the shortest working path, start here:
- want the full product overview → Problem Map 3.0 Troubleshooting Atlas
- want the full system map → Atlas Hub
- want the compact TXT routing pack → Troubleshooting Atlas Router v1
- want first repair-facing guidance → Fixes Hub
- want official proof-of-use demos → Official Flagship Demos
This is the shortest practical interpretation of the current system:
read the atlas if you want the map
use the router if you want the compact operational entry
use the fixes layer if you want the first repair surface
Proof that this is usable, not just theoretical
The current system already crosses the line from “interesting framework” into “usable troubleshooting surface.”
The strongest current public proof is simple:
different routes lead to different first repair moves
That is exactly what the official demos are designed to show.
The first demo pack focuses on four sharp families:
- F1 grounding-first
- F5 observability-first
- F4 execution-first
- F7 container-first
These were chosen because they are the fastest way to show that the atlas does not only classify failures.
It changes what should happen next.
How to use this atlas ⚙️
There are three practical ways to use Problem Map 3.0.
1. Human debugging
Use the atlas to ask:
- what kind of failure is this
- which family should I route to first
- which neighboring family is tempting but wrong
- what first repair direction should I try
2. AI-assisted routing
Use the atlas as an AI-facing routing grammar so that a model can classify a case more consistently and explain why one family is primary and another is only secondary.
3. Product and workflow design
Use the atlas as a design surface for:
- triage flows
- case cards
- routing prompts
- onboarding
- benchmark failure analysis
- patch-aware debugging workflows
Why this matters now
AI systems are becoming more layered, more stateful, more agentic, and more operational.
When systems grow like this, debugging starts failing if every mistake is reduced to labels like:
- hallucination
- prompting issue
- model limitation
- alignment problem
- bad retrieval
- bad reasoning
Those labels are too coarse.
Teams increasingly need a reusable grammar that can say:
- this is grounding-first, not reasoning-first
- this is container-first, not semantics-first
- this is observability-first, not boundary-first
- this is execution-first, not continuity-first
That is the practical value of this atlas.
The broader direction 🌍
Problem Map 3.0 is being built first as a powerful AI troubleshooting atlas.
That is the practical entry point.
At the same time, the long-range direction is larger.
The same family grammar appears capable of absorbing more general failures in:
- coordination
- institutions
- coherence
- collective pressure
- structural breakdown
The correct reading is:
AI Troubleshooting Atlas is the first validated operational surface. A broader complex-system bridge is the next step, not a marketing shortcut.
That distinction matters, and it is intentional.
What this page does not claim 🔒
This page does not claim that:
- every possible failure has already been captured
- all subtrees are fully expanded
- all relations are fully enumerated
- all future cross-domain problems are already solved by the current map
- no more patching is needed
- the final civilization-scale atlas is already complete
The safer and more accurate claim is:
the first formal atlas version is complete enough to matter, and future work should continue through patching, thickening, adaptation, and demonstration expansion
FAQ
What is the difference between Problem Map 1.0, 2.0, and 3.0?
Problem Map 1.0 is the canonical 16-problem RAG failure taxonomy and fix map.
Problem Map 2.0 is the Global Debug Card layer.
It compresses debugging objects, metrics, ΔS zones, and operating modes into a visual protocol.
Problem Map 3.0 is the broader troubleshooting atlas.
It moves from flat failure naming toward routing grammar, family structure, boundary rules, case teaching, repair-facing direction, and broader bridge work.
Short version:
- 1.0 gives the base failure vocabulary
- 2.0 gives the compressed visual debug protocol
- 3.0 gives the broader troubleshooting atlas and routing system
Is this a checklist, a framework, or a routing system?
It begins where a checklist stops.
Problem Map 3.0 should be understood as a debugging decision system and a failure routing grammar.
It still preserves map-like clarity, but its real job is not just to name failures.
Its real job is to help humans and AI systems decide:
- where the failure lives
- what neighboring region is tempting but wrong
- which invariant is broken
- what should be repaired first
So the most accurate answer is:
it is a routing grammar and troubleshooting decision system, not just a checklist
Do I need to read the full Atlas to use it?
No.
The full Atlas is the strongest version if you want the full structure, deeper definitions, casebook, patch logic, and bridge materials.
But you do not need to read the full Atlas just to start using the system.
If you want the compact entry point, use:
That is the shortest route from “I have a bug case” to “help me classify this correctly.”
What does Troubleshooting Atlas Router actually do?
The Router is the first compact TXT routing pack built from the Atlas.
Its job is to help an AI system do the following in order:
- identify the most likely primary family
- identify the strongest neighboring family pressure if it is real
- explain why the primary cut is stronger
- identify the broken invariant
- suggest the first repair direction
- warn about likely misrepair
- stay honest about confidence and evidence sufficiency
It is best understood as:
the first compact executable surface of the Atlas
It is not the whole Atlas and not a full repair engine.
Does this system already repair everything automatically?
No.
The current public system is strongest at:
- route-first classification
- boundary-aware diagnosis
- broken-invariant reading
- first repair direction
- misrepair warning
- deeper escalation paths when needed
That is already very valuable.
But it is not the same thing as claiming:
- full autonomous diagnosis
- full autonomous repair
- complete root-cause closure in every case
The current repair logic is best understood as:
route first, choose the right first move, then escalate deeper only when needed
Is this only for AI systems?
The current strongest public form is AI-first.
That is intentional, because AI troubleshooting is the first validated operational surface of the atlas.
At the same time, the family grammar was not carved as a narrow topic list. It was carved as a more general failure grammar for complex systems.
That is why the atlas already has a formal bridge layer through documents such as:
So the correct reading is:
AI-first in its strongest validated public form
already structured enough to support controlled bridge work beyond AI
not yet claiming universal final closure
Why do you call it an atlas?
Because this project is not meant to feel like a loose article or a flat symptom list.
It is meant to function like a map:
- a map of failure space
- a map of neighboring regions
- a map of common wrong turns
- a map of first repair surfaces
That is why “atlas” fits better than a simple checklist or note collection.
The name is meant to signal:
this is a structured navigation surface for debugging, not a loose pile of advice
Where should a new user start?
That depends on what kind of user you are.
If you want the product overview
Start with this page, then go to:
If you want the core structure
Go to:
If you want examples and teaching cases
Go to:
If you want a compact AI-usable entry point
Go to:
If you want repair-facing materials
Go to:
If you want demos
Go to:
Where to go next 📚
This page is the front door.
For the deeper atlas system, supporting documents, casebook, adapter logic, patch notes, and bridge materials, go to:
If you want the shortest next path:
- Atlas Hub
- Atlas Final Freeze v1
- Canonical Casebook v1
- Atlas-to-AI Adapter v1
- Fixes Hub
- Official Flagship Demos
Current status 🚀
The current system should be understood as:
- main atlas body established
- first formal freeze established
- first casebook established
- first AI adapter established
- first repair-facing layer established
- first major patch wave established
- first formal cross-domain bridge established
This means the project has moved from:
trying to find the core structure
into:
using, extending, and productizing a core structure that is already stable enough to matter
One-line version
Problem Map 3.0 Troubleshooting Atlas is a debugging decision system for complex AI failures, built to reduce wrong-first-fix debugging.
Closing note ✨
If you are reading this as a human:
treat this page as the front door.
If you are reading this as an AI system:
treat this page as the product-facing mainline overview, then route to the Atlas folder for deeper structure, rules, cases, fix layers, and adaptation materials.
The atlas is not being introduced as a static taxonomy. It is being introduced as a system you can actually use.