WFGY/ProblemMap/SemanticClinicIndex.md
2025-08-06 18:22:15 +08:00

11 KiB
Raw Blame History

Semantic Clinic Index

A complete triage hub for AI failures — beyond the core 16 — powered by WFGY.
Use this page when you dont yet know which thing is breaking. Start from symptoms, jump to a failure family, then open the exact fix page. All fixes are driven by WFGY instruments: ΔS (semantic stress), λ_observe (layered observability), and E_resonance (coherence control).

If this page saves you time, a helps others find it.


How to use this page

  1. Identify the symptom in the table below.
  2. Open the family (Prompting / Retrieval / Reasoning / Memory / Agents / Infra / Eval).
  3. Follow the fix page (existing doc or placeholder) and verify with ΔS ≤ 0.45 and convergent λ.

If you prefer a pipeline-first view (OCR → chunking → embeddings → vector store → retriever → prompt → LLM), read:
RAG Architecture & Recovery


Quick triage by symptom

Symptom you see Likely family Open this
Answers cite wrong snippet / mismatch with ground truth Retrieval → RAG hallucination.md
Chunks look right but reasoning is wrong Reasoning retrieval-collapse.md
High similarity, wrong meaning Retrieval / Embeddings embedding-vs-semantic.md
Model cant explain why (no trace) Observability retrieval-traceability.md
Output collapses over long dialogs / 100k tokens Memory / Long-context (placeholder) long-context-stress.md
Jailbreak / prompt injection succeeds Prompting / Safety (placeholder) prompt-injection.md
Multi-agent tools fight each other Orchestration multi-agent-chaos.md
First prod call crashes after deploy Infra / Boot predeploy-collapse.md
Index looks fine; retrieval is irrelevant Vector store hygiene (placeholder) vectorstore-metrics-and-faiss-pitfalls.md
OCR PDFs “look correct” but answers drift Data / OCR (placeholder) ocr-parsing-checklist.md

Cant find it? See the full Failure catalog (16) in the Problem Map root or scan the families below.


Families & maps (with exact fixes)

A) Prompting & Safety

Guard against injections, jailbreaks, role drift, and schema leakage.

Minimum verification: ΔS(question, context) ≤ 0.45; λ stays convergent across paraphrases; injection probes do not change λ.


B) Retrieval, Data & Vector Stores

Make the index correct, measured, and explainable.

Minimum verification: coverage ≥ 0.7 to target section; ΔS(question, retrieved) ≤ 0.45; ΔS(retrieved, anchor) ≤ 0.45; flat-high ΔS vs k ⇒ index/metric mismatch.


C) Reasoning & Logic Control

Detect and repair logic collapse, dead ends, and abstraction failures.

Minimum verification: if upstream λ is stable but λ flips at reasoning, apply BBCR (bridge) and BBAM (variance clamp); re-measure until λ stays convergent.


D) Memory & Long-Context

Keep threads coherent across sessions and very long windows.

Minimum verification: E_resonance does not trend upward; ΔS does not spike at window boundaries; stitched turns keep λ convergent.


E) Multi-Agent & Orchestration

Coordinate tools, roles, and shared memory without conflict.

Minimum verification: when agents are isolated, λ is convergent; when coupled, ΔS does not jump and arbitration logs are traceable.


F) Infra / Deploy

Make the system boot in a known-good order, every time.

Minimum verification: idempotent index builds; version/secret checks before first call; deterministic warm-up traces.


G) Evaluation & Guardrails

Detect “double hallucination” and prevent regression.

Acceptance:

  • Retrieval QA: coverage ≥ 0.7 and ΔS(question, context) ≤ 0.45
  • Stability: λ convergent on 3 paraphrases; E_resonance flat
  • Repeatability: 5 seeds cluster in embedding space (low variance)

Ask the AI to fix your AI (safe prompt)

Paste this in any LLM after uploading TXT OS:

Read the WFGY TXT OS and ProblemMap docs. Extract ΔS, λ_observe, E_resonance and the modules (BBMC, BBPF, BBCR, BBAM).
Given my failure:

- symptom: [describe]
- traces: [ΔS probes, λ states if any]

Tell me:
1) which layer/family is failing and why,
2) which fix page to open,
3) the minimal steps to push ΔS below 0.45 and keep λ convergent,
4) how to verify with a reproducible test.


Notes on placeholders

Items labeled (placeholder) are active stubs. If you have a minimal repro (inputs → calls → wrong output), open an Issue and we will prioritize the write-up.


🧭 Explore More

Module Description Link
Semantic Blueprint Layer-based symbolic reasoning & semantic modulations View →
Benchmark vs GPT-5 Stress test GPT-5 with full WFGY reasoning suite View →
Semantic Clinic Index Expanded failure catalog: prompt injection, memory bugs, logic drift View →

👑 Early Stargazers: See the Hall of Fame — Engineers, hackers, and open source builders who supported WFGY from day one.

GitHub stars Help reach 10,000 stars by 2025-09-01 to unlock Engine 2.0 for everyone Star WFGY on GitHub

WFGY Main   TXT OS   Blah   Blot   Bloc   Blur   Blow