WFGY/ProblemMap/faq.md
2025-08-15 23:16:36 +08:00

8 KiB
Raw Blame History

FAQ — Fast Answers for Busy Builders

Short, practical answers to the questions we get every day.

Quick Nav
Getting Started · RAG Map 2.0 · Retrieval Playbook · Rerankers · Patterns · Eval · Ops


General

What is WFGY?
A symbolic reasoning layer + diagnostic toolkit that sits above your stack. It measures semantic stress (ΔS), shows which layer drifted (λ_observe), and applies repair operators (BBMC/BBPF/BBCR/BBAM). You dont need to change your infra.

Do I need a GPU?
No. You can prototype on CPU with small embedding models. Rerankers and local LLMs benefit from GPU but are optional. See: Retrieval Playbook.

License? Can I use it at work?
MIT. Commercial use allowed. If you ship improvements, we welcome PRs (docs or code).

How is this different from LangChain/LlamaIndex?
Those are orchestration layers. WFGY is a reasoning firewall and diagnostic map—it detects/repairs semantic failure regardless of framework.


Setup & scope

Whats the fastest way to try it?
Grab TXT OS and WFGY 1.0 PDF, paste TXT into any model, and follow the prompts in Getting Started.

Which embedding model should I start with?
General docs: all-MiniLM-L6-v2 (light) or bge-base. Multilingual: bge-m3 / LaBSE. Keep write/read normalization identical. See: Embedding vs Semantic.

Do I need a reranker?
Only if first-stage recall@50 ≥ 0.85 but Top-k precision is weak. Otherwise fix candidate generation. See: Rerankers.

How big can my PDFs be?
Start with a gold set (1050 Q/A with citations). For ingestion, chunk by semantic sections (not fixed tokens). Verify ΔS thresholds before scaling.


Diagnosing failures

The chunks look right but the answer is wrong—now what?
Measure ΔS(question, retrieved). If ≥ 0.60, fix retrieval first; if ≤ 0.45 and reasoning still fails, open Interpretation Collapse.
Links: hallucination.md · retrieval-collapse.md

Hybrid (BM25 + dense) got worse—why?
Likely Query Parsing Split (tokenizer/analyzer drift). Unify analyzers and log per-retriever queries.
Link: pattern_query_parsing_split.md

Citations bleed across sources.
Enforce per-source fences + “cite-then-explain” schema; this is SCU.
Links: retrieval-traceability.md · pattern_symbolic_constraint_unlock.md

Fixes dont stick after refresh.
Youre seeing Memory Desync. Stamp mem_rev/mem_hash, gate writes, and audit traces.
Link: pattern_memory_desync.md


Implementation details

How do I compute ΔS quickly?
Use cosine on unit-normalized sentence embeddings: ΔS = 1 cos(I, G).
Thresholds: <0.40 stable · 0.400.60 transitional · ≥0.60 act.
Ground anchor G can be a section title/snippet you expect.

What are BBMC/BBPF/BBCR/BBAM in one line?

  • BBMC — minimize semantic residue vs anchors.
  • BBPF — branch safely across multiple paths.
  • BBCR — detect collapse; insert a bridge node and restart cleanly.
  • BBAM — modulate attention variance to avoid entropy melt.

Where are the data shapes?
See: Data Contracts. Theyre JSON-first, easy to log, and versioned.


Teams & Ops

How do we avoid regressions?
Commit goldset.jsonl, measure recall@50, nDCG@10, and ΔS across PRs.
Links: eval_rag_precision_recall.md · eval_semantic_stability.md

Any privacy guidance?
Yes—PII redaction, retention, access control, and provider governance patterns are here: Privacy & Governance.


Known limits

  • Extremely noisy OCR may require manual anchors or char-level retrieval.
  • Cross-domain abstract reasoning (#11/#12) needs stronger models.
  • Rerankers improve precision but add latency—prove gains via nDCG.

🔗 Quick-Start Downloads (60 sec)

Tool Link 3-Step Setup
WFGY 1.0 PDF Engine Paper 1 Download · 2 Upload to your LLM · 3 Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS) TXTOS.txt 1 Download · 2 Paste into any LLM chat · 3 Type “hello world” — OS boots instantly

🧭 Explore More

Module Description Link
WFGY Core WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack View →
Problem Map 1.0 Initial 16-mode diagnostic and symbolic fix framework View →
Problem Map 2.0 RAG-focused failure tree, modular fixes, and pipelines View →
Semantic Clinic Index Expanded failure catalog: prompt injection, memory bugs, logic drift View →
Semantic Blueprint Layer-based symbolic reasoning & semantic modulations View →
Benchmark vs GPT-5 Stress test GPT-5 with full WFGY reasoning suite View →
🧙‍♂️ Starter Village 🏡 New here? Lost in symbols? Click here and let the wizard guide you through Start →

👑 Early Stargazers: See the Hall of Fame
Engineers, hackers, and open source builders who supported WFGY from day one.

GitHub stars WFGY Engine 2.0 is already unlocked. Star the repo to help others discover it and unlock more on the Unlock Board.

WFGY Main   TXT OS   Blah   Blot   Bloc   Blur   Blow