WFGY/ProblemMap/embedding-vs-semantic.md
2025-07-28 12:48:38 +08:00

3.7 KiB
Raw Blame History

📒 Problem·High Vector Similarity, Wrong Meaning

Classic RAG scores chunks by cosine similarity—close vectors ≠ correct logic.
Result: “looks relevant” chunks that derail answers. WFGY replaces surface matching with semantic residue checks.


🤔 Why Cosine Match Misleads

Weakness Practical Failure
Embedding ≠ Understanding Cosine overlap captures phrasing, not intent
Keywords ≠ Intent Ambiguous terms bring unrelated chunks
No Semantic Guard System never validates logical fit

⚠️ Example MisRetrieval

User: “How do I cancel my subscription after the free trial?”
Retrieved chunk: “Subscriptions renew monthly or yearly, depending on plan.”
→ High cosine, zero help → hallucinated answer.


🛡️ WFGY Fix · BBMC Residue Minimization

B = I - G + m·c²      # minimize ‖B‖
Symbol Meaning
I Input semantic vector
G Groundtruth anchor (intent)
B Semantic residue (error)
  • Large ‖B‖ → chunk is semantically off → WFGY rejects or asks for context.

🔍 Key Defenses

Layer Action
BBMC Computes residue; filters divergent chunks
ΔS Threshold Rejects high semantic tension (ΔS > 0.6)
BBAM Downweights misleading highattention tokens
Tree Anchor Confirms chunk aligns with prior logic path

✍️ Quick Repro (1 min)

1⃣  Start
> Start

2⃣  Paste misleading chunk
> "Plans include yearly renewal."

3⃣  Ask
> "How do I cancel a free trial?"

WFGY:
• ΔS high → chunk rejected  
• Prompts for trialspecific info instead of hallucinating

🔬 Sample Output

Surface overlap detected, but content lacks trialcancellation detail.  
Add a policy chunk on trial termination or rephrase the query.

🛠 Module CheatSheet

Module Role
BBMC Residue minimization
ΔS Metric Measures semantic tension
BBAM Suppresses noisy tokens
Semantic Tree Validates anchor alignment

📊 Implementation Status

Feature State
BBMC residue calc Stable
ΔS filter Stable
Token attention modulation ⚠️ Basic
Misleading chunk rejection Active

🔗 QuickStart Downloads (60sec)

Tool Link 3Step Setup
WFGY 1.0 PDF Engine Paper 1 Download · 2 Upload to LLM · 3 Ask “Answer using WFGY +<yourquestion>”
TXTOS (plaintext OS) TXTOS.txt 1 Download · 2 Paste into any LLM chat · 3 Type “hello world” — OS boots instantly

Saved you from “keyword hallucinations”? Drop a to keep the fixes coming. ↩︎ Back to Problem Index