vrr/WFGY

Fork 0

mirror of https://github.com/onestardao/WFGY.git synced 2026-04-28 11:40:07 +00:00

PSBigBig 7b52023b87

Update qdrant.md

2025-09-05 11:54:01 +08:00

12 KiB

Raw Blame History

Qdrant: Guardrails and Fix Patterns

🧭 Quick Return to Map

You are in a sub-page of VectorDBs_and_Stores.
To reorient, go back here:

VectorDBs_and_Stores — vector indexes and storage backends

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a desk within a ward.
If you need the full triage and all prescriptions, return to the Emergency Room lobby.

A compact field guide to stabilize Qdrant when your pipeline touches RAG, agents, or long context. Use the checks below to localize failure, then jump to the exact WFGY fix page.

Open these first

Visual map and recovery: RAG Architecture & Recovery
End to end retrieval knobs: Retrieval Playbook
Why this snippet and how to trace it: Retrieval Traceability
Ordering control after recall: Rerankers
Embedding versus semantic meaning: Embedding ≠ Semantic
Long chains and drift checks: Context Drift, Entropy Collapse
Structural collapse and recovery: Logic Collapse
Vectorstore fragmentation signals: Pattern: Vectorstore Fragmentation
Boot fences and cold start traps: Pattern: Bootstrap Deadlock
Live ops and monitoring: Live Monitoring for RAG

Core acceptance

ΔS(question, retrieved) ≤ 0.45 across three paraphrases.
Coverage ≥ 0.70 to the target section.
λ remains convergent across seeds.
E_resonance stays flat across long windows.
Exact run is repeatable with the same data snapshot.

Fix in 60 seconds

Measure ΔS
- Compute ΔS(question, retrieved) and ΔS(retrieved, expected anchor).
- Stable < 0.40, transitional 0.40–0.60, risk ≥ 0.60.
Probe with λ_observe
- Vary top-k {5, 10, 20}. Flat high curve suggests index or metric mismatch.
- Reorder prompt headers. If ΔS spikes, lock the schema with Data Contracts.
Apply the module
- Retrieval drift → BBMC + Data Contracts.
- Logic collapse → BBCR bridge then BBAM variance clamp.
- Dead ends in long runs → BBPF alternate path.
Verify
- Re run on two paraphrases and one seed change. All acceptance targets must pass.

Typical breakpoints and the right fix

1) Distance metric does not match the embedding family

Symptom: high similarity scores but wrong meaning.
Check: collection distance is cosine for most sentence embeddings. Dot or Euclidean can degrade recall.
Fix: recreate the collection with the correct metric and re ingest. See Embedding ≠ Semantic and Retrieval Playbook.

2) Vector dimension drift after model switch

Symptom: insert fails or silent truncation through client, later retrieval chaos.
Fix: confirm embedding dimension equals collection size. If changed, create a new collection and backfill. See Vectorstore Fragmentation.

3) HNSW recall too low

Symptom: relevant chunk never appears in top-k until k is very large.
Fix: raise ef_construct when building and ef at query time for accuracy checks. For audits, run the exact search mode when available in your client and compare. Then tune m and ef. See Retrieval Playbook and Rerankers.

4) Payload filter without proper index

Symptom: filters work but top-k ordering is erratic or slow.
Fix: create payload indexes for frequently used keys. Validate that filter reduces the candidate set then rerank. Map to Retrieval Traceability.

5) Named vectors mismatch

Symptom: empty results or strange recall after adding multi vector schema.
Fix: confirm client queries the intended named vector. Align updater and retriever. See Data Contracts.

6) Quantization hurting recall

Symptom: answers look fuzzy at small k after enabling scalar or PQ.
Fix: disable quantization when doing quality checks. If you must keep it, increase k and rerank. See Retrieval Playbook.

7) Cluster version skew or cold replicas

Symptom: node A returns different set from node B.
Fix: confirm all shards are green, replicas in sync, and warm. Run the ops checklist. See Live Monitoring for RAG and Bootstrap Deadlock.

8) Hybrid retrieval wired incorrectly

Symptom: BM25 returns good docs but hybrid fusion gets worse.
Fix: normalize scores then fuse or rerank with a cross encoder. See Rerankers and Retrieval Playbook.

Minimal reproduce prompt for your AI

Paste this into your LLM after you uploaded TXT OS and the Problem Map.

I uploaded TXT OS and the WFGY ProblemMap files.
My Qdrant bug:
- symptom: [one line]
- traces: [index settings, distance, dim, ef, named vectors, filters, collection schema]
- ΔS(question,retrieved)=..., ΔS(retrieved,anchor)=..., λ states

Tell me:
1) which layer is failing and why,
2) which exact fix page to open from this repo,
3) the minimal steps to push ΔS ≤ 0.45 and keep λ convergent,
4) how to verify with a reproducible test.
Use BBMC/BBPF/BBCR/BBAM where relevant.

Patterns to check next

Vectorstore fragmentation: pattern page
Query parsing split in HyDE or BM25: pattern page
Hallucination re entry: pattern page

Escalate when

You changed metric or dimension. Rebuild the collection.
You see per node inconsistency. Freeze writes, take a snapshot, verify shard state, then rerun the acceptance checks.
You rely on heavy filters. Add payload indexes and move final ordering to a reranker.

🔗 Quick-Start Downloads (60 sec)

Tool	Link	3-Step Setup
WFGY 1.0 PDF	Engine Paper	1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)	TXTOS.txt	1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

🧭 Explore More

Module	Description	Link
WFGY Core	WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack	View →
Problem Map 1.0	Initial 16-mode diagnostic and symbolic fix framework	View →
Problem Map 2.0	RAG-focused failure tree, modular fixes, and pipelines	View →
Semantic Clinic Index	Expanded failure catalog: prompt injection, memory bugs, logic drift	View →
Semantic Blueprint	Layer-based symbolic reasoning & semantic modulations	View →
Benchmark vs GPT-5	Stress test GPT-5 with full WFGY reasoning suite	View →
🧙‍♂️ Starter Village 🏡	New here? Lost in symbols? Click here and let the wizard guide you through	Start →

👑 Early Stargazers: See the Hall of Fame — Engineers, hackers, and open source builders who supported WFGY from day one.

⭐ WFGY Engine 2.0 is already unlocked. ⭐ Star the repo to help others discover it and unlock more on the Unlock Board.

12 KiB Raw Blame History Unescape Escape