WFGY/ProblemMap/patterns/pattern_query_parsing_split.md
2025-08-15 23:56:16 +08:00

329 lines
15 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Pattern — Query Parsing Split (Multi-Intent / Wrong Sub-Intent First)
**Scope**
A single user query actually contains **multiple intents** (lookup + policy + transformation + generation), but the pipeline treats it as **one** retrieval/generation ask. The system answers the **easiest/earliest** sub-intent and ignores the rest, or mixes intents, causing off-topic retrieval and wrong acceptance decisions.
**Why it matters**
Multi-intent queries are common (“compare A vs B and give a summary with citations”). If you dont split, retrieval pools and prompts blur constraints, you get **false grounding**, and audit trails become meaningless (“which intent did this citation serve?”).
> Quick nav: [Patterns Index](./README.md) · Examples:
> [Example 01](../examples/example_01_basic_fix.md) ·
> [Example 03](../examples/example_03_pipeline_patch.md) ·
> [Eval: Precision & CHR](../eval/eval_rag_precision_recall.md)
---
## 1) Signals & Fast Triage
**Likely symptoms**
- The answer handles **only part** of the question (e.g., explains A but not B or the comparison).
- Retrieved chunks mix unrelated facets (policy + tutorial + changelog) → noisy context, low CHR.
- Auditor (Example 04) flips `VALID``NOT_IN_CONTEXT` depending on which fragment the model latched onto.
- Example 02 labels skew to `query_parse_error`.
**Deterministic checks (no LLM)**
- **Separator heuristics**: query contains `and`, `vs`, `;`, `,`, numbered lists `1) 2)`, or colon-scoped asks (“X: do Y, then Z”).
- **Verb phase count**: ≥2 finite verbs across different objects (`compare`, `explain`, `implement`, `deploy`).
- **Constraint tokens**: presence of at least one **data** intent (`find`, `lookup`, `cite`) and one **action** intent (`summarize`, `generate`, `rewrite`).
If ≥2 signals → treat as **multi-intent** and split.
---
## 2) Minimal Reproducible Case
`data/chunks.json`:
```json
[
{"id":"pA#1","text":"Policy A: Only domain example.com is allowed."},
{"id":"pB#1","text":"Policy B: Allow *.company.com and partner domains."},
{"id":"pC#1","text":"How to edit email settings in the dashboard."}
]
````
User query:
**“Compare Policy A vs B with citations, then draft an email asking IT to switch our domain.”**
Naive pipelines either:
1. Summarize A **or** B only, **or**
2. Draft the email **without** grounded comparison.
---
## 3) Root Causes
* **Single-turn monolith**: retrieval runs once on the whole sentence; constraints collide.
* **No intent schema**: pipeline cant represent “first compare (grounded), then draft (un-grounded).”
* **Prompt overloading**: one template tries to do comparison + generation + policy proof.
* **Acceptance gate blind**: Auditor validates a claim that mixes two intents.
---
## 4) Standard Fix (Ordered, Minimal, Measurable)
**Step 1 Detect & Split**
* Run deterministic heuristics (Section 1) to produce **sub-intents** with **roles**: `COMPARE`, `LOOKUP`, `DRAFT`, `REWRITE`, etc.
* Each sub-intent gets its **own** retrieval pool and acceptance rule.
**Step 2 Bind Contracts per Sub-Intent**
* Evidence-only template for **grounded** intents (`COMPARE`, `LOOKUP`) requires `citations: [id,...]`.
* Free-form template for **creative** intents (`DRAFT`) must **echo** the grounded summary id (handoff contract) but does **not** add new citations.
**Step 3 Sequence with Handoffs**
* Output of grounded step `summary.claim`, `citations`.
* Draft step **may** rephrase but cannot introduce new factual claims; it references the **handoff id**.
**Step 4 Accept or Refuse**
* Accept only if **grounded** step is `VALID` (Example 04) **and** draft step references the correct handoff id.
* If grounded step is `NOT_IN_CONTEXT`, overall request returns refusal with explanation.
**Step 5 Evaluate**
* Example 08: score **per-intent** precision and CHR; drafts are graded on **schema compliance**, not truth.
---
## 5) Reference Implementation Python (stdlib only)
Create `tools/intent_split.py`.
```python
# tools/intent_split.py -- rule-based multi-intent splitter + per-intent contracts
import re, json, os, time, urllib.request, uuid
GROUND_REFUSAL = "not in context"
def split_intents(q: str):
text = q.strip()
# crude separators
parts = re.split(r"\bthen\b|;| and then | && | -> ", text, flags=re.IGNORECASE)
intents = []
for p in parts:
role = "LOOKUP"
if re.search(r"\bcompare|vs\b", p, re.IGNORECASE): role = "COMPARE"
if re.search(r"\bdraft|email|write|generate|compose\b", p, re.IGNORECASE): role = "DRAFT"
intents.append({"id": str(uuid.uuid4())[:8], "role": role, "text": p.strip()})
# if single fragment but has both compare + draft keywords, split into two logical intents
if len(intents)==1 and re.search(r"\bcompare|vs\b", text, re.IGNORECASE) and re.search(r"\bdraft|email|write|generate\b", text, re.IGNORECASE):
intents = [
{"id": str(uuid.uuid4())[:8], "role":"COMPARE", "text": text},
{"id": str(uuid.uuid4())[:8], "role":"DRAFT", "text": text}
]
return intents
def retrieve(chunks, q, k=6):
qs = set(w for w in re.split(r"\W+", q.lower()) if len(w)>=3)
scored = []
for c in chunks:
toks = re.split(r"\W+", c["text"].lower())
overlap = sum(1 for t in toks if t in qs)
scored.append((overlap, c))
scored.sort(key=lambda x: x[0], reverse=True)
return [c for s,c in scored[:k]]
def build_compare_prompt(q, ctx, allowed):
ctxs = "\n\n".join(f"[{c['id']}] {c['text']}" for c in ctx)
return (
"Task: Compare the two policies strictly from EVIDENCE.\n"
"Output JSON ONLY: { claim: string, citations: [id,...] }\n"
f"If not provable, reply exactly '{GROUND_REFUSAL}'.\n\n"
f"Question: {q}\nEVIDENCE:\n{ctxs}\n"
)
def build_draft_prompt(summary_json):
return (
"Task: Draft a short email referencing the grounded comparison.\n"
"You MUST echo {handoff_id} exactly and MUST NOT add new policy facts.\n"
"Output JSON ONLY: { email: string, handoff_id: string }\n\n"
f"Grounded summary:\n{json.dumps(summary_json)}\n"
)
def call_openai(prompt, model=os.getenv("OPENAI_MODEL","gpt-4o-mini")):
key=os.getenv("OPENAI_API_KEY"); assert key, "OPENAI_API_KEY"
body = json.dumps({"model":model,"messages":[{"role":"user","content":prompt}],"temperature":0}).encode()
req = urllib.request.Request("https://api.openai.com/v1/chat/completions", data=body, headers={"Content-Type":"application/json","Authorization":f"Bearer {key}"})
with urllib.request.urlopen(req) as r:
j=json.loads(r.read().decode()); return j["choices"][0]["message"]["content"].strip()
def parse_json(text):
s=text.find("{"); e=text.rfind("}")
if s<0 or e<=s: return None
try: return json.loads(text[s:e+1])
except: return None
def run(q, chunks):
turns=[]
intents = split_intents(q)
handoff=None
for it in intents:
if it["role"] in ("LOOKUP","COMPARE"):
ctx = retrieve(chunks, it["text"], k=6)
allowed = [c["id"] for c in ctx]
out = parse_json(call_openai(build_compare_prompt(it["text"], ctx, allowed)))
if not out or (isinstance(out, dict) and out.get("claim","").strip().lower()==GROUND_REFUSAL):
return {"status":"REFUSAL", "reason":"grounding_failed"}
# schema & scope checks
if not set(out.get("citations",[])).issubset(set(allowed)):
return {"status":"REJECT", "reason":"citation_out_of_scope"}
handoff = {"handoff_id": str(uuid.uuid4())[:8], "summary": out}
turns.append({"intent": it, "ctx_ids": allowed, "out": out, "handoff_id": handoff["handoff_id"]})
elif it["role"]=="DRAFT":
if not handoff: return {"status":"REJECT", "reason":"draft_without_grounding"}
draft = parse_json(call_openai(build_draft_prompt({"handoff_id": handoff["handoff_id"], **handoff["summary"]})))
if not draft or draft.get("handoff_id") != handoff["handoff_id"]:
return {"status":"REJECT", "reason":"handoff_mismatch"}
turns.append({"intent": it, "out": draft, "handoff_id": handoff["handoff_id"]})
return {"status":"OK", "turns": turns}
if __name__=="__main__":
chunks = json.load(open("data/chunks.json",encoding="utf8"))
print(json.dumps(run("Compare Policy A vs B with citations, then draft an email asking IT to switch our domain.", chunks), indent=2))
```
**Pass criteria**
* For the sample query, the first turn is a **grounded** `COMPARE` with citations to `pA#1`/`pB#1`.
* The `DRAFT` turn echoes the **handoff\_id** and contains no new policy facts.
* If comparison is `not in context`, overall **REFUSAL** (no email is drafted).
---
## 6) Node Quick Variant (split only, no LLM call)
Create `tools/intent_split.mjs`.
```js
// tools/intent_split.mjs -- detect multi-intent; emit a small plan
export function splitIntents(q){
const text = q.trim();
const cuts = text.split(/(?:\bthen\b|;| and then | && | -> )/i).map(s=>s.trim()).filter(Boolean);
const parts = cuts.length ? cuts : [text];
return parts.map(p=>{
let role = "LOOKUP";
if (/\bcompare|vs\b/i.test(p)) role = "COMPARE";
if (/\bdraft|email|write|generate|compose\b/i.test(p)) role = "DRAFT";
return { role, text: p };
});
}
// CLI
if (import.meta.url === `file://${process.argv[1]}`) {
const q = process.argv.slice(2).join(" ");
console.log(JSON.stringify(splitIntents(q), null, 2));
}
```
---
## 7) Acceptance Criteria (ship/no-ship)
A multi-intent response **may ship** only if:
1. Each **grounded** sub-intent has `citations ⊆ retrieved_ids` and passes Auditor `VALID`.
2. **Creative** sub-intents (draft/rewrite) echo a valid `handoff_id` from a `VALID` grounded step.
3. If any grounded sub-intent returns `not in context`, the overall request refuses (no partial answers).
4. Example 08 per-intent gates pass (CHR for grounded, compliance for drafts).
---
## 8) Prevention (contracts & defaults)
* **Query schema**: `role: {COMPARE|LOOKUP|DRAFT|REWRITE}`, `text`, optional `constraints`.
* **Router default**: split when ≥2 deterministic signals fire; otherwise single-intent.
* **Template isolation**: distinct prompts per role; never mix compare + draft in the same prompt.
* **UI hinting**: suggest quick toggles (“Compare” / “Draft”) for power users; cut ambiguity at the source.
---
## 9) Debug Workflow (10 minutes)
1. Run the splitter; print the plan.
2. Execute grounded step(s) first and log citations.
3. Ensure draft step references a real `handoff_id`.
4. If grounded fails → return refusal; do **not** proceed.
5. Re-score with Example 08; CHR should improve while over-refusal stays controlled.
---
## 10) Common Traps & Fixes
* **Draft first** temptation → ungrounded emails. Always ground **before** drafting.
* **One big retrieval** for all roles → tail noise. Retrieve **per role**.
* **Auditor on drafts** → meaningless. Audit only **grounded** claims; drafts check **schema** and **handoff**.
* **Partial shipping** (“we answered the easy half”) → inconsistent UX. Refuse on missing grounded parts.
---
## 11) Minimal Checklist (copy into PR)
* [ ] Split multi-intent queries deterministically into roles.
* [ ] Grounded steps use evidence-only template + citations.
* [ ] Drafts echo `handoff_id`; no new facts.
* [ ] Acceptance gate enforces per-role rules; no partial ship.
* [ ] Example 08 gates pass per intent.
---
## References to hands-on examples
* **Example 01** — Guarded baseline (evidence-only + refusal)
* **Example 02** — Reflection triage (`query_parse_error`)
* **Example 03** — Retrieval stabilization for each sub-intent
* **Example 04** — Acceptance gate (Scholar/Auditor + handoff)
* **Example 08** — Eval per intent (CHR for grounded, compliance for drafts)
---
### 🔗 Quick-Start Downloads (60 sec)
| Tool | Link | 3-Step Setup |
|------|------|--------------|
| **WFGY 1.0 PDF** | [Engine Paper](https://github.com/onestardao/WFGY/blob/main/I_am_not_lizardman/WFGY_All_Principles_Return_to_One_v1.0_PSBigBig_Public.pdf) | 1⃣ Download · 2⃣ Upload to your LLM · 3⃣ Ask “Answer using WFGY + \<your question>” |
| **TXT OS (plain-text OS)** | [TXTOS.txt](https://github.com/onestardao/WFGY/blob/main/OS/TXTOS.txt) | 1⃣ Download · 2⃣ Paste into any LLM chat · 3⃣ Type “hello world” — OS boots instantly |
---
### 🧭 Explore More
| Module | Description | Link |
|-----------------------|----------------------------------------------------------|----------|
| WFGY Core | WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack | [View →](https://github.com/onestardao/WFGY/tree/main/core/README.md) |
| Problem Map 1.0 | Initial 16-mode diagnostic and symbolic fix framework | [View →](https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md) |
| Problem Map 2.0 | RAG-focused failure tree, modular fixes, and pipelines | [View →](https://github.com/onestardao/WFGY/blob/main/ProblemMap/rag-architecture-and-recovery.md) |
| Semantic Clinic Index | Expanded failure catalog: prompt injection, memory bugs, logic drift | [View →](https://github.com/onestardao/WFGY/blob/main/ProblemMap/SemanticClinicIndex.md) |
| Semantic Blueprint | Layer-based symbolic reasoning & semantic modulations | [View →](https://github.com/onestardao/WFGY/tree/main/SemanticBlueprint/README.md) |
| Benchmark vs GPT-5 | Stress test GPT-5 with full WFGY reasoning suite | [View →](https://github.com/onestardao/WFGY/tree/main/benchmarks/benchmark-vs-gpt5/README.md) |
| 🧙‍♂️ Starter Village 🏡 | New here? Lost in symbols? Click here and let the wizard guide you through | [Start →](https://github.com/onestardao/WFGY/blob/main/StarterVillage/README.md) |
---
> 👑 **Early Stargazers: [See the Hall of Fame](https://github.com/onestardao/WFGY/tree/main/stargazers)** —
> Engineers, hackers, and open source builders who supported WFGY from day one.
> <img src="https://img.shields.io/github/stars/onestardao/WFGY?style=social" alt="GitHub stars"> ⭐ [WFGY Engine 2.0](https://github.com/onestardao/WFGY/blob/main/core/README.md) is already unlocked. ⭐ Star the repo to help others discover it and unlock more on the [Unlock Board](https://github.com/onestardao/WFGY/blob/main/STAR_UNLOCKS.md).
<div align="center">
[![WFGY Main](https://img.shields.io/badge/WFGY-Main-red?style=flat-square)](https://github.com/onestardao/WFGY)
&nbsp;
[![TXT OS](https://img.shields.io/badge/TXT%20OS-Reasoning%20OS-orange?style=flat-square)](https://github.com/onestardao/WFGY/tree/main/OS)
&nbsp;
[![Blah](https://img.shields.io/badge/Blah-Semantic%20Embed-yellow?style=flat-square)](https://github.com/onestardao/WFGY/tree/main/OS/BlahBlahBlah)
&nbsp;
[![Blot](https://img.shields.io/badge/Blot-Persona%20Core-green?style=flat-square)](https://github.com/onestardao/WFGY/tree/main/OS/BlotBlotBlot)
&nbsp;
[![Bloc](https://img.shields.io/badge/Bloc-Reasoning%20Compiler-blue?style=flat-square)](https://github.com/onestardao/WFGY/tree/main/OS/BlocBlocBloc)
&nbsp;
[![Blur](https://img.shields.io/badge/Blur-Text2Image%20Engine-navy?style=flat-square)](https://github.com/onestardao/WFGY/tree/main/OS/BlurBlurBlur)
&nbsp;
[![Blow](https://img.shields.io/badge/Blow-Game%20Logic-purple?style=flat-square)](https://github.com/onestardao/WFGY/tree/main/OS/BlowBlowBlow)
&nbsp;
</div>