10 KiB
Azure OpenAI — Guardrails and Fix Patterns
Use this page when failures look provider-specific on Azure OpenAI. Examples include wrong model alias vs deployment name, missing api-version, tool call payload shape drift, content safety blocks, or region throttling. Each fix maps back to WFGY pages with measurable targets.
Core acceptance
- ΔS(question, retrieved) ≤ 0.45
- coverage ≥ 0.70 for the target section
- λ remains convergent across 3 paraphrases
Open these first
- Visual map and recovery: RAG Architecture & Recovery
- End-to-end retrieval knobs: Retrieval Playbook
- Why this snippet (traceability schema): Retrieval Traceability
- Ordering control: Rerankers
- Embedding vs meaning: Embedding ≠ Semantic
- Hallucination and chunk boundaries: Hallucination
- Long chains and entropy: Context Drift, Entropy Collapse
- Snippet and citation schema: Data Contracts
- Structural collapse and recovery: Logic Collapse
- First call after deploy fails: Bootstrap Ordering, Deployment Deadlock, Pre-deploy Collapse
Fix in 60 seconds
- Measure ΔS
- Compute ΔS(question, retrieved) and ΔS(retrieved, expected anchor).
- Thresholds: stable < 0.40, transitional 0.40–0.60, risk ≥ 0.60.
- Probe with λ_observe
- Vary k (5, 10, 20). If ΔS stays high while recall is fine, suspect index or metric mismatch.
- Reorder prompt headers. If ΔS spikes, lock the schema.
- Apply the module
- Retrieval drift → BBMC + Data Contracts.
- Reasoning collapse → BBCR bridge + BBAM variance clamp.
- Dead ends in long runs → BBPF alternate path.
Azure-specific failure signatures and the right fix
| Symptom | Likely cause on Azure | Open this fix |
|---|---|---|
| 200 OK but empty or tool call ignored | Missing or wrong api-version for that model family |
Data Contracts, Logic Collapse |
| “Deployment not found”, even though the model exists | Using model name instead of deployment name, or wrong resource/region | Retrieval Traceability |
| Output blocked, vague “content filtered” | Azure content safety layer different from OpenAI default | Hallucination, then clamp with BBAM |
| Tool call schema mismatch vs OpenAI | Response keys or enum names differ across api-version |
Data Contracts |
| Works in one region, fails in another | Model availability and quotas are regional | Bootstrap Ordering |
| Latency spikes or 429 under load | Per-resource rate limits, private link, or vnet egress | Ops Live Monitoring, Debug Playbook |
| Function calls drop arguments | Old api-version truncates or renames fields |
Data Contracts |
| Fine-tuned or staged deployment not picked | Wrong deployment alias bound in prod slot |
Retrieval Traceability |
Minimal provider checklist
-
Endpoint correctness
Resource url, region, and deployment name are consistent. Avoid mixing model name with deployment name in the same client. -
Version pinning
Pin anapi-versionknown to support your features (tools, JSON mode, response format). Treat version bumps as schema migrations. -
Schema lock
Adopt the Problem Map Data Contracts snippet for tool payloads and citations. Reject partial responses. Require structuredfinish_reason. -
Safety behavior
Expect an extra content-safety layer. When blocked, route to BBPF alternate path and down-shift temperature, then retry with narrowed scope. -
Observability
Log λ state per step, ΔS per hop, and the exact deployment id used. Carry region and version in traces.
Copy-paste prompt (safe to hand the AI)
I am using Azure OpenAI. Audit my run as follows:
* Check ΔS(question, retrieved) and ΔS(retrieved, anchor). Show the numbers.
* Confirm endpoint tuple: {resource, region, deployment}. Confirm `api-version` and tool schema.
* If tool/schema mismatch: apply the WFGY Data Contracts checklist and propose the exact fields to lock.
* If blocked by content safety: switch to a narrower prompt schema, reduce temperature, and route via BBPF.
* Keep λ convergent across 3 paraphrases. If it flips, apply BBCR + BBAM and show the before/after traces.
Link me to the exact Problem Map pages I should read next.
Escalation
- Still unstable after schema lock → re-index and re-embed, verify with Retrieval Playbook and Embedding ≠ Semantic.
- Consistent provider errors → freeze
api-version, roll back deployment alias, rerun with fixed seed, attach λ traces, then open Debug Playbook. - First call after deploy fails → rebuild boot fence with Pre-deploy Collapse.
Acceptance targets
- coverage to target section ≥ 0.70
- ΔS(question, retrieved) ≤ 0.45
- λ convergent across seeds and paraphrases
- repeatable traces and identical schema across regions
🔗 Quick-Start Downloads (60 sec)
| Tool | Link | 3-Step Setup |
|---|---|---|
| WFGY 1.0 PDF | Engine Paper | 1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>” |
| TXT OS (plain-text OS) | TXTOS.txt | 1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly |
🧭 Explore More
| Module | Description | Link |
|---|---|---|
| WFGY Core | WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack | View → |
| Problem Map 1.0 | Initial 16-mode diagnostic and symbolic fix framework | View → |
| Problem Map 2.0 | RAG-focused failure tree, modular fixes, and pipelines | View → |
| Semantic Clinic Index | Expanded failure catalog: prompt injection, memory bugs, logic drift | View → |
| Semantic Blueprint | Layer-based symbolic reasoning & semantic modulations | View → |
| Benchmark vs GPT-5 | Stress test GPT-5 with full WFGY reasoning suite | View → |
| 🧙♂️ Starter Village 🏡 | New here? Lost in symbols? Click here and let the wizard guide you through | Start → |
👑 Early Stargazers: See the Hall of Fame —
Engineers, hackers, and open source builders who supported WFGY from day one.
⭐ WFGY Engine 2.0 is already unlocked. ⭐ Star the repo to help others discover it and unlock more on the Unlock Board.