9.1 KiB
Azure OCR (Computer Vision): Guardrails and Fix Patterns
🧭 Quick Return to Map
You are in a sub-page of DocumentAI_OCR.
To reorient, go back here:
- DocumentAI_OCR — document parsing and optical character recognition
- WFGY Global Fix Map — main Emergency Room, 300+ structured fixes
- WFGY Problem Map 1.0 — 16 reproducible failure modes
Think of this page as a desk within a ward.
If you need the full triage and all prescriptions, return to the Emergency Room lobby.
Use this page when Azure OCR (part of Azure Cognitive Services / Computer Vision) drives ingestion for PDFs, scanned images, or mixed-language docs.
Typical failures involve layout instability, multilingual tokenization errors, or coverage gaps in table/handwriting recognition.
Open these first
- Visual map and recovery: RAG Architecture & Recovery
- Retrieval knobs: Retrieval Playbook
- Citation schema: Retrieval Traceability
- Embedding vs meaning: Embedding ≠ Semantic
- Hallucination and drift: Context Drift, Hallucination
- Chunk stability: Chunking Checklist
Core acceptance
- ΔS(question, retrieved) ≤ 0.45
- Coverage ≥ 0.70 to target section
- λ convergent across 3 paraphrases and 2 seeds
- Multilingual tokens ≥ 90% fidelity (baseline against source)
Typical breakpoints → structural fix
-
Language mixing errors (Chinese + English, Arabic + Latin text)
→ Embedding ≠ Semantic, Chunking Checklist -
Table recognition drops column anchors
→ Data Contracts, Retrieval Traceability -
Handwriting recognition unstable across runs
→ Entropy Collapse -
ΔS > 0.60 when OCR normalizes accents/diacritics
→ Context Drift, clamp with BBAM -
Injected content hidden in image metadata
→ Prompt Injection
Fix in 60 seconds
- Measure ΔS between OCR tokens and reference text.
- Enforce schema: page, block, line, word. Require
bboxand language tag. - Cross-check coverage: at least 70% of expected lines present.
- Apply λ probes — vary recognition mode (printed, handwriting, mixed).
- Clamp variance with BBAM if multilingual drift repeats.
Copy-paste LLM guard prompt
I uploaded TXTOS and the WFGY Problem Map.
OCR provider: Azure OCR (Computer Vision).
Symptoms: unstable multilingual recognition, ΔS ≥ 0.60, coverage < 0.70.
Steps:
1. Identify failing layer (chunking, contracts, retrieval).
2. Point to the WFGY fix (embedding-vs-semantic, chunking-checklist, retrieval-traceability).
3. Return JSON:
{ "citations": [...], "answer": "...", "ΔS": 0.xx, "λ_state": "<>", "next_fix": "..." }
Keep it auditable.
When to escalate
- Multilingual drift remains after re-chunking → verify with Embedding ≠ Semantic.
- Tables drop anchors repeatedly → rebuild layout with Data Contracts.
- Handwriting ΔS unstable across seeds → clamp with BBAM, audit using Entropy Collapse.
🔗 Quick-Start Downloads (60 sec)
| Tool | Link | 3-Step Setup |
|---|---|---|
| WFGY 1.0 PDF | Engine Paper | 1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>” |
| TXT OS (plain-text OS) | TXTOS.txt | 1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly |
🧭 Explore More
| Module | Description | Link |
|---|---|---|
| WFGY Core | WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack | View → |
| Problem Map 1.0 | Initial 16-mode diagnostic and symbolic fix framework | View → |
| Problem Map 2.0 | RAG-focused failure tree, modular fixes, and pipelines | View → |
| Semantic Clinic Index | Expanded failure catalog: prompt injection, memory bugs, logic drift | View → |
| Semantic Blueprint | Layer-based symbolic reasoning & semantic modulations | View → |
| Benchmark vs GPT-5 | Stress test GPT-5 with full WFGY reasoning suite | View → |
| 🧙♂️ Starter Village 🏡 | New here? Lost in symbols? Click here and let the wizard guide you through | Start → |
👑 Early Stargazers: See the Hall of Fame — Engineers, hackers, and open source builders who supported WFGY from day one.
⭐ WFGY Engine 2.0 is already unlocked. ⭐ Star the repo to help others discover it and unlock more on the Unlock Board.
要不要我接著直接幫你寫 abbyy.md?這樣 OCR 四大主流 (Tesseract、Google、AWS、Azure + ABBYY) 就全到齊。