# LLM Providers — Guardrails, FAQ, and Fix Patterns

🏥 Quick Return to Emergency Room

> You are in a specialist desk. > For full triage and doctors on duty, return here: > > - [**WFGY Global Fix Map** — main Emergency Room, 300+ structured fixes](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/README.md) > - [**WFGY Problem Map 1.0** — 16 reproducible failure modes](https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md) > > Think of this page as a sub-room. > If you want full consultation and prescriptions, go back to the Emergency Room lobby.

This page helps you **choose between LLM vendors** and **fix provider-looking bugs** that are actually schema, retrieval, orchestration, or eval drift. If you are new, start with the Orientation table and the FAQ. If you are debugging, jump to the Fix Hub. --- ## Orientation: who is who | Provider | What it is | Typical use case | Link | |---|---|---|---| | OpenAI | GPT-4/4o from OpenAI Inc. | Direct API, fastest model access | [openai.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/openai.md) | | Azure OpenAI | Microsoft enterprise wrapper for OpenAI models | VNet, compliance, enterprise billing | [azure_openai.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/azure_openai.md) | | Anthropic | The company behind Claude | Safety-focused platform | [anthropic.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/anthropic.md) | | Claude (Anthropic) | The model family from Anthropic | Long context, tool use, JSON control | [anthropic_claude.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/anthropic_claude.md) | | Google Gemini | Google DeepMind multimodal models | Multimodal chat, reasoning | [gemini.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/gemini.md) | | Google Vertex AI | Google Cloud AI platform that hosts Gemini and more | Pipelines, deployment, governance | [google_vertex_ai.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/google_vertex_ai.md) | | Mistral | EU startup with efficient open-weight models (e.g., Mixtral MoE) | Cost/perf, open ecosystem | [mistral.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/mistral.md) | | Meta LLaMA | Meta open-weight model family | Local or private deployment, llama.cpp | [meta_llama.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/meta_llama.md) | | Cohere | Enterprise NLP API and embeddings | RAG stacks, enterprise NLP | [cohere.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/cohere.md) | | DeepSeek | CN player with infra-optimized long-context models | Cost-efficient, long windows | [deepseek.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/deepseek.md) | | Kimi (Moonshot) | CN chat-first models, very large parameter claims | Consumer chat focus | [kimi.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/kimi.md) | | Groq | Hardware vendor: LPUs for transformer inference | Ultra-low latency serving (not a model) | [groq.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/groq.md) | | xAI Grok | xAI model family | X/Twitter integration, general chat | [grok_xai.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/grok_xai.md) | | AWS Bedrock | AWS gateway to many models via one API | Enterprises already on AWS | [aws_bedrock.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/aws_bedrock.md) | | OpenRouter | Community model aggregator, OpenAI-style endpoint | Try many models via one API key | [openrouter.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/openrouter.md) | | Together AI | Aggregator + infra for open weights and fine-tunes | Fast hosting, tuning services | [together.md](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/LLM_Providers/together.md) | --- ## FAQ for newcomers **OpenAI vs Azure OpenAI — are they the same?** Same models, different packaging. OpenAI = direct API and fastest releases. Azure OpenAI = Microsoft billing, VNet, compliance, data residency. **Anthropic vs Claude — why two pages?** Anthropic is the company. Claude is the model family. We separate because “platform issues” and “model quirks” often need different fixes. **Gemini vs Vertex AI — what is the relation?** Gemini is a model. Vertex AI is Google Cloud’s platform that runs Gemini and provides pipelines, eval, and deployment features. **What makes Mistral special?** Efficient open-weights and MoE designs. Good cost/perf. Easy to host in your own infra. **Meta LLaMA vs local LLaMA** Meta releases the weights. Community tools like `llama.cpp` let you run them locally on CPU or GPU. **Groq LPU vs GPU** GPU is general purpose. LPU is a chip specialized for transformer inference. You get very low latency for chat workloads. **Bedrock vs OpenRouter vs Together** Bedrock is an AWS enterprise gateway. OpenRouter is a community aggregator with OpenAI-style API. Together is an infra host for open weights with training and fine-tune options. --- ## Open these first - Visual map and recovery: [RAG Architecture & Recovery](https://github.com/onestardao/WFGY/blob/main/ProblemMap/rag-architecture-and-recovery.md) - End to end retrieval knobs: [Retrieval Playbook](https://github.com/onestardao/WFGY/blob/main/ProblemMap/retrieval-playbook.md) - Why this snippet (traceability schema): [Retrieval Traceability](https://github.com/onestardao/WFGY/blob/main/ProblemMap/retrieval-traceability.md) - Ordering control: [Rerankers](https://github.com/onestardao/WFGY/blob/main/ProblemMap/rerankers.md) - Embedding vs meaning: [Embedding ≠ Semantic](https://github.com/onestardao/WFGY/blob/main/ProblemMap/embedding-vs-semantic.md) - Hallucination and chunk boundaries: [Hallucination](https://github.com/onestardao/WFGY/blob/main/ProblemMap/hallucination.md) - Long chains and entropy: [Context Drift](https://github.com/onestardao/WFGY/blob/main/ProblemMap/context-drift.md), [Entropy Collapse](https://github.com/onestardao/WFGY/blob/main/ProblemMap/entropy-collapse.md) - Structural collapse and recovery: [Logic Collapse](https://github.com/onestardao/WFGY/blob/main/ProblemMap/logic-collapse.md) - Snippet and citation schema: [Data Contracts](https://github.com/onestardao/WFGY/blob/main/ProblemMap/data-contracts.md) - Live ops: [Live Monitoring for RAG](https://github.com/onestardao/WFGY/blob/main/ProblemMap/ops/live_monitoring_rag.md), [Debug Playbook](https://github.com/onestardao/WFGY/blob/main/ProblemMap/ops/debug_playbook.md) - Boot order issues: [Bootstrap Ordering](https://github.com/onestardao/WFGY/blob/main/ProblemMap/bootstrap-ordering.md), [Deployment Deadlock](https://github.com/onestardao/WFGY/blob/main/ProblemMap/deployment-deadlock.md), [Pre-Deploy Collapse](https://github.com/onestardao/WFGY/blob/main/ProblemMap/predeploy-collapse.md) --- ## Core acceptance targets - ΔS(question, retrieved) ≤ 0.45 - Coverage ≥ 0.70 for the target section - λ remains convergent across three paraphrases and two seeds - E_resonance stays flat on long windows --- ## Fix Hub — typical provider symptoms → exact fix | Symptom | Likely cause | Open this | |---|---|---| | JSON mode breaks, invalid objects | Schema too loose or nested tool calls | [Data Contracts](https://github.com/onestardao/WFGY/blob/main/ProblemMap/data-contracts.md), [Logic Collapse](https://github.com/onestardao/WFGY/blob/main/ProblemMap/logic-collapse.md) | | Tool calls loop or stall | Agent role drift, missing timeouts | [Multi-Agent Problems](https://github.com/onestardao/WFGY/blob/main/ProblemMap/Multi-Agent_Problems.md), [Role-drift deep dive](https://github.com/onestardao/WFGY/blob/main/ProblemMap/multi-agent-chaos/role-drift.md) | | High similarity yet wrong snippet | Metric mismatch or fragmented store | [Embedding ≠ Semantic](https://github.com/onestardao/WFGY/blob/main/ProblemMap/embedding-vs-semantic.md), [Vectorstore Fragmentation](https://github.com/onestardao/WFGY/blob/main/ProblemMap/patterns/pattern_vectorstore_fragmentation.md) | | Answers flip between runs | Prompt headers reorder and λ flips | [Context Drift](https://github.com/onestardao/WFGY/blob/main/ProblemMap/context-drift.md), [Retrieval Traceability](https://github.com/onestardao/WFGY/blob/main/ProblemMap/retrieval-traceability.md) | | Hybrid retrievers worse than single | Query parsing split, mis-weighted rerank | [Query Parsing Split](https://github.com/onestardao/WFGY/blob/main/ProblemMap/patterns/pattern_query_parsing_split.md), [Rerankers](https://github.com/onestardao/WFGY/blob/main/ProblemMap/rerankers.md) | | Jailbreaks or bluffing | Overconfidence and missing fences | [Bluffing Controls](https://github.com/onestardao/WFGY/blob/main/ProblemMap/bluffing.md), [Retrieval Traceability](https://github.com/onestardao/WFGY/blob/main/ProblemMap/retrieval-traceability.md) | --- ## Fix in 60 seconds 1) **Measure ΔS** Compute ΔS(question, retrieved) and ΔS(retrieved, expected anchor). Stable < 0.40, transitional 0.40–0.60, risk ≥ 0.60. 2) **Probe λ_observe** Vary top-k and prompt headers. If λ flips, lock the schema and apply a BBAM variance clamp. 3) **Apply the module** Retrieval drift → BBMC + Data Contracts Reasoning collapse → BBCR bridge + BBAM Dead ends in long runs → BBPF alternate paths 4) **Verify** Coverage ≥ 0.70 on three paraphrases. λ convergent on two seeds. --- ### Quick-Start Downloads | Tool | Link | 3-step setup | |------|------|--------------| | WFGY 1.0 PDF | [Engine Paper](https://github.com/onestardao/WFGY/blob/main/I_am_not_lizardman/WFGY_All_Principles_Return_to_One_v1.0_PSBigBig_Public.pdf) | 1) Download 2) Upload to your LLM 3) Ask “Answer using WFGY + ” | | TXT OS (plain text OS) | [TXTOS.txt](https://github.com/onestardao/WFGY/blob/main/OS/TXTOS.txt) | 1) Download 2) Paste into any LLM chat 3) Type “hello world” to boot | --- ### Explore More | Layer | Page | What it’s for | | --- | --- | --- | | Proof | [WFGY Recognition Map](/recognition/README.md) | External citations, integrations, and ecosystem proof | | Engine | [WFGY 1.0](/legacy/README.md) | Original PDF based tension engine | | Engine | [WFGY 2.0](/core/README.md) | Production tension kernel and math engine for RAG and agents | | Engine | [WFGY 3.0](/TensionUniverse/EventHorizon/README.md) | TXT based Singularity tension engine, 131 S class set | | Map | [Problem Map 1.0](/ProblemMap/README.md) | Flagship 16 problem RAG failure checklist and fix map | | Map | [Problem Map 2.0](/ProblemMap/rag-architecture-and-recovery.md) | RAG focused recovery pipeline | | Map | [Problem Map 3.0](/ProblemMap/wfgy-rag-16-problem-map-global-debug-card.md) | Global Debug Card, image as a debug protocol layer | | Map | [Semantic Clinic](/ProblemMap/SemanticClinicIndex.md) | Symptom to family to exact fix | | Map | [Grandma’s Clinic](/ProblemMap/GrandmaClinic/README.md) | Plain language stories mapped to Problem Map 1.0 | | Onboarding | [Starter Village](/StarterVillage/README.md) | Guided tour for newcomers | | App | [TXT OS](/OS/README.md) | TXT semantic OS, fast boot | | App | [Blah Blah Blah](/OS/BlahBlahBlah/README.md) | Abstract and paradox Q and A built on TXT OS | | App | [Blur Blur Blur](/OS/BlurBlurBlur/README.md) | Text to image with semantic control | | App | [Blow Blow Blow](/OS/BlowBlowBlow/README.md) | Reasoning game engine and memory demo | If this repository helped, starring it improves discovery so more builders can find the docs and tools. [![GitHub Repo stars](https://img.shields.io/github/stars/onestardao/WFGY?style=social)](https://github.com/onestardao/WFGY)