vrr/WFGY

mirror of https://github.com/onestardao/WFGY.git synced 2026-05-09 19:45:26 +00:00

History

PSBigBig f437fd951d Update README.md		2025-09-01 15:03:56 +08:00
..
anthropic.md	Create anthropic.md	2025-08-26 14:17:07 +08:00
anthropic_claude.md	Create anthropic_claude.md	2025-08-26 16:15:05 +08:00
aws_bedrock.md	Create aws_bedrock.md	2025-08-26 22:29:51 +08:00
azure_openai.md	Create azure_openai.md	2025-08-26 22:09:03 +08:00
cohere.md	Create cohere.md	2025-08-26 17:06:38 +08:00
deepseek.md	Create deepseek.md	2025-08-26 17:58:06 +08:00
gemini.md	Create gemini.md	2025-08-26 15:31:18 +08:00
google_vertex_ai.md	Create google_vertex_ai.md	2025-08-26 23:28:44 +08:00
grok_xai.md	Create grok_xai.md	2025-08-26 17:37:29 +08:00
groq.md	Create groq.md	2025-08-27 09:58:04 +08:00
kimi.md	Create kimi.md	2025-08-26 20:28:29 +08:00
meta_llama.md	Create meta_llama.md	2025-08-26 20:46:13 +08:00
mistral.md	Create mistral.md	2025-08-26 16:14:31 +08:00
openai.md	Create openai.md	2025-08-26 13:55:39 +08:00
openrouter.md	Create openrouter.md	2025-08-26 23:56:47 +08:00
README.md	Update README.md	2025-09-01 15:03:56 +08:00
together.md	Create together.md	2025-08-27 12:51:20 +08:00

README.md

LLM Providers — Guardrails, FAQ, and Fix Patterns

This page helps you choose between LLM vendors and fix provider-looking bugs that are actually schema, retrieval, orchestration, or eval drift. If you are new, start with the Orientation table and the FAQ. If you are debugging, jump to the Fix Hub.

Orientation: who is who

Provider	What it is	Typical use case
OpenAI	GPT-4/4o from OpenAI Inc.	Direct API, fastest model access
Azure OpenAI	Microsoft’s enterprise wrapper for OpenAI models	VNet, compliance, enterprise billing
Anthropic	The company behind Claude	Safety-focused platform
Claude (Anthropic)	The model family from Anthropic	Long context, tool use, JSON control
Google Gemini	Google DeepMind’s multimodal models	Multimodal chat, reasoning
Google Vertex AI	Google Cloud’s AI/ML platform that hosts Gemini and more	Pipelines, deployment, governance
Mistral	EU startup with efficient open-weight models (e.g., Mixtral MoE)	Cost/perf, open ecosystem
Meta LLaMA	Meta’s open-weight model family	Local/private deployment, llama.cpp
Cohere	Enterprise NLP API and embeddings	RAG stacks, enterprise NLP
DeepSeek	CN player with infra-optimized long-context models	Cost-efficient, long windows
Kimi (Moonshot)	CN chat-first models，very large parameter claims	Consumer chat focus
Groq	Hardware vendor: LPUs for transformer inference	Ultra-low latency serving (not a model)
xAI Grok	xAI’s model family	X/Twitter integration, general chat
AWS Bedrock	AWS gateway to many models in one API	Enterprises already on AWS
OpenRouter	Community model aggregator (OpenAI-style endpoint)	Try many models via one API key
Together AI	Aggregator + infra for open weights and fine-tunes	Fast hosting, tuning services

FAQ for newcomers

OpenAI vs Azure OpenAI — are they the same?
Same models, different packaging. OpenAI = direct API and fastest releases. Azure OpenAI = Microsoft billing, VNet, compliance, data residency.

Anthropic vs Claude — why two pages?
Anthropic is the company. Claude is the model family. We separate because “platform issues” and “model quirks” often need different fixes.

Gemini vs Vertex AI — what is the relation?
Gemini is a model. Vertex AI is Google Cloud’s platform that runs Gemini and provides pipelines, eval, and deployment features.

What makes Mistral special?
Efficient open-weights and MoE designs. Good cost/perf. Easy to host in your own infra.

Meta LLaMA vs local LLaMA
Meta releases the weights. Community tools like llama.cpp let you run them locally on CPU/GPU.

Groq LPU vs GPU
GPU is general purpose. LPU is a chip specialized for transformer inference. You get very low latency for chat workloads.

Bedrock vs OpenRouter vs Together
Bedrock is AWS enterprise gateway. OpenRouter is a community aggregator with OpenAI-style API. Together is an infra host for open weights with training/fine-tune options.

Open these first

Visual map and recovery:
RAG Architecture & Recovery
End-to-end retrieval knobs:
Retrieval Playbook
Why this snippet (traceability schema):
Retrieval Traceability
Ordering control:
Rerankers
Embedding vs meaning:
Embedding ≠ Semantic
Hallucination and chunk boundaries:
Hallucination
Long chains and entropy:
Context Drift,
Entropy Collapse
Structural collapse and recovery:
Logic Collapse
Snippet and citation schema:
Data Contracts
Live ops:
Live Monitoring for RAG,
Debug Playbook
Boot order issues:
Bootstrap Ordering,
Deployment Deadlock,
Pre-Deploy Collapse

Core acceptance targets

ΔS(question, retrieved) ≤ 0.45
Coverage ≥ 0.70 for the target section
λ remains convergent across three paraphrases and two seeds
E_resonance stays flat on long windows

Fix Hub — typical provider symptoms → exact fix

Symptom	Likely cause	Open this
JSON mode breaks, invalid objects	Schema too loose or nested tool calls	Data Contracts, Logic Collapse
Tool calls loop or stall	Agent role drift, missing timeouts	Multi-Agent Problems, Role-drift deep dive
High similarity yet wrong snippet	Metric mismatch or fragmented store	Embedding ≠ Semantic, Vectorstore Fragmentation
Answers flip between runs	Prompt headers reorder and λ flips	Context Drift, Retrieval Traceability
Hybrid retrievers worse than single	Query parsing split, mis-weighted rerank	Query Parsing Split, Rerankers
Jailbreaks or bluffing	Overconfidence and missing fences	Bluffing Controls, Retrieval Traceability

Fix in 60 seconds

Measure ΔS
Compute ΔS(question, retrieved) and ΔS(retrieved, expected anchor). Stable < 0.40, transitional 0.40–0.60, risk ≥ 0.60.
Probe λ_observe
Vary top-k and prompt headers. If λ flips, lock the schema and apply a BBAM variance clamp.
Apply the module
Retrieval drift → BBMC + Data Contracts
Reasoning collapse → BBCR bridge + BBAM
Dead ends in long runs → BBPF alternate paths
Verify
Coverage ≥ 0.70 on three paraphrases. λ convergent on two seeds.

Quick routes to per-provider pages

🔗 Quick-Start Downloads (60 sec)

Tool	Link	3-Step Setup
WFGY 1.0 PDF	Engine Paper	1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS)	TXTOS.txt	1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

🧭 Explore More

Module	Description	Link
WFGY Core	WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack	View →
Problem Map 1.0	Initial 16-mode diagnostic and symbolic fix framework	View →
Problem Map 2.0	RAG-focused failure tree, modular fixes, and pipelines	View →
Semantic Clinic Index	Expanded failure catalog: prompt injection, memory bugs, logic drift	View →
Semantic Blueprint	Layer-based symbolic reasoning & semantic modulations	View →
Benchmark vs GPT-5	Stress test GPT-5 with full WFGY reasoning suite	View →
🧙‍♂️ Starter Village 🏡	New here? Lost in symbols? Click here and let the wizard guide you through	Start →

👑 Early Stargazers: See the Hall of Fame —
Engineers, hackers, and open source builders who supported WFGY from day one.

⭐ WFGY Engine 2.0 is already unlocked. ⭐ Star the repo to help others discover it and unlock more on the Unlock Board.

README.md Unescape Escape