vrr/WFGY

mirror of https://github.com/onestardao/WFGY.git synced 2026-05-19 07:55:29 +00:00

History

PSBigBig 59fe41ab87 Create .gitkeep		2025-09-05 11:02:41 +08:00
..
checklists	Create .gitkeep	2025-09-05 11:00:45 +08:00
eval	Create .gitkeep	2025-09-05 11:01:29 +08:00
mvp_demo	Create .gitkeep	2025-09-05 11:01:50 +08:00
ops	Create .gitkeep	2025-09-05 11:02:04 +08:00
patterns	Create .gitkeep	2025-09-05 11:02:17 +08:00
playbooks	Create .gitkeep	2025-09-05 11:02:30 +08:00
tools	Create .gitkeep	2025-09-05 11:02:41 +08:00
anthropic.md	Update anthropic.md	2025-09-05 10:58:18 +08:00
anthropic_claude.md	Update anthropic_claude.md	2025-09-05 10:58:24 +08:00
aws_bedrock.md	Update aws_bedrock.md	2025-09-05 10:58:31 +08:00
azure_openai.md	Update azure_openai.md	2025-09-05 10:58:38 +08:00
cohere.md	Update cohere.md	2025-09-05 10:58:44 +08:00
deepseek.md	Update deepseek.md	2025-09-05 10:58:52 +08:00
gemini.md	Update gemini.md	2025-09-05 10:58:57 +08:00
google_vertex_ai.md	Update google_vertex_ai.md	2025-09-05 10:59:04 +08:00
grok_xai.md	Update grok_xai.md	2025-09-05 10:59:11 +08:00
groq.md	Update groq.md	2025-09-05 10:59:27 +08:00
kimi.md	Update kimi.md	2025-09-05 10:59:32 +08:00
meta_llama.md	Update meta_llama.md	2025-09-05 10:59:38 +08:00
mistral.md	Update mistral.md	2025-09-05 10:59:43 +08:00
openai.md	Update openai.md	2025-09-05 10:59:49 +08:00
openrouter.md	Update openrouter.md	2025-09-05 10:59:54 +08:00
README.md	Update README.md	2025-09-03 23:51:09 +08:00
together.md	Update together.md	2025-09-05 10:59:59 +08:00

README.md

LLM Providers — Guardrails, FAQ, and Fix Patterns

🏥 Quick Return to Emergency Room

You are in a specialist desk.
For full triage and doctors on duty, return here:

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a sub-room.
If you want full consultation and prescriptions, go back to the Emergency Room lobby.

This page helps you choose between LLM vendors and fix provider-looking bugs that are actually schema, retrieval, orchestration, or eval drift. If you are new, start with the Orientation table and the FAQ. If you are debugging, jump to the Fix Hub.

Orientation: who is who

Provider	What it is	Typical use case	Link
OpenAI	GPT-4/4o from OpenAI Inc.	Direct API, fastest model access	openai.md
Azure OpenAI	Microsoft enterprise wrapper for OpenAI models	VNet, compliance, enterprise billing	azure_openai.md
Anthropic	The company behind Claude	Safety-focused platform	anthropic.md
Claude (Anthropic)	The model family from Anthropic	Long context, tool use, JSON control	anthropic_claude.md
Google Gemini	Google DeepMind multimodal models	Multimodal chat, reasoning	gemini.md
Google Vertex AI	Google Cloud AI platform that hosts Gemini and more	Pipelines, deployment, governance	google_vertex_ai.md
Mistral	EU startup with efficient open-weight models (e.g., Mixtral MoE)	Cost/perf, open ecosystem	mistral.md
Meta LLaMA	Meta open-weight model family	Local or private deployment, llama.cpp	meta_llama.md
Cohere	Enterprise NLP API and embeddings	RAG stacks, enterprise NLP	cohere.md
DeepSeek	CN player with infra-optimized long-context models	Cost-efficient, long windows	deepseek.md
Kimi (Moonshot)	CN chat-first models, very large parameter claims	Consumer chat focus	kimi.md
Groq	Hardware vendor: LPUs for transformer inference	Ultra-low latency serving (not a model)	groq.md
xAI Grok	xAI model family	X/Twitter integration, general chat	grok_xai.md
AWS Bedrock	AWS gateway to many models via one API	Enterprises already on AWS	aws_bedrock.md
OpenRouter	Community model aggregator, OpenAI-style endpoint	Try many models via one API key	openrouter.md
Together AI	Aggregator + infra for open weights and fine-tunes	Fast hosting, tuning services	together.md

FAQ for newcomers

OpenAI vs Azure OpenAI — are they the same?
Same models, different packaging. OpenAI = direct API and fastest releases. Azure OpenAI = Microsoft billing, VNet, compliance, data residency.

Anthropic vs Claude — why two pages?
Anthropic is the company. Claude is the model family. We separate because “platform issues” and “model quirks” often need different fixes.

Gemini vs Vertex AI — what is the relation?
Gemini is a model. Vertex AI is Google Cloud’s platform that runs Gemini and provides pipelines, eval, and deployment features.

What makes Mistral special?
Efficient open-weights and MoE designs. Good cost/perf. Easy to host in your own infra.

Meta LLaMA vs local LLaMA
Meta releases the weights. Community tools like llama.cpp let you run them locally on CPU or GPU.

Groq LPU vs GPU
GPU is general purpose. LPU is a chip specialized for transformer inference. You get very low latency for chat workloads.

Bedrock vs OpenRouter vs Together
Bedrock is an AWS enterprise gateway. OpenRouter is a community aggregator with OpenAI-style API. Together is an infra host for open weights with training and fine-tune options.

Open these first

Visual map and recovery: RAG Architecture & Recovery
End to end retrieval knobs: Retrieval Playbook
Why this snippet (traceability schema): Retrieval Traceability
Ordering control: Rerankers
Embedding vs meaning: Embedding ≠ Semantic
Hallucination and chunk boundaries: Hallucination
Long chains and entropy: Context Drift, Entropy Collapse
Structural collapse and recovery: Logic Collapse
Snippet and citation schema: Data Contracts
Live ops: Live Monitoring for RAG, Debug Playbook
Boot order issues: Bootstrap Ordering, Deployment Deadlock, Pre-Deploy Collapse

Core acceptance targets

ΔS(question, retrieved) ≤ 0.45
Coverage ≥ 0.70 for the target section
λ remains convergent across three paraphrases and two seeds
E_resonance stays flat on long windows

Fix Hub — typical provider symptoms → exact fix

Symptom	Likely cause	Open this
JSON mode breaks, invalid objects	Schema too loose or nested tool calls	Data Contracts, Logic Collapse
Tool calls loop or stall	Agent role drift, missing timeouts	Multi-Agent Problems, Role-drift deep dive
High similarity yet wrong snippet	Metric mismatch or fragmented store	Embedding ≠ Semantic, Vectorstore Fragmentation
Answers flip between runs	Prompt headers reorder and λ flips	Context Drift, Retrieval Traceability
Hybrid retrievers worse than single	Query parsing split, mis-weighted rerank	Query Parsing Split, Rerankers
Jailbreaks or bluffing	Overconfidence and missing fences	Bluffing Controls, Retrieval Traceability

Fix in 60 seconds

Measure ΔS
Compute ΔS(question, retrieved) and ΔS(retrieved, expected anchor). Stable < 0.40, transitional 0.40–0.60, risk ≥ 0.60.
Probe λ_observe
Vary top-k and prompt headers. If λ flips, lock the schema and apply a BBAM variance clamp.
Apply the module
Retrieval drift → BBMC + Data Contracts
Reasoning collapse → BBCR bridge + BBAM
Dead ends in long runs → BBPF alternate paths
Verify
Coverage ≥ 0.70 on three paraphrases. λ convergent on two seeds.

Quick-Start Downloads

Tool	Link	3-step setup
WFGY 1.0 PDF	Engine Paper	1) Download 2) Upload to your LLM 3) Ask “Answer using WFGY + ”
TXT OS (plain text OS)	TXTOS.txt	1) Download 2) Paste into any LLM chat 3) Type “hello world” to boot

Explore More

Module	Description	Link
WFGY Core	WFGY 2.0 engine, full symbolic reasoning architecture and math stack	View →
Problem Map 1.0	Initial 16-mode diagnostic and symbolic fix framework	View →
Problem Map 2.0	RAG-focused failure tree, modular fixes, and pipelines	View →
Semantic Clinic Index	Expanded catalog for prompt injection, memory bugs, logic drift	View →
Semantic Blueprint	Layer-based symbolic reasoning and semantic modulations	View →
Benchmark vs GPT-5	Stress test with full WFGY reasoning suite	View →
Starter Village	New here, want a guided path	Start →

Early stargazers: See the Hall of Fame
Star the repo if this helped. It unlocks more items on the Unlock Board.

README.md Unescape Escape