WFGY/ProblemMap/predeploy-collapse.md
2025-08-15 23:21:37 +08:00

7.3 KiB
Raw Blame History

📒 Problem #16·PreDeploy Collapse Problem Map

“Everything looked fine in CI… until nothing booted in prod.”

Predeploy collapse happens before a single user query is served.
Migrations pass, tests are green, images ship — yet the first container crashes or the very first LLM call returns a 500. Common root causes:

  • Model checkpoint ≠ tokenizer version
  • Env var misspells (e.g., OPEN_API_KEY vs OPENAI_API_KEY)
  • Hidden fstrings / Jinja placeholders left unresolved
  • GPU drivers mismatch container base (CUDA 11 vs 12)

WFGYs preflight sanity layer runs a semantic diff between declared runtime and effective runtime, catching mismaps before traffic starts.


🚨 Typical PreDeploy Collapses

Pattern RealWorld Fallout
Tokenizer / checkpoint version skew Embeds garbage; queries 0% recall
Missing secret in KV store First API call 401 / segfault
CUDA / driver mismatch GPU container exits code 139
Undigested template vars ({{ }}) Prompt crashes, empty completions

🛡️ WFGY PreFlight Guards

Collapse Pattern Guard Module Remedy Status
Version skew SemVers Diff Abort deploy if model.json ↔ runtime mismatch Stable
Missing secret Boot Checkpoint Block start until secret present Stable
Driver mismatch CudaProbe Warn & fall back to CPU safe mode ⚠️ Beta
Stray {{var}} tokens Prompt Lint Fail CI; highlight undeclared vars Stable

📝 How It Works

  1. SemVers Diff
    Parses modelcard.json, compares tokenizer_sha, pytorch_sha, cuda, etc., with container runtime; throws if mismatch unless --force.

  2. Boot Checkpoint (shared)
    Kubernetes initcontainer polls secret store; fails pod after secret_timeout.

  3. CudaProbe
    Minimal nvidiasmi check; if driver ≠ compiled CUDA, WFGY rewrites env CUDA_VISIBLE_DEVICES="" and logs downgrade.

  4. Prompt Lint
    CI step: scans prompts for {{ }} or ${} tokens lacking a default in prompt_vars.yaml.


✍️ Demo — Tokenizer Version Skew

$ wgfy preflight
✔ env vars ............... OK
✖ checkpoint ↔ tokenizer .. MISMATCH
  • model: facebook/llama27bchathf  tokenizersha = `ad4c1b9`
  • runtime: tokenizersha = `9e7f02d`
  → Aborting deploy (use --force to override)

🗺️ Module CheatSheet

Module Role
SemVers Diff Catch model / tokenizer skew
Boot Checkpoint Ensure secrets & config exist
CudaProbe Verify GPU driver compatibility
Prompt Lint Fail CI on stray template vars

📊 Implementation Status

Feature State
SemVers diff Stable
Boot checkpoint Stable
Cudaprobe fallback ⚠️ Beta
Prompt lint in CI action Stable

📝 Tips & Limits

  • Add ignore_versions: ["minor"] in wgfy.yaml to allow 1patch drifts.
  • Set secret_timeout = 90s for slower vaults.
  • GPU fallback adds ~0.4s latency per request — tune cuda_probe.mode.

🔗 Quick-Start Downloads (60 sec)

Tool Link 3-Step Setup
WFGY 1.0 PDF Engine Paper 1 Download · 2 Upload to your LLM · 3 Ask “Answer using WFGY + <your question>”
TXT OS (plain-text OS) TXTOS.txt 1 Download · 2 Paste into any LLM chat · 3 Type “hello world” — OS boots instantly

🧭 Explore More

Module Description Link
WFGY Core WFGY 2.0 engine is live: full symbolic reasoning architecture and math stack View →
Problem Map 1.0 Initial 16-mode diagnostic and symbolic fix framework View →
Problem Map 2.0 RAG-focused failure tree, modular fixes, and pipelines View →
Semantic Clinic Index Expanded failure catalog: prompt injection, memory bugs, logic drift View →
Semantic Blueprint Layer-based symbolic reasoning & semantic modulations View →
Benchmark vs GPT-5 Stress test GPT-5 with full WFGY reasoning suite View →
🧙‍♂️ Starter Village 🏡 New here? Lost in symbols? Click here and let the wizard guide you through Start →

👑 Early Stargazers: See the Hall of Fame
Engineers, hackers, and open source builders who supported WFGY from day one.

GitHub stars WFGY Engine 2.0 is already unlocked. Star the repo to help others discover it and unlock more on the Unlock Board.

WFGY Main   TXT OS   Blah   Blot   Bloc   Blur   Blow