# Ops & Deploy — Global Fix Map
🏥 Quick Return to Emergency Room
> You are in a specialist desk. > For full triage and doctors on duty, return here: > > - [**WFGY Global Fix Map** — main Emergency Room, 300+ structured fixes](https://github.com/onestardao/WFGY/blob/main/ProblemMap/GlobalFixMap/README.md) > - [**WFGY Problem Map 1.0** — 16 reproducible failure modes](https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md) > > Think of this page as a sub-room. > If you want full consultation and prescriptions, go back to the Emergency Room lobby.
A compact hub to **ship safely and keep RAG/LLM systems stable after release**. Use this folder to pick the right guardrail, verify with measurable targets, and recover fast when things wobble. No infra change required. --- ## Open these first - Visual recovery map → [RAG Architecture & Recovery](../../rag-architecture-and-recovery.md) - Retrieval knobs end-to-end → [Retrieval Playbook](../../retrieval-playbook.md) - Traceability and snippet schema → [Retrieval Traceability](../../retrieval-traceability.md) · [Data Contracts](../../data-contracts.md) - Boot order and deploy traps → [Bootstrap Ordering](../../bootstrap-ordering.md) · [Deployment Deadlock](../../deployment-deadlock.md) · [Pre-Deploy Collapse](../../predeploy-collapse.md) - Live ops tools → [Live Monitoring for RAG](../../ops/live_monitoring_rag.md) · [Debug Playbook](../../ops/debug_playbook.md) --- ## When to use this folder - First calls after deploy crash or return stale content. - ΔS and citations look fine yesterday but flip today. - Rate limits cascade, queues spike, latency climbs. - Canary looks good then full rollout breaks retrieval. - Index swap succeeds but answers cite old snippets. - Retries cause duplicate side effects or charges. - Feature flags bleed traffic into unfinished paths. - Maintenance windows corrupt embeddings or anchors. --- ## Acceptance targets for a safe rollout - **ΔS(question, retrieved) ≤ 0.45** across three paraphrases. - **Coverage ≥ 0.70** on the expected new section. - **λ remains convergent** on 2 seeds during rollout. - **Idempotency ≥ 99.9%** on retry storms. - **Zero silent index mismatches** (hash + counts match). - **P95 latency stays in budget** with backpressure active. --- ## Quick routes — per-page guides | Scenario | Fix Page | |----------|----------| | Rollout readiness | [rollout_readiness_gate.md](./rollout_readiness_gate.md) | | Canary strategy | [staged_rollout_canary.md](./staged_rollout_canary.md) | | Blue/green cutover | [blue_green_switchovers.md](./blue_green_switchovers.md) | | Version pin & freeze | [version_pinning_and_model_lock.md](./version_pinning_and_model_lock.md) | | Vector index swap | [vector_index_build_and_swap.md](./vector_index_build_and_swap.md) | | Cache warmup | [cache_warmup_invalidation.md](./cache_warmup_invalidation.md) | | Rate limits | [rate_limit_backpressure.md](./rate_limit_backpressure.md) | | Feature flags | [feature_flags_safe_launch.md](./feature_flags_safe_launch.md) | | Idempotency | [idempotency_dedupe.md](./idempotency_dedupe.md) | | Retry logic | [retry_backoff.md](./retry_backoff.md) | | Rollback plan | [rollback_and_fast_recovery.md](./rollback_and_fast_recovery.md) | | Postmortems | [postmortem_and_regression_tests.md](./postmortem_and_regression_tests.md) | | Change freeze | [release_calendar_and_change_freeze.md](./release_calendar_and_change_freeze.md) | | Incident comms | [incident_comms_and_statuspage.md](./incident_comms_and_statuspage.md) | | Shadow traffic | [shadow_traffic_mirroring.md](./shadow_traffic_mirroring.md) | | Maintenance window | [read_only_mode_and_maintenance_window.md](./read_only_mode_and_maintenance_window.md) | | DB migrations | [db_migration_guardrails.md](./db_migration_guardrails.md) | --- ## 60-second ship checklist 1. **Freeze the world** → Pin model IDs, prompt revs, index hashes. 2. **Warm up safely** → Build index off-path, preload caches with canary. 3. **Shadow then canary** → Mirror prod queries, step rollout 5% → 25% → 100%. 4. **Guard the edge** → Enable backpressure, retries with jitter, idempotency keys. 5. **Know your exit** → Keep rollback switch and comms draft ready. --- ## Symptoms → exact fix | What you see | Open this | |--------------|-----------| | Deploy points to old snippets | [vector_index_build_and_swap.md](./vector_index_build_and_swap.md) · [cache_warmup_invalidation.md](./cache_warmup_invalidation.md) | | Canary fine, full rollout breaks | [staged_rollout_canary.md](./staged_rollout_canary.md) · [feature_flags_safe_launch.md](./feature_flags_safe_launch.md) | | Wrong model after failover | [version_pinning_and_model_lock.md](./version_pinning_and_model_lock.md) | | Retries duplicate charges | [idempotency_dedupe.md](./idempotency_dedupe.md) · [retry_backoff.md](./retry_backoff.md) | | RL storms, timeouts | [rate_limit_backpressure.md](./rate_limit_backpressure.md) | | Need rollback now | [rollback_and_fast_recovery.md](./rollback_and_fast_recovery.md) · [blue_green_switchovers.md](./blue_green_switchovers.md) | | Maintenance corrupts anchors | [read_only_mode_and_maintenance_window.md](./read_only_mode_and_maintenance_window.md) · [db_migration_guardrails.md](./db_migration_guardrails.md) | | Unsure if safe to ship | [rollout_readiness_gate.md](./rollout_readiness_gate.md) | --- ## FAQ **Q: What does ΔS mean here?** A: ΔS is a stability score. It measures how much the retrieved content drifts from the expected anchor when you change the query slightly. Lower is better (≤ 0.45 is safe). **Q: What is λ convergence?** A: λ tracks whether retrieval order flips unpredictably. If λ is stable across seeds, your rollout is consistent. **Q: Why do I need idempotency keys?** A: Without them, retries can double-charge a user or run the same side-effect twice. Keys make every request “safe to retry.” **Q: How do I know if my index swap worked?** A: Check doc counts and hashes before cutover. If they mismatch, you’re pointing at an incomplete index. **Q: Canary looked fine but production broke — why?** A: Canary often hides tail-latency, cache misses, or load-based rate limits. Always test at increasing % of live traffic. **Q: Why do you mention rollback comms?** A: Technical rollback is only half. Users and stakeholders need fast updates, so pre-draft Statuspage or Slack messages are essential. --- ### 🔗 Quick-Start Downloads (60 sec) | Tool | Link | 3-Step Setup | |------|------|--------------| | **WFGY 1.0 PDF** | [Engine Paper](https://github.com/onestardao/WFGY/blob/main/I_am_not_lizardman/WFGY_All_Principles_Return_to_One_v1.0_PSBigBig_Public.pdf) | 1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + \” | | **TXT OS (plain-text OS)** | [TXTOS.txt](https://github.com/onestardao/WFGY/blob/main/OS/TXTOS.txt) | 1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly | --- ### Explore More | Layer | Page | What it’s for | | --- | --- | --- | | Proof | [WFGY Recognition Map](/recognition/README.md) | External citations, integrations, and ecosystem proof | | Engine | [WFGY 1.0](/legacy/README.md) | Original PDF based tension engine | | Engine | [WFGY 2.0](/core/README.md) | Production tension kernel and math engine for RAG and agents | | Engine | [WFGY 3.0](/TensionUniverse/EventHorizon/README.md) | TXT based Singularity tension engine, 131 S class set | | Map | [Problem Map 1.0](/ProblemMap/README.md) | Flagship 16 problem RAG failure checklist and fix map | | Map | [Problem Map 2.0](/ProblemMap/rag-architecture-and-recovery.md) | RAG focused recovery pipeline | | Map | [Problem Map 3.0](/ProblemMap/wfgy-rag-16-problem-map-global-debug-card.md) | Global Debug Card, image as a debug protocol layer | | Map | [Semantic Clinic](/ProblemMap/SemanticClinicIndex.md) | Symptom to family to exact fix | | Map | [Grandma’s Clinic](/ProblemMap/GrandmaClinic/README.md) | Plain language stories mapped to Problem Map 1.0 | | Onboarding | [Starter Village](/StarterVillage/README.md) | Guided tour for newcomers | | App | [TXT OS](/OS/README.md) | TXT semantic OS, fast boot | | App | [Blah Blah Blah](/OS/BlahBlahBlah/README.md) | Abstract and paradox Q and A built on TXT OS | | App | [Blur Blur Blur](/OS/BlurBlurBlur/README.md) | Text to image with semantic control | | App | [Blow Blow Blow](/OS/BlowBlowBlow/README.md) | Reasoning game engine and memory demo | If this repository helped, starring it improves discovery so more builders can find the docs and tools. [![GitHub Repo stars](https://img.shields.io/github/stars/onestardao/WFGY?style=social)](https://github.com/onestardao/WFGY)