Deployment checklist — RAG pipeline (pre-deploy & post-deploy)

Purpose: a short, rigorous checklist to verify your environment and reduce bootstrap/dependency issues during deployment.

Before you deploy (pre-flight)

Kubernetes cluster accessible; kubectl points to correct context.
```
kubectl config current-context
kubectl get nodes
```

Ensure cluster resources: CPU / memory / ephemeral storage for vectorstore. Confirm quotas.
Secrets: API keys (LLM), db credentials, vectorstore creds in k8s Secret or vault.
Helm chart / manifests: reviewed and values set for production (replicas, resources, liveness/readiness).

values.yaml contains:
- resources.requests and limits for retriever/generator.
- replicaCount >= 2 for critical services (if expected load > small).
- readinessProbe and livenessProbe configured.
Vector store sizing: index_shards, disk IOPS, memory (embedding index memory).
Network egress rules for model API (if external LLM).

Create namespace & secrets:

kubectl create ns rag-prod || true
kubectl -n rag-prod apply -f k8s/secrets.yaml

Install/upgrade Helm chart:

helm upgrade --install rag . -n rag-prod -f values.prod.yaml

Wait for pods to be ready (watch):

kubectl -n rag-prod rollout status deploy/rag-api -w
kubectl -n rag-prod get pods -o wide

Smoke tests (simple requests):

curl -fsS http://<ingress>/healthz
curl -fsS -X POST http://<ingress>/api/qa -d '{"qid":"smoke-1","q":"Who is the CEO of WFGY?" }' | jq

Confirm retriever returns docs for 10 sample queries:
- Use your retrieval debug endpoint to inspect retrieved_ids.
Confirm p95 E2E latency ≤ target (by env). Collect from Grafana or kubectl logs.
Confirm CHR on 10 smoke items ≥ expected baseline (manually assert correctness).

Check for error spikes in logs:

kubectl -n rag-prod logs -l app=rag --since=10m | egrep "ERROR|WARN" | head -n 200

Rollback if any of:

Rollback command example:

helm rollback rag <previous_revision> -n rag-prod

Module	Description	Link
WFGY Core	Standalone semantic reasoning engine for any LLM	View →
Problem Map 1.0	Initial 16-mode diagnostic and symbolic fix framework	View →
Problem Map 2.0	RAG-focused failure tree, modular fixes, and pipelines	View →
Semantic Clinic Index	Expanded failure catalog: prompt injection, memory bugs, logic drift	View →
Semantic Blueprint	Layer-based symbolic reasoning & semantic modulations	View →
Benchmark vs GPT-5	Stress test GPT-5 with full WFGY reasoning suite	View →

👑 Early Stargazers: See the Hall of Fame — Engineers, hackers, and open source builders who supported WFGY from day one.

⭐ Help reach 10,000 stars by 2025-09-01 to unlock Engine 2.0 for everyone ⭐ Star WFGY on GitHub