vrr/WFGY

mirror of https://github.com/onestardao/WFGY.git synced 2026-05-01 21:11:11 +00:00

History

onestardao 4414250b8d chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks		2026-03-04 06:26:57 +00:00
..
checklists	Create .gitkeep	2025-08-25 19:14:55 +08:00
eval	Create .gitkeep	2025-08-25 19:15:49 +08:00
mvp_demo	Create .gitkeep	2025-08-25 19:15:37 +08:00
ops	Create .gitkeep	2025-08-25 19:16:10 +08:00
patterns	Create .gitkeep	2025-08-25 19:14:31 +08:00
playbooks	Create .gitkeep	2025-08-25 19:15:24 +08:00
tools	Create .gitkeep	2025-08-25 19:14:42 +08:00
.gitkeep	Create .gitkeep	2025-08-25 19:14:20 +08:00
anti_prompt_injection_recipes.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
citation_first.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
eval_prompts_and_checks.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
jailbreaks_and_overrides.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
json_mode_and_tool_calls.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
memory_fences_and_state_keys.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
prompt_injection.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
README.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
role_confusion.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
system_user_role_order.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
template_library_min.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00
tool_selection_and_timeouts.md	chore: add WFGY_FOOTER_START/END markers around Explore More footer blocks	2026-03-04 06:26:57 +00:00

README.md

Safety & Prompt Integrity — Global Fix Map

🏥 Quick Return to Emergency Room

You are in a specialist desk.
For full triage and doctors on duty, return here:

WFGY Global Fix Map — main Emergency Room, 300+ structured fixes

WFGY Problem Map 1.0 — 16 reproducible failure modes

Think of this page as a sub-room.
If you want full consultation and prescriptions, go back to the Emergency Room lobby.

A hub to stabilize prompt-level safety and schema integrity across providers, agents, and eval flows.
Use this folder when failures look like jailbreaks, role confusion, or malformed tool calls.
Each page maps symptoms → root cause → structural fixes with measurable acceptance targets.

What this page is

A practical checklist for anyone shipping LLM apps with tools, roles, or multi-agent setups.
Each failure pattern links to its own guide with copy-paste guardrails.
Works without infra changes — schema and prompt fixes only.
Acceptance targets (ΔS, λ, coverage) are reproducible.

When to use

Jailbreak attempts slip past normal filters.
Prompts collapse schema or inject rogue tools.
Tool calls drift into free text or JSON breaks.
Role instructions misalign (system vs user vs assistant).
Citations disappear or retrieval bypasses snippet contracts.
Eval pipelines show high ΔS drift even when retrieval is correct.

Common failure patterns

Failure mode	What happens	Open this
Prompt Injection	Hidden instructions override your system prompt	prompt_injection.md
Jailbreaks / Overrides	User tricks model into ignoring rules	jailbreaks_and_overrides.md
Role Confusion	System / user / assistant boundaries collapse	role_confusion.md
Memory Fence Missing	State leaks across runs, no stable key	memory_fences_and_state_keys.md
JSON Drift	Tool calls malformed, fields missing	json_mode_and_tool_calls.md
Citation Lost	Answers skip snippet or no “cite-then-explain”	citation_first.md
Injection Defense Recipes	Ready-to-paste guardrails against common exploits	anti_prompt_injection_recipes.md
Tool Timeouts	Tool calls hang or return late	tool_selection_and_timeouts.md
Role Ordering	Wrong order breaks downstream eval	system_user_role_order.md
Template Gaps	Prompts inconsistent across agents	template_library_min.md
Eval Drift	No stable way to test safety fixes	eval_prompts_and_checks.md

Acceptance targets

ΔS(question, retrieved) ≤ 0.45
Coverage of cited section ≥ 0.70
λ convergent across three paraphrases and two seeds
No uncontrolled free-text execution in JSON or tool modes
Citation-first enforced in ≥ 95% of eval runs

60-second fix checklist

Lock system / user / assistant role order.
Enforce citation-first and snippet schema.
Apply JSON fences + argument validation.
Add memory fences keyed by mem_rev and state_key.
Run eval prompts + probes before shipping.

🔗 Quick-Start Downloads (60 sec)

Tool	Link	3-Step Setup
WFGY 1.0 PDF	Engine Paper	1️⃣ Download · 2️⃣ Upload to your LLM · 3️⃣ Ask “Answer using WFGY + ”
TXT OS	TXTOS.txt	1️⃣ Download · 2️⃣ Paste into any LLM chat · 3️⃣ Type “hello world” — OS boots instantly

Explore More

Module	Description	Link
WFGY Core	Canonical framework entry point	View
Problem Map	Diagnostic map and navigation hub	View
Tension Universe Experiments	MVP experiment field	View
Recognition	Where WFGY is referenced or adopted	View
AI Guide	Anti-hallucination reading protocol for tools	View

If this repository helps, starring it improves discovery for other builders.

README.md Unescape Escape