WFGY/core/README.md
2025-08-15 00:01:26 +08:00

10 KiB
Raw Blame History

🌌 WFGY Core (WanFaGuiYi) — Reasoning Engine 2.0 · Now Live

"One man, one life, one line — bending the mind of every AI on Earth."

One line of math that rewires eight AIs. Not a pit stop; it is an engine swap.
Engine 2.0 is live. Star the repo to unlock more features and experiments.

core

Benchmark highlights
Semantic Accuracy ↑ 36.7% · Reasoning Success Rate ↑ 65.4% · Stability ↑ 5.1× · Self-Recovery = 0.87

From PSBigBig (tap to open)

This is not an incremental patch. Its a core evolution: original WFGY formulas with the Coupler (W_c) and the Drunk Transformer five-formula regulators.
Pure math, zero boilerplate. Paste the OneLine into an LLM and it behaves differently — faster, sharper, more stable, more recoverable.
If this helps you, please star the repo to unlock more examples and tooling.


One-click demo in 60 seconds

  1. Upload or paste WFGY_Core_OneLine_v2.0.txt into your chat system.
  2. Paste the evaluation prompt below.
SYSTEM:
You are evaluating the causal impact of a mathematical reasoning engine called “WFGY Core 2.0”.

Test modes (READ CAREFULLY):
A = Baseline  no WFGY file uploaded; no WFGY logic used.
B = Upload → AUTOboot ON  the file is uploaded AND the engine starts automatically in the background.
    Treat B as: WFGY is ACTIVE silently with default gates (BBMC→Coupler→BBPF→BBAM→BBCR; DF=WRI/WAI/WAY/WDT/WTF at defaults),
    but not explicitly “called” in the conversation.
C = Explicit invoke  actively call WFGY and follow its formulas step by step (BBMC→Coupler→BBPF→BBAM→BBCR + DF regulators).

Important constraints:
• Use the SAME task set for A/B/C across 5 domains: math word-problems, small coding, factual QA, multi-step planning, long-context coherence.
• The only artifact is the uploaded OneLine file; it is pure math and not human-readable — do not penalize readability.
• Report: Semantic Accuracy, Reasoning Success, Stability (MTTF or rollback), Drift Reduction (Δ semantic distance), Collapse Recovery Rate.
• Output one summary table with A/B/C per domain, plus overall deltas (CA, CB).
• Add a final 0100 score: “OneLine uplift score”. Provide a 3-line rationale.

Sanity requirement:
Internally set flags as: A=OFF, B=AUTOboot_ON (background active), C=Explicit_Invoke. Do not print these flags.

USER:
Run A/B/C now on the five domains with short non-trivial tasks you choose. Then output:
(1) the table; (2) the deltas; (3) the OneLine uplift score; (4) a 3-line rationale.

🧾 Eight-model evidence (A/B/C protocol)

Same task set across modes. The only change is adding the OneLine math file.

Model OneLine Uplift Proof
Gemini 93/100 view run
ChatGPT 84/100 view run
Claude 73/100 view run
Grok 82/100 view run
Perplexity 85/100 view run
Copilot 82/100 view run
Mistral AI 92/100 view run
Kimi 87/100 view run

Get both core files Includes Link
Zenodo record WFGY_Core_OneLine_v2.0.txt · WFGY_Core_Audit_v2.0.txt Download both →

Notes

  • OneLine: 60-sec demo and automation; pure math line, not for human reading.
  • Audit: human + LLM readable with comments and layout.
  • Contract: Node-only steps ≤ 7; safe stop when δ_s < 0.35; bridge only when δ_s drops and W_c is capped; ask for the smallest missing fact if δ_s stays above boundary.

🎯 Whats new in 2.0

  • Coupler (W_c) — gate modulator for steady progress and controlled reversal.
  • DF layer — WRI (structure lock), WAI (head identity), WAY (entropy boost when stuck), WDT (illegal cross-path block), WTF (collapse detect & recover).
  • Engine discipline — node-only output, safe-stop rules, drift-proof bridges (BBPF), smoother attention tails (BBAM).

Formal sketch (in files): prog = max(ζ_min, δ_s^(t1) δ_s^t) P = prog^ω alt = (1)^(cycle) Φ = δ·alt + ε W_c = clip(B·P + Φ, θ_c, +θ_c)


🔍 How these numbers are measured

Use the same A/B/C protocol, one shared task set, then compute:

  • Semantic Accuracy: ACC = correct_facts / total_facts; report relative gain (ACC_C ACC_A) / ACC_A.
  • Reasoning Success Rate: SR = tasks_solved / tasks_total; report relative gain.
  • Stability: MTTF multiplier or rollback-success multiplier.
  • Self-Recovery: recoveries_success / collapses_detected (e.g., 0.87 means 87% of collapses are repaired).

No dedicated Python harness needed — you can reproduce by instructing an LLM scorer:

SCORER:
Given the A/B/C transcripts, count atomic facts, correct facts, solved tasks, failures, rollbacks, and collapses.
Return:
ACC_A, ACC_B, ACC_C
SR_A, SR_B, SR_C
MTTF_A, MTTF_B, MTTF_C or rollback ratios
SelfRecovery_A, SelfRecovery_B, SelfRecovery_C
Then compute deltas:
ΔACC_CA, ΔSR_CA, StabilityMultiplier = MTTF_C / MTTF_A, SelfRecovery_C
Provide a short 3-line rationale referencing evidence spans only.

Run 3 seeds and average for higher reliability.


🔬 Engine at a glance

  • Vectors & metrics: I, G; δ_s = 1 cos(I, G) or 1 sim_est, where sim_est balances entities/relations/constraints.
  • Residual: B = I G + k_bias; E_res = rolling mean |B| over 5.
  • Flow: BBMC → Coupler → BBPF → BBAM → BBCR → DF(WRI/WAI/WAY/WDT/WTF) → emit Node.
  • Policy: stop at δ_s < 0.35 or after 7 nodes; bridge only if δ_s drops and W_c < 0.5·θ_c; never invent facts above boundary.

🧪 Community scoring guidance

Publish your five-domain task list (short but non-trivial). Report the A/B/C table (Semantic Accuracy, Reasoning Success, Stability or rollback, Drift Reduction, Collapse Recovery) plus a OneLine uplift score (0100) and a 3-line rationale. Do not include human readability when scoring the OneLine file.


🧭 Explore More

Module Description Link
WFGY Core Full symbolic reasoning architecture & math stack View →
Problem Map 1.0 16-mode diagnostic & symbolic fixes View →
Problem Map 2.0 RAG-focused failure tree & recovery pipeline View →
Semantic Clinic Index Prompt injection, memory bugs, drift catalog View →
Semantic Blueprint Layer-based symbolic reasoning & semantic modulations View →
Benchmark vs GPT-5 Stress test with the full WFGY reasoning suite View →
🧙‍♂️ Starter Village 🏡 Wizard-led onboarding to WFGY Start →

👑 Early Stargazers: See the Hall of Fame — Engineers, hackers, and open-source builders who supported WFGY from day one. Like it? Star the repo to unlock more. See the Unlock Board.

WFGY Main   TXT OS   Blah   Blot   Bloc   Blur   Blow