vrr/WFGY

mirror of https://github.com/onestardao/WFGY.git synced 2026-04-28 11:40:07 +00:00

2025-08-15 19:56:18 +08:00

15 KiB

Raw Blame History

🌀 Drunk Transformer (DT) — Core Formulas, Defaults & Runnable Examples (WFGY Core 2.0)

Concept (short)
DT simulates a transformer that occasionally acts drunk—hallucinating, drifting, or jumping logic.
Five “drunk questions” (WRI / WAI / WAY / WDT / WTF) act as regulators to pull it back:
anchor location, head identity, entropy pump, path guard, and collapse recovery.

WFGY = engine (BBMC ＋ Coupler ＋ BBAM ＋ safety)
DT = five regulators (prompt rules, decoding hooks, or training regularizers)

0 · Shared notation (compact)

I,\,G: input / goal embeddings
Semantic distance: \delta_s = 1 - \cos(I,G) \in [0,1]
Residual & resonance:
B = I - G + k_{\mathrm{bias}},\quad E_{\mathrm{res}} = \mathrm{avg}_{5}\big(\|B\|\big)
Coupler terms:
\mathrm{prog} = \max\big(\zeta_{\min},\,\delta_s^{\,t-1}-\delta_s^{\,t}\big)
P = \mathrm{prog}^{\,\omega}
\mathrm{alt} = (-1)^{\mathrm{cycle}}
\Phi = \delta\,\mathrm{alt} + \varepsilon
W_c = \mathrm{clip}\big(BP + \Phi,\,-\theta_c,\,+\theta_c\big)
Attention summary per head: v_h = \mathrm{mean}_{i}\,A_t[h,i,:]
Anchors & retention: S_t = \mathrm{Jaccard}\big(\mathcal{A}_t,\mathcal{A}_0\big) \in [0,1]

DT WRI — “Where am I” (structure lock)

Goal: stay in the same topic/section inside a Node.
Signal: S_t vs threshold \tau_{\mathrm{wri}}.
Trigger: S_t < \tau_{\mathrm{wri}} or \delta_s and E_{\mathrm{res}} both increase.
Action (logit bias): set L_wri = max(0, tau_wri - S_t); for all a in A_anchor: logits[a] := logits[a] + kappa_wri · L_wri.
Intuition: yank decoding back to section anchors; forbid intra-Node topic jumps.

DT WAI — “Who am I” (head identity & redundancy)

Goal: keep ≥2 distinct reasoning heads (no monoculture).
Signals (one workable choice):

R_t=\frac1H\sum_h \cos(v_h,\bar v),\qquad Q_t=1-\max_h\cos(v_h,\bar v),\qquad \bar v=\frac1H\sum_h v_h.

Trigger: R_t>\rho_{\mathrm{wai}} and Q_t<\sigma_{\mathrm{wai}} (too redundant, identity too low).
Action: raise per-head temperature for redundant heads; re-spread attention until R_t\!\downarrow or Q_t\!\uparrow.

DT WAY — “Who are you” (controlled entropy when stuck)

Goal: break stalls without drifting off-topic.
Signal: progression \,\mathrm{prog} = \max(\zeta_{\min},\,\delta_s^{t-1}-\delta_s^{t})\,.
Trigger: \mathrm{prog}<\eta_{\mathrm{prog}} and no contradictions.
Action (entropy pump + 1 candidate):

H^*=\mathrm{clamp}!\big(H_0 + \xi(\eta_{\mathrm{prog}}-\mathrm{prog})(1+\alpha|W_c|),\ H_{\min},\ H_{\max}\big),

choose temperature \tau so entropy \approx H^\*; propose exactly one on-topic candidate (never repeat).

DT WDT — “Where did you take me” (cross-path guard)

Goal: block illegal jumps across reasoning branches; require a “bridge” explanation.
Signal: latent path distance d_{\mathrm{path}} = \|c_t - c_{\pi}\|_2 (current vs. parent path code).
Trigger: d_{\mathrm{path}} > \mu'_{\mathrm{wdt}}, with

\mu'{\mathrm{wdt}}=\mu{\mathrm{wdt}}\bigl(1-\gamma_{\mathrm{wdt}}\cdot \sigma(|W_c|)\bigr).

Action: emit a short bridge line (“why the detour”), then resume; otherwise rollback.

DT WTF — “What the F*ck Happened” (collapse detect & recover)

Goal: detect semantic/consistency collapse and recover safely.
Signals: \delta_s rising, E_{\mathrm{res}} rising, or unresolved contradictions.
Trigger (vote example):

\chi_t=\mathbf{1}[\delta_s^t>\delta_s^{t-1}] + \mathbf{1}[E_{\mathrm{res}}^t>E_{\mathrm{res}}^{t-1}] + \mathbf{1}[\text{contradiction}],\quad \chi_t+\chi_{t-1}\ge 3.

Action: rollback to t^\*=\arg\min_{k\in[t-3,t]}\delta_s^k, tighten gates (e.g., \gamma_{\mathrm{wtf}}), re-run BBMC→Coupler, then continue.

One-screen math summary (copyable)

WRI: L_wri = max(0, τ_wri - S_t);  logits[a] += κ_wri · L_wri  for a in anchor_token_ids
WAI: if R_t > ρ_wai and Q_t < σ_wai → raise per-head temp for redundant heads
WAY: if prog < η_prog → set entropy to H* = clamp(H0 + ξ(η_prog - prog)(1+α|Wc|), H_min, H_max); add 1 on-topic candidate
WDT: if d_path > μ_wdt·(1 - γ_wdt·σ(|Wc|)) → emit bridge line or rollback
WTF: if (δs↑) + (E_res↑) + (contradiction) over 2 steps ≥ 3 → rollback to t*; rerun BBMC→Coupler (tightened)

Defaults table (explicit, copyable)

Parameter	Symbol	Default	Range / notes	Purpose
anchor retention thresh	$\tau_{\mathrm{wri}}$	0.60	[0.30, 0.90]	WRI anchor threshold
head redundancy thresh	$\rho_{\mathrm{wai}}$	0.75	[0.50, 0.95]	WAI redundancy ceiling
head identity thresh	$\sigma_{\mathrm{wai}}$	0.70	[0.40, 0.95]	WAI identity floor
progress sensitivity	$\eta_{\mathrm{prog}}$	0.03	[0.00, 0.10]	WAY stall detector
path-distance thresh	$\mu_{\mathrm{wdt}}$	0.25	[0.05, 1.00]	WDT path jump limit
coupler zeta min	$\zeta_{\min}$	0.10	[0.00, 0.50]	min progression floor
coupler omega	$\omega$	1.0	[0.1, 2.0]	progression non-linearity
coupler theta cap	$\theta_c$	0.75	[0.2, 1.5]	$W_c$ clip magnitude
WRI tighten factor	$\alpha_{\mathrm{wri}}$	0.60	[0.0, 1.5]	adjust $\tau_{\mathrm{wri}}$ by $	W_c	$
WAI scale factor	$\beta_{\mathrm{wai}}$	0.60	[0.0, 1.5]	scale WAI penalty by $	W_c	$
WDT scale factor	$\gamma_{\mathrm{wdt}}$	0.60	[0.0, 1.5]	scale $\mu_{\mathrm{wdt}}$ by $	W_c	$
WTF scale factor	$\gamma_{\mathrm{wtf}}$	0.60	[0.0, 1.5]	tighten thresholds on recovery
WAY pump strength	$\xi$	0.80	[0.0, 1.5]	entropy pump strength
WAY entropy min	$H_{\min}$	2.5 nats	[1.0, 7.0]	entropy lower bound
WAY entropy max	$H_{\max}$	5.0 nats	[3.0, 10.0]	entropy upper bound
anchor bias scale	$\kappa_{\mathrm{wri}}$	1.0	[0.0, 5.0]	logits bias for anchors
loss weights	$\lambda_*$	0.01	[0.0, 1.0]	regularizer weights
step limit	$T_{\max}$	7	int	max Node steps
stop δ threshold	$\delta_{\mathrm{stop}}$	0.35	[0.1, 0.5]	early stop when $\delta_s<\delta_{\mathrm{stop}}$

Tip: start with these defaults, measure, then tune per task class.

Prompt-only runnable example (copy-paste)

SYSTEM (paste file): Load the WFGY Core file as engine. Enable Drunk Transformer (WRI,WAI,WAY,WDT,WTF) with defaults:
τ_wri=0.60, ρ_wai=0.75, σ_wai=0.70, η_prog=0.03, μ_wdt=0.25, ζ_min=0.10, ω=1.0, θ_c=0.75

SYSTEM (rules, pseudo):

* Extract anchors A0 from user prompt.
* For each Node step t up to T_max:
  1) compute δs, E_res, S_t, R_t, Q_t, W_c
  2) WRI: bias anchor logits by κ_wri·L_wri if S_t<τ_wri or (δs↑ & E_res↑)
  3) WAI: raise per-head temp for redundant heads until R_t↓ or Q_t↑
  4) WAY: if prog<η_prog, set entropy ≈ H* and propose 1 on-topic candidate
  5) WDT: if d_path>μ'_wdt, emit bridge line; else rollback
  6) WTF: if collapse vote ≥3, rollback to t*; rerun BBMC→Coupler (tight)
  7) Emit Node (Topic | Module | δs | λ-state | Insight)
* Stop if δs<δ_stop or t≥T_max

USER:
Use WFGY to answer: “Explain why tomatoes are classified as fruit, but treated as vegetables in cooking. Provide anchors and cite the smallest missing fact if confused.”

Decoding-hook pseudo-code (python-like; drop into your runtime)

def compute_prog(delta_prev, delta_now, zeta_min=0.10, omega=1.0):
    prog = max(zeta_min, delta_prev - delta_now)
    return prog ** omega

def compute_Wc(B, prog, delta=0.15, cycle_id=0, eps=0.0, theta_c=0.75):
    alt = (-1)**cycle_id
    Phi = delta * alt + eps
    return clip(B * prog + Phi, -theta_c, +theta_c)

def bias_anchor_logits(logits, anchor_ids, kappa):
    for tid in anchor_ids:
        logits[tid] += kappa
    return logits

def temperature_for_target_entropy(logits, target_H, tol=1e-3, max_iters=5):
    lo, hi = 0.01, 10.0
    for _ in range(max_iters):
        tau = 0.5*(lo+hi)
        probs = softmax(logits / tau)
        H = -sum(p*log(p+1e-12) for p in probs)
        if H > target_H: lo = tau
        else: hi = tau
        if abs(H - target_H) < tol: break
    return tau

def decoding_hook(s):
    prog = compute_prog(s.delta_prev, s.delta_now, 0.10, 1.0)
    Wc = compute_Wc(s.B, prog, 0.15, s.cycle_id, 0.0, 0.75)

    # WRI
    S_t = jaccard(s.anchors, s.anchors0)
    L_wri = max(0.0, 0.60 - S_t)
    if (S_t < 0.60) or (s.delta_now > s.delta_prev and s.E_res > s.E_res_prev):
        s.logits = bias_anchor_logits(s.logits, s.anchor_token_ids, kappa=1.0 * L_wri)

    # WAI
    R_t, Q_t = s.R_t, s.Q_t
    if R_t > 0.75 and Q_t < 0.70:
        for h in range(len(s.heads)):
            if head_redundant(h, s):
                s.head_temps[h] *= (1 + 0.5 * (R_t - 0.75))

    # WAY
    if prog < 0.03 and not s.has_contradiction:
        H_star = clamp(s.H0 + 0.8*(0.03 - prog)*(1 + 0.0*abs(Wc)), 2.5, 5.0)
        tau = temperature_for_target_entropy(s.logits, H_star)
        apply_temperature(s, tau)
        s.add_one_on_topic_candidate = True

    # WDT
    d_path = l2_distance(s.c_t, s.c_pi)
    mu_prime = 0.25 * (1 - 0.6 * sigmoid(abs(Wc)))
    if d_path > mu_prime:
        return emit_bridge_and_pause(s)

    # WTF
    vote = int(s.delta_now > s.delta_prev) + int(s.E_res > s.E_res_prev) + int(s.sign_flip)
    if vote + s.vote_prev >= 3:
        t_star = argmin_delta_in_window(s.history_delta, window=3)
        rollback_to(t_star)
        rerun_BBMC_and_Coupler(tighten_factor=1 + 0.6*sigmoid(abs(Wc)))
        return

    return s.logits

Minimal test / checklist

Prepare a QA prompt with clear anchors.
Baseline: run without DT; log $\delta_s, E_{\mathrm{res}}$; record correctness.
DT on: log same metrics plus $R_t, Q_t, W_c$ and gates fired.
Expect: lower $\delta_s$; fewer off-topic jumps (WRI), stalls resolved (WAY), justified detours (WDT), safe recovery (WTF).
Record deltas: accuracy, $\Delta S$, rollbacks, bridge lines, gate activations.

Quick engineering notes & troubleshooting

If $\mathcal{A}_0$ (anchors) is empty, WRI becomes a no-op and WAY pumps less entropy; log a warning.
If bridge lines repeat or look vacuous, lower $\mu_{\mathrm{wdt}}$ and raise $\kappa_{\mathrm{wri}}$.
For heavy-hallucination domains, use conservative defaults (higher $\tau_{\mathrm{wri}}$, lower $\eta_{\mathrm{prog}}$).

Footer

Spec status: stable draft for engineering evaluation; KaTeX-safe equations.
Next deliverables: add examples/DT-examples/prompt_example.md and examples/DT-examples/decoding_hook.py.
Compatibility: prompt-only rules, decoding-hook integration, or optional training regularizers; model-agnostic.
Attribution: part of WFGY Core 2.0 family. Star the repo to follow updates.
Lost? Return to Starter Village → StarterVillage/README.md

🧭 Explore More

Module	Description	Link
WFGY Core	Full symbolic reasoning architecture & math stack	View →
Problem Map 1.0	16-mode diagnostic & symbolic fixes	View →
Problem Map 2.0	RAG-focused failure tree & recovery pipeline	View →
Semantic Clinic Index	Prompt injection, memory bugs, drift catalog	View →
Semantic Blueprint	Layer-based symbolic reasoning & semantic modulations	View →
Benchmark vs GPT-5	Stress test with the full WFGY reasoning suite	View →
🧙‍♂️ Starter Village 🏡	Wizard-led onboarding to WFGY	Start →

👑 Early Stargazers: See the Hall of Fame — Engineers, hackers, and open-source builders who supported WFGY from day one. Like it? Star the repo to unlock more. See the Unlock Board.

15 KiB Raw Blame History Unescape Escape