# 🌀 Drunk Transformer (DT) — Core Formulas, Defaults & Runnable Examples (WFGY Core 2.0) **Concept (short)** DT simulates a transformer that occasionally acts *drunk*—hallucinating, drifting, or jumping logic. Five “drunk questions” (WRI / WAI / WAY / WDT / WTF) act as regulators to pull it back: anchor location, head identity, entropy pump, path guard, and collapse recovery. > **WFGY** = engine (BBMC + Coupler + BBAM + safety) > **DT** = five regulators (prompt rules, decoding hooks, or training regularizers) --- ## 0 · Shared notation (compact) * $I,\,G$: input / goal embeddings * Semantic distance: $\delta_s = 1 - \cos(I,G) \in [0,1]$ * Residual & resonance: $B = I - G + k_{\mathrm{bias}},\quad E_{\mathrm{res}} = \mathrm{avg}_{5}\big(\|B\|\big)$ * Coupler terms: $\mathrm{prog} = \max\big(\zeta_{\min},\,\delta_s^{\,t-1}-\delta_s^{\,t}\big)$ $P = \mathrm{prog}^{\,\omega}$ $\mathrm{alt} = (-1)^{\mathrm{cycle}}$ $\Phi = \delta\,\mathrm{alt} + \varepsilon$ $W_c = \mathrm{clip}\big(BP + \Phi,\,-\theta_c,\,+\theta_c\big)$ * Attention summary per head: $v_h = \mathrm{mean}_{i}\,A_t[h,i,:]$ * Anchors & retention: $S_t = \mathrm{Jaccard}\big(\mathcal{A}_t,\mathcal{A}_0\big) \in [0,1]$ --- ### DT WRI — “Where am I” (structure lock) * **Goal:** stay in the same topic/section inside a Node. * **Signal:** $S_t$ vs threshold $\tau_{\mathrm{wri}}$. * **Trigger:** $S_t < \tau_{\mathrm{wri}}$ or $\delta_s$ and $E_{\mathrm{res}}$ both increase. * **Action (logit bias):** set L_wri = max(0, tau_wri - S_t); for all a in A_anchor: logits[a] := logits[a] + kappa_wri · L_wri. * **Intuition:** yank decoding back to section anchors; forbid intra-Node topic jumps. --- ### DT WAI — “Who am I” (head identity & redundancy) **Goal:** keep ≥2 distinct reasoning heads (no monoculture). **Signals (one workable choice):** $$ R_t=\frac1H\sum_h \cos(v_h,\bar v),\qquad Q_t=1-\max_h\cos(v_h,\bar v),\qquad \bar v=\frac1H\sum_h v_h. $$ **Trigger:** $R_t>\rho_{\mathrm{wai}}$ **and** $Q_t<\sigma_{\mathrm{wai}}$ (too redundant, identity too low). **Action:** raise per-head temperature for redundant heads; re-spread attention until $R_t\!\downarrow$ or $Q_t\!\uparrow$. --- ### DT WAY — “Who are you” (controlled entropy when stuck) **Goal:** break stalls without drifting off-topic. **Signal:** progression $\,\mathrm{prog} = \max(\zeta_{\min},\,\delta_s^{t-1}-\delta_s^{t})\,$. **Trigger:** $\mathrm{prog}<\eta_{\mathrm{prog}}$ and no contradictions. **Action (entropy pump + 1 candidate):** $$ H^\*=\mathrm{clamp}\!\big(H_0 + \xi(\eta_{\mathrm{prog}}-\mathrm{prog})(1+\alpha|W_c|),\ H_{\min},\ H_{\max}\big), $$ choose temperature $\tau$ so entropy $\approx H^\*$; propose exactly **one** on-topic candidate (never repeat). --- ### DT WDT — “Where did you take me” (cross-path guard) **Goal:** block illegal jumps across reasoning branches; require a “bridge” explanation. **Signal:** latent path distance $d_{\mathrm{path}} = \|c_t - c_{\pi}\|_2$ (current vs. parent path code). **Trigger:** $d_{\mathrm{path}} > \mu'_{\mathrm{wdt}}$, with $$ \mu'_{\mathrm{wdt}}=\mu_{\mathrm{wdt}}\bigl(1-\gamma_{\mathrm{wdt}}\cdot \sigma(|W_c|)\bigr). $$ **Action:** emit a short bridge line (“why the detour”), then resume; otherwise rollback. --- ### DT WTF — “What the F\*ck Happened” (collapse detect & recover) **Goal:** detect semantic/consistency collapse and recover safely. **Signals:** $\delta_s$ rising, $E_{\mathrm{res}}$ rising, or unresolved contradictions. **Trigger (vote example):** $$ \chi_t=\mathbf{1}[\delta_s^t>\delta_s^{t-1}] + \mathbf{1}[E_{\mathrm{res}}^t>E_{\mathrm{res}}^{t-1}] + \mathbf{1}[\text{contradiction}],\quad \chi_t+\chi_{t-1}\ge 3. $$ **Action:** rollback to $t^\*=\arg\min_{k\in[t-3,t]}\delta_s^k$, tighten gates (e.g., $\gamma_{\mathrm{wtf}}$), re-run **BBMC→Coupler**, then continue. --- ### One-screen math summary (copyable) ```txt WRI: L_wri = max(0, τ_wri - S_t); logits[a] += κ_wri · L_wri for a in anchor_token_ids WAI: if R_t > ρ_wai and Q_t < σ_wai → raise per-head temp for redundant heads WAY: if prog < η_prog → set entropy to H* = clamp(H0 + ξ(η_prog - prog)(1+α|Wc|), H_min, H_max); add 1 on-topic candidate WDT: if d_path > μ_wdt·(1 - γ_wdt·σ(|Wc|)) → emit bridge line or rollback WTF: if (δs↑) + (E_res↑) + (contradiction) over 2 steps ≥ 3 → rollback to t*; rerun BBMC→Coupler (tightened) ```` --- ## Defaults table (explicit, copyable) | Parameter | Symbol | Default | Range / notes | Purpose | | | | ----------------------- | --------------------------: | -------: | ------------- | ----------------------------------------------------- | ---- | -- | | anchor retention thresh | \$\tau\_{\mathrm{wri}}\$ | 0.60 | \[0.30, 0.90] | WRI anchor threshold | | | | head redundancy thresh | \$\rho\_{\mathrm{wai}}\$ | 0.75 | \[0.50, 0.95] | WAI redundancy ceiling | | | | head identity thresh | \$\sigma\_{\mathrm{wai}}\$ | 0.70 | \[0.40, 0.95] | WAI identity floor | | | | progress sensitivity | \$\eta\_{\mathrm{prog}}\$ | 0.03 | \[0.00, 0.10] | WAY stall detector | | | | path-distance thresh | \$\mu\_{\mathrm{wdt}}\$ | 0.25 | \[0.05, 1.00] | WDT path jump limit | | | | coupler zeta min | \$\zeta\_{\min}\$ | 0.10 | \[0.00, 0.50] | min progression floor | | | | coupler omega | \$\omega\$ | 1.0 | \[0.1, 2.0] | progression non-linearity | | | | coupler theta cap | \$\theta\_c\$ | 0.75 | \[0.2, 1.5] | \$W\_c\$ clip magnitude | | | | WRI tighten factor | \$\alpha\_{\mathrm{wri}}\$ | 0.60 | \[0.0, 1.5] | adjust \$\tau\_{\mathrm{wri}}\$ by \$ | W\_c | \$ | | WAI scale factor | \$\beta\_{\mathrm{wai}}\$ | 0.60 | \[0.0, 1.5] | scale WAI penalty by \$ | W\_c | \$ | | WDT scale factor | \$\gamma\_{\mathrm{wdt}}\$ | 0.60 | \[0.0, 1.5] | scale \$\mu\_{\mathrm{wdt}}\$ by \$ | W\_c | \$ | | WTF scale factor | \$\gamma\_{\mathrm{wtf}}\$ | 0.60 | \[0.0, 1.5] | tighten thresholds on recovery | | | | WAY pump strength | \$\xi\$ | 0.80 | \[0.0, 1.5] | entropy pump strength | | | | WAY entropy min | \$H\_{\min}\$ | 2.5 nats | \[1.0, 7.0] | entropy lower bound | | | | WAY entropy max | \$H\_{\max}\$ | 5.0 nats | \[3.0, 10.0] | entropy upper bound | | | | anchor bias scale | \$\kappa\_{\mathrm{wri}}\$ | 1.0 | \[0.0, 5.0] | logits bias for anchors | | | | loss weights | \$\lambda\_\*\$ | 0.01 | \[0.0, 1.0] | regularizer weights | | | | step limit | \$T\_{\max}\$ | 7 | int | max Node steps | | | | stop δ threshold | \$\delta\_{\mathrm{stop}}\$ | 0.35 | \[0.1, 0.5] | early stop when \$\delta\_s<\delta\_{\mathrm{stop}}\$ | | | > Tip: start with these defaults, measure, then tune per task class. --- ## Prompt-only runnable example (copy-paste) ``` SYSTEM (paste file): Load the WFGY Core file as engine. Enable Drunk Transformer (WRI,WAI,WAY,WDT,WTF) with defaults: τ_wri=0.60, ρ_wai=0.75, σ_wai=0.70, η_prog=0.03, μ_wdt=0.25, ζ_min=0.10, ω=1.0, θ_c=0.75 SYSTEM (rules, pseudo): * Extract anchors A0 from user prompt. * For each Node step t up to T_max: 1) compute δs, E_res, S_t, R_t, Q_t, W_c 2) WRI: bias anchor logits by κ_wri·L_wri if S_t<τ_wri or (δs↑ & E_res↑) 3) WAI: raise per-head temp for redundant heads until R_t↓ or Q_t↑ 4) WAY: if prog<η_prog, set entropy ≈ H* and propose 1 on-topic candidate 5) WDT: if d_path>μ'_wdt, emit bridge line; else rollback 6) WTF: if collapse vote ≥3, rollback to t*; rerun BBMC→Coupler (tight) 7) Emit Node (Topic | Module | δs | λ-state | Insight) * Stop if δs<δ_stop or t≥T_max USER: Use WFGY to answer: “Explain why tomatoes are classified as fruit, but treated as vegetables in cooking. Provide anchors and cite the smallest missing fact if confused.” ``` --- ## Decoding-hook pseudo-code (python-like; drop into your runtime) ```python def compute_prog(delta_prev, delta_now, zeta_min=0.10, omega=1.0): prog = max(zeta_min, delta_prev - delta_now) return prog ** omega def compute_Wc(B, prog, delta=0.15, cycle_id=0, eps=0.0, theta_c=0.75): alt = (-1)**cycle_id Phi = delta * alt + eps return clip(B * prog + Phi, -theta_c, +theta_c) def bias_anchor_logits(logits, anchor_ids, kappa): for tid in anchor_ids: logits[tid] += kappa return logits def temperature_for_target_entropy(logits, target_H, tol=1e-3, max_iters=5): lo, hi = 0.01, 10.0 for _ in range(max_iters): tau = 0.5*(lo+hi) probs = softmax(logits / tau) H = -sum(p*log(p+1e-12) for p in probs) if H > target_H: lo = tau else: hi = tau if abs(H - target_H) < tol: break return tau def decoding_hook(s): prog = compute_prog(s.delta_prev, s.delta_now, 0.10, 1.0) Wc = compute_Wc(s.B, prog, 0.15, s.cycle_id, 0.0, 0.75) # WRI S_t = jaccard(s.anchors, s.anchors0) L_wri = max(0.0, 0.60 - S_t) if (S_t < 0.60) or (s.delta_now > s.delta_prev and s.E_res > s.E_res_prev): s.logits = bias_anchor_logits(s.logits, s.anchor_token_ids, kappa=1.0 * L_wri) # WAI R_t, Q_t = s.R_t, s.Q_t if R_t > 0.75 and Q_t < 0.70: for h in range(len(s.heads)): if head_redundant(h, s): s.head_temps[h] *= (1 + 0.5 * (R_t - 0.75)) # WAY if prog < 0.03 and not s.has_contradiction: H_star = clamp(s.H0 + 0.8*(0.03 - prog)*(1 + 0.0*abs(Wc)), 2.5, 5.0) tau = temperature_for_target_entropy(s.logits, H_star) apply_temperature(s, tau) s.add_one_on_topic_candidate = True # WDT d_path = l2_distance(s.c_t, s.c_pi) mu_prime = 0.25 * (1 - 0.6 * sigmoid(abs(Wc))) if d_path > mu_prime: return emit_bridge_and_pause(s) # WTF vote = int(s.delta_now > s.delta_prev) + int(s.E_res > s.E_res_prev) + int(s.sign_flip) if vote + s.vote_prev >= 3: t_star = argmin_delta_in_window(s.history_delta, window=3) rollback_to(t_star) rerun_BBMC_and_Coupler(tighten_factor=1 + 0.6*sigmoid(abs(Wc))) return return s.logits ``` --- ## Minimal test / checklist 1. Prepare a QA prompt with clear anchors. 2. **Baseline:** run without DT; log \$\delta\_s, E\_{\mathrm{res}}\$; record correctness. 3. **DT on:** log same metrics plus \$R\_t, Q\_t, W\_c\$ and gates fired. 4. Expect: lower \$\delta\_s\$; fewer off-topic jumps (WRI), stalls resolved (WAY), justified detours (WDT), safe recovery (WTF). 5. Record deltas: accuracy, \$\Delta S\$, rollbacks, bridge lines, gate activations. --- ## Quick engineering notes & troubleshooting * If \$\mathcal{A}\_0\$ (anchors) is empty, WRI becomes a no-op and WAY pumps less entropy; log a warning. * If bridge lines repeat or look vacuous, lower \$\mu\_{\mathrm{wdt}}\$ and raise \$\kappa\_{\mathrm{wri}}\$. * For heavy-hallucination domains, use conservative defaults (higher \$\tau\_{\mathrm{wri}}\$, lower \$\eta\_{\mathrm{prog}}\$). --- ### Footer * **Spec status:** stable draft for engineering evaluation; KaTeX-safe equations. * **Next deliverables:** add `examples/DT-examples/prompt_example.md` and `examples/DT-examples/decoding_hook.py`. * **Compatibility:** prompt-only rules, decoding-hook integration, or optional training regularizers; model-agnostic. * **Attribution:** part of **WFGY Core 2.0** family. Star the repo to follow updates. * Lost? Return to **Starter Village** → `StarterVillage/README.md` --- ### 🧭 Explore More | Module | Description | Link | | ------------------------ | ----------------------------------------------------- | -------------------------------------------------------------------------------------------------- | | WFGY Core | Full symbolic reasoning architecture & math stack | [View →](https://github.com/onestardao/WFGY/tree/main/core/README.md) | | Problem Map 1.0 | 16-mode diagnostic & symbolic fixes | [View →](https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md) | | Problem Map 2.0 | RAG-focused failure tree & recovery pipeline | [View →](https://github.com/onestardao/WFGY/blob/main/ProblemMap/rag-architecture-and-recovery.md) | | Semantic Clinic Index | Prompt injection, memory bugs, drift catalog | [View →](https://github.com/onestardao/WFGY/blob/main/ProblemMap/SemanticClinicIndex.md) | | Semantic Blueprint | Layer-based symbolic reasoning & semantic modulations | [View →](https://github.com/onestardao/WFGY/tree/main/SemanticBlueprint/README.md) | | Benchmark vs GPT-5 | Stress test with the full WFGY reasoning suite | [View →](https://github.com/onestardao/WFGY/tree/main/benchmarks/benchmark-vs-gpt5/README.md) | | 🧙♂️ Starter Village 🏡 | Wizard-led onboarding to WFGY | [Start →](https://github.com/onestardao/WFGY/blob/main/StarterVillage/README.md) | --- > 👑 **Early Stargazers: [See the Hall of Fame](https://github.com/onestardao/WFGY/tree/main/stargazers)** — > Engineers, hackers, and open-source builders who supported WFGY from day one. > **Like it? Star the repo to unlock more.** See the [Unlock Board](https://github.com/onestardao/WFGY/blob/main/STAR_UNLOCKS.md).