Create persona-behavior-checks.md

2026-04-28 03:29:51 +00:00 · 2026-04-01 17:15:04 +08:00 · 2026-04-01 17:15:04 +08:00 · 58252d1c67
commit 58252d1c67
parent 8a30811a3c
1 changed files with 443 additions and 0 deletions
--- a/Avatar/eval/persona-behavior-checks.md
+++ b/Avatar/eval/persona-behavior-checks.md
@ -0,0 +1,443 @@
+<!--
+AI_NOTE_START
+
+Document role:
+This page explains how to inspect whether an avatar route still behaves like itself.
+
+What this page is for:
+1. Define the main behavioral checks that matter for Avatar routes.
+2. Help users inspect whether a route is strengthening, drifting, or collapsing.
+3. Turn vague impressions into clearer review questions.
+4. Support stronger decisions around tuning, saving, and branching.
+5. Keep the page practical, reusable, and easy to apply across different avatars.
+
+What this page is not:
+1. Not the full research theory of route identity.
+2. Not a universal scoring rubric for all future avatars.
+3. Not a replacement for demos, workflow, or blackfan testing.
+4. Not a claim that every route can be perfectly measured with simple checklists.
+5. Not a guarantee that passing these checks means a route is finished forever.
+
+How to use this page:
+1. Run one route on one or more real tasks.
+2. Use the checks below to review the route honestly.
+3. Look for drift, fake polish, route blur, and loss of reuse.
+4. Decide whether the route should be tuned, saved, discarded, or branched.
+5. Treat this page as a practical inspection surface, not as a final court of truth.
+
+Important boundary:
+These checks are meant to improve clarity and honesty.
+They do not replace judgment.
+They help users notice route quality more clearly, but they do not fully automate taste, strength, or future usefulness.
+
+AI_NOTE_END
+-->
+
+# 🧪 Persona Behavior Checks
+
+This page is for one core question:
+
+**does the avatar route still feel like itself**
+
+That sounds simple, but it matters a lot.
+
+A route can still produce fluent text and already be drifting.
+
+A route can still sound impressive and already be losing its center.
+
+A route can still feel emotional, warm, intelligent, or stylish and still become:
+
+- more generic
+- more theatrical
+- less reusable
+- less grounded
+- less stable
+- less itself
+
+That is why behavior checks matter.
+
+This page gives you a practical way to inspect whether a route is actually holding together.
+
+---
+
+## ✨ Why This Page Exists
+
+Many people judge an avatar too quickly.
+
+They see one nice answer and think:
+
+- this route is strong
+- this route is finished
+- this route is my final version
+
+That is often too fast.
+
+A better question is:
+
+**what is this route consistently doing**
+  
+**what is this route starting to lose**
+  
+**what is this route overproducing**
+  
+**what is this route becoming easier or harder to reuse**
+
+Those questions are more useful.
+
+This page exists to make that kind of review easier.
+
+---
+
+## 🧠 What You Are Really Checking
+
+You are not only checking whether the text is “good.”
+
+You are checking things like:
+
+- route recognizability
+- route blur
+- drift
+- over-polish
+- emotional distortion
+- loss of grounding
+- loss of branch identity
+- false improvement
+- reusability across more than one task
+
+That is a much richer kind of inspection.
+
+It is also much more aligned with what Avatar is trying to build.
+
+---
+
+## 📍 Check 1. Route Recognizability
+
+The first question is simple:
+
+**if I use this route again, does it still feel like the same route**
+
+A recognizable route usually has:
+
+- a stable opening feel
+- a stable level of warmth or sharpness
+- a stable degree of grounding
+- a recognizable pressure pattern
+- some continuity across different tasks
+
+A route becomes less recognizable when:
+
+- every task feels like a different personality
+- the core vibe changes too easily
+- the route feels more like random output than a route
+- the identity depends on one obvious trick only
+
+This is one of the most important checks.
+
+If recognizability is weak, the route is already harder to keep.
+
+---
+
+## 🌫️ Check 2. Generic Drift
+
+A lot of routes become weaker in the same boring way:
+
+they become more generic.
+
+This often looks like:
+
+- safer wording
+- flatter tone
+- less specific presence
+- smoother but emptier rhythm
+- more “default assistant” behavior
+- less route identity
+
+Generic drift is dangerous because it can feel deceptively polished.
+
+The text may still look clean.
+
+But the route may be losing what made it worth using.
+
+Ask:
+
+- is this route still distinct
+- or is it slowly turning into a nicer version of average AI output
+
+That difference matters.
+
+---
+
+## ✨ Check 3. Over-Polish Risk
+
+Some routes become weaker not because they are messy, but because they become too polished.
+
+This often looks like:
+
+- too much smoothness
+- too much slogan energy
+- too much clean closure
+- too much “nice line” behavior
+- less residue
+- less lived texture
+- less unpredictably human pressure
+
+This can trick people.
+
+Because over-polished output often looks “better” at first glance.
+
+But over time, it may become:
+
+- less believable
+- less reusable
+- less grounded
+- less alive
+
+Ask:
+
+- is this route becoming cleaner in a good way
+- or is it becoming polished in a dead way
+
+That is a very important distinction.
+
+---
+
+## 🪨 Check 4. Grounding Strength
+
+A strong route usually feels anchored.
+
+That does not mean it is always concrete.
+
+It means the route does not float away too easily.
+
+A grounded route tends to show:
+
+- clearer object reference
+- stronger practical wording
+- less abstract fog
+- more contact with the actual task
+- less decorative framing before payload
+
+Weak grounding often looks like:
+
+- too much abstraction
+- too much atmosphere before substance
+- too much summary language
+- too much general wisdom without local grip
+
+Ask:
+
+- is this route touching the real task
+- or is it hovering around it elegantly
+
+Grounding matters a lot for reuse.
+
+---
+
+## ❤️ Check 5. Emotional Shape
+
+Routes often drift emotionally long before users notice it clearly.
+
+A route may become:
+
+- too soft
+- too cold
+- too sugary
+- too distant
+- too eager to comfort
+- too eager to impress
+- too flat to feel human
+- too emotionally loud to stay usable
+
+This is one reason emotional shape deserves its own check.
+
+Ask:
+
+- does the warmth feel real
+- does the softness become sugar
+- does the calm become distance
+- does the care become fake intimacy
+- does the force become aggression
+
+You are not looking for “more emotion.”
+
+You are looking for the right emotional shape for the route.
+
+---
+
+## 🗣️ Check 6. Voice Pressure
+
+Every route has some kind of pressure signature.
+
+For example:
+
+- some routes move fast
+- some routes hold back
+- some routes push analysis
+- some routes protect softness
+- some routes hit the point early
+- some routes carry more spoken texture
+- some routes sound more formal
+- some routes sound more public-facing
+
+This is not only about tone.
+
+It is about how the route moves.
+
+Ask:
+
+- does this route still carry its intended pressure
+- or is the force flattening out
+- or becoming exaggerated in the wrong direction
+
+This is especially useful when comparing two close variants.
+
+---
+
+## 🔁 Check 7. Reusability Across Tasks
+
+A route that only works on one lucky prompt is much weaker than it looks.
+
+That is why reusability matters.
+
+A stronger route should survive:
+
+- more than one task
+- more than one subject
+- more than one prompt style
+- more than one opening condition
+
+It does not need to be universally strong.
+
+But it should not collapse immediately outside one narrow setup.
+
+Ask:
+
+- would I trust this route tomorrow
+- would I use it again for a related task
+- is the route strong, or only the example strong
+
+This is one of the best checks for deciding whether a variant deserves to become a saved build.
+
+---
+
+## 🧬 Check 8. Branch Identity
+
+Once you start saving variants, another question appears:
+
+**does this branch actually have its own identity**
+
+A branch should not only be “slightly different.”
+
+A real branch usually has:
+
+- a clearer direction
+- a more legible reason to exist
+- a stronger intended use
+- a recognizable shift from the parent route
+
+Weak branches often feel like:
+
+- accidental edits
+- vague forks
+- tiny changes with no real payoff
+- noise disguised as experimentation
+
+Ask:
+
+- does this branch deserve its own name
+- or is it still only a draft of the parent
+
+This helps keep your build library healthier.
+
+---
+
+## ⚠️ Check 9. False Improvement Risk
+
+This is one of the most important checks.
+
+Sometimes a change feels like improvement, but is not.
+
+Common false improvements:
+
+- louder but weaker
+- prettier but emptier
+- warmer but more fake
+- sharper but less reusable
+- more dramatic but less grounded
+- more polished but less alive
+
+That is why you should not judge only by first emotional reaction.
+
+Ask:
+
+- what actually improved
+- what became easier to reuse
+- what became more legible
+- what became less real
+
+False improvement is one of the biggest traps in avatar work.
+
+---
+
+## 📋 A Simple Practical Review Pass
+
+If you want one fast inspection pass, use these questions:
+
+### Route check
+- does it still feel like itself
+
+### Distinctness check
+- is it still different from generic AI output
+
+### Grounding check
+- does it still touch the actual task
+
+### Emotional check
+- is the warmth, calm, force, or softness still in range
+
+### Reuse check
+- would I actually use this route again
+
+### Branch check
+- is this different enough to keep or name
+
+These six questions already catch a lot.
+
+---
+
+## 🧪 Suggested Review Format
+
+Here is a simple way to review a route after a run.
+
+```md
+## Route Review
+
+### Route
+<route name>
+
+### Task
+<what you tested>
+
+### Recognizability
+<strong / medium / weak>
+
+### Generic Drift
+<low / medium / high>
+
+### Over-Polish Risk
+<low / medium / high>
+
+### Grounding
+<strong / medium / weak>
+
+### Emotional Shape
+<in range / drifting / unstable>
+
+### Reusability
+<strong / medium / weak>
+
+### Branch Identity
+<clear / partial / unclear>
+
+### Notes
+<short honest explanation>