WFGY/ProblemMap/Atlas/tiny-planner-output-examples-pack-v1.md
2026-03-19 23:49:54 +08:00

18 KiB

Tiny Planner Output Examples Pack v1 🧩

Example pack for small, valid repair-planner outputs inside Auto Repair v1

Quick links:


If the planner spec defines what the planner is allowed to produce, this page shows what that output actually looks like in small, usable examples. 🧭

This file is here for a very practical reason:

many systems sound clean at the spec level
but become vague, oversized, or fake-confident when they actually produce output

This page helps prevent that.

It makes the planner layer more concrete by showing tiny outputs that are:

  • structured
  • modest
  • validation-aware
  • rollback-aware
  • clearly first-move oriented

Short version:

diagnose first
plan one controlled move
define validation
keep rollback visible
escalate only when needed

That is the spirit of this pack.


Quick start 🚀

I am new to the planner layer

Use this path:

  1. read Repair Planner Spec v1
  2. read Repair Planner Prompt v1
  3. read this page
  4. compare the examples against Repair Plan Schema v1
  5. check them against Planner Review Checklist v1

I already know the planner and want the shortest route

Start here:

  1. read Example 1, Example 2, and Example 3
  2. inspect the repair_move, validation_check, and rollback_trigger
  3. compare compact planner behavior across F1, F4, and F7 style cases
  4. use the review checklist if you want to judge output quality

Shortest possible reading:

good planner output is small
tied to the diagnosed break
explicit about validation
explicit about rollback
and never pretends to solve the whole case in one jump


What this page is teaching 🎯

This example pack is not only showing JSON-like planner shapes.

It is teaching five deeper habits:

  1. repair must stay route-sensitive
    the move should follow the diagnosed failure family

  2. the first move must stay small
    the planner should not explode into a giant repair screenplay

  3. validation is part of the output
    a planner output without validation is incomplete

  4. rollback must stay visible
    small repair still needs a safe exit

  5. escalation must remain honest
    some cases should not pretend to be locally solved

That is why tiny examples matter.

Tiny examples reveal whether the planner really understands restraint.


What good tiny planner output should look like

A good tiny planner output usually has these properties:

  • one clear first move
  • one clear target invariant
  • one clear validation check
  • one clear rollback trigger
  • one clear escalation condition when needed

It should feel like:

the smallest controlled step that is still worth trying

It should not feel like:

  • a whole migration plan
  • a long repair essay
  • a speculative redesign
  • a fake “AI solved everything” story

Canonical tiny output shape

The exact schema lives elsewhere, but most good tiny planner outputs here follow a compact shape like this:

{
  "case_id": "example_name",
  "diagnosed_family": "F4",
  "broken_invariant": "execution_skeleton_closure_broken",
  "repair_move": "insert readiness gate before dependent action",
  "why_this_move_first": "current failure appears before downstream retries and before broader redesign is justified",
  "validation_check": "verify dependent action is never executed before readiness condition is true",
  "rollback_trigger": "if readiness gate blocks valid execution path or creates new deadlock symptoms",
  "escalate_if": "ordering becomes stable but closure still fails for a deeper reason"
}

This is not the only acceptable form.

But this is the right scale.


Example pack 📦

Example 1 · F1 grounding-first case

Case summary

The answer cites retrieved material, but the selected chunks do not actually support the final claim.

Tiny planner output

{
  "case_id": "f1_grounding_recheck",
  "diagnosed_family": "F1",
  "broken_invariant": "anchor_to_claim_coupling_broken",
  "repair_move": "force claim-to-evidence trace before final answer emission",
  "why_this_move_first": "the main break is grounding integrity, so re-linking claims to supporting evidence is the smallest meaningful first repair",
  "validation_check": "sample outputs must show that every major claim can be traced to the cited chunk set",
  "rollback_trigger": "if trace insertion reduces answer quality without improving grounding correctness",
  "escalate_if": "evidence trace is present but semantic target mismatch still dominates"
}

Why this is a good tiny output

This output is good because it does not jump into full retrieval redesign.

It stays focused on the first useful controlled move:

  • restore traceability between claim and evidence
  • validate whether grounding actually improved
  • keep rollback visible if the added constraint harms behavior without fixing the real failure

Example 2 · F4 readiness / ordering case

Case summary

A downstream action runs before the required resource is ready, causing repeated workflow breakage.

Tiny planner output

{
  "case_id": "f4_readiness_gate",
  "diagnosed_family": "F4",
  "broken_invariant": "execution_skeleton_closure_broken",
  "repair_move": "insert readiness gate before the dependent action",
  "why_this_move_first": "the visible failure is premature execution, so the first move should repair ordering before deeper architecture changes",
  "validation_check": "dependent action must not fire while readiness state is false",
  "rollback_trigger": "if the new gate blocks valid runs or introduces deadlock-like waiting behavior",
  "escalate_if": "ordering becomes correct but bridge closure still fails later in the path"
}

Why this is a good tiny output

This output is good because it respects execution-first logic.

It does not drift into:

  • memory blame
  • observability essay
  • total workflow redesign

It chooses the smallest move that matches the break:

  • enforce readiness first
  • validate ordering
  • watch for deadlock side effects
  • escalate only if closure still fails after the visible ordering problem is fixed

Example 3 · F5 observability-first case

Case summary

A workflow fails intermittently, but logs do not expose where the failure path actually breaks.

Tiny planner output

{
  "case_id": "f5_trace_exposure",
  "diagnosed_family": "F5",
  "broken_invariant": "failure_path_visibility_broken",
  "repair_move": "add step-level trace markers at each transition boundary",
  "why_this_move_first": "the system is too dark to justify deeper repair, so visibility uplift is the first legitimate move",
  "validation_check": "failed runs must expose the last successful transition and the first missing or broken transition",
  "rollback_trigger": "if trace insertion creates excessive noise without improving path localization",
  "escalate_if": "the failure path becomes visible and a deeper F4 or F6 break becomes primary"
}

Why this is a good tiny output

This output is good because it refuses premature deep repair.

It recognizes that the right first move is:

  • expose the path
  • improve visibility
  • then let later routing decide whether the true deeper family is F4, F6, or something else

That is exactly the kind of planner restraint Auto Repair v1 is supposed to preserve.


Example 4 · F7 representation-first case

Case summary

The model output looks superficially valid, but required fields appear in unstable locations and break downstream schema parsing.

Tiny planner output

{
  "case_id": "f7_schema_stabilization",
  "diagnosed_family": "F7",
  "broken_invariant": "representation_container_fidelity_broken",
  "repair_move": "enforce one stable schema shell before field population",
  "why_this_move_first": "the first visible failure is container instability, so carrier repair should come before reasoning or prompt expansion",
  "validation_check": "multiple runs must preserve identical required-field placement and parse successfully",
  "rollback_trigger": "if schema stabilization reduces flexibility but does not improve parse reliability",
  "escalate_if": "container becomes stable but semantic content remains mismatched or incomplete"
}

Why this is a good tiny output

This output is good because it does not confuse:

  • structural shell failure with
  • deeper reasoning failure

The first move is carrier stabilization, which is exactly what a route-sensitive planner should produce here.


Example 5 · F3 continuity-first case

Case summary

A multi-step agent loses role ownership and continuity across handoff, causing later actions to be applied under the wrong context.

Tiny planner output

{
  "case_id": "f3_role_reisolation",
  "diagnosed_family": "F3",
  "broken_invariant": "state_continuity_broken",
  "repair_move": "re-isolate role and ownership state at each handoff boundary",
  "why_this_move_first": "the earliest decisive break is continuity drift across handoff, so role fencing is the smallest valid first move",
  "validation_check": "handoff state must preserve the same role and ownership identity across the tested sequence",
  "rollback_trigger": "if isolation fragments valid shared context and causes new task discontinuity",
  "escalate_if": "continuity stabilizes but execution closure still fails in later stages"
}

Why this is a good tiny output

This output is good because it distinguishes continuity repair from execution repair.

It avoids the common failure of treating every multi-step failure as if it were immediately an F4 closure issue.


Example 6 · early escalation case

Case summary

A case shows mixed pressure, and the first visible local repair may help, but there is a real risk that deeper structural continuation will still be needed.

Tiny planner output

{
  "case_id": "mixed_case_escalation_ready",
  "diagnosed_family": "F5",
  "broken_invariant": "failure_path_visibility_broken",
  "repair_move": "expose the local failure path before deeper structural intervention",
  "why_this_move_first": "the current evidence is still too thin for a larger repair claim, so visibility is the safest honest first move",
  "validation_check": "local path exposure must clarify whether the deeper break is F4, F6, or unresolved",
  "rollback_trigger": "if local tracing adds cost without clarifying the underlying break",
  "escalate_if": "local visibility improves but the structural failure remains unresolved or keeps recurring"
}

Why this is a good tiny output

This output is good because it leaves room for escalation without being lazy.

It does not say:

  • “we do not know anything”
  • or
  • “just escalate immediately”

Instead it says:

  • do one useful local move first
  • validate whether that move reveals the true deeper structure
  • escalate only if the local move is insufficient

That is mature planner behavior.


Contrast examples 🚧

These are not good outputs. They are here to make the planner standard even clearer.


Bad example 1 · too large

{
  "repair_move": "redesign retrieval, prompt architecture, memory policy, schema layer, and workflow orchestration"
}

Why this is bad

This is not a first move.

It is a vague mini-roadmap pretending to be a planner output.

Tiny planner output should stay small enough that validation and rollback remain believable.


Bad example 2 · no validation

{
  "repair_move": "insert readiness gate",
  "why_this_move_first": "ordering seems wrong"
}

Why this is bad

This is incomplete.

A valid planner output needs at least some explicit sense of:

  • how the move will be checked
  • when it should be rolled back
  • when deeper escalation becomes justified

Without that, it is just a suggestion, not a controlled plan.


Bad example 3 · fake closure

{
  "repair_move": "fix the entire bug",
  "confidence": "high"
}

Why this is bad

This is planner theater.

A first planner layer should not pretend it has solved full closure from one compact output.

That is exactly the kind of overclaim this package is trying to avoid.


What these examples collectively teach 📚

These examples collectively teach six important things.

1. Tiny does not mean shallow

A small planner output can still be structurally serious if it includes:

  • correct first move
  • validation
  • rollback
  • escalation boundary

2. The planner must stay route-sensitive

A good output is different depending on whether the diagnosed break is:

  • F1 grounding
  • F3 continuity
  • F4 closure
  • F5 visibility
  • F7 representation

That difference is the whole point.

3. Validation is part of the plan

If the planner output does not say how improvement will be checked, it is not ready.

4. Rollback is part of safety

Even a good first move may still fail locally.

That possibility must remain visible.

5. Escalation is not failure

Sometimes the right tiny output is the one that cleanly prepares the case for deeper continuation.

6. Restraint is a real capability

A planner that stays modest is usually more useful than one that pretends to be omniscient.


How to use this page in practice 🧠

This page is useful in at least four ways.

A. As a planner calibration page

Use these examples to compare the actual planner output against the intended output shape.

B. As a review page

Use this file together with Planner Review Checklist v1 when checking whether a candidate output is too vague, too large, or too confident.

C. As a prompt-teaching page

Use these examples to teach an AI system what “small but real” planner behavior looks like.

D. As a demo support page

These examples are small enough to be shown in a README, slide, or tiny demo pack without overwhelming the reader.


Relationship to the rest of Auto Repair v1 🔗

This file should be read in the correct layer order.

Structure first

Example shape next

  • this file

Review and control after that

This order matters because examples are strongest when the structure behind them is already clear.


Current status 📌

Tiny Planner Output Examples Pack v1 is now established as the first compact example layer for what a valid small repair-planner output should look like inside Auto Repair v1.

It does not prove full autonomous repair.

It does prove that the planner layer can now be shown in a compact, teachable, and structurally disciplined form.


Next steps

After this page, most readers continue with:

  1. Open Planner Review Checklist v1
  2. Open Planner Test Note v1
  3. Open Repair Validation Loop v1
  4. Open Rollback Policy v1

If you want the broader Auto Repair surface:


One-line status 🌍

This file provides the first compact example pack for what small, valid, validation-aware repair-planner outputs should look like in Auto Repair v1.


Closing note

A planner becomes more trustworthy when it can show its behavior in small form.

Not in giant architecture talk. Not in fake autonomous drama. In small form.

That is what this page is for.

It shows that the planner layer can already produce outputs that are:

  • small
  • controlled
  • reviewable
  • validation-aware
  • rollback-aware
  • escalation-ready