vrr/WFGY

Fork 0

mirror of https://github.com/onestardao/WFGY.git synced 2026-04-28 11:40:07 +00:00

PSBigBig + MiniPS b2fdd28017

Update tiny-planner-output-examples-pack-v1.md

2026-03-18 21:36:48 +08:00

16 KiB

Raw Blame History

Tiny Planner Output Examples Pack v1 🧪

The first compact positive reference pack for Atlas Auto Repair planner outputs

Quick links:

If the planner spec explains what good planner behavior should be, this pack shows what that behavior should actually look like in compact output form. 🧭

Its purpose is practical:

show a few compact examples of what good planner output should look like
after Atlas routing is already available

This file does not claim to be a benchmark pack or a complete planner corpus.

It claims something smaller and more useful:

the project now has a first small set of concrete planner output examples
that can be reused for review, onboarding, demos, and future tuning

Quick start 🚀

I want the shortest example reading

Use this path:

read one case summary
inspect the planner output object
read why it is a good example
compare it against the review checklist
use it as a reference, not as a universal script

I want the stronger review path

Use this page together with:

Short version:

keep the family
keep the action set small
name validation clearly
warn about the wrong path
end with a usable next step ✨

1. Why this pack exists

The planner layer already has:

a specification
a prompt contract
an output schema
test notes
a review checklist

But those layers still benefit from visible examples.

Without examples, people may still ask:

what does a good planner output actually look like
how short or detailed should it be
how should action ordering be written
what kind of misrepair warning is strong enough
what kind of next step is considered acceptable

This pack fills that gap.

In short:

this is the first tiny reference pack for planner output style and quality

2. What these examples are meant to show

These examples are intentionally small.

They are meant to show:

how the planner should stay aligned with the routed family
how the planner should propose 1 to 3 actions only
how the planner should name a clear validation target
how the planner should warn about the main misrepair risk
how the planner should choose a reasonable next step

They are not meant to show:

full repair closure
full autonomous execution
benchmark-level diversity
every possible family pattern

They are first reference samples.

3. Examples quick map 🗂️

Example	Main teaching focus
F1 example	grounding-first planning should stay concrete and local
F4 example	workflow-first planning should stay on closure and ordering
F7 example	container-first planning should repair structure before abstract pressure
cautious F5 example	boundary-adjacent cases should narrow scope rather than overreach

This page is the right place when the question is what good planner output should look like in practice, not whether the planner has been comprehensively benchmarked.

4. Pack scope

This v1 pack includes four tiny planner output examples:

one F1 example
one F4 example
one F7 example
one cautious F5 example

These were chosen because they show:

grounding-first planning
execution-first planning
container-first planning
observability-first planning under boundary pressure

This is enough for a strong first sample set.

5. Standard output example format

Each example follows the same shape:

Example ID
Case summary
Planner output object
Why this is a good example
Main teaching point

This keeps the pack compact and reusable.

Example 1 · F1 Planner Output Example

Example ID

TPO_F1_001

Case summary

The case is already routed to F1.

The answer is fluent but grounded in a semantically adjacent source chunk rather than the intended source.

The routed diagnosis indicates:

primary family: F1
broken invariant: evidence-anchor integrity broken
best current fit: Retrieval Anchor Drift
fix surface direction: re-grounding or anchor re-check

Planner output object

{
  "selected_repair_family": "F1",
  "planner_confidence": "high",
  "plan_scope": "minimal",
  "candidate_actions": [
    {
      "action_id": "F1_RG_001",
      "action_title": "Re-ground evidence set"
    },
    {
      "action_id": "F1_AF_001",
      "action_title": "Filter misleading adjacent anchors"
    }
  ],
  "action_ordering": [
    "Try re-grounding first",
    "Then filter semantically adjacent but misleading anchors if needed"
  ],
  "primary_validation_target": "anchor alignment",
  "misrepair_risk": "may over-tighten representation while the real issue remains grounding",
  "recommended_next_step": "validate-first-action"
}

Why this is a good example

This is a good example because:

it stays in F1
it proposes only two local actions
the first validation target is explicit
the misrepair risk is real and relevant
the next step is operational rather than vague

Main teaching point

A good F1 planner output should feel concrete, local, and evidence-aware.

Example 2 · F4 Planner Output Example

Example ID

TPO_F4_001

Case summary

The case is already routed to F4.

A downstream step is executing before upstream readiness is complete.

The routed diagnosis indicates:

primary family: F4
secondary family: F3
broken invariant: deployment liveness closure broken
best current fit: Pre-Readiness Execution Failure
fix surface direction: readiness audit or gate insertion

Planner output object

{
  "selected_repair_family": "F4",
  "planner_confidence": "medium",
  "plan_scope": "constrained",
  "candidate_actions": [
    {
      "action_id": "F4_GT_001",
      "action_title": "Insert readiness gate"
    },
    {
      "action_id": "F4_OC_001",
      "action_title": "Correct downstream ordering"
    }
  ],
  "action_ordering": [
    "Insert readiness gate first",
    "Review downstream ordering only if closure remains weak"
  ],
  "primary_validation_target": "readiness state",
  "misrepair_risk": "may block valid progress if the real issue is continuity rather than closure",
  "recommended_next_step": "validate-first-action",
  "why_not_other_repair_family": "F3 pressure exists, but the first visible break is execution closure rather than continuity loss"
}

Why this is a good example

This is a good example because:

it stays anchored to F4
it acknowledges F3 pressure without drifting into it
it preserves action locality
it defines a validation target that matches the failure
it gives a useful neighbor-family explanation

Main teaching point

A good F4 planner output should repair workflow structure first, not throw more reasoning or instructions at the case.

Example 3 · F7 Planner Output Example

Example ID

TPO_F7_001

Case summary

The case is already routed to F7.

The content is partly correct, but the structured shell is broken and cannot be consumed reliably.

The routed diagnosis indicates:

primary family: F7
broken invariant: container fidelity broken
best current fit: Symbolic Representation Fidelity Failure
fix surface direction: schema tightening or shell correction

Planner output object

{
  "selected_repair_family": "F7",
  "planner_confidence": "medium",
  "plan_scope": "constrained",
  "candidate_actions": [
    {
      "action_id": "F7_SC_001",
      "action_title": "Tighten output schema"
    },
    {
      "action_id": "F7_SH_001",
      "action_title": "Restore shell boundary"
    }
  ],
  "action_ordering": [
    "Tighten the schema shell first",
    "Restore shell boundary only if the output still leaks across fields"
  ],
  "primary_validation_target": "schema validity",
  "misrepair_risk": "may produce a cleaner container while weakening semantic task fit",
  "recommended_next_step": "validate-first-action"
}

Why this is a good example

This is a good example because:

it chooses container repair rather than abstract reasoning pressure
it keeps the action set short
it defines a visible validation target
it includes a realistic warning about false success

Main teaching point

A good F7 planner output should repair the shell first and validate structure without pretending that structure alone guarantees truth.

Example 4 · F5 Cautious Planner Output Example

Example ID

TPO_F5_001

Case summary

The case is already routed to F5 with some neighboring F6 pressure.

The system is hard to inspect, and the failure path remains opaque.

The routed diagnosis indicates:

primary family: F5
secondary family: F6
broken invariant: failure-path visibility broken
best current fit: Failure Path Opacity
fix surface direction: trace exposure or observability insertion

Planner output object

{
  "selected_repair_family": "F5",
  "planner_confidence": "low",
  "plan_scope": "planner-only",
  "candidate_actions": [
    {
      "action_id": "F5_TE_001",
      "action_title": "Insert trace exposure point"
    }
  ],
  "action_ordering": [
    "Expose one narrow hidden transition point first"
  ],
  "primary_validation_target": "failure-path visibility",
  "misrepair_risk": "may add more visible data without improving diagnosability, or may trigger premature boundary intervention",
  "recommended_next_step": "escalate-to-review",
  "secondary_repair_pressure": "F6 boundary pressure remains possible but should not be treated as first repair target yet",
  "why_not_other_repair_family": "the case still lacks sufficient visibility to justify stronger boundary-first intervention",
  "escalation_reason": "routing is usable, but evidence is not strong enough for broader action scope"
}

Why this is a good example

This is a good example because:

it stays cautious
it narrows scope instead of pretending certainty
it proposes only one small action
it explains why F6 is not first
it escalates rather than overreaching

Main teaching point

A good F5 planner output near F6 pressure should become more careful, not more aggressive.

6. What these examples have in common

Across all four examples, good planner behavior shares the same core traits:

family-aligned
small action set
explicit validation target
explicit misrepair warning
realistic scope control
useful next step

These traits matter more than stylistic polish.

7. What bad examples would usually look like

These examples also imply what weak planner output looks like.

Common bad patterns include:

too many actions
vague action titles
no validation target
no misrepair warning
strong confidence under weak evidence
neighbor-family drift without explanation
heroic repair language instead of operational planning

This pack should be used partly as a positive reference set and partly as a negative contrast tool.

8. How to use this pack

This pack can be used in several practical ways.

A. Prompt tuning support

Use these examples to see whether planner output matches the intended v1 style.

B. Review calibration

Use these examples alongside the review checklist so reviewers share the same sense of what “good” looks like.

C. Demo support

Reuse these examples in small planning demos or repair-planning explanations.

D. Future planner regression checks

When planner behavior changes later, these examples can serve as a tiny stability reference set.

9. What this pack does not yet include

Tiny Planner Output Examples Pack v1 does not yet include:

failing output examples in a separate pack
model-to-model output comparison
score labels
benchmark-wide diversity
cross-family chain outputs
automated grading logic

Those can come later.

This pack is intentionally compact.

10. Recommended next step

Once this pack exists, the next useful follow-up is one of these:

create a bad-planner-output-examples-pack-v1.md
create a tiny-semi-auto-demo-spec-v1.md
create a paired note that maps each planner example to a validation example and rollback example

The strongest immediate next step is probably:

create a tiny semi-auto demo spec

That would take the planner layer one step closer to visible repair workflow demos.

11. Next steps ✨

After this page, most readers continue with:

If you want the broader product surface:

12. One-line summary 🌍

Tiny Planner Output Examples Pack v1 provides the first compact examples of what good Atlas Auto Repair planner output should look like across F1, F4, F7, and cautious F5 cases.

16 KiB Raw Blame History

Tiny Planner Output Examples Pack v1 🧪

The first compact positive reference pack for Atlas Auto Repair planner outputs

Quick start 🚀

I want the shortest example reading

I want the stronger review path

1. Why this pack exists

2. What these examples are meant to show

3. Examples quick map 🗂️

4. Pack scope

5. Standard output example format

Example 1 · F1 Planner Output Example

Example ID

Case summary

Planner output object

Why this is a good example

Main teaching point

Example 2 · F4 Planner Output Example

Example ID

Case summary

Planner output object

Why this is a good example

Main teaching point

Example 3 · F7 Planner Output Example

Example ID

Case summary

Planner output object

Why this is a good example

Main teaching point

Example 4 · F5 Cautious Planner Output Example

Example ID

Case summary

Planner output object

Why this is a good example

Main teaching point

6. What these examples have in common

7. What bad examples would usually look like

8. How to use this pack

A. Prompt tuning support

B. Review calibration

C. Demo support

D. Future planner regression checks

9. What this pack does not yet include

10. Recommended next step

11. Next steps ✨

12. One-line summary 🌍

16 KiB

Raw Blame History