vrr/WFGY

Fork 0

mirror of https://github.com/onestardao/WFGY.git synced 2026-04-28 11:40:07 +00:00

PSBigBig + MiniPS d3bc823dfb

Update README.md

2026-03-15 20:52:49 +08:00

Demo 2 · F5 Observability First

Problem Map 3.0 Troubleshooting Atlas

Official flagship demo for diagnosability-first repair

Quick links:

Open in Colab if you want the recommended replay notebook view.
The README and replay artifacts are already enough to understand the teaching pattern.

Replay-only MVP
No API key required

This is the second flagship demo in the official runnable demo pack.

If the AI eval snapshots suggest that better routing may reduce hidden debugging waste, this page goes one step further.

This demo is meant to make one mechanism-level claim visible:

when the real bottleneck is diagnosability,
the first repair move becomes visibility uplift, not premature workflow surgery

It was chosen because many real systems fail in a painful but very common way:

something is clearly going wrong
but the system still does not expose enough structure for a correct first repair

This demo is designed to make that situation visible.

It shows that once a case is routed as F5 Observability & Diagnosability Integrity, the first repair move changes from premature intervention to explicit visibility uplift.

Quick start

If you want the shortest path through this demo, use this order:

read Section 1 · What this demo proves
read Section 3 · Why not F4 first
read Section 5 · First repair move
open replay_outputs.json
open the recommended replay notebook only if you want the notebook presentation in Colab

If you only want the takeaway, this demo teaches one clean lesson:

if the system is still too opaque to diagnose correctly, repair visibility first

1. What this demo proves

This demo proves four things.

A. Some failures are blocked by invisibility before they are blocked by intelligence

A system can fail in a way that tempts people to say:

the workflow is broken
the model is confused
the chain is unstable
the execution layer must be fixed first

Sometimes those statements may later become true.

But the first practical problem is still:

the failure path is not exposed
the trace is too thin
the relevant signals are hidden
the system cannot yet be audited clearly enough

That is an F5-style first cut.

B. Correct routing changes the first repair move

If the case is routed correctly into F5, the first repair move becomes:

observability insertion
trace exposure
logging uplift
diagnostic visibility recovery

This is different from trying to fix workflow logic or model behavior too early.

C. Replay mode is enough to teach the pattern

This demo is intentionally replay-first.

The user should be able to understand the case without live execution.
The point is not “the answer became magically better.”
The point is:

the failure became legible

D. This MVP does not require live mode

For this first release, replay mode is enough.

This demo is about the first correct repair move.
It is not trying to simulate a full observability platform.

2. Family route

Primary family

F5 · Observability & Diagnosability Integrity

Secondary family

F4 · Execution & Contract Integrity

Why F5 is primary

The main failure is that the system still cannot stably expose enough structure to diagnose the problem correctly.

The workflow may indeed be broken.
But the first broken invariant is still:

the failure path cannot yet be seen, traced, or audited well enough

Best current fit

F5_N01 Failure Path Opacity

In stronger warning-oriented variants this may later move toward:

F5_E02 Early Warning Deficit

But the flagship teaching version stays centered on failure-path opacity.

3. Why not F4 first

The main tempting neighboring cut is F4.

That temptation is reasonable because many systems with poor observability also suffer from:

weak execution closure
broken bridges
thin stage boundaries
unreliable liveness

But this demo is not primarily about closure failure yet.

It is primarily about this:

the system still does not expose enough structure for the operator to know what to repair first

That means the first broken layer is diagnosability, not execution closure.

Wrong cut

“This is mainly a workflow execution bug.”

Better cut

“This is a diagnosability-first case with possible execution pressure behind it.”

That distinction matters because the first repair move changes immediately.

If you cut too early to F4, you are likely to start rebuilding the machine before you can even see the broken path.

4. Baseline failure

The baseline case is intentionally simple and realistic.

A workflow produces a bad result.
The operator can observe that:

the result is wrong
the process seems unreliable
something is clearly off

But the system does not yet provide enough of the following:

clear step traces
output lineage
stage-level visibility
branch decisions
failure checkpoints
useful logging context

So the operator is left with the most frustrating state:

I know something is wrong, but I cannot yet see the right thing to repair

This is exactly why the first repair move belongs to F5.

5. First repair move

Once the case is routed to F5, the first repair move should be simple and disciplined.

Recommended first repair stack

observability insertion
Add explicit visibility to the relevant pipeline stages.
trace exposure
Surface the path from input to intermediate state to output.
diagnostic logging uplift
Add the minimum trace needed to tell where the failure shape begins.
failure-surface clarification
Separate what is visible from what is still inferred.
only then consider deeper repair
After visibility improves, reassess whether the downstream repair belongs to F4, F6, F1, or elsewhere.

What should not happen first

Do not start with:

large workflow rewrites
broad prompt overhauls
random retry loops
“just upgrade the model” reactions

Those may create movement, but they do not solve the first problem if the failure path is still hidden.

First repair principle

if the system is still opaque, expose the failure path first

That is the teaching core of this demo.

6. Replay mode

Replay mode is the default reading mode for this MVP.

It requires no API key and no live execution.

Replay mode should show

the workflow or case setup
the baseline result
the thin or missing trace structure
the family route
the why-primary-not-secondary statement
the broken invariant
the first repair move
the improved visibility state
a short note on what newly became visible

What replay mode proves

Replay mode proves that:

diagnosability-first routing is understandable
the repair move is visibly different
the result shift can be measured in legibility, not only in output quality
a demo can teach through structure, not only through live execution

This is enough for the first public MVP.

Core replay artifacts:

7. Current notebook versions

This folder currently preserves two replay notebook versions.

Recommended version

demo_02_f5_observability_first_replay_v2.ipynb

This is the current recommended notebook for public reading, Colab use, and MVP presentation.

It keeps the replay-first design, while adding:

clearer route summary
clearer before / after contrast
stronger teaching framing
better aligned shared-layer presentation

Original version

demo_02_f5_observability_first_replay.ipynb

This version is intentionally retained.

It remains part of the demo history and preserves the earlier first-pass MVP presentation.

Version rule

The rule for this folder is simple:

v1 is preserved
v2 is the recommended replay notebook
replay-first remains the official design center for Demo 2

8. Files in this folder

Core files

File roles

input_case.json

Contains the workflow case, visible artifacts, and missing-diagnosability context.

replay_outputs.json

Contains the baseline state, route explanation, first repair move, and improved visibility state.

expected_output.json

Contains the clean target structure for what the demo is meant to make visible.

demo_02_f5_observability_first_replay.ipynb

The original first-pass replay notebook retained for continuity.

demo_02_f5_observability_first_replay_v2.ipynb

The current recommended replay notebook for public reading and Colab use.

9. Expected outcome

If the demo works, the user should walk away with the following understanding:

the workflow was failing in a way that was initially hard to diagnose
the atlas routed the case to F5, not directly to F4
the first repair move was observability insertion, not workflow redesign
after visibility improved, the real shape of the failure became more legible

The demo does not need to solve everything.

It only needs to make the first correct move visibly different.

That is enough.

10. Limits of this demo

This demo has real limits, and those limits should be stated clearly.

It does not prove

that every hidden failure belongs to F5
that all workflow bugs are really diagnosability issues
that logging alone solves structural failure
that visibility replaces deeper repair
that this single example covers all observability systems

It does prove

that some failures should be repaired through observability first
that route-first repair changes what the operator does next
that diagnosability can itself be a first-class repair target

These are already strong claims.
There is no need to inflate them.

11. Why this demo matters

This demo teaches one clean lesson: