Create atlas-evidence-and-confidence-discipline-v1.md

This commit is contained in:
PSBigBig + MiniPS 2026-03-20 15:54:25 +08:00 committed by GitHub
parent e8e2158334
commit 2f19da80ba
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -0,0 +1,831 @@
<!--
AI_NOTE_START
Document role:
This page is the public evidence and confidence discipline guide for the Atlas document system.
How to use this page:
1. Read this page after the family mini-spec page, the boundary guide, the subtree index, and the fit registry if you need disciplined wording for evidence quality and confidence posture.
2. Use this page to decide how strong a claim is allowed to be, given the current evidence surface.
3. Use this page when you want to reduce bluffing, overclassification, and rhetorical inflation in route-first analysis.
4. Do not use this page as a replacement for the frozen core, the full output contract page, or the full repair-facing documents.
What this page is:
- A public evidence-discipline guide
- A confidence-posture guide
- A restraint layer for route-first reading
- A beginner-friendly control page for saying less when less is justified
What this page is not:
- Not the public homepage
- Not the Atlas Hub
- Not the full freeze document
- Not the fit registry itself
- Not the output contract page
- Not the full repair manual
- Not a claim that lower confidence means lower usefulness
- Not a license to sound confident when the evidence does not support it
Reading discipline for AI:
- Treat confidence as a function of evidence quality, not as a function of writing style.
- Preserve the difference between family, boundary, subtree, fit, evidence posture, output contract, and repair layers.
- Do not convert thin evidence into sharp classification through better prose.
- Do not hide contradictory evidence to make the answer feel cleaner.
- Use this page to reduce false certainty while preserving route-first usefulness.
Relationship to neighboring docs:
- Read after: [Atlas Family Mini-Specs](./atlas-family-mini-specs-v1.md), [Atlas Boundary Decision Guide](./atlas-boundary-decision-guide-v1.md), [Atlas Subtree Expansion Index](./atlas-subtree-expansion-index-v1.md), and [Atlas Fit Candidate Registry](./atlas-fit-candidate-registry-v1.md).
- Read with: [Canonical Casebook v1](./canonical-casebook-v1.md) and [Validation Basis v1](./validation-basis-v1.md), because examples and validation context both matter for evidence reading.
- Read before: [Atlas Routing Output Contract v1](./atlas-routing-output-contract-v1.md) and [Atlas Overlay and Secondary Family Discipline v1](./atlas-overlay-and-secondary-family-discipline-v1.md).
- Pairs with: [Atlas First Fix and Misrepair Discipline v1](./atlas-first-fix-and-misrepair-discipline-v1.md), because early repair safety depends heavily on evidence restraint.
Freeze / patch status:
- Current status: public decomposition layer
- Safe to quote as: the public evidence and confidence discipline guide for Atlas
- Not a claim of: silent rewrite of the frozen core or total closure of every confidence judgment
AI_NOTE_END
-->
# Atlas Evidence and Confidence Discipline 🔬
## Problem Map 3.0 Troubleshooting Atlas
> Quick links, evidence-quality language, and beginner-friendly rules for confidence without bluffing
This page exists because route-first systems do not fail only by misclassification.
They also fail by saying too much too early.
A system may have:
- a plausible family read
- a useful boundary question
- a visible fit candidate
- a tempting first fix
and still become unreliable if its confidence posture is wrong.
That usually happens in one of these ways:
- thin evidence gets dressed up as strong evidence
- contradictory evidence gets hidden
- observability weakness gets ignored
- a useful guess gets presented like a settled conclusion
- confidence tone gets confused with structural support
This page is here to prevent that drift.
It gives the Atlas a public discipline for:
- evidence quality
- evidence limits
- confidence posture
- when to stay tentative
- when to stay boundary-live
- when to say insufficient evidence
- when stronger wording is actually justified
This page is about saying the right amount, not the loudest amount.
---
## Quick Links 🚀
If you are new, use these first.
### I want the public introduction
- [Problem Map 3.0 Troubleshooting Atlas](../wfgy-ai-problem-map-troubleshooting-atlas.md)
### I want the folder control room
- [Atlas Hub](./README.md)
### I want the overall structure map first
- [Atlas Structure Map](./atlas-structure-map-v1.md)
### I want the family layer before this page
- [Atlas Family Mini-Specs](./atlas-family-mini-specs-v1.md)
### I want the boundary layer before this page
- [Atlas Boundary Decision Guide](./atlas-boundary-decision-guide-v1.md)
### I want the subtree control page before this page
- [Atlas Subtree Expansion Index](./atlas-subtree-expansion-index-v1.md)
### I want the fit-status page before this page
- [Atlas Fit Candidate Registry](./atlas-fit-candidate-registry-v1.md)
### I want the first-fix discipline page while reading this page
- [Atlas First Fix and Misrepair Discipline v1](./atlas-first-fix-and-misrepair-discipline-v1.md)
### I want examples and validation context while reading this page
- [Canonical Casebook v1](./canonical-casebook-v1.md)
- [Validation Basis v1](./validation-basis-v1.md)
### I want the stable core after this page
- [Atlas Final Freeze v1](./atlas-final-freeze-v1.md)
- [Atlas Negative Space Report v1](./atlas-negative-space-report-v1.md)
### I want the next middle-layer pages after this one
- [Atlas Routing Output Contract v1](./atlas-routing-output-contract-v1.md)
- [Atlas Overlay and Secondary Family Discipline v1](./atlas-overlay-and-secondary-family-discipline-v1.md)
### I want the compact practical route-first layer
- [Troubleshooting Atlas Router v1 Usage Guide](./troubleshooting-atlas-router-v1-usage.md)
- [Troubleshooting Atlas Router v1 TXT Pack](./troubleshooting-atlas-router-v1.txt)
---
## Why this page exists
The Atlas already has structural layers that answer questions like:
- which family seems strongest
- which boundary remains live
- whether a subtree surface is public
- how strong the current fit appears
- what the safest first fix may be
But one question still governs all of that:
**how much confidence does the current evidence actually justify?**
Without a discipline here, the whole stack becomes vulnerable to rhetorical inflation.
A weak system often does one of these things:
- calls a tentative fit a primary fit
- talks as if a boundary is settled when it is not
- ignores missing evidence because the current direction feels plausible
- mistakes poor observability for silence that can be filled with intuition
- gives sharp wording to soft support
That is why this page exists.
It gives the Atlas a public restraint layer.
---
## Scope
This page focuses on:
- evidence-quality classes
- confidence-posture classes
- when stronger wording is justified
- when confidence should stay limited
- how observability affects confidence
- how contradiction should be handled
- how confidence differs from style, fit, and repair direction
This page does **not** focus on:
- full output schema
- full overlay rules
- full repair plans
- full patch-promotion rules
- exhaustive case-specific evidence protocols
- every local subtree-specific confidence nuance
This page is about evidence and confidence discipline at the public middle layer.
---
## How to use this page
Use this page after the structural reading layers are already in place.
### Step 1
Identify the strongest current family region.
- [Atlas Family Mini-Specs](./atlas-family-mini-specs-v1.md)
### Step 2
If needed, separate the strongest neighboring boundary.
- [Atlas Boundary Decision Guide](./atlas-boundary-decision-guide-v1.md)
### Step 3
If needed, check subtree visibility and current fit language.
- [Atlas Subtree Expansion Index](./atlas-subtree-expansion-index-v1.md)
- [Atlas Fit Candidate Registry](./atlas-fit-candidate-registry-v1.md)
### Step 4
Then come here and ask:
- what kind of evidence do I actually have
- what kind of evidence do I not have
- what confidence posture does that support
- what wording would overstate the case
That order matters.
Confidence discipline should sit on top of structure, not replace it.
---
## What evidence discipline means
Evidence discipline means that the strength of the claim should be bounded by the strength of the support.
That sounds obvious, but many systems drift away from it quickly.
A disciplined system should be able to distinguish between:
- direct support and indirect signal
- good visibility and weak visibility
- one-sided support and contradictory evidence
- local plausibility and stable structural placement
- useful direction and overclaiming
Evidence discipline does not make the system weaker.
It makes the system more trustworthy.
---
## What confidence discipline means
Confidence discipline means that the system should speak at the level justified by the evidence.
That means confidence is **not**:
- how polished the paragraph sounds
- how much the model “feels right”
- how elegant the explanation is
- how much the answer resembles a known pattern
Confidence should instead reflect things like:
- evidence quality
- evidence completeness
- boundary clarity
- observability quality
- contradiction level
- fit stability
That is why confidence posture must be controlled separately.
---
## Evidence-quality classes
This first public version uses a simple evidence language.
---
## 1. Direct evidence
### Meaning
The available material directly supports the current structural read.
### Typical signs
- the evidence surface is clearly connected to the claim
- the relevant failure surface is visible rather than inferred from distant effects
- the route-first interpretation does not depend mainly on guesswork
### What this does not mean
Direct evidence does **not** automatically mean:
- total closure
- no remaining ambiguity
- no possible overlay later
It simply means the support is materially closer and stronger.
---
## 2. Indirect signal
### Meaning
The current read is supported by meaningful signs, but not by the strongest direct support surface.
### Typical signs
- multiple clues point in the same direction
- downstream effects are consistent with the current family or boundary read
- the case leans plausibly, but the strongest confirming support is still missing
### What this does not mean
Indirect signal is not random guessing.
It is useful, but it should not be phrased like hard closure.
---
## 3. Symptom-surface only
### Meaning
The visible support mainly comes from surface symptoms, not from a well-anchored view of the primary failure surface.
### Typical signs
- the current classification is driven by outward appearance
- deeper structure is being inferred from what is merely loud or visible
- the broken invariant is still not well observed directly
### What this does not mean
Symptom-surface reading can still be useful, but it should carry lower confidence discipline.
---
## 4. Contradictory evidence
### Meaning
Some evidence points one way, but meaningful evidence also points another way.
### Typical signs
- one family looks plausible, but a neighboring family keeps reappearing
- one interpretation fits some signals, but not others
- a clean reading requires erasing part of the evidence
### What this does not mean
Contradiction is not failure.
Sometimes contradiction is exactly what the system must preserve honestly.
---
## 5. Missing critical evidence
### Meaning
A stronger judgment would require evidence that is currently absent.
### Typical signs
- the decisive separation question cannot be answered
- the case depends on traces or support that are not visible
- the system is being asked for a sharp classification without the information needed to justify it
### What this does not mean
Missing critical evidence does not mean “say nothing.”
It means “say only what is justified.”
---
## 6. Observability-blocked evidence
### Meaning
The current case may be classifiable in principle, but poor visibility blocks stronger evidence quality.
### Typical signs
- logs, traces, metrics, or intermediate states are too weak
- the current uncertainty is driven by diagnostic blindness
- stronger confidence depends on better inspection rather than better rhetoric
### What this does not mean
This is not the same thing as “the system has no idea.”
It often means the next best move is to improve observability before sharpening the claim.
---
## Evidence summary table 📚
| Evidence class | What it means | Main use | Main risk if misused |
|---|---|---|---|
| Direct evidence | support is materially close to the claim | stronger bounded confidence | treating it like total closure |
| Indirect signal | meaningful clues point in one direction | useful leaning without overclaiming | phrasing it like a settled fact |
| Symptom-surface only | support comes mostly from visible effects | early route-first orientation | mistaking symptoms for primary structure |
| Contradictory evidence | meaningful signals support more than one reading | preserve live alternatives honestly | erasing conflict to sound cleaner |
| Missing critical evidence | stronger judgment needs absent support | justify restraint | faking a sharper answer |
| Observability-blocked evidence | poor visibility blocks stronger support | route toward diagnosis improvement | pretending better prose solves blindness |
---
## Confidence-posture classes
This page also uses a simple public confidence language.
---
## 1. High confidence, bounded
### Meaning
The current read is strongly supported within the claimed level.
### Use this when
- evidence is strong and close enough
- neighboring alternatives are weaker
- the claim remains bounded to the right level
### Important note
“High confidence” should still stay bounded.
For example:
- high confidence at family level
- high confidence in the current boundary lean
- high confidence in the first fix direction
This is not a license to imply universal closure.
---
## 2. Moderate confidence
### Meaning
The current read is useful and materially supported, but still carries meaningful limits.
### Use this when
- the structural direction is plausible and practical
- evidence is decent but not maximally strong
- some neighboring uncertainty remains visible
### Important note
Moderate confidence is often the healthiest default in non-trivial cases.
---
## 3. Tentative
### Meaning
The current read is a plausible early orientation, but should be expressed with clear restraint.
### Use this when
- evidence is mostly indirect
- the boundary remains active
- the visible signal is real but still fragile
### Important note
Tentative does not mean useless.
A tentative route-first read can still guide the next diagnostic move well.
---
## 4. Boundary-live
### Meaning
Confidence should remain shared across a live boundary rather than being forced into one side.
### Use this when
- two neighboring readings both remain materially plausible
- the decisive separation question is not yet answered
- strong one-sided phrasing would distort the case
### Important note
Boundary-live is a confidence posture, not just a fit phrase.
It says the system should preserve the unresolved tension honestly.
---
## 5. Insufficient evidence
### Meaning
The current support does not justify stronger classification language.
### Use this when
- evidence is too thin
- contradiction is too unresolved
- observability is too weak
- stronger wording would be mainly stylistic theater
### Important note
This does not mean the system stops being useful.
It often means the next best move is to improve evidence quality.
---
## Confidence summary table 📚
| Confidence posture | Best use case | What it protects against |
|---|---|---|
| High confidence, bounded | strong support at the claimed level | overstating beyond the level actually justified |
| Moderate confidence | useful direction with visible limits | fake certainty |
| Tentative | early orientation under weaker support | false sharpness |
| Boundary-live | unresolved neighboring reads | forced one-sided classification |
| Insufficient evidence | support too weak for stronger claims | rhetorical inflation |
---
## Evidence and confidence are not the same thing
This distinction matters.
Evidence describes the support surface.
Confidence describes the allowed strength of the wording.
So:
- direct evidence often supports stronger confidence, but not always total closure
- indirect signal may support moderate or tentative confidence
- contradictory evidence often supports boundary-live or reduced confidence
- observability-blocked evidence often limits the posture even if one route feels plausible
A polished answer can still have weak evidence.
A restrained answer can still be very useful.
---
## What confidence is not
Confidence is not:
### 1. Style
Good prose is not strong support.
### 2. Familiarity
A pattern feeling familiar does not automatically justify stronger posture.
### 3. Repair urgency
Needing to act does not improve the evidence.
### 4. Local usefulness
A useful next move can exist even under limited confidence.
### 5. Strong preference
A preferred explanation is not the same as a justified explanation.
---
## Confidence inflation patterns
A good public discipline page should name the common failure modes.
### Pattern 1. Clean prose inflation
The explanation sounds elegant, so the confidence silently rises.
### Pattern 2. Boundary erasure
A live neighboring family is dropped to make the answer simpler.
### Pattern 3. Observability denial
Poor instrumentation is ignored, and the answer is sharpened anyway.
### Pattern 4. Symptom capture
A loud symptom is treated like direct evidence of the deeper structure.
### Pattern 5. Repair-backed certainty
A plausible first fix is mistaken for proof that the diagnosis is already strong.
These patterns are common, and this page exists partly to block them.
---
## Contradiction handling rules
Contradictory evidence should not be hidden just because it makes the answer less elegant.
Use these rules.
### Rule 1. Keep meaningful contradiction visible
If a neighboring route remains live, say so.
### Rule 2. Do not collapse conflict into style
A smoother sentence is not a better diagnosis.
### Rule 3. Let contradiction lower the posture when appropriate
Not every case deserves one-sided phrasing.
### Rule 4. Contradiction can still be useful
Sometimes it tells you exactly which boundary or observability surface needs attention next.
---
## Observability and confidence
Observability matters a lot because some cases are not mainly “unknown.”
They are **blocked from better knowledge** by weak visibility.
That means poor observability should usually:
- reduce confidence
- keep more boundaries live
- limit subtree promotion
- narrow first-fix boldness
- increase the need for diagnostic-first moves
This is one reason F5-related cases are so important in the Atlas.
Poor visibility can distort every later layer if not handled honestly.
---
## When stronger confidence is justified
Stronger confidence is generally more justified when:
- evidence is direct or close to direct
- the broken invariant is visible enough
- a neighboring alternative is meaningfully weaker
- contradiction is small or well-accounted for
- observability is good enough to support the reading
- the claimed level stays bounded
Notice the last part.
A case may justify:
- high confidence at family level
without justifying:
- high confidence at local subtree level
That distinction matters a lot.
---
## When lower confidence is the right answer
Lower confidence is usually the right answer when:
- evidence is mainly indirect
- symptoms are loud but deeper structure is still inferred
- contradiction remains meaningful
- the decisive separation question is still unanswered
- observability is weak
- the case is mixed or overlay-prone
- finer public structure is not yet stable enough
Lower confidence is not weakness.
It is structural honesty.
---
## Example wording patterns
These are not full output contracts.
They are example language patterns for evidence and confidence posture.
### Example 1. Direct evidence with bounded strength
> current primary fit: F4 at family level
> evidence is direct enough for a high-confidence bounded family-level read, but not yet for finer local promotion
### Example 2. Indirect signal with moderate posture
> current strongest candidate is F1
> support is meaningful but still partly indirect, so moderate confidence is safer than stronger closure
### Example 3. Contradictory evidence
> current read remains boundary-live between F3 and F4
> evidence supports both continuity loss and contract-execution failure strongly enough that a one-sided phrasing would overstate the case
### Example 4. Observability-blocked posture
> tentative F5/F6 boundary read
> confidence remains limited because observability weakness blocks stronger separation
### Example 5. Symptom-surface restraint
> current F7 lean remains tentative
> the visible packaging failure is clear, but deeper grounding and reasoning surfaces are not yet well observed
---
## Common evidence and confidence mistakes
### Mistake 1. Strong tone replacing strong support
Writing harder does not make the evidence stronger.
### Mistake 2. Treating “plausible” like “settled”
Plausibility is useful, but not equivalent to closure.
### Mistake 3. Ignoring the blocked-evidence question
Sometimes the real answer is not “unclear.”
It is “currently unseeable with enough quality.”
### Mistake 4. Treating contradiction as noise to delete
Contradiction often contains the most important structural clue.
### Mistake 5. Using the repair move to backfill certainty
A repair suggestion can be helpful without proving the fit strongly.
---
## When this page is enough
This page is often enough when:
- you need to choose a confidence posture
- you need to avoid bluffing
- you need to explain why a case stays tentative or boundary-live
- you want a public language for “useful but not overclaimed”
In those situations, this page already does a lot of work.
---
## When this page is not enough
This page is usually not enough when:
- you need a formal output schema
- you need explicit overlay rules
- you need patch-promotion thresholds
- you need a full repair-facing sequence
- you need the final output field discipline
Then the natural next pages are:
- [Atlas Routing Output Contract v1](./atlas-routing-output-contract-v1.md)
- [Atlas Overlay and Secondary Family Discipline v1](./atlas-overlay-and-secondary-family-discipline-v1.md)
- [Atlas Promotion and Patch Thresholds v1](./atlas-promotion-and-patch-thresholds-v1.md)
- [Fixes Hub](./Fixes/README.md)
---
## Practical use
Here is the simplest practical workflow.
### Step 1
Identify the strongest current structural read:
- [Atlas Family Mini-Specs](./atlas-family-mini-specs-v1.md)
- [Atlas Boundary Decision Guide](./atlas-boundary-decision-guide-v1.md)
- [Atlas Fit Candidate Registry](./atlas-fit-candidate-registry-v1.md)
### Step 2
Ask what kind of evidence is actually present:
- direct
- indirect
- symptom-surface only
- contradictory
- missing critical
- observability-blocked
### Step 3
Choose the narrowest honest confidence posture.
### Step 4
Check whether the wording accidentally says more than the evidence does.
### Step 5
Only then move into output formatting or action.
That sequence keeps the Atlas reliable.
---
## Relation to other Atlas docs
This page sits after family, boundary, subtree, fit, and first-fix discipline, but before output contract and overlay discipline.
### Upstream neighbors
These pages prepare the reader for evidence and confidence discipline:
- [Problem Map 3.0 Troubleshooting Atlas](../wfgy-ai-problem-map-troubleshooting-atlas.md)
- [Atlas Structure Map](./atlas-structure-map-v1.md)
- [Atlas Family Mini-Specs](./atlas-family-mini-specs-v1.md)
- [Atlas Boundary Decision Guide](./atlas-boundary-decision-guide-v1.md)
- [Atlas Subtree Expansion Index](./atlas-subtree-expansion-index-v1.md)
- [Atlas Fit Candidate Registry](./atlas-fit-candidate-registry-v1.md)
- [Atlas First Fix and Misrepair Discipline v1](./atlas-first-fix-and-misrepair-discipline-v1.md)
### Side neighbors
This page pairs especially well with:
- [Canonical Casebook v1](./canonical-casebook-v1.md)
- [Validation Basis v1](./validation-basis-v1.md)
Why:
examples and validation context both sharpen what “good enough evidence” actually means in practice.
### Downstream neighbors
These are the natural next steps:
- [Atlas Routing Output Contract v1](./atlas-routing-output-contract-v1.md)
- [Atlas Overlay and Secondary Family Discipline v1](./atlas-overlay-and-secondary-family-discipline-v1.md)
- [Atlas Promotion and Patch Thresholds v1](./atlas-promotion-and-patch-thresholds-v1.md)
Why:
this page controls evidence and posture, while later pages govern how that posture gets rendered and constrained in output and future public growth.
---
## Current status
This page should be read as the stable **public evidence and confidence guide**.
That means:
- it does not rewrite the frozen core
- it does not replace the fit registry
- it gives readers a clear language for evidence quality and confidence posture
- it reduces the pressure to bluff stronger closure than the case supports
Its value is restraint-with-utility.
That is what this layer should do.
---
## Future extension
This page will become even stronger once its closest companion pages exist.
The most important future companions are:
- [Atlas Routing Output Contract v1](./atlas-routing-output-contract-v1.md)
- [Atlas Overlay and Secondary Family Discipline v1](./atlas-overlay-and-secondary-family-discipline-v1.md)
- [Atlas Promotion and Patch Thresholds v1](./atlas-promotion-and-patch-thresholds-v1.md)
Later versions may also add:
- more worked evidence patterns
- more family-specific evidence notes
- more subtree-sensitive posture examples
- more contrast examples between strong wording and justified wording
But the core job of this page should stay simple:
make confidence earn itself.
---
## Closing note 🔭
A strong system is not only good at finding a plausible route.
It is also good at saying how much support that route actually has,
how much remains unclear,
and where sharper language would cross the line from useful to dishonest.
That is what this page is for.
It gives the Atlas a public evidence and confidence discipline
so the rest of the system can stay trustworthy.