vrr/WFGY

Fork 0

mirror of https://github.com/onestardao/WFGY.git synced 2026-04-26 10:40:55 +00:00

PSBigBig d1abf509a5

Update Q117_scientific_realism_vs_anti_realism.md

2026-01-31 16:41:07 +08:00

48 KiB

Raw Permalink Blame History

Q117 · Scientific realism vs anti realism

0. Header metadata

ID: Q117
Code: BH_PHIL_SCIENCE_REALISM_L3_117
Domain: Philosophy
Family: Philosophy of science
Rank: S
Projection_dominance: C
Field_type: socio_technical_field
Tension_type: consistency_tension
Status: Reframed_only
Semantics: hybrid
E_level: E1
N_level: N1
Last_updated: 2026-01-31

0. Effective layer disclaimer

This entry is written strictly at the effective layer of the Tension Universe (TU) framework.

It specifies only:
- state spaces,
- observables and fields,
- tension scores and functionals,
- counterfactual patterns,
- engineering style modules and experiments.
It does not specify:
- any underlying TU core axioms,
- any PDE like generative rules for TU,
- any constructive mapping from raw empirical or textual data into internal TU fields.
It does not:
- prove or disprove the canonical philosophical problem of scientific realism vs anti realism,
- claim that any particular stance is metaphysically correct,
- introduce new theorems about scientific realism beyond the cited literature.
All scalar tension quantities in this document are understood as dimensionless scores on the TU tension scale described in the TU Tension Scale Charter. Low values correspond to low tension bands. Higher values correspond to medium or high tension bands.
This page can be used to:
- encode different stances as patterns of observables and tension,
- design falsifiable experiments and evaluation harnesses,
- define reusable components for other S class problems.
It must not be cited as evidence that:
- the realism vs anti realism debate has been settled,
- any specific metaphysical stance has been proven true.

This page should be read together with the following charters:

1. Canonical problem and status

1.1 Canonical statement

The canonical problem of scientific realism vs anti realism asks:

When our best scientific theories make successful, precise and wide ranging predictions about observable phenomena, should we regard their theoretical entities and structures as approximately true descriptions of an independent reality, or should we treat them only as useful instruments for organizing and predicting experience, without ontological commitment?

More concretely, the dispute concerns questions such as:

Are unobservable entities posited by science (for example electrons, fields, spacetime curvature, wave functions) genuinely part of what exists, at least approximately?
Does the success of a theory provide non accidental support for the approximate truth of its claims about such entities?
Can scientists remain entirely agnostic or anti realist about theoretical entities while still accounting for the depth, unification and counterfactual richness of scientific practice?

Scientific realism, in a standard formulation, holds that:

Mature, well confirmed scientific theories are approximately true.
Theoretical terms in such theories (for example “electron”) successfully refer to real entities or structures.
The explanatory and predictive success of science is best explained by the approximate truth of these theories.

Anti realist positions (for example constructive empiricism, instrumentalism) deny at least one of these claims, typically insisting that:

The proper aim of science is empirical adequacy.
Commitment should be restricted to claims about observable phenomena.
Theoretical entities are either useful fictions or tools, not objects of belief in the same sense as observables.

Q117 treats this dispute as an S class problem because it organizes many other realism debates (about mathematics, morality, probability and AI) and because there is no consensus resolution.

1.2 Status and difficulty

The scientific realism vs anti realism debate has persisted for decades in contemporary philosophy of science. It is characterized by:

Long running and sophisticated argument exchange without stable convergence.
Multiple refined positions on both sides, for example:
- selective realism,
- structural realism,
- entity realism,
- pessimistic meta induction,
- constructive empiricism.
Deep connections to issues in theory change, underdetermination, explanation, confirmation and the role of models.

There is no accepted decision procedure that, given a body of scientific practice, outputs a unique and compulsory stance. Instead, philosophers and scientists adopt positions that trade off:

explanatory depth and metaphysical commitment,
flexibility under theory change and robustness of reference,
simplicity of epistemic norms and capacity to account for unifying structures.

Within TU, Q117 is therefore not a problem that is expected to receive a single final proof. It is encoded as a structural problem about how different stances generate different patterns of consistency_tension between:

what scientific theories say,
what they predict and explain,
what they commit us to regarding what there is.

1.3 Role in the BlackHole project

Within the BlackHole S problem collection, Q117 plays several roles:

It is the prototype consistency_tension problem for ontology in science. It makes precise how different stances about what is real generate different patterns of mismatch across theory, evidence and explanation.
It provides reusable components for other realism style debates, for example:
- Q111 mind body relation,
- Q114 status of moral facts,
- Q116 foundations of mathematics,
- Q119 meaning of probability.
It offers templates for encoding stance dependent interpretation of models in complex socio technical systems, including:
- Q059 ultimate thermodynamic cost of information processing,
- Q098 Anthropocene system dynamics,
- Q123 scalable interpretability in AI.

1.4 References

Stanford Encyclopedia of Philosophy, “Scientific Realism”, first published 2002, substantive revisions in later years.
Stanford Encyclopedia of Philosophy, “Constructive Empiricism”, first published 1998, substantive revisions in later years.
S. Psillos, “Scientific Realism: How Science Tracks Truth”, Routledge, 1999.
B. C. van Fraassen, “The Scientific Image”, Oxford University Press, 1980.

2. Position in the BlackHole graph

This block records how Q117 sits within the BlackHole graph of Q001 to Q125. Edges are described using one line reasons that point to components or tension types defined at the effective layer.

2.1 Upstream problems

These problems provide prerequisites or structural tools for Q117.

Q111 (BH_PHIL_MIND_BODY_L3_111) Reason: Supplies templates for relating higher level states, such as minds, to physical reality. These templates are reused for relating theoretical entities to the world.
Q115 (BH_PHIL_INDUCTION_L3_115) Reason: Encodes the tension between evidence and generalization, which directly constrains how realism and anti realism justify belief in theoretical claims.
Q116 (BH_PHIL_FOUND_MATH_L3_116) Reason: Provides parallel debates about mathematical ontology that help define cross domain realism components.
Q119 (BH_PHIL_PROB_MEANING_L3_119) Reason: Gives worked examples of how realism and anti realism about probability interact with modeling and evidence.

2.2 Downstream problems

These problems reuse Q117 components or depend on its stance templates.

Q114 (BH_PHIL_MORAL_REALISM_L3_114) Reason: Reuses the RealismCommitmentIndex and stance tension functional to encode moral realism vs non cognitivism.
Q116 (BH_PHIL_FOUND_MATH_L3_116) Reason: Reuses empirical equivalence and invariance components to structure debates about mathematical structures and ontology.
Q119 (BH_PHIL_PROB_MEANING_L3_119) Reason: Reuses stance templates to distinguish realist, subjectivist and pragmatist interpretations of probability within scientific models.
Q121 (BH_AI_ALIGN_L3_121) Reason: Uses Q117 stance components to frame realism vs instrumentalism about values, utilities and preferences in AI alignment.

2.3 Parallel problems

These nodes share similar tension types but have no strict component dependence.

Q111 (BH_PHIL_MIND_BODY_L3_111) Reason: Both treat ontological commitment to non observable entities as a source of consistency_tension between theory and experience.
Q114 (BH_PHIL_MORAL_REALISM_L3_114) Reason: Mirrors the realism vs anti realism axis in a different domain, using comparable stance observables.
Q116 (BH_PHIL_FOUND_MATH_L3_116) Reason: Uses analogous structures to encode commitment to mathematical objects vs structural roles.

2.4 Cross domain edges

Cross domain edges indicate reuse of Q117 components in other domains.

Q059 (BH_CS_INFO_THERMODYN_L3_059) Reason: Reuses the RealismCommitmentIndex to distinguish views that treat information as an ontologically robust quantity vs mere bookkeeping.
Q098 (BH_EARTH_ANTHROPOCENE_DYN_L3_098) Reason: Uses empirical equivalence and stance tension components to encode realism vs instrumentalism about complex Earth system models.
Q123 (BH_AI_INTERP_L3_123) Reason: Reuses the stance tension functional to distinguish realism vs instrumentalism about internal features and mechanisms in AI interpretability.

3. Tension Universe encoding (effective layer)

All content in this block is at the effective layer. It only specifies:

a state space,
observables and fields,
invariants and tension scores,
singular sets and domain restrictions.

It does not specify any deep TU generative rule or mapping from raw data to internal fields.

3.1 State space M

We assume a semantic state space:

Elements of M are configurations of scientific practice and stance.

A state m in M encodes, at the effective layer:

A finite portfolio of scientific theories or models that are currently active in some context.
A pattern of empirical applications, prediction records and explanatory uses for these theories.
A stance profile that records how agents or communities treat the theoretical entities of these theories, along a realist to anti realist axis.

We assume:

Each m contains enough structure to evaluate the observables defined below.
There is no requirement that M be minimal or uniquely defined. Multiple state spaces could serve as models as long as they support the observables and constraints.

We do not describe how states in M are constructed from texts, experiments or agent histories. We only assume that such states exist at the effective layer.

3.2 Effective observables and fields

We introduce the following observables on M. All values are in the closed interval [0, 1] and are interpreted on the TU tension and scale conventions.

Realist commitment index

R_commit(m) in [0, 1]

R_commit(m) is also referred to as the RealismCommitmentIndex.
Intended meaning:
- R_commit(m) = 1 means a fully realist stance. Theoretical entities in the portfolio are treated as approximately real.
- R_commit(m) = 0 means a fully anti realist or instrumentalist stance.
- Intermediate values represent partial or selective realism.

Empirical adequacy score

E_adequacy(m) in [0, 1]

Summarizes, at the effective layer, how well the theory portfolio in m fits the relevant domain of observable phenomena.
Higher values indicate broader and more precise empirical success.

Cross theory invariance score

I_invariant(m) in [0, 1]

Measures how stable certain structures or entities remain across theory change encoded in m.
High values mean that, as theories are replaced or refined, there is a robust mapping between key theoretical entities or structures.

Empirical equivalence spread

EE_spread(m) in [0, 1]

Captures the degree of nontrivial empirical equivalence among different theories in the portfolio.
High values indicate that multiple conceptually distinct theories share overlapping empirical consequences.

These observables are defined at the effective layer as given scalar summaries. No claim is made about how they are computed from underlying data. The hybrid semantics is explicit: stance labels are discrete, while these observables are continuous scores in [0, 1].

3.3 Tension observables

We define two mismatch observables that capture the costs of adopting realist or anti realist stances in a given state. They are also normalized to the interval [0, 1] so that they can be read as tension scores on the TU tension scale.

Realist mismatch

DeltaS_realist(m) in [0, 1]

Increases when:
- R_commit(m) is high,
- but I_invariant(m) is low, which means weak cross theory invariance,
- or EE_spread(m) is high, which means many empirically equivalent rivals.
Interpreted as the degree to which a strong realist stance over commits beyond what the stability and distinctiveness of theories seem to support.

Anti realist mismatch

DeltaS_anti(m) in [0, 1]

Increases when:
- R_commit(m) is low,
- but E_adequacy(m) is high, which means strong empirical success,
- and I_invariant(m) is high, which means stable structures across theory change.
Interpreted as the degree to which a strict anti realist stance refuses to acknowledge robust structural features that are naturally treated as real.

The exact functional forms that map R_commit, E_adequacy, I_invariant and EE_spread into DeltaS_realist and DeltaS_anti are part of the encoding and belong to an admissible encoding class described below.

3.4 Admissible encoding class and fairness constraints

To prevent trivial tuning, we impose the following constraints on the class of admissible encodings for Q117.

Observable bounds

All observables R_commit(m), E_adequacy(m), I_invariant(m), EE_spread(m) take values in [0, 1].
The mismatch observables DeltaS_realist(m) and DeltaS_anti(m) also take values in [0, 1].
Encodings must respect these bounds for all m in their domain.

Monotonicity

DeltaS_realist(m) is nondecreasing in R_commit(m) and in EE_spread(m) when I_invariant(m) is held fixed and low.
DeltaS_anti(m) is nondecreasing in E_adequacy(m) and in I_invariant(m) when R_commit(m) is held fixed and low.

Nondegeneracy

There exist admissible states m_high_realist and m_high_anti in M such that:
```
DeltaS_realist(m_high_realist) > 0
DeltaS_anti(m_high_anti)       > 0
```
Neither stance is trivially free of mismatch across all states.

No post hoc adjustment

Once an encoding within the admissible class is fixed for a given experiment or application, it must be held fixed across all states and cases in that experiment.
It is not permitted to alter the functional forms or internal parameters of DeltaS_realist or DeltaS_anti after inspecting the outputs on specific cases.

Refinement stability

We consider refinement sequences of the form:

refine(k),  k = 0, 1, 2, ...

where:

refine(k) enlarges or sharpens the case library, improves assignments of observables, or adds more detailed structure to M_reg.
The encoding functions and their parameters are fixed once at k = 0 and remain unchanged for all k.

An encoding is considered refinement stable for Q117 if:

bands for DeltaS_realist(m) and DeltaS_anti(m) do not flip arbitrarily under small and reasonable changes introduced by refine(k),
patterns such as realism being systematically lower tension in mature stable theory states do not disappear or reverse under minor refinement.

These constraints are designed so that any low tension result is a substantive property of the stance and the configuration, rather than a consequence of arbitrary parameter choices or after the fact adjustments.

3.5 Singular set and domain restriction

Some states may fail to support coherent evaluation of the observables. We define the singular set:

S_sing = { m in M :
           R_commit(m) is undefined
           or E_adequacy(m) is undefined
           or I_invariant(m) is undefined
           or EE_spread(m) is undefined }

We restrict Q117 analysis to the regular domain:

M_reg = M \ S_sing

Rules:

All tension related quantities DeltaS_realist(m) and DeltaS_anti(m) are only evaluated on M_reg.
States in S_sing are treated as out of domain for this problem, not as evidence in favor of any stance.
Experiments that attempt to evaluate tension on S_sing are considered to have encountered an encoding breakdown, not a metaphysical result.

4. Tension principle for this problem

This block explains how Q117 is treated as a tension problem within TU, at the effective layer.

4.1 Core tension functionals

We define two nonnegative tension functionals for each state m in M_reg:

T_realist(m) in [0, 1]
T_anti(m)    in [0, 1]

For Q117 we choose the simplest normalization that aligns directly with the TU tension scale:

T_realist(m) = DeltaS_realist(m)
T_anti(m)    = DeltaS_anti(m)

These functionals represent the overall consistency_tension incurred by adopting a realist or anti realist stance in state m. They are already normalized to [0, 1] and are read as tension bands according to the TU Tension Scale Charter, for example:

values near 0 fall into low tension bands,
intermediate values fall into medium bands,
values near 1 fall into high tension bands.

Alternative monotone rescalings that preserve the interval [0, 1] are allowed inside the admissible encoding class but must be fixed before any experiment and cannot be adjusted after seeing outputs.

4.2 Realism as a low tension principle

Within the TU encoding for Q117, scientific realism is favored as a low tension principle if the following pattern holds across a wide range of states in M_reg:

For states encoding mature, empirically successful and structurally stable theory portfolios, there exist encodings in the admissible class such that:
```
T_realist(m) is in a low tension band
```
with corresponding small numerical values that remain small under refinement of the case library and observable assignments.
For the same states, any attempt to maintain a strict anti realist stance leads to:
```
T_anti(m) stays in a medium or high tension band
```
reflecting the difficulty of accounting for explanatory depth and cross theory invariance without ontological commitment.

In such worlds, realist stances are systematically lower tension and more robust under refinement.

4.3 Anti realism as a low tension principle

Conversely, scientific anti realism is favored as a low tension principle if:

For states encoding portfolios with:
- high empirical equivalence spread,
- frequent and deep theory change,
- limited cross theory invariance,
there exist admissible encodings such that:
```
T_anti(m) is in a low tension band
```
and remains in low bands when case descriptions are refined.
Realist stances in these states incur:
```
T_realist(m) in medium or high tension bands
```
reflecting the cost of committing to entities that cannot be stably tracked across an evolving theory landscape.

In such worlds, anti realist stances are systematically lower tension.

4.4 Mixed or selective stance benchmarks

The Q117 encoding also allows for mixed or selective stances in which:

realist commitment is adopted for entities that are highly invariant and central to successful theories,
anti realist caution is adopted for entities that appear only in frequent, unstable or empirically equivalent fragments.

These hybrid stances are evaluated by computing T_realist(m) and T_anti(m) on a component wise basis and aggregating their contributions. They serve as benchmarks to test whether a global pure stance is necessary or whether selective realism provides a strictly lower tension profile.

In a hybrid low tension regime we expect:

T_selective(m) in a lower band
T_realist(m)   in a higher band for volatile components
T_anti(m)      in a higher band for robust structural cores

for representative states in M_reg.

5. Counterfactual tension worlds

This block describes counterfactual worlds at the effective layer. It does not construct internal TU fields from raw data. It only specifies patterns of observables and tension functionals.

We consider three illustrative worlds:

World R: realism favored, anti realism disfavored.
World A: anti realism favored, realism disfavored.
World H: a hybrid stance favored over pure positions.

5.1 World R (scientific realism as global low tension stance)

In World R:

Theory change trajectories
- Across the history of science encoded in M_reg, major transitions, such as Newtonian mechanics to relativistic mechanics or classical to quantum mechanics, exhibit:
```
I_invariant(m) high
EE_spread(m)  moderate or low
```
  for states representing mature stages of the theories.
Explanatory depth
- Explanations in mature theories achieve wide unification and counterfactual support that is naturally captured by moderate to high R_commit(m).
Tension pattern
- For states representing mature, well confirmed theories:
```
T_realist(m) stays in low tension bands
T_anti(m)    tends to occupy medium or high tension bands
```
  because strict anti realism must treat robust structures as mere instruments, generating high DeltaS_anti(m).
Stability under refinement
- As encodings are refined by adding more detailed theory change data or better measures of invariance, the inequalities between T_realist and T_anti persist rather than being inverted by small adjustments.

5.2 World A (scientific anti realism as global low tension stance)

In World A:

Theory and model proliferation
- For many domains, there exist distinct theories with overlapping empirical consequences, so that:
```
EE_spread(m) high
I_invariant(m) low or moderate
```
  for key states in M_reg.
Frequent deep revisions
- Theory change is frequent and often disruptive enough that attempts to track theoretical entities across changes are fragile and heavily interpretation dependent.
Tension pattern
- For these states, pure anti realist stances with low R_commit(m) yield:
```
T_anti(m) in low tension bands
```
  while realist stances incur:
```
T_realist(m) in medium or high tension bands
```
  because they attach ontological weight to entities that are not stably supported by the structure of theory change.
Stability under refinement
- Refinements that better represent the volatility and empirical equivalence do not erase the tension difference. They reinforce the advantage of anti realist stances within this encoding.

5.3 World H (hybrid selective realism as low tension stance)

In World H:

Structural cores and peripheral models
- Some parts of scientific practice exhibit high invariance and explanatory depth, while others are more opportunistic or domain limited.
Stance pattern
- A selective stance is adopted:
  - high R_commit(m) for structural cores with high I_invariant(m) and strong E_adequacy(m),
  - low R_commit(m) for peripheral reports with high EE_spread(m) and low invariance.
Tension outcome
- When tension is aggregated component wise:
```
T_selective(m) in a lower band
T_realist(m)   in a higher band when applied uniformly
T_anti(m)      in a higher band when applied uniformly
```
  for representative states in M_reg. Pure realism over commits in volatile areas. Pure anti realism under acknowledges robust structural cores.
Role in Q117
- World H serves as a benchmark to test whether Q117 points toward a single pure stance or toward context sensitive or selective realism as a lower tension equilibrium.

6. Falsifiability and discriminating experiments

This block specifies experiments that can falsify or support particular Q117 encodings at the effective layer. They do not prove or disprove any metaphysical thesis. They only test whether the observables and tension functionals behave in a stable and discriminating way within the TU framework.

Experiment 1: Historical theory change tension scan

Goal: Test whether a given Q117 encoding of R_commit, E_adequacy, I_invariant, EE_spread, DeltaS_realist and DeltaS_anti can produce stable and discriminating tension patterns across canonical episodes of theory change.

Setup:

Select at least three historical case studies, for example:
- phlogiston theory to oxygen based chemistry,
- Newtonian mechanics to special and general relativity,
- classical to quantum mechanics.
For each case, construct a finite sequence of states:
```
m_1, m_2, ..., m_k in M_reg
```
representing key stages of the episode, such as early theory, mature theory, transitional theory and replacement.
Fix a specific encoding from the admissible class and a specific choice of any rescaling parameters, if present, before any evaluation.

Protocol:

For each state m_j in each episode, assign approximate values for:
```
R_commit(m_j)
E_adequacy(m_j)
I_invariant(m_j)
EE_spread(m_j)
```
according to historical and philosophical scholarship.

Compute:

DeltaS_realist(m_j)
DeltaS_anti(m_j)
T_realist(m_j)
T_anti(m_j)

using the chosen encoding.

For each episode, construct summary statistics such as:

Avg_T_realist_mature
Avg_T_anti_mature
Avg_T_realist_transitional
Avg_T_anti_transitional

Map these averages into TU tension bands and compare tension patterns across episodes to see whether they consistently favor a particular stance or reveal context dependencies.

Metrics:

For each episode:
- average tension values and corresponding bands for realist and anti realist stances in mature stages,
- average tension values and bands in transitional stages.
Global measures:
- fraction of episodes where realism yields lower band tension in mature stages,
- fraction where anti realism yields lower band tension,
- sensitivity of these fractions to small perturbations in observable assignments that respect the admissible class.

Falsification conditions:

If small, reasonable perturbations in the assignments of R_commit, E_adequacy, I_invariant and EE_spread cause arbitrary flips in which stance appears in a lower tension band for mature stages, the encoding is considered unstable and rejected.
If, across all episodes and reasonable perturbations, T_realist and T_anti remain nearly identical with no consistent band structure, the encoding is considered non informative and rejected.
If the encoding class can be tuned after seeing the data to make either stance appear low tension at will, without constraints from admissibility conditions, the implementation is considered to violate the fairness constraints in Block 3.4 and is rejected.

Semantics implementation note: All observables and tension functionals are treated at an abstract hybrid level where both discrete episode labels and continuous scores in [0, 1] coexist. No claim is made about underlying mathematical structure beyond the constraints in Block 3.

Boundary note: Falsifying a TU encoding of Q117 does not solve the canonical realism vs anti realism debate. Even if one encoding consistently yields lower tension bands for a given stance across these historical episodes, this does not prove that the stance is metaphysically correct. It only shows that, under the chosen encoding and case library, that stance is a lower tension effective layer description.

Experiment 2: AI stance switch stability test

Goal: Evaluate whether Q117 observables and tension functionals can be used as an evaluation harness for AI systems instructed to adopt realist vs anti realist stances about the same cases.

Setup:

Prepare a set of scientific case descriptions, for example classic textbook examples or simplified historical episodes.
Use an AI system to generate pairs of explanations for each case:
- one explanation under a “scientific realist” instruction,
- one explanation under an “anti realist or constructive empiricist” instruction.
For each generated explanation, construct an approximate state in M_reg that captures its stance and structural content, without specifying how this mapping is implemented at the TU level.

Protocol:

For each explanation, assign approximate values of:
```
R_commit(m)
E_adequacy(m)
I_invariant(m)
EE_spread(m)
```
guided by the explicit language of the explanation and its treatment of entities and theory change.

Compute the corresponding:

DeltaS_realist(m)
DeltaS_anti(m)
T_realist(m)
T_anti(m)

For each case, compare:
- tension values and bands for the realist explanation vs the anti realist explanation,
- consistency of these differences across cases.
Optionally, repeat with different AI models or with the same model under different training conditions.

Metrics:

For each case:
- whether the realist explanation yields lower band T_realist than the anti realist explanation yields band T_anti,
- magnitude of the difference between stance specific tensions.
Across cases:
- fraction of cases consistent with a World R pattern,
- fraction consistent with a World A pattern,
- variability across models and prompts.

Falsification conditions:

If the encoding cannot systematically distinguish realist leaning from anti realist leaning explanations and produces nearly identical band profiles across all cases and stances, the encoding is considered non discriminating and rejected.
If small changes in the evaluation scheme, within the admissible class, lead to arbitrary reversals in which stance looks lower tension for the same explanation, the encoding is considered unstable and rejected.
If the evaluation protocol can be tuned after seeing model outputs to make any chosen stance appear favorable, it is considered to violate the fairness constraints in Block 3.4 and is rejected.

Semantics implementation note: The AI system and its explanations are treated as generating hybrid encodings where discrete stance labels and continuous observables coexist. The Q117 encoding only requires that these observables be well defined at the effective layer.

Boundary note: This experiment evaluates the usefulness of Q117 as an AI assessment module. Even if one stance tends to occupy lower tension bands for many tasks under a particular encoding, this does not settle the philosophical correctness of scientific realism or anti realism. It only constrains how those stances behave as effective layer descriptions in the TU framework.

7. AI and WFGY engineering spec

This block describes how Q117 can be used as an engineering module for AI systems within the WFGY framework, at the effective layer.

7.1 Training signals

We define several training signals that can be used as auxiliary objectives.

signal_realist_commitment_consistency
- Definition: a penalty proportional to a function of DeltaS_realist(m) when the model produces strongly realist language in contexts where EE_spread(m) is high and I_invariant(m) is low.
- Purpose: encourage the model to either reduce realist language in such contexts or explicitly acknowledge uncertainty, thereby lowering realist mismatch and keeping tension in lower bands where appropriate.
signal_anti_realist_explanatory_loss
- Definition: a penalty proportional to DeltaS_anti(m) in contexts where the model uses strongly anti realist language but relies on deep and structured explanations that implicitly treat entities or structures as robust.
- Purpose: encourage the model to either accept some realist commitment where structural robustness is high or explicitly mark its explanations as purely instrumental.
signal_stance_separation
- Definition: a signal that measures overlap between internal representations used for realist and anti realist answers to matched questions and penalizes excessive overlap.
- Purpose: prevent the model from collapsing distinct stances into a single undifferentiated pattern, improving clarity and controllability.
signal_theory_change_sensitivity
- Definition: a signal that measures whether the model stance indicators change when prompted with theory change scenarios encoded as sequences of states.
- Purpose: ensure that the model does not maintain fixed stance markers regardless of how theory change affects invariance and empirical equivalence.

7.2 Architectural patterns

We outline module patterns that can reuse Q117 structures.

StanceHead_Q117
- Role: a head that maps internal representations of scientific discourse into approximate values of R_commit(m), E_adequacy(m), I_invariant(m) and EE_spread(m).
- Interface:
  - Inputs: hidden states for a given passage.
  - Outputs: a small vector of stance observables in [0, 1] and a stance label prediction, for example realist, anti realist, selective or indeterminate.
TensionEvaluator_Q117
- Role: a module that takes the outputs of StanceHead_Q117 and produces DeltaS_realist(m), DeltaS_anti(m), T_realist(m) and T_anti(m) as auxiliary signals.
- Interface:
  - Inputs: stance observables and the current stance label.
  - Outputs: normalized tension scores in [0, 1] for each stance, interpreted using the TU tension scale.
TheoryChangeTracker_Q117
- Role: a recurrent or sequence module that processes sequences of theory related contexts and computes approximate I_invariant(m) and EE_spread(m) values for each stage.
- Interface:
  - Inputs: ordered lists of model internal states for different theory descriptions.
  - Outputs: dynamic invariance and equivalence profiles.

7.3 Evaluation harness

We propose an evaluation harness that uses Q117 components.

Benchmark design
- Collect tasks that involve:
  - explaining specific scientific theories,
  - comparing competing theories with overlapping empirical coverage,
  - describing theory change episodes.
Conditions
- Baseline model: no explicit Q117 modules or signals.
- TU augmented model: includes StanceHead_Q117, TensionEvaluator_Q117 and associated training signals.
Metrics
- Stance clarity: agreement between human labels of stance and the model stance predictions.
- Stance consistency: stability of stance across logically similar prompts.
- Tension stability: robustness of tension scores and bands under minor rephrasing of prompts.
Comparison
- Compare baseline and TU augmented models on these metrics. Improvements in clarity, consistency and stability without excessive rigidity would indicate effective use of Q117 components.

7.4 60 second reproduction protocol

A minimal protocol for external users to experience Q117 style encoding in an AI system.

Baseline setup
- Prompt the AI: “Explain the debate between scientific realism and anti realism in science. Give arguments on both sides.”
- Record the answer. Typical issues:
  - the explanation may blur distinctions between stances,
  - it may fail to connect stance differences to patterns of theory change or empirical equivalence.
TU encoded setup
- Prompt the same AI with Q117 modules enabled: “Explain the debate between scientific realism and anti realism in science. Make explicit:
  - how the stance treats theoretical entities as real or instrumental,
  - how theory change and empirical equivalence affect the stance,
  - where each stance incurs tension with empirical success or with underdetermination.”
- Record the answer and any exposed tension indicators.
Comparison metric
- Human evaluators rate:
  - clarity of stance description,
  - explicitness of tradeoffs, realist risks vs anti realist costs,
  - use of theory change and empirical equivalence examples.
What to log
- Prompts, responses, stance predictions and tension scores. This allows later inspection of how Q117 components shaped the behavior, without exposing any deep TU generative rules.

8. Cross problem transfer template

This block describes reusable components produced by Q117 and their transfer to other problems.

8.1 Reusable components produced by this problem

ComponentName: RealismCommitmentIndex
- Type: observable
- Minimal interface:
```
Input: context describing theories and their use
Output: r in [0, 1] representing realist commitment
```
- Preconditions:
  - the context describes at least one theory and its intended interpretation,
  - it is possible to distinguish talk about observables and theoretical entities.
ComponentName: EmpiricalEquivalenceProfile
- Type: observable
- Minimal interface:
```
Input: set of theories or models with shared domain
Output: e in [0, 1] representing degree of empirical equivalence
```
- Preconditions:
  - there are at least two distinct theories with overlapping domains,
  - some information about their comparative empirical performance is available.
ComponentName: StanceTensionFunctional_Q117
- Type: functional
- Minimal interface:
```
Inputs: R_commit, E_adequacy, I_invariant, EE_spread
Outputs: T_realist, T_anti
```
- Preconditions:
  - observables obey bounds and monotonicity constraints in Block 3.4,
  - the stance being evaluated is clearly specified.

8.2 Direct reuse targets

Q114 (status of moral facts)
- Reused components:
  - RealismCommitmentIndex,
  - StanceTensionFunctional_Q117.
- Why it transfers:
  - moral realism vs non cognitivism or expressivism can be encoded with an analogous commitment index and mismatch functionals.
- What changes:
  - E_adequacy measures coherence with moral practice, patterns of judgment and interpersonal justification rather than empirical data,
  - I_invariant refers to stability of moral judgments across reflection and cultural change.
Q116 (foundations of mathematics)
- Reused components:
  - RealismCommitmentIndex,
  - EmpiricalEquivalenceProfile,
  - StanceTensionFunctional_Q117.
- Why it transfers:
  - debates about set theoretic realism, structuralism and nominalism mirror scientific realism debates.
- What changes:
  - E_adequacy becomes adequacy for mathematical practice, proofs and problem solving,
  - EE_spread becomes degree of underdetermination of mathematical ontology by mathematical practice.
Q119 (meaning of probability)
- Reused components:
  - RealismCommitmentIndex,
  - EmpiricalEquivalenceProfile.
- Why it transfers:
  - realist, subjectivist and pragmatist views of probability differ in how they treat probabilities as real properties or instruments.
- What changes:
  - stance labels and observables are adapted to probabilistic contexts, for example frequencies vs credences.
Q123 (scalable interpretability in AI)
- Reused components:
  - RealismCommitmentIndex,
  - StanceTensionFunctional_Q117.
- Why it transfers:
  - the question of whether internal features and circuits in AI systems are real mechanisms or convenient descriptions is structurally parallel to Q117.
- What changes:
  - I_invariant measures stability of extracted features across training runs and models,
  - E_adequacy measures success in prediction, control or safety tasks.

9. TU roadmap and verification levels

This block explains how Q117 fits into the TU verification ladder and what the next measurable steps are.

9.1 Current levels

E_level: E1
- An effective layer encoding has been specified with:
  - a clear state space,
  - defined observables,
  - normalized tension functionals in [0, 1],
  - at least two experiments with falsification conditions.
- No implementation level or large scale empirical program has yet been completed.
N_level: N1
- The narrative of the realism vs anti realism debate has been linked to the TU encoding in a coherent way.
- Counterfactual worlds have been described qualitatively but not instantiated in detailed case libraries or public code.

9.2 Next measurable step toward E2

To move Q117 from E1 to E2, it is sufficient to complete at least one of the following:

Implement a case study library
- Construct a small but concrete library of historical episodes encoded as sequences of states in M_reg.
- Evaluate T_realist and T_anti under one fixed admissible encoding and publish the tension profiles, bands and code.
Build an AI evaluation harness
- Implement the stance switch experiment with one or more AI models.
- Publicly release the prompts, explanations, approximate observable assignments, tension scores and band assignments.

In both cases, the key requirement is that:

observable assignments and encodings are specified in enough detail to be checked and critiqued,
the fairness constraints on encodings are respected.

9.3 Long term role in the TU program

In the long term, Q117 is expected to serve as:

A master template for encoding realism vs anti realism disputes across domains, supplying:
- stance observables,
- equivalence and invariance measures,
- stance specific tension functionals.
A bridge between philosophy of science and AI engineering, allowing:
- explicit control over stance taking behavior in AI systems,
- systematic evaluation of how different stances affect reasoning and explanation.
A diagnostic node in the BlackHole graph for checking whether other S class problems inadvertently assume realist or anti realist positions without making their stance explicit.

10. Elementary but precise explanation

This block gives an explanation of Q117 for non specialists while staying faithful to the effective layer encoding.

Scientists often talk about things we cannot see directly, such as electrons, fields, spacetime curvature and wave functions. When theories that use these ideas work very well, a natural question arises:

Are these invisible things really out there in the world, or are they just useful stories that help us organize what we observe?

Scientific realism says that when a theory has been tested many times and works in many ways, it probably tells us something approximately true about how the world is, including its invisible parts. Theoretical entities are not just stories, they are part of reality.

Anti realism says that the job of a theory is to fit the observations. As long as the theory saves the phenomena, we do not need to believe that its invisible entities really exist. They can be treated as tools for calculations and predictions.

In the Tension Universe view, we do not try to declare one side simply correct. Instead we ask different questions:

For a given piece of science, how much tension is created if we treat its theoretical entities as real?
How much tension is created if we treat them only as instruments?

We look at things like:

how well the theory fits the data,
how stable its main ideas are when theories change or are refined,
how many rival theories there are that fit the data equally well.

From this, we define numbers between 0 and 1 that summarize:

how costly it is to be a realist in this context,
how costly it is to be an anti realist in this context.

Values closer to 0 are low tension. Values closer to 1 are high tension.

Then we consider different possible worlds:

worlds where realist stances usually have low tension, because the structures of science are very stable and deep,
worlds where anti realist stances have lower tension, because there are many rival theories that fit the data equally well,
worlds where a mixed or selective stance wins, being realist about some robust structures and anti realist about the rest.

Q117 does not decide once and for all which world we live in. Instead, it gives a way to:

make the realism vs anti realism debate precise in terms of patterns of tension,
design experiments and case studies that test whether a given encoding of these patterns is stable and informative,
reuse the same tools in other debates, for example about mathematics, morality and AI.

In this way, Q117 acts as a structural map of the scientific realism problem inside the Tension Universe at the effective layer, rather than a final verdict about what is ultimately true.

Tension Universe effective layer footer

This page is part of the WFGY and Tension Universe S problem collection.

Scope of claims

The goal of this document is to specify an effective layer encoding of the scientific realism vs anti realism problem.
It does not claim to prove or disprove the canonical statement in Section 1.
It does not introduce any new theorem beyond what is already established in the cited literature.
It should not be cited as evidence that the corresponding open philosophical problem has been solved.

Effective layer boundary

All objects used here, such as state spaces M, observables, invariants, tension scores and counterfactual worlds, live at the effective layer.
This page does not define or assume any particular TU core axioms or generative mechanisms.
No mapping from raw empirical or textual data into internal TU fields is specified. Only the existence of such mappings is assumed in an abstract sense.

Encoding and fairness

All observables and tension scores are constrained to the interval [0, 1] and are interpreted using the TU tension scale.
Admissible encodings must satisfy bounds, monotonicity and nondegeneracy conditions and must remain fixed throughout each experiment.
Post hoc adjustment of encodings or parameters in response to particular cases is considered a violation of the TU Encoding and Fairness Charter.

Experiments and falsifiability

The experiments in Section 6 provide ways to falsify or refine specific implementations of the Q117 encoding.
Falsifying an encoding does not falsify the underlying philosophical positions. It only shows that a particular way of mapping those positions into the TU framework is unstable or uninformative.
Likewise, success in these experiments does not prove that scientific realism or anti realism is metaphysically correct. It only identifies lower tension descriptions inside this effective layer modeling scheme.

Relation to TU charters

This page should be read together with the following charters, which define the global rules for TU effective layer work, encoding fairness and tension scale interpretation:

Index:
← Back to Event Horizon
← Back to WFGY Home

Consistency note:
This entry has passed the internal formal-consistency and symbol-audit checks under the current WFGY 3.0 specification.
The structural layer is already self-consistent; any remaining issues are limited to notation or presentation refinement.
If you find a place where clarity can improve, feel free to open a PR or ping the community.
WFGY evolves through disciplined iteration, not ad-hoc patching.

48 KiB Raw Permalink Blame History