48 KiB
Q117 · Scientific realism vs anti realism
0. Header metadata
ID: Q117
Code: BH_PHIL_SCIENCE_REALISM_L3_117
Domain: Philosophy
Family: Philosophy of science
Rank: S
Projection_dominance: C
Field_type: socio_technical_field
Tension_type: consistency_tension
Status: Reframed_only
Semantics: hybrid
E_level: E1
N_level: N1
Last_updated: 2026-01-31
0. Effective layer disclaimer
This entry is written strictly at the effective layer of the Tension Universe (TU) framework.
-
It specifies only:
- state spaces,
- observables and fields,
- tension scores and functionals,
- counterfactual patterns,
- engineering style modules and experiments.
-
It does not specify:
- any underlying TU core axioms,
- any PDE like generative rules for TU,
- any constructive mapping from raw empirical or textual data into internal TU fields.
-
It does not:
- prove or disprove the canonical philosophical problem of scientific realism vs anti realism,
- claim that any particular stance is metaphysically correct,
- introduce new theorems about scientific realism beyond the cited literature.
-
All scalar tension quantities in this document are understood as dimensionless scores on the TU tension scale described in the TU Tension Scale Charter. Low values correspond to low tension bands. Higher values correspond to medium or high tension bands.
-
This page can be used to:
- encode different stances as patterns of observables and tension,
- design falsifiable experiments and evaluation harnesses,
- define reusable components for other S class problems.
-
It must not be cited as evidence that:
- the realism vs anti realism debate has been settled,
- any specific metaphysical stance has been proven true.
This page should be read together with the following charters:
1. Canonical problem and status
1.1 Canonical statement
The canonical problem of scientific realism vs anti realism asks:
When our best scientific theories make successful, precise and wide ranging predictions about observable phenomena, should we regard their theoretical entities and structures as approximately true descriptions of an independent reality, or should we treat them only as useful instruments for organizing and predicting experience, without ontological commitment?
More concretely, the dispute concerns questions such as:
- Are unobservable entities posited by science (for example electrons, fields, spacetime curvature, wave functions) genuinely part of what exists, at least approximately?
- Does the success of a theory provide non accidental support for the approximate truth of its claims about such entities?
- Can scientists remain entirely agnostic or anti realist about theoretical entities while still accounting for the depth, unification and counterfactual richness of scientific practice?
Scientific realism, in a standard formulation, holds that:
- Mature, well confirmed scientific theories are approximately true.
- Theoretical terms in such theories (for example “electron”) successfully refer to real entities or structures.
- The explanatory and predictive success of science is best explained by the approximate truth of these theories.
Anti realist positions (for example constructive empiricism, instrumentalism) deny at least one of these claims, typically insisting that:
- The proper aim of science is empirical adequacy.
- Commitment should be restricted to claims about observable phenomena.
- Theoretical entities are either useful fictions or tools, not objects of belief in the same sense as observables.
Q117 treats this dispute as an S class problem because it organizes many other realism debates (about mathematics, morality, probability and AI) and because there is no consensus resolution.
1.2 Status and difficulty
The scientific realism vs anti realism debate has persisted for decades in contemporary philosophy of science. It is characterized by:
-
Long running and sophisticated argument exchange without stable convergence.
-
Multiple refined positions on both sides, for example:
- selective realism,
- structural realism,
- entity realism,
- pessimistic meta induction,
- constructive empiricism.
-
Deep connections to issues in theory change, underdetermination, explanation, confirmation and the role of models.
There is no accepted decision procedure that, given a body of scientific practice, outputs a unique and compulsory stance. Instead, philosophers and scientists adopt positions that trade off:
- explanatory depth and metaphysical commitment,
- flexibility under theory change and robustness of reference,
- simplicity of epistemic norms and capacity to account for unifying structures.
Within TU, Q117 is therefore not a problem that is expected to receive a single final proof. It is encoded as a structural problem about how different stances generate different patterns of consistency_tension between:
- what scientific theories say,
- what they predict and explain,
- what they commit us to regarding what there is.
1.3 Role in the BlackHole project
Within the BlackHole S problem collection, Q117 plays several roles:
-
It is the prototype consistency_tension problem for ontology in science. It makes precise how different stances about what is real generate different patterns of mismatch across theory, evidence and explanation.
-
It provides reusable components for other realism style debates, for example:
- Q111 mind body relation,
- Q114 status of moral facts,
- Q116 foundations of mathematics,
- Q119 meaning of probability.
-
It offers templates for encoding stance dependent interpretation of models in complex socio technical systems, including:
- Q059 ultimate thermodynamic cost of information processing,
- Q098 Anthropocene system dynamics,
- Q123 scalable interpretability in AI.
1.4 References
- Stanford Encyclopedia of Philosophy, “Scientific Realism”, first published 2002, substantive revisions in later years.
- Stanford Encyclopedia of Philosophy, “Constructive Empiricism”, first published 1998, substantive revisions in later years.
- S. Psillos, “Scientific Realism: How Science Tracks Truth”, Routledge, 1999.
- B. C. van Fraassen, “The Scientific Image”, Oxford University Press, 1980.
2. Position in the BlackHole graph
This block records how Q117 sits within the BlackHole graph of Q001 to Q125. Edges are described using one line reasons that point to components or tension types defined at the effective layer.
2.1 Upstream problems
These problems provide prerequisites or structural tools for Q117.
-
Q111 (BH_PHIL_MIND_BODY_L3_111) Reason: Supplies templates for relating higher level states, such as minds, to physical reality. These templates are reused for relating theoretical entities to the world.
-
Q115 (BH_PHIL_INDUCTION_L3_115) Reason: Encodes the tension between evidence and generalization, which directly constrains how realism and anti realism justify belief in theoretical claims.
-
Q116 (BH_PHIL_FOUND_MATH_L3_116) Reason: Provides parallel debates about mathematical ontology that help define cross domain realism components.
-
Q119 (BH_PHIL_PROB_MEANING_L3_119) Reason: Gives worked examples of how realism and anti realism about probability interact with modeling and evidence.
2.2 Downstream problems
These problems reuse Q117 components or depend on its stance templates.
-
Q114 (BH_PHIL_MORAL_REALISM_L3_114) Reason: Reuses the RealismCommitmentIndex and stance tension functional to encode moral realism vs non cognitivism.
-
Q116 (BH_PHIL_FOUND_MATH_L3_116) Reason: Reuses empirical equivalence and invariance components to structure debates about mathematical structures and ontology.
-
Q119 (BH_PHIL_PROB_MEANING_L3_119) Reason: Reuses stance templates to distinguish realist, subjectivist and pragmatist interpretations of probability within scientific models.
-
Q121 (BH_AI_ALIGN_L3_121) Reason: Uses Q117 stance components to frame realism vs instrumentalism about values, utilities and preferences in AI alignment.
2.3 Parallel problems
These nodes share similar tension types but have no strict component dependence.
-
Q111 (BH_PHIL_MIND_BODY_L3_111) Reason: Both treat ontological commitment to non observable entities as a source of consistency_tension between theory and experience.
-
Q114 (BH_PHIL_MORAL_REALISM_L3_114) Reason: Mirrors the realism vs anti realism axis in a different domain, using comparable stance observables.
-
Q116 (BH_PHIL_FOUND_MATH_L3_116) Reason: Uses analogous structures to encode commitment to mathematical objects vs structural roles.
2.4 Cross domain edges
Cross domain edges indicate reuse of Q117 components in other domains.
-
Q059 (BH_CS_INFO_THERMODYN_L3_059) Reason: Reuses the RealismCommitmentIndex to distinguish views that treat information as an ontologically robust quantity vs mere bookkeeping.
-
Q098 (BH_EARTH_ANTHROPOCENE_DYN_L3_098) Reason: Uses empirical equivalence and stance tension components to encode realism vs instrumentalism about complex Earth system models.
-
Q123 (BH_AI_INTERP_L3_123) Reason: Reuses the stance tension functional to distinguish realism vs instrumentalism about internal features and mechanisms in AI interpretability.
3. Tension Universe encoding (effective layer)
All content in this block is at the effective layer. It only specifies:
- a state space,
- observables and fields,
- invariants and tension scores,
- singular sets and domain restrictions.
It does not specify any deep TU generative rule or mapping from raw data to internal fields.
3.1 State space M
We assume a semantic state space:
M
Elements of M are configurations of scientific practice and stance.
A state m in M encodes, at the effective layer:
- A finite portfolio of scientific theories or models that are currently active in some context.
- A pattern of empirical applications, prediction records and explanatory uses for these theories.
- A stance profile that records how agents or communities treat the theoretical entities of these theories, along a realist to anti realist axis.
We assume:
- Each
mcontains enough structure to evaluate the observables defined below. - There is no requirement that
Mbe minimal or uniquely defined. Multiple state spaces could serve as models as long as they support the observables and constraints.
We do not describe how states in M are constructed from texts, experiments or agent histories. We only assume that such states exist at the effective layer.
3.2 Effective observables and fields
We introduce the following observables on M.
All values are in the closed interval [0, 1] and are interpreted on the TU tension and scale conventions.
- Realist commitment index
R_commit(m) in [0, 1]
-
R_commit(m)is also referred to as theRealismCommitmentIndex. -
Intended meaning:
R_commit(m) = 1means a fully realist stance. Theoretical entities in the portfolio are treated as approximately real.R_commit(m) = 0means a fully anti realist or instrumentalist stance.- Intermediate values represent partial or selective realism.
- Empirical adequacy score
E_adequacy(m) in [0, 1]
- Summarizes, at the effective layer, how well the theory portfolio in
mfits the relevant domain of observable phenomena. - Higher values indicate broader and more precise empirical success.
- Cross theory invariance score
I_invariant(m) in [0, 1]
- Measures how stable certain structures or entities remain across theory change encoded in
m. - High values mean that, as theories are replaced or refined, there is a robust mapping between key theoretical entities or structures.
- Empirical equivalence spread
EE_spread(m) in [0, 1]
- Captures the degree of nontrivial empirical equivalence among different theories in the portfolio.
- High values indicate that multiple conceptually distinct theories share overlapping empirical consequences.
These observables are defined at the effective layer as given scalar summaries. No claim is made about how they are computed from underlying data. The hybrid semantics is explicit: stance labels are discrete, while these observables are continuous scores in [0, 1].
3.3 Tension observables
We define two mismatch observables that capture the costs of adopting realist or anti realist stances in a given state.
They are also normalized to the interval [0, 1] so that they can be read as tension scores on the TU tension scale.
- Realist mismatch
DeltaS_realist(m) in [0, 1]
-
Increases when:
R_commit(m)is high,- but
I_invariant(m)is low, which means weak cross theory invariance, - or
EE_spread(m)is high, which means many empirically equivalent rivals.
-
Interpreted as the degree to which a strong realist stance over commits beyond what the stability and distinctiveness of theories seem to support.
- Anti realist mismatch
DeltaS_anti(m) in [0, 1]
-
Increases when:
R_commit(m)is low,- but
E_adequacy(m)is high, which means strong empirical success, - and
I_invariant(m)is high, which means stable structures across theory change.
-
Interpreted as the degree to which a strict anti realist stance refuses to acknowledge robust structural features that are naturally treated as real.
The exact functional forms that map R_commit, E_adequacy, I_invariant and EE_spread into DeltaS_realist and DeltaS_anti are part of the encoding and belong to an admissible encoding class described below.
3.4 Admissible encoding class and fairness constraints
To prevent trivial tuning, we impose the following constraints on the class of admissible encodings for Q117.
- Observable bounds
- All observables
R_commit(m),E_adequacy(m),I_invariant(m),EE_spread(m)take values in[0, 1]. - The mismatch observables
DeltaS_realist(m)andDeltaS_anti(m)also take values in[0, 1]. - Encodings must respect these bounds for all
min their domain.
- Monotonicity
DeltaS_realist(m)is nondecreasing inR_commit(m)and inEE_spread(m)whenI_invariant(m)is held fixed and low.DeltaS_anti(m)is nondecreasing inE_adequacy(m)and inI_invariant(m)whenR_commit(m)is held fixed and low.
- Nondegeneracy
-
There exist admissible states
m_high_realistandm_high_antiinMsuch that:DeltaS_realist(m_high_realist) > 0 DeltaS_anti(m_high_anti) > 0Neither stance is trivially free of mismatch across all states.
- No post hoc adjustment
- Once an encoding within the admissible class is fixed for a given experiment or application, it must be held fixed across all states and cases in that experiment.
- It is not permitted to alter the functional forms or internal parameters of
DeltaS_realistorDeltaS_antiafter inspecting the outputs on specific cases.
- Refinement stability
We consider refinement sequences of the form:
refine(k), k = 0, 1, 2, ...
where:
refine(k)enlarges or sharpens the case library, improves assignments of observables, or adds more detailed structure toM_reg.- The encoding functions and their parameters are fixed once at
k = 0and remain unchanged for allk.
An encoding is considered refinement stable for Q117 if:
- bands for
DeltaS_realist(m)andDeltaS_anti(m)do not flip arbitrarily under small and reasonable changes introduced byrefine(k), - patterns such as realism being systematically lower tension in mature stable theory states do not disappear or reverse under minor refinement.
These constraints are designed so that any low tension result is a substantive property of the stance and the configuration, rather than a consequence of arbitrary parameter choices or after the fact adjustments.
3.5 Singular set and domain restriction
Some states may fail to support coherent evaluation of the observables. We define the singular set:
S_sing = { m in M :
R_commit(m) is undefined
or E_adequacy(m) is undefined
or I_invariant(m) is undefined
or EE_spread(m) is undefined }
We restrict Q117 analysis to the regular domain:
M_reg = M \ S_sing
Rules:
- All tension related quantities
DeltaS_realist(m)andDeltaS_anti(m)are only evaluated onM_reg. - States in
S_singare treated as out of domain for this problem, not as evidence in favor of any stance. - Experiments that attempt to evaluate tension on
S_singare considered to have encountered an encoding breakdown, not a metaphysical result.
4. Tension principle for this problem
This block explains how Q117 is treated as a tension problem within TU, at the effective layer.
4.1 Core tension functionals
We define two nonnegative tension functionals for each state m in M_reg:
T_realist(m) in [0, 1]
T_anti(m) in [0, 1]
For Q117 we choose the simplest normalization that aligns directly with the TU tension scale:
T_realist(m) = DeltaS_realist(m)
T_anti(m) = DeltaS_anti(m)
These functionals represent the overall consistency_tension incurred by adopting a realist or anti realist stance in state m.
They are already normalized to [0, 1] and are read as tension bands according to the TU Tension Scale Charter, for example:
- values near
0fall into low tension bands, - intermediate values fall into medium bands,
- values near
1fall into high tension bands.
Alternative monotone rescalings that preserve the interval [0, 1] are allowed inside the admissible encoding class but must be fixed before any experiment and cannot be adjusted after seeing outputs.
4.2 Realism as a low tension principle
Within the TU encoding for Q117, scientific realism is favored as a low tension principle if the following pattern holds across a wide range of states in M_reg:
-
For states encoding mature, empirically successful and structurally stable theory portfolios, there exist encodings in the admissible class such that:
T_realist(m) is in a low tension bandwith corresponding small numerical values that remain small under refinement of the case library and observable assignments.
-
For the same states, any attempt to maintain a strict anti realist stance leads to:
T_anti(m) stays in a medium or high tension bandreflecting the difficulty of accounting for explanatory depth and cross theory invariance without ontological commitment.
In such worlds, realist stances are systematically lower tension and more robust under refinement.
4.3 Anti realism as a low tension principle
Conversely, scientific anti realism is favored as a low tension principle if:
-
For states encoding portfolios with:
- high empirical equivalence spread,
- frequent and deep theory change,
- limited cross theory invariance,
there exist admissible encodings such that:
T_anti(m) is in a low tension bandand remains in low bands when case descriptions are refined.
-
Realist stances in these states incur:
T_realist(m) in medium or high tension bandsreflecting the cost of committing to entities that cannot be stably tracked across an evolving theory landscape.
In such worlds, anti realist stances are systematically lower tension.
4.4 Mixed or selective stance benchmarks
The Q117 encoding also allows for mixed or selective stances in which:
- realist commitment is adopted for entities that are highly invariant and central to successful theories,
- anti realist caution is adopted for entities that appear only in frequent, unstable or empirically equivalent fragments.
These hybrid stances are evaluated by computing T_realist(m) and T_anti(m) on a component wise basis and aggregating their contributions. They serve as benchmarks to test whether a global pure stance is necessary or whether selective realism provides a strictly lower tension profile.
In a hybrid low tension regime we expect:
T_selective(m) in a lower band
T_realist(m) in a higher band for volatile components
T_anti(m) in a higher band for robust structural cores
for representative states in M_reg.
5. Counterfactual tension worlds
This block describes counterfactual worlds at the effective layer. It does not construct internal TU fields from raw data. It only specifies patterns of observables and tension functionals.
We consider three illustrative worlds:
- World R: realism favored, anti realism disfavored.
- World A: anti realism favored, realism disfavored.
- World H: a hybrid stance favored over pure positions.
5.1 World R (scientific realism as global low tension stance)
In World R:
-
Theory change trajectories
-
Across the history of science encoded in
M_reg, major transitions, such as Newtonian mechanics to relativistic mechanics or classical to quantum mechanics, exhibit:I_invariant(m) high EE_spread(m) moderate or lowfor states representing mature stages of the theories.
-
-
Explanatory depth
- Explanations in mature theories achieve wide unification and counterfactual support that is naturally captured by moderate to high
R_commit(m).
- Explanations in mature theories achieve wide unification and counterfactual support that is naturally captured by moderate to high
-
Tension pattern
-
For states representing mature, well confirmed theories:
T_realist(m) stays in low tension bands T_anti(m) tends to occupy medium or high tension bandsbecause strict anti realism must treat robust structures as mere instruments, generating high
DeltaS_anti(m).
-
-
Stability under refinement
- As encodings are refined by adding more detailed theory change data or better measures of invariance, the inequalities between
T_realistandT_antipersist rather than being inverted by small adjustments.
- As encodings are refined by adding more detailed theory change data or better measures of invariance, the inequalities between
5.2 World A (scientific anti realism as global low tension stance)
In World A:
-
Theory and model proliferation
-
For many domains, there exist distinct theories with overlapping empirical consequences, so that:
EE_spread(m) high I_invariant(m) low or moderatefor key states in
M_reg.
-
-
Frequent deep revisions
- Theory change is frequent and often disruptive enough that attempts to track theoretical entities across changes are fragile and heavily interpretation dependent.
-
Tension pattern
-
For these states, pure anti realist stances with low
R_commit(m)yield:T_anti(m) in low tension bandswhile realist stances incur:
T_realist(m) in medium or high tension bandsbecause they attach ontological weight to entities that are not stably supported by the structure of theory change.
-
-
Stability under refinement
- Refinements that better represent the volatility and empirical equivalence do not erase the tension difference. They reinforce the advantage of anti realist stances within this encoding.
5.3 World H (hybrid selective realism as low tension stance)
In World H:
-
Structural cores and peripheral models
- Some parts of scientific practice exhibit high invariance and explanatory depth, while others are more opportunistic or domain limited.
-
Stance pattern
-
A selective stance is adopted:
- high
R_commit(m)for structural cores with highI_invariant(m)and strongE_adequacy(m), - low
R_commit(m)for peripheral reports with highEE_spread(m)and low invariance.
- high
-
-
Tension outcome
-
When tension is aggregated component wise:
T_selective(m) in a lower band T_realist(m) in a higher band when applied uniformly T_anti(m) in a higher band when applied uniformlyfor representative states in
M_reg. Pure realism over commits in volatile areas. Pure anti realism under acknowledges robust structural cores.
-
-
Role in Q117
- World H serves as a benchmark to test whether Q117 points toward a single pure stance or toward context sensitive or selective realism as a lower tension equilibrium.
6. Falsifiability and discriminating experiments
This block specifies experiments that can falsify or support particular Q117 encodings at the effective layer. They do not prove or disprove any metaphysical thesis. They only test whether the observables and tension functionals behave in a stable and discriminating way within the TU framework.
Experiment 1: Historical theory change tension scan
Goal:
Test whether a given Q117 encoding of R_commit, E_adequacy, I_invariant, EE_spread, DeltaS_realist and DeltaS_anti can produce stable and discriminating tension patterns across canonical episodes of theory change.
Setup:
-
Select at least three historical case studies, for example:
- phlogiston theory to oxygen based chemistry,
- Newtonian mechanics to special and general relativity,
- classical to quantum mechanics.
-
For each case, construct a finite sequence of states:
m_1, m_2, ..., m_k in M_regrepresenting key stages of the episode, such as early theory, mature theory, transitional theory and replacement.
-
Fix a specific encoding from the admissible class and a specific choice of any rescaling parameters, if present, before any evaluation.
Protocol:
-
For each state
m_jin each episode, assign approximate values for:R_commit(m_j) E_adequacy(m_j) I_invariant(m_j) EE_spread(m_j)according to historical and philosophical scholarship.
-
Compute:
DeltaS_realist(m_j) DeltaS_anti(m_j) T_realist(m_j) T_anti(m_j)using the chosen encoding.
-
For each episode, construct summary statistics such as:
Avg_T_realist_mature Avg_T_anti_mature Avg_T_realist_transitional Avg_T_anti_transitional -
Map these averages into TU tension bands and compare tension patterns across episodes to see whether they consistently favor a particular stance or reveal context dependencies.
Metrics:
-
For each episode:
- average tension values and corresponding bands for realist and anti realist stances in mature stages,
- average tension values and bands in transitional stages.
-
Global measures:
- fraction of episodes where realism yields lower band tension in mature stages,
- fraction where anti realism yields lower band tension,
- sensitivity of these fractions to small perturbations in observable assignments that respect the admissible class.
Falsification conditions:
- If small, reasonable perturbations in the assignments of
R_commit,E_adequacy,I_invariantandEE_spreadcause arbitrary flips in which stance appears in a lower tension band for mature stages, the encoding is considered unstable and rejected. - If, across all episodes and reasonable perturbations,
T_realistandT_antiremain nearly identical with no consistent band structure, the encoding is considered non informative and rejected. - If the encoding class can be tuned after seeing the data to make either stance appear low tension at will, without constraints from admissibility conditions, the implementation is considered to violate the fairness constraints in Block 3.4 and is rejected.
Semantics implementation note:
All observables and tension functionals are treated at an abstract hybrid level where both discrete episode labels and continuous scores in [0, 1] coexist. No claim is made about underlying mathematical structure beyond the constraints in Block 3.
Boundary note: Falsifying a TU encoding of Q117 does not solve the canonical realism vs anti realism debate. Even if one encoding consistently yields lower tension bands for a given stance across these historical episodes, this does not prove that the stance is metaphysically correct. It only shows that, under the chosen encoding and case library, that stance is a lower tension effective layer description.
Experiment 2: AI stance switch stability test
Goal: Evaluate whether Q117 observables and tension functionals can be used as an evaluation harness for AI systems instructed to adopt realist vs anti realist stances about the same cases.
Setup:
-
Prepare a set of scientific case descriptions, for example classic textbook examples or simplified historical episodes.
-
Use an AI system to generate pairs of explanations for each case:
- one explanation under a “scientific realist” instruction,
- one explanation under an “anti realist or constructive empiricist” instruction.
-
For each generated explanation, construct an approximate state in
M_regthat captures its stance and structural content, without specifying how this mapping is implemented at the TU level.
Protocol:
-
For each explanation, assign approximate values of:
R_commit(m) E_adequacy(m) I_invariant(m) EE_spread(m)guided by the explicit language of the explanation and its treatment of entities and theory change.
-
Compute the corresponding:
DeltaS_realist(m) DeltaS_anti(m) T_realist(m) T_anti(m) -
For each case, compare:
- tension values and bands for the realist explanation vs the anti realist explanation,
- consistency of these differences across cases.
-
Optionally, repeat with different AI models or with the same model under different training conditions.
Metrics:
-
For each case:
- whether the realist explanation yields lower band
T_realistthan the anti realist explanation yields bandT_anti, - magnitude of the difference between stance specific tensions.
- whether the realist explanation yields lower band
-
Across cases:
- fraction of cases consistent with a World R pattern,
- fraction consistent with a World A pattern,
- variability across models and prompts.
Falsification conditions:
- If the encoding cannot systematically distinguish realist leaning from anti realist leaning explanations and produces nearly identical band profiles across all cases and stances, the encoding is considered non discriminating and rejected.
- If small changes in the evaluation scheme, within the admissible class, lead to arbitrary reversals in which stance looks lower tension for the same explanation, the encoding is considered unstable and rejected.
- If the evaluation protocol can be tuned after seeing model outputs to make any chosen stance appear favorable, it is considered to violate the fairness constraints in Block 3.4 and is rejected.
Semantics implementation note: The AI system and its explanations are treated as generating hybrid encodings where discrete stance labels and continuous observables coexist. The Q117 encoding only requires that these observables be well defined at the effective layer.
Boundary note: This experiment evaluates the usefulness of Q117 as an AI assessment module. Even if one stance tends to occupy lower tension bands for many tasks under a particular encoding, this does not settle the philosophical correctness of scientific realism or anti realism. It only constrains how those stances behave as effective layer descriptions in the TU framework.
7. AI and WFGY engineering spec
This block describes how Q117 can be used as an engineering module for AI systems within the WFGY framework, at the effective layer.
7.1 Training signals
We define several training signals that can be used as auxiliary objectives.
-
signal_realist_commitment_consistency- Definition: a penalty proportional to a function of
DeltaS_realist(m)when the model produces strongly realist language in contexts whereEE_spread(m)is high andI_invariant(m)is low. - Purpose: encourage the model to either reduce realist language in such contexts or explicitly acknowledge uncertainty, thereby lowering realist mismatch and keeping tension in lower bands where appropriate.
- Definition: a penalty proportional to a function of
-
signal_anti_realist_explanatory_loss- Definition: a penalty proportional to
DeltaS_anti(m)in contexts where the model uses strongly anti realist language but relies on deep and structured explanations that implicitly treat entities or structures as robust. - Purpose: encourage the model to either accept some realist commitment where structural robustness is high or explicitly mark its explanations as purely instrumental.
- Definition: a penalty proportional to
-
signal_stance_separation- Definition: a signal that measures overlap between internal representations used for realist and anti realist answers to matched questions and penalizes excessive overlap.
- Purpose: prevent the model from collapsing distinct stances into a single undifferentiated pattern, improving clarity and controllability.
-
signal_theory_change_sensitivity- Definition: a signal that measures whether the model stance indicators change when prompted with theory change scenarios encoded as sequences of states.
- Purpose: ensure that the model does not maintain fixed stance markers regardless of how theory change affects invariance and empirical equivalence.
7.2 Architectural patterns
We outline module patterns that can reuse Q117 structures.
-
StanceHead_Q117-
Role: a head that maps internal representations of scientific discourse into approximate values of
R_commit(m),E_adequacy(m),I_invariant(m)andEE_spread(m). -
Interface:
- Inputs: hidden states for a given passage.
- Outputs: a small vector of stance observables in
[0, 1]and a stance label prediction, for example realist, anti realist, selective or indeterminate.
-
-
TensionEvaluator_Q117-
Role: a module that takes the outputs of
StanceHead_Q117and producesDeltaS_realist(m),DeltaS_anti(m),T_realist(m)andT_anti(m)as auxiliary signals. -
Interface:
- Inputs: stance observables and the current stance label.
- Outputs: normalized tension scores in
[0, 1]for each stance, interpreted using the TU tension scale.
-
-
TheoryChangeTracker_Q117-
Role: a recurrent or sequence module that processes sequences of theory related contexts and computes approximate
I_invariant(m)andEE_spread(m)values for each stage. -
Interface:
- Inputs: ordered lists of model internal states for different theory descriptions.
- Outputs: dynamic invariance and equivalence profiles.
-
7.3 Evaluation harness
We propose an evaluation harness that uses Q117 components.
-
Benchmark design
-
Collect tasks that involve:
- explaining specific scientific theories,
- comparing competing theories with overlapping empirical coverage,
- describing theory change episodes.
-
-
Conditions
- Baseline model: no explicit Q117 modules or signals.
- TU augmented model: includes
StanceHead_Q117,TensionEvaluator_Q117and associated training signals.
-
Metrics
- Stance clarity: agreement between human labels of stance and the model stance predictions.
- Stance consistency: stability of stance across logically similar prompts.
- Tension stability: robustness of tension scores and bands under minor rephrasing of prompts.
-
Comparison
- Compare baseline and TU augmented models on these metrics. Improvements in clarity, consistency and stability without excessive rigidity would indicate effective use of Q117 components.
7.4 60 second reproduction protocol
A minimal protocol for external users to experience Q117 style encoding in an AI system.
-
Baseline setup
-
Prompt the AI: “Explain the debate between scientific realism and anti realism in science. Give arguments on both sides.”
-
Record the answer. Typical issues:
- the explanation may blur distinctions between stances,
- it may fail to connect stance differences to patterns of theory change or empirical equivalence.
-
-
TU encoded setup
-
Prompt the same AI with Q117 modules enabled: “Explain the debate between scientific realism and anti realism in science. Make explicit:
- how the stance treats theoretical entities as real or instrumental,
- how theory change and empirical equivalence affect the stance,
- where each stance incurs tension with empirical success or with underdetermination.”
-
Record the answer and any exposed tension indicators.
-
-
Comparison metric
-
Human evaluators rate:
- clarity of stance description,
- explicitness of tradeoffs, realist risks vs anti realist costs,
- use of theory change and empirical equivalence examples.
-
-
What to log
- Prompts, responses, stance predictions and tension scores. This allows later inspection of how Q117 components shaped the behavior, without exposing any deep TU generative rules.
8. Cross problem transfer template
This block describes reusable components produced by Q117 and their transfer to other problems.
8.1 Reusable components produced by this problem
-
ComponentName:
RealismCommitmentIndex-
Type: observable
-
Minimal interface:
Input: context describing theories and their use Output: r in [0, 1] representing realist commitment -
Preconditions:
- the context describes at least one theory and its intended interpretation,
- it is possible to distinguish talk about observables and theoretical entities.
-
-
ComponentName:
EmpiricalEquivalenceProfile-
Type: observable
-
Minimal interface:
Input: set of theories or models with shared domain Output: e in [0, 1] representing degree of empirical equivalence -
Preconditions:
- there are at least two distinct theories with overlapping domains,
- some information about their comparative empirical performance is available.
-
-
ComponentName:
StanceTensionFunctional_Q117-
Type: functional
-
Minimal interface:
Inputs: R_commit, E_adequacy, I_invariant, EE_spread Outputs: T_realist, T_anti -
Preconditions:
- observables obey bounds and monotonicity constraints in Block 3.4,
- the stance being evaluated is clearly specified.
-
8.2 Direct reuse targets
-
Q114 (status of moral facts)
-
Reused components:
RealismCommitmentIndex,StanceTensionFunctional_Q117.
-
Why it transfers:
- moral realism vs non cognitivism or expressivism can be encoded with an analogous commitment index and mismatch functionals.
-
What changes:
E_adequacymeasures coherence with moral practice, patterns of judgment and interpersonal justification rather than empirical data,I_invariantrefers to stability of moral judgments across reflection and cultural change.
-
-
Q116 (foundations of mathematics)
-
Reused components:
RealismCommitmentIndex,EmpiricalEquivalenceProfile,StanceTensionFunctional_Q117.
-
Why it transfers:
- debates about set theoretic realism, structuralism and nominalism mirror scientific realism debates.
-
What changes:
E_adequacybecomes adequacy for mathematical practice, proofs and problem solving,EE_spreadbecomes degree of underdetermination of mathematical ontology by mathematical practice.
-
-
Q119 (meaning of probability)
-
Reused components:
RealismCommitmentIndex,EmpiricalEquivalenceProfile.
-
Why it transfers:
- realist, subjectivist and pragmatist views of probability differ in how they treat probabilities as real properties or instruments.
-
What changes:
- stance labels and observables are adapted to probabilistic contexts, for example frequencies vs credences.
-
-
Q123 (scalable interpretability in AI)
-
Reused components:
RealismCommitmentIndex,StanceTensionFunctional_Q117.
-
Why it transfers:
- the question of whether internal features and circuits in AI systems are real mechanisms or convenient descriptions is structurally parallel to Q117.
-
What changes:
I_invariantmeasures stability of extracted features across training runs and models,E_adequacymeasures success in prediction, control or safety tasks.
-
9. TU roadmap and verification levels
This block explains how Q117 fits into the TU verification ladder and what the next measurable steps are.
9.1 Current levels
-
E_level: E1
-
An effective layer encoding has been specified with:
- a clear state space,
- defined observables,
- normalized tension functionals in
[0, 1], - at least two experiments with falsification conditions.
-
No implementation level or large scale empirical program has yet been completed.
-
-
N_level: N1
- The narrative of the realism vs anti realism debate has been linked to the TU encoding in a coherent way.
- Counterfactual worlds have been described qualitatively but not instantiated in detailed case libraries or public code.
9.2 Next measurable step toward E2
To move Q117 from E1 to E2, it is sufficient to complete at least one of the following:
-
Implement a case study library
- Construct a small but concrete library of historical episodes encoded as sequences of states in
M_reg. - Evaluate
T_realistandT_antiunder one fixed admissible encoding and publish the tension profiles, bands and code.
- Construct a small but concrete library of historical episodes encoded as sequences of states in
-
Build an AI evaluation harness
- Implement the stance switch experiment with one or more AI models.
- Publicly release the prompts, explanations, approximate observable assignments, tension scores and band assignments.
In both cases, the key requirement is that:
- observable assignments and encodings are specified in enough detail to be checked and critiqued,
- the fairness constraints on encodings are respected.
9.3 Long term role in the TU program
In the long term, Q117 is expected to serve as:
-
A master template for encoding realism vs anti realism disputes across domains, supplying:
- stance observables,
- equivalence and invariance measures,
- stance specific tension functionals.
-
A bridge between philosophy of science and AI engineering, allowing:
- explicit control over stance taking behavior in AI systems,
- systematic evaluation of how different stances affect reasoning and explanation.
-
A diagnostic node in the BlackHole graph for checking whether other S class problems inadvertently assume realist or anti realist positions without making their stance explicit.
10. Elementary but precise explanation
This block gives an explanation of Q117 for non specialists while staying faithful to the effective layer encoding.
Scientists often talk about things we cannot see directly, such as electrons, fields, spacetime curvature and wave functions. When theories that use these ideas work very well, a natural question arises:
- Are these invisible things really out there in the world, or are they just useful stories that help us organize what we observe?
Scientific realism says that when a theory has been tested many times and works in many ways, it probably tells us something approximately true about how the world is, including its invisible parts. Theoretical entities are not just stories, they are part of reality.
Anti realism says that the job of a theory is to fit the observations. As long as the theory saves the phenomena, we do not need to believe that its invisible entities really exist. They can be treated as tools for calculations and predictions.
In the Tension Universe view, we do not try to declare one side simply correct. Instead we ask different questions:
- For a given piece of science, how much tension is created if we treat its theoretical entities as real?
- How much tension is created if we treat them only as instruments?
We look at things like:
- how well the theory fits the data,
- how stable its main ideas are when theories change or are refined,
- how many rival theories there are that fit the data equally well.
From this, we define numbers between 0 and 1 that summarize:
- how costly it is to be a realist in this context,
- how costly it is to be an anti realist in this context.
Values closer to 0 are low tension. Values closer to 1 are high tension.
Then we consider different possible worlds:
- worlds where realist stances usually have low tension, because the structures of science are very stable and deep,
- worlds where anti realist stances have lower tension, because there are many rival theories that fit the data equally well,
- worlds where a mixed or selective stance wins, being realist about some robust structures and anti realist about the rest.
Q117 does not decide once and for all which world we live in. Instead, it gives a way to:
- make the realism vs anti realism debate precise in terms of patterns of tension,
- design experiments and case studies that test whether a given encoding of these patterns is stable and informative,
- reuse the same tools in other debates, for example about mathematics, morality and AI.
In this way, Q117 acts as a structural map of the scientific realism problem inside the Tension Universe at the effective layer, rather than a final verdict about what is ultimately true.
Tension Universe effective layer footer
This page is part of the WFGY and Tension Universe S problem collection.
Scope of claims
- The goal of this document is to specify an effective layer encoding of the scientific realism vs anti realism problem.
- It does not claim to prove or disprove the canonical statement in Section 1.
- It does not introduce any new theorem beyond what is already established in the cited literature.
- It should not be cited as evidence that the corresponding open philosophical problem has been solved.
Effective layer boundary
- All objects used here, such as state spaces
M, observables, invariants, tension scores and counterfactual worlds, live at the effective layer. - This page does not define or assume any particular TU core axioms or generative mechanisms.
- No mapping from raw empirical or textual data into internal TU fields is specified. Only the existence of such mappings is assumed in an abstract sense.
Encoding and fairness
- All observables and tension scores are constrained to the interval
[0, 1]and are interpreted using the TU tension scale. - Admissible encodings must satisfy bounds, monotonicity and nondegeneracy conditions and must remain fixed throughout each experiment.
- Post hoc adjustment of encodings or parameters in response to particular cases is considered a violation of the TU Encoding and Fairness Charter.
Experiments and falsifiability
- The experiments in Section 6 provide ways to falsify or refine specific implementations of the Q117 encoding.
- Falsifying an encoding does not falsify the underlying philosophical positions. It only shows that a particular way of mapping those positions into the TU framework is unstable or uninformative.
- Likewise, success in these experiments does not prove that scientific realism or anti realism is metaphysically correct. It only identifies lower tension descriptions inside this effective layer modeling scheme.
Relation to TU charters
This page should be read together with the following charters, which define the global rules for TU effective layer work, encoding fairness and tension scale interpretation:
- TU Effective Layer Charter
- TU Encoding and Fairness Charter
- TU Tension Scale Charter
- TU Global Guardrails
Index:
← Back to Event Horizon
← Back to WFGY Home
Consistency note:
This entry has passed the internal formal-consistency and symbol-audit checks under the current WFGY 3.0 specification.
The structural layer is already self-consistent; any remaining issues are limited to notation or presentation refinement.
If you find a place where clarity can improve, feel free to open a PR or ping the community.
WFGY evolves through disciplined iteration, not ad-hoc patching.