50 KiB
Q118 · Limits of classical logic
0. Header metadata
ID: Q118
Code: BH_PHIL_REF_LOGIC_L3_118
Domain: Philosophy
Family: Logic and foundations
Rank: S
Projection_dominance: C
Field_type: cognitive_field
Tension_type: consistency_tension
Status: Open
Semantics: discrete
E_level: E1
N_level: N2
Last_updated: 2026-01-31
0. Effective layer disclaimer
All statements in this entry are made strictly at the effective layer of the Tension Universe (TU) framework.
Scope of claims
- The goal of this document is to specify an effective layer encoding of Q118, the problem of limits of classical logic.
- It does not claim to prove or disprove any canonical thesis in philosophy of logic, for example that classical logic is the one correct logic of rational consequence.
- It does not introduce any new theorem beyond what is already established in the cited literature.
- It should not be cited as evidence that the corresponding philosophical problem has been solved.
Effective layer boundary
- All objects introduced here, including the state space
M, scenario libraries, logic libraries, observables, and tension scores, live at the effective layer. - No deep TU axiom system, no PDE style field equation, and no constructive derivation of TU itself are specified here.
- No explicit mapping is given from real agents, texts, or experiments to internal TU fields. We only assume that such mappings exist in principle and can produce the effective summaries used in this page.
Encoding and fairness
- The encoding for Q118 follows the general constraints of admissible TU encodings. Logic libraries, scenario libraries, and weight parameters must be fixed before evaluation.
- Classical and non classical logics are evaluated under the same admissible encoding constraints. No stance may receive scenario specific tuning that is unavailable to others.
- Experiments described here can falsify or refine particular encodings of Q118. They cannot, by themselves, establish a unique correct logic for all rational reasoning.
Tension scale and bands
- All scalar tension quantities in this document, including
DeltaS_classical_norm(m),DeltaS_classical_extended(m; L), andDeltaS_logic(m), are understood as dimensionless scores on the TU tension scale. - Each such score lies in the closed interval
[0, 1], with low values associated with low tension bands, intermediate values with medium bands, and high values with high bands as defined in the TU Tension Scale Charter. - The experiments and falsification conditions use both numeric comparisons and band comparisons. A stance that systematically occupies a strictly higher band is treated as higher tension, even if raw numeric differences are small.
Semantics: discrete
-
The header metadata sets
Semantics: discrete. This means that:- The state space
Mis a discrete collection of reasoning scenario configurations. - All observables and mismatch scores are finite summaries over these discrete configurations.
- No claim is made here about whether the physical world is discrete or continuous. The discrete semantics only describes the level of abstraction used in this encoding.
- The state space
For general principles that govern effective layer encodings, fairness, and tension scales, this page relies on the TU charters listed again in the footer:
- TU Effective Layer Charter
- TU Encoding and Fairness Charter
- TU Tension Scale Charter
1. Canonical problem and status
1.1 Canonical statement
The canonical question behind Q118 can be stated as follows:
Are the inference rules and structural principles of classical logic sufficient to capture all forms of rational reasoning, or are there stable patterns of rational inference that require non classical logics at the effective level?
Classical logic is usually characterized by at least the following features:
- Bivalence: every statement is either true or false.
- Law of excluded middle: for any statement
P, the disjunctionP or not Pis valid. - Law of non contradiction: no statement
Pcan be both true and false. - Explosion: from a contradiction, anything follows.
- Monotonicity: adding premises never removes valid consequences.
The problem asks whether these features, together with the standard structural rules, are enough to describe all rational inference across mathematics, science, everyday reasoning, law, and ethics, at the level where we actually evaluate arguments.
1.2 Status and difficulty
There is no consensus answer. Instead, there is a landscape of positions.
-
Classical monism The view that classical logic is the one correct logic of rational consequence.
-
Logical pluralism The view that more than one logic can be correct, depending on context or on the notion of consequence.
-
Non classical challenges Positions that treat some non classical logic as strictly better for certain domains, for example paraconsistent logic for inconsistent but informative theories, or intuitionistic logic for constructive reasoning.
Several facts make Q118 an S rank philosophical problem.
- There are deeply developed non classical logics, such as intuitionistic, relevance, paraconsistent, modal, and substructural systems, that capture apparently rational patterns where classical logic struggles. Examples include reasoning from inconsistent data without triviality, default reasoning, or reasoning about quantum phenomena.
- There are strong arguments that classical logic remains adequate once we model context and meaning carefully enough, and that non classical logics can often be simulated within classical frameworks.
- There is no widely accepted meta theory that settles which logic or logics are correct, or what criteria such correctness should satisfy.
Because of these tensions, the question whether classical logic is sufficient for all rational reasoning remains open. It is not an open problem in the sense of a single conjecture in mathematics. It functions as an open structural question in philosophy of logic.
1.3 Role in the BlackHole project
Within the BlackHole S problem collection, Q118 has three roles.
- It is the canonical node for consistency_tension in logical systems, where the mismatch between formal consequence and normative judgments is measured.
- It anchors a cluster of problems in philosophy of logic and rationality, including questions about induction, probability, value of information, and AI alignment.
- It provides a template for encoding logic choice at the effective layer, without claiming to identify any fundamental logic at the deep layer.
References
- Stanford Encyclopedia of Philosophy, "Classical Logic", standard reference entry on the nature, principles, and scope of classical logic, latest revision.
- Stanford Encyclopedia of Philosophy, "Non classical Logic", and related entries on "Paraconsistent Logic" and "Intuitionistic Logic", standard surveys of logics that relax or revise classical principles.
- J. C. Beall and Greg Restall, "Logical Pluralism", Oxford University Press, 2006.
- Graham Priest, "An Introduction to Non Classical Logic", second edition, Cambridge University Press, 2008.
2. Position in the BlackHole graph
This block records how Q118 sits inside the BlackHole graph. Each edge is listed with a one line reason that points to a concrete component or tension type.
2.1 Upstream problems
These problems provide prerequisites or background tools for Q118.
-
Q116 (Foundations of mathematics) Reason: Supplies the formal systems background that
LogicalSystemDescriptoruses to encode classical and non classical logics as effective fields. -
Q115 (Problem of induction) Reason: Provides core patterns of ampliative reasoning that
LogicTensionFunctionalmust evaluate when classical deduction interacts with uncertain premises. -
Q111 (Mind body relation) Reason: Frames how
cognitive_fieldandconsistency_tensionare interpreted as about agents, representations, or abstract structures, without committing to a specific metaphysics.
2.2 Downstream problems
These problems reuse Q118 components or depend directly on its tension structure.
-
Q119 (Meaning of probability) Reason: Reuses
LogicTensionFunctionalto test whether classical consequence plus Kolmogorov axioms are enough to model rational credence updates. -
Q120 (Value of information and knowledge) Reason: Uses
LogicalSystemDescriptorto evaluate how different logics affect the appraisal of information gain and knowledge states. -
Q121 (AI alignment problem) Reason: Depends on
LogicTensionFunctionalto formalize when an AI system’s reasoning is logically safe and consistent with human normative standards. -
Q123 (Scalable interpretability) Reason: Uses
LogicalSystemDescriptorto classify the implicit logics that complex models implement at the effective layer.
2.3 Parallel problems
Parallel nodes share similar tension types but no direct component dependence.
-
Q112 (Free will and determinism) Reason: Both Q118 and Q112 examine how formal frameworks support rational explanation, but via different components and observables.
-
Q114 (Status of moral facts) Reason: Both ask whether a classical style of structure, logical or moral, is sufficient for all rational discourse, but without reusing specific logic encodings.
-
Q119 (Meaning of probability) Reason: Both investigate whether classical constructs can capture all rational practice, Q118 for consequence and Q119 for probabilistic coherence.
2.4 Cross domain edges
Cross domain edges connect Q118 to problems in other domains that can reuse its components.
-
Q051 (P versus NP) Reason: Reuses
LogicTensionFunctionalto relate classical proof systems to computational feasibility in search of low tension reasoning architectures. -
Q052 (Quantum computation and complexity) Reason: Uses
LogicalSystemDescriptorto study whether classical logic is adequate to reason about quantum processes, or if quantum logics reduce consistency_tension. -
Q057 (Generalization in reinforcement learning) Reason: Applies
LogicTensionFunctionalto understand how different logics handle default rules and generalization under sparse data. -
Q121 (AI alignment problem) Reason: Shares components for representing agent level logical constraints and measuring when classical rules are sufficient for safe reasoning.
-
Q123 (Scalable interpretability) Reason: Uses
LogicalSystemDescriptoras a template for mapping internal model representations to effective logics.
3. Tension Universe encoding (effective layer)
This block defines the effective layer encoding for Q118. It only introduces:
- state space,
- observables and fields,
- invariants and tension scores,
- singular sets and domain restrictions,
- admissible encoding class constraints.
No deep generative rules or mappings from real agents to TU fields are given.
3.1 State space
We define a discrete state space
M
with the following interpretation at the effective layer.
- Each
minMis a reasoning scenario configuration.
A configuration includes:
- a finite set of premises and candidate conclusions in a fixed formal language,
- a designation of which inferences are endorsed as rational in that scenario,
- a record of which logic or logics are being evaluated on that scenario.
We also assume a finite library of logics:
L_lib = {L_classical, L_intuitionistic, L_paraconsistent, L_relevance, L_nonmonotonic}
The exact list is part of the encoding choice but must be fixed before any measurements are taken.
We assume a finite library of benchmark scenarios:
S_lib = {s_1, s_2, ..., s_N}
where each s_k is associated with:
- a canonical scenario description,
- one or more normative judgments about acceptable inferences,
- classification into types such as inconsistent but informative, default reasoning, vagueness, and related categories.
For each s_k and each logic L in L_lib, there are states m in M that encode how L handles s_k and how normative judgments classify the inferences in s_k.
We do not specify how S_lib or the normative judgments are constructed from real data or real agents. We only require that they can be represented as finite discrete structures.
3.2 Effective fields and observables
On M we define the following effective observables and fields.
- Logical consequence summaries
For each L in L_lib we define:
Consequence_L(m)
Interpretation:
- Input: a state
minM. - Output: a finite summary object that records which conclusion patterns are derivable from the premises in
maccording to the logicL.
We do not require a particular data structure. We only require that for any scenario in S_lib the summary is well defined and finite.
We write
Consequence_classical(m) = Consequence_L(m) with L = L_classical
- Normative judgments observable
We define:
Normative_judgment(m)
Interpretation:
- Input: a state
m. - Output: a finite summary of which inferences are judged rational or acceptable according to a specified normative standard for the scenario. For example expert philosophical judgments or controlled experimental data.
- Paradox flag
We define:
Paradox_flag(m) in {0, 1}
Interpretation:
Paradox_flag(m) = 1if, under classical logic, the scenario encoded bymexhibits triviality or explosion in the sense that nearly all statements become derivable.Paradox_flag(m) = 0otherwise.
This is an effective observable only. We do not specify how the triviality condition is computed.
3.3 Logic mismatch observables
We define two primary mismatch observables. Both are normalized to the TU tension scale.
- Classical versus normative mismatch
DeltaS_classical_norm(m) in [0, 1]
Interpretation:
DeltaS_classical_norm(m)measures how farConsequence_classical(m)deviates fromNormative_judgment(m)on the scenario encoded inm.DeltaS_classical_norm(m) = 0if and only if every inference endorsed by the normative judgment matches a classical consequence and no classical consequence is normatively rejected.- Higher values of
DeltaS_classical_norm(m)correspond to higher bands on the TU tension scale. They indicate greater mismatch between classical consequence and normative judgments.
- Classical versus extended logic mismatch
For each L in L_lib we define:
DeltaS_classical_extended(m; L) in [0, 1]
Interpretation:
DeltaS_classical_extended(m; L)measures how farConsequence_classical(m)deviates fromConsequence_L(m)on the same scenario.DeltaS_classical_extended(m; L) = 0if and only if classical logic and logicLvalidate exactly the same inference patterns in the scenario encoded bym.- Higher values indicate larger divergence between classical logic and the comparison logic for that scenario.
The normalization to [0, 1] is understood as a rescaling of a more detailed mismatch measure. Only the normalized, dimensionless scores appear in this page.
3.4 Combined logic tension functional
We define a combined logic tension observable:
DeltaS_logic(m) = w_norm * DeltaS_classical_norm(m)
+ w_ext * DeltaS_classical_extended(m; L_star(m))
where:
-
w_normandw_extare nonnegative weights that satisfyw_norm >= 0 w_ext >= 0 w_norm + w_ext = 1 -
L_star(m)is a designated comparison logic fromL_libfor the scenario encoded bym.
At the encoding level we require:
-
A fixed rule that maps scenario types to comparison logics, for example:
- consistent deductive scenarios use
L_classical, - inconsistent but informative scenarios use
L_paraconsistent, - default reasoning scenarios use
L_nonmonotonic.
- consistent deductive scenarios use
-
This mapping rule is specified once when the encoding is defined and is not tuned after observing measurement outcomes.
Since both DeltaS_classical_norm(m) and DeltaS_classical_extended(m; L_star(m)) lie in [0, 1] and the weights are convex, DeltaS_logic(m) also lies in [0, 1] for all states where it is defined.
Intuitively:
DeltaS_classical_norm(m)captures how far classical consequence deviates from normative judgments in a scenario.DeltaS_classical_extended(m; L_star(m))captures how far classical consequence deviates from a chosen non classical logic that is designed to handle that type of scenario.DeltaS_logic(m)combines these deviations into a single consistency_tension score on the TU tension scale.
3.5 Singular set and domain restrictions
There are configurations where some observables are not defined or not finite. Examples include:
- scenarios where normative judgments are not available or are unstable,
- logics in
L_libthat do not provide clear consequence relations for a given language fragment, - degenerate cases where any finite representation of mismatch is impossible.
We collect such cases into a singular set:
S_sing = { m in M :
DeltaS_classical_norm(m) is undefined
or DeltaS_classical_extended(m; L_star(m)) is undefined
or any required summary is not finite }
We define the regular domain:
M_reg = M \ S_sing
All logic tension analysis for Q118 at the effective layer is restricted to M_reg. If an experiment or protocol attempts to evaluate DeltaS_logic(m) for m in S_sing, the outcome is recorded as out of domain.
States in S_sing cannot be used as evidence in favor of or against the adequacy of classical logic. They only indicate an encoding breakdown or a lack of sufficient data at the effective layer.
3.6 Admissible encoding class
We restrict attention to an admissible encoding class E_adm with the following properties.
-
Finite libraries
- The logic library
L_liband scenario libraryS_libare finite. - Their contents and classification rules are fixed before any evaluation of
DeltaS_logic(m).
- The logic library
-
Fixed weight constraints
-
The pair
(w_norm, w_ext)is chosen from a predefined finite set of candidate pairs, for example{(1, 0), (0.75, 0.25), (0.5, 0.5), (0.25, 0.75), (0, 1)} -
Once chosen,
(w_norm, w_ext)remains fixed for all scenarios and experiments in that encoding. -
It is not permitted to change the weights after inspecting mismatch or tension values on particular scenarios.
-
-
Stable comparison mapping
- The mapping from scenario types to comparison logics
L_star(m)is defined by a simple rule that depends only on scenario metadata inS_lib, not on the actual mismatch values. - It is not permitted to switch
L_star(m)for specific scenarios after seeing their tension scores.
- The mapping from scenario types to comparison logics
-
Refinement order
-
There exists a refinement index
ksuch that:-
For a given underlying reasoning pattern, there is a sequence of scenarios
m_1, m_2, ..., m_k, ...in
M_regwith increasing detail or coverage. -
An encoding in
E_admis expected to produce a sequence of tension valuesDeltaS_logic(m_k)that either stabilizes within a band on the TU tension scale or reveals persistent divergence.
-
-
-
No scenario specific tuning
- Encodings cannot be modified in a way that depends on the tension profile of individual scenarios.
- In particular, it is not allowed to adjust mismatch normalizations, weights, or comparison mappings to make a preferred logic appear low tension on a given benchmark after scores have been observed.
These constraints ensure that when we compare different encodings or worlds, the differences in tension cannot be explained by arbitrary post hoc parameter choices.
4. Tension principle for this problem
This block explains how Q118 is framed as a tension problem in TU at the effective layer.
4.1 Core tension interpretation
The core tension functional for Q118 is DeltaS_logic(m) defined above. Intuitively:
DeltaS_logic(m)small in a low tension band means that classical logic and the comparison logic both track rational judgments well on that scenario.DeltaS_logic(m)large in a high tension band means that there is a significant mismatch between classical consequence, extended logic, and normative judgments.
We are interested in how these scores behave:
- across scenario types in
S_lib, - across refinement sequences for each type,
- across different admissible encodings in
E_adm.
4.2 Classical adequacy as low logic tension
At the effective layer, we can phrase the thesis that classical logic is sufficient for all rational reasoning as follows.
For every reasoning pattern that actually occurs in rational practice, and for every admissible encoding in
E_adm, there exist refined statesm_kinM_regrepresenting that pattern such that the logic tensionDeltaS_logic(m_k)remains in low tension bands on the TU scale.
More concretely:
- For each underlying reasoning context, as we move along a refinement sequence
m_kthat captures more detail while staying inM_reg, the valuesDeltaS_logic(m_k)remain below a small thresholdepsilon_logicthat is compatible with low tension bands. - The threshold
epsilon_logicmay depend on the context but is bounded and does not blow up with refinement.
This formulation treats classical logic as effectively adequate. When classical logic is combined with reasonable modeling of context and with fair encoding rules, it stays within low tension bands for rational reasoning in all domains represented in S_lib.
4.3 Classical inadequacy as persistent high tension
The competing thesis that classical logic is not sufficient for all rational reasoning can be formulated as follows.
There exist families of reasoning patterns and admissible encodings such that, for any refinement sequence compatible with faithful modeling, the logic tension for classical logic stays in medium or high tension bands, while some alternative logic in
L_libachieves a strictly lower and more stable band profile.
Formally, this means:
-
There exist scenarios and encodings in
E_admfor which, along refinement sequencesm_krepresenting the same underlying reasoning practice,DeltaS_logic_classical(m_k) >= delta_logic > 0for all sufficiently large
k, withDeltaS_logic_classical(m_k)consistently occupying a band strictly above low tension. -
For the same scenarios, there exists a non classical logic
LaltinL_liband an admissible encoding that treatsLaltas the primary reference logic such thatDeltaS_logic_alt(m_k) <= epsilon_altwith
epsilon_altin low tension bands and withepsilon_alt < delta_logic.
This formulation does not claim that classical logic is false in any absolute sense. It states that, at the effective layer where we model rational reasoning, classical logic produces persistent higher tension in some domains that non classical logics can handle with lower tension.
4.4 Fairness and comparison constraints
All of the above depends on retaining fairness.
- Encodings cannot be designed so that classical logic is penalized by construction.
- Weights, logic libraries, and scenario libraries are fixed in advance and shared across stances within a given encoding.
- Both classical and non classical logics are evaluated under the same admissible encoding constraints.
Q118 is not about winning an unfair contest. It is about whether classical logic can be the unique low tension attractor across reasoning domains under transparent and symmetric comparison rules.
5. Counterfactual tension worlds
We describe two counterfactual worlds at the effective layer.
- World C: classical logic is fully adequate for all rational reasoning.
- World NC: classical logic is not fully adequate, non classical logics are needed to achieve low tension in stable ways.
These worlds are described in terms of observable patterns of DeltaS_logic(m), not in terms of any deep metaphysical claims about logic.
5.1 World C (classical logic fully adequate)
In World C:
-
Classical alignment with normative judgments
-
For every scenario type in
S_lib, there exist refined statesm_krepresenting actual reasoning patterns such thatDeltaS_classical_norm(m_k) <= epsilon_normfor some small bound
epsilon_normthat keepsDeltaS_classical_norm(m_k)in low tension bands on the TU scale.
-
-
Limited benefit of non classical logics
-
For each scenario type and for all encodings in
E_adm, the alternative logics inL_libdo not consistently achieve significantly lower tension bands than classical logic:DeltaS_classical_extended(m_k; L) <= epsilon_extwhere
epsilon_extis comparable in magnitude toepsilon_normacross logics and scenario types and yields similar tension bands.
-
-
Rare and local paradox flags
Paradox_flag(m)is rarely equal to1for states that represent realistic reasoning patterns.- When triviality does occur, it can be resolved by better modeling of premises or meanings without changing the logic itself. After refinement, the corresponding states move into low tension bands.
Overall, World C exhibits low and stable logic tension values for classical logic across the entire benchmark library S_lib and across refinement sequences.
5.2 World NC (classical logic not fully adequate)
In World NC:
-
Stable classical norm mismatch
-
There exist scenario types, especially inconsistent but informative, default reasoning, or vague predicates, such that for all refined states
m_krepresenting those typesDeltaS_classical_norm(m_k) >= delta_normwith
delta_norm > 0that placesDeltaS_classical_norm(m_k)in medium or high tension bands. This mismatch is not reducible by better modeling alone.
-
-
Systematic advantage of non classical logics
-
For these scenario types, at least one non classical logic
LaltinL_libyields tension values under some admissible encoding such thatDeltaS_logic_alt(m_k) <= epsilon_altwith
epsilon_altlying in low tension bands and withepsilon_alt < delta_normin a stable way across refinement sequences and encodings inE_adm.
-
-
Frequent paradox flags
Paradox_flag(m)takes value1for many realistic scenarios when classical logic is applied directly, indicating triviality or explosion.- Paraconsistent or non monotonic logics can suppress explosion while retaining useful inferences, thereby lowering consistency_tension and moving
DeltaS_logicinto lower bands.
World NC thus exhibits persistent high or medium tension for classical logic in specific domains, while certain non classical logics can systematically reduce that tension without ad hoc tuning.
5.3 Interpretive note
These worlds are effective layer constructs. They do not assert:
- that reality itself is governed by a non classical logic, or
- that there is a unique logic that is true.
They assert that, when we model rational reasoning with explicit mismatch observables under admissible encodings, we see either:
- universal low tension for classical logic across benchmark scenarios, or
- domain specific persistent higher tension for classical logic that some non classical logics mitigate under the same fairness constraints.
6. Falsifiability and discriminating experiments
This block specifies experiments and protocols that:
- test the coherence of the Q118 encoding,
- distinguish between different logic tension models,
- provide evidence for or against specific TU encodings in
E_adm.
They do not settle the philosophical debate. They can falsify particular encodings.
Experiment 1: Human normative inference tasks
Goal
Test whether classical logic, under the current encoding, can match human judgments of rational inference as well as selected non classical logics.
Setup
-
Construct or collect a finite benchmark library
S_libof reasoning scenarios, each with:- a formalization in a fixed language,
- recorded normative judgments about which inferences are rational.
-
For each scenario type, select up to three logics from
L_libthat are commonly discussed as relevant, for exampleL_classical,L_paraconsistent, andL_nonmonotonic. -
Fix the weights
(w_norm, w_ext)and the mappingL_star(m)according to the admissible encoding class rules. -
Ensure that mismatch scores are normalized to
[0, 1]and interpreted through TU tension bands.
Protocol
- For each scenario
s_kand each logicLof interest, construct a regular statem_k(L)inM_regthat encodes the scenario, the logic’s consequence patterns, and the normative judgments. - Compute
DeltaS_classical_norm(m_k(L_classical))andDeltaS_classical_extended(m_k(L_classical); L_star(m_k))for eachk, then combine them intoDeltaS_logic(m_k(L_classical)). - Compute analogous combined tension scores for non classical logics when they are treated as primary logics in alternative encodings, using the same benchmark library and normative judgments.
- Map each scalar tension score to its band on the TU tension scale. For each scenario, record which band is occupied by classical logic and by each comparison logic.
- Aggregate the results across the benchmark library by computing average and maximal tension values for each logic and the distribution of tension bands for each scenario type.
Metrics
-
Average tension:
Avg_classical = average over k of DeltaS_logic(m_k(L_classical)) Avg_alt = average over k of DeltaS_logic_alt(m_k(Lalt)) -
Band distributions:
- fraction of scenarios where classical logic is in a higher, equal, or lower tension band than each alternative logic.
-
Maximal tension per logic:
- highest band attained by each logic on any scenario type.
Falsification conditions
-
If, for a substantial subset of scenario types in
S_libthat involve inconsistency, default reasoning, or vagueness, the following pattern holds under all admissible choices of(w_norm, w_ext):Avg_classicalis greater thanAvg_altby at least a fixed margintau > 0, and- classical logic occupies a strictly higher tension band than the best non classical alternative on most scenarios of that type, and
- this pattern is stable under refinement of the scenarios and observables, then an encoding that treats classical logic as uniquely low tension for all reasoning patterns is considered falsified.
-
If small admissible changes in
(w_norm, w_ext)within the fixed candidate set cannot remove this inequality without breaking other constraints or moving alternative logics into implausible bands, the falsification applies to the entire encoding.
Semantics implementation note
All observables and mismatch scores are treated in a discrete semantics setting that matches the metadata. Scenarios, logics, and judgments are represented as finite discrete structures with no continuous fields introduced in this experiment.
Boundary note
Falsifying a TU encoding for Q118 in this experiment does not settle which logic is correct in a philosophical sense. It only rejects a particular effective layer encoding and its claim that classical logic is always in low tension bands under fair comparison.
Experiment 2: AI reasoning benchmarks under controlled logics
Goal
Test whether AI systems constrained to classical inference exhibit higher logic tension on benchmark tasks than systems that incorporate non classical reasoning modules, under the same encoding rules.
Setup
-
Select a subset
S_AIof scenarios fromS_libthat can be implemented as AI benchmark tasks, for example:- inconsistent knowledge bases with important but conflicting rules,
- default reasoning tasks with exceptions,
- context sensitive obligations that require non monotonic behavior.
-
Build two AI system variants:
Model_classical: uses only classical inference rules internally to handle logical constraints.Model_hybrid: has access to a non classical module, such as a paraconsistent or non monotonic inference engine, wrapped behind a simple interface.
Protocol
-
For each scenario in
S_AI, collect the model’s derived conclusions under each variant. -
Translate each model run into a state
m_k(Model_type)inM_reg, with:- premises encoded,
- the model’s conclusions encoded,
- normative judgments for that scenario available.
-
For
Model_classical, computeDeltaS_classical_norm(m_k(Model_classical))andDeltaS_classical_extended(m_k(Model_classical); L_star(m_k)), then obtainDeltaS_logic_classical(m_k). -
For
Model_hybrid, use the same normative judgments and scenario metadata, and treat the non classical module as the comparison logic where appropriate. ComputeDeltaS_logic_hybrid(m_k)using the same admissible encoding rules. -
Map the tension scores for both models to TU tension bands and compare band distributions across tasks.
Metrics
-
Average and maximal tension for each model type on
S_AI. -
Band distributions:
- fraction of tasks where
Model_classicaloccupies a strictly higher band thanModel_hybridand vice versa.
- fraction of tasks where
-
Frequency of triviality events for
Model_classical, detected viaParadox_flag(m_k(Model_classical)). -
Performance metrics such as accuracy or task completion rate, to verify that lower tension aligns with better task behavior or more coherent reasoning.
Falsification conditions
-
If, across a majority of scenarios in
S_AI, the hybrid model satisfies both:DeltaS_logic_hybrid(m_k) <= DeltaS_logic_classical(m_k) - taufor some fixedtau > 0, andModel_hybridoccupies equal or lower tension bands thanModel_classicalon those scenarios, and this pattern remains under refinements of the scenarios and benchmarks, then an encoding that treats classical logic as sufficient for those domains is considered falsified.
-
If classical logic only achieves comparable bands by moving many scenarios into
S_singor by treating triviality as acceptable, the encoding is judged unstable and rejected.
Semantics implementation note
The experiment interprets AI model behavior in discrete terms: finite sets of premises and conclusions, finite mismatch scores, and discrete flags. No continuous fields are introduced beyond aggregated statistics.
Boundary note
Falsifying a TU encoding for Q118 in this experiment evaluates engineering level adequacy of classical logic for the chosen AI benchmarks. It does not settle the deeper philosophical question of the one true logic.
7. AI and WFGY engineering spec
This block describes how Q118 can be used as an engineering module for AI systems in the WFGY framework, at the effective layer.
7.1 Training signals
We define several training signals derived from the observables in Block 3.
-
signal_consistency_gap- Definition: a nonnegative signal proportional to
DeltaS_classical_norm(m)on scenarios where normative judgments are available. - Usage: penalize internal states that imply many inferences judged irrational or that miss inferences judged rational, encouraging alignment of classical style reasoning with normative standards.
- Definition: a nonnegative signal proportional to
-
signal_logic_choice_sensitivity- Definition: a signal based on
DeltaS_classical_extended(m; L_star(m)), measuring how much the choice of logic affects derived consequences in a scenario. - Usage: encourage the model to recognize when logic choice is consequential and to flag such contexts for special handling, for example through logic switching or explicit uncertainty.
- Definition: a signal based on
-
signal_paradox_exposure- Definition: a signal derived from
Paradox_flag(m), higher when classical inference leads to triviality or widespread inconsistency. - Usage: discourage internal states that make large parts of the knowledge base trivial or unusable, and steer the model toward strategies that avoid explosion.
- Definition: a signal derived from
-
signal_low_tension_preference- Definition: a composite signal based directly on
DeltaS_logic(m), with lower values and lower bands preferred under scenarios that are known to admit low tension encodings. - Usage: align the model toward representations and inference strategies that stay in low tension regimes when possible, while respecting fairness constraints.
- Definition: a composite signal based directly on
7.2 Architectural patterns
We outline module patterns that reuse Q118 structures without revealing any deep TU rules.
-
LogicTensionHead-
Role: given an internal representation of a reasoning context, outputs estimates of
DeltaS_classical_norm(m),DeltaS_classical_extended(m; L_star(m)), andDeltaS_logic(m), together with their tension bands. -
Interface:
- Inputs: context embeddings or hidden states for a reasoning task.
- Outputs: a small vector of normalized tension scores in
[0, 1]and optional band labels.
-
-
LogicalSystemDescriptor-
Role: maintains an effective description of which logical rules and structural principles are active in the current context.
-
Interface:
- Inputs:
scenario_metadata,internal_context_state. - Output: a
logic_descriptorindicating which logic or logic mix fromL_libis active.
- Inputs:
-
-
NonMonotonicGate-
Role: controls when non monotonic or paraconsistent reasoning modules are allowed to override classical inference, based on
LogicTensionHeadoutputs and scenario type. -
Interface:
- Inputs: tension scores, band labels, and scenario descriptors.
- Outputs: gate signals for inference modules, such as switches between classical and non classical reasoning paths.
-
7.3 Evaluation harness
We suggest an evaluation harness that uses Q118 components.
-
Task selection
-
Choose benchmark suites that include:
- classical theorem proving tasks,
- inconsistent but informative knowledge base tasks,
- default reasoning and non monotonic tasks.
-
-
Conditions
- Baseline: model uses classical inference only and does not incorporate Q118 modules explicitly.
- TU enhanced: model uses
LogicTensionHeadandLogicalSystemDescriptorto modulate inference strategies, and may employNonMonotonicGate.
-
Metrics
- Task performance metrics such as accuracy, proof success rate, and robustness to inconsistent data.
- Logic tension metrics, including average and maximal
DeltaS_logic(m)and the distribution of bands across tasks. - Alignment metrics such as how often model outputs accord with expert normative judgments.
The harness is designed to show whether Q118 inspired modules can reduce consistency_tension in AI reasoning without sacrificing task performance.
7.4 60 second reproduction protocol
A minimal protocol that allows external users to experience the impact of Q118 encoding.
-
Baseline setup:
- Prompt an AI model to reason about a small inconsistent but informative knowledge base using ordinary instructions, without any mention of logic tension.
- Observation: note whether the model ignores contradictions, collapses into triviality, or gives unstable results across prompts.
-
TU encoded setup:
-
Prompt the same model, but now instruct it to explicitly separate:
- what follows under strict classical consequence,
- what follows under a paraconsistent or default logic,
- where its own reasoning sees high consistency_tension according to
DeltaS_logic(m)and associated bands.
-
Observation: note whether the outputs become more structured, with explicit boundaries between safe and unsafe inferences.
-
-
Comparison metric:
-
Use a simple rubric that scores:
- explicit handling of contradictions,
- separation between strict and default inferences,
- stability across minor prompt variations.
-
-
What to log:
- Prompts, model outputs, tension scores, and band labels.
- These logs can be used to audit whether the model actually uses Q118 components rather than only changing surface wording.
8. Cross problem transfer template
This block describes the reusable components produced by Q118 and their direct reuse targets.
8.1 Reusable components produced by this problem
-
ComponentName:
LogicalSystemDescriptor-
Type: field.
-
Minimal interface:
- Inputs:
scenario_metadata,internal_context_state. - Output:
logic_descriptorindicating which logic or logics fromL_libare active.
- Inputs:
-
Preconditions:
- Scenario metadata must classify the context into a type that the descriptor knows how to map to a logic or logic mix.
-
-
ComponentName:
LogicTensionFunctional-
Type: functional.
-
Minimal interface:
- Inputs:
logic_descriptor,consequence_summaries,normative_judgments. - Output:
tension_scoresincludingDeltaS_classical_norm(m),DeltaS_classical_extended(m; L_star(m)), and combinedDeltaS_logic(m), each in[0, 1]with associated TU bands.
- Inputs:
-
Preconditions:
- Consequence summaries and normative judgments must be available and finite for the scenario.
-
-
ComponentName:
ParadoxScenarioPattern-
Type: experiment_pattern.
-
Minimal interface:
- Inputs:
knowledge_base,query_set. - Output:
scenario_familythat systematically probes triviality and inconsistency behavior under classical and non classical reasoning.
- Inputs:
-
Preconditions:
- The knowledge base must be representable in the chosen formal language. Query sets must be finite and well defined.
-
8.2 Direct reuse targets
-
Q119 (Meaning of probability)
- Reused components:
LogicalSystemDescriptorandLogicTensionFunctional. - Why it transfers: probabilistic reasoning often blends deductive and default inferences. Describing which logic is active and how tension behaves is directly relevant.
- What changes: consequence summaries now include probabilistic coherence and conditionalization behavior, not only truth functional inference. Normative judgments incorporate probabilistic rationality criteria.
- Reused components:
-
Q120 (Value of information and knowledge)
- Reused component:
LogicTensionFunctional. - Why it transfers: the value of information depends on how additional information changes consistency_tension and rational inference.
- What changes: tension scores are tracked before and after information updates to see how they affect knowledge states and whether classical logic or non classical logics yield lower tension profiles.
- Reused component:
-
Q121 (AI alignment problem)
- Reused components:
LogicalSystemDescriptor,LogicTensionFunctional, andParadoxScenarioPattern. - Why it transfers: alignment depends on ensuring that AI reasoning is logically safe, especially under inconsistency and ambiguity.
- What changes: scenarios include multi agent and safety critical contexts. Tension is interpreted as part of risk assessment for misaligned or unsafe behavior.
- Reused components:
-
Q123 (Scalable interpretability)
- Reused component:
LogicalSystemDescriptor. - Why it transfers: interpretability efforts often try to understand what implicit logic a model uses in different subsystems.
- What changes: descriptor inputs now come from internal activations or circuits rather than explicit scenario metadata. Logic descriptors help organize interpretability findings.
- Reused component:
9. TU roadmap and verification levels
This block explains where Q118 currently sits in the TU verification ladder and what the next measurable steps are.
9.1 Current levels
-
E_level: E1
- A coherent effective layer encoding has been specified, including state space, observables, mismatch functionals, and singular set.
- Admissible encoding constraints are stated, but not yet implemented as a concrete library with published data.
-
N_level: N2
- The narrative explains how classical and non classical logics interact under consistency_tension.
- Counterfactual worlds C and NC are described in a way that can be instantiated on benchmark scenario families.
9.2 Next measurable step toward E2
To reach E2, at least the following should be achieved.
-
Implement a finite benchmark library
S_libwith public documentation:- A list of scenarios, their types, and normative judgments.
- A documented procedure for constructing states
minM_regfrom these scenarios.
-
Implement at least one concrete encoding in
E_adm:- A fixed
L_lib,S_lib,(w_norm, w_ext), and mappingL_star(m). - A working prototype that computes
DeltaS_logic(m)and band labels for all scenarios inS_liband publishes the resulting tension profiles.
- A fixed
-
Run one instance of Experiment 1 or Experiment 2 in a limited setting:
- Collect data on how classical and at least one non classical logic perform under the same encoding.
- Report results in a way that other groups can replicate.
These steps can be completed without revealing any deep TU generative rules. They operate on effective summaries and benchmark data.
9.3 Long term role in the TU program
Long term, Q118 is expected to serve as:
- the reference node for all problems involving logic choice and consistency_tension,
- a template for encoding philosophical disputes as structured tension comparisons under transparent fairness constraints,
- a bridge between philosophical logic, AI safety, and interpretability, by treating logic adequacy as an observable property of reasoning systems.
10. Elementary but precise explanation
This block explains Q118 in accessible terms, while staying aligned with the effective layer encoding.
Classical logic is a system of rules that tell you what follows from what. It includes ideas such as:
- either a statement is true or it is false,
- from a contradiction you can derive any statement,
- if a conclusion follows from some information then adding more information never makes that conclusion invalid.
The big question behind Q118 is:
If we look at all the ways people and machines reason when they are being rational, is classical logic always enough, or do we sometimes need other logics to describe what is going on?
In the Tension Universe view, we do not try to decide this question by slogans alone. We do three things.
-
We imagine a library of reasoning situations, such as:
- trying to reason with inconsistent but useful data,
- making default assumptions that can have exceptions,
- dealing with vague or context sensitive words.
-
For each situation, we record:
- what classical logic says follows from the premises,
- what some non classical logics say,
- what people or experts judge to be reasonable inferences.
-
We define numbers in
[0, 1]that measure how far classical logic is from:- those human judgments,
- those non classical logics in exactly the same situations.
These numbers are tension scores on the TU tension scale.
- If the tension score is small and remains in low tension bands, then classical logic fits rational practice well in that scenario.
- If the tension score is large and lies in higher bands, and if some non classical logic has a smaller score in lower bands under the same rules, then classical logic looks less adequate for that kind of reasoning.
Q118 does not declare a winner for all time. Instead, it provides:
- a precise way to talk about the match or mismatch between logics and rational practice,
- a set of experiments that can reject unfair or unstable encodings,
- reusable components that help study logic choice in mathematics, science, and AI.
In this way, Q118 functions as a structural map of the limits of classical logic inside the Tension Universe, rather than as a final verdict about which logic is ultimately correct.
Tension Universe effective layer footer
This page is part of the WFGY / Tension Universe S problem collection for logic and rationality.
Scope of claims
- The aim of this document is to provide an effective layer encoding of Q118 and to list concrete experiments that can falsify or refine this encoding.
- It does not prove or disprove any canonical philosophical thesis about classical logic or non classical logics.
- It does not introduce new mathematical theorems and should not be cited as a solution to the limits of classical logic problem.
Effective layer boundary
- All state spaces, observables, mismatch measures, and tension scores described here live at the effective layer.
- No deep TU axiom system, field equation, or generative rule is specified.
- No explicit pipeline from real world data to TU fields is given. Any such pipeline must be documented separately and audited on its own terms.
Encoding and fairness
- The encoding belongs to an admissible class
E_admwith finite libraries, fixed weight choices, and stable comparison rules. - Classical and non classical logics are evaluated under the same constraints. Scenario specific tuning is prohibited.
- States in the singular set
S_singrepresent encoding breakdown or lack of data, not evidence for or against any stance about logic.
Experiments and falsifiability
- Experiments in Section 6 can falsify specific Q118 encodings by showing unstable or non discriminating tension patterns, or by revealing systematic band advantages for alternative logics under fair comparison.
- Falsifying an encoding does not settle the philosophical problem. It only rules out that particular effective layer implementation.
Relation to TU charters
This page should be read together with the following charters:
- TU Effective Layer Charter
- TU Encoding and Fairness Charter
- TU Tension Scale Charter
- TU Global Guardrails
Index:
← Back to Event Horizon
← Back to WFGY Home
Consistency note:
This entry has passed the internal formal-consistency and symbol-audit checks under the current WFGY 3.0 specification.
The structural layer is already self-consistent; any remaining issues are limited to notation or presentation refinement.
If you find a place where clarity can improve, feel free to open a PR or ping the community.
WFGY evolves through disciplined iteration, not ad-hoc patching.